Commit graph

246 commits

Author SHA1 Message Date
Simon Willison
f02484c3de From 409 warnings down to 52 warnings.
By closing unclosed database connections.

Refs #2614
2026-04-14 18:46:47 -07:00
Simon Willison
0b639a8122
Replace token-based CSRF with Sec-Fetch-Site header protection (#2689)
- New CSRF protection middleware inspired by Go 1.25 and research by Filippo Valsorda - https://words.filippo.io/csrf/ - this replaces the old CSRF token based protection.
- Removes all instances of `<input type="hidden" name="csrftoken" value="{{ csrftoken() }}">` in the templates - they are no longer needed.
- Removes the `def skip_csrf(datasette, scope):` plugin hook defined in `datasette/hookspecs.py` and its documentation and tests.
- Updated CSRF protection documentation to describe the new approach.
- Upgrade guide now describes the CSRF change.
2026-04-14 17:11:36 -07:00
Simon Willison
c479e7dec9
Document call_with_supported_arguments as a supported public API (#2678)
* Document call_with_supported_arguments as a supported public API

Mark both call_with_supported_arguments and async_call_with_supported_arguments
with the @documented decorator and add documentation to docs/internals.rst
so plugin authors can use these dependency injection utilities in their own code.

https://claude.ai/code/session_01DKogZpHwzCTrbeG4XjXmNc
2026-03-30 10:44:10 -07:00
Simon Willison
e800312b54 allowed_resources(view-query, actor) fix
Previously we could not filter for canned queries that a
specific actor could view.
2026-03-18 09:05:23 -07:00
Claude
ad6a020e6d
Add NOT NULL constraints to column_types primary key columns
SQLite allows NULLs in primary key columns by default, so mark
database_name, resource_name, and column_name as NOT NULL explicitly.

https://claude.ai/code/session_01SvPEPqHgURTWESRp28pTC3
2026-03-17 03:58:18 +00:00
Claude
73225ccad0
Add column types system for semantic column annotations
Implements the column types feature that lets Datasette and plugins annotate
columns with semantic types beyond SQLite storage types (e.g. markdown, email,
url, json, file, point). This enables type-appropriate rendering, validation,
form widgets, and API behavior.

Key changes:
- New `column_types` internal DB table for storing assignments
- `ColumnType` dataclass in datasette/column_types.py with render_cell,
  validate, and transform_value methods
- `register_column_types` plugin hook for registering types
- Built-in url, email, and json column types
- Datasette API methods: get/set/remove_column_type(s),
  get_column_type_class
- Config loading from datasette.json `column_types` table config key
- `column_types` extra on the table JSON endpoint
- Column type info in display_columns extra
- Column type render_cell gets priority in rendering pipeline
- column_type/column_type_config args added to render_cell hookspec
- Write-path validation on insert and update

https://claude.ai/code/session_01SvPEPqHgURTWESRp28pTC3
2026-03-17 02:40:37 +00:00
Simon Willison
5c3137d148 Black formatting 2026-02-17 13:30:24 -08:00
Simon Willison
40a37307de
Add request.form() for multipart form data and file uploads
* Add request.form() for multipart form data and file uploads

New Request.form() method that handles both application/x-www-form-urlencoded
and multipart/form-data content types with streaming parsing.

Features:
- Streaming multipart parser that doesn't buffer entire body in memory
- Files spill to disk above 1MB threshold via SpooledTemporaryFile
- files=False (default) discards file content, files=True stores them
- Security limits: max_request_size, max_file_size, max_fields, max_files
- FormData container with dict-like access and getlist() for multiple values
- UploadedFile class with async read(), seek(), filename, content_type, size
- Support for RFC 5987 filename* encoding for international filenames

Uses multipart-form-data-conformance test suite for validation.

* Update views to use request.form() and document new API

- Migrate PermissionsDebugView, MessagesDebugView, and CreateTokenView
  from post_vars() to form()
- Add documentation for request.form(), FormData, and UploadedFile classes

Centralize multipart defaults and expose stricter limits via Request.form().

Enforce header, part, file, and disk space limits even when files are discarded; detect truncated bodies and client disconnects; and move blocking work off the event loop.

Add FormData close/aclose context managers, update internals docs, and expand multipart tests (including len semantics and stricter conformance expectations).
2026-01-28 18:41:03 -08:00
Simon Willison
ffadb5f74c
Workaround for intermittent test failure on SQLite 3.25.3
Closes:
- #2632
2026-01-28 18:34:00 -08:00
Simon Willison
2f7b120177 Minor speedup for remove_infinites, refs #2629 2026-01-24 22:07:54 -08:00
Simon Willison
7915c46ddd
Fix flaky test_database_page test with deterministic ordering (#2628)
* Fix flaky test_database_page test with deterministic ordering

- Add ORDER BY to table_names() query in database.py
- Sort foreign keys deterministically in get_all_foreign_keys()
- Refactor test_database_page to use property-based assertions instead of
  500+ lines of hardcoded expected data
- Run blacken-docs on plugin_hooks.rst

* Update test_row_foreign_key_tables for new deterministic FK ordering

The foreign keys are now sorted by (other_table, column, other_column),
so complex_foreign_keys comes before foreign_key_references alphabetically.

* Update test_table_names for new alphabetical ordering

The table_names() method now returns tables sorted alphabetically.

* Fix for test that fails prior to SQLite 3.37

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-01-23 20:57:25 -08:00
Simon Willison
170b3ff61c Better fix for stale catalog_databases, closes #2606
Refs 2605
2025-12-02 19:00:13 -08:00
Simon Willison
c6c2a238c3 Fix for stale internal database bug, closes #2605 2025-12-02 16:22:42 -08:00
Simon Willison
d814e81b32
datasette.client.get(..., skip_permission_checks=True)
Closes #2580
2025-11-05 13:38:01 -08:00
Simon Willison
8b371495dc Move open redirect fix to asgi_send_redirect, refs #2429
See https://github.com/simonw/datasette/pull/2500#issuecomment-3488632278
2025-11-04 17:08:06 -08:00
Simon Willison
18fd373a8f
New PermissionSQL.restriction_sql mechanism for actor restrictions
Implement INTERSECT-based actor restrictions to prevent permission bypass

Actor restrictions are now implemented as SQL filters using INTERSECT rather
than as deny/allow permission rules. This ensures restrictions act as hard
limits that cannot be overridden by other permission plugins or config blocks.

Previously, actor restrictions (_r in actor dict) were implemented by 
generating permission rules with deny/allow logic. This approach had a 
critical flaw: database-level config allow blocks could bypass table-level 
restrictions, granting access to tables not in the actor's allowlist.

The new approach separates concerns:

- Permission rules determine what's allowed based on config and plugins
- Restriction filters limit the result set to only allowlisted resources
- Restrictions use INTERSECT to ensure all restriction criteria are met
- Database-level restrictions (parent, NULL) properly match all child tables

Implementation details:

- Added restriction_sql field to PermissionSQL dataclass
- Made PermissionSQL.sql optional to support restriction-only plugins
- Updated actor_restrictions_sql() to return restriction filters instead of rules
- Modified SQL builders to apply restrictions via INTERSECT and EXISTS clauses

Closes #2572
2025-11-03 14:17:51 -08:00
Simon Willison
400fa08e4c
Add keyset pagination to allowed_resources() (#2562)
* Add keyset pagination to allowed_resources()

This replaces the unbounded list return with PaginatedResources,
which supports efficient keyset pagination for handling thousands
of resources.

Closes #2560

Changes:
- allowed_resources() now returns PaginatedResources instead of list
- Added limit (1-1000, default 100) and next (keyset token) parameters
- Added include_reasons parameter (replaces allowed_resources_with_reasons)
- Removed allowed_resources_with_reasons() method entirely
- PaginatedResources.all() async generator for automatic pagination
- Uses tilde-encoding for tokens (matching table pagination)
- Updated all callers to use .resources accessor
- Updated documentation with new API and examples

The PaginatedResources object has:
- resources: List of Resource objects for current page
- next: Token for next page (None if no more results)
- all(): Async generator that yields all resources across pages

Example usage:
    page = await ds.allowed_resources("view-table", actor, limit=100)
    for table in page.resources:
        print(table.child)

    # Iterate all pages automatically
    async for table in page.all():
        print(table.child)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-31 14:50:46 -07:00
Simon Willison
6a71bde37f
Permissions SQL API improvements (#2558)
* Neater design for PermissionSQL class, refs #2556
  - source is now automatically set to the source plugin
  - params is optional
* PermissionSQL.allow() and PermissionSQL.deny() shortcuts

Closes #2556

* Filter out temp database from attached_databases()

Refs https://github.com/simonw/datasette/issues/2557#issuecomment-3470510837
2025-10-30 15:48:46 -07:00
Simon Willison
5c537e0a3e Fix type annotation bugs and remove unused imports
This fixes issues introduced by the ruff commit e57f391a which converted
Optional[x] to x | None:

- Fixed datasette/app.py line 1024: Dict[id | str, Dict] -> Dict[int | str, Dict]
  (was using id built-in function instead of int type)
- Fixed datasette/app.py line 1074: Optional["Resource"] -> "Resource" | None
- Added 'from __future__ import annotations' for Python 3.10 compatibility
- Added TYPE_CHECKING blocks to avoid circular imports
- Removed dead code (unused variable assignments) from cli.py and views
- Removed unused imports flagged by ruff across multiple files
- Fixed test fixtures: moved app_client fixture imports to conftest.py
  (fixed 71 test errors caused by fixtures not being registered)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-26 16:03:13 -07:00
Simon Willison
d769e97ab8 Show multiple permission reasons as JSON arrays, refs #2531
- Modified /-/allowed to show all reasons that grant access to a resource
- Changed from MAX(reason) to json_group_array() in SQL to collect all reasons
- Reasons now displayed as JSON arrays in both HTML and JSON responses
- Only show Reason column to users with permissions-debug permission
- Removed obsolete "Source Plugin" column from /-/rules interface
- Updated allowed_resources_with_reasons() to parse and return reason lists
- Fixed alert() on /-/allowed by replacing with disabled input state
2025-10-25 21:24:05 -07:00
Simon Willison
82cc3d5c86 Migrate view-query permission to SQL-based system, refs #2510
This change integrates canned queries with Datasette's new SQL-based
permissions system by making the following changes:

1. **Default canned_queries plugin hook**: Added a new hookimpl in
   default_permissions.py that returns canned queries from datasette
   configuration. This extracts config-reading logic into a plugin hook,
   allowing QueryResource to discover all queries.

2. **Async resources_sql()**: Converted Resource.resources_sql() from a
   synchronous class method returning a string to an async method that
   receives the datasette instance. This allows QueryResource to call
   plugin hooks and query the database.

3. **QueryResource implementation**: Implemented QueryResource.resources_sql()
   to gather all canned queries by:
   - Querying catalog_databases for all databases
   - Calling canned_queries hooks for each database with actor=None
   - Building a UNION ALL SQL query of all (database, query_name) pairs
   - Properly escaping single quotes in resource names

4. **Simplified get_canned_queries()**: Removed config-reading logic since
   it's now handled by the default plugin hook.

5. **Added view-query to default allow**: Added "view-query" to the
   default_allow_actions set so canned queries are accessible by default.

6. **Removed xfail markers**: Removed test xfail markers from:
   - tests/test_canned_queries.py (entire module)
   - tests/test_html.py (2 tests)
   - tests/test_permissions.py (1 test)
   - tests/test_plugins.py (1 test)

All canned query tests now pass with the new permission system.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 15:38:07 -07:00
Simon Willison
a5910f200e Code cleanup: rename variables, remove WHERE 0 check, cleanup files, refs #2528
- Rename permission_name to action_name in debug templates for consistency
- Remove confusing WHERE 0 check from check_permission_for_resource()
- Rename tests/test_special.py to tests/test_search_tables.py
- Remove tests/vec.db that shouldn't have been committed
2025-10-25 15:38:07 -07:00
Simon Willison
60a38cee85 Run black formatter 2025-10-25 15:38:07 -07:00
Simon Willison
224084facc Make allowed() and check_permission_for_resource keyword-only, add default resource
- Made allowed() accept resource=None with InstanceResource() as default
- Made both functions keyword-argument only
- Added logging to _permission_checks for debug endpoints
- Fixed check_permission_for_resource to handle empty params correctly
- Created build_permission_rules_sql() helper function for debug views
2025-10-25 15:38:07 -07:00
Simon Willison
e8b79970fb Implement also_requires to enforce view-database for execute-sql
Adds Action.also_requires field to specify dependencies between permissions.
When an action has also_requires set, users must have permission for BOTH
the main action AND the required action on a resource.

Applies this to execute-sql, which now requires view-database permission.
This prevents the illogical scenario where users can execute SQL on a
database they cannot view.

Changes:
- Add also_requires field to Action dataclass in datasette/permissions.py
- Update execute-sql action with also_requires="view-database"
- Implement also_requires handling in build_allowed_resources_sql()
- Implement also_requires handling in AllowedResourcesView endpoint
- Add test verifying execute-sql requires view-database permission

Fixes #2527
2025-10-24 12:14:52 -07:00
Simon Willison
a2994cc5bb Remove automatic parameter namespacing from permission plugins
Simplifies the permission system by removing automatic parameter namespacing.
Plugins are now responsible for using unique parameter names. The recommended
convention is to prefix parameters with the plugin source name (e.g.,
:myplugin_user_id). System reserves :actor, :actor_id, :action, :filter_parent.

- Remove _namespace_params() function from datasette/utils/permissions.py
- Update build_rules_union() to use plugin params directly
- Document parameter naming convention in plugin_hooks.rst
- Update example plugins to use prefixed parameters
- Add test_multiple_plugins_with_own_parameters() to verify convention works
2025-10-24 11:44:43 -07:00
Simon Willison
5138e95d69 Migrate homepage to use bulk allowed_resources() and fix NULL handling in SQL JOINs
- Updated IndexView in datasette/views/index.py to fetch all allowed databases and tables
  in bulk upfront using allowed_resources() instead of calling check_visibility() for each
  database, table, and view individually
- Fixed SQL bug in build_allowed_resources_sql() where USING (parent, child) clauses failed
  for database resources because NULL = NULL evaluates to NULL in SQL, not TRUE
- Changed all INNER JOINs to use explicit ON conditions with NULL-safe comparisons:
  ON b.parent = x.parent AND (b.child = x.child OR (b.child IS NULL AND x.child IS NULL))

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-24 10:32:18 -07:00
Simon Willison
8674aaa392 Add parent filter and include_is_private to allowed_resources()
Major improvements to the allowed_resources() API:

1. **parent filter**: Filter results to specific database in SQL, not Python
   - Avoids loading thousands of tables into Python memory
   - Filtering happens efficiently in SQLite

2. **include_is_private flag**: Detect private resources in single SQL query
   - Compares actor permissions vs anonymous permissions in SQL
   - LEFT JOIN between actor_allowed and anon_allowed CTEs
   - Returns is_private column: 1 if anonymous blocked, 0 otherwise
   - No individual check_visibility() calls needed

3. **Resource.private property**: Safe access with clear error messages
   - Raises AttributeError if accessed without include_is_private=True
   - Prevents accidental misuse of the property

4. **Database view optimization**: Use new API to eliminate redundant checks
   - Single bulk query replaces N individual permission checks
   - Private flag computed in SQL, not via check_visibility() calls
   - Views filtered from allowed_dict instead of checking db.view_names()

All permission filtering now happens in SQLite where it belongs, with
minimal data transferred to Python.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-24 10:32:18 -07:00
Simon Willison
58ac5ccd6e Simplify types in datasette/permissions.py 2025-10-24 10:32:18 -07:00
Simon Willison
b311f735f9 Fix schema mismatch in empty result query
When no permission rules exist, the query was returning 2 columns (parent, child)
but the function contract specifies 3 columns (parent, child, reason). This could
cause schema mismatches in consuming code.

Added 'NULL AS reason' to match the documented 3-column schema.

Added regression test that verifies the schema has 3 columns even when no
permission rules are returned. The test fails without the fix (showing only
2 columns) and passes with it.

Thanks to @asg017 for catching this
2025-10-24 10:32:18 -07:00
Simon Willison
79879b834a Address PR #2515 review comments
- Add URL to sqlite-permissions-poc in module docstring
- Replace Optional with | None for modern Python syntax
- Add Datasette type annotations
- Add SQL comment explaining cascading permission logic
- Refactor duplicated plugin result processing into helper function
2025-10-24 10:32:18 -07:00
Simon Willison
65c427e4ee Ensure :actor, :actor_id and :action are all available to permissions SQL, closes #2520
- Updated build_rules_union() to accept actor as dict and provide :actor (JSON) and :actor_id
- Updated resolve_permissions_from_catalog() and resolve_permissions_with_candidates() to accept actor dict
- :actor is now the full actor dict as JSON (use json_extract() to access fields)
- :actor_id is the actor's id field for simple comparisons
- :action continues to be available as before
- Updated all call sites and tests to use new parameter format
- Added test demonstrating all three parameters working together
2025-10-24 10:32:18 -07:00
Simon Willison
b9c6e7a0f6 PluginSQL renamed to PermissionSQL, closes #2524 2025-10-24 10:32:18 -07:00
Simon Willison
2b879e462f Implement resource-based permission system with SQL-driven access control
This introduces a new hierarchical permission system that uses SQL queries
for efficient permission checking across resources. The system replaces the
older permission_allowed() pattern with a more flexible resource-based
approach.

Core changes:

- New Resource ABC and Action dataclass in datasette/permissions.py
  * Resources represent hierarchical entities (instance, database, table)
  * Each resource type implements resources_sql() to list all instances
  * Actions define operations on resources with cascading rules

- New plugin hook: register_actions(datasette)
  * Plugins register actions with their associated resource types
  * Replaces register_permissions() and register_resource_types()
  * See docs/plugin_hooks.rst for full documentation

- Three new Datasette methods for permission checks:
  * allowed_resources(action, actor) - returns list[Resource]
  * allowed_resources_with_reasons(action, actor) - for debugging
  * allowed(action, resource, actor) - checks single resource
  * All use SQL for filtering, never Python iteration

- New /-/tables endpoint (TablesView)
  * Returns JSON list of tables user can view
  * Supports ?q= parameter for regex filtering
  * Format: {"matches": [{"name": "db/table", "url": "/db/table"}]}
  * Respects all permission rules from configuration and plugins

- SQL-based permission evaluation (datasette/utils/actions_sql.py)
  * Cascading rules: child-level → parent-level → global-level
  * DENY beats ALLOW at same specificity
  * Uses CTEs for efficient SQL-only filtering
  * Combines permission_resources_sql() hook results

- Default actions in datasette/default_actions.py
  * InstanceResource, DatabaseResource, TableResource, QueryResource
  * Core actions: view-instance, view-database, view-table, etc.

- Fixed default_permissions.py to handle database-level allow blocks
  * Now creates parent-level rules for view-table action
  * Fixes: datasette ... -s databases.fixtures.allow.id root

Documentation:

- Comprehensive register_actions() hook documentation
- Detailed resources_sql() method explanation
- /-/tables endpoint documentation in docs/introspection.rst
- Deprecated register_permissions() with migration guide

Tests:

- tests/test_actions_sql.py: 7 tests for core permission API
- tests/test_tables_endpoint.py: 13 tests for /-/tables endpoint
- All 118 documentation tests pass
- Tests verify SQL does filtering (not Python)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-24 10:32:18 -07:00
Simon Willison
e2a739c496 Fix for asyncio.iscoroutinefunction deprecation warnings
Closes #2512

Refs https://github.com/simonw/asyncinject/issues/18
2025-10-08 20:32:16 -07:00
Simon Willison
27084caa04
New allowed_resources_sql plugin hook and debug tools (#2505)
* allowed_resources_sql plugin hook and infrastructure
* New methods for checking permissions with the new system
* New /-/allowed and /-/check and /-/rules special endpoints

Still needs to be integrated more deeply into Datasette, especially for listing visible tables.

Refs: #2502

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-08 14:27:51 -07:00
Simon Willison
7a602140df catalog_views table, closes #2495
Refs https://github.com/datasette/datasette-queries/issues/1#issuecomment-3074491003
2025-07-15 10:22:56 -07:00
Simon Willison
e59fd01757 Fix for incorrect REFERENCES in internal DB
Refs #2466
2025-02-12 19:40:43 -08:00
Simon Willison
53a3b3c80e
Test improvements and fixed deprecation warnings (#2464)
* `asyncio_default_fixture_loop_scope = function`
* Fix a bunch of BeautifulSoup deprecation warnings
* Fix for PytestUnraisableExceptionWarning: Exception ignored in: <_io.FileIO [closed]>
* xfail for sql_time_limit tests (these can be flaky in CI)

Refs #2461
2025-02-04 14:49:52 -08:00
Simon Willison
34a6b2ac84 Fixed bug with ?_trace=1 and large responses, closes #2404 2024-08-21 10:58:17 -07:00
Simon Willison
39dfc7d7d7
Removed units functionality and Pint dependency
Closes #2400, unblocks #2320
2024-08-20 19:03:33 -07:00
Simon Willison
bf953628bb Fix bug where -s could reset settings to defaults, closes #2389 2024-08-14 14:28:48 -07:00
Simon Willison
2b0a61ee19 Rename metadata tables and add schema to docs, refs #2382 2024-08-05 13:53:55 -07:00
Simon Willison
263788906a Fix for RowNotFound, refs #2359 2024-06-21 16:10:16 -07:00
Simon Willison
7316dd4ac6 Fix for TableNotFound, refs #2359 2024-06-21 16:09:20 -07:00
Simon Willison
62686114ee Do not show database name in Database Not Found error, refs #2359 2024-06-21 16:02:15 -07:00
Simon Willison
64a125b860 Removed unnecessary comments, refs #2354 2024-06-12 16:56:59 -07:00
Simon Willison
d118d5c5bb named_parameters(sql) sync function, refs #2354
Also refs #2353 and #2352
2024-06-12 16:51:07 -07:00
Simon Willison
2b6bfddafc Workaround for #2353 2024-06-11 14:04:55 -07:00
Alex Garcia
e1bfab3fca
Move Metadata to --internal database
Refs:
- https://github.com/simonw/datasette/pull/2343
- https://github.com/simonw/datasette/issues/2341
2024-06-11 09:33:23 -07:00