Commit graph

1,179 commits

Author SHA1 Message Date
Simon Willison
3110faa0ba
Replace Janus queue with asyncio.Future
Closes #1752

AI generated patch explanation: https://gisthost.github.io/?e2b8d9c7666e988b5c003ff5e5ef3098
2026-05-16 11:45:43 -07:00
Simon Willison
345f910043
Fix for Database.close()/Datasette.close() order (#2710)
Closes:
- #2709

The key behavior change: after close() starts, no new execute work can be submitted, but already-running execute work is allowed to finish before SQLite connections are closed.
2026-05-12 16:31:36 -07:00
Simon Willison
0dc7bb19d9 Table headers and column options visible for 0 rows
Closes #2701
2026-04-22 22:23:02 -07:00
Simon Willison
b15ce18ddc
TokenRestrictions.abbreviated(datasette) utility method for creating _r dicts (#2696)
Closes #2695
Refs https://github.com/simonw/datasette-auth-tokens/pull/42
2026-04-17 08:44:43 -07:00
Simon Willison
630e557cdb Ran black 2026-04-16 20:44:21 -07:00
Simon Willison
b3001c1e5a Drop redundant _ds_client global now that ds_client is session-scoped
Session-scoped fixtures are cached per worker by pytest itself, so the
manual _ds_client module global is no longer needed.

Refs #2692

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 20:41:58 -07:00
Simon Willison
c9a7dc9be2 Declare ds_client as session-scoped so auto-close plugin spares it
ds_client already caches a single Datasette for the whole session via a
module-level _ds_client global, so the declared fixture scope should
match. With function scope the auto-close plugin correctly closes it
after the first test that uses it, which then breaks every subsequent
test that reuses the cached (now-closed) instance — as seen in the CI
coverage job, which runs serially rather than under pytest-xdist.

Refs #2692

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 20:40:51 -07:00
Simon Willison
ede942a32e Fix ruff lints in close-related tests
Drop unused `bad = ...` assignment and unused `import pytest`.

Refs #2692

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 20:34:48 -07:00
Simon Willison
d23b32c3e5 Call ds.close() in more places in tests
Refs #2692
2026-04-16 20:25:58 -07:00
Simon Willison
c0153386ef FD-leak regression test for Datasette.close()
Creates and disposes 50 Datasette instances in a loop and asserts that
the number of open file descriptors and live threads does not grow,
exercising the full close() path end to end.

Refs #2692

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 20:18:05 -07:00
Simon Willison
34cc320eab Pytest auto-close plugin for Datasette instances
Installs a pytest11 entry point so that every Datasette() constructed
inside a pytest_runtest_call phase is auto-closed at the end of the test.
Fixture-scoped instances are untouched. Opt out via the
datasette_autoclose = false ini option.

This gives large test suites a safety net against FD exhaustion and leaked
write threads from the now-default temp-disk internal database without
requiring every existing test to be rewritten.

Refs #2692

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 20:15:50 -07:00
Simon Willison
d72dd35378 Wire Datasette.close into ASGI lifespan shutdown
AsgiLifespan now receives an on_shutdown callback that invokes
Datasette.close(), so resources are released cleanly when the ASGI server
delivers a lifespan.shutdown message (SIGTERM / SIGINT for uvicorn).

Refs #2692

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 20:11:02 -07:00
Simon Willison
290f27158f Datasette.close() closes databases, shuts down executor, unlinks temp file
Datasette.close() iterates over every attached Database (including the
internal database), calls Database.close() on each, then shuts down the
ThreadPoolExecutor. Exceptions raised by one Database don't prevent the
others from being closed; the first exception is re-raised afterwards.
Idempotent.

Refs #2692

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 20:10:18 -07:00
Simon Willison
dabf8e4199 Database.close() shuts down write thread and raises DatasetteClosedError
After this commit, Database.close() sends a sentinel to the write queue so
the background write thread exits cleanly, closes cached read/write
connections, and marks the instance closed. Subsequent calls to execute*()
raise DatasetteClosedError. close() remains idempotent and one-way.

Refs #2692

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 20:09:47 -07:00
Simon Willison
ade0ef8a60 Restore compatibility with existing execute_write_fn() callbacks
Closes #2691
2026-04-16 19:14:32 -07:00
Simon Willison
1a7030d668 API explorer special case for rowid in /-/upsert
Refs #1936
2026-04-15 15:47:48 -07:00
Simon Willison
5f39036b9b ok: true in /db.json for consistency 2026-04-15 15:44:06 -07:00
Simon Willison
73f338b9f3 Better example in API explorer for /-/upsert, closes #1936 2026-04-15 15:29:59 -07:00
Simon Willison
4922fc2e39 Disallow null primary keys in upsert
Refs https://github.com/simonw/datasette/issues/1936#issuecomment-1341849496
2026-04-15 15:11:33 -07:00
Simon Willison
a973e3ffa1 Normalize headers in CSRF checks, refs #2689 2026-04-14 19:24:38 -07:00
Simon Willison
028cc2446f Don't allow cookies with Authorization: Bearer to bypass CSRF
Refs #2689
2026-04-14 19:23:21 -07:00
Simon Willison
f02484c3de From 409 warnings down to 52 warnings.
By closing unclosed database connections.

Refs #2614
2026-04-14 18:46:47 -07:00
Simon Willison
9c164572d3
Add actor= parameter to datasette.client methods (#2688)
`datasette.client.get(path, actor={"id": "root"}` now makes the internal request with that actor as `request.actor` - same for the other HTTP verb methods on `datasette.client`.

Upgraded relevant tests to use the new `actor=` mechanism.
2026-04-14 18:31:57 -07:00
Simon Willison
0b639a8122
Replace token-based CSRF with Sec-Fetch-Site header protection (#2689)
- New CSRF protection middleware inspired by Go 1.25 and research by Filippo Valsorda - https://words.filippo.io/csrf/ - this replaces the old CSRF token based protection.
- Removes all instances of `<input type="hidden" name="csrftoken" value="{{ csrftoken() }}">` in the templates - they are no longer needed.
- Removes the `def skip_csrf(datasette, scope):` plugin hook defined in `datasette/hookspecs.py` and its documentation and tests.
- Updated CSRF protection documentation to describe the new approach.
- Upgrade guide now describes the CSRF change.
2026-04-14 17:11:36 -07:00
Simon Willison
fc1794719a
Database(is_temp_disk=True) option, used for internal database (#2684)
Closes #2683

* Add is_temp_disk option to Database for temp file-backed databases

Replace the default in-memory internal database with a temporary
file-backed database using WAL mode. This fixes concurrent read/write
locking errors that occur with named in-memory SQLite databases.

The new is_temp_disk parameter on Database creates a temp file via
tempfile.mkstemp, connects to it as a regular file-based database
with WAL mode enabled, and cleans it up on close() and via atexit.

https://claude.ai/code/session_01TteLrUjpDcARjnP1GMRqz2
2026-03-30 21:03:21 -07:00
Simon Willison
312f41b0c2
RenameTableEvent, plus write connection track_event() mechanism (#2682)
* Add track_event callback to execute_write_fn and write_wrapper

Allows write functions and write_wrapper generators to queue events
during a write operation that are dispatched after successful commit.
The fn or wrapper can optionally accept a `track_event` parameter
(detected via call_with_supported_arguments). Events are discarded
if the write raises an exception.

Does not yet handle the block=False (non-blocking) case - events
queued during non-blocking writes are currently silently discarded.

Refs https://github.com/simonw/datasette/issues/2681

* Dispatch track_event events for non-blocking (block=False) writes

Spawns a background asyncio task that awaits the write thread's reply
queue and dispatches pending events after a successful non-blocking
write. Events are still discarded if the write raises an exception.

Refs https://github.com/simonw/datasette/issues/2681

* Warn that events won't fire for other processes

Refs https://github.com/simonw/datasette/issues/2681#issuecomment-4157118662
2026-03-30 11:20:46 -07:00
Simon Willison
cb293572c4 UI for setting custom column types, refs #2671 2026-03-18 14:08:44 -07:00
Simon Willison
2b06da29a1 Rename set-column-types action to et-column-type
Refs https://github.com/simonw/datasette/pull/2674#issuecomment-4085015792
2026-03-18 12:33:09 -07:00
Simon Willison
d440c20984 /db/table/-/set-column-type JSON API, refs #2671 2026-03-18 12:33:09 -07:00
Simon Willison
fa1d8f0fa5 set-column-types permission, refs #2671 2026-03-18 12:33:09 -07:00
Simon Willison
feaba9b18b
Optionally limit ColumnType subclasses to specific SQLite types (#2673)
* ColumnTypes now have optional SQLite column types

Refs #2672
2026-03-18 11:37:09 -07:00
Simon Willison
68966880c2
Fix mobile column actions not showing items for SQL views (#2670)
* Fix mobile column actions not showing items for SQL views

The previous fix to exclude the Link column from mobile column actions
(d02072b) used .dropdown-menu-icon presence as a proxy, but dropdown
icons are only added to sortable columns (those with <a> tags). This
caused all non-sortable columns to be excluded too.

Instead, explicitly mark the Link column with a data-is-link-column
attribute and filter by that in mobileColumnHeaders, so non-sortable
columns on views and tables still appear in the mobile column actions.

* Prettier formatting for mobile-column-actions.js

https://claude.ai/code/session_01CG545gLcZxet7dS5nMzfCd
2026-03-18 09:46:36 -07:00
Simon Willison
e800312b54 allowed_resources(view-query, actor) fix
Previously we could not filter for canned queries that a
specific actor could view.
2026-03-18 09:05:23 -07:00
Simon Willison
fd016f7986
Column actions panel on mobile (#2669)
On mobile widths the column actions were no longer available.

This adds a new modal to help with that.

https://gisthost.github.io/?ec60eb27e22cf5d96642eec1715586b6
2026-03-18 09:04:28 -07:00
Claude
b7578a4884
Remove pointless test_column_type_with_config test
https://claude.ai/code/session_01SvPEPqHgURTWESRp28pTC3
2026-03-17 05:24:09 +00:00
Claude
dd9b83301c
Refactor ColumnType: register classes, return instances with config
- register_column_types() now returns classes instead of instances
- ColumnType.__init__ takes optional config=, baking it into the instance
- get_column_type() returns a ColumnType instance (or None) instead of a
  (name, config) tuple
- get_column_types() returns {col: ColumnType instance} instead of tuples
- Remove get_column_type_class() - no longer needed
- render_cell/validate/transform_value methods no longer take config arg;
  use self.config instead
- render_cell hook takes column_type (ColumnType or None) instead of
  column_type + column_type_config

https://claude.ai/code/session_01SvPEPqHgURTWESRp28pTC3
2026-03-17 05:18:14 +00:00
Claude
8af98c24c2
Move name and description to class attributes on ColumnType
Instead of passing name= and description= as constructor arguments,
define them as class attributes on each subclass. This better reflects
that they are intrinsic to the type, not configurable per-instance.

https://claude.ai/code/session_01SvPEPqHgURTWESRp28pTC3
2026-03-17 05:00:57 +00:00
Claude
5db4f6953d
Fix linting: remove unused import, apply black formatting
https://claude.ai/code/session_01SvPEPqHgURTWESRp28pTC3
2026-03-17 03:56:20 +00:00
Claude
de4269629b
Remove duplicate import logging line
https://claude.ai/code/session_01SvPEPqHgURTWESRp28pTC3
2026-03-17 03:55:51 +00:00
Claude
e8472bc0cd
Add missing tests and transform_value integration
- Add transform_value integration in table JSON endpoint rows
- Add tests for: duplicate type name error, row endpoint rendering,
  transform_value in JSON output, column type priority over plugins,
  row detail HTML rendering, table HTML rendering, upsert validation,
  unknown type warning logging, config overwrite on restart, and
  no-config edge case
- Total: 34 column type tests, all passing

https://claude.ai/code/session_01SvPEPqHgURTWESRp28pTC3
2026-03-17 02:48:55 +00:00
Claude
73225ccad0
Add column types system for semantic column annotations
Implements the column types feature that lets Datasette and plugins annotate
columns with semantic types beyond SQLite storage types (e.g. markdown, email,
url, json, file, point). This enables type-appropriate rendering, validation,
form widgets, and API behavior.

Key changes:
- New `column_types` internal DB table for storing assignments
- `ColumnType` dataclass in datasette/column_types.py with render_cell,
  validate, and transform_value methods
- `register_column_types` plugin hook for registering types
- Built-in url, email, and json column types
- Datasette API methods: get/set/remove_column_type(s),
  get_column_type_class
- Config loading from datasette.json `column_types` table config key
- `column_types` extra on the table JSON endpoint
- Column type info in display_columns extra
- Column type render_cell gets priority in rendering pipeline
- column_type/column_type_config args added to render_cell hookspec
- Write-path validation on insert and update

https://claude.ai/code/session_01SvPEPqHgURTWESRp28pTC3
2026-03-17 02:40:37 +00:00
Simon Willison
7f93353549
Fix startup hook to fire after metadata and schema tables are populated (#2666)
* Fix startup hook to fire after metadata and schema tables are populated

Previously, the startup() plugin hook fired before internal database
tables were populated from metadata.yaml and before catalog schema
tables were filled. This meant plugins couldn't read or modify metadata
during startup. Now invoke_startup() calls refresh_schemas() before
firing startup hooks, ensuring metadata and catalog tables are available.

* Fix startup hook to fire after metadata and schema tables are populated

Previously, the startup() plugin hook fired before internal database
tables were populated from metadata.yaml and before catalog schema
tables were filled. This meant plugins couldn't read or modify metadata
during startup. Now invoke_startup() calls _refresh_schemas() before
firing startup hooks, ensuring metadata and catalog tables are available.

Updated test_tracer to reflect that internal DB creation SQL now runs
during startup rather than during the first traced request.

* Move check_databases before invoke_startup in CLI serve

Since invoke_startup now calls _refresh_schemas() which queries each
database, the spatialite connection check must run first to provide
the friendly error message instead of a raw OperationalError.

https://claude.ai/code/session_01KL4t5FZYb32rZY7xaqrrZU
2026-03-16 17:56:40 -07:00
Simon Willison
e2c1e81ec9
UI for selecting and re-ordering columns on the table page (#2662)
New Web Component on table/view page with a dialog for selecting and re-ordering columns.

Closes #2661
Refs #1298
2026-03-09 17:45:24 -07:00
Simon Willison
97201f067c Row pages link to foreign keys from table display, closes #1592
https://gisthost.github.io/?40813f5b3e4d83c0efe1c09135f84290/index.html

Also now shows primary key column first and in bold on that page.
2026-03-06 20:16:50 -08:00
Simon Willison
24d801b7f7
Respect metadata-defined facet ordering in sorted_facet_results (#2648)
* Preserve metadata-defined facet ordering on table pages

When facets are explicitly defined in table metadata/config, they now
appear in the order specified in the configuration rather than being
sorted by result count. Request-added facets still appear after
metadata-defined facets, sorted by count as before.

* Document metadata-defined facet ordering behavior

* Apply black formatting

https://claude.ai/code/session_01PbSHtjsUpNk3Fx7xjvVqDb
2026-02-25 16:33:27 -08:00
Simon Willison
c96dc5ce26
register_token_handler() plugin hook for custom API token backends (#2650)
Closes #2649

* Add register_token_handler plugin hook for pluggable token backends

Adds a new register_token_handler hook that allows plugins to provide
custom token creation and verification backends. This enables plugins
like datasette-oauth to issue tokens without depending on specific
backend plugins like datasette-auth-tokens.

Key changes:
- New datasette/tokens.py with TokenHandler base class and SignedTokenHandler
  (the default signed-token implementation moved here)
- New register_token_handler hookspec in hookspecs.py
- Datasette.create_token() is now async and delegates to token handlers
- New Datasette.verify_token() method tries all handlers in sequence
- handler= parameter on create_token() to select a specific backend
- TokenHandler exported from datasette package for plugin use
- Fixed actor_from_request loop to await all coroutines (avoids warnings)

* Add documentation and hook test for register_token_handler

Fixes CI failures: the new hook needs a section in docs/plugin_hooks.rst
(checked by test_plugin_hooks_are_documented) and a test_hook_* function
in test_plugins.py (checked by test_plugin_hooks_have_tests).

* Register tokens module as separate default plugin

Instead of re-exporting hookimpls from default_permissions/__init__.py,
register datasette.default_permissions.tokens as its own DEFAULT_PLUGINS
entry. Cleaner and avoids confusing import-for-side-effect patterns.

* Replace restrict_x params with TokenRestrictions dataclass

Consolidates the three separate restrict_all, restrict_database, and
restrict_resource parameters into a single TokenRestrictions dataclass.
Cleaner API surface for both Datasette.create_token() and
TokenHandler.create_token().

Also clarifies docs re: default handler selection via pluggy ordering.

* Add builder methods to TokenRestrictions

Adds allow_all(), allow_database(), and allow_resource() methods that
return self for chaining. Callers no longer need to manipulate nested
dicts directly:

    restrictions = (TokenRestrictions()
        .allow_all("view-instance")
        .allow_database("mydb", "create-table")
        .allow_resource("mydb", "mytable", "insert-row"))

* docs: add 1.0a25 upgrade guide section for create_token() signature change

Ref: https://github.com/simonw/datasette/issues/2649#issuecomment-3962639393

* docs: note that create_token() is now async in upgrade guide

* docs: update internals, plugin_hooks, authentication for new token API

- internals.rst: new async create_token() signature with restrictions
  and handler params, add TokenRestrictions reference docs
- plugin_hooks.rst: show full create_token signature in TokenHandler
  example, note list returns and error cases
- authentication.rst: cross-reference TokenRestrictions from the
  restrictions section

* style: apply black formatting to token handler files

* docs: fix RST heading underline length in internals.rst

* tests: add restrictions round-trip and expiration tests for token handler

Covers allow_database/allow_resource builders, _r payload encoding,
and token_expires in verified actors. Coverage 76% -> 90%.

* tests: add test for signed tokens disabled

* fix: add TokenRestrictions TYPE_CHECKING import to fix ruff F821

* docs: regenerate plugins.rst with cog

* docs: reformat code blocks in plugin_hooks.rst with blacken-docs

* docs: add await .verify_token() to internals.rst

* tests: rewrite register_token_handler test to use real plugin handler

Adds a HardcodedTokenHandler to the test plugins dir that creates
tokens like dstok_hardcoded_token_1. The test now exercises creating
tokens via the default handler (which is the plugin's hardcoded one),
by explicitly naming the hardcoded handler, and by explicitly naming
the signed handler -- then verifies each token round-trips correctly.

* tests: clarify test_token_handler_via_http tests the default signed handler

* fix: use handler="signed" explicitly where signed tokens are expected

The HardcodedTokenHandler in my_plugin.py gets globally registered,
so create_token() without a handler name picks it up as the default.
Fix the create-token view, CLI, and tests to explicitly request the
signed handler where they depend on signed token behavior.

* fix: use handler="signed" in test_create_table_permissions

https://claude.ai/code/session_013cQFiDQjYRrRBH2biFfKuS
2026-02-25 16:32:45 -08:00
Simon Willison
1c6c6d2e68 Fix test_write_wrapper_set_authorizer: use permissive callback instead of None
conn.set_authorizer(None) does not clear the authorizer - SQLite treats
None as an invalid callback. The denied state persists on the shared
write connection, causing subsequent non-deny test cases to fail.

Fixes test added in 8a315f3d.
2026-02-17 13:30:46 -08:00
Simon Willison
5c3137d148 Black formatting 2026-02-17 13:30:24 -08:00
Claude
51e341b06a Fix test assertions broken by new fixture rows in 170f9de
The render_cell pks parameter commit added rows to compound_primary_key
(2->3 rows) and no_primary_key (201->202 rows) tables but did not
update existing tests that had hardcoded row count expectations.

https://claude.ai/code/session_01XfPSZfK57bzRRiEa7Kz5n1
2026-02-17 13:22:57 -08:00
Claude
170f9de774 Add pks parameter to render_cell() plugin hook
The render_cell() hook now receives a pks parameter containing the list
of primary key column names for the table being rendered. This avoids
plugins needing to make redundant async calls to look up primary keys.

For tables without an explicit primary key, pks is ["rowid"]. For custom
SQL queries and views, pks is an empty list [].

https://claude.ai/code/session_01HFYfevAziq4fSYTNRD9ZCh
2026-02-17 11:13:30 -08:00