21 KiB
Queries in the internal database
Plan for https://github.com/simonw/datasette/issues/2735.
Goal
Move named query definitions into Datasette's internal database, so hundreds or thousands of queries can be listed, searched, permission-filtered, managed, and executed efficiently.
Terminology change: these are now "queries", not "canned queries". Legacy code and documentation can mention the old name only when describing compatibility or migration.
Decisions so far
- Internal table name:
queries. - Query definitions should use real columns, not a JSON blob for all options.
- Query parameter names live in a
parameterstext column as a JSON array. No default values for parameters in this pass. - No separate index is needed for the privacy/trust flags yet.
- User-created queries require
execute-sqlandinsert-queryon the database. They default to private, and writable queries additionally require matching table write permissions discovered byDatabase.analyze_sql(). - Configured queries default to trusted, which means actors who can view them can execute them without also holding
execute-sqlor the relevant write permissions. Config can opt out withis_trusted: false. - Add
update-queryanddelete-query, so administrators can manage queries created by other users. - Remove the old
canned_queries()hook from core. If we want compatibility later, build a separatedatasette-old-canned-queriesplugin. - Writable user-created queries can be supported using
Database.analyze_sql(), provided we fail closed when analysis cannot prove the required permissions.
Current shape
- Query definitions currently come from
datasette.yamlor thecanned_queries()plugin hook. Datasette.get_canned_queries(database_name, actor)calls that hook every time it needs query definitions.QueryResource.resources_sql()currently enumerates databases and calls the hook for each one, because permissions and/-/jumpneed query resources.- Query pages are visible if the actor has
view-queryforQueryResource(database, query). Executing an untrusted stored query also checksexecute-sqlor the relevant write permissions. - Arbitrary SQL executes if the actor has
execute-sqlforDatabaseResource(database).
The main performance and architecture win is making query resource enumeration a direct SQL query against the internal database.
Proposed internal schema
Start with one queries table.
CREATE TABLE IF NOT EXISTS queries (
database_name TEXT NOT NULL,
name TEXT NOT NULL,
sql TEXT NOT NULL,
title TEXT,
description TEXT,
description_html TEXT,
options TEXT NOT NULL DEFAULT '{}',
parameters TEXT NOT NULL DEFAULT '[]',
is_write INTEGER NOT NULL DEFAULT 0 CHECK (is_write IN (0, 1)),
is_private INTEGER NOT NULL DEFAULT 0 CHECK (is_private IN (0, 1)),
is_trusted INTEGER NOT NULL DEFAULT 0 CHECK (is_trusted IN (0, 1)),
source TEXT NOT NULL DEFAULT 'user',
owner_id TEXT,
created_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP,
updated_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (database_name, name)
);
CREATE INDEX IF NOT EXISTS queries_owner_idx
ON queries(owner_id);
Column notes:
database_name,name, andsqlare the routing and execution core.- Display fields become columns:
title,description, anddescription_html. - Less common presentation and writable-query behavior lives in
options, stored as a JSON object. That covershide_sql,fragment,on_success_message,on_success_message_sql,on_success_redirect,on_error_message, andon_error_redirect. parametersis a JSON array of parameter names, stored as text. This preserves explicit parameter order, but does not support labels or default values.- Existing writable query behavior gets
is_writeas a column. Success/error messages, success/error redirects, andon_success_message_sqlare stored inoptions. is_privatemeans the query is only visible to its owning actor. This is enforced as a permission restriction, so broaderview-querygrants do not expose private rows.is_trustedmeans execution skips the usualexecute-sqlor write-permission checks afterview-queryhas allowed access.sourcedistinguishesuser,config, andpluginrows.owner_idis the actor id for user-created rows. It isNULLfor config/plugin rows.
No separate index is needed on (database_name, name) because the primary key already creates one.
QueryResource.resources_sql() can become:
SELECT q.database_name AS parent, q.name AS child
FROM queries q
JOIN catalog_databases cd ON cd.database_name = q.database_name
The join keeps persisted queries for detached databases from appearing as live resources.
Config and plugin migration
datasette.yaml can continue to support databases: {db}: queries: blocks, but core should import them directly into the internal queries tables at startup:
- Ensure the internal schema exists.
- Delete previous
source='config'rows. - Read configured query blocks for each live database.
- Normalize string definitions to
{"sql": ...}. - Insert rows into
queries, storing explicitparamsas JSON inparameters.
Plugins should move to:
await datasette.add_query(...)
await datasette.remove_query(...)
Remove the old canned_queries() hookspec and all core calls to it. If compatibility is needed, build datasette-old-canned-queries later as a plugin that restores the hook and imports old hook results using datasette.add_query().
Permission model
Add core actions:
insert-query, database-level, for creating queries in a database.update-query, query-level, for modifying existing query definitions.delete-query, query-level, for deleting existing query definitions.
User-created query creation requires:
execute-sqlonDatabaseResource(database)insert-queryonDatabaseResource(database)- If analysis shows the query is writable, the table-level write permissions described in the writable query section.
Updating an existing query requires:
update-queryonQueryResource(database, query)or default owner permission for a user-owned row.- If the SQL changes, also require
execute-sqlon the database. - If the changed SQL is writable, also require the table-level write permissions described in the writable query section.
Deleting an existing query requires:
delete-queryonQueryResource(database, query)or default owner permission for a user-owned row.
Default owner permissions:
- For
source='user' AND owner_id = actor.id, grantupdate-queryanddelete-query. - For
source='user' AND owner_id = actor.id, grantview-query. If the query is private, restriction SQL ensures no other actor sees it through a broader grant.
Executing queries
Default execution rule for read-only queries:
- If
is_trusted=0, the actor needsexecute-sqlon the database. - If
is_trusted=1, the actor can execute the query withoutexecute-sql, providedview-queryallows access.
Default execution rule for user-created writable queries:
is_trustedmust be0.- The actor must have
view-query. - The actor must currently have every write permission required by fresh
Database.analyze_sql()results for the query SQL.
Implementation:
- Keep
view-queryin the broadDEFAULT_ALLOW_ACTIONSset, so saved queries remain visible by default in all-public Datasette. - Emit default
view-queryallows for the owning actor. - Use
restriction_sqlto limit private rows to their owner even when broaderview-querypermissions exist. - Have
QueryViewperform the freshexecute-sqlor table-permission check before execution unless the row hasis_trusted=1.
For read-only queries this keeps QueryView explicit: it checks view-query for the query resource, then checks execute-sql unless the row is trusted. User-created writable queries need one additional runtime permission check because their required table permissions are derived from fresh SQL analysis.
Explicit deny rules should still be able to block a query, and --default-deny still blocks trusted queries unless something grants view-query.
Writable queries
Writable user-created queries should be in scope, guarded by Database.analyze_sql().
The secure rule: a user can create, update, or execute a writable user-created query only if they currently have the corresponding write permissions for every table the SQL can affect.
Database.analyze_sql(sql, params=None) runs the SQL through SQLite's authorizer on an isolated connection and returns a SQLAnalysis object containing SQLTableAccess rows:
operation:read,insert,update, ordeletedatabase: Datasette database name formain, or SQLite schema name where no Datasette mapping existstable: affected table or viewcolumns: read/updated columns where SQLite reports themsource: trigger/view/CTE source when SQLite reports one
Validation flow for user-created queries:
- Derive named parameters from the SQL and pass harmless placeholder values into
db.analyze_sql()so SQLite can prepare statements with bindings. - If analysis raises a SQLite error, reject the query.
- If every table access is
read, treat the query as read-only and requireexecute-sqlplusinsert-query/update-queryas described above. - If any table access is
insert,update, ordelete, treat the query as writable and forceis_trusted=0. - Reject writable user-created queries that access a database other than the database they are being saved against, until
analyze_sql()can reliably map attached SQLite schemas back to Datasette database names. - For every write access returned by analysis, require the corresponding permission on
TableResource(access.database, access.table):insert->insert-rowupdate->update-rowdelete->delete-row
- Include write accesses reported from triggers and views, since those are real side effects.
- Re-run the same analysis and permission checks when SQL changes through
update_query()orPOST .../-/update. - Re-run analysis before executing user-created writable queries, so schema or trigger changes cannot leave a previously saved query with stale permission assumptions.
The user-facing API should not trust a submitted is_write value. It should derive is_write from analysis.
Trusted configuration and plugin code can still call datasette.add_query(..., is_write=True, ...). Those are treated as deployment/admin-authored queries. They keep the existing execution model: they require view-query, and the default view-query hook should preserve current default-open behavior for trusted writable queries while still respecting --default-deny.
Fail closed cases for user-created writable queries:
- Analysis fails.
- Analysis reports any write operation that cannot be mapped to a Datasette table resource.
- Analysis reports writes outside the target database.
- The actor lacks any required table write permission.
is_trusted=1is requested through the user-facing API.
This gives us writable user-created queries without letting execute-sql alone become a path to create arbitrary write endpoints.
HTTP API sketch
JSON endpoints should follow Datasette's existing write API style: use POST plus action paths such as /-/insert, /-/update, and /-/delete, not HTTP PATCH or DELETE.
Endpoints:
GET /-/queriesandGET /{database}/-/queriesshow searchable HTML query browsers.GET /-/queries.jsonlists query definitions across every database the actor can view;GET /{database}/-/queries.jsonscopes that list to one database. Both JSON endpoints use cursor pagination with_nextand_size.POST /{database}/-/queries/insertcreates a query.GET /{database}/{query}/-/definitionreturns one query definition without executing it.POST /{database}/{query}/-/updateupdates one query.POST /{database}/{query}/-/deletedeletes one query.
Create request:
{
"query": {
"name": "top_customers",
"sql": "select * from customers order by revenue desc limit 20",
"title": "Top customers",
"description": "Highest revenue customers",
"is_private": true,
"parameters": ["region"]
}
}
Successful create returns 201 and the created query definition:
{
"ok": true,
"query": {
"database": "fixtures",
"name": "top_customers",
"sql": "select * from customers order by revenue desc limit 20",
"title": "Top customers",
"description": "Highest revenue customers",
"is_private": true,
"is_trusted": false,
"parameters": ["region"]
}
}
Update request, imitating RowUpdateView:
{
"update": {
"title": "Top customers by revenue",
"is_private": false
},
"return": true
}
Successful update returns {"ok": true} by default. With "return": true, return the updated query definition:
{
"ok": true,
"query": {
"database": "fixtures",
"name": "top_customers",
"sql": "select * from customers order by revenue desc limit 20",
"title": "Top customers by revenue",
"is_private": false,
"is_trusted": false
}
}
Delete request:
POST /{database}/{query}/-/delete
Content-Type: application/json
Successful delete returns:
{
"ok": true
}
Validation:
- Update bodies must be dictionaries containing an
updatedictionary, with optionalreturn; invalid keys return{"ok": false, "errors": [...]}. - Validate route-safe query names.
- Reject names that collide with a table or view in the same database, since table routes currently win over query routes.
- Analyze user-created SQL with
Database.analyze_sql(). - Use
validate_sql_select(sql)as the read-only fast path when analysis shows only reads, but do not require it for writable queries that pass analysis and permission checks. - Reject magic parameters such as
:_actor_id,:_cookie_*, and:_header_*for user-created queries. - Reject client-supplied
is_write; derive it from analysis. - Reject writable-only success/error fields for read-only queries.
Python API sketch
Add methods on Datasette:
await datasette.add_query(
database,
name,
sql,
title=None,
description=None,
description_html=None,
hide_sql=False,
fragment=None,
parameters=None,
is_write=False,
is_private=False,
is_trusted=False,
source="plugin",
owner_id=None,
on_success_message=None,
on_success_message_sql=None,
on_success_redirect=None,
on_error_message=None,
on_error_redirect=None,
replace=True,
)
await datasette.update_query(
database,
name,
*,
sql=UNCHANGED,
title=UNCHANGED,
description=UNCHANGED,
description_html=UNCHANGED,
hide_sql=UNCHANGED,
fragment=UNCHANGED,
parameters=UNCHANGED,
is_write=UNCHANGED,
is_private=UNCHANGED,
is_trusted=UNCHANGED,
source=UNCHANGED,
owner_id=UNCHANGED,
on_success_message=UNCHANGED,
on_success_message_sql=UNCHANGED,
on_success_redirect=UNCHANGED,
on_error_message=UNCHANGED,
on_error_redirect=UNCHANGED,
)
await datasette.remove_query(database, name, source=None)
await datasette.get_query(database, name)
await datasette.list_queries(
database,
actor=None,
limit=50,
cursor=None,
q=None,
is_write=None,
is_private=None,
is_trusted=None,
source=None,
owner_id=None,
)
list_queries() should return a bounded page shaped like {"queries": [...], "next": "...", "has_more": true, "limit": 50}. The next value is an opaque cursor token, not an offset. Passing database=None lists visible queries across all live databases, still filtered through view-query permission SQL.
update_query() should use an internal sentinel default such as UNCHANGED = object() so callers can distinguish "leave this column alone" from "set this column to NULL":
await datasette.update_query(
"fixtures",
"top_customers",
on_success_redirect=None,
)
For column-backed fields, None should write SQL NULL. For option fields, None should remove that key from the JSON object so get_query() returns None; omitting the field should leave the existing option unchanged.
Implementation detail: build the UPDATE statement dynamically from fields whose value is not UNCHANGED, validate non-nullable fields before writing, and update updated_at whenever at least one field changes.
The read methods should reconstruct the existing dictionary shape used by query execution and templates, with name, sql, display fields, write fields, params, is_private, is_trusted, owner_id, and source. parameters should be returned as the decoded JSON array and exposed as params where existing query execution code expects that key. Option values should be unpacked from the options JSON object and returned as the same top-level keys accepted by add_query() and update_query().
Query page save UI
On /{database}/-/query, if the actor has both execute-sql and insert-query, show a save control for valid read-only SQL. That page already executes read-only arbitrary SQL, so the first UI can stay read-only even though the JSON API can accept writable SQL after Database.analyze_sql() validation.
The save form should call POST /{database}/-/queries/insert and default to is_private=true.
On /{database}, show a preview of the first 5 visible queries using list_queries(..., limit=5). If the page has has_more, show a link to /{database}/-/queries rather than rendering hundreds or thousands of query links inline. The full /{database}/-/queries page provides search, filters, and cursor pagination. The global /-/queries page reuses the same interface and shows the database for each query.
Dedicated create query UI
Add /{database}/-/queries/-/create for the fuller query authoring flow, including writable queries.
This page should require execute-sql and insert-query to access. It should provide a SQL editor and a mode control:
- Read-only
- Writable
Read-only mode can share the same fields as the arbitrary SQL save flow: name, title, description, parameters, and privacy status.
Writable mode should always run Database.analyze_sql() and show an analysis panel before saving:
- detected operation
- database and table
- required permission
- whether the actor has that permission
- source, when the operation comes from a trigger or view
The Save button should be disabled until analysis succeeds and every required table write permission is allowed.
The existing edit-SQL flow from query pages can continue to point back to arbitrary SQL. A later enhancement can add "update this query" when the actor owns it or has update-query.
Test plan
- Internal schema creates
queries. - Query parameters are stored in the
queries.parameterstext column as a JSON array of names. - Config
queries:blocks import into internal tables. - Legacy string query definitions normalize to SQL rows.
- The old
canned_queries()hook is no longer called by core. QueryResource.resources_sql()returns rows fromqueries.- Database page and
/-/jumplist queries from the internal DB. view-queryremains globally default-allowed, withrestriction_sqlnarrowing private queries to their owner.- Private query is only visible to its owner, even when a broader
view-queryrule applies. - Non-trusted read-only query requires
execute-sqlto execute. - Trusted read-only query can be executed without
execute-sqlafterview-querypasses. - Config queries default to trusted and can opt out with
is_trusted: false. - User API rejects client-supplied
is_trusted. - User-created query requires both
execute-sqlandinsert-query. - User-created writable query creation uses
Database.analyze_sql()and requires matchinginsert-row,update-row, and/ordelete-rowpermissions for every reported write access. /{database}/-/queries/-/createprovides the writable-query authoring UI with an analysis panel and disabled save until all required write permissions pass.- User-created writable query execution re-runs
Database.analyze_sql()and re-checks table write permissions. - User-created writable query cannot be trusted through the user API.
- Query update uses
POST /{database}/{query}/-/updatewith an{"update": {...}}body. - Query delete uses
POST /{database}/{query}/-/delete. - There are no
PATCHor HTTPDELETEroutes for query management. datasette.update_query(..., field=None)writesNULLfor column-backed fields and removes JSON keys for option fields, while omitted fields are left unchanged.- Owner gets default
update-queryanddelete-queryfor their own user-created rows. - Admin can manage other users' queries with
update-queryanddelete-query. - User API rejects magic parameters.
- User API rejects writable queries if analysis fails, reports writes outside the target database, or reports writes the actor is not allowed to perform.
- Trusted config/plugin writable queries still execute through
view-query. - Trusted config/plugin writable queries are not default-allowed under
--default-deny. - Persisted internal DB does not expose queries for detached databases.