20 KiB
Queries in the internal database
Plan for https://github.com/simonw/datasette/issues/2735.
Goal
Move named query definitions into Datasette's internal database, so hundreds or thousands of queries can be listed, searched, permission-filtered, managed, and executed efficiently.
Terminology change: these are now "queries", not "canned queries". Legacy code and documentation can mention the old name only when describing compatibility or migration.
Decisions so far
- Internal table name:
queries. - Query definitions should use real columns, not a JSON blob for all options.
- Query parameter names live in a
parameterstext column as a JSON array. No default values for parameters in this pass. - No
queries_database_published_idxindex. - User-created queries require
execute-sqlandinsert-queryon the database. Writable queries additionally require matching table write permissions discovered byDatabase.analyze_sql(). publish-queryis the permission for creating or updating a query so users withoutexecute-sqlcan execute it.- Add
update-queryanddelete-query, so administrators can manage queries created by other users. - Remove the old
canned_queries()hook from core. If we want compatibility later, build a separatedatasette-old-canned-queriesplugin. - Writable user-created queries can be supported using
Database.analyze_sql(), provided we fail closed when analysis cannot prove the required permissions.
Current shape
- Query definitions currently come from
datasette.yamlor thecanned_queries()plugin hook. Datasette.get_canned_queries(database_name, actor)calls that hook every time it needs query definitions.QueryResource.resources_sql()currently enumerates databases and calls the hook for each one, because permissions and/-/jumpneed query resources.- Query pages execute if the actor has
view-queryforQueryResource(database, query). - Arbitrary SQL executes if the actor has
execute-sqlforDatabaseResource(database).
The main performance and architecture win is making query resource enumeration a direct SQL query against the internal database.
Proposed internal schema
Start with one queries table.
CREATE TABLE IF NOT EXISTS queries (
database_name TEXT NOT NULL,
name TEXT NOT NULL,
sql TEXT NOT NULL,
title TEXT,
description TEXT,
description_html TEXT,
hide_sql INTEGER NOT NULL DEFAULT 0 CHECK (hide_sql IN (0, 1)),
fragment TEXT,
parameters TEXT NOT NULL DEFAULT '[]',
is_write INTEGER NOT NULL DEFAULT 0 CHECK (is_write IN (0, 1)),
published INTEGER NOT NULL DEFAULT 0 CHECK (published IN (0, 1)),
source TEXT NOT NULL DEFAULT 'user',
owner_id TEXT,
on_success_message TEXT,
on_success_message_sql TEXT,
on_success_redirect TEXT,
on_error_message TEXT,
on_error_redirect TEXT,
created_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP,
updated_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (database_name, name),
CHECK (is_write = 0 OR published = 0)
);
CREATE INDEX IF NOT EXISTS queries_owner_idx
ON queries(owner_id);
Column notes:
database_name,name, andsqlare the routing and execution core.- Display fields become columns:
title,description,description_html,hide_sql, andfragment. parametersis a JSON array of parameter names, stored as text. This preserves explicit parameter order, but does not support labels or default values.- Existing writable query behavior gets columns too:
is_write, success/error messages, success/error redirects, andon_success_message_sql. publishedonly applies to read-only queries. A writable query can still be public through explicitview-querypermissions, but the "publish for users without execute-sql" shortcut should be read-only.sourcedistinguishesuser,config, andpluginrows.owner_idis the actor id for user-created rows. It isNULLfor config/plugin rows.
No separate index is needed on (database_name, name) because the primary key already creates one. Do not add a queries_database_published_idx index for now.
QueryResource.resources_sql() can become:
SELECT q.database_name AS parent, q.name AS child
FROM queries q
JOIN catalog_databases cd ON cd.database_name = q.database_name
The join keeps persisted queries for detached databases from appearing as live resources.
Config and plugin migration
datasette.yaml can continue to support databases: {db}: queries: blocks, but core should import them directly into the internal queries tables at startup:
- Ensure the internal schema exists.
- Delete previous
source='config'rows. - Read configured query blocks for each live database.
- Normalize string definitions to
{"sql": ...}. - Insert rows into
queries, storing explicitparamsas JSON inparameters.
Plugins should move to:
await datasette.add_query(...)
await datasette.remove_query(...)
Remove the old canned_queries() hookspec and all core calls to it. If compatibility is needed, build datasette-old-canned-queries later as a plugin that restores the hook and imports old hook results using datasette.add_query().
Permission model
Add core actions:
insert-query, database-level, for creating queries in a database.publish-query, database-level, for marking read-only queries as executable by actors who lackexecute-sql.update-query, query-level, for modifying existing query definitions.delete-query, query-level, for deleting existing query definitions.
User-created query creation requires:
execute-sqlonDatabaseResource(database)insert-queryonDatabaseResource(database)- If analysis shows the query is writable, the table-level write permissions described in the writable query section.
Setting published=1 requires:
publish-queryonDatabaseResource(database)- The query must be read-only according to
Database.analyze_sql().
Updating an existing query requires:
update-queryonQueryResource(database, query)or default owner permission for a user-owned row.- If the SQL changes, also require
execute-sqlon the database. - If the changed SQL is writable, also require the table-level write permissions described in the writable query section.
- If
publishedchanges from0to1, also requirepublish-queryon the database.
Deleting an existing query requires:
delete-queryonQueryResource(database, query)or default owner permission for a user-owned row.
Default owner permissions:
- For
source='user' AND owner_id = actor.id, grantupdate-queryanddelete-query. - Do not automatically grant execution if the user no longer has the execution permission described below.
Executing queries
Default execution rule for read-only queries:
- If
published=0, the actor needsexecute-sqlon the database. - If
published=1, the actor can execute the query withoutexecute-sql.
Default execution rule for user-created writable queries:
publishedmust be0.- The actor must have
view-query. - The actor must currently have every write permission required by fresh
Database.analyze_sql()results for the query SQL.
Implementation:
- Remove
view-queryfrom the broadDEFAULT_ALLOW_ACTIONSset. - Replace it with query-aware default
view-querypermission SQL. - For
published=1 AND is_write=0, emit a child-levelview-queryallow. - For
published=0 AND is_write=0, emit child-levelview-queryallows for queries whose parent database is in the actor'sexecute-sqlallowed resources. - For
is_write=1 AND source='user', emitview-queryonly for the owner or actors with explicitview-querypermission, then haveQueryViewperform the fresh analysis/table-permission check before execution. - For trusted writable queries, preserve current behavior by emitting child-level
view-queryallows foris_write=1 AND source IN ('config', 'plugin')when Datasette is not running with--default-deny.
For read-only queries this keeps QueryView simple: it checks view-query for the query resource, and the default permission hook encodes the relationship with execute-sql. User-created writable queries need one additional runtime permission check because their required table permissions are derived from fresh SQL analysis.
Explicit deny rules should still be able to block a published query.
Writable queries
Writable user-created queries should be in scope, guarded by Database.analyze_sql().
The secure rule: a user can create, update, or execute a writable user-created query only if they currently have the corresponding write permissions for every table the SQL can affect.
Database.analyze_sql(sql, params=None) runs the SQL through SQLite's authorizer on an isolated connection and returns a SQLAnalysis object containing SQLTableAccess rows:
operation:read,insert,update, ordeletedatabase: Datasette database name formain, or SQLite schema name where no Datasette mapping existstable: affected table or viewcolumns: read/updated columns where SQLite reports themsource: trigger/view/CTE source when SQLite reports one
Validation flow for user-created queries:
- Derive named parameters from the SQL and pass harmless placeholder values into
db.analyze_sql()so SQLite can prepare statements with bindings. - If analysis raises a SQLite error, reject the query.
- If every table access is
read, treat the query as read-only and requireexecute-sqlplusinsert-query/update-queryas described above. - If any table access is
insert,update, ordelete, treat the query as writable and forcepublished=0. - Reject writable user-created queries that access a database other than the database they are being saved against, until
analyze_sql()can reliably map attached SQLite schemas back to Datasette database names. - For every write access returned by analysis, require the corresponding permission on
TableResource(access.database, access.table):insert->insert-rowupdate->update-rowdelete->delete-row
- Include write accesses reported from triggers and views, since those are real side effects.
- Re-run the same analysis and permission checks when SQL changes through
update_query()orPOST .../-/update. - Re-run analysis before executing user-created writable queries, so schema or trigger changes cannot leave a previously saved query with stale permission assumptions.
The user-facing API should not trust a submitted is_write value. It should derive is_write from analysis.
Trusted configuration and plugin code can still call datasette.add_query(..., is_write=True, ...). Those are treated as deployment/admin-authored queries. They keep the existing execution model: they require view-query, and the default view-query hook should preserve current default-open behavior for trusted writable queries while still respecting --default-deny.
Fail closed cases for user-created writable queries:
- Analysis fails.
- Analysis reports any write operation that cannot be mapped to a Datasette table resource.
- Analysis reports writes outside the target database.
- The actor lacks any required table write permission.
published=1is requested.
This gives us writable user-created queries without letting execute-sql alone become a path to create arbitrary write endpoints.
HTTP API sketch
JSON endpoints should follow Datasette's existing write API style: use POST plus action paths such as /-/insert, /-/update, and /-/delete, not HTTP PATCH or DELETE.
Endpoints:
GET /{database}/-/querieslists query definitions the actor can view or manage, probably paginated.POST /{database}/-/queries/-/insertcreates a query.GET /{database}/{query}/-/definitionreturns one query definition without executing it.POST /{database}/{query}/-/updateupdates one query.POST /{database}/{query}/-/deletedeletes one query.
Create request:
{
"query": {
"name": "top_customers",
"sql": "select * from customers order by revenue desc limit 20",
"title": "Top customers",
"description": "Highest revenue customers",
"published": false,
"parameters": ["region"]
}
}
Successful create returns 201 and the created query definition:
{
"ok": true,
"query": {
"database": "fixtures",
"name": "top_customers",
"sql": "select * from customers order by revenue desc limit 20",
"title": "Top customers",
"description": "Highest revenue customers",
"published": false,
"parameters": ["region"]
}
}
Update request, imitating RowUpdateView:
{
"update": {
"title": "Top customers by revenue",
"published": true
},
"return": true
}
Successful update returns {"ok": true} by default. With "return": true, return the updated query definition:
{
"ok": true,
"query": {
"database": "fixtures",
"name": "top_customers",
"sql": "select * from customers order by revenue desc limit 20",
"title": "Top customers by revenue",
"published": true
}
}
Delete request:
POST /{database}/{query}/-/delete
Content-Type: application/json
Successful delete returns:
{
"ok": true
}
Validation:
- Update bodies must be dictionaries containing an
updatedictionary, with optionalreturn; invalid keys return{"ok": false, "errors": [...]}. - Validate route-safe query names.
- Reject names that collide with a table or view in the same database, since table routes currently win over query routes.
- Analyze user-created SQL with
Database.analyze_sql(). - Use
validate_sql_select(sql)as the read-only fast path when analysis shows only reads, but do not require it for writable queries that pass analysis and permission checks. - Reject magic parameters such as
:_actor_id,:_cookie_*, and:_header_*for user-created queries. - Reject client-supplied
is_write; derive it from analysis. - Reject writable-only success/error fields for read-only queries.
Python API sketch
Add methods on Datasette:
await datasette.add_query(
database,
name,
sql,
title=None,
description=None,
description_html=None,
hide_sql=False,
fragment=None,
parameters=None,
is_write=False,
published=False,
source="plugin",
owner_id=None,
on_success_message=None,
on_success_message_sql=None,
on_success_redirect=None,
on_error_message=None,
on_error_redirect=None,
replace=True,
)
await datasette.update_query(
database,
name,
*,
sql=UNCHANGED,
title=UNCHANGED,
description=UNCHANGED,
description_html=UNCHANGED,
hide_sql=UNCHANGED,
fragment=UNCHANGED,
parameters=UNCHANGED,
is_write=UNCHANGED,
published=UNCHANGED,
source=UNCHANGED,
owner_id=UNCHANGED,
on_success_message=UNCHANGED,
on_success_message_sql=UNCHANGED,
on_success_redirect=UNCHANGED,
on_error_message=UNCHANGED,
on_error_redirect=UNCHANGED,
)
await datasette.remove_query(database, name, source=None)
await datasette.get_query(database, name)
await datasette.get_queries(database)
update_query() should use an internal sentinel default such as UNCHANGED = object() so callers can distinguish "leave this column alone" from "set this column to NULL":
await datasette.update_query(
"fixtures",
"top_customers",
on_success_redirect=None,
)
That call should set on_success_redirect to SQL NULL; omitting on_success_redirect should leave the existing value unchanged.
Implementation detail: build the UPDATE statement dynamically from fields whose value is not UNCHANGED, validate non-nullable fields before writing, and update updated_at whenever at least one field changes.
The read methods should reconstruct the existing dictionary shape used by query execution and templates, with name, sql, display fields, write fields, params, published, owner_id, and source. parameters should be returned as the decoded JSON array and exposed as params where existing query execution code expects that key.
Query page save UI
On /{database}/-/query, if the actor has both execute-sql and insert-query, show a save control for valid read-only SQL. That page already executes read-only arbitrary SQL, so the first UI can stay read-only even though the JSON API can accept writable SQL after Database.analyze_sql() validation.
The save form should call POST /{database}/-/queries/-/insert and default to published=false.
If the actor also has publish-query, include a publish control. The UI copy should make it clear that publishing allows people without arbitrary SQL permission to run this query.
Dedicated create query UI
Add /{database}/-/queries/-/create for the fuller query authoring flow, including writable queries.
This page should require execute-sql and insert-query to access. It should provide a SQL editor and a mode control:
- Read-only
- Writable
Read-only mode can share the same fields as the arbitrary SQL save flow: name, title, description, parameters, and optional published status if the actor has publish-query.
Writable mode should always run Database.analyze_sql() and show an analysis panel before saving:
- detected operation
- database and table
- required permission
- whether the actor has that permission
- source, when the operation comes from a trigger or view
The Save button should be disabled until analysis succeeds and every required table write permission is allowed. Writable mode should not show a publish control, because user-created writable queries cannot be published.
The existing edit-SQL flow from query pages can continue to point back to arbitrary SQL. A later enhancement can add "update this query" when the actor owns it or has update-query.
Test plan
- Internal schema creates
queries. - Query parameters are stored in the
queries.parameterstext column as a JSON array of names. - Config
queries:blocks import into internal tables. - Legacy string query definitions normalize to SQL rows.
- The old
canned_queries()hook is no longer called by core. QueryResource.resources_sql()returns rows fromqueries.- Database page and
/-/jumplist queries from the internal DB. view-queryis no longer globally default-allowed; default query permissions come from the query-aware hook.- Unpublished read-only query requires
execute-sqlto execute. - Published read-only query can be executed without
execute-sql. - Setting
published=truerequirespublish-query. - User-created query requires both
execute-sqlandinsert-query. - User-created writable query creation uses
Database.analyze_sql()and requires matchinginsert-row,update-row, and/ordelete-rowpermissions for every reported write access. /{database}/-/queries/-/createprovides the writable-query authoring UI with an analysis panel and disabled save until all required write permissions pass.- User-created writable query execution re-runs
Database.analyze_sql()and re-checks table write permissions. - User-created writable query cannot be published.
- Query update uses
POST /{database}/{query}/-/updatewith an{"update": {...}}body. - Query delete uses
POST /{database}/{query}/-/delete. - There are no
PATCHor HTTPDELETEroutes for query management. datasette.update_query(..., field=None)writesNULL, while omitted fields are left unchanged.- Owner gets default
update-queryanddelete-queryfor their own user-created rows. - Admin can manage other users' queries with
update-queryanddelete-query. - User API rejects magic parameters.
- User API rejects writable queries if analysis fails, reports writes outside the target database, or reports writes the actor is not allowed to perform.
- Trusted config/plugin writable queries still execute through
view-query. - Trusted config/plugin writable queries are not default-allowed under
--default-deny. - Persisted internal DB does not expose queries for detached databases.