Datasette previously only supported one type of faceting: exact column value counting.
With this change, faceting logic is extracted out into one or more separate classes which can implement other patterns of faceting - this is discussed in #427, but potential upcoming facet types include facet-by-date, facet-by-JSON-array, facet-by-many-2-many and more.
A new plugin hook, register_facet_classes, can be used by plugins to add in additional facet classes.
Each class must implement two methods: suggest(), which scans columns in the table to decide if they might be worth suggesting for faceting, and facet_results(), which executes the facet operation and returns results ready to be displayed in the UI.
Also introduced a mechanism whereby table counts are calculated against a time limit
but immutable databases have their table counts calculated on server startup.
Prior to this commit Datasette would calculate the content hash of every
database and redirect to a URL containing that hash, like so:
https://v0-27.datasette.io/fixtures => https://v0-27.datasette.io/fixtures-dd88475
This assumed that all databases were opened in immutable mode and were not
expected to change.
This will be changing as a result of #419 - so this commit takes the first step
in implementing that change by changing this default behaviour. Datasette will
now only redirect hash-free URLs under two circumstances:
* The new `hash_urls` config option is set to true (it defaults to false).
* The user passes `?_hash=1` in the URL
If you start Datasette with no files, it will connect to :memory: instead.
When starting it with files you can add --memory to also get a :memory: database.
* table.csv?_stream=1 to download all rows - refs #266
This option causes Datasette to serve ALL rows in the table, by internally
following the _next= pagination links and serving everything out as a stream.
Also added new config option, allow_csv_stream, which can be used to disable
this feature.
* New config option max_csv_mb limiting size of CSV export
These new querystring arguments can be used to request expanded foreign keys
in both JSON and CSV formats.
?_labels=on turns on expansions for ALL foreign key columns
?_label=COLUMN1&_label=COLUMN2 can be used to pick specific columns to expand
e.g. `Street_Tree_List.json?_label=qSpecies&_label=qLegalStatus`
{
"rowid": 233,
"TreeID": 121240,
"qLegalStatus": {
"value" 2,
"label": "Private"
}
"qSpecies": {
"value": 16,
"label": "Sycamore"
}
"qAddress": "91 Commonwealth Ave",
...
}
The labels option also works for the HTML and CSV views.
HTML defaults to `?_labels=on`, so if you pass `?_labels=off` you can disable
foreign key expansion entirely - or you can use `?_label=COLUMN` to request
just specific columns.
If you expand labels on CSV you get additional columns in the output:
`/Street_Tree_List.csv?_label=qLegalStatus`
rowid,TreeID,qLegalStatus,qLegalStatus_label...
1,141565,1,Permitted Site...
2,232565,2,Undocumented...
I also refactored the existing foreign key expansion code.
Closes#233. Refs #266.
Tables and custom SQL query results can now be exported as CSV.
The easiest way to do this is to use the .csv extension, e.g.
/test_tables/facet_cities.csv
By default this is served as Content-Type: text/plain so you can see it in
your browser. If you want to download the file (using text/csv and with an
appropriate Content-Disposition: attachment header) you can do so like this:
/test_tables/facet_cities.csv?_dl=1
We link to the CSV and downloadable CSV URLs from the table and query pages.
The links use ?_size=max and so by default will return 1,000 rows.
Also fixes#303 - table names ending in .json or .csv are now detected and
URLs are generated that look like this instead:
/test_tables/table%2Fwith%2Fslashes.csv?_format=csv
The ?_format= option is available for everything else too, but we link to the
.csv / .json versions in most cases because they are aesthetically pleasing.
https://github.com/pytest-dev/pytest/issues/1875 made it impossible to declare
a function as a fixture multiple times, which we were doing across different
modules. The fix was to move our @pytest.fixture calls into decorators in the
tests/fixtures.py module.
Removed the --page_size= argument to datasette serve in favour of:
datasette serve --config default_page_size:50 mydb.db
Added new help section:
$ datasette --help-config
Config options:
default_page_size Default page size for the table view
(default=100)
max_returned_rows Maximum rows that can be returned from a table
or custom query (default=1000)
sql_time_limit_ms Time limit for a SQL query in milliseconds
(default=1000)
default_facet_size Number of values to return for requested facets
(default=30)
facet_time_limit_ms Time limit for calculating a requested facet
(default=200)
facet_suggest_time_limit_ms Time limit for calculating a suggested facet
(default=50)