ASGI cannot differentiate between / and %2F in a URL, so we need an
alternative scheme for encoding the names of tables that contain special
characters such as /
For background, see
https://github.com/django/asgiref/issues/51#issuecomment-450603464
Some examples:
"table/and/slashes" => "tableU+002FandU+002Fslashes"
"~table" => "U+007Etable"
"+bobcats!" => "U+002Bbobcats!"
"U+007Etable" => "UU+002B007Etable"
I've run the black code formatting tool against everything:
black tests datasette setup.py
I also added a new unit test, in tests/test_black.py, which will fail if the code does not
conform to black's exacting standards.
This unit test only runs on Python 3.6 or higher, because black itself doesn't run on 3.5.
Thanks @russss!
* Add register_output_renderer hook
This changeset refactors out the JSON renderer and then adds a hook and
dispatcher system to allow custom output renderers to be registered.
The CSV output renderer is untouched because supporting streaming
renderers through this system would be significantly more complex, and
probably not worthwhile.
We can't simply allow hooks to be called at request time because we need
a list of supported file extensions when the request is being routed in
order to resolve ambiguous database/table names. So, renderers need to
be registered at startup.
I've tried to make this API independent of Sanic's request/response
objects so that this can remain stable during the switch to ASGI. I'm
using dictionaries to keep it simple and to make adding additional
options in the future easy.
Fixes#440
If an error occurred inside the block the progress handler (used to
enforce a time limit) was not being correctly cleared, resulting in
timeout errors potentially occurring during subsequent SQL queries.
The fix is described here: https://docs.python.org/3/library/contextlib.html#contextlib.contextmanager
Prior to this commit Datasette would calculate the content hash of every
database and redirect to a URL containing that hash, like so:
https://v0-27.datasette.io/fixtures => https://v0-27.datasette.io/fixtures-dd88475
This assumed that all databases were opened in immutable mode and were not
expected to change.
This will be changing as a result of #419 - so this commit takes the first step
in implementing that change by changing this default behaviour. Datasette will
now only redirect hash-free URLs under two circumstances:
* The new `hash_urls` config option is set to true (it defaults to false).
* The user passes `?_hash=1` in the URL
This change introduces a new plugin hook, publish_subcommand, which can be
used to implement new subcommands for the "datasette publish" command family.
I've used this new hook to refactor out the "publish now" and "publish heroku"
implementations into separate modules. I've also added unit tests for these
two publishers, mocking the subprocess.call and subprocess.check_output
functions.
As part of this, I introduced a mechanism for loading default plugins. These
are defined in the new "default_plugins" list inside datasette/app.py
Closes#217 (Plugin support for datasette publish)
Closes#348 (Unit tests for "datasette publish")
Refs #14, #59, #102, #103, #146, #236, #347
* table.csv?_stream=1 to download all rows - refs #266
This option causes Datasette to serve ALL rows in the table, by internally
following the _next= pagination links and serving everything out as a stream.
Also added new config option, allow_csv_stream, which can be used to disable
this feature.
* New config option max_csv_mb limiting size of CSV export
This is a relatively obscure new command-line argument that helps solve the
problem of showing accurate version information in deployed instances of
Datasette even if they were deployed directly from source code.
You can pass --version-note to datasette publish and package and it will then
in turn be passed to datasette when it starts:
datasette --version-note=hello fixtures.db
Now if you visit /-/versions.json you will see this:
{
"datasette": {
"note": "hello",
"version": "0+unknown"
},
"python": {
"full": "3.6.5 (default, Jun 6 2018, 19:19:24) \n[GCC 6.3.0 20170516]",
"version": "3.6.5"
},
...
}
I plan to use this in some Travis CI configuration, refs #313
These new querystring arguments can be used to request expanded foreign keys
in both JSON and CSV formats.
?_labels=on turns on expansions for ALL foreign key columns
?_label=COLUMN1&_label=COLUMN2 can be used to pick specific columns to expand
e.g. `Street_Tree_List.json?_label=qSpecies&_label=qLegalStatus`
{
"rowid": 233,
"TreeID": 121240,
"qLegalStatus": {
"value" 2,
"label": "Private"
}
"qSpecies": {
"value": 16,
"label": "Sycamore"
}
"qAddress": "91 Commonwealth Ave",
...
}
The labels option also works for the HTML and CSV views.
HTML defaults to `?_labels=on`, so if you pass `?_labels=off` you can disable
foreign key expansion entirely - or you can use `?_label=COLUMN` to request
just specific columns.
If you expand labels on CSV you get additional columns in the output:
`/Street_Tree_List.csv?_label=qLegalStatus`
rowid,TreeID,qLegalStatus,qLegalStatus_label...
1,141565,1,Permitted Site...
2,232565,2,Undocumented...
I also refactored the existing foreign key expansion code.
Closes#233. Refs #266.
Tables and custom SQL query results can now be exported as CSV.
The easiest way to do this is to use the .csv extension, e.g.
/test_tables/facet_cities.csv
By default this is served as Content-Type: text/plain so you can see it in
your browser. If you want to download the file (using text/csv and with an
appropriate Content-Disposition: attachment header) you can do so like this:
/test_tables/facet_cities.csv?_dl=1
We link to the CSV and downloadable CSV URLs from the table and query pages.
The links use ?_size=max and so by default will return 1,000 rows.
Also fixes#303 - table names ending in .json or .csv are now detected and
URLs are generated that look like this instead:
/test_tables/table%2Fwith%2Fslashes.csv?_format=csv
The ?_format= option is available for everything else too, but we link to the
.csv / .json versions in most cases because they are aesthetically pleasing.
New command-line argument which causes SpatiaLite to be installed and
configured for the published Datasette.
datasette publish now --spatialite mydb.db
If the user requests some _facet= options that do not successfully execute in
the configured facet_time_limit_ms, we now show a warning message like this:
These facets timed out: rowid, Title
To build this I had to clean up our SQLite interrupted logic. We now raise a
custom InterruptedError exception when SQLite terminates due to exceeding a
time limit.
In implementing this I found and fixed a logic error where invalid SQL was
being generated in some cases for our faceting calculations but the resulting
sqlite3.OperationalError had been incorrectly captured and treated as a
timeout.
Refs #255Closes#269