Tilde encoding (#1659)

Closes #1657

Refs #1439
This commit is contained in:
Simon Willison 2022-03-15 11:01:57 -07:00 committed by GitHub
commit a35393b29c
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
14 changed files with 125 additions and 141 deletions

View file

@ -59,21 +59,3 @@ truncation error message.
You can increase or remove this limit using the :ref:`setting_max_csv_mb` config
setting. You can also disable the CSV export feature entirely using
:ref:`setting_allow_csv_stream`.
A note on URLs
--------------
The default URL for the CSV representation of a table is that table with
``.csv`` appended to it:
* https://latest.datasette.io/fixtures/facetable - HTML interface
* https://latest.datasette.io/fixtures/facetable.csv - CSV export
* https://latest.datasette.io/fixtures/facetable.json - JSON API
This pattern doesn't work for tables with names that already end in ``.csv`` or
``.json``. For those tables, you can instead use the ``_format=`` query string
parameter:
* https://latest.datasette.io/fixtures/table%2Fwith%2Fslashes.csv - HTML interface
* https://latest.datasette.io/fixtures/table%2Fwith%2Fslashes.csv?_format=csv - CSV export
* https://latest.datasette.io/fixtures/table%2Fwith%2Fslashes.csv?_format=json - JSON API

View file

@ -545,7 +545,7 @@ These functions can be accessed via the ``{{ urls }}`` object in Datasette templ
<a href="{{ urls.table("fixtures", "facetable") }}">facetable table</a>
<a href="{{ urls.query("fixtures", "pragma_cache_size") }}">pragma_cache_size query</a>
Use the ``format="json"`` (or ``"csv"`` or other formats supported by plugins) arguments to get back URLs to the JSON representation. This is usually the path with ``.json`` added on the end, but it may use ``?_format=json`` in cases where the path already includes ``.json``, for example a URL to a table named ``table.json``.
Use the ``format="json"`` (or ``"csv"`` or other formats supported by plugins) arguments to get back URLs to the JSON representation. This is the path with ``.json`` added on the end.
These methods each return a ``datasette.utils.PrefixedUrlString`` object, which is a subclass of the Python ``str`` type. This allows the logic that considers the ``base_url`` setting to detect if that prefix has already been applied to the path.
@ -876,31 +876,31 @@ Utility function for calling ``await`` on a return value if it is awaitable, oth
.. autofunction:: datasette.utils.await_me_maybe
.. _internals_dash_encoding:
.. _internals_tilde_encoding:
Dash encoding
-------------
Tilde encoding
--------------
Datasette uses a custom encoding scheme in some places, called **dash encoding**. This is primarily used for table names and row primary keys, to avoid any confusion between ``/`` characters in those values and the Datasette URLs that reference them.
Datasette uses a custom encoding scheme in some places, called **tilde encoding**. This is primarily used for table names and row primary keys, to avoid any confusion between ``/`` characters in those values and the Datasette URLs that reference them.
Dash encoding uses the same algorithm as `URL percent-encoding <https://developer.mozilla.org/en-US/docs/Glossary/percent-encoding>`__, but with the ``-`` hyphen character used in place of ``%``.
Tilde encoding uses the same algorithm as `URL percent-encoding <https://developer.mozilla.org/en-US/docs/Glossary/percent-encoding>`__, but with the ``~`` tilde character used in place of ``%``.
Any character other than ``ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789_`` will be replaced by the numeric equivalent preceded by a hyphen. For example:
Any character other than ``ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789_-`` will be replaced by the numeric equivalent preceded by a tilde. For example:
- ``/`` becomes ``-2F``
- ``.`` becomes ``-2E``
- ``%`` becomes ``-25``
- ``-`` becomes ``-2D``
- Space character becomes ``-20``
- ``polls/2022.primary`` becomes ``polls-2F2022-2Eprimary``
- ``/`` becomes ``~2F``
- ``.`` becomes ``~2E``
- ``%`` becomes ``~25``
- ``~`` becomes ``~7E``
- Space character becomes ``~20``
- ``polls/2022.primary`` becomes ``polls~2F2022~2Eprimary``
.. _internals_utils_dash_encode:
.. _internals_utils_tilde_encode:
.. autofunction:: datasette.utils.dash_encode
.. autofunction:: datasette.utils.tilde_encode
.. _internals_utils_dash_decode:
.. _internals_utils_tilde_decode:
.. autofunction:: datasette.utils.dash_decode
.. autofunction:: datasette.utils.tilde_decode
.. _internals_tracer: