The `slugify()` function used by Pelican is in general very good at
coming up with something both readable and URL-safe. However, there are
a few specific cases where it causes conflicts. One that I've run into
is using the strings `C++` and `C` as tags, both of which transform to
the slug `c`. This commit adds an optional `SLUG_SUBSTITUTIONS` setting
which is a list of 2-tuples of substitutions to be carried out
case-insensitively just prior to stripping out non-alphanumeric
characters. This allows cases like `C++` to be transformed to `CPP` or
similar. This can also improve the readability of slugs.
If DELETE_OUTPUT_DIRECTORY is set to True, all files and directories are
deleted from the output directory. There are, however, several reasons
one might want to retain certain files/directories and avoid their
deletion from the output directory. One such use case is version control
system data: a versioned output directory can facilitate deployment via
Heroku and/or allow the user to easily revert to a prior version of the
site without having to rely on regeneration via Pelican.
This change introduces the OUTPUT_RETENTION setting, a tuple of
filenames that will be preserved when the clean_output_dir function in
pelican.utils is run. Setting OUTPUT_RETENTION = (".hg", ".git") would,
for example, prevent the relevant VCS data from being deleted when the
output directory is cleaned.
Support the forms listed by the W3C [1]. I also removed the
'%Y-%d-%m' form, which can be confused with the '%Y-%m-%d' ISO form.
The new ISO forms can use 'Z' to designate UTC or '[+-]HHMM' to
specify offsets from UTC. Other time zone designators are not
supported.
The '%z' directive has only been supported since Python 3.2 [2], so if
you're running Pelican on Python 2.7, you're stuck with 'Z' for UTC.
Conveniently, we get ValueErrors for both invalid directives and
data/format missmatches, so we don't need special handling for the 2.7
case inside get_date().
[1]: http://www.w3.org/TR/NOTE-datetime
[2]: http://bugs.python.org/issue6641
The old get_relative_path() implementation assumed os.sep == '/',
which doesn't hold on MS Windows. The new implementation uses
split_all() for a more general component count.
I added path_to_url(), because the:
'/'.join(split_all(path))
idiom was showing up in a number of cases, and it's easier to
understand what's going on when that reads:
path_to_url(path)
This will fix a number of places where I think paths and URLs were
conflated, and should improve MS Windows support.
I think the conversion from native paths to URLs is best put off until
we are actually trying to generate the URL. The old handling
(introduced in 2692586, Fixes#645 - Making cross-content linking
windows compatible, 2012-12-19) converted the path at StaticContent
initialization, which left you with a bogus StaticContent.src.
Once we drop the 'static' subdirectory, we will be able to drop the
`dest` and `url` parts from the StaticGenerator.generate_context()
handling, which will leave things looking a good deal cleaner than
they do now.
Making everything consistent is a bit awkward, since this is a
commonly used attribute, but I've done my best.
Reasons for not consolidating on `filename`:
* It is often used for the "basename" (last component in the path).
Using `source_path` makes it clear that this attribute can contain
multiple components.
Reasons for not consolidating on `filepath`:
* It is barely used in the Pelican source, and therefore easy to
change.
* `path` is more Pythonic. The only place `filepath` ever show up in
the documentation for `os`, `os.path`, and `shutil` is in the
`os.path.relpath` documentation [1].
Reasons for not consolidating on `path`:
* The Page elements have both a source (this attribute) and a
destination (.save_as). To avoid confusion for developers not aware
of this, make it painfully obvious that this attribute is for the
source. Explicit is better than implicit ;).
Where I was touching the line, I also updated the string formatting in
StaticGenerator.generate_output to use the forward compatible
'{}'.format() syntax.
[1]: http://docs.python.org/2/library/os.path.html#os.path.relpath
If the destination directory specified by FILES_TO_COPY does not
exist, site generation crashes with a CRITICAL error. This creates the
destination if it does not exist.
Used an exception so show error state.
Used a bool flag to make sure the error is only shown once PER error.
Updated tests to check for the correct Exception raised
Added Debug level logging for each deletion
Added error level logging when a delete has an exception
Built a test case for clean_output_directory in tests.utils.py
Updated `settings` in documentation to reflect new behavior
Removed the tip from the bottom of getting started since this setting should no longer break web servers pointing to the output directory.