This is a reworked and improved version of content caching.
Notable changes:
- by default only raw content and metadata returned by readers are
cached which should prevent conficts with plugins, the speed benefit
of content objects caching is not very big with a simple setup
- renamed --full-rebuild to --ignore-cache
- added more elaborate logging to caching code
Under python 2, with non-ascii locales, u"{:%b}".format(date) can raise UnicodeDecodeError
because u"{:%b}".format(date) will call date.__format__(u"%b"), which will return a byte string
and not a unicode string.
eg:
locale.setlocale(locale.LC_ALL, 'ja_JP.utf8')
date.__format__(u"%b") == '12\xe6\x9c\x88' # True
This commit catches UnicodeDecodeError and calls date.__format__() with byte strings instead
of characters, since it to work with character strings
invoked before categories and tags lists are created
useful when e.g. modifying the list of articles to be generated
so that removed articles are not leaked in categories or tags
* Adds period tuple of (year, month, day) matching the time
period of the current archive. Note that this is done
to the archive context if period_archives.html doesn't exist.
* Adds tests to verify this.
* Adds documentation in themes.rst about period in period_archives.html.
Previously if you tried to mark an article as a draft by using a different
casing (for example, draft) you would get a warning when building:
`Unknown status Draft for file foo.md, skipping it.` This uses a
case-insensitive comparison when looking at article status instead. I
believe this behavior is a little easier for new Pelican users.
Add a `Readers` class which contains a dict of file extensions / `Reader`
instances. This dict can be overwritten with a `READERS` settings, for instance
to avoid processing *.html files:
READERS = {'html': None}
Or to add a custom reader for the `foo` extension:
READERS = {'foo': FooReader}
This dict is no storing the Reader classes as it was done before with
`EXTENSIONS`. It stores the instances of the Reader classes to avoid instancing
for each file reading.
Make deliberate overriding (*) works with overwrites detection.
(*) first introduced by d0e9c52410
The following are decided to be deliberate override:
- articles using the `save_as` metadata
- pages using the `save_as` metadata
- template pages (always)
Pelican now exits in the following 2 cases:
- at least 2 not deliberate writes to the same file name (behaviour introduced
by the overwrite detection feature ff7410ce2a)
- at least 2 deliberate writes to the same file name (new behaviour)
Also added info logging when deliberate overrides are performed.
Switched to StandardError instead of IOError, thanks to @ametaireau and
@russkel.
This adds the lstrip_blocks Jinja parameter and removes unnecessary
whitespace from a few notmyidea templates.
Note: The lstrip_blocks parameter requires Jinja 2.7+, which has been
noted in Pelican's setup.py.
Credit for this commit goes entirely to Russ Webber, who has earned my
eternal thanks for discovering and applying this useful Jinja parameter.
Refs #969
All paths should be relative to Generator.path unless we're actively
accessing the filesystem. This makes the argument less ambiguous, so
we have less likelyhood of joining paths multiple times.
We no longer instantiate the Static object in the StaticGenerator, so
we can't set the save_as argument anymore. If you want to adjust the
output path, use the upcoming EXTRA_PATH_METADATA setting.
If a setting exists in DEFAULT_CONFIG, assume it will be there
(instead of checking and/or providing a local default). The earlier
code was split between the two idioms, which was confusing.
This cuts down on the remaining difference between static files and
articles/pages. The main difference is that path-based metadata is
now parsed for static content.
The old get_relative_path() implementation assumed os.sep == '/',
which doesn't hold on MS Windows. The new implementation uses
split_all() for a more general component count.
I added path_to_url(), because the:
'/'.join(split_all(path))
idiom was showing up in a number of cases, and it's easier to
understand what's going on when that reads:
path_to_url(path)
This will fix a number of places where I think paths and URLs were
conflated, and should improve MS Windows support.
From the Python docs for os.sep [1]:
Note that knowing this is not sufficient to be able to parse or
concatenate pathnames - use os.path.split() and os.path.join()...
Where I touched a line, I also changed double quoted string literals
to single quotes, since they are used more often in the source:
wking@mjolnir ~/src/pelican $ git grep "'" pelican/*.py | wc -l
683
wking@mjolnir ~/src/pelican $ git grep '"' pelican/*.py | wc -l
181
[1]: http://docs.python.org/3/library/os.html#os.sep
Allows users to have per-year, per-month, and per-day archives of posts
automatically generated. The feature is disabled by default; to enable
it a user must supply format strings for a period's respective
`_SAVE_AS` setting.
I think the conversion from native paths to URLs is best put off until
we are actually trying to generate the URL. The old handling
(introduced in 2692586, Fixes#645 - Making cross-content linking
windows compatible, 2012-12-19) converted the path at StaticContent
initialization, which left you with a bogus StaticContent.src.
Once we drop the 'static' subdirectory, we will be able to drop the
`dest` and `url` parts from the StaticGenerator.generate_context()
handling, which will leave things looking a good deal cleaner than
they do now.
Static needs a lot of the same handling as other pages, so make it a
subclass of Page. The rename from StaticContent to Static makes for
cleaner configuration settings (STATIC_URL instead of
STATICCONTENT_URL).
All currently generated Static instances override the save_as
attribute explicitly on initialization, but it isn't hard to imagine
wanting to adjust STATIC file output based on metadata (e.g. extracted
from their source filename). With this union, the framework for
manipulating URLs and filenames is shared between all source file
types.
This makes it easier for StaticGenerator to walk FILES_TO_COPY, where
the input may be a directory or a bare filename.
Non-traversable file types (e.g. everything but directories and
symlinks to directories) are not checked against the exclude list.
The user-level effect of this is that explicit entries in STATIC_PATHS
or FILES_TO_COPY will override a hypothetical STATIC_EXCLUDES setting,
which seems like a reasonable approach.
I also removed the Python 2.5 compatibility check for `followlinks` in
os.walk, since Pelican is now Python >=2.7.