This is a reworked and improved version of content caching.
Notable changes:
- by default only raw content and metadata returned by readers are
cached which should prevent conficts with plugins, the speed benefit
of content objects caching is not very big with a simple setup
- renamed --full-rebuild to --ignore-cache
- added more elaborate logging to caching code
Previously, the error returned by Python when docutils is not installed
was not explicit, instead saying that HTMLTranslator is not defined
(needed by FeedGenerator and such), forcing the user to go into
readers.py to figure out that this happens because "import docutils"
failed.
This pull request makes the docutils dependency explicit, so that there
is an ImportError if doctutils is not found.
The _cache_open attribute of the FileDataCacher class was not set when
settings[load_policy_key] was not True, so saving later failed.
As a precaution, replaced the `if ...: return` style with a plain
if structure to prevent such readability issues and added tests.
The locale is a global state, and it was not properly reset to
whatever it was before the unitttest possibly changed it.
This is now fixed.
Not restoring the locale led to weird issues: depending on
the order chosen by "python -m unittest discover" to run
the unit tests, some tests would apparently randomly fail
due to the locale not being what was expected.
For example, test_period_in_timeperiod_archive would
call mock('posts/1970/ 1月/index.html',...) instead of
expected mock('posts/1970/Jan/index.html',...) and fail.
Under python 2, with non-ascii locales, u"{:%b}".format(date) can raise UnicodeDecodeError
because u"{:%b}".format(date) will call date.__format__(u"%b"), which will return a byte string
and not a unicode string.
eg:
locale.setlocale(locale.LC_ALL, 'ja_JP.utf8')
date.__format__(u"%b") == '12\xe6\x9c\x88' # True
This commit catches UnicodeDecodeError and calls date.__format__() with byte strings instead
of characters, since it to work with character strings
The download_attachments error is triggered in the unit tests by a japanese
error message (接続を拒否されました) (connexion denied), that
python is not able to serialize the into a byte string.
This error weirdly does not appear every time the unit tests are run.
It might be related to the order in which the tests are run.
This error was found and fixed during the PyconUS 2014 pelican
sprint. It was discovered on a Linux Fedora20 computer running
Python2.7 in virtualenv
The test_datetime test passed on python3 but not python2 because
datetime.strftime is a byte string in python2, and a unicode string in python3
This patch allows the test to pass in both python2 and python3 (3.3+ only)
Drop duplicates logs.
Allow for logs to be grouped, enforcing a maximum number of logs per group.
Add the LOG_FILTER setting to ask from the configuration file to ignore some
logs (of level up to warning).
Previously, the documentation claimed the value of None for this purpose
even though False was used for certain defaults. The values False and
None cause warnings to be emitted from URLWrapper._from_settings though,
so the new way of inhibiting page generation is to set a *_SAVE_AS value
to the empty string.
PAGINATION_PATTERNS was hard coded so that all files had a ".html" extension. This fixes that and add a test to
ensure that the pagination code is not changing the filename incorrectly.