Typogrify interferes with certain sections of the output that it should not touch (see #1407 for more details).
This feature adds a setting called TYPOGRIFY_IGNORE_LIST which is a list of tag for Typogrify to ignore.
The following was updated:
1. readers.py - if TYPOGRIFY_IGNORE_TAGS is present, then use it
2. settings.ps - default TYPOGRIFY_IGNORE_TAGS to []
3. contents/article_with_code_block.rst - an article with a code block for typogrify to ignore
4. updated tests
5. updated documentation
Old system was using manual string formatting for log messages.
This caused issues with common operations like exception logging
because often they need to be handled differently for Py2/Py3
compatibility. In order to unify the effort:
- All logging is changed to `logging.level(msg, arg1, arg2)` style.
- A `SafeLogger` is implemented to auto-decode exceptions properly
in the args (ref #1403).
- Custom formatters were overriding useful logging functionality
like traceback outputing (ref #1402). They are refactored to be
more transparent. Traceback information is provided in `--debug`
mode for `read_file` errors in generators.
- Formatters will now auto-format multiline log messages in order
to make them look related. Similarly, traceback will be formatted in
the same fashion.
- `pelican.log.LimitFilter` was (ab)using logging message which
would result in awkward syntax for argumented logging style. This
functionality is moved to `extra` keyword argument.
- Levels for errors that would result skipping a file (`read_file`)
changed from `warning` to `error` in order to make them stand out
among other logs.
- Small consistency changes to log messages (i.e. changing all
to start with an uppercase letter) and quality-of-life improvements
(some log messages were dumping raw object information).
"U" mode is redundant in py3 since "newline" argument replaces it and by default
universal newlines is enabled. As of py3.4, "U" mode triggers a deprecation warning.
reverts getpelican/pelican@ddcccfeaa9
If one used a locale that made use of unicode characters (like fr_FR.UTF-8)
the files on disk would be in correct locale while links would be to C.
Uses a SafeDatetime class that works with unicode format strigns
by using custom strftime to prevent ascii decoding errors with Python2.
Also added unicode decoding for the calendar module to fix period
archives.
The reader would return a list of authors already, but
METADATA_PROCESSORS['authors'] expects a string.
Added a test case for this (only the HTMLReader had it).
This is a reworked and improved version of content caching.
Notable changes:
- by default only raw content and metadata returned by readers are
cached which should prevent conficts with plugins, the speed benefit
of content objects caching is not very big with a simple setup
- renamed --full-rebuild to --ignore-cache
- added more elaborate logging to caching code
Previously, the error returned by Python when docutils is not installed
was not explicit, instead saying that HTMLTranslator is not defined
(needed by FeedGenerator and such), forcing the user to go into
readers.py to figure out that this happens because "import docutils"
failed.
This pull request makes the docutils dependency explicit, so that there
is an ImportError if doctutils is not found.
Drop duplicates logs.
Allow for logs to be grouped, enforcing a maximum number of logs per group.
Add the LOG_FILTER setting to ask from the configuration file to ignore some
logs (of level up to warning).
publication time and date and the last modified time and date
independently.
This makes it possible to access the last updated date with {{ article.locale_modified }} in templates.
Additionally, an already delivered feed entry can be corrected by changing the modified date and time, as it is used for atom:update
/ rss pubDate field now.
There was several issues here:
- `self.extensions` was adding 'meta' multiple times (ref #1058)
- `self.extensions` was keeping a reference to `self.settings['MD_EXTENSIONS']`,
so adding 'meta' to it.
- the `%s_EXTENSIONS` block coming after, it was overriding `self.extensions`
with `self.settings['EXTENSIONS']` (while it was a reference, it was working,
but ...). As this is currently used only for Mardown, the simplest solution is
to remove this, and let each reader manage its `_EXTENSIONS` setting.
Add a `Readers` class which contains a dict of file extensions / `Reader`
instances. This dict can be overwritten with a `READERS` settings, for instance
to avoid processing *.html files:
READERS = {'html': None}
Or to add a custom reader for the `foo` extension:
READERS = {'foo': FooReader}
This dict is no storing the Reader classes as it was done before with
`EXTENSIONS`. It stores the instances of the Reader classes to avoid instancing
for each file reading.
The assorted generators all use read_file() to read in the file
contents and metadata. Previously, they sometimes parse additional
metadata, fire off signals, and initialize a pelican.contents.Content
subclass on their own. We can reduce duplicated code and increase
consistency by shifting all that stuff into read_file() itself, and
this commit is a step in that direction.
If a setting exists in DEFAULT_CONFIG, assume it will be there
(instead of checking and/or providing a local default). The earlier
code was split between the two idioms, which was confusing.
Markdown instance carries state for subsequent uses. Content
and summary parsing is done with the same instance. Since
footnotes are processed with an extension and stored as state,
content footnote is duplicated for summary.
This PR adds a ``.reset()`` call before summary parsing to clear
the state. It also adds a test case with footnotes.