The assorted generators all use read_file() to read in the file
contents and metadata. Previously, they sometimes parse additional
metadata, fire off signals, and initialize a pelican.contents.Content
subclass on their own. We can reduce duplicated code and increase
consistency by shifting all that stuff into read_file() itself, and
this commit is a step in that direction.
If a setting exists in DEFAULT_CONFIG, assume it will be there
(instead of checking and/or providing a local default). The earlier
code was split between the two idioms, which was confusing.
Markdown instance carries state for subsequent uses. Content
and summary parsing is done with the same instance. Since
footnotes are processed with an extension and stored as state,
content footnote is duplicated for summary.
This PR adds a ``.reset()`` call before summary parsing to clear
the state. It also adds a test case with footnotes.
This cuts down on the remaining difference between static files and
articles/pages. The main difference is that path-based metadata is
now parsed for static content.
Instead of just being a base class, we can use it to parse static
files. It won't actually do any parsing, but we will get path
metadata extraction from read_file, and this will make the
StaticGenerator implementation simpler.
Sometimes the base filename doesn't have everything you need.
Remember that os.sep is platform dependent, so using it in a regular
expression for this setting may not be portable (boo MS Windows!).
For easier re-use. I also extract the path metadata first, so
metadata explicitly listed in the file contents will take precedence
over metadata parsed from the filename.
These additions are to make it easier to disable pygments or any other
extension the user may not want. In the previous version, these plugins are
hardcoded, but by making it a variable in the config, it is possible to not use
pygments or easily load extra markdown plugins if needed; you can have multiple
plugins in one virtual environment and have different configs load them as
needed.
In my `pelicanconf.py` I then have the following:
MD_EXTENSIONS = ['extra', 'syntaxhighlighter']
where `syntaxhighlighter` is a custom markdown extension I am working on to use
syntax highlighter instead of pygments for code highlighting.
Since version 0.9 `docutils` supports `code` directive. But this directive
can generate fullname classes for the `pygments` style classes.
For example, the following code
```reStructuredText
.. code:: c++
GetFoo()->do_something();
```
generate the following output
```html
<pre class="code c++ literal-block">
<span class="name">GetFoo</span>
<span class="punctuation">()</span>
<span class="operator">-></span>
<span class="name">do_something</span>
<span class="punctuation">();</span>
</pre>
```
Note, that fullname classes were used, when we need a short one
```html
<pre class="code c++ literal-block">
<span class="n">GetFoo</span>
<span class="p">()</span>
<span class="o">-></span>
<span class="n">do_something</span>
<span class="p">();</span>
</pre>
```
Making everything consistent is a bit awkward, since this is a
commonly used attribute, but I've done my best.
Reasons for not consolidating on `filename`:
* It is often used for the "basename" (last component in the path).
Using `source_path` makes it clear that this attribute can contain
multiple components.
Reasons for not consolidating on `filepath`:
* It is barely used in the Pelican source, and therefore easy to
change.
* `path` is more Pythonic. The only place `filepath` ever show up in
the documentation for `os`, `os.path`, and `shutil` is in the
`os.path.relpath` documentation [1].
Reasons for not consolidating on `path`:
* The Page elements have both a source (this attribute) and a
destination (.save_as). To avoid confusion for developers not aware
of this, make it painfully obvious that this attribute is for the
source. Explicit is better than implicit ;).
Where I was touching the line, I also updated the string formatting in
StaticGenerator.generate_output to use the forward compatible
'{}'.format() syntax.
[1]: http://docs.python.org/2/library/os.path.html#os.path.relpath
wrote unit tests and documentation, improved regular expression.
The HtmlReader is enabled by default now and parses metadata in html
files of the form:
<!-- key:value -->