Commit graph

189 commits

Author SHA1 Message Date
Deniz Turgut
2afef16008 Use language_code if it is supported 2018-11-26 03:45:20 +03:00
Justin Mayer
d974ba898c
Merge branch 'master' into html_list_tags 2018-11-11 11:30:28 +01:00
Deniz Turgut
015eefe7de Let's handle reST errors in a simple and nice way, finally. 2018-11-10 23:08:10 +01:00
Justin Mayer
11de7b2e47
Merge branch 'master' into html_list_tags 2018-11-01 15:43:14 +01:00
Justin Mayer
25045e8fb8
Merge pull request #2311 from gwax/warn_rst_no_document_title
Multiple reST headers yields "could not find information about 'date'"
2018-04-06 11:51:22 -07:00
Justin Mayer
5330453579
Merge pull request #2017 from JulienPalard/master
Explicitly disallow duplications of URL and save_as
2018-04-06 11:29:37 -07:00
George Leslie-Waksman
fb6d44712b Warn on missing rst document title. Fixes #2311. 2018-03-27 10:16:29 -07:00
Mr. Senko
f62217f38e Make HTMLReader parse multiple occurences of metadata tags as list
this means you can now specify:
<meta name="custom_field" content="value_1" />
<meta name="custom_field" content="value_2" />

and the resulting object.custom_field will be ['value_1', 'value_2']
2017-12-02 13:21:46 +02:00
Vladimír Vondruš
abb91fc3be Propagate value of DEFAULT_LANG to docutils reST parser.
That way the parsed node tree gets proper information about the
language, which can be then used for e.g. language-aware smart quotes or
hyphenation.
2017-11-29 22:42:54 +01:00
Vladimír Vondruš
7336de45cb Ability to override docutils HTML writer/translator.
The RstReader class can now use user-specified writer/translator classes
instead of the hardcoded ones from docutils. This allows for far easier
overriding of the default HTML output -- in the past one would need to
override the internal _parse_metadata() and _get_publisher() functions.
With hypothetical Html5Writer and Html5FieldBodyTranslator classes,
based for example on docutils.writers.html5_polyglot.Writer and
docutils.writers.html5_polyglot.HTMLTranslator, a plugin that overrides
the default behavior would now look just like this:

    # (definition of Writer / Translator classes omitted)

    class Html5RstReader(RstReader):
        writer_class = Html5Writer
        field_body_translator_class = Html5FieldBodyTranslator

    def add_reader(readers):
        readers.reader_classes['rst'] = Html5RstReader

    def register():
        pelican.signals.readers_init.connect(add_reader)
2017-06-30 22:59:42 +02:00
derwinlu
f49037e0ca Fixup 89b28fd
We need to mark the whole doctest string as raw as it contains
regular expressions.
2017-03-29 10:46:51 +02:00
derwinlu
623eb0a4c0 Fix more python 3.6 regex DeprecationWarning's 2017-03-29 10:19:47 +02:00
derwinlu
89b28fd36b Fix warnings originating from bad regexes
Starting with python 3.6 warnings are issued for invalid escape
sequences in regular expressions. This commit corrects all
DeprecationWarning's via properly declaring the offending
regular expressions as raw strings.

Resolves #2095.
2017-03-27 16:09:08 +02:00
Tim Wienk
4917b8618a Fix setting None metadata from FILENAME_METADATA matches.
This is relevant when using optional items in the expression. E.g. if an
optional captured group is not matched, the result of
`match.groupdict()` contains the captured group with value `None`.
2017-03-15 17:07:31 +01:00
John
6f9f0def0f Fix #1325 and add test_find_empty_alt 2016-11-17 23:29:19 +00:00
Bernhard Scheirle
a445e81ae6 Make markdown extensions order non-arbitrary 2016-11-15 17:05:12 +01:00
Bernhard Scheirle
35dba138e0 Replaces MD_EXTENSIONS with MARKDOWN
MARKDOWN allows to configure the python markdown module

fixes #1024
2016-11-02 21:11:42 +01:00
Julien Palard
e07c53a09d FIX: Those keys are looked up lowercased. 2016-10-26 08:34:52 +02:00
Julien Palard
e8a87e5d3c As not allowing duplicates in processed items is counter intuitive, let's allow it.
Also it may be allowed in the future (to process multiple values).

Also @avaris think it's bad to test something twice (see
https://github.com/getpelican/pelican/pull/2017), but for me confusion
lies in the "Why is list processing forbidden?", so, in a way, our
ideas converges in "let's not disallow processed items to be lists".

This reverts commit 9e574e9d8c.
2016-10-10 12:23:26 +02:00
Julien Palard
13ac28b6d4 Merge remote-tracking branch 'upstream/master' 2016-10-10 12:23:01 +02:00
Justin Mayer
22861aa1c1 Merge pull request #2015 from jpli/improve_path_metadata_processing
Improve path metadata processing
2016-10-06 11:30:27 -06:00
Julien Palard
9e574e9d8c Just in case someone forgot the DUPLICATES_DEFINITIONS_ALLOWED but add in METADATA_PROCESSORS. 2016-09-30 15:33:05 +02:00
Julien Palard
24a1254f03 Explicitly disallow duplications of URL and save_as. 2016-09-30 15:29:14 +02:00
Justin Mayer
5cc4c9f4ab Merge pull request #1880 from allanman/patch-1
Ensure DEFAULT_DATE = 'fs' actually uses modified time
2016-09-22 16:12:46 -06:00
Li Jiapeng
0f6b98506e Revert to the old category processing order
If no category is specified in PATH_METADATA or FILENAME_METADATA,
fall back to USE_FOLDER_AS_CATEGORY, which defaults to True.
2016-09-19 18:29:58 +08:00
Li Jiapeng
9185e0b7a8 Avoid circumvention of metadata name checking
See https://github.com/getpelican/pelican/issues/2011
2016-09-19 18:28:43 +08:00
Will Thompson
904f57d9c3 MarkdownReader: don't raise AttributeError on empty files
Markdown.convert() returns early, without running any preprocessors, if
source.strip() is empty.

Before, Pelican would raise AttributeError in this case; now, it logs a
more friendly error:

ERROR: Skipping ./foo.md: could not find information about 'NameError: title'

which is more consistent with the error from empty .rst files:

ERROR: Skipping ./foo.rst: could not find information about 'NameError: date'
2016-08-11 07:51:39 +01:00
Marcin Kurczewski
12c72d3e19 Fix typogrifying objects without title
This fixes my use case where I use `readers.read_file` from within plugin to load something that is neither a page nor an article.
2016-05-29 10:53:56 +02:00
Robert Utter
3f2d89c9d6 Makes DEFAULT_DATE accept string dates; fixes #1464 2016-05-24 17:22:27 +03:00
A Björck
749f85e468 Update readers.py
change so that DEFAULT_DATE = 'fs' makes pelican actually uses file mtime as stated in the manual (was ctime)
2015-12-26 19:07:41 +01:00
Kernc
510961bbb9 Avoid Markdown 2.6 deprecations; make MD_EXTENSIONS a dict
* Make MD_EXTENSIONS setting a dict and add tests for this change;
* Short extension names ('extra', 'meta') are deprecated
  https://pythonhosted.org/Markdown/release-2.6.html#shortened-extension-names-deprecated
* Extension config as part of extension name is deprecated
  https://pythonhosted.org/Markdown/release-2.6.html#extension-configuration-as-part-of-extension-name-deprecated
2015-11-30 18:12:28 +01:00
Justin Mayer
1bbffad7b9 Merge pull request #1837 from SimonStJG/fix-quote-escaping-in-html-attributes
Fix quote escaping in HTML attributes. Fixes issue #1260.
2015-11-02 13:44:56 -08:00
Simon StJG
d333ed12c6 Fix quote escaping in read html attributes.
* Wrap HTML attributes in quotes according to their content.  If it contains a double quote use single quotes, otherwise escape with double quotes.
* Add escape_html utility to ensure quote entities are converted identically across Python versions.

Fixes #1260
2015-10-14 21:03:01 +01:00
derwinlu
554cde1a22 rename summary references to formatted_fields
Since FORMATTED_FIELDS was introduced the variables are not specific to
summary and contain other data as well.
2015-10-13 10:26:52 +02:00
Jesús Fernández
7f795ed558 Remove duplicate tags and authors in metadata 2015-08-26 12:07:38 +02:00
derwinlu
8993c55e6e fulfil pep8 standard 2015-08-17 13:34:32 +02:00
Justin Mayer
b7e8af5977 Merge pull request #1747 from ingwinlu/fix_cache
Fix caching and disable by default
2015-06-09 08:42:51 -07:00
derwinlu
b7e6390f04 fix caching
*  break out cache into cache.py
*  break out cache-tests into test_cache.py
*  fix broken cache tests
   *  replace non existing assert calls with self.assertEqual
   *  fix path for page caching test (was invalid)
   *  cleanup test code
*  restructure generate_context in Article and Path Generator
   * destinguish between valid/invalid files correctly and cache accordingly
*  use cPickle if available for increased performance
2015-06-08 09:34:30 +02:00
Zack Weinberg
c918380802 Support semicolon-separated author/tag lists.
Idea borrowed from Docutils.  This allows one to write author lists in
lastname,firstname format.  The code change also means that readers with
fancy metadata that can natively represent lists (e.g. Docutils itself,
or MD-Yaml) don't have to merge 'em back together for process_metadata's
sake.
2015-06-04 17:31:20 -04:00
Forest
db2e517450 Ignore empty metadata. Fixes #1469. Fixes #1398.
Some metadata values cause problems when empty.  For example, a markdown file
containing a Slug: line with no additional text causing Pelican to produce a
file named ".html" instead of generating a proper file name.  Others, like
those created by a PATH_METADATA regex, must be preserved even if empty,
so things like PAGE_URL="filename{customvalue}.html" will always work.
Essentially, we want to discard empty metadata that we know will be useless
or problematic.  This is better than raising an exception because (a) it
allows users to deliberately keep empty metadata in their source files for
filling in later, and (b) users shouldn't be forced to fix empty metadata
created by blog migration tools (see #1398).

The metadata processors are the ideal place to do this, because they know
the type of data they are handling and whether an empty value is wanted.
Unfortunately, they can't discard items, and neither can process_metadata(),
because their return values are always saved by calling code.  We can't
safely change the calling code, because some of it lives in custom reader
classes out in the field, and we don't want to break those working systems.
Discarding empty values at the time of use isn't good enough, because that
still allows useless empty values in a source file to override configured
defaults.

My solution:
- When processing a list of values, a metadata processor will omit any
  unwanted empty ones from the list it returns.
- When processing an entirely unwanted value, it will return something easily
  identifiable that will pass through the reader code.
- When collecting the processed metadata, read_file() will filter out items
  identified as unwanted.

These metadata are affected by this change:
author, authors, category, slug, status, tags.

I also removed a bit of now-superfluous code from generators.py that was
discarding empty authors at the time of use.
2015-03-24 11:37:07 -07:00
Deniz Turgut
3ea4542015 Make sure Content uses URLWrappers 2015-03-06 16:06:20 -05:00
Patrick Fournier
d0afaa5fbe Format custom metadata fields listed in the FORMATTED_FIELDS setting.
Adding FORMATTED_FIELDS to the default settings with ['summary'] as the default value.
2015-02-24 16:57:05 -05:00
Justin Mayer
88ec7026ea Merge pull request #1533 from kernc/underscore_dates
Replace underscores in dates with spaces before parsing
2015-02-18 09:14:37 -08:00
Justin Mayer
bfbb7d4bb5 Merge pull request #1581 from georgevreilly/win-fixes
Fix Pelican rendering and unit tests on Windows.
2015-02-17 17:06:19 -08:00
John Mastro
0949fa62ec Tell smartypants to also process &quot; entities
This is necessary because Docutils has already replaced double quotes
with &quot; HTML entities by the time the typogrify filter is applied.
2015-02-12 16:27:30 -08:00
George V. Reilly
4c25610cd8 Fix Pelican rendering and unit tests on Windows.
* Fix {filename} links on Windows.
  Otherwise '{filename}/foo/bar.jpg' doesn't work
* Clean up relative Posix path handling in contents.
* Use Posix paths in readers
* Environment for Popen must be strs, not unicodes.
* Ignore Git CRLF warnings.
* Replace CRLFs with LFs in inputs on Windows.
* Fix importer tests
* Fix test_contents
* Fix one last backslash in paginated output
* Skip the remaining failing locale tests on Windows.
* Document the use of forward slashes on Windows.
* Add some Fabric and ghp-import notes
2015-01-25 17:42:53 -08:00
Kernc
88d19d47b5 Replace underscores in dates with spaces before parsing 2014-11-17 06:54:22 +01:00
Deniz Turgut
a2bb80b8bd Fixes #1420: Handle multiple definitions of standard metadata for Markdown 2014-08-22 17:53:36 -04:00
Justin Mayer
b8c9d61f20 Merge pull request #1411 from barrysteyn/typogrify-ignore-list
Allow Typogrify to ignore user specified tags. Refs #1407
2014-08-17 07:18:19 -06:00
Barry Steyn
a0ecab901f Allows Typogrify to ignore user specified tags. Refs #1407
Typogrify interferes with certain sections of the output that it should not touch (see #1407 for more details).
This feature adds a setting called TYPOGRIFY_IGNORE_LIST which is a list of tag for Typogrify to ignore.

The following was updated:

 1. readers.py - if TYPOGRIFY_IGNORE_TAGS is present, then use it
 2. settings.ps - default TYPOGRIFY_IGNORE_TAGS to []
 3. contents/article_with_code_block.rst - an article with a code block for typogrify to ignore
 4. updated tests
 5. updated documentation
2014-07-28 15:17:12 -07:00