Previously, with RELATIVE_URLS disabled, when both SITEURL and
STATIC_URL were absolute, the final generate data URLs looked wrong like
this (two absolute URLs joined by `/`):
http://your.site/http://static.your.site/image.png
With this patch, the data URLs are correctly:
http://static.your.site/image.png
This also applies to all *_URL configuration options (for example,
ability to have pages and articles on different domains) and behaves
like one expects even with URLs starting with just `//`, thanks to
making use of urllib.parse.urljoin().
However, when RELATIVE_URLS are enabled, urllib.parse.urljoin() doesn't
handle the relative base correctly. In that case, simple os.path.join()
is used. That, however, breaks the above case, but as RELATIVE_URLS are
meant for local development (thus no data scattered across multiple
domains), I don't see any problem.
Just to clarify, this is a fully backwards-compatible change, it only
enables new use cases that were impossible before.
Check for 0 dates.
For my own blog this means it doesn't break during import, but I don't know pelican well enough yet to say if this is correct or not.
Error that I was getting before applying this fix
```
Traceback (most recent call last):
File "/mnt/data/home/stu/.virtualenvs/blog/bin/pelican-import", line 11, in <module>
sys.exit(main())
File "/mnt/data/home/stu/.virtualenvs/blog/local/lib/python2.7/site-packages/pelican/tools/pelican_import.py", line 896, in main
attachments=attachments or None)
File "/mnt/data/home/stu/.virtualenvs/blog/local/lib/python2.7/site-packages/pelican/tools/pelican_import.py", line 684, in fields2pelican
kind, in_markup) in fields:
File "/mnt/data/home/stu/.virtualenvs/blog/local/lib/python2.7/site-packages/pelican/tools/pelican_import.py", line 163, in wp2fields
date_object = time.strptime(raw_date, '%Y-%m-%d %H:%M:%S')
File "/usr/lib/python2.7/_strptime.py", line 478, in _strptime_time
return _strptime(data_string, format)[0]
File "/usr/lib/python2.7/_strptime.py", line 332, in _strptime
(data_string, format))
ValueError: time data u'0000-00-00 00:00:00' does not match format u'%Y-%m-%d %H:%M:%S'
```
* Consolidate validation of content
Previously we validated content outside of the content class via
calls to `is_valid_content` and some additional checks in page /
article generators (valid status).
This commit moves those checks all into content.valid() resulting
in a cleaner code structure.
This allows us to restructure how generators interact with content,
removing several old bugs in pelican (#1748, #1356, #2098).
- move verification function into content class
- move generator verifying content to contents class
- remove unused quote class
- remove draft class (no more rereading drafts)
- move auto draft status setter into Article.__init__
- add now parsing draft to basic test output
- remove problematic DEFAULT_STATUS setting
- add setter/getter for content.status
removes need for lower() calls when verifying status
* expand c4b184fa32
Mostly implement feedback by @iKevinY.
* rename content.valid to content.is_valid
* rename valid_* functions to has_valid_*
* update tests and function calls in code accordingly
The RstReader class can now use user-specified writer/translator classes
instead of the hardcoded ones from docutils. This allows for far easier
overriding of the default HTML output -- in the past one would need to
override the internal _parse_metadata() and _get_publisher() functions.
With hypothetical Html5Writer and Html5FieldBodyTranslator classes,
based for example on docutils.writers.html5_polyglot.Writer and
docutils.writers.html5_polyglot.HTMLTranslator, a plugin that overrides
the default behavior would now look just like this:
# (definition of Writer / Translator classes omitted)
class Html5RstReader(RstReader):
writer_class = Html5Writer
field_body_translator_class = Html5FieldBodyTranslator
def add_reader(readers):
readers.reader_classes['rst'] = Html5RstReader
def register():
pelican.signals.readers_init.connect(add_reader)