Fix intrasite links for non-'summary' metadata
Metadata like `MyArticleBanner: ` would be properly parsed (as defined in `FORMATTED_FIELDS`), but the intrasite links would not be processed.
Only the summary gets its intrasite links processed, has its value is either generated from the content (calling self._update_content at some point) or self._update_content is explicitly called if summary is passed as metadata.
This PR expands the paths as soon as possible in (`Content.__init__`) for metadata defined in `FORMATTED_FIELDS`.
Previously, with RELATIVE_URLS disabled, when both SITEURL and
STATIC_URL were absolute, the final generate data URLs looked wrong like
this (two absolute URLs joined by `/`):
http://your.site/http://static.your.site/image.png
With this patch, the data URLs are correctly:
http://static.your.site/image.png
This also applies to all *_URL configuration options (for example,
ability to have pages and articles on different domains) and behaves
like one expects even with URLs starting with just `//`, thanks to
making use of urllib.parse.urljoin().
However, when RELATIVE_URLS are enabled, urllib.parse.urljoin() doesn't
handle the relative base correctly. In that case, simple os.path.join()
is used. That, however, breaks the above case, but as RELATIVE_URLS are
meant for local development (thus no data scattered across multiple
domains), I don't see any problem.
Just to clarify, this is a fully backwards-compatible change, it only
enables new use cases that were impossible before.
* Consolidate validation of content
Previously we validated content outside of the content class via
calls to `is_valid_content` and some additional checks in page /
article generators (valid status).
This commit moves those checks all into content.valid() resulting
in a cleaner code structure.
This allows us to restructure how generators interact with content,
removing several old bugs in pelican (#1748, #1356, #2098).
- move verification function into content class
- move generator verifying content to contents class
- remove unused quote class
- remove draft class (no more rereading drafts)
- move auto draft status setter into Article.__init__
- add now parsing draft to basic test output
- remove problematic DEFAULT_STATUS setting
- add setter/getter for content.status
removes need for lower() calls when verifying status
* expand c4b184fa32
Mostly implement feedback by @iKevinY.
* rename content.valid to content.is_valid
* rename valid_* functions to has_valid_*
* update tests and function calls in code accordingly
a html tag always starts with <[a-z], < [a-z] is invalid
a space can be found after the = in href='bleh'
This function is taking 10% of the compilation time, with caching enabled,
maybe it's worth optimising the regexp a bit more, I don't know.
There was no test case checking the behaviour of what happens when
trying to {filename} an unknown file.
Also changed the braces to `'` chars that we use elseware.
When memoizing summary a dummy get_summary function was introduced to
properly work. This has multiple problems:
* The siteurl argument is actually not used in the function, it is only
used so it properly memoizes.
* It differs from how content is accessed and memoized.
This commit brings summary inline with how content is accessed and
processed. It also removes the old get_summary function with the unused
siteurl argument.
Additionally _get_summary is marked as deprecated and calls .summary
instead while also issueing a warning.
If an unknown replacement indicator {bla} was used, it was ignored
silently. This commit adds a warning when an unmatched indicator occurs
to help identify the issue.
closes#1794
* speed up via reduced slugify calls (only call when needed)
* fix __repr__ to not contain str, should call repr on name
* add test_urlwrappers and move URLWrappers tests there
* add new equality test
* cleanup header
additionally:
* Content is now decorated with python_2_unicode_compatible
instead of treating __str__ differently
* better formatting for test_article_metadata_key_lowercase
to actually output the conflict instead of a non descriptive
error
* remove content_object_init section from docs
* improve content_object_init test
The content_object_init signal used to set its class as sender and pass
the instance as additional arg in 6100773. Commit ed907b4 removed this
behaviour to bring it inline with other signals, on the basis that
you can test for the class of the object anyway.
We also had a test in place, checking this behaviour, but it was poorly
implemented, not actually checking if the function ever got called.
closes#1711
* Fix {filename} links on Windows.
Otherwise '{filename}/foo/bar.jpg' doesn't work
* Clean up relative Posix path handling in contents.
* Use Posix paths in readers
* Environment for Popen must be strs, not unicodes.
* Ignore Git CRLF warnings.
* Replace CRLFs with LFs in inputs on Windows.
* Fix importer tests
* Fix test_contents
* Fix one last backslash in paginated output
* Skip the remaining failing locale tests on Windows.
* Document the use of forward slashes on Windows.
* Add some Fabric and ghp-import notes
Until now, making static files end up in the same output directory as an
article that links to them has been difficult, especially when the article's
output path is generated based on metadata. This changeset introduces the
{attach} link syntax, which works like the {filename} syntax, but also
overrides the static file's output path with the directory of the
linking document.
It also clarifies and expands the documentation on linking to internal content.
reverts getpelican/pelican@ddcccfeaa9
If one used a locale that made use of unicode characters (like fr_FR.UTF-8)
the files on disk would be in correct locale while links would be to C.
Uses a SafeDatetime class that works with unicode format strigns
by using custom strftime to prevent ascii decoding errors with Python2.
Also added unicode decoding for the calendar module to fix period
archives.
The locale is a global state, and it was not properly reset to
whatever it was before the unitttest possibly changed it.
This is now fixed.
Not restoring the locale led to weird issues: depending on
the order chosen by "python -m unittest discover" to run
the unit tests, some tests would apparently randomly fail
due to the locale not being what was expected.
For example, test_period_in_timeperiod_archive would
call mock('posts/1970/ 1月/index.html',...) instead of
expected mock('posts/1970/Jan/index.html',...) and fail.
The test_datetime test passed on python3 but not python2 because
datetime.strftime is a byte string in python2, and a unicode string in python3
This patch allows the test to pass in both python2 and python3 (3.3+ only)
This avoids harcoding test-specific overrides, and makes it easy to
setup a settings dictionary based on DEFAULT_CONFIG for testing.
Because you can trust Pelican to use settings based on DEFAULT_CONFIG,
you are free to go about using:
settings[my_key]
instead of:
settings.get(my_key, some_fallback)
or:
if my_key in settings:
...
if you know that `my_key` is in DEFAULT_CONFIG.
This dictionary is accessed by plugins (like `summary`) which add new
settings, so it should be public (i.e. no prefixed underscore).
The changed name length would have led to a re-indenting of the
default contents anyway, so I shifted them all to four spaces.