1
0
Fork 0
forked from github/pelican
Commit graph

62 commits

Author SHA1 Message Date
manhhomienbienthuy
d5d792060c
Fix #2982: Improve _HTMLWordTruncator (#3002) 2022-07-11 19:47:37 +02:00
ImBearChild
22192c148a Improve word count behavior when generating summary
Improve _HTMLWordTruncator by using more than one unicode block in
_word_regex, making word count function behave properly with CJK,
Cyrillic, and more Latin characters when generating summary.
2021-09-29 12:41:00 +02:00
MinchinWeb
332be6e5c8
Support date format codes G, V, and u (used by ISO dates) (#2902) 2021-07-13 09:35:22 +02:00
Justin Mayer
2eb9c26cdb
Merge pull request #2750 from avaris/autoreload
Refactor file/folder watchers and autoreload
2020-05-10 07:29:27 +02:00
Deniz Turgut
48d842faa7
Refactor file/folder watchers and autoreload
Combined file and folder watchers under a class and refactored
common watcher related code from __init__.py to the class.
This simplifies the main and autoreload functions in __init__
as well as fix the problem with crashes related to multiprocessing
on systems where default spawn mode is "spawn" instead of "fork".
2020-05-09 16:22:36 +03:00
Deniz Turgut
2e482b207b
Fix Windows tests
* Unskip passable tests
* Fix broken tests
2020-05-09 16:17:14 +03:00
Justin Mayer
d43b786b30 Modernize code base to Python 3+ syntax
Replaces syntax that was relevant in earlier Python versions but that
now has modernized equivalents.
2020-04-27 09:45:31 +02:00
Deniz Turgut
03d9c38871 Rewrite pelican.utils.slugify to use unicode and add tests
Adds a use_unicode kwarg to slugify to keep unicode
characters as is (no ASCII-fying) and add tests for
it. Also reworks how slugification logic.

slugify started with the Django method for slugiying:
 - Normalize to compatibility decomposed from (NFKD)
 - Encode and decode with 'ascii'

This works fine if the decomposed form contains ASCII
characters (i.e. ç can be changed in to c+CEDILLA and
ASCII would keep c only), but fails when decomposition
doesn't result in ASCII characters (i.e. Chinese). To
solve that 'unidecode' was added, which works fine for
both cases. However, old method is now redundant but
was kept. This commit removes the old method and
adjusts logic slightly.

Now slugify will normalize all text with composition
mode (NFKC) to unify format for regex substitutions.
And then if use_unicode is False, uses unidecode to
convert it to ASCII.
2020-04-19 20:10:46 +03:00
Justin Mayer
8ba00dd9f1 Preserve category case in importer
Adds a `preserve_case` parameter to the `slugify()` function and uses it
to preserve capital letters in category names when using the Pelican
importer.
2020-04-15 20:42:21 +02:00
Kevin Yap
1e0e541b57 Initial pass of removing Python 2 support
This commit removes Six as a dependency for Pelican, replacing the
relevant aliases with the proper Python 3 imports. It also removes
references to Python 2 logic that did not require Six.
2019-11-26 06:16:41 +09:00
MinchinWeb
2ee423017b Skip some non-Windows tests on Windows
Some tests will never pass on Windows due to differences in filesystems between
Windows and Linux.
2019-08-21 11:27:31 -06:00
Oliver Urs Lenz
77c967f1db control scope of identification of translations with new settings 2018-11-01 10:12:47 +01:00
Oliver Urs Lenz
5199fa51ea control slug substitutions from settings with regex 2018-10-31 16:20:21 +01:00
Andrea Corbellini
01480a539f more accurate code and tests 2018-02-08 20:10:08 +01:00
Andrea Corbellini
fc7af9e1c3 flake8 fixes 2018-02-08 18:39:29 +01:00
Andrea Corbellini
b573576b00 Fix utils.truncate_html_words() to work with invalid HTML references
Invalid references like those missing semicolons (e.g. `&mdash`) or
those causing overflows (e.g. `�`) are now gracefully
handled and no exception is thrown.

This commit also adds tests and comments where needed.
2018-02-08 18:30:09 +01:00
Jonas Wielicki
018f4468cc Check safety of save_as earlier if possible
The check in the writer still serves as a safety net.
2017-02-27 21:49:17 +01:00
Rogdham
4645def789 Fix translation metadata support 2016-11-20 20:01:56 +01:00
Mr. Senko
648165b839 More granular control of tags and categories slugs. Fixes #1873
- add TAG_SUBSTITUTIONS AND CATEGORY_SUBSTITURIONS settings
- make slugify keep non-alphanumeric characters if configured
2016-04-01 23:00:08 +03:00
Daniel Himmelstein
f864dd832c Change ... (periods) to … (ellipsis) in summary
Also update HTML output by running (after making sure to have the fr_FR.utf8
locale installed):

```sh
LC_ALL=en_US.utf8 pelican -o pelican/tests/output/custom/ -s samples/pelican.conf.py samples/content/
LC_ALL=fr_FR.utf8 pelican -o pelican/tests/output/custom_locale/ -s samples/pelican.conf_FR.py samples/content/
LC_ALL=en_US.utf8 pelican -o pelican/tests/output/basic/ samples/content/
```
as described at
http://docs.getpelican.com/en/3.6.3/contribute.html#running-the-test-suite
2016-02-22 13:03:47 -08:00
Andrea Corbellini
c255a35800 Use unichr() instead of chr() with Python 2. 2015-09-22 23:25:24 +02:00
Andrea Corbellini
9d0804de7a When truncating, consider hypens, apostrophes and HTML entities. 2015-08-28 13:22:54 +02:00
derwinlu
8993c55e6e fulfil pep8 standard 2015-08-17 13:34:32 +02:00
Andrea Corbellini
379f8666c1 Rewrite pelican.utils.truncate_html_words() to use an HTML parser instead of regular expressions. 2015-07-30 21:04:28 +02:00
derwinlu
39fd4936b5 improve result output of a pelican run
*  display count of hidden pages
*  pluralize only if necessary
*  add maybe_pluralize to utils
*  add tests for maybe_pluralize
2015-06-03 08:58:59 +02:00
Kevin Yap
95860c6b1b Remove unused modules/variables from tests 2015-02-17 18:25:44 -08:00
Deniz Turgut
fa269d7c6f Fix tests that were skipped in #1581 2015-02-17 20:50:27 -05:00
George V. Reilly
4c25610cd8 Fix Pelican rendering and unit tests on Windows.
* Fix {filename} links on Windows.
  Otherwise '{filename}/foo/bar.jpg' doesn't work
* Clean up relative Posix path handling in contents.
* Use Posix paths in readers
* Environment for Popen must be strs, not unicodes.
* Ignore Git CRLF warnings.
* Replace CRLFs with LFs in inputs on Windows.
* Fix importer tests
* Fix test_contents
* Fix one last backslash in paginated output
* Skip the remaining failing locale tests on Windows.
* Document the use of forward slashes on Windows.
* Add some Fabric and ghp-import notes
2015-01-25 17:42:53 -08:00
Carlos E. Garcia
f5775bcba0 minor spelling fixes 2014-09-15 20:29:00 -04:00
Justin Mayer
1fae9534d5 Merge pull request #1446 from avaris/enhanced_strftime
Fixes #1395: extends pelican.utils.strftime with `-` prefix to strip leading zeros
2014-08-28 11:17:28 -07:00
Deniz Turgut
7c3cc8fc0d Fixes #1395: extends pelican.utils.strftime with - prefix to strip leading zeros
Adds the ability to use `-` prefix with C89 format codes to strip any
leading zeros.
2014-07-16 03:40:53 -04:00
OGINO Masanori
2c50ccb764 Add timezone to datetime objects. Refs #962.
Based on https://github.com/getpelican/pelican/pull/977, but it adds
timezone information before formatting.

Signed-off-by: OGINO Masanori <masanori.ogino@gmail.com>
2014-07-04 01:23:57 +09:00
Ondrej Grover
3f6b130d6e Fix #1198, enable custom locale in template rendering, fixes links
reverts getpelican/pelican@ddcccfeaa9

If one used a locale that made use of unicode characters (like fr_FR.UTF-8)
the files on disk would be in correct locale while links would be to C.

Uses a SafeDatetime class that works with unicode format strigns
by using custom strftime to prevent ascii decoding errors with Python2.

Also added unicode decoding for the calendar module to fix period
archives.
2014-06-26 00:00:19 -04:00
Antoine Brenner
fd7cb9e213 Test to reproduce an issue that occurs with python3.3 under macos10 only
This test passes fine under linux
2014-04-15 22:01:20 +02:00
Antoine Brenner
7277c95fb5 Make sure locale is what we want before/after the tests
The locale is a global state, and it was not properly reset to
whatever it was before the unitttest possibly changed it.
This is now fixed.

Not restoring the locale led to weird issues: depending on
the order chosen by "python -m unittest discover" to run
the unit tests, some tests would apparently randomly fail
due to the locale not being what was expected.

For example, test_period_in_timeperiod_archive would
call mock('posts/1970/ 1月/index.html',...) instead of
expected mock('posts/1970/Jan/index.html',...) and fail.
2014-04-15 16:45:45 +02:00
Stefan hr Berder
7f2bc2a23b change date metadata parsing to dateutil.parser 2014-02-23 11:21:44 +01:00
Justin Mayer
dcadf33988 Merge pull request #1183 from Rogdham/pelican-fixcopy
Fix `utils.copy` for copying files
2014-02-05 08:19:34 -08:00
Alistair Magee
dc552bb869 fix test-suite import error 2014-01-10 16:33:02 +00:00
Rogdham
fd7fc2e202 Simplify usage of utils.copy
Remove confusing parameters, clarify usage in __doc__
2013-12-07 21:11:15 +01:00
Rogdham
7da0506f2d Fix utils.copy for copying files, add unit tests
`copy('', 'a/b.ext0', 'c/d.ext1')` is copying `a/b.ext0` into `c/d.ext1/b.ext0`
(creating folder `c/d.ext1` in the process) instead of `c/d.ext1`.
Bug introduced by e03cf3f517.
2013-12-07 20:58:19 +01:00
Simon Conseil
cfe72c2736 Disable asciidoc files for tests 2013-08-06 23:42:41 +02:00
Simon Conseil
4bc4b1500c Refactor readers and remove MARKUP
Add a `Readers` class which contains a dict of file extensions / `Reader`
instances. This dict can be overwritten with a `READERS` settings, for instance
to avoid processing *.html files:

    READERS = {'html': None}

Or to add a custom reader for the `foo` extension:

    READERS = {'foo': FooReader}

This dict is no storing the Reader classes as it was done before with
`EXTENSIONS`. It stores the instances of the Reader classes to avoid instancing
for each file reading.
2013-08-06 23:42:41 +02:00
Andy Pearce
39518e15ef Allow text substitutions when generating slugs
The `slugify()` function used by Pelican is in general very good at
coming up with something both readable and URL-safe. However, there are
a few specific cases where it causes conflicts. One that I've run into
is using the strings `C++` and `C` as tags, both of which transform to
the slug `c`. This commit adds an optional `SLUG_SUBSTITUTIONS` setting
which is a list of 2-tuples of substitutions to be carried out
case-insensitively just prior to stripping out non-alphanumeric
characters. This allows cases like `C++` to be transformed to `CPP` or
similar. This can also improve the readability of slugs.
2013-07-04 12:17:21 +01:00
Justin Mayer
6f36b0a246 Keep certain files when cleaning output; fix #574
If DELETE_OUTPUT_DIRECTORY is set to True, all files and directories are
deleted from the output directory. There are, however, several reasons
one might want to retain certain files/directories and avoid their
deletion from the output directory. One such use case is version control
system data: a versioned output directory can facilitate deployment via
Heroku and/or allow the user to easily revert to a prior version of the
site without having to rely on regeneration via Pelican.

This change introduces the OUTPUT_RETENTION setting, a tuple of
filenames that will be preserved when the clean_output_dir function in
pelican.utils is run. Setting OUTPUT_RETENTION = (".hg", ".git") would,
for example, prevent the relevant VCS data from being deleted when the
output directory is cleaned.
2013-06-25 19:03:32 -07:00
Justin Mayer
6fcb6d3766 Merge pull request #928 from wking/iso-8601
utils: Add some ISO 8601 forms to get_date()
2013-06-14 08:27:37 -07:00
W. Trevor King
5a61600bc9 tests: Avoid hidden logic with better .assert*() method choices
We'll get better failure messages if we use an assertion method that
understands the comparison we're trying to make.  If you make the
comparison by hand and assertTrue(), you don't get much constructive
feedback ;).
2013-06-12 14:52:23 -04:00
W. Trevor King
1102143c33 utils: Use pytz instead of datetime.timezone for timezones
datetime.timezone is new in Python 3.2 [1], so pytz allows us to keep
support for Python 2.7.

[1]: http://docs.python.org/dev/library/datetime.html#datetime.timezone
2013-06-11 22:54:01 -04:00
W. Trevor King
228fc82fc9 utils: Add some ISO 8601 forms to get_date()
Support the forms listed by the W3C [1].  I also removed the
'%Y-%d-%m' form, which can be confused with the '%Y-%m-%d' ISO form.
The new ISO forms can use 'Z' to designate UTC or '[+-]HHMM' to
specify offsets from UTC.  Other time zone designators are not
supported.

The '%z' directive has only been supported since Python 3.2 [2], so if
you're running Pelican on Python 2.7, you're stuck with 'Z' for UTC.
Conveniently, we get ValueErrors for both invalid directives and
data/format missmatches, so we don't need special handling for the 2.7
case inside get_date().

[1]: http://www.w3.org/TR/NOTE-datetime
[2]: http://bugs.python.org/issue6641
2013-06-11 22:53:21 -04:00
Deniz Turgut
09a332aff3 reset locale after DateFormatter test 2013-04-22 20:07:53 -04:00
Deniz Turgut
62a9b05595 added tests for DateFormatter 2013-04-22 19:54:52 -04:00