1
0
Fork 0
forked from github/pelican
Commit graph

111 commits

Author SHA1 Message Date
boxydog
9b77a9027b File fixes for ruff B007, RUF015, PLR1722 2024-05-31 08:48:44 -05:00
boxydog
7577dd7603 More ruff fixes in files: stop ignoring C408, UP007, PLR5501, B006 2024-05-30 13:21:12 -05:00
boxydog
6d8597addb The ruff and ruff-format fixes 2024-05-30 09:08:16 -05:00
boxydog
d6a33f1d21 Medium post importer (from medium export) 2024-01-16 16:56:07 -06:00
Justin Mayer
ecd598f293 Update code base for Python 3.8 and above
Result of: pipx run pyupgrade --py38-plus pelican/**/*.py
2023-11-12 13:53:02 +01:00
Chris Rose
cabdb26cee Apply code style to project via: ruff format . 2023-10-29 22:18:29 +01:00
Deniz Turgut
11c13ceae1
use a tempfile for intermediate html file for pandoc in importer 2023-10-28 16:50:34 +03:00
Deniz Turgut
83a8059d02
force timestamp conversion in tumblr importer to be UTC with offset and adjust tests 2023-10-28 16:50:34 +03:00
boxydog
9c87d8f3a3
Deal with broken embedded video links when importing from Tumblr (#3218)
Co-authored-by: boxydog <boxydog@users.noreply.github.com>
Co-authored-by: Will Thong <will@willthong.com>
2023-10-28 12:56:00 +02:00
boxydog
1404a2dbc3
Remove newline when importing Tumblr post photos (#3215)
Co-authored-by: Dan Frankowski <dfrankow@gmail.com>
2023-10-27 21:56:34 +02:00
Justin Mayer
777a708ef7
Merge pull request #3198 from mart-e/remove-posterous 2023-10-24 10:22:19 +02:00
Martin Trigaux
5d8c03108b Remove Posterous integration
Posterous closed down in 2013.
The API is no longer accessible and the code did not work in python 3
(base64.encodestring was expecting bytes, not string)
2023-10-06 09:34:26 +02:00
Martin (mart-e)
48166bd687 Convert Wordpress caption to figure
In Wordpress, inserting image with a caption can look like:

[caption id="attachment_42" caption="Image Description"]<a ...><img ... /></a>[/caption]
[caption id="attachment_42"]<a ...><img ... /></a> Image Description[/caption]
[caption id="attachment_42"]<img ... > Image Description[/caption]

Replace by an HTML figure tag
2023-10-03 11:45:31 +02:00
Justin Mayer
23c50ea885
Merge pull request #3115 from mart-e/import-config-file 2023-06-04 10:36:55 +02:00
Martin (mart-e)
ef844dbe0a Use the default configuration
When importing a blog, a error is logged: 'No timezone information
specified in the settings'.
This is because the code calls read_settings() but no configuration
file is provided.
Instead of providing one (users may not already have one if they are
at the import step), use the default settings.
2023-05-14 08:50:06 +02:00
Martin (mart-e)
219c01afb0 [IMP] pelican_import with gmf instead of markdown
The markdown import of pandoc is their own flavour of markdown. It for
instance uses fenced divs[1] which are not supported by
python-markdown.  When importing content from Wordpress, there is
several issues as explained in discussion 3113[2]
This change follows a discussion with pandoc developer[3]

[1] https://pandoc.org/MANUAL.html#divs-and-spans
[2] https://github.com/getpelican/pelican/discussions/3113
[3] https://fosstodon.org/@pandoc/110105559949588768

Take the following Wordpress blog post sample:
```html
<p><!-- wp:paragraph --></p>
<p>Paragraph content</p>
<p><!-- /wp:paragraph --></p>
<p><!-- wp:image {"align":"center","id":3747,"sizeSlug":"full"} --></p>
<div class="wp-block-image">
<figure class="aligncenter size-full"><img src="https://test.com/test.jpg" alt="" class="wp-image-3747" title="Some title"/><br />
<figcaption><em>Some caption</em></figcaption>
</figure>
</div>
<p><!-- /wp:image --></p>
```
Before this commit:
was imported as

```md
`<!-- wp:paragraph -->`{=html}

Paragraph content

`<!-- /wp:paragraph -->`{=html}

`<!-- wp:image {"align":"center","id":3747,"sizeSlug":"full"} -->`{=html}

::: wp-block-image
<figure class="aligncenter size-full">
<img src="https://test.com/test.jpg" title="Some title"
class="wp-image-3747" /><br />

<figcaption><em>Some caption</em></figcaption>
</figure>
:::

`<!-- /wp:image -->`{=html}
```

After this change:
```md
<!-- wp:paragraph -->

Paragraph content

<!-- /wp:paragraph -->

<!-- wp:image {"align":"center","id":3747,"sizeSlug":"full"} -->

<div class="wp-block-image">

<figure class="aligncenter size-full">
<img src="https://test.com/test.jpg" title="Some title"
class="wp-image-3747" /><br />

<figcaption><em>Some caption</em></figcaption>
</figure>

</div>

<!-- /wp:image -->
```

Fixes #3113
2023-03-29 14:07:23 +02:00
Justin Mayer
21e855a29f Adjust code style for Flake8 5.0+
We are pinned to Flake8 <4.0, but at least we'll be compliant if we ever
upgrade to Flake8 5.0+.
2022-08-01 13:24:21 +02:00
Deniz Turgut
2e482b207b
Fix Windows tests
* Unskip passable tests
* Fix broken tests
2020-05-09 16:17:14 +03:00
Kernc
b8f7c584c5
Fix error strings whitespace 2020-04-29 18:08:38 +02:00
Justin Mayer
d43b786b30 Modernize code base to Python 3+ syntax
Replaces syntax that was relevant in earlier Python versions but that
now has modernized equivalents.
2020-04-27 09:45:31 +02:00
Justin Mayer
fc031174bb Flake8 fix 2020-04-16 08:10:30 +02:00
Justin Mayer
86ff02541f Fix building asciidoc headers in importer & add docs 2020-04-16 08:01:10 +02:00
Tim Janik
5365a1cdb3 PELICAN: pelican_import.py: add support for pelican-import -m asciidoc
Signed-off-by: Tim Janik <timj@gnu.org>
2020-04-16 07:48:04 +02:00
Justin Mayer
8ba00dd9f1 Preserve category case in importer
Adds a `preserve_case` parameter to the `slugify()` function and uses it
to preserve capital letters in category names when using the Pelican
importer.
2020-04-15 20:42:21 +02:00
Kurt McKee
7bbd3dc6fb
Update links to HTTPS and current 301 redirects in docs/templates/themes (#2661)
This also updates the Tumblr API to use HTTPS as documented in the
current Tumblr API docs.
2020-04-12 16:38:35 +02:00
Deniz Turgut
49bc6ed47f Further remove python2-isms 2019-11-26 06:17:04 +09:00
Kevin Yap
1e0e541b57 Initial pass of removing Python 2 support
This commit removes Six as a dependency for Pelican, replacing the
relevant aliases with the proper Python 3 imports. It also removes
references to Python 2 logic that did not require Six.
2019-11-26 06:16:41 +09:00
Justin Mayer
d9e98a5a39
Merge pull request #2514 from rask004/fix-2487
Fix pelican-import error regarding wp-attach and Unicode
2019-03-07 21:13:37 +01:00
John Franey
63a72fc619 Remove Python 3.4 references
This PR removes the Python 3.4 tox task and updates references in the
code to Python 3.5+.

tox complains about Python 3.4, which is EOL after next month:

> py34 installed: DEPRECATION: Python 3.4 support has been deprecated. pip 19.1 will be the last one supporting it. Please upgrade your Python as Python 3.4 won't be maintained after March 2019 (cf PEP 429).
2019-02-06 10:23:27 -04:00
Roland Askew
2aebfd1cdc fix pelican-import error regarding wp-attach and Uncode 2019-01-12 12:44:04 +13:00
Oliver Urs Lenz
3cdf4fd410 reverts bug involving strftime accidentally introduced in feed importer 2019-01-05 19:19:46 +01:00
Stuart Axon
a597a31dad Make the blogger tests consistant with the wp ones - cast
to list in test if needed.
2018-11-26 16:58:12 +00:00
Stuart Axon
ded234467d Update pelican_import.py
pelican-import: Move pandoc check inside loop, fixing #2448
2018-11-26 16:37:10 +00:00
Justin Mayer
3596e04639
Merge pull request #2452 from stuaxo/patch-6
Importer: Avoid downloading duplicate post attachments
2018-11-26 08:12:54 -08:00
Stuart Axon
4d1869002e Update pelican_import.py
Use a set to avoid downloading duplicate attachments on a post more than once.
2018-11-17 15:02:31 +00:00
Stuart Axon
033d6ac4d6 pelican_import wordpress import
get_filename:
   use "post_name", where the parameter is the postname.
   fixup names that are entirely made up of spaces.
2018-11-15 14:02:10 +00:00
Oliver Urs Lenz
048ea4dc0c automatically copy linked static files 2018-11-01 18:08:11 +01:00
Oliver Urs Lenz
5199fa51ea control slug substitutions from settings with regex 2018-10-31 16:20:21 +01:00
Justin Mayer
e9b654bbaa
Merge pull request #2395 from oulenz/import_from_blogger
Add Blogger XML backup importer
2018-08-08 09:45:52 +02:00
Oliver Urs Lenz
f3e95cf473 importer: don't wrap, because it breaks html attributes 2018-08-07 17:16:23 +02:00
Oliver Urs Lenz
c388f14d3e add blogger importer 2018-08-07 14:33:10 +02:00
David Alfonso
150d1f05d0 Add pandoc2 support to pelican-import. Fix #2255
Specific options passed to pandoc2 in order to get similar results than
with pandoc1:

- Disable smart quotes from the markdown output.

- Enable raw parsing from html.
2018-08-03 19:44:50 +02:00
Oliver Urs Lenz
a5571ba1d5 importer: update links to attachments if --wp-attach 2018-07-09 11:26:50 +02:00
David Alfonso
e44c4aba36 Add missing wordpress options to importer doc 2018-06-22 22:36:43 +02:00
Justin Mayer
34103cd5dd
Merge pull request #2251 from fgallaire/content
Change imported content directory name to "content". Fixes #2250
2017-11-12 08:10:23 -08:00
Florent Gallaire
a091a4b8b9 Change pelican-import output directory default name to "content" (fix #2250) 2017-11-09 19:23:38 +01:00
Justin Mayer
8ebc120f36 Align import style with flake8-import-order 0.15
Addresses: https://github.com/PyCQA/flake8-import-order/issues/120

Refs #2246
2017-11-07 04:18:03 -08:00
Stuart Axon
012d034cba Check for 0 dates in pelican-import
Check for 0 dates.

For my own blog this means it doesn't break during import, but I don't know pelican well enough yet to say if this is correct or not.

Error that I was getting before applying this fix
```
Traceback (most recent call last):
  File "/mnt/data/home/stu/.virtualenvs/blog/bin/pelican-import", line 11, in <module>
    sys.exit(main())
  File "/mnt/data/home/stu/.virtualenvs/blog/local/lib/python2.7/site-packages/pelican/tools/pelican_import.py", line 896, in main
    attachments=attachments or None)
  File "/mnt/data/home/stu/.virtualenvs/blog/local/lib/python2.7/site-packages/pelican/tools/pelican_import.py", line 684, in fields2pelican
    kind, in_markup) in fields:
  File "/mnt/data/home/stu/.virtualenvs/blog/local/lib/python2.7/site-packages/pelican/tools/pelican_import.py", line 163, in wp2fields
    date_object = time.strptime(raw_date, '%Y-%m-%d %H:%M:%S')
  File "/usr/lib/python2.7/_strptime.py", line 478, in _strptime_time
    return _strptime(data_string, format)[0]
  File "/usr/lib/python2.7/_strptime.py", line 332, in _strptime
    (data_string, format))
ValueError: time data u'0000-00-00 00:00:00' does not match format u'%Y-%m-%d %H:%M:%S'

```
2017-10-02 22:05:42 +01:00
derwinlu
623eb0a4c0 Fix more python 3.6 regex DeprecationWarning's 2017-03-29 10:19:47 +02:00
Justin Mayer
ca389e70e1 Merge pull request #1753 from ingwinlu/flake8
Make Pelican codebase compliant with PEP8
2015-08-24 19:27:25 -07:00