1
0
Fork 0
forked from github/pelican
pelican-theme/pelican
Martin (mart-e) 219c01afb0 [IMP] pelican_import with gmf instead of markdown
The markdown import of pandoc is their own flavour of markdown. It for
instance uses fenced divs[1] which are not supported by
python-markdown.  When importing content from Wordpress, there is
several issues as explained in discussion 3113[2]
This change follows a discussion with pandoc developer[3]

[1] https://pandoc.org/MANUAL.html#divs-and-spans
[2] https://github.com/getpelican/pelican/discussions/3113
[3] https://fosstodon.org/@pandoc/110105559949588768

Take the following Wordpress blog post sample:
```html
<p><!-- wp:paragraph --></p>
<p>Paragraph content</p>
<p><!-- /wp:paragraph --></p>
<p><!-- wp:image {"align":"center","id":3747,"sizeSlug":"full"} --></p>
<div class="wp-block-image">
<figure class="aligncenter size-full"><img src="https://test.com/test.jpg" alt="" class="wp-image-3747" title="Some title"/><br />
<figcaption><em>Some caption</em></figcaption>
</figure>
</div>
<p><!-- /wp:image --></p>
```
Before this commit:
was imported as

```md
`<!-- wp:paragraph -->`{=html}

Paragraph content

`<!-- /wp:paragraph -->`{=html}

`<!-- wp:image {"align":"center","id":3747,"sizeSlug":"full"} -->`{=html}

::: wp-block-image
<figure class="aligncenter size-full">
<img src="https://test.com/test.jpg" title="Some title"
class="wp-image-3747" /><br />

<figcaption><em>Some caption</em></figcaption>
</figure>
:::

`<!-- /wp:image -->`{=html}
```

After this change:
```md
<!-- wp:paragraph -->

Paragraph content

<!-- /wp:paragraph -->

<!-- wp:image {"align":"center","id":3747,"sizeSlug":"full"} -->

<div class="wp-block-image">

<figure class="aligncenter size-full">
<img src="https://test.com/test.jpg" title="Some title"
class="wp-image-3747" /><br />

<figcaption><em>Some caption</em></figcaption>
</figure>

</div>

<!-- /wp:image -->
```

Fixes #3113
2023-03-29 14:07:23 +02:00
..
plugins Stringify plugin definitions so they can be pickled during caching (#2835) 2021-01-04 17:13:32 +01:00
tests [IMP] pelican_import with gmf instead of markdown 2023-03-29 14:07:23 +02:00
themes Remove unnecessary ids and classes in simple theme 2022-02-20 10:29:46 +08:00
tools [IMP] pelican_import with gmf instead of markdown 2023-03-29 14:07:23 +02:00
__init__.py Fix #2938 2022-02-09 06:05:50 -07:00
__main__.py Initial pass of removing Python 2 support 2019-11-26 06:16:41 +09:00
cache.py Fix typo in cache.py (#2978) 2022-02-09 06:15:59 -07:00
contents.py Use (?P=) to replace \2 for intrasite link 2022-10-24 18:05:40 -07:00
generators.py typo fix 2021-06-15 22:41:38 -06:00
log.py Move rich's console to log.py 2021-07-08 21:33:22 -06:00
paginator.py Support last page pattern in PAGINATION_PATTERNS 2021-01-13 11:19:36 +01:00
readers.py use docutils.Node.findall instead of traverse 2021-11-25 23:57:09 +03:00
rstdirectives.py Modernize code base to Python 3+ syntax 2020-04-27 09:45:31 +02:00
server.py server: for extension_map, refer to upstream version rather than only overwriting it 2021-10-07 14:44:36 -06:00
settings.py Fix #2938 2022-02-09 06:05:50 -07:00
signals.py restore pelican.signals with an explicit ImportError mentioning move 2020-10-12 14:53:18 +03:00
urlwrappers.py Modernize code base to Python 3+ syntax 2020-04-27 09:45:31 +02:00
utils.py Fix #2982: Improve _HTMLWordTruncator (#3002) 2022-07-11 19:47:37 +02:00
writers.py Allow easy subclassing of Writer 2021-06-08 14:01:32 -06:00