mirror of
https://github.com/getpelican/pelican.git
synced 2025-10-15 20:28:56 +02:00
This also updates the Tumblr API to use HTTPS as documented in the current Tumblr API docs.
143 lines
No EOL
5.5 KiB
ReStructuredText
143 lines
No EOL
5.5 KiB
ReStructuredText
.. _import:
|
|
|
|
Importing an existing site
|
|
##########################
|
|
|
|
Description
|
|
===========
|
|
|
|
``pelican-import`` is a command-line tool for converting articles from other
|
|
software to reStructuredText or Markdown. The supported import formats are:
|
|
|
|
- Blogger XML export
|
|
- Dotclear export
|
|
- Posterous API
|
|
- Tumblr API
|
|
- WordPress XML export
|
|
- RSS/Atom feed
|
|
|
|
The conversion from HTML to reStructuredText or Markdown relies on `Pandoc`_.
|
|
For Dotclear, if the source posts are written with Markdown syntax, they will
|
|
not be converted (as Pelican also supports Markdown).
|
|
|
|
.. note::
|
|
|
|
Unlike Pelican, Wordpress supports multiple categories per article. These
|
|
are imported as a comma-separated string. You have to resolve these
|
|
manually, or use a plugin that enables multiple categories per article
|
|
(like `more_categories`_).
|
|
|
|
Dependencies
|
|
============
|
|
|
|
``pelican-import`` has some dependencies not required by the rest of Pelican:
|
|
|
|
- *BeautifulSoup4* and *lxml*, for WordPress and Dotclear import. Can be
|
|
installed like any other Python package (``pip install BeautifulSoup4
|
|
lxml``).
|
|
- *Feedparser*, for feed import (``pip install feedparser``).
|
|
- *Pandoc*, see the `Pandoc site`_ for installation instructions on your
|
|
operating system.
|
|
|
|
.. _Pandoc: https://pandoc.org/
|
|
.. _Pandoc site: https://pandoc.org/installing.html
|
|
|
|
|
|
Usage
|
|
=====
|
|
|
|
::
|
|
|
|
pelican-import [-h] [--blogger] [--dotclear] [--posterous] [--tumblr] [--wpfile] [--feed]
|
|
[-o OUTPUT] [-m MARKUP] [--dir-cat] [--dir-page] [--strip-raw] [--wp-custpost]
|
|
[--wp-attach] [--disable-slugs] [-e EMAIL] [-p PASSWORD] [-b BLOGNAME]
|
|
input|api_token|api_key
|
|
|
|
Positional arguments
|
|
--------------------
|
|
============= ============================================================================
|
|
``input`` The input file to read
|
|
``api_token`` (Posterous only) api_token can be obtained from http://posterous.com/api/
|
|
``api_key`` (Tumblr only) api_key can be obtained from https://www.tumblr.com/oauth/apps
|
|
============= ============================================================================
|
|
|
|
Optional arguments
|
|
------------------
|
|
|
|
-h, --help Show this help message and exit
|
|
--blogger Blogger XML export (default: False)
|
|
--dotclear Dotclear export (default: False)
|
|
--posterous Posterous API (default: False)
|
|
--tumblr Tumblr API (default: False)
|
|
--wpfile WordPress XML export (default: False)
|
|
--feed Feed to parse (default: False)
|
|
-o OUTPUT, --output OUTPUT
|
|
Output path (default: content)
|
|
-m MARKUP, --markup MARKUP
|
|
Output markup format (supports rst & markdown)
|
|
(default: rst)
|
|
--dir-cat Put files in directories with categories name
|
|
(default: False)
|
|
--dir-page Put files recognised as pages in "pages/" sub-
|
|
directory (blogger and wordpress import only)
|
|
(default: False)
|
|
--filter-author Import only post from the specified author
|
|
--strip-raw Strip raw HTML code that can't be converted to markup
|
|
such as flash embeds or iframes (wordpress import
|
|
only) (default: False)
|
|
--wp-custpost Put wordpress custom post types in directories. If
|
|
used with --dir-cat option directories will be created
|
|
as "/post_type/category/" (wordpress import only)
|
|
--wp-attach Download files uploaded to wordpress as attachments.
|
|
Files will be added to posts as a list in the post
|
|
header and links to the files within the post will be
|
|
updated. All files will be downloaded, even if they
|
|
aren't associated with a post. Files will be downloaded
|
|
with their original path inside the output directory,
|
|
e.g. "output/wp-uploads/date/postname/file.jpg".
|
|
(wordpress import only) (requires an internet
|
|
connection)
|
|
--disable-slugs Disable storing slugs from imported posts within
|
|
output. With this disabled, your Pelican URLs may not
|
|
be consistent with your original posts. (default:
|
|
False)
|
|
-e EMAIL, --email=EMAIL
|
|
Email used to authenticate Posterous API
|
|
-p PASSWORD, --password=PASSWORD
|
|
Password used to authenticate Posterous API
|
|
-b BLOGNAME, --blogname=BLOGNAME
|
|
Blog name used in Tumblr API
|
|
|
|
|
|
Examples
|
|
========
|
|
|
|
For Blogger::
|
|
|
|
$ pelican-import --blogger -o ~/output ~/posts.xml
|
|
|
|
For Dotclear::
|
|
|
|
$ pelican-import --dotclear -o ~/output ~/backup.txt
|
|
|
|
for Posterous::
|
|
|
|
$ pelican-import --posterous -o ~/output --email=<email_address> --password=<password> <api_token>
|
|
|
|
For Tumblr::
|
|
|
|
$ pelican-import --tumblr -o ~/output --blogname=<blogname> <api_token>
|
|
|
|
For WordPress::
|
|
|
|
$ pelican-import --wpfile -o ~/output ~/posts.xml
|
|
|
|
Tests
|
|
=====
|
|
|
|
To test the module, one can use sample files:
|
|
|
|
- for WordPress: https://www.wpbeginner.com/wp-themes/how-to-add-dummy-content-for-theme-development-in-wordpress/
|
|
- for Dotclear: http://media.dotaddict.org/tda/downloads/lorem-backup.txt
|
|
|
|
.. _more_categories: https://github.com/getpelican/pelican-plugins/tree/master/more_categories |