Commit graph

71 commits

Author SHA1 Message Date
Oliver Urs Lenz
f3e95cf473 importer: don't wrap, because it breaks html attributes 2018-08-07 17:16:23 +02:00
David Alfonso
150d1f05d0 Add pandoc2 support to pelican-import. Fix #2255
Specific options passed to pandoc2 in order to get similar results than
with pandoc1:

- Disable smart quotes from the markdown output.

- Enable raw parsing from html.
2018-08-03 19:44:50 +02:00
Oliver Urs Lenz
a5571ba1d5 importer: update links to attachments if --wp-attach 2018-07-09 11:26:50 +02:00
David Alfonso
e44c4aba36 Add missing wordpress options to importer doc 2018-06-22 22:36:43 +02:00
Justin Mayer
34103cd5dd
Merge pull request #2251 from fgallaire/content
Change imported content directory name to "content". Fixes #2250
2017-11-12 08:10:23 -08:00
Florent Gallaire
a091a4b8b9 Change pelican-import output directory default name to "content" (fix #2250) 2017-11-09 19:23:38 +01:00
Justin Mayer
8ebc120f36 Align import style with flake8-import-order 0.15
Addresses: https://github.com/PyCQA/flake8-import-order/issues/120

Refs #2246
2017-11-07 04:18:03 -08:00
Stuart Axon
012d034cba Check for 0 dates in pelican-import
Check for 0 dates.

For my own blog this means it doesn't break during import, but I don't know pelican well enough yet to say if this is correct or not.

Error that I was getting before applying this fix
```
Traceback (most recent call last):
  File "/mnt/data/home/stu/.virtualenvs/blog/bin/pelican-import", line 11, in <module>
    sys.exit(main())
  File "/mnt/data/home/stu/.virtualenvs/blog/local/lib/python2.7/site-packages/pelican/tools/pelican_import.py", line 896, in main
    attachments=attachments or None)
  File "/mnt/data/home/stu/.virtualenvs/blog/local/lib/python2.7/site-packages/pelican/tools/pelican_import.py", line 684, in fields2pelican
    kind, in_markup) in fields:
  File "/mnt/data/home/stu/.virtualenvs/blog/local/lib/python2.7/site-packages/pelican/tools/pelican_import.py", line 163, in wp2fields
    date_object = time.strptime(raw_date, '%Y-%m-%d %H:%M:%S')
  File "/usr/lib/python2.7/_strptime.py", line 478, in _strptime_time
    return _strptime(data_string, format)[0]
  File "/usr/lib/python2.7/_strptime.py", line 332, in _strptime
    (data_string, format))
ValueError: time data u'0000-00-00 00:00:00' does not match format u'%Y-%m-%d %H:%M:%S'

```
2017-10-02 22:05:42 +01:00
derwinlu
623eb0a4c0 Fix more python 3.6 regex DeprecationWarning's 2017-03-29 10:19:47 +02:00
Justin Mayer
ca389e70e1 Merge pull request #1753 from ingwinlu/flake8
Make Pelican codebase compliant with PEP8
2015-08-24 19:27:25 -07:00
Julien Vehent
9d57dcf020 Fix calculation of tag count in dotclear import
Upon import of a dotclear backup, `pelican-import` returned this stacktrace:

```
  File "/usr/bin/pelican-import", line 11, in <module>
    sys.exit(main())
  File "/usr/lib/python3.4/site-packages/pelican/tools/pelican_import.py", line 809, in main
    attachments = attachments or None)
  File "/usr/lib/python3.4/site-packages/pelican/tools/pelican_import.py", line 621, in fields2pelican
    kind, in_markup) in fields:
  File "/usr/lib/python3.4/site-packages/pelican/tools/pelican_import.py", line 262, in dc2fields
    if int(tag[:1]) == 1:
ValueError: invalid literal for int() with base 10: 'a'
```
2015-08-19 12:28:14 -04:00
derwinlu
8993c55e6e fulfil pep8 standard 2015-08-17 13:34:32 +02:00
Alex Chan
641b3ffa71 Fix capitalisation of WordPress 2015-05-24 13:08:30 +01:00
Justin Mayer
bfbb7d4bb5 Merge pull request #1581 from georgevreilly/win-fixes
Fix Pelican rendering and unit tests on Windows.
2015-02-17 17:06:19 -08:00
Arul
77231e97c0 wordpress importer support for draft article 2015-02-04 16:57:40 +05:30
George V. Reilly
4c25610cd8 Fix Pelican rendering and unit tests on Windows.
* Fix {filename} links on Windows.
  Otherwise '{filename}/foo/bar.jpg' doesn't work
* Clean up relative Posix path handling in contents.
* Use Posix paths in readers
* Environment for Popen must be strs, not unicodes.
* Ignore Git CRLF warnings.
* Replace CRLFs with LFs in inputs on Windows.
* Fix importer tests
* Fix test_contents
* Fix one last backslash in paginated output
* Skip the remaining failing locale tests on Windows.
* Document the use of forward slashes on Windows.
* Add some Fabric and ghp-import notes
2015-01-25 17:42:53 -08:00
Deniz Turgut
ed3209888a Refactor logging handling
Old system was using manual string formatting for log messages.
This caused issues with common operations like exception logging
because often they need to be handled differently for Py2/Py3
compatibility. In order to unify the effort:

 - All logging is changed to `logging.level(msg, arg1, arg2)` style.
 - A `SafeLogger` is implemented to auto-decode exceptions properly
in the args (ref #1403).
 - Custom formatters were overriding useful logging functionality
like traceback outputing (ref #1402). They are refactored to be
more transparent. Traceback information is provided in `--debug`
mode for `read_file` errors in generators.
 - Formatters will now auto-format multiline log messages in order
to make them look related. Similarly, traceback will be formatted in
the same fashion.
 - `pelican.log.LimitFilter` was (ab)using logging message which
would result in awkward syntax for argumented logging style. This
functionality is moved to `extra` keyword argument.
 - Levels for errors that would result skipping a file (`read_file`)
changed from `warning` to `error` in order to make them stand out
among other logs.
 - Small consistency changes to log messages (i.e. changing all
to start with an uppercase letter) and quality-of-life improvements
(some log messages were dumping raw object information).
2014-07-22 12:39:39 -04:00
Justin Mayer
8fe05bb599 Merge pull request #1380 from avaris/py34_warnings
Fix for Python 3.4 deprecation warnings while running tests
2014-06-27 05:49:03 -07:00
Deniz Turgut
ce8574aff4 Fix HTMLParser related deprecation warnings in Py3.4 2014-06-26 01:10:52 -04:00
Ondrej Grover
3f6b130d6e Fix #1198, enable custom locale in template rendering, fixes links
reverts getpelican/pelican@ddcccfeaa9

If one used a locale that made use of unicode characters (like fr_FR.UTF-8)
the files on disk would be in correct locale while links would be to C.

Uses a SafeDatetime class that works with unicode format strigns
by using custom strftime to prevent ascii decoding errors with Python2.

Also added unicode decoding for the calendar module to fix period
archives.
2014-06-26 00:00:19 -04:00
OGINO Masanori
ca3aa1e75f Use six.moves.urllib.
Signed-off-by: OGINO Masanori <masanori.ogino@gmail.com>
2014-06-10 17:30:17 +09:00
Antoine Brenner
aabb7f9345 Fix error in download_attachments() triggered by python2 unit test
The download_attachments error is triggered in the unit tests by a japanese
error message (接続を拒否されました) (connexion denied), that
python is not able to serialize the into a byte string.

This error weirdly does not appear every time the unit tests are run.
It might be related to the order in which the tests are run.

This error was found and fixed during the PyconUS 2014 pelican
sprint. It was discovered on a Linux Fedora20 computer running
Python2.7 in virtualenv
2014-04-14 22:39:10 +02:00
Justin Mayer
80842cbc0e Fix deprecated logger warning for Python 3
logger.warn() has been deprecated in Python 3 in favor of logger.warning()
2014-04-02 12:38:49 -07:00
Anatoly Bubenkov
2c25e488c4 multiple authors implemented 2014-02-14 03:21:06 +01:00
Justin Mayer
6b0a99932f Revert test-failing change from #1114 2014-02-09 08:45:06 -08:00
Justin Mayer
c60e0d03fb Merge pull request #1114 from brannerchinese/master
xml => lxml for bs4, in pelican-import.wp2fields()
2014-02-08 15:38:06 -08:00
Alistair Magee
ea3e160db1 Extra functionality for pelican-import for wordpress imports 2014-02-03 17:36:41 +00:00
Joe Shaw
eeae09be5e wordpress importer: fallback onto wp:post_id if wp:post_name is empty 2013-12-04 09:46:44 -05:00
Kevin Deldycke
83e4d35b44 Produce inline links instead of references. 2013-11-26 10:04:15 +01:00
David Branner
00150f3556 xml => lxml for bs4, in pelican-import.wp2fields() 2013-10-09 11:53:11 -04:00
Justin Mayer
e2f50750d2 Add Tumblr and Posterous to importer description 2013-10-08 13:20:56 +02:00
Kyle Fuller
f83d0d3b0c Handle east asian character column width in the importer
Fixes #682
Closes #923
2013-10-08 09:46:40 +01:00
Russ Webber
dc58a17e64 fix missing 'kind' arg in importer
fixes #983
2013-08-04 17:21:35 +08:00
Justin Mayer
d4b64a3c8e Merge pull request #873 from xlz/tumblr-import
Support importing Tumblr
2013-07-18 09:17:26 -07:00
Benjamin Port
cb650c1c99 Add filter-author option to importer
Allow to import post from only one author when importing data
2013-07-13 16:15:45 +02:00
Lingzhu Xiang
241ac2400a Use better titles than None for Tumblr posts without title 2013-07-07 19:05:21 +08:00
Lingzhu Xiang
75263fa852 Fix importing Tumblr photo caption
Besides each photo's caption, the general caption is also needed.

While we're at it, also add a linefeed at the end of file.
2013-07-07 19:05:21 +08:00
Lingzhu Xiang
00a1cbb6b8 Support importing Tumblr
Try to integrate Tumblr's various post types without using
additonal templates.
2013-07-07 19:05:21 +08:00
Justin Mayer
7ec4d5faa2 Merge pull request #880 from joeshaw/pre-index-fix
Exception on WP import looking for <pre> tag
2013-06-29 08:02:27 -07:00
Joe Shaw
5fa3504ad0 use string find() instead of index(). Fixes #880.
We're expecting a non-match to return -1, which is what find() does,
but index() instead throws a ValueError.
2013-05-08 09:41:55 -04:00
James Murty
8c7ea8df98 Import wordpress pages to pages/ subdir with --dir-page option
When importing from Wordpress, the --dir-page directive (disabled by
default) automatically adds files to the pages/ when they are recognised
as pages, as opposed to posts.
2013-04-19 23:06:59 +01:00
Rogdham
b2aabdc02b Do not generate invalid filenames. Fixes #814.
Turn invalid characters into underscores, remove leading dots and enforce
a maximum length. Should be fine on main file systems used by Windows, Mac OS
and Linux.
Thanks to @Avaris for helping to clean my code.
2013-04-14 13:20:16 +01:00
Justin Mayer
87735b5215 Merge pull request #842 from avaris/remove-unittest2
remove unittest2 and fix various warnings in py3
2013-04-13 14:23:38 -07:00
Deniz Turgut
bc4bd773a0 remove unittest2 and fix various warnings in py3 2013-04-13 16:36:05 -04:00
Rogdham
a4c16e1b53 Warn user in case of missing title. Fixes #440.
When a WP XML file is imported, items with missing title are generated with a
title which is probably not the good one (instead of being dropped), and a
warning is displayed to the user.
2013-04-13 20:44:18 +01:00
Irfan Ahmad
58faf9462e Implement Posterous import - fixes #608 2013-03-29 09:10:27 -07:00
James King
999980c07c Added WordPress content decoding to importer 2013-03-28 07:16:01 -07:00
Steve Schwarz
986733e8fb Corrected parsing of categories/tags 2013-03-03 21:17:42 -08:00
Steve Schwarz
8a6d96b289 pelican_import fix for bs4
Quick fix for this traceback:
$ pelican-import --wpfile ~/Downloads/mysite.wordpress.2013-02-24.xml 
Traceback (most recent call last):
  File "/Users/me/.virtualenvs/pelican/bin/pelican-import", line 8, in <module>
    load_entry_point('pelican==3.2', 'console_scripts', 'pelican-import')()
  File "/Users/me/.virtualenvs/pelican/src/pelican/pelican/tools/pelican_import.py", line 363, in main
    disable_slugs=args.disable_slugs or False)
  File "/Users/me/.virtualenvs/pelican/src/pelican/pelican/tools/pelican_import.py", line 238, in fields2pelican
    for title, content, filename, date, author, categories, tags, in_markup in fields:
  File "/Users/me/.virtualenvs/pelican/src/pelican/pelican/tools/pelican_import.py", line 37, in wp2fields
    if item.fetch('wp:status')[0].contents[0] == "publish":
TypeError: 'NoneType' object is not callable

I'm a BeautifulSoup novice but these changes allowed me to import two of my wordpress.xml files.
2013-03-03 21:17:42 -08:00
Alexis Métaireau
149ca493e0 Annotate py3k code when needed. 2013-01-11 18:55:04 +01:00