Merge pull request #1011 from saimn/readers

Refactor readers and remove MARKUP. Fixes #866
2025-10-15 20:28:56 +02:00 · 2013-08-07 12:34:22 -07:00 · 2013-08-07 12:34:22 -07:00 · 5a469dc2e3
commit 5a469dc2e3
parent 0df12e31e2 f47f054d0b
11 changed files with 265 additions and 221 deletions
--- a/docs/internals.rst
+++ b/docs/internals.rst
@ -24,7 +24,7 @@ The logic is separated into different classes and concepts:
  then passed to the generators.

 * **Readers** are used to read from various formats (AsciiDoc, HTML, Markdown and
-  reStructuredText for now, but the system is extensible). Given a file, they 
+  reStructuredText for now, but the system is extensible). Given a file, they
  return metadata (author, tags, category, etc.) and content (HTML-formatted).

 * **Generators** generate the different outputs. For instance, Pelican comes with
@ -44,7 +44,7 @@ method that returns HTML content and some metadata.

 Take a look at the Markdown reader::

-    class MarkdownReader(Reader):
+    class MarkdownReader(BaseReader):
        enabled = bool(Markdown)

        def read(self, source_path):
--- a/docs/plugins.rst
+++ b/docs/plugins.rst
@ -71,6 +71,7 @@ finalized                       pelican object                  invoked after al
                                                                - minifying js/css assets.
                                                                - notify/ping search engines with an updated sitemap.
 generator_init                  generator                       invoked in the Generator.__init__
+readers_init                    readers                         invoked in the Readers.__init__
 article_generate_context        article_generator, metadata
 article_generate_preread        article_generator               invoked before a article is read in ArticlesGenerator.generate_context;
                                                                use if code needs to do something before every article is parsed
@ -144,13 +145,13 @@ write and don't slow down pelican itself when they're not active.
 No more talking, here is the example::

    from pelican import signals
-    from pelican.readers import EXTENSIONS, Reader
+    from pelican.readers import BaseReader

-    # Create a new reader class, inheriting from the pelican.reader.Reader
-    class NewReader(Reader):
+    # Create a new reader class, inheriting from the pelican.reader.BaseReader
+    class NewReader(BaseReader):
        enabled = True  # Yeah, you probably want that :-)

-        # The list of extensions you want this reader to match with.
+        # The list of file extensions you want this reader to match with.
        # In the case multiple readers use the same extensions, the latest will
        # win (so the one you're defining here, most probably).
        file_extensions = ['yeah']
@ -168,12 +169,12 @@ No more talking, here is the example::

            return "Some content", parsed

-    def add_reader(arg):
-        EXTENSIONS['yeah'] = NewReader
+    def add_reader(readers):
+        readers.reader_classes['yeah'] = NewReader

    # this is how pelican works.
    def register():
-        signals.initialized.connect(add_reader)
+        signals.readers_init.connect(add_reader)


 Adding a new generator
--- a/docs/settings.rst
+++ b/docs/settings.rst
@ -84,9 +84,10 @@ Setting name (default value)                                            What doe
                                                                        here or a single string representing one locale.
                                                                        When providing a list, all the locales will be tried
                                                                        until one works.
-`MARKUP` (``('rst', 'md')``)                                            A list of available markup languages you want
-                                                                        to use. For the moment, the only available values
-                                                                        are `rst`, `md`, `markdown`, `mkd`, `mdown`, `html`, and `htm`.
+`READERS` (``{}``)                                                      A dict of file extensions / Reader classes to overwrite or
+                                                                        add file readers. for instance, to avoid processing .html files:
+                                                                        ``READERS = {'html': None}``. Or to add a custom reader for the
+                                                                        `foo` extension: ``READERS = {'foo': FooReader}``
 `IGNORE_FILES` (``['.#*']``)                                            A list of file globbing patterns to match against the
                                                                        source files to be ignored by the processor. For example,
                                                                        the default ``['.#*']`` will ignore emacs lock files.