mirror of
https://github.com/getpelican/pelican.git
synced 2025-10-15 20:28:56 +02:00
Cache content to speed up reading. Fixes #224.
Cache read content so that it doesn't have to be read next time if its source has not been modified.
This commit is contained in:
parent
de9ef74479
commit
fd77926700
9 changed files with 336 additions and 34 deletions
19
docs/faq.rst
19
docs/faq.rst
|
|
@ -205,3 +205,22 @@ You can also disable generation of tag-related pages via::
|
||||||
|
|
||||||
TAGS_SAVE_AS = ''
|
TAGS_SAVE_AS = ''
|
||||||
TAG_SAVE_AS = ''
|
TAG_SAVE_AS = ''
|
||||||
|
|
||||||
|
Why does Pelican always write all HTML files even with content caching enabled?
|
||||||
|
===============================================================================
|
||||||
|
|
||||||
|
In order to reliably determine whether the HTML output is different
|
||||||
|
before writing it, a large part of the generation environment
|
||||||
|
including the template contexts, imported plugins, etc. would have to
|
||||||
|
be saved and compared, at least in the form of a hash (which would
|
||||||
|
require special handling of unhashable types), because of all the
|
||||||
|
possible combinations of plugins, pagination, etc. which may change in
|
||||||
|
many different ways. This would require a lot more processing time
|
||||||
|
and memory and storage space. Simply writing the files each time is a
|
||||||
|
lot faster and a lot more reliable.
|
||||||
|
|
||||||
|
However, this means that the modification time of the files changes
|
||||||
|
every time, so a ``rsync`` based upload will transfer them even if
|
||||||
|
their content hasn't changed. A simple solution is to make ``rsync``
|
||||||
|
use the ``--checksum`` option, which will make it compare the file
|
||||||
|
checksums in a much faster way than Pelican would.
|
||||||
|
|
|
||||||
|
|
@ -173,6 +173,12 @@ Setting name (default value)
|
||||||
`SLUGIFY_SOURCE` (``'input'``) Specifies where you want the slug to be automatically generated
|
`SLUGIFY_SOURCE` (``'input'``) Specifies where you want the slug to be automatically generated
|
||||||
from. Can be set to 'title' to use the 'Title:' metadata tag or
|
from. Can be set to 'title' to use the 'Title:' metadata tag or
|
||||||
'basename' to use the articles basename when creating the slug.
|
'basename' to use the articles basename when creating the slug.
|
||||||
|
`CACHE_CONTENT` (``True``) If ``True``, save read content in a cache file.
|
||||||
|
See :ref:`reading_only_modified_content` for details about caching.
|
||||||
|
`CACHE_DIRECTORY` (``cache``) Directory in which to store cache files.
|
||||||
|
`CHECK_MODIFIED_METHOD` (``mtime``) Controls how files are checked for modifications.
|
||||||
|
`LOAD_CONTENT_CACHE` (``True``) If ``True``, load unmodified content from cache.
|
||||||
|
`GZIP_CACHE` (``True``) If ``True``, use gzip to (de)compress the cache files.
|
||||||
=============================================================================== =====================================================================
|
=============================================================================== =====================================================================
|
||||||
|
|
||||||
.. [#] Default is the system locale.
|
.. [#] Default is the system locale.
|
||||||
|
|
@ -602,7 +608,7 @@ Setting name (default value) What does it do?
|
||||||
.. [3] %s is the language
|
.. [3] %s is the language
|
||||||
|
|
||||||
Ordering content
|
Ordering content
|
||||||
=================
|
================
|
||||||
|
|
||||||
================================================ =====================================================
|
================================================ =====================================================
|
||||||
Setting name (default value) What does it do?
|
Setting name (default value) What does it do?
|
||||||
|
|
@ -697,7 +703,6 @@ adding the following to your configuration::
|
||||||
|
|
||||||
CSS_FILE = "wide.css"
|
CSS_FILE = "wide.css"
|
||||||
|
|
||||||
|
|
||||||
Logging
|
Logging
|
||||||
=======
|
=======
|
||||||
|
|
||||||
|
|
@ -713,6 +718,61 @@ be filtered out.
|
||||||
|
|
||||||
For example: ``[(logging.WARN, 'TAG_SAVE_AS is set to False')]``
|
For example: ``[(logging.WARN, 'TAG_SAVE_AS is set to False')]``
|
||||||
|
|
||||||
|
.. _reading_only_modified_content:
|
||||||
|
|
||||||
|
Reading only modified content
|
||||||
|
=============================
|
||||||
|
|
||||||
|
To speed up the build process, pelican can optionally read only articles
|
||||||
|
and pages with modified content.
|
||||||
|
|
||||||
|
When Pelican is about to read some content source file:
|
||||||
|
|
||||||
|
1. The hash or modification time information for the file from a
|
||||||
|
previous build are loaded from a cache file if `LOAD_CONTENT_CACHE`
|
||||||
|
is ``True``. These files are stored in the `CACHE_DIRECTORY`
|
||||||
|
directory. If the file has no record in the cache file, it is read
|
||||||
|
as usual.
|
||||||
|
2. The file is checked according to `CHECK_MODIFIED_METHOD`:
|
||||||
|
|
||||||
|
- If set to ``'mtime'``, the modification time of the file is
|
||||||
|
checked.
|
||||||
|
- If set to a name of a function provided by the ``hashlib``
|
||||||
|
module, e.g. ``'md5'``, the file hash is checked.
|
||||||
|
- If set to anything else or the necessary information about the
|
||||||
|
file cannot be found in the cache file, the content is read as
|
||||||
|
usual.
|
||||||
|
|
||||||
|
3. If the file is considered unchanged, the content object saved in a
|
||||||
|
previous build corresponding to the file is loaded from the cache
|
||||||
|
and the file is not read.
|
||||||
|
4. If the file is considered changed, the file is read and the new
|
||||||
|
modification information and the content object are saved to the
|
||||||
|
cache if `CACHE_CONTENT` is ``True``.
|
||||||
|
|
||||||
|
Modification time based checking is faster than comparing file hashes,
|
||||||
|
but is not as reliable, because mtime information can be lost when
|
||||||
|
e.g. copying the content sources using the ``cp`` or ``rsync``
|
||||||
|
commands without the mtime preservation mode (invoked e.g. by
|
||||||
|
``--archive``).
|
||||||
|
|
||||||
|
The cache files are Python pickles, so they may not be readable by
|
||||||
|
different versions of Python as the pickle format often changes. If
|
||||||
|
such an error is encountered, the cache files have to be rebuilt
|
||||||
|
using the pelican command-line option ``--full-rebuild``.
|
||||||
|
The cache files also have to be rebuilt when changing the
|
||||||
|
`GZIP_CACHE` setting for cache file reading to work.
|
||||||
|
|
||||||
|
The ``--full-rebuild`` command-line option is also useful when the
|
||||||
|
whole site needs to be regenerated due to e.g. modifications to the
|
||||||
|
settings file or theme files. When pelican runs in autorealod mode,
|
||||||
|
modification of the settings file or theme will trigger a full rebuild
|
||||||
|
automatically.
|
||||||
|
|
||||||
|
Note that even when using cached content, all output is always
|
||||||
|
written, so the modification times of the ``*.html`` files always
|
||||||
|
change. Therefore, ``rsync`` based upload may benefit from the
|
||||||
|
``--checksum`` option.
|
||||||
|
|
||||||
Example settings
|
Example settings
|
||||||
================
|
================
|
||||||
|
|
|
||||||
|
|
@ -260,6 +260,10 @@ def parse_arguments():
|
||||||
action='store_true',
|
action='store_true',
|
||||||
help='Relaunch pelican each time a modification occurs'
|
help='Relaunch pelican each time a modification occurs'
|
||||||
' on the content files.')
|
' on the content files.')
|
||||||
|
|
||||||
|
parser.add_argument('-f', '--full-rebuild', action='store_true',
|
||||||
|
dest='full_rebuild', help='Rebuild everything by not loading from cache')
|
||||||
|
|
||||||
return parser.parse_args()
|
return parser.parse_args()
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -275,6 +279,8 @@ def get_config(args):
|
||||||
config['THEME'] = abstheme if os.path.exists(abstheme) else args.theme
|
config['THEME'] = abstheme if os.path.exists(abstheme) else args.theme
|
||||||
if args.delete_outputdir is not None:
|
if args.delete_outputdir is not None:
|
||||||
config['DELETE_OUTPUT_DIRECTORY'] = args.delete_outputdir
|
config['DELETE_OUTPUT_DIRECTORY'] = args.delete_outputdir
|
||||||
|
if args.full_rebuild:
|
||||||
|
config['LOAD_CONTENT_CACHE'] = False
|
||||||
|
|
||||||
# argparse returns bytes in Py2. There is no definite answer as to which
|
# argparse returns bytes in Py2. There is no definite answer as to which
|
||||||
# encoding argparse (or sys.argv) uses.
|
# encoding argparse (or sys.argv) uses.
|
||||||
|
|
@ -327,6 +333,7 @@ def main():
|
||||||
print(' --- AutoReload Mode: Monitoring `content`, `theme` and'
|
print(' --- AutoReload Mode: Monitoring `content`, `theme` and'
|
||||||
' `settings` for changes. ---')
|
' `settings` for changes. ---')
|
||||||
|
|
||||||
|
first_run = True # load cache on first run
|
||||||
while True:
|
while True:
|
||||||
try:
|
try:
|
||||||
# Check source dir for changed files ending with the given
|
# Check source dir for changed files ending with the given
|
||||||
|
|
@ -335,9 +342,14 @@ def main():
|
||||||
# have changed, no matter what extension the filenames
|
# have changed, no matter what extension the filenames
|
||||||
# have.
|
# have.
|
||||||
modified = {k: next(v) for k, v in watchers.items()}
|
modified = {k: next(v) for k, v in watchers.items()}
|
||||||
|
original_load_cache = settings['LOAD_CONTENT_CACHE']
|
||||||
|
|
||||||
if modified['settings']:
|
if modified['settings']:
|
||||||
pelican, settings = get_instance(args)
|
pelican, settings = get_instance(args)
|
||||||
|
if not first_run:
|
||||||
|
original_load_cache = settings['LOAD_CONTENT_CACHE']
|
||||||
|
# invalidate cache
|
||||||
|
pelican.settings['LOAD_CONTENT_CACHE'] = False
|
||||||
|
|
||||||
if any(modified.values()):
|
if any(modified.values()):
|
||||||
print('\n-> Modified: {}. re-generating...'.format(
|
print('\n-> Modified: {}. re-generating...'.format(
|
||||||
|
|
@ -349,8 +361,15 @@ def main():
|
||||||
if modified['theme'] is None:
|
if modified['theme'] is None:
|
||||||
logger.warning('Empty theme folder. Using `basic` '
|
logger.warning('Empty theme folder. Using `basic` '
|
||||||
'theme.')
|
'theme.')
|
||||||
|
elif modified['theme']:
|
||||||
|
# theme modified, needs full rebuild -> no cache
|
||||||
|
if not first_run: # but not on first run
|
||||||
|
pelican.settings['LOAD_CONTENT_CACHE'] = False
|
||||||
|
|
||||||
pelican.run()
|
pelican.run()
|
||||||
|
first_run = False
|
||||||
|
# restore original caching policy
|
||||||
|
pelican.settings['LOAD_CONTENT_CACHE'] = original_load_cache
|
||||||
|
|
||||||
except KeyboardInterrupt:
|
except KeyboardInterrupt:
|
||||||
logger.warning("Keyboard interrupt, quitting.")
|
logger.warning("Keyboard interrupt, quitting.")
|
||||||
|
|
|
||||||
|
|
@ -325,6 +325,13 @@ class Content(object):
|
||||||
os.path.abspath(self.settings['PATH']))
|
os.path.abspath(self.settings['PATH']))
|
||||||
)
|
)
|
||||||
|
|
||||||
|
def __eq__(self, other):
|
||||||
|
"""Compare with metadata and content of other Content object"""
|
||||||
|
return other and self.metadata == other.metadata and self.content == other.content
|
||||||
|
|
||||||
|
# keep basic hashing functionality for caching to work
|
||||||
|
__hash__ = object.__hash__
|
||||||
|
|
||||||
|
|
||||||
class Page(Content):
|
class Page(Content):
|
||||||
mandatory_properties = ('title',)
|
mandatory_properties = ('title',)
|
||||||
|
|
|
||||||
|
|
@ -20,14 +20,15 @@ from jinja2 import (Environment, FileSystemLoader, PrefixLoader, ChoiceLoader,
|
||||||
|
|
||||||
from pelican.contents import Article, Draft, Page, Static, is_valid_content
|
from pelican.contents import Article, Draft, Page, Static, is_valid_content
|
||||||
from pelican.readers import Readers
|
from pelican.readers import Readers
|
||||||
from pelican.utils import copy, process_translations, mkdir_p, DateFormatter
|
from pelican.utils import (copy, process_translations, mkdir_p, DateFormatter,
|
||||||
|
FileStampDataCacher)
|
||||||
from pelican import signals
|
from pelican import signals
|
||||||
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
class Generator(object):
|
class Generator(FileStampDataCacher):
|
||||||
"""Baseclass generator"""
|
"""Baseclass generator"""
|
||||||
|
|
||||||
def __init__(self, context, settings, path, theme, output_path, **kwargs):
|
def __init__(self, context, settings, path, theme, output_path, **kwargs):
|
||||||
|
|
@ -73,6 +74,10 @@ class Generator(object):
|
||||||
custom_filters = self.settings['JINJA_FILTERS']
|
custom_filters = self.settings['JINJA_FILTERS']
|
||||||
self.env.filters.update(custom_filters)
|
self.env.filters.update(custom_filters)
|
||||||
|
|
||||||
|
# set up caching
|
||||||
|
super(Generator, self).__init__(settings, 'CACHE_CONTENT',
|
||||||
|
'LOAD_CONTENT_CACHE')
|
||||||
|
|
||||||
signals.generator_init.send(self)
|
signals.generator_init.send(self)
|
||||||
|
|
||||||
def get_template(self, name):
|
def get_template(self, name):
|
||||||
|
|
@ -408,20 +413,24 @@ class ArticlesGenerator(Generator):
|
||||||
for f in self.get_files(
|
for f in self.get_files(
|
||||||
self.settings['ARTICLE_DIR'],
|
self.settings['ARTICLE_DIR'],
|
||||||
exclude=self.settings['ARTICLE_EXCLUDES']):
|
exclude=self.settings['ARTICLE_EXCLUDES']):
|
||||||
try:
|
article = self.get_cached_data(f, None)
|
||||||
article = self.readers.read_file(
|
if article is None:
|
||||||
base_path=self.path, path=f, content_class=Article,
|
try:
|
||||||
context=self.context,
|
article = self.readers.read_file(
|
||||||
preread_signal=signals.article_generator_preread,
|
base_path=self.path, path=f, content_class=Article,
|
||||||
preread_sender=self,
|
context=self.context,
|
||||||
context_signal=signals.article_generator_context,
|
preread_signal=signals.article_generator_preread,
|
||||||
context_sender=self)
|
preread_sender=self,
|
||||||
except Exception as e:
|
context_signal=signals.article_generator_context,
|
||||||
logger.warning('Could not process {}\n{}'.format(f, e))
|
context_sender=self)
|
||||||
continue
|
except Exception as e:
|
||||||
|
logger.warning('Could not process {}\n{}'.format(f, e))
|
||||||
|
continue
|
||||||
|
|
||||||
if not is_valid_content(article, f):
|
if not is_valid_content(article, f):
|
||||||
continue
|
continue
|
||||||
|
|
||||||
|
self.cache_data(f, article)
|
||||||
|
|
||||||
self.add_source_path(article)
|
self.add_source_path(article)
|
||||||
|
|
||||||
|
|
@ -502,7 +511,7 @@ class ArticlesGenerator(Generator):
|
||||||
|
|
||||||
self._update_context(('articles', 'dates', 'tags', 'categories',
|
self._update_context(('articles', 'dates', 'tags', 'categories',
|
||||||
'tag_cloud', 'authors', 'related_posts'))
|
'tag_cloud', 'authors', 'related_posts'))
|
||||||
|
self.save_cache()
|
||||||
signals.article_generator_finalized.send(self)
|
signals.article_generator_finalized.send(self)
|
||||||
|
|
||||||
def generate_output(self, writer):
|
def generate_output(self, writer):
|
||||||
|
|
@ -527,20 +536,24 @@ class PagesGenerator(Generator):
|
||||||
for f in self.get_files(
|
for f in self.get_files(
|
||||||
self.settings['PAGE_DIR'],
|
self.settings['PAGE_DIR'],
|
||||||
exclude=self.settings['PAGE_EXCLUDES']):
|
exclude=self.settings['PAGE_EXCLUDES']):
|
||||||
try:
|
page = self.get_cached_data(f, None)
|
||||||
page = self.readers.read_file(
|
if page is None:
|
||||||
base_path=self.path, path=f, content_class=Page,
|
try:
|
||||||
context=self.context,
|
page = self.readers.read_file(
|
||||||
preread_signal=signals.page_generator_preread,
|
base_path=self.path, path=f, content_class=Page,
|
||||||
preread_sender=self,
|
context=self.context,
|
||||||
context_signal=signals.page_generator_context,
|
preread_signal=signals.page_generator_preread,
|
||||||
context_sender=self)
|
preread_sender=self,
|
||||||
except Exception as e:
|
context_signal=signals.page_generator_context,
|
||||||
logger.warning('Could not process {}\n{}'.format(f, e))
|
context_sender=self)
|
||||||
continue
|
except Exception as e:
|
||||||
|
logger.warning('Could not process {}\n{}'.format(f, e))
|
||||||
|
continue
|
||||||
|
|
||||||
if not is_valid_content(page, f):
|
if not is_valid_content(page, f):
|
||||||
continue
|
continue
|
||||||
|
|
||||||
|
self.cache_data(f, page)
|
||||||
|
|
||||||
self.add_source_path(page)
|
self.add_source_path(page)
|
||||||
|
|
||||||
|
|
@ -560,6 +573,7 @@ class PagesGenerator(Generator):
|
||||||
self._update_context(('pages', ))
|
self._update_context(('pages', ))
|
||||||
self.context['PAGES'] = self.pages
|
self.context['PAGES'] = self.pages
|
||||||
|
|
||||||
|
self.save_cache()
|
||||||
signals.page_generator_finalized.send(self)
|
signals.page_generator_finalized.send(self)
|
||||||
|
|
||||||
def generate_output(self, writer):
|
def generate_output(self, writer):
|
||||||
|
|
|
||||||
|
|
@ -119,7 +119,12 @@ DEFAULT_CONFIG = {
|
||||||
'IGNORE_FILES': ['.#*'],
|
'IGNORE_FILES': ['.#*'],
|
||||||
'SLUG_SUBSTITUTIONS': (),
|
'SLUG_SUBSTITUTIONS': (),
|
||||||
'INTRASITE_LINK_REGEX': '[{|](?P<what>.*?)[|}]',
|
'INTRASITE_LINK_REGEX': '[{|](?P<what>.*?)[|}]',
|
||||||
'SLUGIFY_SOURCE': 'title'
|
'SLUGIFY_SOURCE': 'title',
|
||||||
|
'CACHE_CONTENT': True,
|
||||||
|
'CACHE_DIRECTORY': 'cache',
|
||||||
|
'GZIP_CACHE': True,
|
||||||
|
'CHECK_MODIFIED_METHOD': 'mtime',
|
||||||
|
'LOAD_CONTENT_CACHE': True,
|
||||||
}
|
}
|
||||||
|
|
||||||
PYGMENTS_RST_OPTIONS = None
|
PYGMENTS_RST_OPTIONS = None
|
||||||
|
|
|
||||||
|
|
@ -42,6 +42,7 @@ class TestArticlesGenerator(unittest.TestCase):
|
||||||
settings['DEFAULT_CATEGORY'] = 'Default'
|
settings['DEFAULT_CATEGORY'] = 'Default'
|
||||||
settings['DEFAULT_DATE'] = (1970, 1, 1)
|
settings['DEFAULT_DATE'] = (1970, 1, 1)
|
||||||
settings['READERS'] = {'asc': None}
|
settings['READERS'] = {'asc': None}
|
||||||
|
settings['CACHE_CONTENT'] = False # cache not needed for this logic tests
|
||||||
|
|
||||||
cls.generator = ArticlesGenerator(
|
cls.generator = ArticlesGenerator(
|
||||||
context=settings.copy(), settings=settings,
|
context=settings.copy(), settings=settings,
|
||||||
|
|
@ -50,8 +51,15 @@ class TestArticlesGenerator(unittest.TestCase):
|
||||||
cls.articles = [[page.title, page.status, page.category.name,
|
cls.articles = [[page.title, page.status, page.category.name,
|
||||||
page.template] for page in cls.generator.articles]
|
page.template] for page in cls.generator.articles]
|
||||||
|
|
||||||
|
def setUp(self):
|
||||||
|
self.temp_cache = mkdtemp(prefix='pelican_cache.')
|
||||||
|
|
||||||
|
def tearDown(self):
|
||||||
|
rmtree(self.temp_cache)
|
||||||
|
|
||||||
def test_generate_feeds(self):
|
def test_generate_feeds(self):
|
||||||
settings = get_settings()
|
settings = get_settings()
|
||||||
|
settings['CACHE_DIRECTORY'] = self.temp_cache
|
||||||
generator = ArticlesGenerator(
|
generator = ArticlesGenerator(
|
||||||
context=settings, settings=settings,
|
context=settings, settings=settings,
|
||||||
path=None, theme=settings['THEME'], output_path=None)
|
path=None, theme=settings['THEME'], output_path=None)
|
||||||
|
|
@ -127,6 +135,7 @@ class TestArticlesGenerator(unittest.TestCase):
|
||||||
settings['DEFAULT_CATEGORY'] = 'Default'
|
settings['DEFAULT_CATEGORY'] = 'Default'
|
||||||
settings['DEFAULT_DATE'] = (1970, 1, 1)
|
settings['DEFAULT_DATE'] = (1970, 1, 1)
|
||||||
settings['USE_FOLDER_AS_CATEGORY'] = False
|
settings['USE_FOLDER_AS_CATEGORY'] = False
|
||||||
|
settings['CACHE_DIRECTORY'] = self.temp_cache
|
||||||
settings['READERS'] = {'asc': None}
|
settings['READERS'] = {'asc': None}
|
||||||
settings['filenames'] = {}
|
settings['filenames'] = {}
|
||||||
generator = ArticlesGenerator(
|
generator = ArticlesGenerator(
|
||||||
|
|
@ -151,6 +160,7 @@ class TestArticlesGenerator(unittest.TestCase):
|
||||||
def test_direct_templates_save_as_default(self):
|
def test_direct_templates_save_as_default(self):
|
||||||
|
|
||||||
settings = get_settings(filenames={})
|
settings = get_settings(filenames={})
|
||||||
|
settings['CACHE_DIRECTORY'] = self.temp_cache
|
||||||
generator = ArticlesGenerator(
|
generator = ArticlesGenerator(
|
||||||
context=settings, settings=settings,
|
context=settings, settings=settings,
|
||||||
path=None, theme=settings['THEME'], output_path=None)
|
path=None, theme=settings['THEME'], output_path=None)
|
||||||
|
|
@ -165,6 +175,7 @@ class TestArticlesGenerator(unittest.TestCase):
|
||||||
settings = get_settings()
|
settings = get_settings()
|
||||||
settings['DIRECT_TEMPLATES'] = ['archives']
|
settings['DIRECT_TEMPLATES'] = ['archives']
|
||||||
settings['ARCHIVES_SAVE_AS'] = 'archives/index.html'
|
settings['ARCHIVES_SAVE_AS'] = 'archives/index.html'
|
||||||
|
settings['CACHE_DIRECTORY'] = self.temp_cache
|
||||||
generator = ArticlesGenerator(
|
generator = ArticlesGenerator(
|
||||||
context=settings, settings=settings,
|
context=settings, settings=settings,
|
||||||
path=None, theme=settings['THEME'], output_path=None)
|
path=None, theme=settings['THEME'], output_path=None)
|
||||||
|
|
@ -180,6 +191,7 @@ class TestArticlesGenerator(unittest.TestCase):
|
||||||
settings = get_settings()
|
settings = get_settings()
|
||||||
settings['DIRECT_TEMPLATES'] = ['archives']
|
settings['DIRECT_TEMPLATES'] = ['archives']
|
||||||
settings['ARCHIVES_SAVE_AS'] = 'archives/index.html'
|
settings['ARCHIVES_SAVE_AS'] = 'archives/index.html'
|
||||||
|
settings['CACHE_DIRECTORY'] = self.temp_cache
|
||||||
generator = ArticlesGenerator(
|
generator = ArticlesGenerator(
|
||||||
context=settings, settings=settings,
|
context=settings, settings=settings,
|
||||||
path=None, theme=settings['THEME'], output_path=None)
|
path=None, theme=settings['THEME'], output_path=None)
|
||||||
|
|
@ -206,6 +218,7 @@ class TestArticlesGenerator(unittest.TestCase):
|
||||||
settings = get_settings(filenames={})
|
settings = get_settings(filenames={})
|
||||||
|
|
||||||
settings['YEAR_ARCHIVE_SAVE_AS'] = 'posts/{date:%Y}/index.html'
|
settings['YEAR_ARCHIVE_SAVE_AS'] = 'posts/{date:%Y}/index.html'
|
||||||
|
settings['CACHE_DIRECTORY'] = self.temp_cache
|
||||||
generator = ArticlesGenerator(
|
generator = ArticlesGenerator(
|
||||||
context=settings, settings=settings,
|
context=settings, settings=settings,
|
||||||
path=CONTENT_DIR, theme=settings['THEME'], output_path=None)
|
path=CONTENT_DIR, theme=settings['THEME'], output_path=None)
|
||||||
|
|
@ -268,6 +281,25 @@ class TestArticlesGenerator(unittest.TestCase):
|
||||||
authors_expected = ['alexis-metaireau', 'first-author', 'second-author']
|
authors_expected = ['alexis-metaireau', 'first-author', 'second-author']
|
||||||
self.assertEqual(sorted(authors), sorted(authors_expected))
|
self.assertEqual(sorted(authors), sorted(authors_expected))
|
||||||
|
|
||||||
|
def test_content_caching(self):
|
||||||
|
"""Test that the articles are read only once when caching"""
|
||||||
|
settings = get_settings(filenames={})
|
||||||
|
settings['CACHE_DIRECTORY'] = self.temp_cache
|
||||||
|
settings['READERS'] = {'asc': None}
|
||||||
|
|
||||||
|
generator = ArticlesGenerator(
|
||||||
|
context=settings.copy(), settings=settings,
|
||||||
|
path=CONTENT_DIR, theme=settings['THEME'], output_path=None)
|
||||||
|
generator.generate_context()
|
||||||
|
self.assertTrue(hasattr(generator, '_cache'))
|
||||||
|
|
||||||
|
generator = ArticlesGenerator(
|
||||||
|
context=settings.copy(), settings=settings,
|
||||||
|
path=CONTENT_DIR, theme=settings['THEME'], output_path=None)
|
||||||
|
generator.readers.read_file = MagicMock()
|
||||||
|
generator.generate_context()
|
||||||
|
generator.readers.read_file.assert_called_count == 0
|
||||||
|
|
||||||
|
|
||||||
class TestPageGenerator(unittest.TestCase):
|
class TestPageGenerator(unittest.TestCase):
|
||||||
# Note: Every time you want to test for a new field; Make sure the test
|
# Note: Every time you want to test for a new field; Make sure the test
|
||||||
|
|
@ -275,12 +307,19 @@ class TestPageGenerator(unittest.TestCase):
|
||||||
# distill_pages Then update the assertEqual in test_generate_context
|
# distill_pages Then update the assertEqual in test_generate_context
|
||||||
# to match expected
|
# to match expected
|
||||||
|
|
||||||
|
def setUp(self):
|
||||||
|
self.temp_cache = mkdtemp(prefix='pelican_cache.')
|
||||||
|
|
||||||
|
def tearDown(self):
|
||||||
|
rmtree(self.temp_cache)
|
||||||
|
|
||||||
def distill_pages(self, pages):
|
def distill_pages(self, pages):
|
||||||
return [[page.title, page.status, page.template] for page in pages]
|
return [[page.title, page.status, page.template] for page in pages]
|
||||||
|
|
||||||
def test_generate_context(self):
|
def test_generate_context(self):
|
||||||
settings = get_settings(filenames={})
|
settings = get_settings(filenames={})
|
||||||
settings['PAGE_DIR'] = 'TestPages' # relative to CUR_DIR
|
settings['PAGE_DIR'] = 'TestPages' # relative to CUR_DIR
|
||||||
|
settings['CACHE_DIRECTORY'] = self.temp_cache
|
||||||
settings['DEFAULT_DATE'] = (1970, 1, 1)
|
settings['DEFAULT_DATE'] = (1970, 1, 1)
|
||||||
|
|
||||||
generator = PagesGenerator(
|
generator = PagesGenerator(
|
||||||
|
|
@ -306,6 +345,26 @@ class TestPageGenerator(unittest.TestCase):
|
||||||
self.assertEqual(sorted(pages_expected), sorted(pages))
|
self.assertEqual(sorted(pages_expected), sorted(pages))
|
||||||
self.assertEqual(sorted(hidden_pages_expected), sorted(hidden_pages))
|
self.assertEqual(sorted(hidden_pages_expected), sorted(hidden_pages))
|
||||||
|
|
||||||
|
def test_content_caching(self):
|
||||||
|
"""Test that the pages are read only once when caching"""
|
||||||
|
settings = get_settings(filenames={})
|
||||||
|
settings['CACHE_DIRECTORY'] = 'cache_dir' #TODO
|
||||||
|
settings['CACHE_DIRECTORY'] = self.temp_cache
|
||||||
|
settings['READERS'] = {'asc': None}
|
||||||
|
|
||||||
|
generator = PagesGenerator(
|
||||||
|
context=settings.copy(), settings=settings,
|
||||||
|
path=CUR_DIR, theme=settings['THEME'], output_path=None)
|
||||||
|
generator.generate_context()
|
||||||
|
self.assertTrue(hasattr(generator, '_cache'))
|
||||||
|
|
||||||
|
generator = PagesGenerator(
|
||||||
|
context=settings.copy(), settings=settings,
|
||||||
|
path=CUR_DIR, theme=settings['THEME'], output_path=None)
|
||||||
|
generator.readers.read_file = MagicMock()
|
||||||
|
generator.generate_context()
|
||||||
|
generator.readers.read_file.assert_called_count == 0
|
||||||
|
|
||||||
|
|
||||||
class TestTemplatePagesGenerator(unittest.TestCase):
|
class TestTemplatePagesGenerator(unittest.TestCase):
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -43,12 +43,14 @@ class TestPelican(LoggedTestCase):
|
||||||
def setUp(self):
|
def setUp(self):
|
||||||
super(TestPelican, self).setUp()
|
super(TestPelican, self).setUp()
|
||||||
self.temp_path = mkdtemp(prefix='pelicantests.')
|
self.temp_path = mkdtemp(prefix='pelicantests.')
|
||||||
|
self.temp_cache = mkdtemp(prefix='pelican_cache.')
|
||||||
self.old_locale = locale.setlocale(locale.LC_ALL)
|
self.old_locale = locale.setlocale(locale.LC_ALL)
|
||||||
self.maxDiff = None
|
self.maxDiff = None
|
||||||
locale.setlocale(locale.LC_ALL, str('C'))
|
locale.setlocale(locale.LC_ALL, str('C'))
|
||||||
|
|
||||||
def tearDown(self):
|
def tearDown(self):
|
||||||
rmtree(self.temp_path)
|
rmtree(self.temp_path)
|
||||||
|
rmtree(self.temp_cache)
|
||||||
locale.setlocale(locale.LC_ALL, self.old_locale)
|
locale.setlocale(locale.LC_ALL, self.old_locale)
|
||||||
super(TestPelican, self).tearDown()
|
super(TestPelican, self).tearDown()
|
||||||
|
|
||||||
|
|
@ -77,6 +79,7 @@ class TestPelican(LoggedTestCase):
|
||||||
settings = read_settings(path=None, override={
|
settings = read_settings(path=None, override={
|
||||||
'PATH': INPUT_PATH,
|
'PATH': INPUT_PATH,
|
||||||
'OUTPUT_PATH': self.temp_path,
|
'OUTPUT_PATH': self.temp_path,
|
||||||
|
'CACHE_DIRECTORY': self.temp_cache,
|
||||||
'LOCALE': locale.normalize('en_US'),
|
'LOCALE': locale.normalize('en_US'),
|
||||||
})
|
})
|
||||||
pelican = Pelican(settings=settings)
|
pelican = Pelican(settings=settings)
|
||||||
|
|
@ -92,6 +95,7 @@ class TestPelican(LoggedTestCase):
|
||||||
settings = read_settings(path=SAMPLE_CONFIG, override={
|
settings = read_settings(path=SAMPLE_CONFIG, override={
|
||||||
'PATH': INPUT_PATH,
|
'PATH': INPUT_PATH,
|
||||||
'OUTPUT_PATH': self.temp_path,
|
'OUTPUT_PATH': self.temp_path,
|
||||||
|
'CACHE_DIRECTORY': self.temp_cache,
|
||||||
'LOCALE': locale.normalize('en_US'),
|
'LOCALE': locale.normalize('en_US'),
|
||||||
})
|
})
|
||||||
pelican = Pelican(settings=settings)
|
pelican = Pelican(settings=settings)
|
||||||
|
|
@ -103,6 +107,7 @@ class TestPelican(LoggedTestCase):
|
||||||
settings = read_settings(path=SAMPLE_CONFIG, override={
|
settings = read_settings(path=SAMPLE_CONFIG, override={
|
||||||
'PATH': INPUT_PATH,
|
'PATH': INPUT_PATH,
|
||||||
'OUTPUT_PATH': self.temp_path,
|
'OUTPUT_PATH': self.temp_path,
|
||||||
|
'CACHE_DIRECTORY': self.temp_cache,
|
||||||
'THEME_STATIC_PATHS': [os.path.join(SAMPLES_PATH, 'very'),
|
'THEME_STATIC_PATHS': [os.path.join(SAMPLES_PATH, 'very'),
|
||||||
os.path.join(SAMPLES_PATH, 'kinda'),
|
os.path.join(SAMPLES_PATH, 'kinda'),
|
||||||
os.path.join(SAMPLES_PATH, 'theme_standard')]
|
os.path.join(SAMPLES_PATH, 'theme_standard')]
|
||||||
|
|
@ -123,6 +128,7 @@ class TestPelican(LoggedTestCase):
|
||||||
settings = read_settings(path=SAMPLE_CONFIG, override={
|
settings = read_settings(path=SAMPLE_CONFIG, override={
|
||||||
'PATH': INPUT_PATH,
|
'PATH': INPUT_PATH,
|
||||||
'OUTPUT_PATH': self.temp_path,
|
'OUTPUT_PATH': self.temp_path,
|
||||||
|
'CACHE_DIRECTORY': self.temp_cache,
|
||||||
'THEME_STATIC_PATHS': [os.path.join(SAMPLES_PATH, 'theme_standard')]
|
'THEME_STATIC_PATHS': [os.path.join(SAMPLES_PATH, 'theme_standard')]
|
||||||
})
|
})
|
||||||
|
|
||||||
|
|
|
||||||
113
pelican/utils.py
113
pelican/utils.py
|
|
@ -12,6 +12,8 @@ import pytz
|
||||||
import re
|
import re
|
||||||
import shutil
|
import shutil
|
||||||
import traceback
|
import traceback
|
||||||
|
import pickle
|
||||||
|
import hashlib
|
||||||
|
|
||||||
from collections import Hashable
|
from collections import Hashable
|
||||||
from contextlib import contextmanager
|
from contextlib import contextmanager
|
||||||
|
|
@ -545,3 +547,114 @@ def split_all(path):
|
||||||
break
|
break
|
||||||
path = head
|
path = head
|
||||||
return components
|
return components
|
||||||
|
|
||||||
|
|
||||||
|
class FileDataCacher(object):
|
||||||
|
'''Class that can cache data contained in files'''
|
||||||
|
|
||||||
|
def __init__(self, settings, cache_policy_key, load_policy_key):
|
||||||
|
'''Load the specified cache within CACHE_DIRECTORY
|
||||||
|
|
||||||
|
only if load_policy_key in setttings is True,
|
||||||
|
May use gzip if GZIP_CACHE.
|
||||||
|
Sets caching policy according to *cache_policy_key*
|
||||||
|
in *settings*
|
||||||
|
'''
|
||||||
|
self.settings = settings
|
||||||
|
name = self.__class__.__name__
|
||||||
|
self._cache_path = os.path.join(self.settings['CACHE_DIRECTORY'], name)
|
||||||
|
self._cache_data_policy = self.settings[cache_policy_key]
|
||||||
|
if not self.settings[load_policy_key]:
|
||||||
|
self._cache = {}
|
||||||
|
return
|
||||||
|
if self.settings['GZIP_CACHE']:
|
||||||
|
import gzip
|
||||||
|
self._cache_open = gzip.open
|
||||||
|
else:
|
||||||
|
self._cache_open = open
|
||||||
|
try:
|
||||||
|
with self._cache_open(self._cache_path, 'rb') as f:
|
||||||
|
self._cache = pickle.load(f)
|
||||||
|
except Exception as e:
|
||||||
|
self._cache = {}
|
||||||
|
|
||||||
|
def cache_data(self, filename, data):
|
||||||
|
'''Cache data for given file'''
|
||||||
|
if not self._cache_data_policy:
|
||||||
|
return
|
||||||
|
self._cache[filename] = data
|
||||||
|
|
||||||
|
def get_cached_data(self, filename, default={}):
|
||||||
|
'''Get cached data for the given file
|
||||||
|
|
||||||
|
if no data is cached, return the default object
|
||||||
|
'''
|
||||||
|
return self._cache.get(filename, default)
|
||||||
|
|
||||||
|
def save_cache(self):
|
||||||
|
'''Save the updated cache'''
|
||||||
|
if not self._cache_data_policy:
|
||||||
|
return
|
||||||
|
try:
|
||||||
|
mkdir_p(self.settings['CACHE_DIRECTORY'])
|
||||||
|
with self._cache_open(self._cache_path, 'wb') as f:
|
||||||
|
pickle.dump(self._cache, f)
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning('Could not save cache {}\n{}'.format(
|
||||||
|
self._cache_path, e))
|
||||||
|
|
||||||
|
|
||||||
|
class FileStampDataCacher(FileDataCacher):
|
||||||
|
'''Subclass that also caches the stamp of the file'''
|
||||||
|
|
||||||
|
def __init__(self, settings, cache_policy_key, load_policy_key):
|
||||||
|
'''This sublcass additionaly sets filestamp function'''
|
||||||
|
super(FileStampDataCacher, self).__init__(settings, cache_policy_key,
|
||||||
|
load_policy_key)
|
||||||
|
|
||||||
|
method = self.settings['CHECK_MODIFIED_METHOD']
|
||||||
|
if method == 'mtime':
|
||||||
|
self._filestamp_func = os.path.getmtime
|
||||||
|
else:
|
||||||
|
try:
|
||||||
|
hash_func = getattr(hashlib, method)
|
||||||
|
def filestamp_func(buf):
|
||||||
|
return hash_func(buf).digest()
|
||||||
|
self._filestamp_func = filestamp_func
|
||||||
|
except ImportError:
|
||||||
|
self._filestamp_func = None
|
||||||
|
|
||||||
|
def cache_data(self, filename, data):
|
||||||
|
'''Cache stamp and data for the given file'''
|
||||||
|
stamp = self._get_file_stamp(filename)
|
||||||
|
super(FileStampDataCacher, self).cache_data(filename, (stamp, data))
|
||||||
|
|
||||||
|
def _get_file_stamp(self, filename):
|
||||||
|
'''Check if the given file has been modified
|
||||||
|
since the previous build.
|
||||||
|
|
||||||
|
depending on CHECK_MODIFIED_METHOD
|
||||||
|
a float may be returned for 'mtime',
|
||||||
|
a hash for a function name in the hashlib module
|
||||||
|
or an empty bytes string otherwise
|
||||||
|
'''
|
||||||
|
filename = os.path.join(self.path, filename)
|
||||||
|
try:
|
||||||
|
with open(filename, 'rb') as f:
|
||||||
|
return self._filestamp_func(f.read())
|
||||||
|
except Exception:
|
||||||
|
return b''
|
||||||
|
|
||||||
|
def get_cached_data(self, filename, default=None):
|
||||||
|
'''Get the cached data for the given filename
|
||||||
|
if the file has not been modified.
|
||||||
|
|
||||||
|
If no record exists or file has been modified, return default.
|
||||||
|
Modification is checked by compaing the cached
|
||||||
|
and current file stamp.
|
||||||
|
'''
|
||||||
|
stamp, data = super(FileStampDataCacher, self).get_cached_data(
|
||||||
|
filename, (None, default))
|
||||||
|
if stamp != self._get_file_stamp(filename):
|
||||||
|
return default
|
||||||
|
return data
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue