wrote unit tests and documentation, improved regular expression. The HtmlReader is enabled by default now and parses metadata in html files of the form: <!-- key:value -->