Added docbook reader (with contributions from Mauro Bieg).
Fixed bug in fromEntities. The previous version would turn hi & low you know; into hi &.
HTML reader:
would be rendered as an empty paragraph. Thanks to Paul Vorbach for pointing out the bug.<col> and <caption> in tables. Closes #486.Markdown reader:
LaTeX reader:
\bgroup, \egroup, \begingroup, \endgroup.\begingroup was parsed as \begin followed by group.\parindent0pt\label and \ref sensitive to --parse-raw. If --parse-raw is selected, these will be parsed as raw latex inlines, rather than bracketed text.\vspace{10pt}) inside \author; just skip them. Closes #505.Textile reader:
== and <notextile>. Closes #473.Docx writer: Fixed multi-paragraph list items. Previously they each got a list marker. Closes #457.
LaTeX writer:
--no-tex-ligatures option to avoid replacing quotation marks and dashes with TeX ligatures.fixltx2e package to provide ‘’.ConTeXt writer: Fixed escaping of %. In text, % needs to be escaped as \letterpercent, not \% Inside URLs, % needs to be escaped as \% Thanks to jmarca and adityam for the fix. Closes #492.
Texinfo writer: Escape special characters in node titles. This fixes a problem pointed out by Joost Kremers. Pandoc used to escape an ‘@’ in a chapter title, but not in the corresponding node title, leading to invalid texinfo.
Fixed document encoding in texinfo template. Resolves Debian Bug #667816.
Markdown writer:
< and $.LaTeX writer: Use \hyperref[ident]{text} for internal links. Previously we used \href{\#ident}{text}, which didn’t work on all systems. Thanks to Dirk Laurie.
RST writer: Don’t wrap link references. Closes #487.
Updated to use latest versions of blaze-html, mtl.
LaTeX reader:
lstlisting work as a proper verbatim environment.LaTeX writer:
{} around ctable caption, so that formatting can be used.LaTeX template: Added variables for geometry, romanfont, sansfont, mathfont, mainfont so users can more easily customize fonts.
PDF writer:
Texinfo writer: retain directories in image paths. (Peter Wang)
RST writer: Better handling of inline formatting, in accord with docutils’ “inline markup recognition rules” (though we don’t implement the unicode rules fully). Now hi*there*hi gets rendered properly as hi\ *there*\ hi, and unnecessary \ are avoided around :math:, :sub:, :sup:.
RST reader:
\ as null, not escaped space.:math:`...` even when not followed by blank or \. This does not implement the complex rule docutils follows, but it should be good enough for most purposes.Text.Pandoc.Parsing: Added stateRstDefaultRole field to ParserState. (Greg Maslov)
Markdown reader: Properly handle citations nested in other inline elements.
Markdown writer: don’t replace empty alt in image with “image”.
DZSlides: Updated template.html and styles in default template. Removed bizarre CSS for q in dzslides template.
Avoid repeated id attribute in section and header in HTML slides.
README improvements: new instructions on internal links, removed misleading note on reST math.
Build system:
Added beamer+lhs as output format.
Don’t escape < in <style> tags with --self-contained. This fixes a bug which prevented highlighting from working when using --self-contained.
PDF: run latex engine three times if --toc specified. This fixes page numbers in the table of contents.
Docx writer: Added TableNormal style to tables.
LaTeX math environment fixes. aligned is now used instead of the nonexistent aligned*. multline instead of the nonexistent multiline.
LaTeX writer: Use \textasciitilde for literal ~.
HTML writer: Don’t escape contents of EQ tags with –gladtex. This fixes a regression from 1.8.
Use <q> tags for Quoted items for HTML5 output. The quote style can be changed by modifying the template or including a css file. A default quote style is included.
LaTeX reader: Fixed accents (~{a}, \c{c}). Correctly handle ^{}. Support “minted” as a LaTeX verbatim block.
Updated LaTeX template for better language support. Use polyglossia instead of babel with xetex. Set lang as documentclass option. \setmainlanguage will use the last of a comma-separated list of languages. Thanks to François Gannaz.
Fixed default LaTeX template so \euro and € work. The eurosym package is needed if you are using pdflatex.
Fixed escaping of period in man writer (thanks to Michael Thompson).
Fixed list label positions in beamer.
Set mainlang variable in context writer. This parallels behavior of latex writer. mainlang is the last of a comma-separated list of languages in lang.
EPUB language metadat: convert e.g. en_US from locale to en-US.
Changed -V so that you can specify a key without a value. Such keys get the value true.
Fixed permissions on installed man pages - thanks Magnus Therning.
Windows installer: require XP or higher. The installer is now compiled on a Windows 7 machine, which fixes a problem using citation functions on Windows 7.
OSX package: Check for 64-bit Intel CPU before installing.
Better handling of raw latex environments in markdown. Now
\begin{equation}
a_1
\end{equation}
turns into a raw latex block as expected.
Improvements to LaTeX reader:
{\\} in braced., to suffix if it doesn’t start with space or punctuation. Otherwise we get no space between the year and the suffix in author-date styles.Added two needed data files for S5. This fixes a problem with pandoc -t s5 --self-contained. Also removed slides.min.js, which was no longer being used.
Fixed some minor problems in reference.docx: name on “Date” style, xCs instead of xIs.
Fixed a problem creating docx files using a reference docx modified using Word. The problem seems to be that Word modifies _rels/.rels, changing the Type of the Relationship to docProps/core.xml. Pandoc now changes this back to the correct value if it has been altered, fixing the problem.
Fixed html5 template so it works properly with highlighting.
LaTeX reader:
Markdown reader:
a**a*a**a*a**a*a**a*a**a*a**a*a**a*a**.Headers no longer wrap in markdown or RST writers.
Added stateMaxNestingLevel to ParserState. We set this to 6, so you can still have Emph inside Emph, just not indefinitely.
More efficient implementation of nowrap in Text.Pandoc.Pretty.
Text.Pandoc.PDF: Only run latex twice if \tableofcontents is present.
Require highlighting-kate >= 0.5.0.2, texmath >= 0.6.0.2.
Changed cabal file so that build-depends for the test program are not required unless the tests flag is used.
LaTeX writer: insert {} between adjacent hyphens so they don’t form ligatures (dashes) in code spans.
Raised version bound on test-framework to avoid problems compiling tests on GHC 7.4.1.
LaTeX reader: Use raw LaTeX as fallback inline text for Cites, so citations don’t just disappear unless you process with citeproc. Ignore \bibliographystyle, \nocite.
Simplified tex2pdf; it will always run latex twice to resolve table of contents and hyperrefs.
Require Cabal >= 1.10.
Tweaked cabal file to meet Cabal 1.10 requirements.
Added a Microsoft Word docx writer. The writer includes support for highlighted code and for math (which is converted from TeX to OMML, Office’s native math markup language, using texmath’s new OMML module). A new option --reference-docx allows the user to customize the styles.
Added an asciidoc writer (http://www.methods.co.nz/asciidoc/).
Better support for slide shows:
Added a dzslides writer. DZSlides is a lightweight HTML5/javascript slide show format due to Paul Rouget (http://paulrouget.com/dzslides/).
Added a LaTeX beamer writer. Beamer is a LaTeX package for creating slide presentations.
New, flexible rules for dividing documents into sections and slides (see the “Structuring the slide show” in the User’s Guide). These are backward-compatible with the old rules, but they allow slide shows to be organized into sections and subsections containing multiple slides.
A new --slide-level option allows users to override defaults and select a slide level below the first header level with content.
A new --self-contained option produces HTML output that does not depend on an internet connection or the presence of any external files. Linked images, CSS, and javascript is downloaded (or fetched locally) and encoded in data: URIs. This is useful for making portable HTML slide shows. The --offline option has been deprecated and is now treated as a synonym or --self-contained.
Support for PDF output:
markdown2pdf.pandoc can now create PDFs (assuming you have latex and a set of appropriate packages installed): just specify an output file with the .pdf extension.--latex-engine allows you to specify pdflatex, xelatex, or lualatex as the processor.Highlighting changes:
highlighting flag is no longer needed when compiling.--no-highlight option allows highlighting to be disabled.docx, latex, and epub, as well as html, html5, dzslides, s5, and slidy.--highlight-style option selects between various highlighting color themes.Internal links to sections now work in ConTeXt and LaTeX as well as HTML.
LaTeX \include and \usepackage commands are now processed, provided the files are in the working directory.
EPUB improvements:
--epub-embed-font option.epub-page.html, epub-coverimage.html, epub-titlepage.html.--mathml now works with DocBook.
Added support for math in RST reader and writer. Inline math uses the :math:`...` construct. Display math uses
.. math:: ...
or if the math is multiline,
.. math::
...
These constructions are now supported now by rst2latex.py.
Github syntax for fenced code blocks is supported in pandoc’s markdown. You can now write
```ruby
x = 2
```
instead of
~~~ {.ruby}
x = 2
~~~~Easier scripting: a new toJsonFilter function makes it easier to write Haskell scripts to manipulate the Pandoc AST. See Scripting with pandoc.
Fixed parsing of consecutive lists in markdown. Pandoc previously behaved like Markdown.pl for consecutive lists of different styles. Thus, the following would be parsed as a single ordered list, rather than an ordered list followed by an unordered list:
1. one
2. two
- one
- two
This change makes pandoc behave more sensibly, parsing this as two lists. Any change in list type (ordered/unordered) or in list number style will trigger a new list. Thus, the following will also be parsed as two lists:
1. one
2. two
a. one
b. two
Since we regard this as a bug in Markdown.pl, and not something anyone would ever rely on, we do not preserve the old behavior even when --strict is selected.
Dashes work differently with --smart: --- is always em-dash, and -- is always en-dash. Pandoc no longer tries to guess when - should be en-dash. Note: This may change how existing documents look when processed with pandoc. A new option, --old-dashes, is provided for legacy documents.
The markdown writer now uses setext headers for levels 1-2. The old behavior (ATX headers for all levels) can be restored using the new --atx-headers option.
Links are now allowed in markdown image captions. They are also allowed in links, but will appear there as regular text. So,
[link with [link](/url)](/url)
will turn into
<p><a href="/url">link with link</a></p>Improved handling of citations using citeproc-hs-0.3.4. Added --citation-abbreviations option.
Citation keys can no longer end with a punctuation character. This means that @item1. will be parsed as a citation with key ‘item1’, followed by a period, instead of a citation with key ‘item1.’, as was the case previously.
In HTML output, citations are now put in a span with class citation.
The markdown reader now recognizes DocBook block and inline tags. It was always possible to include raw DocBook tags in a markdown document, but now pandoc will be able to distinguish block from inline tags and behave accordingly. Thus, for example,
<sidebar>
hello
</sidebar>
will not be wrapped in <para> tags.
The LaTeX parser has been completely rewritten; it is now much more accurate, robust, and extensible. However, there are two important changes in how it treats unknown LaTeX. (1) Previously, unknown environments became BlockQuote elements; now, they are treated as “transparent”, so \begin{unknown}xyz\end{unknown} is the same as xyz. (2) Previously, arguments of unknown commands were passed through with their braces; now the braces are stripped off.
--smart is no longer selected automatically with man output.
The deprecated --xetex option has been removed.
The --html5/-5 option has been deprecated. Use -t html5 instead. html5 and html5+lhs are now separate output formats.
Single quotes are no longer escaped in HTML output. They do not need to be escaped outside of attributes.
Pandoc will no longer transform leading newlines in code blocks to <br/> tags.
The ODT writer now sizes images appropriately, using the image size and DPI information embedded in the image.
--standalone is once again implicitly for a non-text output format (ODT, EPUB). You can again do pandoc test.txt -o test.odt and get a standalone ODT file.
The Docbook writer now uses <sect1>, <sect2>, etc. instead of <section>.
The HTML writer now uses <del> for strikeout.
In HTML output with --section-divs, the classes section and level[1,2,..6] are put on the div tags so they can be styled. In HTML 5 output with --section-divs, the classes level[1,2,...6] are put on section tags.
EPUB writer changes:
lang variable now sets the language in the metadata (if it is not set, we default to the locale).Added titleslide class to title slide in S5 template.
In HTML, EPUB, and docx metadata, the date is normalized into YYYY-MM-DD format if possible. (This is required for validation.)
Attributes in highlighted code blocks are now preserved in HTML. The container element will have the classes, id, and key-value attributes you specified in the delimited code block. Previously these were stripped off.
The reference backlink in the HTML writer no longer has a special footnoteBacklink class.
The HTML template has been split into html and html5 templates.
Author and date are treated more consistently in HTML templates. Authors are now <h2>, date <h3>.
URLs are hyphenated in the ConTeXt writer (B. Scott Michel).
In Text.Pandoc.Builder, +++ has been replaced by <>.
Better support for combining characters and East Asian wide characters in markdown and reST.
Better handling of single quotes with --smart. Previously D'oh l'*aide* would be parsed with left and right single quotes instead of apostrophes. This kind of error is now fixed.
Highlighting: Use reads instead of read for better error handling. Fixes crash on startNum="abc".
Added blank comment after directives in rst template.
Unescape entities in citation refId. The refIds coming from citeproc contain XML numeric entities, and these don’t match with the citation keys parsed by pandoc. Solution is to unescape them.
HTML reader: Fixed bug parsing tables with both thead and tbody.
Markdown reader:
RST reader: allow footnotes followed by newline without space characters.
LaTeX reader:
\@.\itemsep.RST writer: Fixed bug involving empty table cells. isSimple was being calculated in a way that assumed there were no non-empty cells.
ConTeXt writer:
--toc work even without --number-sections.LaTeX writer:
~ inside \href{...}.# in href URLs.documentclass variable, and if that is not set, we look through the template itself. Also, we have added the KOMA classes scrreprt and scrbook. You can now make a book using pandoc -V documentclass:book mybook.txt -o mybook.pdfcode environment will be included in the template.hyperref settings in the template).lang variable to LaTeX template.HTML writer:
<section> for footnotes if HTML5.S5/slidy writer: Make footnotes appear on separate slide at end.
MIME: Added ‘layout-cache’ to getMimeType. This ensures that the META-INF/manifest.xml for ODT files will have everything it needs, so that ODT files modified by LibreOffice can be used as --reference-odt.
Text.Pandoc.Templates: Return empty string for json template.
Text.Pandoc.Biblio:
\160 as space when parsing locator and suffix. This fixes a bug with “p. 33” when --smart is used. Previously the whole “p. 33” would be included in the suffix, with no locator.Updated chicago-author-date.csl. The old version did not work properly for edited volumes with no author.
EPUB writer:
cover-image.jpg.Modified make_osx_package.sh to use cabal-dev. Items are no longer installed as root. Man pages are zipped and given proper permissions.
Modified windows installer generater to use cabal-dev.
Setup: Making man pages now works with cabal-dev (at least on OSX). In Setup.hs we now invoke ‘runghc’ in a way that points it to the correct package databases, instead of always falling back to the default user package db.
Updated to work with GHC 7.4.1.
Removed dependency on old-time.
Removed dependency on dlist.
New slidy directory for “self-contained.”
TeXMath writer: Use unicode thin spaces for thin spaces.
Markdown citations: don’t strip off initial space in locator.
Removed Apostrophe, EmDash, EnDash, and Ellipses from the native Inline type in pandoc-types. Now we use Str elements with unicode.
Improvements to Text.Pandoc.Builder:
Inlines and Blocks are now newtypes (not synonyms for sequences).IsString, Show, Read, Monoid, and a new Listable class, which allows these to be manipulated to some extent like lists. Monoid append includes automatic normalization.+++ has been replaced by <> (mappend).Use blaze-html instead of xhtml for HTML generation. This changes the type of writeHtml.
Text.Pandoc.Shared:
warn and err.unescapeURI, modified escapeURI. (See under [behavior changes], above.)Changes in URI escaping: Previously the readers escaped URIs by converting unicode characters to octets and then percent encoding. Now unicode characters are left as they are, and escapeURI only percent-encodes space characters. This gives more readable URIs, and works well with modern user agents. URIs are no longer unescaped at all on conversion to markdown, asciidoc, rst, org.
New module Text.Pandoc.SelfContained.
New module Text.Pandoc.Docx.
New module Text.Pandoc.PDF.
Added writerBeamer to WriterOptions.
Added normalizeDate to Text.Pandoc.Shared.
Added splitStringWithIndices in Text.Pandoc.Shared. This is like splitWithIndices, but it is sensitive to distinctions between wide, combining, and regular characters.
Text.Pandoc.Pretty:
chomp combinator.beforeNonBreak combinator. This allows you to include something conditionally on it being before a nonblank. Used for RST inline math.charWidth function. All characters marked W or F in the unicode spec EastAsianWidth.txt get width 2.realLength, based on charWidth. realLength is now used in calculating offsets.New module Text.Pandoc.Slides, for common functions for breaking a document into slides.
Removed Text.Pandoc.S5, which is no longer needed.
Removed Text.Pandoc.CharacterReferences. Moved characterReference to Text.Pandoc.Parsing. decodeCharacterReferences is replaced by fromEntities in Text.Pandoc.XML.
Added Text.Pandoc.ImageSize. This is intened for use in docx and odt writers, so the size and dpi of images can be calculated.
Removed writerAscii in WriterOptions.
Added writerHighlight to WriterOptions.
Added DZSlides to HTMLSlideVariant.
writeEPUB has a new argument for font files to embed.
Added stateLastStrPos to ParserState. This lets us keep track of whether we’re parsing the position immediately after a regular (non-space, non-symbol) string, which is useful for distinguishing apostrophes from single quote starts.
Text.Pandoc.Parsing:
escaped now returns a Char.charsInBalanced', added a character parser as a parameter of charsInBalanced. This is needed for proper handling of escapes, etc.withRaw.Added toEntities to Text.Pandoc.XML.
Text.Pandoc.Readers.LaTeX:
handleIncludes.rawLaTeXBlock instead of rawLaTeXEnvironment'.Added ToJsonFilter class and toJsonFilter function to Text.Pandoc, deprecating the old jsonFilter function.
Text.Pandoc.Highlighting:
highlightHtml, defaultHighlightingCss.formatLaTeXInline, formatLaTeXBlock, and highlight, plus key functions from highlighting-kate.highlight returns a Maybe, not an Either.Adjusted Arbitrary instance to help avoid timeouts in tests.
Added Tests.Writers.Markdown to cabal file.
Relaxed version bounds on pandoc-types, test-framework.
Added script to produce OS X package.
Made templates directory a git submodule. This should make it easier for people to revise their custom templates when the default templates change.
Changed template naming scheme: FORMAT.template -> default.FORMAT. Note: If you have existing templates in ~/.pandoc/templates, you must rename them to conform to the new scheme!
Default template improvements:
s5-url and slidy-url variables, instead of hard-coding. If you want to put your slidy files in the slidy subdirectory, for example, you can do pandoc -t slidy -V slidy-url=slidy -s.\and to separate authors in LaTeX documents (reader & writer). Closes #279.\emergencystretch to prevent overfull lines.hyperref options for xetex, fixing problems with unicode bookmarks (thanks to CircleCode).ucs package, use utf8 rather than utf8x with inputenc. This covers fewer characters but is more robust with other packages, and ucs is unmaintained. Users who need better unicode support should use xelatex or lualatex.If a template specified with --template is not found, look for it in datadir. Also, if no extension is provided, supply one based on the writer. So now you can put your special.latex template in ~/.pandoc/templates, and use it from any directory via pandoc -t latex --template special.
Added nonspaceChar to Text.Pandoc.Parsing.
Fixed smart quotes bug, now handling '...hi' properly.
RST reader:
simpleReferenceName parser.HTML reader:
LaTeX reader: Handle \subtitle command (a subtitle is added to the title, after a colon and linebreak). Closes #280.
Leaner reference.odt.
Added unexported module Text.Pandoc.MIME for use in the ODT writer.
ODT writer: Construct manifest.xml based on archive contents. This fixes a bug in ODTs containing images. Recent versions of LibreOffice would reject these as corrupt, because manifest.xml did not contain a reference to the image files.
LaTeX writer:
\texttt and escapes insntead of \verb!...!, which is too fragile (doesn’t work in command arguments).\enquote{} for quotes if the template includes the csquotes package. This provides better support for local quoting styles. (Thanks to Andreas Wagner for the idea.)ConTeXt writer: Make \starttyping/\stoptyping flush with margin, preventing spurious blank lines.
Slidy writer:
slidy.css with --offline option, so users can more easily edit it.S5 writer:
s5/default/slides.js.{comment,packed} with new compressed s5/default/slides.min.js.data: protocol to embed S5 CSS in <link> tags, when --offline is specified. Using inline CSS didn’t work with Chrome or Safari. This fixes offline S5 on those browsers.HTML writer: Removed English title on footnote backlinks. This is incongrous in non-English documents.
Docbook writer:
programlisting tags (instead of screen) for code blocks.markdown2pdf:
-halt-on-error -interaction nonstopmode instead of -interaction=batchmode, which essentially just ignored errors, leading to bad results. Better to know when something is wrong.pdflatex.--mathjax now takes an optional URL argument. If it is not provided, pandoc links directly to the (secure) mathjax CDN, as now recommended (thanks to dsanson).
Deprecated --xetex option in pandoc. It is no longer needed, since the LaTeX writer now produces a file that can be processed by latex, pdflatex, lualatex, or xelatex.
Introduced --luatex option to markdown2pdf. This causes lualatex to be used to create the PDF.
Added --epub-cover-image option.
Documented --biblatex and --natbib options.
Allow --section-divs with slidy output. Resolves Issue #296.
Disallow notes within notes in reST and markdown. These previously caused infinite looping and stack overflows. For example:
[^1]
[^1]: See [^1]
Note references are allowed in reST notes, so this isn’t a full implementation of reST. That can come later. For now we need to prevent the stack overflows. Partially resolves Issue #297.
EPUB writer: Allow non-plain math methods.
Forbid ()s in citation item keys. Resolves Issue #304: problems with (@item1; @item2) because the final paren was being parsed as part of the item key.
Changed URI parser so it doesn’t include trailing punctuation. So, in RST, http://google.com. should be parsed as a link followed by a period. The parser is smart enough to recognize balanced parentheses, as often occur in wikipedia links: http://foo.bar/baz_(bam).
Markdown+lhs reader: Require space after inverse bird tracks, so that HTML tags can be used freely at the left margin of a markdown+lhs document. Thanks to Conal Elliot for the suggestion.
Markdown reader: Fixed bug in footnote order (reported by CircleCode).
specialChars, so (http://google.com) will be parsed as a link in parens. Resolves Issue #291.| followed by newline in RST line block.\dots.\\[10pt].<b>, <emph>, etc.Paras instead of Plains in some contexts.OpenDocument writer: Use special First paragraph style for first paragraph after most non-paragraph blocks. This allows users to specify e.g. that only paragraphs after the first paragraph of a block are to be indented. Thanks to Andrea Rossato for the patch. Closes #20.
LaTeX writer: use deVerb on table and picture captions. Otherwise LaTeX complains about \verb inside command argument. Thanks to bbanier for reporting the bug.
Markdown writer: Insert HTML comment btw list and indented code block. This prevents the code block from being interpreted as part of the list.
EPUB writer: Add a meta element specify the cover. Some EPUB e-readers, such as the Nook, require a meta element inside the OPF metadata block to ensure the cover image is properly displayed. (Kelsey Hightower)
HTML writer: Use embed tag for images with non-image extensions. (e.g. PDFs). Closes #264.
LaTeX writer: Improved tables.
Un-URI-escape image filenames in LaTeX, ConTeXt, RTF, Texinfo. Also do this when copying image files into EPUBs and ODTs. Closes #263.
Changed to github issue tracker.
Added failing emph/strong markdown test case due to Perry Wagle.
duration variable in template. Setting this activates the timer.markdown2pdf: Removed some debugging lines accidentally included in the 1.8.1 release. With those lines, the temp directory is created in the working directory, and it is not deleted. This fix restores the original behavior.Added --ascii option. Currently supported only in HTML writer, which it causes to use numerical entities instead of UTF-8.
EPUB writer: --toc now works to provide a table of contents at the beginning of each chapter.
LaTeX writer: Change figure defaults to htbp. This prevents “too many unprocessed floats.” Resolves Issue #285.
Text.Pandoc.UTF8: Encode filenames even when using recent base.
markdown2pdf: Fixed filename encoding issues. With help from Paulo Tanimoto. Resolves Issue #286.
HTML writer: Put line breaks in section divs.
Text.Pandoc.Shared: Make writerSectionDivs default to False.
HTML writer:
Markdown reader: Fixed bug in footnote block parser (pointed out by Jesse Rosenthal). The problem arose when the blank line at the end of a footnote block contained indenting spaces.
Shared: Improved ‘normalize’ function so it normalizes Spaces too. In normal form, Space elements only occur to separate two non-Space elements. So, we never have [Space], or [, …, Space].
Tests:
README:
markdown2pdf: Fixed bug with output file extensions. Previously markdown2pdf test.txt -o test.en.pdf would produce test.pdf, not test.en.pdf. Thanks to Paolo Tanimoto for the fix.
Revised Interact.hs so that it works with the CPP macros in the UTF8 module.
Revised Setup.hs so that we don’t call MakeManPage.hs unless the man pages are out of date.
Support for citations using Andrea Rossato’s citeproc-hs 0.3. You can now write, for example,
Water is wet [see @doe99, pp. 33-35; also @smith04, ch. 1].
and, when you process your document using pandoc, specifying a citation style using --csl and a bibliography using --bibliography, the citation will be replaced by an appropriately formatted citation, and a list of works cited will be added to the end of the document.
This means that you can switch effortlessly between different citation and bibliography styles, including footnote, numerical, and author-date formats. The bibliography can be in any of the following formats: MODS, BibTeX, BibLaTeX, RIS, EndNote, EndNote XML, ISI, MEDLINE, Copac, or JSON. See the README for further details.
Citations are supported in the markdown reader, using a special syntax, and in the LaTeX reader, using natbib or biblatex syntax. (Thanks to Nathan Gass for the natbib and biblatex support.)
New textile reader and writer. Thanks to Paul Rivier for contributing the textile reader, an almost complete implementation of the textile syntax used by the ruby RedCloth library. Resolves Issue #51.
New org writer, for Emacs Org-mode, contributed by Puneeth Chaganti.
New json reader and writer, for reading and writing a JSON representation of the native Pandoc AST. These are much faster than the native reader and writer, and should be used for serializing Pandoc to text. To convert between the JSON representation and native Pandoc, use encodeJSON and decodeJSON from Text.JSON.Generic.
A new jsonFilter function in Text.Pandoc makes it easy to write scripts that transform a JSON-encoded pandoc document. For example:
-- removelinks.hs - removes links from document
import Text.Pandoc
main = interact $ jsonFilter $ bottomUp removeLink
where removeLink (Link xs _) = Emph xs
removeLink x = x
To use this to remove links while translating markdown to LaTeX:
pandoc -t json | runghc removelinks.hs | pandoc -f json -t latexAttributes are now allowed in inline Code elements, for example:
In this code, `ulist ! [theclass "special"] << elts`{.haskell} is...
The attribute syntax is the same as for delimited code blocks. Code inline has an extra argument place for attributes, just like CodeBlock. Inline code will be highlighted in HTML output, if pandoc is compiled with highlighting support. Resolves Issue #119.
New RawBlock and RawInline elements (replacing RawHtml, HtmlInline, and TeX) provide lots of flexibility in writing scripts to transform Pandoc documents. Scripts can now change how each element is rendered in each output format.
You can now define LaTeX macros in markdown documents, and pandoc will apply them to TeX math. For example,
\newcommand{\plus}[2]{#1 + #2}
$\plus{3}{4}$
yields 3+4. Since the macros are applied in the reader, they will work in every output format, not just LaTeX.
LaTeX macros can also be used in LaTeX documents (both in math and in non-math contexts).
A new --mathjax option has been added for displaying math in HTML using MathJax. Resolves issue #259.
Footnotes are now supported in the RST reader. (Note, however, that unlike docutils, pandoc ignores the numeral or symbol used in the note; footnotes are put in an auto-numbered ordered list.) Resolves Issue #258.
A new --normalize option causes pandoc to normalize the AST before writing the document. This means that, for example, *hi**there* will be rendered as <em>hithere</em> instead of <em>hi</em><em>there</em>. This is not the default, because there is a significant performance penalty.
A new --chapters command-line option causes headers in DocBook, LaTeX, and ConTeXt to start with “chapter” (level one). Resolves Issue #265.
In DocBook output, <chapter> is now used for top-level headers if the template contains <book>. Resolves Issue #265.
A new --listings option in pandoc and markdown2pdf causes the LaTeX writer to use the listings package for code blocks. (Thanks to Josef Svennigsson for the pandoc patch, and Etienne Millon for the markdown2pdf patch.)
markdown2pdf now supports --data-dir.
URLs in autolinks now have class “url” so they can be styled.
Improved prettyprinting in most formats. Lines will be wrapped more evenly and duplicate blank lines avoided.
New --columns command-line option sets the column width for line wrapping and relative width calculations for tables.
Made --smart work in HTML, RST, and Textile readers, as well as markdown.
Added --html5 option for HTML5 output.
Added support for listings package in LaTeX reader (Puneeth Chaganti).
Added support for simple tables in the LaTeX reader.
Added support for simple tables in the HTML reader.
Significant performance improvements in many readers and writers.
Moved Text.Pandoc.Definition from the pandoc package to a new auxiliary package, pandoc-types. This will make it possible for other programs to supply output in Pandoc format, without depending on the whole pandoc package.
Added Attr field to Code.
Removed RawHtml, HtmlInline, and TeX elements; added generic RawBlock and RawInline.
Moved generic functions to Text.Pandoc.Generic. Deprecated processWith, replacing it with two functions, bottomUp and topDown. Removed previously deprecated functions processPandoc and queryPandoc.
Added Text.Pandoc.Builder, for building Pandoc structures.
Text.Pandoc now exports association lists readers and writers.
Added Text.Pandoc.Readers.Native, which exports readNative. readNative can now read full pandoc documents, block lists, blocks, inline lists, or inlines. It will interpret Str "hi" as if it were Pandoc (Meta [] [] []) [Plain [Str "hi"]]. This should make testing easier.
Removed deprecated -C/--custom-header option. Use --template instead.
--biblio-file has been replaced by --bibliography. --biblio-format has been removed; pandoc now guesses the format from the file extension (see README).
pandoc will treat an argument as a URI only if it has an http(s) scheme. Previously pandoc would treat some Windows pathnames beginning with C:/ as URIs.
The --sanitize-html option and the stateSanitize field in ParserState have been removed. Sanitization is better done in the resulting HTML using xss-sanitize, which is based on pandoc’s sanitization, but improved.
pandoc now adds a newline to the end of its output in fragment mode (= not --standalone).
Added support for lang in html tag in the HTML template, so you can do pandoc -s -V lang=es, for example.
highlightHtml in Text.Pandoc.Highlighting now takes a boolean argument that selects between “inline” and “block” HTML.
Text.Pandoc.Writers.RTF now exports rtfEmbedImage. Images are embedded in RTF output when possible (png, jpeg). Resolves Issue #275.
Added Text.Pandoc.Pretty. This is better suited for pandoc than the pretty package. Changed all writers that used Text.PrettyPrint.HughesPJ to use Text.Pandoc.Pretty instead.
Rewrote writeNative using the new prettyprinting module. It is now much faster. The output has been made more consistent and compressed. writeNative is also now sensitive to writerStandalone, and will simplyprint a block list if writerStandalone` is False.
Removed Text.Pandoc.Blocks. Text.Pandoc.Pretty allows you to define blocks and concatenate them, so a separate module is no longer needed.
Text.Pandoc.Shared:
writerColumns, writerChapters, and writerHtml5 to WriterOptions.normalize.wrapped, wrapIfNeeded, wrappedTeX, wrapTeXIfNeeded, hang', BlockWrapper, wrappedBlocksToDoc.splitBy take a test instead of an element.findDataFile, refactored readDataFile.stringify. Rewrote inlineListToIdentifier using stringify.inlineListToIdentifier to treat ‘60’ as ’ ’.Text.Pandoc.Readers.HTML:
rawHtmlBlock, anyHtmlBlockTag, anyHtmlInlineTag, anyHtmlTag, anyHtmlEndTag, htmlEndTag, extractTagType, htmlBlockElement, htmlCommenthtmlTag, htmlInBalanced, isInlineTag, isBlockTag, isTextTagMoved smartPunctuation from Text.Pandoc.Readers.Markdown to Text.Pandoc.Readers.Parsing, and parameterized it with an inline parser.
Ellipses are no longer allowed to contain spaces. Previously we allowed ’. . .’, ’ . . . ’, etc. This caused too many complications, and removed author’s flexibility in combining ellipses with spaces and periods.
Allow linebreaks in URLs (treat as spaces). Also, a string of consecutive spaces or tabs is now parsed as a single space. If you have multiple spaces in your URL, use %20%20.
Text.Pandoc.Parsing:
refsMatch.Key constructor.Ord and Eq instances for Key.toKey and fromKey to convert between Key and [Inline].readWith.Small change in calculation of relative widths of table columns. If the size of the header > the specified column width, use the header size as 100% for purposes of calculating relative widths of columns.
Markdown writer now uses some pandoc-specific features when --strict is not specified: \ newline is used for a hard linebreak instead of two spaces then a newline. And delimited code blocks are used when there are attributes.
HTML writer: improved gladTeX output by setting ENV appropriately for display or inline math (Jonathan Daugherty).
LaTeX writer: Use \paragraph, \subparagraph for level 4,5 headers.
LaTeX reader:
\label{foo} and \ref{foo} now become {foo} instead of (foo).\index{} commands are skipped.Added fontsize variable to default LaTeX template. This makes it easy to set the font size using markdown2pdf: markdown2pdf -V fontsize=12pt input.txt.
Fixed problem with strikeout in LaTeX headers when using hyperref, by adding a command to the default LaTeX template that disables \sout inside pdf strings. Thanks to Joost Kremers for the fix.
The COLUMNS environment variable no longer has any effect.
Pandoc now compiles with GHC 7. (This alone leads to a significant performance improvement, 15-20%.)
Completely rewrote HTML reader using tagsoup as a lexer. The new reader is faster and more accurate. Unlike the old reader, it does not get bogged down on some input (Issues #277, 255). And it handles namespaces in tags (Issue #274).
Replaced escapeStringAsXML with a faster version.
Rewrote spaceChar and some other parsers in Text.Pandoc.Parsing for a significant performance boost.
Improved performance of all readers by rewriting parsers.
Simplified Text.Pandoc.CharacterReferences by using entity lookup functions from TagSoup.
Text.Pandoc.UTF8 now uses the unicode-aware IO functions from System.IO if base >= 4.2. This gives support for windows line endings on windows.
Remove duplications in documentation by generating the pandoc man page from README, using MakeManPage.hs.
README now includes a full description of markdown syntax, including non-pandoc-specific parts. A new pandoc_markdown man page is extracted from this, so you can look up markdown syntax by doing man pandoc_markdown.
Completely revised test framework (with help from Nathan Gass). The new test framework is built when the tests Cabal flag is set. It includes the old integration tests, but also some new unit and quickcheck tests. Test output has been much improved, and you can now specify a glob pattern after cabal test to indicate which tests should be run; for example cabal test citations will run all the citation tests.
Added a shell script, stripansi.sh, for filtering ANSI control sequences from test output: cabal test | ./stripansi.sh > test.log.
Added Interact.hs to make it easier to use ghci while developing. Interact.hs loads ghci from the src directory, specifying all the options needed to load pandoc modules (including specific package dependencies, which it gets by parsing dist/setup-config).
Added Benchmark.hs, testing all readers + writers using criterion.
Added stats.sh, to make it easier to collect and archive benchmark and lines-of-code stats.
Added upper bounds to all cabal dependencies.
Include man pages in extra-source-files. This allows users to install pandoc from the tarball without needing to build the man pages.
Filenames are encoded as UTF8. Resolves Issue #252.
Handle curly quotes better in --smart mode. Previously, curly quotes were just parsed literally, leading to problems in some output formats. Now they are parsed as Quoted inlines, if --smart is specified. Resolves Issue #270.
Text.Pandoc.Parsing: Fixed bug in grid table parser. Spaces at end of line were not being stripped properly, resulting in unintended LineBreaks.
Markdown reader:
aaa <!-- comment --> bbb can be a single paragraph.^[link](/foo)^ gets recognized as a superscripted link, not an inline note followed by garbage.Mr.) at the end of a line.RST reader:
Para instead of Plain, matching behavior of rst2xml.py.LaTeX reader:
\begin{document} in, say, a verbatim block.\begin or \end and {.\L and \l.LaTeX writer:
\href{..}.\parbox.OpenDocument writer: don’t print raw TeX.
Markdown writer:
Image. URI was getting unescaped twice!LaTeX and ConTeXt: Escape [ and ] as {[} and {]}. This avoids unwanted interpretation as an optional argument.
ConTeXt writer: Fixed problem with inline code. Previously } would be rendered \type{}}. Now we check the string for ‘}’ and ‘{’. If it contains neither, use \type{}; otherwise use \mono{} with an escaped version of the string.
: now allowed in HTML tags. Resolves Issue #274.
New EPUB and HTML Slidy writers. (Issue #122)
All input is assumed to be UTF-8, no matter what the locale and ghc version, and all output is UTF-8. This reverts to pre-1.5 behavior. Also, a BOM, if present, is stripped from the input.
Markdown now supports grid tables, whose cells can contain arbitrary block elements. (Issue #43)
Sequentially numbered example lists in markdown with @ marker.
Markdown table captions can begin with a bare colon and no longer need to include the English word “table.” Also, a caption can now occur either before or after the table. (Issue #227)
New command-line options:
--epub-stylesheet allows you to specify a CSS file that will be used to style your ebook.--epub-metadata allows you to specify metadata for the ebook.--offline causes the generated HTML slideshow to include all needed scripts and stylesheets.--webtex causes TeX math to be converted to images using the Google Charts API (unless a different URL is specified).--section-divs causes div tags to be added around each section in an HTML document. (Issue #230, 239)Default behavior of S5 writer in standalone mode has changed: previously, it would include all needed scripts and stylesheets in the generated HTML; now, only links are included unless the --offline option is used.
Default behavior of HTML writer has changed. Between 1.2 and 1.5, pandoc would enclose sections in div tags with identifiers on the div tags, so that the sections can be manipulated in javascript. This caused undesirable interactions with raw HTML div tags. So, starting with 1.6, the default is to put the identifiers directly on the header tags, and not to include the divs. The --section-divs option selects the 1.2-1.5 behavior.
API changes:
HTMLMathMethod: Added WebTeX, removed MimeTeX.WriterOptions: Added writerUserDataDir, writerSourceDirectory, writerEPUBMetadata fields. Removed writerIncludeBefore, writerIncludeAfter.headerShift to Text.Pandoc.Shared.ParserState from Text.Pandoc.Shared to a new module, Text.Pandoc.Parsing.stateHasChapters to ParserState.HTMLSlideVariant.KeyTable a map instead of an association list.Meta fields (docTitle, docAuthors, docDate).Pandoc, Meta, Inline, and Block have been given Ord instances.Key), with its own Ord instance for case-insensitive comparison.Text.Pandoc.Writers.EPUB.Text.Pandoc.UUID.Text.Pandoc.ODT, added Text.Pandoc.Writers.ODT. Removed saveOpenDocumentAsODT, added writeODT.Text.Pandoc.Writers.Native and writeNative. Removed prettyPandoc.Text.Pandoc.UTF8 for portable UTF8 string IO.Text.Pandoc.Writers.S5 and the writeS5 function. Moved s5Includes to a new module, Text.Pandoc.S5. To write S5, you now use writeHtml with writerSlideVariant set to S5Slides or SlidySlides.Template changes. If you use custom templates, please update them, particularly if you use syntax highlighting with pandoc. The old HTML templates hardcoded highlighting CSS that will no longer work with the most recent version of highlighting-kate.
<body> tag, as documented. (Issue #241)Removed excess newlines at the end of output. Note: because output will not contain an extra newline, you may need to make adjustments if you are inserting pandoc’s output into a template.
In S5 and slidy, horizontal rules now cause a new slide, so you are no longer limited to one slide per section.
Improved handling of code in man writer. Inline code is now monospace, not bold, and code blocks now use .nf (no fill) and .IP (indented para).
HTML reader parses <tt> as Code. (Issue #247)
html+lhs output now contains bird tracks, even when compiled without highlighting support. (Issue #242)
Colons are now no longer allowed in autogenerated XML/HTML identifiers, since they have a special meaning in XML.
Code improvements in ODT writer. Remote images are now replaced with their alt text rather than a broken link.
LaTeX reader improvements:
\section, \chapter parsers more forgiving of whitespace.\chapter{} in latex.rawLaTeXInline to accept \section, \begin, etc.rawLaTeXInline' in LaTeX reader, and export rawLaTeXInline for use in markdown reader.\section{foo} was not recognized as raw TeX in markdown document.LaTeX writer: images are automatically shrunk if they would extend beyond the page margin.
Plain, markdown, RST writers now use unicode for smart punctuation.
Man writer converts math to unicode when possible, as in other writers.
markdown2pdf can now recognize citeproc options.
Command-line arguments are converted to UTF-8. (Issue #234)
Text.Pandoc.TeXMath has been rewritten to use texmath’s parser. This allows it to handle a wider range of formulas. Also, if a formula cannot be converted, it is left in raw TeX; formulas are no longer partially converted.
Unicode curly quotes are left alone when parsing smart quotes. (Issue #143)
Cabal file changes:
Use explicit imports from Data.Generics. Otherwise we have a conflict with the ‘empty’ symbol, introduced in syb >= 0.2. (Issue #237)
New data files: slidy/slidy.min.js, slidy/slidy.min.css, epub.css.
--mathml option, for display of TeX math as MathML.--data-dir option, allowing users to specify a data directory other than ~/.pandoc. Files placed in this directory will be used instead of system defaults.--base-header-level option. For example, --base-header-level=2 changes level 1 headers to level 2, level 2 to level 3, etc.html2markdown has been removed; it is no longer necessary, given the last two changes. pandoc can be used by itself to convert web pages to markdown or other formats.hsmarkdown has also been removed. Use pandoc --strict instead. Or symlink pandoc’s executable to hsmarkdown; pandoc will then behave like hsmarkdown used to.= head = is now level 1 instead of level 2.-B and -A options now imply -s and no longer work in fragment mode.\chapter is now used instead of \section. when the documentclass is book, report, or memoir.--standalone mode. Added --template and --variable options. The --print-default-header option is now --print-default-template. See README under “Templates” for details.--custom-header option should still work, but it has been deprecated.--reference-odt option allows users to customize styles in ODT output.~/.pandoc directory, where they will override system defaults. See README for details.>, <, ", &).--xetex option for pandoc and markdown2pdf.markdown2pdf and hsmarkdown wrappers.--id-prefix option to help prevent duplicate identifiers when you’re generating HTML fragments.--indented-code-classes option, which specifies default highlighting syntax for indented code blocks.--number-sections now affects HTML output.--smart).--strict compatible with --standalone and --toc.--email-obfuscation option.--jsmath option supporting use of pandoc with [jsMath].--sanitize-html option (and a corresponding parameter in ParserState for those using the pandoc libraries in programs). This option causes pandoc to sanitize HTML (in HTML or Markdown input) using a whitelist method. Possibly harmful HTML elements are replaced with HTML comments. This should be useful in the context of web applications, where pandoc may be used to convert user input into HTML.--gladtex, --mimetex, and --asciimathml options are provided. See the User’s Guide for details.--no-wrap option that disables line wrapping and minimizes whitespace in HTML output.