pandoc 1.13.1 (30 Aug 2014)

  • Fixed --self-contained with Windows paths (#1558). Previously C:\foo.js was being wrongly interpreted as a URI.

  • HTML reader: improved handling of tags that can be block or inline. Previously a section like this would be enclosed in a paragraph, with RawInline for the video tags (since video is a tag that can be either block or inline):

    <video controls="controls">
       <source src="../videos/test.mp4" type="video/mp4" />
       <source src="../videos/test.webm" type="video/webm" />
          The videos can not be played back on your system.<br/>
          Try viewing on Youtube (requires Internet connection):
          <a href="">Relative Velocity on

    This change will cause the video and source tags to be parsed as RawBlock instead, giving better output. The general change is this: when we’re parsing a “plain” sequence of inlines, we don’t parse anything that COULD be a block-level tag.

  • Docx reader:

    • Be sensitive to user styles. Note that “Hyperlink” is “blacklisted,” as we don’t want the default underline styling to be inherited by all links by default (Jesse Rosenthal).
    • Read single paragraph in table cell as Plain (Jesse Rosenthal). This makes to docx reader’s native output fit with the way the markdown reader understands its markdown output.
  • Txt2Tags reader:

    • Header is now parsed only if standalone flag is set (Matthew Pickering).
    • The header is now parsed as meta information. The first line is the title, the second is the author and third line is the date (Matthew Pickering).
    • Corrected formatting of %%mtime macro (Matthew Pickering).
    • Fixed crash when reading from stdin.
  • Textile writer: Extended the range of cases where native textile tables will be used (as opposed to raw HTML): we now handle any alignment type, but only for simple tables with no captions.

  • EPUB writer: Don’t use page-progression-direction in EPUB2, which doesn’t support it. Also, if page-progression-direction not specified in metadata, don’t include the attribute even in EPUB3; not including it is the same as including it with the value “default”, as we did before. (#1550)

  • Org writer: Accept example lines with indentation at the beginning (Calvin Beck).

  • DokuWiki writer:

    • Refactor to use Reader monad (Matthew Pickering).
    • Avoid using raw HTML in table cells; instead, use \\ instead of newlines (Jesse Rosenthal).
    • Properly handle HTML table cell alignments, and use spacing to make the tables look prettier (#1566).
  • Docx writer:

    • Bibliography entries get Bibliography style (#1559).
    • Implement change tracking (Jesse Rosenthal).
  • LaTeX writer:

    • Fixed a bug that caused a table caption to repeat across all pages (Jose Luis Duran).
    • Improved vertical spacing in tables and made it customizable using standard lengths set by booktab. See!msg/pandoc-discuss/qMu6_5lYy0o/ZAU7lzAIKw0J (Jose Luis Duran).
    • Added \strut to fix spacing in multiline tables (Jose Luis Duran).
    • Use \tabularnewline instead of \\ in table cells (Jose Luis Duran).
    • Made horizontal rules more flexible (Jose Luis Duran).
  • Text.Pandoc.MIME:

    • Added MimeType (type synonym for String) and getMimeTypeDef. Code cleanups (Artyom Kazak).
  • Templates:

    • LaTeX template: disable microtype protrusion for typewriter font (#1549, thanks lemzwerg).
  • Improved OSX build procedure.

  • Added network-uri flag, to deal with split of network-uri from network.

  • Fix build dependencies for the trypandoc flag, so that they are ignored if trypandoc flag is set to False (Gabor Pali).

  • Updated README to remove outdated claim that --self-contained looks in the user data directory for missing files.

pandoc (17 August 2014)

  • Docx writer:

    • Fixed regression which bungled list numbering (#1544), causing all lists to appear as basic ordered lists.
    • Include row width in table rows (Christoffer Ackelman, Viktor Kronvall). Added a property to all table rows where the sum of column widths is specified in pct (fraction of 5000). This helps persuade Word to lay out the table with the widths we specify.
  • Fixed a bug in Windows 8 which caused pandoc not to find the pandoc-citeproc filter (#1542).

  • Docx reader: miscellaneous under-the-hood improvements (Jesse Rosenthal). Most significantly, the reader now uses Builder, leading to some performance improvements.

  • HTML reader: Parse appropriately styled span as SmallCaps.

  • Markdown writer: don’t escape $, ^, ~ when tex_math_dollars, superscript, and subscript extensions, respectively, are deactivated (#1127).

  • Added trypandoc flag to build CGI executable used in the online demo.

  • Makefile: Added ‘quick’, ‘osxpkg’ targets.

  • Updated README in templates to indicate templates license. The templates are dual-licensed, BSD3 and GPL2+.

pandoc 1.13 (15 August 2014)

New features

  • Added docx as an input format (Jesse Rosenthal). The docx reader includes conversion of native Word equations to pandoc LaTeX Math elements. Metadata is taken from paragraphs at the beginning of the document with styles Author, Title, Subtitle, Date, and Abstract.

  • Added epub as an input format (Matthew Pickering). The epub reader includes conversion of MathML to pandoc LaTeX Math elements.

  • Added t2t (Txt2Tags) as an input format (Matthew Pickering). Txt2tags is a lightweight markup format described at

  • Added dokuwiki as an output format (Clare Macrae).

  • Added haddock as an output format.

  • Added --extract-media option to extract media contained in a zip container (docx or epub) while adjusting image paths to point to the extracted images.

  • Added a new markdown extension, compact_definition_lists, that restores the syntax for definition lists of pandoc 1.12.x, allowing tight definition lists with no blank space between items, and disallowing lazy wrapping. (See below under behavior changes.)

  • Added an extension epub_html_exts for parsing HTML in EPUBs.

  • Added extensions native_spans and native_divs to activate parsing of material in HTML span or div tags as Pandoc Span inlines or Div blocks.

  • --trace now works with the Markdown, HTML, Haddock, EPUB, Textile, and MediaWiki readers. This is an option intended for debugging parsing problems; ordinary users should not need to use it.

Behavior changes

  • Changed behavior of the markdown_attribute extension, to bring it in line with PHP markdown extra and multimarkdown. Setting markdown="1" on an outer tag affects all contained tags, recursively, until it is reversed with markdown="0" (#1378).

  • Revised markdown definition list syntax (#1429). Both the reader and writer are affected. This change brings pandoc’s definition list syntax into alignment with that used in PHP markdown extra and multimarkdown (with the exception that pandoc is more flexible about the definition markers, allowing tildes as well as colons). Lazily wrapped definitions are now allowed. Blank space is required between list items. The space before a definition is used to determine whether it is a paragraph or a “plain” element. WARNING: This change may break existing documents! Either check your documents for definition lists without blank space between items, or use markdown+compact_definition_lists for the old behavior.

  • .numberLines now works in fenced code blocks even if no language is given (#1287, jgm/highlighting-kate#40).

  • Improvements to --filter:

    • Don’t search PATH for a filter with an explicit path. This fixed a bug wherein --filter ./ would run from the system path, even if there was a in the working directory.
    • Respect shebang if filter is executable (#1389).
    • Don’t print misleading error message. Previously pandoc would say that a filter was not found, even in a case where the filter had a syntax error.
  • HTML reader:

    • Parse div and span elements even without --parse-raw, provided native_divs and native_spans extensions are set. Motivation: these now generate native pandoc Div and Span elements, not raw HTML.
    • Parse EPUB-specific elements if the epub_html_exts extension is enabled. These include switch, footnote, rearnote, noteref.
  • Org reader:

    • Support for inline LaTeX. Inline LaTeX is now accepted and parsed by the org-mode reader. Both math symbols (like \tau) and LaTeX commands (like \cite{Coffee}), can be used without any further escaping (Albert Krewinkel).
  • Textile reader and writer:

    • The raw_tex extension is no longer set by default. You can enable it with textile+raw_tex.
  • DocBook reader:

    • Support equation, informalequation, inlineequation elements with mml:math content. This is converted into LaTeX and put into a Pandoc Math inline.
  • Revised plain output, largely following the style of Project Gutenberg:

    • Emphasis is rendered with _underscores_, strong emphasis with ALL CAPS.
    • Headings are rendered differently, with space to set them off, not with setext style underlines. Level 1 headers are ALL CAPS.
    • Math is rendered using unicode when possible, but without the distracting emphasis markers around variables.
    • Footnotes use a regular [n] style.
  • Markdown writer:

    • Horizontal rules are now a line across the whole page.
    • Prettier pipe tables. Columns are now aligned (#1323).
    • Respect the raw_html extension. pandoc -t markdown-raw_html no longer emits any raw HTML, including span and div tags generated by Span and Div elements.
    • Use span with style for SmallCaps (#1360).
  • HTML writer:

    • Autolinks now have class uri, and email autolinks have class email, so they can be styled.
  • Docx writer:

    • Document formatting is carried over from reference.docx. This includes margins, page size, page orientation, header, and footer, including images in headers and footers.
    • Include abstract (if present) with Abstract style (#1451).
    • Include subtitle (if present) with Subtitle style, rather than tacking it on to the title (#1451).
  • Org writer:

    • Write empty span elements with an id attribute as org anchors. For example Span ("uid",[],[]) [] becomes <<uid>>.
  • LaTeX writer:

    • Put table captions above tables, to match the conventional standard. (Previously they appeared below tables.)
    • Use \(..\) instead of $..$ for inline math (#1464).
    • Use \nolinkurl in email autolinks. This allows them to be styled using \urlstyle{tt}. Thanks to Ulrike Fischer for the solution.
    • Use \textquotesingle for ' in inline code. Otherwise we get curly quotes in the PDF output (#1364).
    • Use \footnote<.>{..} for notes in beamer, so that footnotes do not appear before the overlays in which their markers appear (#1525).
    • Don’t produce a \label{..} for a Div or Span element. Do produce a \hyperdef{..} (#1519).
  • EPUB writer:

    • If the metadata includes page-progression-direction (which can be ltr or rtl, the page-progression-direction attribute will be set in the EPUB spine (#1455).
  • Custom lua writers:

    • Custom writers now work with --template.
    • Removed HTML header scaffolding from sample.lua.
    • Made citation information available in lua writers.
  • --normalize and Text.Pandoc.Shared.normalize now consolidate adjacent RawBlocks when possible.

API changes

  • Added Text.Pandoc.Readers.Docx, exporting readDocx (Jesse Rosenthal).

  • Added Text.Pandoc.Readers.EPUB, exporting readEPUB (Matthew Pickering).

  • Added Text.Pandoc.Readers.Txt2Tags, exporting readTxt2Tags (Matthew Pickering).

  • Added Text.Pandoc.Writers.DokuWiki, exporting writeDokuWiki (Clare Macrae).

  • Added Text.Pandoc.Writers.Haddock, exporting writeHaddock.

  • Added Text.Pandoc.MediaBag, exporting MediaBag, lookupMedia, insertMedia, mediaDirectory, extractMediaBag. The docx and epub readers return a pair of a Pandoc document and a MediaBag with the media resources they contain. This can be extracted using --extract-media. Writers that incorporate media (PDF, Docx, ODT, EPUB, RTF, or HTML formats with --self-contained) will look for resources in the MediaBag generated by the reader, in addition to the file system or web.

  • Text.Pandoc.Readers.TexMath: Removed deprecated readTeXMath. Renamed readTeXMath' to texMathToInlines.

  • Text.Pandoc: Added Reader data type (Matthew Pickering). readers now associates names of readers with Reader structures. This allows inclusion of readers, like the docx reader, that take binary rather than textual input.

  • Text.Pandoc.Shared:

    • Added capitalize (Artyom Kazak), and replaced uses of map toUpper (which give bad results for many languages).
    • Added collapseFilePath, which removes intermediate . and .. from a path (Matthew Pickering).
    • Added fetchItem', which works like fetchItem but searches a MediaBag before looking on the net or file system.
    • Added withTempDir.
    • Added removeFormatting.
    • Added extractSpaces (from HTML reader) and generalized its type so that it can be used by the docx reader (Matthew Pickering).
    • Added ordNub.
    • Added normalizeInlines, normalizeBlocks.
    • normalize is now Pandoc -> Pandoc instead of Data a :: a -> a. Some users may need to change their uses of normalize to the newly exported normalizeInlines or normalizeBlocks.
  • Text.Pandoc.Options:

    • Added writerMediaBag to WriterOptions.
    • Removed deprecated and no longer used readerStrict in ReaderOptions. This is handled by readerExtensions now.
    • Added Ext_compact_definition_lists.
    • Added Ext_epub_html_exts.
    • Added Ext_native_divs and Ext_native_spans. This allows users to turn off the default pandoc behavior of parsing contents of div and span tags in markdown and HTML as native pandoc Div blocks and Span inlines.
  • Text.Pandoc.Parsing:

    • Generalized readWith to readWithM (Matthew Pickering).
    • Export runParserT and Stream (Matthew Pickering).
    • Added HasQuoteContext type class (Matthew Pickering).
    • Generalized types of mathInline, smartPunctuation, quoted, singleQuoted, doubleQuoted, failIfInQuoteContext, applyMacros (Matthew Pickering).
    • Added custom token (Matthew Pickering).
    • Added stateInHtmlBlock to ParserState. This is used to keep track of the ending tag we’re waiting for when we’re parsing inside HTML block tags.
    • Added stateMarkdownAttribute to ParserState. This is used to keep track of whether the markdown attribute has been set in an enclosing tag.
    • Generalized type of registerHeader, using new type classes HasReaderOptions, HasIdentifierList, HasHeaderMap (Matthew Pickering). These allow certain common functions to be reused even in parsers that use custom state (instead of ParserState), such as the MediaWiki reader.
    • Moved inlineMath, displayMath from Markdown reader to Parsing, and generalized their types (Matthew Pickering).
  • Text.Pandoc.Pretty:

    • Added nestle.
    • Added blanklines, which guarantees a certain number of blank lines (and no more).

Bug fixes

  • Markdown reader:

    • Fixed parsing of indented code in list items. Indented code at the beginning of a list item must be indented eight spaces from the margin (or edge of the container), or four spaces from the list marker, whichever is greater.
    • Fixed small bug in HTML parsing with markdown_attribute, which caused incorrect tag nesting for input like <aside markdown="1">*hi*</aside>.
    • Fixed regression with intraword underscores (#1121).
    • Improved parsing of inline links containing quote characters (#1534).
    • Slight rewrite of enclosure/emphOrStrong code.
    • Revamped raw HTML block parsing in markdown (#1330). We no longer include trailing spaces and newlines in the raw blocks. We look for closing tags for elements (but without backtracking). Each block-level tag is its own RawBlock; we no longer try to consolidate them (though --normalize will do so).
    • Combine consecutive latex environments. This helps when you have two minipages which can’t have blank lines between them (#690, #1196).
    • Support smallcaps through span. <span style="font-variant:small-caps;">foo</span> will be parsed as a SmallCaps inline, and will work in all output formats that support small caps (#1360).
    • Prevent spurious line breaks after list items (#1137). When the hard_line_breaks option was specified, pandoc would formerly produce a spurious line break after a tight list item.
    • Fixed table parsing bug (#1333).
    • Handle c++ and objective-c as language identifiers in github-style fenced blocks (#1318).
    • Inline math must have nonspace before final $ (#1313).
  • LaTeX reader:

    • Handle comments at the end of tables. This resolves the issue illustrated in
    • Correctly handle table rows with too few cells. LaTeX seems to treat them as if they have empty cells at the end (#241).
    • Handle leading/trailing spaces in \emph better. \emph{ hi } gets parsed as [Space, Emph [Str "hi"], Space] so that we don’t get things like * hi * in markdown output. Also applies to \textbf and some other constructions (#1146).
    • Don’t assume preamble doesn’t contain environments (#1338).
    • Allow (and discard) optional argument for \caption (James Aspnes).
  • HTML reader:

    • Fixed major parsing problem with HTML tables. Table cells were being combined into one cell (#1341).
    • Fixed performance issue with malformed HTML tables. We let a </table> tag close an open <tr> or <td> (#1167).
    • Allow space between <col> and </col>.
    • Added audio and source in eitherBlockOrInline.
    • Moved video, svg, progress, script, noscript, svg from blockTags to eitherBlockOrInline.
    • map and object were mistakenly in both lists; they have been removed from blockTags.
    • Ignore DOCTYPE and xml declarations.
  • MediaWiki reader:

    • Don’t parse backslash escapes inside <source> (#1445).
    • Tightened up template parsing. The opening {{ must be followed by an alphanumeric or :. This prevents the exponential slowdown in #1033.
    • Support “Bild” for images.
  • DocBook reader:

    • Better handle elements inside code environments. Pandoc’s document model does not allow structure inside code blocks, but at least this way we preserve the text (#1449).
    • Support <?asciidoc-br?> (#1236).
  • Textile reader:

    • Fixed list parsing. Lists can now start without an intervening blank line (#1513).
    • HTML block-level tags that do not start a line are parsed as inline HTML and do not interrupt paragraphs (as in RedCloth).
  • Org reader:

    • Make tildes create inline code (#1345). Also relabeled code and verbatim parsers to accord with the org-mode manual.
    • Respect :exports header argument in code blocks (Craig Bosma).
    • Fixed tight lists with sublists (#1437).
  • EPUB writer:

    • Avoid excess whitespace in nav.xhtml. This should improve TOC view in iBooks (#1392).
    • Fixed regression on cover image. In 1.12.4 and, the cover image would not appear properly, because the metadata id was not correct. Now we derive the id from the actual cover image filename, which we preserve rather than using “cover-image.”
    • Keep newlines between block elements. This allows easier diff-ability (#1424).
    • Use stringify instead of custom plainify.
    • Use renderTags' for all tag rendering. This properly handles tags that should be self-closing. Previously <hr/> would appear in EPUB output as <hr></hr> (#1420).
    • Better handle HTML media tags.
    • Handle multiple dates with OPF event attributes. Note: in EPUB3 we can have only one dc:date, so only the first one is used.
  • LaTeX writer:

    • Correctly handle figures in notes. Notes can’t contain figures in LaTeX, so we fake it to avoid an error (#1053).
    • Fixed strikeout + highlighted code (#1294). Previously strikeout highlighted code caused an error.
  • ConTeXt writer:

    • Improved detection of autolinks with URLs containing escapes.
  • RTF writer:

    • Improved image embedding: fetchItem' is now used to get the images, and calculated image sizes are indicated in the RTF.
    • Avoid extra paragraph tags in metadata (#1421).
  • HTML writer:

    • Deactivate “incremental” inside slide speaker notes (#1394).
    • Don’t include empty items in the table of contents for slide shows. (These would result from creating a slide using a horizontal rule.)
  • MediaWiki writer:

    • Minor renaming of st prefixed names.
  • AsciiDoc writer:

    • Double up emphasis and strong emphasis markers in intraword contexts, as required by asciidoc (#1441).
  • Markdown writer:

    • Avoid wrapping that might start a list, blockquote, or header (#1013).
    • Use Span instead of (hackish) SmallCaps in plainify.
    • Don’t use braced attributes for fenced code (#1416). If Ext_fenced_code_attributes is not set, the first class attribute will be printed after the opening fence as a bare word.
    • Separate adjacent lists of the same kind with an HTML comment (#1458).
  • PDF writer:

    • Fixed treatment of data uris for images (#1062).
  • Docx writer:

    • Use Compact style for empty table cells (#1353). Otherwise we get overly tall lines when there are empty table cells and the other cells are compact.
    • Create overrides per-image for media/ in reference docx. This should be somewhat more robust and cover more types of images.
    • Improved entryFromArchive to avoid an unneeded parse.
    • Section numbering carries over from reference.docx (#1305).
    • Simplified abstractNumId numbering. Instead of sequential numbering, we assign numbers based on the list marker styles.
  • Text.Pandoc.Options:

    • Removed Ext_fenced_code_attributes from markdown_github extensions.
  • Text.Pandoc.ImageSize:

    • Use default instead of failing if image size not found in exif header (#1358).
    • ignore unknown exif header tag rather than crashing. Some images seem to have tag type of 256, which was causing a runtime error.
  • Text.Pandoc.Shared:

    • fetchItem: unescape URI encoding before reading local file (#1427).
    • fetchItem: strip a fragment like ?#iefix from the extension before doing mime lookup, to improve mime type guessing.
    • Improved logic of fetchItem: absolute URIs are fetched from the net; other things are treated as relative URIs if sourceURL is Just _, otherwise as file paths on the local file system.
    • fetchItem now properly handles links without a protocol (#1477).
    • fetchItem now escapes characters not allowed in URIs before trying to parse the URIs.
    • Fixed runtime error with compactify'DL on certain lists (#1452).
  • pandoc.hs: Don’t strip path off of writerSourceURL: the path is needed to resolve relative URLs when we fetch resources (#750).

  • Text.Pandoc.Parsing

    • Simplified dash and ellipsis (#1419).
    • Removed (>>~) in favor of the equivalent (<*) (Matthew Pickering).
    • Generalized functions to use ParsecT (Matthew Pickering).
    • Added isbn and pmid to list of recognized schemes (Matthew Pickering).

Template changes

  • Added haddock template.
  • EPUB3: Added type attribute to link tags. They are supposed to be “advisory” in HTML5, but kindlegen seems to require them.
  • EPUB3: Put title page in section with epub:type="titlepage".
  • LaTeX: Made \subtitle work properly (#1327).
  • LaTeX/Beamer: remove conditional around date (#1321).
  • LaTeX: Added lot and lof variables, which can be set to get \listoftables and \listoffigures (#1407). Note that these variables can be set at the command line with -Vlot -Vlof or in YAML metadata.

Under the hood improvements

  • Rewrote normalize for efficiency (#1385).

  • Rewrote Haddock reader to use haddock-library (#1346).

    • This brings pandoc’s rendering of haddock markup in line with the new haddock.
    • Fixed line breaks in @ code blocks.
    • alex and happy are no longer build-depends.
  • Added Text.Pandoc.Compat.Directory to allow building against different versions of the directory library.

  • Added Text.Pandoc.Compat.Except to allow building against different verions of mtl.

  • Code cleanup in some writers, using Reader monad to avoid passing options parameter around (Matej Kollar).

  • Improved readability in pandoc.hs.

  • Miscellaneous code cleanups (Artyom Kazak).

  • Avoid import Prelude hiding (catch) (#1309, thanks to Michael Thompson).

  • Changed http-conduit flag to https. Depend on http-client and http-client-tls instead of http-conduit. (Note: pandoc still depends on conduit via yaml.)

  • Require highlighting-kate >= (#1271, #1317, Debian #753299). This change to highlighting-kate means that PHP fragments no longer need to start with <?php. It also fixes a serious bug causing failures with ocaml and fsharp.

  • Require latest texmath. This fixes \tilde{E} and allows \left to be used with ], ) etc. (#1319), among many other improvements.

  • Require latest zip-archive. This has fixes for unicode path names.

  • Added tests for plain writer.

  • Text.Pandoc.Templates:

    • Fail informatively on template syntax errors. With the move from parsec to attoparsec, we lost good error reporting. In fact, since we weren’t testing for end of input, malformed templates would fail silently. Here we revert back to Parsec for better error messages.
    • Use ordNub (#1022).
  • Benchmarks:

    • Made benchmarks compile again (Artyom Kazak).
    • Fixed so that the failure of one benchmark does not prevent others from running (Artyom Kazak).
    • Use nfIO instead of the getLength trick to force full evaluation.
    • Changed benchmark to use only the test suite, so that benchmarks run more quickly.
  • Windows build script:

    • Add -windows to file name.
    • Use one install command for pandoc, pandoc-citeproc.
    • Force install of pandoc-citeproc.
  • make_osx_package: Call zip file The zip should not be named, or OSX finder will extract it into a folder named SOMETHING.pkg, which it will interpret as a defective package (#1308).


    • Made headers for all extensions so they have IDs and can be linked to (Beni Cherniavsky-Paskin).
    • Fixed typos (Phillip Alday).
    • Fixed documentation of attributes (#1315).
    • Clarified documentation on small caps (#1360).
    • Better documentation for fenced_code_attributes extension (Caleb McDaniel).
    • Documented fact that you can put YAML metadata in a separate file (#1412).

pandoc (14 May 2014)

  • Require highlighting-kate >= 0.5.8. Fixes a performance regression.

  • Shared: addMetaValue now behaves slightly differently: if both the new and old values are lists, it concatenates their contents to form a new list.

  • LaTeX reader:

    • Set bibliography in metadata from \bibliography or \addbibresource command.
    • Don’t error on %foo with no trailing newline.
  • Org reader:

    • Support code block headers (#+BEGIN_SRC ...) (Albert Krewinkel).
    • Fix parsing of blank lines within blocks (Albert Krewinkel).
    • Support pandoc citation extension (Albert Krewinkel). This can be turned off by specifying org-citation as the input format.
  • Markdown reader:

    • citeKey moved to Text.Pandoc.Parsing so it can be used by other readers (Albert Krewinkel).
  • Text.Pandoc.Parsing:

    • Added citeKey (see above).
    • Added HasLastStrPosition type class and updateLastStrPos and notAfterString functions.
  • Updated copyright notices (Albert Krewinkel).

  • Added default.icml to data files so it installs with the package.

  • OSX package:

    • The binary is now built with options to ensure that it can be used with OSX 10.6+.
    • Moved OSX package materials to osx directory.
    • Added OSX package uninstall script, included in the zip container (thanks to Daniel T. Staal).

pandoc 1.12.4 (07 May 2014)

  • Made it possible to run filters that aren’t executable (#1096). Pandoc first tries to find the executable (searching the path if path isn’t given). If it fails, but the file exists and has a .py, .pl, .rb, .hs, or .php extension, pandoc runs the filter using the appropriate interpreter. This should make it easier to use filters on Windows, and make it more convenient for everyone.

  • Added Emacs org-mode reader (Albert Krewinkel).

  • Added InDesign ICML Writer (mb21).

  • MediaWiki reader:

    • Accept image links in more languages (Jaime Marquínez Ferrándiz).
    • Fixed bug in certain nested lists (#1213). If a level 2 list was followed by a level 1 list, the first item of the level 1 list would be lost.
    • Handle table rows containing just an HTML comment (#1230).
  • LaTeX reader:

    • Give better location information on errors, pointing to line numbers within included files (#1274).
    • LaTeX reader: Better handling of table environment (#1204). Positioning options no longer rendered verbatim.
    • Better handling of figure and table with caption (#1204).
    • Handle @{} and p{length} in tabular. The length is not actually recorded, but at least we get a table (#1180).
    • Properly handle \nocite. It now adds a nocite metadata field. Citations there will appear in the bibliography but not in the text (unless you explicitly put a $nocite$ variable in your template).
  • Markdown reader:

    • Ensure that whole numbers in YAML metadata are rendered without decimal points. (This became necessary with changes to aeson and yaml libraries. aeson >= 0.7 and yaml >= are now required.)
    • Fixed regression on line breaks in strict mode (#1203).
    • Small efficiency improvements.
    • Improved parsing of nested divs. Formerly a closing div tag would be missed if it came right after other block-level tags.
    • Avoid backtracking when closing </div> not found.
    • Fixed bug in reference link parsing in markdown_mmd.
    • Fixed a bug in list parsing (#1154). When reading a raw list item, we now strip off up to 4 spaces.
    • Fixed parsing of empty reference link definitions (#1186).
    • Made one-column pipe tables work (#1218).
  • Textile reader:

    • Better support for attributes. Instead of being ignored, attributes are now parsed and included in Span inlines. The output will be a bit different from stock textile: e.g. for *(foo)hi*, we’ll get <em><span class="foo">hi</span></em> instead of <em class="foo">hi</em>. But at least the data is not lost.
    • Improved treatment of HTML spans (%) (#1115).
    • Improved link parsing. In particular we now pick up on attributes. Since pandoc links can’t have attributes, we enclose the whole link in a span if there are attributes (#1008).
    • Implemented correct parsing rules for inline markup (#1175, Matthew Pickering).
    • Use Builder (Matthew Pickering).
  • DocBook reader:

    • Better treatment of formalpara. We now emit the title (if present) as a separate paragraph with boldface text (#1215).
    • Set metadata author not authors.
    • Added recognition of authorgroup and releaseinfo elements (#1214, Matthew Pickering).
    • Converted current meta information parsing in DocBook to a more extensible version which is aware of the more recent meta representation (Matthew Pickering).
  • HTML reader:

    • Require tagsoup 0.13.1, to fix a bug with parsing of script tags (#1248).
    • Treat processing instructions & declarations as block. Previously these were treated as inline, and included in paragraph tags in HTML or DocBook output, which is generally not what is wanted (#1233).
    • Updated closes with rules from HTML5 spec.
    • Use Builder (Matthew Pickering, #1162).
  • RST reader:

    • Remove duplicate http in PEP links (Albert Krewinkel).
    • Make rst figures true figures (#1168, CasperVector)
    • Enhanced Pandoc’s support for rST roles (Merijn Verstaaten). rST parser now supports: all built-in rST roles, new role definition, role inheritance, though with some limitations.
    • Use author rather than authors in metadata.
    • Better handling of directives. We now correctly handle field lists that are indented more than three spaces. We treat an aafig directive as a code block with attributes, so it can be processed in a filter (#1212).
  • LaTeX writer:

    • Mark span contents with label if span has an ID (Albert Krewinkel).
    • Made --toc-depth work well with books in latex/pdf output (#1210).
    • Handle line breaks in simple table cells (#1217).
    • Workaround for level 4-5 headers in quotes. These previously produced invalid LaTeX: \paragraph or \subparagraph in a quote environment. This adds an mbox{} in these contexts to work around the problem. See (#1221).
    • Use \/ to avoid en-dash ligature instead of -{}- (Vaclav Zeman). This is to fix LuaLaTeX output. The -{}- sequence does not avoid the ligature with LuaLaTeX but \/ does.
    • Fixed string escaping in hyperref and hyperdef (#1130).
  • ConTeXt writer: Improved autolinks (#1270).

  • DocBook writer:

    • Improve handling of hard line breaks in Docbook writer (Neil Mayhew). Use a <literallayout> for the entire paragraph, not just for the newline character.
    • Don’t let line breaks inside footnotes influence the enclosing paragraph (Neil Mayhew).
    • Distinguish tight and loose lists in DocBook output, using spacing="compact" (Neil Mayhew, #1250).
  • Docx writer: When needed files are not present in the user’s reference.docx, fall back on the versions in the reference.docx in pandoc’s data files. This fixes a bug that occurs when a reference.docx saved by LibreOffice is used. (#1185)

  • EPUB writer:

    • Include extension in epub ids. This fixes a problem with duplicate extensions for fonts and images with the same base name but different extensions (#1254).
    • Handle files linked in raw img tags (#1170).
    • Handle media in audio source tags (#1170). Note that we now use a media directory rather than images.
    • Incorporate files linked in video tags (#1170). src and poster will both be incorporated into content.opf and the epub container.
  • HTML writer:

    • Add colgroup around col tags (#877). Also affects EPUB writer.
    • Fixed bug with unnumbered section headings. Unnumbered section headings (with class unnumbered) were getting numbers.
    • Improved detection of image links. Previously image links with queries were not recognized, causing <embed> to be used instead of <img>.
  • Man writer: Ensure that terms in definition lists aren’t line wrapped (#1195).

  • Markdown writer:

    • Use proper escapes to avoid unwanted lists (#980). Previously we used 0-width spaces, an ugly hack.
    • Use longer backtick fences if needed (#1206). If the content contains a backtick fence and there are attributes, make sure longer fences are used to delimit the code. Note: This works well in pandoc, but github markdown is more limited, and will interpret the first string of three or more backticks as ending the code block.
  • RST writer: Avoid stack overflow with certain tables (#1197).

  • RTF writer: Fixed table cells containing paragraphs.

  • Custom writer:

    • Correctly handle UTF-8 in custom lua scripts (#1189).
    • Fix bugs with lua scripts with mixed-case filenames and paths containing + or - (#1267). Note that getWriter in Text.Pandoc no longer returns a custom writer on input foo.lua.
  • AsciiDoc writer: Handle multiblock and empty table cells (#1245, #1246). Added tests.

  • Text.Pandoc.Options: Added readerTrace to ReaderOptions

  • Text.Pandoc.Shared:

    • Added compactify'DL (formerly in markdown reader) (Albert Krewinkel).
    • Fixed bug in toRomanNumeral: numbers ending with ‘9’ would be rendered as Roman numerals ending with ‘IXIV’ (#1249). Thanks to Jesse Rosenthal.
    • openURL: set proxy with value of http_proxy env variable (#1211). Note: proxies with non-root paths are not supported, due to limitations in http-conduit.
  • Text.Pandoc.PDF:

    • Ensure that temp directories deleted on Windows (#1192). The PDF is now read as a strict bytestring, ensuring that process ownership will be terminated, so the temp directory can be deleted.
    • Use / as path separators in a few places, even on Windows. This seems to be necessary for texlive (#1151, thanks to Tim Lin).
    • Use ; for TEXINPUTS separator on Windows (#1151).
    • Changes to error reporting, to handle non-UTF8 error output.
  • Text.Pandoc.Templates:

    • Removed unneeded datatype context (Merijn Verstraaten).

    • YAML objects resolve to “true” in conditionals (#1133). Note: If address is a YAML object and you just have $address$ in your template, the word true will appear, which may be unexpected. (Previously nothing would appear.)

  • Text.Pandoc.SelfContained: Handle poster attribute in video tags (#1188).

  • Text.Pandoc.Parsing:

    • Made F an instance of Applicative (#1138).
    • Added stateCaption.
    • Added HasMacros, simplified other typeclasses. Removed updateHeaderMap, setHeaderMap, getHeaderMap, updateIdentifierList, setIdentifierList, getIdentifierList.
    • Changed the smart punctuation parser to return Inlines rather than Inline (Matthew Pickering).
    • Changed HasReaderOptions, HasHeaderMap, HasIdentifierList from typeclasses of monads to typeclasses of states. This simplifies the instance definitions and provides more flexibility. Generalized type of getOption and added a default definition. Removed askReaderOption. Added extractReaderOption. Added extractHeaderMap and updateHeaderMap in HasHeaderMap. Gave default definitions for getHeaderMap, putHeaderMap, modifyHeaderMap. Added extractIdentifierList and updateIdentifierList in HasIdentifierList. Gave defaults for getIdentifierList, putIdentifierList, and modifyIdentifierList. The ultimate goal here is to allow different parsers to use their own, tailored parser states (instead of ParserState) while still using shared functions.
  • Template changes:

    • LaTeX template: Use fontenc package only with pdflatex (#1164).
    • LaTeX template: Add linestretch and fontfamily variables.
    • LaTeX template: Conditionalize author and date commands.
    • Beamer template: Consistent styles for figure and table captions (aaronwolen).
    • LaTeX and beamer template: Adjust widths correctly for oversized images. Use \setkeys{Gin}{} to set appropriate defaults for \includegraphics (Yihui Xie, Garrick Aden-Buie). Load upquote only after fontenc (Yihui Xie).
    • Beamer template: Added caption package (#1200).
    • Beamer template: changes for better unicode handling (KarolS).
    • DocBook template: use authorgroup if there are authors.
    • revealjs template: Move include-after to end (certainlyakey).
    • revealjs template: Fixed PDF print function (#1220, kevinkenan).
  • Bumped version bounds of dependencies.

  • Added a --trace command line option, for debugging backtracking bugs. So far this only works with the markdown reader.

  • MathMLinHTML: Fixed deprecation warning (#362, gwern, Albert Krewinkel).

  • Updated travis script to test with multiple GHC versions.

  • Force failure of a Travis build if GHC produces warnings (Albert Krewinkel).

  • Add .editorconfig (Albert Krewinkel). See for details.

  • Give more useful error message if ‘-t pdf’ is specified (#1155).

  • Added Cite, SmallCaps to Arbitrary instance (#1269).

  • Allow html4 as a synonym of html as a reader (it already works as a writer).


    • Added an explanation of how to use YAML metadata to force items to appear in the bibliography without citations in the text (like LaTeX \nocite).
    • Added note to --bibtex/--natbib: not for use in making PDF (#1194, thanks to nahoj).
    • Added explanatory notes about --natbib and --biblatex.
    • Added specification of legal syntax for citation keys.
    • Fixed variable defaults documentation (Albert Krewinkel).
  • Removed copyright statements for files that have been removed (Albert Krewinkel).

  • Moved some doc files from data-files to extra-source-files (#1123). They aren’t needed at runtime. We keep README and COPYRIGHT in data to ensure that they’ll be available on all systems on which pandoc is installed.

  • Use cabal sandboxes in Windows build script.

pandoc (03 Feb 2014)

  • To changes to source; recompiled tarball with latest alex and happy, so they will work with GHC 7.8.

pandoc (03 Feb 2014)

  • Bumped version bounds for blaze-html, blaze-markup.

  • ImageSize: Avoid use of lookAhead, which is not in binary >= 0.6 (#1124).

  • Fixed mediawiki ordered list parsing (#1122).

  • HTML reader: Fixed bug reading inline math with $$ (#225).

  • Added support for LaTeX style literate Haskell code blocks in rST (Merijn Verstraaten).

pandoc (14 Jan 2014)

  • Relaxed version constraint on binary, allowing the use of binary 0.5.

pandoc 1.12.3 (10 Jan 2014)

  • The --bibliography option now sets the biblio-files variable. So, if you’re using --natbib or --biblatex, you can just use --bibliography=foo.bib instead of -V bibliofiles=foo.

  • Don’t run pandoc-citeproc filter if --bibliography is used together with --natbib or --biblatex (Florian Eitel).

  • Template changes:

    • Updated beamer template to include booktabs.
    • Added abstract variable to LaTeX template.
    • Put header-includes after title in LaTeX template (#908).
    • Allow use of \includegraphics[size] in beamer. This just required porting a macro definition from the default LaTeX template to the default beamer template.
  • reference.docx: Include FootnoteText style. Otherwise Word ignores the style, even when specified in the pPr. (#901)

  • reference.odt: Tidied styles.xml.

  • Relaxed version bounds for dependencies.

  • Added withSocketsDo around http conduit code in openURL, so it works on Windows (#1080).

  • Added Cite function to sample.lua.

  • Markdown reader:

    • Fixed regression in title blocks (#1089). If author field was empty, date was being ignored.
    • Allow backslash-newline hard line breaks in grid and multiline table cells.
    • Citation keys may now start with underscores, and may contain underscores adjacent to internal punctuation.
  • LaTeX reader:

    • Add support for Verb macro (jrnold) (#1090).
    • Support babel-style quoting: "`..."'.
  • Properly handle script blocks in strict mode. (That is, markdown-markdown_in_html_blocks.) Previously a spurious <p> tag was being added (#1093).

  • Docbook reader: Avoid failure if tbody contains no tr or row elements.

  • LaTeX writer:

    • Factored out function for table cell creation.
    • Better treatment of footnotes in tables. Notes now appear in the regular sequence, rather than in the table cell. (This was a regression in 1.10.)
  • HTML reader: Parse name/content pairs from meta tags as metadata. Closes #1106.

  • Moved fixDisplayMath from Docx writer to Writer.Shared.

  • OpenDocument writer: Fixed RawInline, RawBlock so they don’t escape.

  • ODT writer: Use mathml for proper rendering of formulas. Note: LibreOffice’s support for this seems a bit buggy. But it should be better than what we had before.

  • RST writer: Ensure no blank line after def in definition list (#992).

  • Markdown writer: Don’t use tilde code blocks with braced attributes in markdown_github output. A consequence of this change is that the backtick form will be preferred in general if both are enabled. That is good, as it is much more widespread than the tilde form. (#1084)

  • Docx writer: Fixed problem with some modified reference docx files. Include word/_rels/settings.xml.rels if it exists, as well as other rels files besides the ones pandoc generates explicitly.

  • HTML writer:

    • With --toc, headers no longer link to themselves (#1081).
    • Omit footnotes from TOC entries. Otherwise we get doubled footnotes when headers have notes!
  • EPUB writer:

    • Avoid duplicate notes when headings contain notes. This arose because the headings are copied into the metadata “title” field, and the note gets rendered twice. We strip the note now before putting the heading in “title”.
    • Strip out footnotes from toc entries.
    • Fixed bug with --epub-stylesheet. Now the contents of writerEpubStylesheet (set by --epub-stylesheet) should again work, and take precedence over a stylesheet specified in the metadata.
  • Text.Pandoc.Pretty: Added nestle. API change.

  • Text.Pandoc.MIME: Added wmf, emf.

  • Text.Pandoc.Shared: fetchItem now handles image URLs beginning with //.

  • Text.Pandoc.ImageSize: Parse EXIF format JPEGs. Previously we could only get size information for JFIF format, which led to squished images in Word documents. Closes #976.

  • Removed old MarkdownTest_1.0.3 directory (#1104).

pandoc (2013-12-08)

  • Markdown reader: Fixed regression in list parser, involving continuation lines containing raw HTML (or even verbatim raw HTML).

pandoc 1.12.2 (2013-12-07)

  • Metadata may now be included in YAML blocks in a markdown document. For example,

    - type: main
      text: My Book
    - type: subtitle
      text: An investigation of metadata
    - role: author
      text: John Smith
    - role: editor
      text: Sarah Jones
    - scheme: DOI
      text: doi:10.234234.234/33
    publisher:  My Press
    rights:  (c) 2007 John Smith, CC BY-NC
    cover-image: img/mypic.jpg
    stylesheet: style.css

    Metadata may still be provided using --epub-metadata; it will be merged with the metadata in YAML blocks.

  • EPUB writer:

    • meta tags are now used instead of opf attributes for EPUB3.
    • Insert “svg” property as needed in opf (EPUB 3).
    • Simplify imageTypeOf using getMimeType.
    • Add properties attribute to cover-image item for EPUB 3.
    • Don’t include node for cover.xhtml if no cover!
    • Ensure that same identifier is used throughout (#1044). If an identifier is given in metadata, we use that; otherwise we generate a random uuid.
    • Add cover reference to guide element (EPUB 2) (Shaun Attfield). Fixes an issue with Calibre putting the cover at the end of the book if the spine has linear="no". Apparently this is best practice for other converters as well:
    • Allow stylesheet in metadata. The value is a path to the stylesheet.
    • Allow partial dates: YYYY, YYYY-MM.
  • Markdown writer: Fix rendering of tight sublists (#1050). Previously a spurious blank line was included after a tight sublist.

  • ODT writer: Add draw:name attribute to draw:frame elements (#1069). This is reported to be necessary to avoid an error from recent versions of Libre Office when files contain more than one image Thanks to wmanley for reporting and diagnosing the problem.

  • ConTeXt writer: Don’t hardcode figure/table placement and numbering. Instead, let this be set in the template, using \setupfloat. Thanks to on4aa and Aditya Mahajan for the suggestion (#1067).

  • Implemented CSL flipflopping spans in DOCX, LaTeX, and HTML writers.

  • Fixed bug with markdown intraword emphasis. Closes #1066.

  • Docbook writer: Hierarchicalize block content in metadata. Previously headers just disappeared from block-level metadata when it was used in templates. Now we apply the ‘hierarchicalize’ transformation. Note that a block headed by a level-2 header will turn into a <sect1> element.

  • OpenDocument writer: Skip raw HTML (#1035). Previously it was erroneously included as verbatim text.

  • HTML/EPUB writer, footnotes: Put <sup> tag inside <a> tags. This allows better control of formatting, since the <a> tags have a distinguishing class (#1049).

  • Docx writer:

    • Use mime type info returned by fetchItem.
    • Fixed core metadata (#1046). Don’t create empty date nodes if no date given. Don’t create multiple dc:creator nodes; instead separate by semicolons.
    • Fix URL for core-properties in _rels/.rels (#1046).
  • Plain writer: don’t print <span> tags.

  • LaTeX writer:

    • Fix definition lists with internal links in terms (#1032). This fix puts braces around a term that contains an internal link, to avoid problems with square brackets.
    • Properly escape pdftitle, pdfauthor (#1059).
    • Use booktabs package for tables (thanks to Jose Luis Duran).
  • Updated beamer template. Now references should work properly (in a slide) when --biblatex or --natbib is used.

  • LaTeX reader:

    • Parse contents of curly quotes or matched " as quotes.
    • Support \textnormal as span with class nodecor. This is needed for pandoc-citeproc.
    • Improved citation parsing. This fixes a run-time error that occured with \citet{} (empty list of keys). It also ensures that empty keys don’t get produced.
  • MediaWiki reader: Add automatic header identifiers.

  • HTML reader:

    • Use pandoc Div and Span for raw <div>, <span> when --parse-raw.
    • Recognize svg tags as block level content (thanks to MinRK).
    • Parse LaTeX math if appropriate options are set.
  • Markdown reader:

    • Yaml block must start immediately after ---. If there’s a blank line after ---, we interpreted it as a horizontal rule.
    • Correctly handle empty bullet list items.
    • Stop parsing “list lines” when we hit a block tag. This fixes exponential slowdown in certain input, e.g. a series of lists followed by </div>.
  • Slides: Preserve <div class="references"> in references slide.

  • Text.Pandoc.Writer.Shared:

    • Fixed bug in tagWithAttrs. A space was omitted before key-value attributes, leading to invalid HTML.
    • normalizeDate: Allow dates with year only (thanks to Shaun Attfield).
    • Fixed bug in openURL with data: URIs. Previously the base-64 encoded bytestring was returned. We now decode it so it’s a proper image!
  • DocBook reader: Handle numerical attributes starting with decimal. Also use safeRead instead of read.

  • Text.Pandoc.Parsing:

    • Generalized type of registerHeader, using new type classes HasReadeOptions, HasIdentifierList, HasHeaderMap. These allow certain common functions to be reused even in parsers that use custom state (instead of ParserState), such as the MediaWiki reader.
    • Moved inlineMath, displayMath from Markdown reader to Parsing. Generalize their types and export them from Parsing. (API change.)
  • Text.Pandoc.Readers.TexMath: Export readTeXMath', which attends to display/inline. Deprecate readTeXMath, and use readTeXMath' in all the writers. Require texmath >=

  • Text.Pandoc.MIME:

    • Add entry for jfif.
    • In looking up extensions, drop the encoding info. E.g. for ‘image/jpg;base64’ we should lookup ‘image/jpg’.
  • Templates: Changed how array variables are resolved. Previously if foo is an array (which might be because multiple values were set on the command line), $foo$ would resolve to the concatenation of the elements of foo. This is rarely useful behavior. It has been changed so that the first value is rendered. Of course, you can still iterate over the values using $for(foo)$. This has the result that you can override earlier settings using -V by putting new values later on the command line, which is useful for many purposes.

  • Text.Pandoc: Don’t default to pandocExtensions for all writers.

  • Allow “epub2” as synonym for “epub”, “html4” for “html”.

  • Don’t look for slidy files in data files with --self-contained.

  • Allow https: command line arguments to be downloaded.

  • Fixed so data files embedded in pandoc-citeproc.

pandoc 1.12.1 (2013-10-20)

  • Text.Pandoc.Definition: Changed default JSON serialization format. Instead of {"Str": "foo"}, for example, we now have {"t": "Str", "c": "foo"}. This new format is easier to work with outside of Haskell. Incidentally, “t” stands for “tag”, “c” for “contents”.

  • MediaWiki reader: Trim contents of <math> tags, to avoid problems when converting to markdown (#1027).

  • LaTeX reader:

    • Ensure that preamble doesn’t contribute to the text of the document.
    • Fixed character escaping in . Previously \~ wasn’t handled properly, among others.
    • Parse {groups} as Span. This is needed for accurate conversion of bibtex titles, since we need to know what was protected from titlecase conversions.
  • LaTeX writer:

    • Specially escape non-ascii characters in labels. Otherwise we can get compile errors and other bugs when compiled with pdflatex (#1007). Thanks to begemotv2718 for the fix.
    • Add link anchors for code blocks with identifiers (#1025).
  • Throughout the code, use isURI instead of isAbsoluteURI. It allows fragments identifiers.

  • Slide formats:

    • A Div element with class “notes” is treated as speaker notes. Currently beamer goes to \note{}, revealjs to <aside class="notes">, and the notes are simply suppressed in other formats (#925).
    • Fixed . . . (pause) on HTML slide formats. Closes #1029. The old version caused a pause to be inserted before the first material on a slide. This has been fixed.
    • Removed data files for s5, slideous, slidy. Users of s5 and slideous will have to download the needed files, as has been documented for some time in the README. By default, slidy code will be sought on the web, as before.
  • HTML writer: Insert command to typeset mathjax only in slideous output (#966, #1012).

  • RST writer: Skip spaces after display math. Otherwise we get indentation problems, and part of the next paragraph may be rendered as part of the math.

  • OpenDocument writer: Fix formatting of strikeout code (#995), thanks to wilx. don’t use font-face-decls variable.

  • Fixed test suite so it works with cabal sandboxes.

pandoc (2013-09-20)

  • Removed an unused dependency (stringable) from pandoc.cabal. This will help packagers, but users should not need to upgrade.

pandoc (2013-09-20)

  • Allow --metadata to be repeated for the same key to form a list. This also has the effect that --bibliography can be repeated, as before.

  • Handle boolean values in --metadata. Note that anything not parseable as a YAML boolean or string is treated as a literal string. You can get a string value with “yes”, or any of the strings interpretable as booleans, by quoting it:

    -M boolvalue=yes -M stringvalue='"yes"'
  • LaTeX writer: Don’t print references if --natbib or --biblatex option used.

  • DOCX writer: Add settings.xml to the zip container. Fixes a bug in which docx files could not be read by some versions of Word and LibreOffice (#990).

  • Fixed a regression involving slide shows with bibliographies. The Div container around references messed up the procedure for carving a document into slides. So we now remove the surrounding Div in prepSlides.

  • More informative error message when a filter is not found in path.

  • Depend on pandoc-types 1.12.1. This provide ToJSONFilter instances for Data a => a -> [a] and Data a => a -> IO [a].

  • Don’t use unicode_collation in building OSX package: it adds something like 50MB of dependencies to the package.

  • Declare alex and happy as build-tools (#986).

pandoc 1.12 (2013-09-15)

New features

  • Much more flexible metadata, including arbitrary fields and structured values. Metadata can be specified flexibly in pandoc markdown using YAML metadata blocks, which may occur anywhere in the document:

    title: Here is my title.
    abstract: |
      This is the abstract.
      1. It can contain
      2. block content
         and *inline markup*
    tags: [cat, dog, animal]

    Metadata fields automatically populate template variables.

  • Added opml (OPML) as input and output format. The _note attribute, used in OmniOutliner and supported by multimarkdown, is supported. We treat the contents as markdown blocks under a section header.

  • Added haddock (Haddock markup) as input format (David Lazar).

  • Added revealjs output format, for reveal.js HTML 5 slide shows. (Thanks to Jamie F. Olson for the initial patch.) Nested vertical stacks are used for hierarchical structure. Results for more than one level of nesting may be odd.

  • Custom writers can now be written in lua.

    pandoc -t data/sample.lua

    will load the script sample.lua and use it as a custom writer. (For a sample, do pandoc --print-default-data-file sample.lua.) Note that pandoc embeds a lua interpreter, so lua need not be installed separately.

  • New --filter/-F option to make it easier to run “filters” (Pandoc AST transformations that operate on JSON serializations). Filters are always passed the name of the output format, so their behavior can be tailored to it. The repository contains a python module for writing pandoc filters in python, with a number of examples.

  • Added --metadata/-M option. This is like --variable/-V, but actually adds to metadata, not just variables.

  • Added --print-default-data-file option, which allows printing of any of pandoc’s data files. (For example, pandoc --print-default-data-file reference.odt will print reference.odt.)

  • Added syntax for “pauses” in slide shows:

    This gives
    . . .
    me pause.
  • New markdown extensions:

    • ignore_line_breaks: causes intra-paragraph line breaks to be ignored, rather than being treated as hard line breaks or spaces. This is useful for some East Asian languages, where spaces aren’t used between words, but text is separated into lines for readability.
    • yaml_metadata_block: Parse YAML metadata blocks. (Default.)
    • ascii_identifiers: This will force auto_identifiers to use ASCII only. (Default for markdown_github.) (#807)
    • lists_without_preceding_blankline: Allow lists to start without preceding blank space. (Default for markdown_github.) (#972)

Behavior changes

  • --toc-level no longer implies --toc. Reason: EPUB users who don’t want a visible TOC may still want to set the TOC level for in the book navigation.

  • --help now prints in and out formats in alphabetical order, and says something about PDF output (#720).

  • --self-contained now returns less verbose output (telling you which URLs it is fetching, but not giving the full header). In addition, there are better error messages when fetching a URL fails.

  • Citation support is no longer baked in to core pandoc. Users who need citations will need to install and use a separate filter (--filter pandoc-citeproc). This filter will take bibliography, csl, and citation-abbreviations from the metadata, though it may still be specified on the command line as before.

  • A Cite element is now created in parsing markdown whether or not there is a matching reference.

  • The pandoc-citeproc script will put the bibliography at the end of the document, as before. However, it will be put inside a Div element with class “references”, allowing users some control over the styling of references. A final header, if any, will be included in the Div.

  • The markdown writer will not print a bibliography if the citations extension is enabled. (If the citations are formatted as markdown citations, it is redundant to have a bibliography, since one will be generated automatically.)

  • Previously we used to store the directory of the first input file, even if it was local, and used this as a base directory for finding images in ODT, EPUB, Docx, and PDF. This has been confusing to many users. So we now look for images relative to the current working directory, even if the first file argument is in another directory. Note that this change may break some existing workflows. If you have been assuming that relative links will be interpreted relative to the directory of the first file argument, you’ll need to make that the current directory before running pandoc. (#942)

  • Better error reporting in some readers, due to changes in readWith: the line in which the error occured is printed, with a caret pointing to the column.

  • All slide formats now support incremental slide view for definition lists.

  • Parse \(..\) and \[..\] as math in MediaWiki reader. Parse :<math>...</math> as display math. These notations are used with the MathJax MediaWiki extension.

  • All writers: template variables are set automatically from metadata fields. However, variables specified on the command line with --variable will completely shadow metadata fields.

  • If --variable is used to set many variables with the same name, a list is created.

  • Man writer: The title, section, header, and footer can now all be set individually in metadata. The description variable has been removed. Quotes have been added so that spaces are allowed in the title. If you have a title that begins

    COMMAND(1) footer here | header here

    pandoc will still parse it into a title, section, header, and footer. But you can also specify these elements explicitly (#885).

  • Markdown reader

    • Added support for YAML metadata blocks, which can come anywhere in the document (not just at the beginning). A document can contain multiple YAML metadata blocks.
    • HTML span and div tags are parsed as pandoc Span and Div elements.
  • Markdown writer

    • Allow simple tables to be printed as grid tables, if other table options are disabled. This means you can do pandoc -t markdown-pipe_tables-simple_tables-multiline_tables and all tables will render as grid tables.
    • Support YAML title block (render fields in alphabetical order to make output predictable).

API changes

  • Meta in Text.Pandoc.Definition has been changed to allow structured metadata. (Note: existing code that pattern-matches on Meta will have to be revised.) Metadata can now contain indefinitely many fields, with content that can be a string, a Boolean, a list of Inline elements, a list of Block elements, or a map or list of these.

  • A new generic block container (Div) has been added to Block, and a generic inline container (Span) has been added to Inline. These can take attributes. They will render in HTML, Textile, MediaWiki, Org, RST and and Markdown (with markdown_in_html extension) as HTML <div> and <span> elements; in other formats they will simply pass through their contents. But they can be targeted by scripts.

  • Format is now a newtype, not an alias for String. Equality comparisons are case-insensitive.

  • Added Text.Pandoc.Walk, which exports hand-written tree-walking functions that are much faster than the SYB functions from Text.Pandoc.Generic. These functions are now used where possible in pandoc’s code. (Tests.Walk verifies that walk and query match the generic traversals bottomUp and queryWith.)

  • Added Text.Pandoc.JSON, which provides ToJSON and FromJSON instances for the basic pandoc types. They use GHC generics and should be faster than the old JSON serialization using Data.Aeson.Generic.

  • Added Text.Pandoc.Process, exporting pipeProcess. This is a souped-up version of readProcessWithErrorcode that uses lazy bytestrings instead of strings and allows setting environment variables. (Used in Text.Pandoc.PDF.)

  • New module Text.Pandoc.Readers.OPML.

  • New module Text.Pandoc.Writers.OPML.

  • New module Text.Pandoc.Readers.Haddock (David Lazar). This is based on Haddock’s own lexer/parser.

  • New module Text.Pandoc.Writers.Custom.

  • In Text.Pandoc.Shared, openURL and fetchItem now return an Either, for better error handling.

  • Made stringify polymorphic in Text.Pandoc.Shared.

  • Removed stripTags from Text.Pandoc.XML.

  • Text.Pandoc.Templates:

    • Simplified Template type to a newtype.
    • Removed Empty.
    • Changed type of renderTemplate: it now takes a JSON context and a compiled template.
    • Export compileTemplate.
    • Export renderTemplate' that takes a string instead of a compiled template.
    • Export varListToJSON.
  • Text.Pandoc.PDF exports makePDF instead of tex2pdf.

  • Text.Pandoc:

    • Made toJsonFilter an alias for toJSONFilter from Text.Pandoc.JSON.
    • Removed ToJsonFilter typeclass. ToJSONFilter from Text.Pandoc.JSON should be used instead. (Compiling against pandoc-types instead of pandoc will also produce smaller executables.)
    • Removed the deprecated jsonFilter function.
    • Added readJSON, writeJSON to the API (#817).
  • Text.Pandoc.Options:

    • Added Ext_lists_without_preceding_blankline, Ext_ascii_identifiers, Ext_ignore_line_breaks, Ext_yaml_metadataBlock to Extension.
    • Changed writerSourceDirectory to writerSourceURL and changed the type to a Maybe. writerSourceURL is set to ‘Just url’ when the first command-line argument is an absolute URL. (So, relative links will be resolved in relation to the first page.) Otherwise, ‘Nothing’.
    • All bibliography-related fields have been removed from ReaderOptions and WriterOptions: writerBiblioFiles, readerReferences, readerCitationStyle.
  • The Text.Pandoc.Biblio module has been removed. Users of the pandoc library who want citation support will need to use Text.CSL.Pandoc from pandoc-citeproc.

Bug fixes

  • In markdown, don’t autolink a bare URI that is followed by </a> (#937).

  • Text.Pandoc.Shared

    • openURL now follows redirects (#701), properly handles data: URIs, and prints diagnostic output to stderr rather than stdout.
    • readDefaultDataFile: normalize the paths. This fixes bugs in --self-contained on pandoc compiled with embed_data_files (#833).
    • Fixed readDefaultDataFile so it works on Windows.
    • Better error messages for readDefaultDataFile. Instead of listing the last path tried, which can confuse people who are using --self-contained, so now we just list the data file name.
    • URL-escape pipe characters. Even though these are legal, Network.URI doesn’t regard them as legal in URLs. So we escape them first (#535).
  • Mathjax in HTML slide shows: include explicit “Typeset” call. This seems to be needed for some formats (e.g. slideous) and won’t hurt in others (#966).

  • Text.Pandoc.PDF

    • On Windows, create temdir in working directory, since the system temp directory path may contain tildes, which can cause problems in LaTeX (#777).
    • Put temporary output directory in TEXINPUTS (see #917).
    • makePDF tries to download images that are not found locally, if the first argument is a URL (#917).
    • If compiling with pdflatex yields an encoding error, offer the suggestion to use --latex-engine=xelatex.
  • Produce automatic header identifiers in parsing textile, RST, and LaTeX, unless auto_identifiers extension is disabled (#967).

  • Text.Pandoc.SelfContained: Strip off fragment, query of relative URL before treating as a filename. This fixes --self-contained when used with CSS files that include web fonts using the method described here: (#739). Handle src in embed, audio, source, input tags.

  • Text.Pandoc.Parsing: uri parser no longer treats punctuation before percent-encoding, or a + character, as final punctuation.

  • Text.Pandoc.ImageSize: Handle EPS (#903). This change will make EPS images properly sized on conversion to Word.

  • Slidy: Use slidy.js rather than slidy.js.gz. Reason: some browsers have trouble with the gzipped js file, at least on the local file system (#795).

  • Markdown reader

    • Properly handle blank line at beginning of input (#882).
    • Fixed bug in unmatched reference links. The input [*infile*] [*outfile*] was getting improperly parsed: “infile” was emphasized, but “outfile” was literal (#883).
    • Allow internal + in citation identifiers (#856).
    • Allow . or ) after # in ATX headers if no fancy_lists.
    • Do not generate blank title, author, or date metadata elements. Leave these out entirely if they aren’t present.
    • Allow backtick code blocks not to be preceded by blank line (#975).
  • Textile reader:

    • Correctly handle entities.
    • Improved handling of <pre> blocks (#927). Remove internal HTML tags in code blocks, rather than printing them verbatim. Parse attributes on <pre> tag for code blocks.
  • HTML reader: Handle non-simple tables (#893). Column widths are read from col tags if present, otherwise divided equally.

  • LaTeX reader

    • Support alltt environment (#892).
    • Support \textasciitilde, \textasciicircum (#810).
    • Treat \textsl as emphasized text reader (#850).
    • Skip positional options after \begin{figure}.
    • Support \v{} for hacek (#926).
    • Don’t add spurious “,” to citation suffixes. This is added when needed in pandoc-citeproc.
    • Allow spaces in alignment spec in tables, e.g. { l r c }.
    • Improved support for accented characters (thanks to Scott Morrison).
    • Parse label after section command and set id (#951).
  • RST reader:

    • Don’t insert paragraphs where docutils doesn’t. rst2html doesn’t add <p> tags to list items (even when they are separated by blank lines) unless there are multiple paragraphs in the list. This commit changes the RST reader to conform more closely to what docutils does (#880).
    • Improved metadata. Treat initial field list as metadata when standalone specified. Previously ALL fields “title”, “author”, “date” in field lists were treated as metadata, even if not at the beginning. Use subtitle metadata field for subtitle.
    • Fixed ‘authors’ metadata parsing in reST. Semicolons separate different authors.
  • MediaWiki reader

    • Allow space before table rows.
    • Fixed regression for <ref>URL</ref>. < is no longer allowed in URLs, according to the uri parser in Text.Pandoc.Parsing. Added a test case.
    • Correctly handle indented preformatted text without preceding or following blank line.
    • Fixed | links inside table cells. Improved attribute parsing.
    • Skip attributes on table rows. Previously we just crashed if rows had attributes, now we ignore them.
    • Ignore attributes on headers.
    • Allow Image: for images (#971).
    • Parse an image with caption in a paragraph by itself as a figure.
  • LaTeX writer

    • Don’t use ligatures in escaping inline code.
    • Fixed footnote numbers in LaTeX/PDF tables. This fixes a bug wherein notes were numbered incorrectly in tables (#827).
    • Always create labels for sections. Previously the labels were only created when there were links to the section in the document (#871).
    • Stop escaping | in LaTeX math. This caused problems with array environments (#891).
    • Change \ to / in paths. / works even on Windows in LaTeX. \ will cause major problems if unescaped.
    • Write id for code block to label attribute in LaTeX when listings is used (thanks to Florian Eitel).
    • Scale LaTeX tables so they don’t exceed columnwidth.
    • Avoid problem with footnotes in unnumbered headers (#940).
  • Beamer writer: when creating beamer slides, add allowframebreaks option to the slide if it is one of the header classes. It is recommended that your bibliography slide have this attribute:

    # References {.allowframebreaks}

    This causes multiple slides to be created if necessary, depending on the length of the bibliography.

  • ConTeXt writer: Properly handle tables without captions. The old output only worked in MkII. This should work in MkIV as well (#837).

  • MediaWiki writer: Use native mediawiki tables instead of HTML (#720).

  • HTML writer:

    • Fixed --no-highlight (Alexander Kondratskiy).
    • Don’t convert to lowercase in email obfuscation (#839).
    • Ensure proper escaping in <title> and <meta> fields.
  • AsciiDoc writer:

    • Support --atx-headers (Max Rydahl Andersen).
    • Don’t print empty identifier blocks ([[]]) on headers (Max Rydahl Andersen).
  • ODT writer:

    • Fixing wrong numbered-list indentation in open document format (Alexander Kondratskiy) (#369).
    • reference.odt: Added pandoc as “generator” in meta.xml.
    • Minor changes for ODF 1.2 conformance (#939). We leave the nonconforming contextual-spacing attribute, which is provided by LibreOffice itself and seems well supported.
  • Docx writer:

    • Fixed rendering of display math in lists. In 1.11 and 1.11.1, display math in lists rendered as a new list item. Now it always appears centered, just as outside of lists, and in proper display math style, no matter how far indented the containing list item is (#784).
    • Use w:br with w:type textWrapping for linebreaks. Previously we used w:cr (#873).
    • Use Compact style for Plain block elements, to differentiate between tight and loose lists (#775).
    • Ignore most components of reference.docx. We take the word/styles.xml, docProps/app.xml, word/theme/theme1.xml, and word/fontTable.xml from reference.docx, ignoring everything else. This should help with the corruption problems caused when different versions of Word resave the reference.docx and reorganize things.
    • Made --no-highlight work properly.
  • EPUB writer

    • Don’t add dc:creator tags if present in EPUB metadata.
    • Add id="toc-title" to h1 in nav.xhtml (#799).
    • Don’t put blank title page in reading sequence. Set linear="no" if no title block. Addresses #797.
    • Download webtex images and include as data URLs. This allows you to use --webtex in creating EPUBs. Math with --webtex is automatically made self-contained.
    • In data/epub.css, removed highlighting styles (which are no longer needed, since styles are added by the HTML writer according to --highlighting-style). Simplified margin fields.
    • If resource not found, skip it, as in Docx writer (#916).
  • RTF writer:

    • Properly handle characters above the 0000-FFFF range. Uses surrogate pairs. Thanks to Hiromi Ishii for the patch.
    • Fixed regression with RTF table of contents.
    • Only autolink absolute URIs. This fixes a regression, #830.
  • Markdown writer:

    • Only autolink absolute URIs. This fixes a regression, #830.
    • Don’t wrap attributes in fenced code blocks.
    • Write full metadata in MMD style title blocks.
    • Put multiple authors on separate lines in pandoc titleblock. Also, don’t wrap long author entries, as new lines get treated as new authors.
  • Text.Pandoc.Templates:

    • Fixed bug retrieving default template for markdown variants.
    • Templates can now contain “record lookups” in variables; for example, author.institution will retrieve the institution field of the author variable.
    • More consistent behavior of $for$. When foo is not a list, $for(foo)$...$endfor$ should behave like if(foo)endif. So if foo resolves to “”, no output should be produced. See pandoc-templates#39.
  • Citation processing improvements (now part of pandoc-citeproc):

    • Fixed endWithPunct The new version correctly sees a sentence ending in ‘.)’ as ending with punctuation. This fixes a bug which led such sentences to receive an extra period at the end: ‘.).’. Thanks to Steve Petersen for reporting.
    • Don’t interfere with Notes that aren’t citation notes. This fixes a bug in which notes not generated from citations were being altered (e.g. first letter capitalized) (#898).
    • Only capitalize footnote citations when they have a prefix.
    • Changes in suffix parsing. A suffix beginning with a digit gets ‘p’ inserted before it before passing to citeproc-hs, so that bare numbers are treated as page numbers by default. A suffix not beginning with punctuation has a space added at the beginning (rather than a comma and space, as was done before for not-author-in-text citations). The result is that \citep[23]{item1} in LaTeX will be interpreted properly, with ‘23’ treated as a locator of type ‘page’.
    • Many improvements to citation rendering, due to fixes in citeproc-hs (thanks to Andrea Rossato).
    • Warnings are issued for undefined citations, which are rendered as ???.
    • Fixed hanging behavior when locale files cannot be found.

Template changes

  • DocBook: Use DocBook 4.5 doctype.

  • Org: ‘#+TITLE:’ is inserted before the title. Previously the writer did this.

  • LaTeX: Changes to make mathfont work with xelatex. We need the mathspec library, not just fontspec, for this. We also need to set options for setmathfont (#734).

  • LaTeX: Use tex-ansi mapping for monofont. This ensures that straight quotes appear as straight, rather than being treated as curly. See #889.

  • Made \includegraphics more flexible in LaTeX template. Now it can be used with options, if needed. Thanks to Bernhard Weichel.

  • LaTeX/Beamer: Added classoption variable. This is intended for class options like oneside; it may be repeated with different options. (Thanks to Oliver Matthews.)

  • Beamer: Added fonttheme variable. (Thanks to Luis Osa.)

  • LaTeX: Added biblio-style variable (#920).

  • DZSlides: title attribute on title section.

  • HTML5: add meta tag to allow scaling by user (Erik Evenson)

Under-the-hood improvements

  • Markdown reader:Improved strong/emph parsing, using the strategy of The new parsing algorithm requires no backtracking, and no keeping track of nesting levels. It will give different results in some edge cases, but these should not affect normal uses.

  • Added Text.Pandoc.Compat.Monoid. This allows pandoc to compile with base < 4.5, where Data.Monoid doesn’t export <>. Thanks to Dirk Ullirch for the patch.

  • Added Text.Pandoc.Compat.TagSoupEntity. This allows pandoc to compile with tagsoup 0.13.x. Thanks to Dirk Ullrich for the patch.

  • Most of Text.Pandoc.Readers.TeXMath has been moved to the texmath module (0.6.4). (This allows pandoc-citeproc to handle simple math in bibliography fields.)

  • Added Text.Pandoc.Writers.Shared for shared functions used only in writers. metaToJSON is used in writers to create a JSON object for use in the templates from the pandoc metadata and variables. getField, setField, and defField are for working with JSON template contexts.

  • Added Text.Pandoc.Asciify utility module. This exports functions to create ASCII-only versions of identifiers.

  • Text.Pandoc.Parsing

    • Generalized state type on readWith (API change).
    • Specialize readWith to String input. (API change).
    • In ParserState, replace stateTitle, stateAuthors, stateDate with stateMeta and stateMeta'.
  • Text.Pandoc.UTF8: use strict bytestrings in reading. The use of lazy bytestrings seemed to cause problems using pandoc on 64-bit Windows 7/8 (#874).

  • Factored out registerHeader from markdown reader, added to Text.Pandoc.Parsing.

  • Removed blaze_html_0_5 flag, require blaze-html >= 0.5. Reason: < 0.5 does not provide a monoid instance for Attribute, which is now needed by the HTML writer (#803).

  • Added http-conduit flag, which allows fetching https resources. It also brings in a large number of dependencies (http-conduit and its dependencies), which is why for now it is an optional flag (#820).

  • Added

  • Improved INSTALL instructions.

  • make-windows-installer.bat: Removed explicit paths for executables.

  • aeson is now used instead of json for JSON.

  • Set default stack size to 16M. This is needed for some large conversions, esp. if pandoc is compiled with 64-bit ghc.

  • Various small documentation improvements. Thanks to achalddave and drothlis for patches.

  • Removed comment that chokes recent versions of CPP (#933).

  • Removed support for GHC version < 7.2, since pandoc-types now requires at least GHC 7.2 for GHC generics.

pandoc 1.11.1 (2013-03-17)

  • Markdown reader:

    • Fixed regression in which parentheses were lost in link URLs. Added tests. Closes #786.
    • Better handling of unmatched double quotes in --smart mode. These occur frequently in fiction, since it is customary not to close quotes in dialogue if the speaker does not change between paragraphs. The unmatched quotes now get turned into literal left double quotes. (No Quoted inline is generated, however.) Closes #99 (again).
  • HTML writer: Fixed numbering mismatch between TOC and sections. --number-offset now affects TOC numbering as well as section numbering, as it should have all along. Closes #789.

  • Markdown writer: Reverted 1.11 change that caused citations to be rendered as markdown citations, even if --bibliography was specified, unless citation extension is disabled. Now, formatted citations are always printed if --bibliography was specified. If you want to reformat markdown keeping pandoc markdown citations intact, don’t use --bibliography. Note that citations parsed from LaTeX documents will be rendered as pandoc markdown citations when --bibliography is not specified.

  • ODT writer: Fixed regression leading to corrupt ODTs. This was due to a change in the Show instance for Text.Pandoc.Pretty.Doc. Closes #780.

  • Fixed spacing bugs involving code block attributes in RST reader and Markdown writer. Closes #763.

  • Windows package: Various improvements due to Fyodor Sheremetyev.

    • Automatically set installation path (Program Files or Local App Data).
    • Set system PATH environment variable when installing for all users.
    • Pandoc can installed for all users using the following command. msiexec /i pandoc-1.11.msi ALLUSERS=1.
  • Bumped QuickCheck version bound.

pandoc 1.11 (2013-03-09)

  • Added --number-offset option. (See README for description.)

  • Added --default-image-extension option. (See README for description.)

  • --number-sections behavior change: headers with class unnumbered will not be numbered.

  • --version now reports the default data directory.

  • Text.Pandoc.Parsing is no longer exposed. (API change.)

  • Text.Pandoc.Highlighting is no longer exposed. (API change.)

  • Text.Pandoc.Shared: Changed type of Element. Sec now includes a field for Attr rather than just String. (API change.)

  • Added markdown_github as input format. This was an accidental omission in 1.10.

  • Added readerDefaultImageExtension field to ReaderOptions. (API change.)

  • Added writerNumberOffset field in WriterOptions. (API change.)

  • Beamer template:

    • Fixed captions with longtable. Thanks to Joost Kremers.
    • Provide \Oldincludegraphics as in LaTeX template (Benjamin Bannier).
  • LaTeX template:

    • Load microtype after fonts. Microtype needs to know what fonts are being used. Thanks to dfc for the patch.
    • Set secnumdepth to 5 if --number-sections specified. This yields behavior equivalent to the other writers, numbering level 4 and 5 headers too. Closes #753.
  • HTML reader:

    • Handle <colgroup> tag.
    • Preserve all header attributes.
  • LaTeX reader:

    • Parse \hrule as HorizontalRule. Closes #746.
    • Parse starred variants of \section etc. as headers with attribute unnumbered.
    • Read optional attributes in lstlisting and Verbatim environments. We convert these to pandoc standard names, e.g. numberLines for numbers=left, startFrom=100 from firstnumber=100.
    • Handle language attribute for lstlistings.
    • Better support for Verbatim and minted environments. Closes #763.
  • Markdown reader:

    • - in an attribute context = .unnumbered. The point of this is to provide a way to specify unnumbered headers in non-English documents.
    • Fixed bug parsing key/value attributes. Parsing failed if you had an unquoted attribute immediately before the final ‘}’.
    • Make backslash escape work in attributes.
    • Fix title block parsing. Now if mmd_title_blocks is specified, pandoc will parse a MMD title block if it sees one, even if pandoc_title_blocks is enabled.
    • Refactoring: litChar now includes entities, so we don’t need to use fromEntities e.g. on titles.
    • Allow spaces around borders in pipe tables. Closes #772.
    • Allow all punctuation in angle-bracket autolinks. Previously things like ---- were disallowed, because the uri parser treated them as trailing punctuation. Closes #768.
    • Make implicit_header_references work properly when headers are given explicit identifiers.
    • Check for tables before line blocks. Otherwise some pipe tables get treated as line blocks.
    • Allow & in emails (for entities).
    • Properly handle entities in titles and links. A markdown link <http://g&ouml;> should be a link to http://gö Closes #723.
  • Textile reader:

    • Handle attributes on headers.
  • LaTeX reader:

    • Add fig: as title for images with captions. This is needed for them to be rendered as figures. Closes #766.
    • Never emit an empty paragraph. See #761.
    • Handle \caption for images in figures. Closes #766.
    • Parse \section*, etc. as unnumbered sections.
  • HTML writer:

    • Support header attributes. The attributes go on the enclosing section or div tag if --section-divs is specified.
    • Fixed a regression (only now noticed) in html+lhs output. Previously the bird tracks were being omitted.
  • LaTeX writer:

    • Omit lists with no items to avoid LaTeX errors.
    • Support line numbering with --listings. If numberLines class is present, we add numbers=left; if startFrom is present, we add firstnumber=. (#763)
  • ConTeXt writer:

    • Removed \placecontent. This produced a duplicate toc, in conjunction with \placelist.
    • Use \title, \subject etc. for headers with unnumbered class.
  • Textile writer:

    • Support header attributes.
  • Markdown writer:

    • Use grid tables when needed, and if enabled. Closes #740.
    • Render citations as pandoc-markdown citations. Previously citations were rendered as citeproc-formatted citations by default. Now we render them as pandoc citations, e.g. [@item1], unless the citations extension is disabled. If you still want formatted citations in your markdown output, use pandoc -t markdown-citations.
  • RST writer:

    • Support :number-lines: in code blocks.
  • Docx writer:

    • Better treatment of display math. Display math inside a paragraph is now put in a separate paragraph, so it will render properly (centered and without extra blank lines around it). Partially addresses #742.
    • Content types and document rels xml files are now created from scratch, rather than being taken over from reference.docx. This fixes problems that arise when you edit the reference.docx with Word.
    • We also now encode mime types for each individual image rather than using defaults. This should allow us to handle a wider range of image types (including PDF). Closes #414.
    • Changed style names in reference docx. FootnoteReference -> FootnoteRef, Hyperlink -> Link. The old names got changed by Word when the reference.docx was edited. Closes #414.
  • EPUB writer:

    • Fix section numbering. Previously the numbering restarted from 1 in each chapter (with --number-sections), though the numbers in the table of contents were correct.
    • Headers with “unnumbered” attribute are not numbered. (Nor do they cause an increment in running numbering.) Section numbers now work properly, even when there is material before the first numbered section.
    • Include HTML TOC, even in epub2. The TOC is included in <spine>, but linear is set to no unless the --toc option is specified. Include <guide> element in OPF. This should allow the TOC to be useable in Kindles when converted with kindlegen. Closes #773.
  • Text.Pandoc.Parsing: Optimized oneOfStringsCI. This dramatically reduces the speed penalty that comes from enabling the autolink_bare_uris extension. The penalty is still substantial (in one test, from 0.33s to 0.44s), but nowhere near what it used to be. The RST reader is also much faster now, as it autodetects URIs.

  • Text.Pandoc.Shared: hierarchicalize will not number section with class “unnumbered”. Unnumbered sections get [] for their section number.

  • Text.Pandoc.Pretty:

    • Fixed chomp so it works inside Prefixed elements.
    • Changed Show instance so it is better for debugging.
  • Text.Pandoc.ImageSize: Added Pdf to ImageType.

  • Text.Pandoc.UTF8: Strip off BOM if present. Closes #743.

  • Windows installer improvements:

    • The installer is now signed with a certificate (thanks to Fyodor Sheremetyev).
    • WiX is used instead of InnoSetup. The installer is now a standard msi file.
    • The version number is now auto-detected, and need not be updated separately.
  • OSX installer improvements:

    • The package and pandoc executable are now signed with a certificate (thanks to Fyodor Sheremetyev).
    • RTF version of license is used.
    • Use full path for sysctl in InstallationCheck script (jonahbull). Closes #580.
  • Converted COPYING to markdown.

  • pandoc.cabal: Require latest versions of highlighting-kate, texmath, citeproc-hs, zip-archive.

pandoc 1.10.1 (2013-01-23)

  • Markdown reader: various optimizations, leading to a significant performance boost.

  • RST reader: Allow anonymous form of inline links: `hello <url>`__ Closes #724.

  • Mediawiki reader: Don’t require newlines after tables. Thanks to jrunningen for the patch. Closes #733.

  • Fixed LaTeX macro parsing. Now LaTeX macro definitions are preserved when output is LaTeX, and applied when it is another format. Partially addresses #730.

  • Markdown and RST readers: Added parser to block that skips blank lines. This fixes a subtle regression involving grid tables with empty cells. Also added test for grid table with empty cells. Closes #732.

  • RST writer: Use .. code:: language for code blocks with language. Closes #721.

  • DocBook writer: Fixed output for hard line breaks, adding a newline between <literallayout> tags.

  • Markdown writer: Use an autolink when link text matches url. Previously we also checked for a null title, but this test fails for links produced by citeproc-hs in bibliographies. So, if the link has a title, it will be lost on conversion to an autolink, but that seems okay.

  • Markdown writer: Set title, author, date variables as before. These are no longer used in the default template, since we use titleblock, but we set them anyway for those who use custom templates.

  • LaTeX writer: Avoid extra space at start/end of table cell. Thanks to Nick Bart for the suggestion of using @{}.

  • Text.Pandoc.Parsing:

    • More efficient version of anyLine.
    • Type of macro has changed; the parser now returns Blocks instead of Block.
  • Relaxed old-time version bound, allowing 1.0.*.

  • Removed obsolete hsmarkdown script. Those who need hsmarkdown should create a symlink as described in the README.

pandoc (2013-01-23)

  • Markdown reader: Try lhsCodeBlock before rawTeXBlock. Otherwise \begin{code}...\end{code} isn’t handled properly in markdown+lhs. Thanks to Daniel Miot for noticing the bug and suggesting the fix.

  • Markdown reader: Fixed bug with headerless grid tables. The 1.10 code assumed that each table header cell contains exactly one block. That failed for headerless tables (0) and also for tables with multiple blocks in a header cell. The code is fixed and tests provided. Thanks to Andrew Lee for pointing out the bug.

  • Markdown reader: Fixed regressions in fenced code blocks. Closes #722.

    • Tilde code fences can again take a bare language string (~~~ haskell), not just curly-bracketed attributes (~~~ {.haskell}).
    • Backtick code blocks can take the curly-bracketed attributes.
    • Backtick code blocks don’t require a language.
    • Consolidated code for the two kinds of fenced code blocks.
  • LaTeX template: Use \urlstyle{same} to avoid monospace URLs.

  • Markdown writer: Use proportional font for email autolinks with obfuscation. Closes #714.

  • Corrected name of blank_before_blockquote in README. Closes #718.

  • Text.Pandoc.Shared: Fixed bug in uri parser. The bug prevented an autolink at the end of a string (e.g. at the end of a line block line) from counting as a link. Closes #711.

  • Use the hsb2hs preprocessor instead of TH for embed_data_files. This should work on Windows, unlike the TH solution with file-embed.

  • Eliminated use of TH in test suite.

  • Added Text.Pandoc.Data (non-exported) to hold the association list of embedded data files, if the embed_data_files flag is selected. This isolates the code that needs special treatment with file-embed or hsb2hs.

  • Changes to make-windows-installer.bat.

    • Exit batch file if any of the cabal-dev installs fail.
    • There’s no longer any need to reinstall highlighting-kate.
    • Don’t start with a cabal update; leave that to the user.
    • Force reinstall of pandoc.
  • Fixed EPUB writer so it builds with blaze-html 0.4.x. Thanks to Jens Petersen.

pandoc (2013-01-20)

  • Fixed bug with escaped % in LaTeX reader. Closes #710.

pandoc (2013-01-20)

  • Added further missing fb2 tests to cabal file.

pandoc (2013-01-20)

  • Added fb2 tests to cabal file’s extra-source-files.

pandoc (2013-01-20)

  • Bump version bounds on test-framework packages.

pandoc 1.10 (2013-01-19)

New features

  • New input formats: mediawiki (MediaWiki markup).

  • New output formats: epub3 (EPUB v3 with MathML), fb2 (FictionBook2 ebooks).

  • New --toc-depth option, specifying how many levels of headers to include in a table of contents.

  • New --epub-chapter-level option, specifying the header level at which to divide EPUBs into separate files. Note that this normally affects only performance, not the visual presentation of the EPUB in a reader.

  • Removed the --strict option. Instead of using --strict, one can now use the format name markdown_strict for either input or output. This gives more fine-grained control that --strict did, allowing one to convert from pandoc’s markdown to strict markdown or vice versa.

  • It is now possible to enable or disable specific syntax extensions by appending them (with + or -) to the writer or reader name. For example,

    pandoc -f markdown-footnotes+hard_line_breaks

    disables footnotes and enables treating newlines as hard line breaks. The literate Haskell extensions are now implemented this way as well, using either +lhs or +literate_haskell. For a list of extension names, see the README under “Pandoc’s Markdown.”

  • The following aliases have been introduced for specific combinations of markdown extensions: markdown_phpextra, markdown_github, markdown_mmd, markdown_strict. These aliases work just like regular reader and writer names, and can be modified with extension modifiers as described above. (Note that conversion from one markdown dialect to another does not work perfectly, because there are differences in markdown parsers besides just the extensions, and because pandoc’s internal document model is not rich enough to capture all of the extensions.)

  • New --html-q-tags option. The previous default was to use <q> tags for smart quotes in HTML5. But <q> tags are also valid HTML4. Moreover, they are not a robust way of typesetting quotes, since some user agents don’t support them, and some CSS resets (e.g. bootstrap) prevent pandoc’s quotes CSS from working properly. We now just insert literal quote characters by default in both html and html5 output, but this option is provided for those who still want <q> tags.

  • The markdown reader now prints warnings (to stderr) about duplicate link and note references. Closes #375.

  • Markdown syntax extensions:

    • Added pipe tables. Thanks to François Gannaz for the initial patch. These conform to PHP Markdown Extra’s pipe table syntax. A subset of org-mode table syntax is also supported, which means that you can use org-mode’s nice table editor to create tables.

    • Added support for RST-style line blocks. These are useful for verse and addresses.

    • Attributes can now be specified for headers, using the same syntax as in code blocks. (However, currently only the identifier has any effect in most writers.) For example,

      # My header {#foo}
      See [the header above](#foo).
    • Pandoc will now act as if link references have been defined for all headers without explicit identifiers. So, you can do this:

      # My header
      Link to [My header].
      Another link to [it][My header].

    Closes #691.

  • LaTeX reader:

    • Command macros now work everywhere, including non-math. Environment macros still not supported.
    • \input now works, as well as \include. TEXINPUTS is used. Pandoc looks recursively into included files for more included files.

Behavior changes

  • The Markdown reader no longer puts the text of autolinks in a Code inline. This means that autolinks will no longer appear in a monospace font.

  • The character / can now appear in markdown citation keys.

  • HTML blocks in strict_markdown are no longer required to begin at the left margin. Technically this is required, according to the markdown syntax document, but and other markdown processors are more liberal.

  • The -V option has been changed so that if there are duplicate variables, those specified later on the command line take precedence.

  • Tight lists now work in LaTeX and ConTeXt output.

  • The LaTeX writer no longer relien on the enumerate package. Instead, it uses standard LaTeX commands to change the list numbering style.

  • The LaTeX writer now uses longtable instead of ctable. This allows tables to be split over page boundaries.

  • The RST writer now uses a line block to render paragraphs containing linebreaks (which previously weren’t supported at all).

  • The markdown writer now applies the --id-prefix to footnote IDs. Closes #614.

  • The plain writer no longer uses backslash-escaped line breaks (which are not very “plain”).

  • Text.Pandoc.UTF8: Better error message for invalid UTF8. Read bytestring and use Text’s decodeUtf8 instead of using System.IO.hGetContents. This way you get a message saying “invalid UTF-8 stream” instead of “invalid byte sequence.” You are also told which byte caused the problem.

  • Docx, ODT, and EPUB writers now download images specified by a URL instead of skipping them or raising an error.

  • EPUB writer:

    • The default CSS now left-aligns headers by default, instead of centering. This is more consistent with the rest of the writers.
    • A proper multi-level table of contents is now used in toc.ncx. There is no longer a subsidiary table of contents at the beginning of each chapter.
    • Code highlighting now works by default.
    • Section divs are used by default for better semantic markup.
    • The title is used instead of “Title Page” in the table of contents. Otherwise we have a hard-coded English string, which looks strange in ebooks written in other languages. Closes #572.
  • HTML writer:

    • Put mathjax in span with class “math”. Closes #562.
    • Put citations in a span with class “citation.” In HTML5, also include a data-cite attribute with a space-separated list of citation keys.
  • Text.Pandoc.UTF8: use universalNewlineMode in reading. This treats both \r\n and \n as \n on input, no matter what platform we’re running on.

  • Citation processing is now done in the Markdown and LaTeX readers, not in pandoc.hs. This makes it easier for library users to use citations.

Template changes

  • HTML: Added css to template to preserve spaces in <code> tags. Thanks to Dirk Laurie.

  • Beamer: Remove English-centric strings in section pages. Section pages used to have “Section” and a number as well as the section title. Now they just have the title. Similarly for part and subsection. Closes #566.

  • LaTeX, ConTeXt: Added papersize variable.

  • LaTeX, Beamer templates: Use longtable instead of ctable.

  • LaTeX, Beamer templates: Don’t require ‘float’ package for tables. We don’t actually seem to use the ‘[H]’ option.

  • Markdown, plain: Fixed titleblock so it is just a single string. Previously separate title, author, and date variables were used, but this didn’t allow different kinds of title blocks.

  • EPUB:

    • Rationalized templates. Previously there were three different templates involved in epub production. There is now just one template, default.epub or default.epub3. It can now be overridden using --template, just like other templates. The titlepage is now folded into the default template. A titlepage variable selects it.
    • UTF-8, lang tag, meta tags, title element.
  • Added scale-to-width feature to beamer template

API changes

  • Text.Pandoc.Definition: Added Attr field to Header. Previously header identifers were autogenerated by the writers. Now they are added in the readers (either automatically or explicitly).

  • Text.Pandoc.Builder:

    • Inlines and Blocks are now synonyms for Many Inline and Many Block. Many is a newtype wrapper around Seq, with custom Monoid instances for Many Inline and Many Block. This allowsManyto be made an instance ofFoldableandTraversable`.
    • The old Listable class has been removed.
    • The module now exports isNull, toList, fromList.
    • The old Read and Show instances have been removed; derived instances are now used.
    • Added headerWith.
  • The readers now take a ReaderOptions rather than a ParserState as a parameter. Indeed, not all parsers use the ParserState type; some have a custom state. The motivation for this change was to separate user-specifiable options from the accounting functions of parser state.

  • New module Text.Pandoc.Options. This includes the WriterOptions formerly in Text.Pandoc.Shared, and its associated data types. It also includes a new type ReaderOptions, which contains many options formerly in ParserState, and its associated data types:

    • ParserState.stateParseRaw -> ReaderOptions.readerParseRaw.
    • ParserState.stateColumns -> ReaderOptions.readerColumns.
    • ParserState.stateTabStop -> ReaderOptions.readerTabStop.
    • ParserState.stateOldDashes -> ReaderOptions.readerOldDashes.
    • ParserState.stateLiterateHaskell -> ReaderOptions.readerLiterateHaskell.
    • ParserState.stateCitations -> ReaderOptions.readerReferences.
    • ParserState.stateApplyMacros -> ReaderOptions.readerApplyMacros.
    • ParserState.stateIndentedCodeClasses -> ReaderOptions.readerIndentedCodeClasses.
    • Added ReaderOptions.readerCitationStyle.
  • WriterOptions now includes writerEpubVersion, writerEpubChapterLevel, writerEpubStylesheet, writerEpubFonts, writerReferenceODT, writerReferenceDocx, and writerTOCDepth. writerEPUBMetadata has been renamed writerEpubMetadata for consistency.

  • Changed signatures of writeODT, writeDocx, writeEPUB, since they no longer stylesheet, fonts, reference files as separate parameters.

  • Removed writerLiterateHaskell from WriterOptions, and readerLiterateHaskell from ReaderOptions. LHS is now handled by an extension (Ext_literate_haskell).

  • Removed deprecated writerXeTeX.

  • Removed writerStrict from WriterOptions. Added writerExtensions. Strict is now handled through extensions.

  • Text.Pandoc.Options exports pandocExtensions, strictExtensions, phpMarkdownExtraExtensions, githubMarkdownExtensions, and multimarkdownExtensions, as well as the Extensions type.

  • New Text.Pandoc.Readers.MediaWiki module, exporting readMediaWiki.

  • New Text.Pandoc.Writers.FB2 module, exporting writeFB2 (thanks to Sergey Astanin).

  • Text.Pandoc:

    • Added getReader, getWriter to Text.Pandoc.
    • writers is now an association list (String, Writer). A Writer can be a PureStringWriter, an IOStringWriter, or an IOByteStringWriter. ALL writers are now in the ‘writers’ list, including the binary writers and FB2 writer. This allows code in pandoc.hs to be simplified.
    • Changed type of readers, so all readers are in IO. Users who want pure readers can still get them form the reader modules; this just affects the function getReader that looks up a reader based on the format name. The point of this change is to make it possible to print warnings from the parser.
  • Text.Pandoc.Parsing:

    • Text.Parsec now exports all Parsec functions used in pandoc code. No other module directly imports Parsec. This will make it easier to change the parsing backend in the future, if we want to.
    • Text.Parsec is used instead of Text.ParserCombinators.Parsec.
    • Export the type synonym Parser.
    • Export widthsFromIndices, NoteTable', KeyTable', Key', toKey', withQuoteContext, singleQuoteStart, singleQuoteEnd, doubleQuoteStart, doubleQuoteEnd, ellipses, apostrophe, dash, nested, F(..), askF, asksF, runF, lineBlockLines.
    • ParserState is no longer an instance of Show.
    • Added stateSubstitutions and stateWarnings to ParserState.
    • Generalized type of withQuoteContext.
    • Added guardEnabled, guardDisabled, getOption.
    • Removed failIfStrict.
    • lookupKeySrc and fromKey are no longer exported.
  • Data.Default instances are now provided for ReaderOptions, WriterOptions, and ParserState. Text.Pandoc re-exports def. Now you can use def (which is re-exported by Text.Pandoc) instead of defaultWriterOptions (which is still defined). Closes #546.

  • Text.Pandoc.Shared:

    • Added safeRead.
    • Renamed removedLeadingTrailingSpace to trim, removeLeadingSpace to triml, and removeTrailingSpace to trimr.
    • Count \r as space in trim functions.
    • Moved renderTags' from HTML reader and Text.Pandoc.SelfContained to Shared.
    • Removed failUnlessLHS.
    • Export compactify', formerly in Markdown reader.
    • Export isTightList.
    • Do not export findDataFile.
    • readDataFile now returns a strict ByteString.
    • Export readDataFileUTF8 which returns a String, like the old readDataFile.
    • Export fetchItem and openURL.
  • Text.Pandoc.ImageSize: Use strict, not lazy bytestrings. Removed readImageSize.

  • Text.Pandoc.UTF8: Export encodePath, decodePath, decodeArg, toString, fromString, toStringLazy, fromStringLazy.

  • Text.Pandoc.UTF8 is now an exposed module.

  • Text.Pandoc.Biblio:

    • csl parameter now a String rather than a FilePath.
    • Changed type of processBiblio. It is no longer in the IO monad. It now takes a Maybe Style argument rather than parameters for CSL and abbrev filenames. (pandoc.hs now calls the functions to parse the style file and add abbreviations.)
  • Markdown reader now exports readMarkdownWithWarnings.

  • Text.Pandoc.RTF now exports writeRTFWithEmbeddedImages instead of rtfEmbedImage.

Bug fixes

  • Make --ascii work properly with --self-contained. Closes #568.

  • Markdown reader:

    • Fixed link parser to avoid exponential slowdowns. Closes #620. Previously the parser would hang on input like this:


    We fixed this by making the link parser parser characters between balanced brackets (skipping brackets in inline code spans), then parsing the result as an inline list. One change is that

        [hi *there]* bud](/url)

    is now no longer parsed as a link. But in this respect pandoc behaved differently from most other implementations anyway, so that seems okay.

    • Look for raw html/latex blocks before tables. Otherwise the following gets parsed as a table:

      -- My comment.

    Closes #578.

  • RST reader:

    • Added support for :target: on .. image:: blocks and substitutions.
    • Field list fixes:

      • Fixed field lists items with body beginning after a new line (Denis Laxalde).
      • Allow any char but ‘:’ in names of field lists in RST reader (Denis Laxalde).
      • Don’t allow line breaks in field names.
      • Require whitespace after field list field names.
      • Don’t create empty definition list for metadata field lists. Previously a field list consisting only of metadata fields (author, title, date) would be parsed as an empty DefinitionList, which is not legal in LaTeX and not needed in any format.
    • Don’t recognize inline-markup starts inside words. For example, 2*2 = 4*1 should not contain an emphasized section. Added test case for “Literal symbols”. Closes #569.
    • Allow dashes as separator in simple tables. Closes #555.
    • Added support for container, compound, epigraph, rubric, highlights, pull-quote.
    • Added support for .. code::.
    • Made directive labels case-insensitive.
    • Removed requirement that directives begin at left margin. This was (correctly) not in earlier releases; docutils doesn’t make the requirement.
    • Added support for replace:: and unicode:: substitutions.
    • Ignore unknown interpreted roles.
    • Renamed image parser to subst, since it now handles all substitution references.

  • Textile reader:

    • Allow newlines before pipes in table. Closes #654.
    • Fixed bug with list items containing line breaks. Now pandoc correctly handles hard line breaks inside list items. Previously they broke list parsing.
    • Implemented comment blocks.
    • Fixed bug affected words ending in hyphen.
    • Properly handle links with surrounding brackets. Square brackets need to be used when the link isn’t surrounded by spaces or punctuation, or when the URL ending may be ambiguous. Closes #564.
    • Removed nullBlock. Better to know about parsing problems than to skip stuff when we get stuck.
    • Allow ID attributes on headers.
    • Textile reader: Avoid parsing dashes as strikeout. Previously the input

    would be parsed with strikeouts rather than dashes. This fixes the problem by requiring that a strikeout delimiting - not be followed by a -. Closes #631.
    • Expanded list of stringBreakers. This fixes a bug on input like “(hello)” which should be a parenthesized emphasized “hello”. The new list is taken from the PHP source of textile 2.4.
    • Fixed autolinks. Previously the textile reader and writer incorrectly implented RST-style autolinks for URLs and email addresses. This has been fixed. Now an autolink is done this way: "$":
    • Fixed footnotes bug in textile. This affected notes occuring before punctuation, e.g. foo[1].. Closes #518.
  • LaTeX reader:

    • Better handling of citation commands.
    • Better handling of \noindent.
    • Added a ‘try’ in rawLaTeXBlock, so we can handle \begin without {. Closes #622.
    • Made rawLaTeXInline try to parse block commands as well. This is usually what we want, given how rawLaTeXInline is used in the markdown and textile readers. If a block-level LaTeX command is used in the middle of a paragraph (e.g. \subtitle inside a title), we can treat it as raw inline LaTeX.
    • Handle command. Closes #605.
    • Basic \enquote support.
    • Fixed parsing of paragraphs beginning with a group. Closes #606.
    • Use curly quotes for bare straight quotes.
    • Support obeylines environment. Closes #604.
    • Guard against “begin”, “end” in inlineCommand and blockCommand.
    • Better error messages for environments. Now it should tell you that it was looking for \end{env}, instead of giving “unknown parse error.”
  • HTML reader:

    • Added HTML 5 tags to list of block-level tags.
    • HTML reader: Fixed bug in htmlBalanced, which caused hangs in parsing certain markdown input using strict mode.
    • Parse <q> as Quoted DoubleQuote.
    • Handle nested <q> tags properly.
    • Modified htmlTag for fewer false positives. A tag must start with < followed by !,?, /, or a letter. This makes it more useful in the wikimedia and markdown parsers.
  • DocBook reader: Support title in “figure” element. Closes #650.

  • MediaWiki writer:

    • Remove newline after <br/> in translation of LineBreak There’s no particular need for a newline (other than making the generated MediaWiki source look nice to a human), and in fact sometimes it is incorrect: in particular, inside an enumeration, list items cannot have embedded newline characters. (Brent Yorgey)
    • Use <code> not <tt> for Code.
  • Man writer: Escape - as \-. Unescaped -’s become hyphens, while \-’s are left as ascii minus signs. That is preferable for use with command-line options. See Thanks to Andrea Bolognani for bringing the issue to our attention.

  • RST writer:

    • Improved line block output. Use nonbreaking spaces for initial indent (otherwise lost in HTML and LaTeX). Allow multiple paragraphs in a single line block. Allow soft breaks w continuations in line blocks.
    • Properly handle images with no alt text. Closes #678.
    • Fixed bug with links with duplicate text. We now (a) use anonymous links for links with inline URLs, and (b) use an inline link instead of a reference link if the reference link would require a label that has already been used for a different link. Closes #511.
    • Fixed hyperlinked images. Closes #611. Use :target: field when you have a simple linked image.
    • Don’t add :align: center to figures.
  • Texinfo writer: Fixed internal cross-references. Now we insert anchors after each header, and use @ref instead of @uref for links. Commas are now escaped as @comma{} only when needed; previously all commas were escaped. (This change is needed, in part, because @ref commands must be followed by a real comma or period.) Also insert a blank line in from of @verbatim environments.

  • DocBook writer:

    • Made –id-prefix work in DocBook as well as HTML. Closes #607.
    • Don’t include empty captions in figures. Closes #581.
  • LaTeX writer:

    • Use \hspace* for nonbreaking space after line break, since ~ spaces after a line break are just ignored. Closes #687.
    • Don’t escape _ in URLs or hyperref identifiers.
    • Properly escape strings inside . Closes #576.
    • Use [fragile] only for slides containing code rendered using listings. Closes #649.
    • Escape | as \vert in LaTeX math. This avoids a clash with highlighting-kate’s macros, which redefine | as a short verbatim delimiter. Thanks to Björn Peemöller for raising this issue.
    • Use minipage rather than parbox for block containers in tables. This allows verbatim code to be included in grid tables. Closes #663.
    • Prevent paragraphs containing only linebreaks or spaces.
  • HTML writer:

    • Included highlighting-css for code spans, too. Previously it was only included if used in a code block. Closes #653.
    • Improved line breaks with <dd> tags. We now put a newline between </dd> and <dd> when there are multiple definitions.
    • Changed mathjax cdn url so it doesn’t use https. (This caused problems when used with --self-contained.) See #609.
  • EPUB writer:

    • --number-sections now works properly.
    • Don’t strip meta and link elements in epub metadata. Patch from aberrancy. Closes #589.
    • Fixed a couple validation bugs.
    • Use ch001, ch002, etc. for chapter filenames. This improves sorting of chapters in some readers, which apparently sort ch2 after ch10. Closes #610.
  • ODT writer: properly set title property (Arlo O’Keeffe).

  • Docx writer:

    • Fixed bug with nested lists. Previously a list like

      1. one
          - a
          - b
      2. two
    would come out with a bullet instead of “2.” Thanks to Russell Allen for reporting the bug.
    • Use w:cr in w:r instead of w:br for linebreaks. This seems to fix a problem viewing pandoc-generated docx files in LibreOffice.
    • Use integer ids for bookmarks. Closes #626.
    • Added nsid to abstractNum elements. This helps when merging word documents with numbered or bulleted lists. Closes #627.
    • Use separate footnotes.xml for notes. This seems to help LibreOffice convert the file, even though it was valid docx before. Closes #637.
    • Use rIdNN identifiers for r:embed in images.
    • Avoid reading image files again when we’ve already processed them.
    • Fixed typo in referenc.docx that prevented image captions from working. Thanks to Huashan Chen.
  • Text.Pandoc.Parsing:

    • Fixed bug in withRaw, which didn’t correctly handle the case where nothing is parsed.
    • Made emailAddress parser more correct. Now it is based on RFC 822, though it still doesn’t implement quoted strings in email addresses.
    • Revised URI parser. It now allows many more schemes, allows uppercase URIs, and better handles trailing punctuation and trailing slashes in bare URIs. Added many tests.
    • Simplified and improved singleQuoteStart. This makes 's', 'l', etc. parse properly. Formerly we had some English-centric heuristics, but they are no longer needed. Closes #698.
  • Text.Pandoc.Pretty: Added wide punctuation range to charWidth. This fixes bug with Chinese commas in markdown and reST tables, and a bug that caused combining characters to be dropped.

  • Text.Pandoc.MIME: Added MIME types for .wof and .eot. Closes #640.

  • Text.Pandoc.Biblio:

    • Run mvPunc and deNote on metadata too. This fixed a bug with notes on titles using footnote styles.
    • Fixed bug in fetching CSL files from CSL data directory.
  • pandoc.hs: Give correct value to writerSourceDirectory when a URL is provided. It should be the URL up to the path.

  • Fixed/simplified diff output for tests. Biblio: Make sure mvPunc and deNote run on metadata too. This fixed a bug with notes on titles using footnote styles.

Under the hood improvements

  • We no longer depend on utf8-string. Instead we use functions defined in Text.Pandoc.UTF8 that use Data.Text’s conversions.

  • Use safeRead instead of using reads directly (various modules).

  • “Implicit figures” (images alone in a paragraph) are now handled differently. The markdown reader gives their titles the prefix fig:; the writers look for this before treating the image as a figure. Though this is a bit of a hack, it has two advantages: (i) implicit figures can be limited to the markdown reader, and (ii) they can be deactivated by turning off the implicit_figures extension.

  • catch from Control.Exception is now used instead of the old Preface catch.

  • Text.Pandoc.Shared: Improved algorithm for normalizeSpaces and oneOfStrings (which is now non-backtracking).

  • Text.Pandoc.Biblio: Remove workaround for toCapital. Now citeproc-hs is fixed upstream, so this is no longer needed. Closes #531.

  • Textile reader: Improved speed of hyphenedWords. This speeds up the textile reader by about a factor of 4.

  • Use Text.Pandoc.Builder in RST reader, for more flexibility, better performance, and automatic normalization.

  • Major rewrite of markdown reader:

    • Use Text.Pandoc.Builder instead of lists. This also means that everything is normalized automatically.
    • Move to a one-pass parsing strategy, returning values in the reader monad, which are then run (at the end of parsing) against the final parser state.
  • In HTML writer, we now use toHtml instead of pre-escaping. We work around the problem that blaze-html unnecessarily escapes ' by pre-escaping just the ' characters, instead of the whole string. If blaze-html later stops escaping ' characters, we can simplify strToHtml to toHtml. Closes #629.

  • Moved code for embedding images in RTFs from pandoc.hs to the RTF writer (which now exports writeRTFWithEmbeddedImages).

  • Moved citation processing from pandoc.hs into the readers. This makes things more convenient for library users.

  • The man pages are now built by an executable make-pandoc-man-pages, which has its own stanza in the cabal file so that dependencies can be handled by Cabal. Special treatment in Setup.hs ensures that this executable never gets installed; it is only used to create the man pages.

  • The cabal file has been modified so that the pandoc library is used in building the pandoc executable. (This required moving pandoc.hs from src to ..) This cuts compile time in half.

  • The executable and library flags have been removed.

  • -threaded has been removed from ghc-options.

  • Version bounds of dependencies have been raised, and the blaze_html_0_5 flag now defaults to True. Pandoc now compiles on GHC 7.6.

  • We now require base >= 4.2.

  • Integrated the benchmark program into cabal. One can now do:

    cabal configure --enable-benchmarks && cabal build
    cabal bench --benchmark-option='markdown' --benchmark-option='-s 20'

    The benchmark now uses README + testsuite, so benchmark results from older versions aren’t comparable.

  • Integrated test suite with cabal. To run tests, configure with --enable-tests, then cabal test. You can specify particular tests using --test-options='-t markdown'. No output is shown unless tests fail. The Haskell test modules have been moved from src/ to tests/.

  • Moved all data files and templates to the data/ subdirectory.

  • Added an embed_data_files cabal flag. This causes all data files to be embedded in the binary, so that the binary is self-sufficient and can be relocated anywhere, copied on a USB key, etc. The Windows installer now uses this. (Since we no longer have the option to build the executable without the library, this is the only way to get a relocatable binary on Windows.)

  • Removed pcre3.dll from windows package. It isn’t needed unless highlighting-kate is compiled with the pcre-light flag. By default, regex-prce-builtin is used.

pandoc (2012-10-21)

  • Raised version bounds on network, base64-bytestring, json, and template-haskell.

pandoc (2012-10-20)

  • Removed tests flag and made test suite into a proper cabal test suite, which can now be enabled using --enable-tests and run with cabal test.

  • Moved man page creation out of Setup.hs and into an executable built by Cabal, but never installed. This allows dependencies to be specified, and solves a problem with, which could only be installed if data-default had already been installed.

  • Updated lhs-latex.tex test for latest highlighting-kate representation of backticks.

pandoc (2012-10-20)

  • Removed -threaded from default compile flags.

  • Modified modules to compile with GHC 7.6 and latest version of time package.

pandoc (2012-06-29)

  • Don’t encode/decode file paths if base >= 4.4. Prior to base 4.4, filepaths and command line arguments were treated as unencoded lists of bytes, not unicode strings, so we had to work around that by encoding and decoding them. This commit adds CPP checks for the base version that intelligibly enable encoding/decoding when needed. Fixes a bug with multilingual filenames when pandoc was compiled with ghc 7.4 (#540).

  • Don’t generate an empty H1 after hrule slide breaks. We now use a slide-level header with contents [Str "\0"] to mark an hrule break. This avoids creation of an empty H1 in these contexts. Closes #484.

  • Docbook reader: Added support for “bold” emphasis. Thanks to mb21.

  • In, ensure citeproc-hs is built with the embed_data_files flag.

  • MediaWiki writer: Avoid extra blank lines after sublists (Gavin Beatty).

  • ConTeXt writer: Don’t escape &, ^, <, >, _, simplified escapes for } and { to \{ and \} (Aditya Mahajan).

  • Fixed handling of absolute URLs in CSS imports with --self-contained. Closes #535.

  • Added webm to mime types. Closes #543.

  • Added some missing exports and tests to the cabal file (Alexander V Vershilov).

  • Compile with -rtsopts and -threaded by default.

pandoc (2012-06-08)

  • Markdown reader: Added cf. and cp. to list of likely abbreviations.

  • LaTeX template: Added linkcolor, urlcolor and links-as-notes variables. Make TOC links black.

  • LaTeX template improvements.

    • Don’t print date unless one is given explicitly in the document.
    • Simplified templates.
    • Use fontenc [T1] by default, and lmodern.
    • Use microtype if available.
  • Biblio:

    • Add comma to beginning of bare suffix, e.g. @item1 [50]. Motivation: @item1 [50] should be as close as possible to [@item1, 50].
    • Added workaround for a bug in citeproc-hs 0.3.4 that causes footnotes beginning with a citation to be empty. Closes #531.
  • Fixed documentation on mixed lists. Closes #533.

pandoc 1.9.4 (2012-06-03)

  • Simplified Text.Pandoc.Biblio and fixed bugs with citations inside footnotes and captions. We now handle note citations by inserting footnotes during initial citation processing, and doing a separate pass later to remove notes inside notes.

  • Added ‘zenburn’ highlight style from highlighting-kate.

  • Added Slideous writer. Slideous is an HTML + javascript slide show format, similar to Slidy, but works with IE 7. (Jonas Smedegaard)

  • LaTeX writer:

    • Ensure we don’t have extra blank lines at ends of cells. This can cause LaTeX errors, as they are interpreted as new paragraphs.
    • More consistent interblock spacing.
    • Require highlighting-kate >= 0.5.1, for proper highlighted inline code in LaTeX. Closes #527.
    • Ensure that a Verbatim at the end of a footnote is followed by a newline. (Fixes a regression in the previous version.)
    • In default template, use black for internal links and TOC. Added commented-out code to use footnotes for links, as would be suitable in print output.
  • Beamer writer: When --incremental is used, lists inside a block quote should appear all at once. (This makes Beamer output consistent with the HTML slide show formats.)

  • ConTeXt writer:

    • Escape % as \letterpercent{} not \letterpercent, to avoid gobbling spaces after the % sign.
    • Ensure space after \stopformula.
  • Markdown writer:

    • Use : form instead of ~ in definition lists, for better compatibility with other markdown implementations.
    • Don’t wrap the term, because it breaks definition lists.
    • Use a nonzero space to prevent false recognition of list marker in ordered lists. Closes #516.
  • Org writer: Add space before language name. Closes #523.

  • Docx writer: Simplified bullet characters so they work properly with Word 2007. Closes #520.

  • LaTeX reader: Support \centerline.

  • RST reader: handle figures. Closes #522.

  • Textile reader: fix for <notextile> and ==. Closes #517. (Paul Rivier)

pandoc 1.9.3 (2012-05-12)

  • Added docbook reader (with contributions from Mauro Bieg).

  • Fixed bug in fromEntities. The previous version would turn hi & low you know; into hi &.

  • HTML reader:

    • Don’t skip nonbreaking spaces. Previously a paragraph containing just &nbsp; would be rendered as an empty paragraph. Thanks to Paul Vorbach for pointing out the bug.
    • Support <col> and <caption> in tables. Closes #486.
  • Markdown reader:

    • Don’t recognize references inside delimited code blocks.
    • Allow list items to begin with lists.
  • LaTeX reader:

    • Handle \bgroup, \egroup, \begingroup, \endgroup.
    • Control sequences can’t be followed by a letter. This fixes a bug where \begingroup was parsed as \begin followed by group.
    • Parse ‘dimension’ arguments to unknown commands. e.g. \parindent0pt
    • Make \label and \ref sensitive to --parse-raw. If --parse-raw is selected, these will be parsed as raw latex inlines, rather than bracketed text.
    • Don’t crash on unknown block commands (like \vspace{10pt}) inside \author; just skip them. Closes #505.
  • Textile reader:

    • Implemented literal escapes with == and <notextile>. Closes #473.
    • Added support for LaTeX blocks and inlines (Paul Rivier).
    • Better conformance to RedCloth inline parsing (Paul Rivier).
    • Parse ‘+text+’ as emphasized (should be underlined, but this is better than leaving literal plus characters in the output.
  • Docx writer: Fixed multi-paragraph list items. Previously they each got a list marker. Closes #457.

  • LaTeX writer:

    • Added --no-tex-ligatures option to avoid replacing quotation marks and dashes with TeX ligatures.
    • Use fixltx2e package to provide ‘’.
    • Improve spacing around LaTeX block environments: quote, verbatim, itemize, description, enumerate. Closes #502.
    • Use blue instead of pink for URL links in latex/pdf output.
  • ConTeXt writer: Fixed escaping of %. In text, % needs to be escaped as \letterpercent, not \% Inside URLs, % needs to be escaped as \% Thanks to jmarca and adityam for the fix. Closes #492.

  • Texinfo writer: Escape special characters in node titles. This fixes a problem pointed out by Joost Kremers. Pandoc used to escape an ‘@’ in a chapter title, but not in the corresponding node title, leading to invalid texinfo.

  • Fixed document encoding in texinfo template. Resolves Debian Bug #667816.

  • Markdown writer:

    • Don’t force delimited code blocks to be flush left. Fixes bug with delimited code blocks inside lists etc.
    • Escape < and $.
  • LaTeX writer: Use \hyperref[ident]{text} for internal links. Previously we used \href{\#ident}{text}, which didn’t work on all systems. Thanks to Dirk Laurie.

  • RST writer: Don’t wrap link references. Closes #487.

  • Updated to use latest versions of blaze-html, mtl.

pandoc 1.9.2 (2012-04-05)

  • LaTeX reader:

    • Made lstlisting work as a proper verbatim environment.
    • Fixed bug parsing LaTeX tables with one column.
  • LaTeX writer:

    • Use {} around ctable caption, so that formatting can be used.
    • Don’t require eurosym package unless document has a €.
  • LaTeX template: Added variables for geometry, romanfont, sansfont, mathfont, mainfont so users can more easily customize fonts.

  • PDF writer:

    • Run latex engine at least two times, to ensure that PDFs will have hyperlinked bookmarks.
    • Added PDF metadata (title,author) in LaTeX standalone + PDF output.
  • Texinfo writer: retain directories in image paths. (Peter Wang)

  • RST writer: Better handling of inline formatting, in accord with docutils’ “inline markup recognition rules” (though we don’t implement the unicode rules fully). Now hi*there*hi gets rendered properly as hi\ *there*\ hi, and unnecessary \ are avoided around :math:, :sub:, :sup:.

  • RST reader:

    • Parse \ as null, not escaped space.
    • Allow :math:`...` even when not followed by blank or \. This does not implement the complex rule docutils follows, but it should be good enough for most purposes.
    • Add support for the rST default-role directive. (Greg Maslov)
  • Text.Pandoc.Parsing: Added stateRstDefaultRole field to ParserState. (Greg Maslov)

  • Markdown reader: Properly handle citations nested in other inline elements.

  • Markdown writer: don’t replace empty alt in image with “image”.

  • DZSlides: Updated template.html and styles in default template. Removed bizarre CSS for q in dzslides template.

  • Avoid repeated id attribute in section and header in HTML slides.

  • README improvements: new instructions on internal links, removed misleading note on reST math.

  • Build system:

    • Fixed Windows installer so that dzslides works.
    • Removed
    • Added .travis.yml for Travis continuous integration support..
    • Fixed upper bound for zlib (Sergei Trofimovich).
    • Fixed upper bound for test-framework.
    • Updated haddocks for haddock-2.10 (Sergei Trofimovich).

pandoc (2012-03-09)

  • Added beamer+lhs as output format.

  • Don’t escape < in <style> tags with --self-contained. This fixes a bug which prevented highlighting from working when using --self-contained.

  • PDF: run latex engine three times if --toc specified. This fixes page numbers in the table of contents.

  • Docx writer: Added TableNormal style to tables.

  • LaTeX math environment fixes. aligned is now used instead of the nonexistent aligned*. multline instead of the nonexistent multiline.

  • LaTeX writer: Use \textasciitilde for literal ~.

  • HTML writer: Don’t escape contents of EQ tags with –gladtex. This fixes a regression from 1.8.

  • Use <q> tags for Quoted items for HTML5 output. The quote style can be changed by modifying the template or including a css file. A default quote style is included.

  • LaTeX reader: Fixed accents (~{a}, \c{c}). Correctly handle ^{}. Support “minted” as a LaTeX verbatim block.

  • Updated LaTeX template for better language support. Use polyglossia instead of babel with xetex. Set lang as documentclass option. \setmainlanguage will use the last of a comma-separated list of languages. Thanks to François Gannaz.

  • Fixed default LaTeX template so \euro and work. The eurosym package is needed if you are using pdflatex.

  • Fixed escaping of period in man writer (thanks to Michael Thompson).

  • Fixed list label positions in beamer.

  • Set mainlang variable in context writer. This parallels behavior of latex writer. mainlang is the last of a comma-separated list of languages in lang.

  • EPUB language metadat: convert e.g. en_US from locale to en-US.

  • Changed -V so that you can specify a key without a value. Such keys get the value true.

  • Fixed permissions on installed man pages - thanks Magnus Therning.

  • Windows installer: require XP or higher. The installer is now compiled on a Windows 7 machine, which fixes a problem using citation functions on Windows 7.

  • OSX package: Check for 64-bit Intel CPU before installing.

pandoc (2012-02-11)

  • Better handling of raw latex environments in markdown. Now


    turns into a raw latex block as expected.

  • Improvements to LaTeX reader:

    • Skip options after block commands.
    • Correctly handle {\\} in braced.
    • Added a needed ‘try’.
    • Citations: add , to suffix if it doesn’t start with space or punctuation. Otherwise we get no space between the year and the suffix in author-date styles.
  • Added two needed data files for S5. This fixes a problem with pandoc -t s5 --self-contained. Also removed slides.min.js, which was no longer being used.

  • Fixed some minor problems in reference.docx: name on “Date” style, xCs instead of xIs.

  • Fixed a problem creating docx files using a reference docx modified using Word. The problem seems to be that Word modifies _rels/.rels, changing the Type of the Relationship to docProps/core.xml. Pandoc now changes this back to the correct value if it has been altered, fixing the problem.

  • Fixed html5 template so it works properly with highlighting.

pandoc 1.9.1 (2012-02-09)

  • LaTeX reader:

    • Fixed regression in 1.9; properly handle escaped $ in latex math.
    • Put LaTeX verse environments in blockquotes.
  • Markdown reader:

    • Limit nesting of strong/emph. This avoids exponential lookahead in parasitic cases, like a**a*a**a*a**a*a**a*a**a*a**a*a**a*a**.
    • Improved attributes syntax (in code blocks/spans): (1) Attributes can contain line breaks. (2) Values in key-value attributes can be surrounded by either double or single quotes, or left unquoted if they contain no spaces.
  • Headers no longer wrap in markdown or RST writers.

  • Added stateMaxNestingLevel to ParserState. We set this to 6, so you can still have Emph inside Emph, just not indefinitely.

  • More efficient implementation of nowrap in Text.Pandoc.Pretty.

  • Text.Pandoc.PDF: Only run latex twice if \tableofcontents is present.

  • Require highlighting-kate >=, texmath >=

pandoc (2012-02-06)

  • Changed cabal file so that build-depends for the test program are not required unless the tests flag is used.

  • LaTeX writer: insert {} between adjacent hyphens so they don’t form ligatures (dashes) in code spans.

pandoc (2012-02-06)

  • Raised version bound on test-framework to avoid problems compiling tests on GHC 7.4.1.

  • LaTeX reader: Use raw LaTeX as fallback inline text for Cites, so citations don’t just disappear unless you process with citeproc. Ignore \bibliographystyle, \nocite.

  • Simplified tex2pdf; it will always run latex twice to resolve table of contents and hyperrefs.

pandoc (2012-02-06)

  • Require Cabal >= 1.10.

  • Tweaked cabal file to meet Cabal 1.10 requirements.

pandoc (2012-02-05)

  • Allow build with json 0.4 or 0.5. Otherwise we can’t build with ghc 6.12.

pandoc 1.9 (2012-02-05)

New features

  • Added a Microsoft Word docx writer. The writer includes support for highlighted code and for math (which is converted from TeX to OMML, Office’s native math markup language, using texmath’s new OMML module). A new option --reference-docx allows the user to customize the styles.

  • Added an asciidoc writer (

  • Better support for slide shows:

    • Added a dzslides writer. DZSlides is a lightweight HTML5/javascript slide show format due to Paul Rouget (

    • Added a LaTeX beamer writer. Beamer is a LaTeX package for creating slide presentations.

    • New, flexible rules for dividing documents into sections and slides (see the “Structuring the slide show” in the User’s Guide). These are backward-compatible with the old rules, but they allow slide shows to be organized into sections and subsections containing multiple slides.

    • A new --slide-level option allows users to override defaults and select a slide level below the first header level with content.

  • A new --self-contained option produces HTML output that does not depend on an internet connection or the presence of any external files. Linked images, CSS, and javascript is downloaded (or fetched locally) and encoded in data: URIs. This is useful for making portable HTML slide shows. The --offline option has been deprecated and is now treated as a synonym or --self-contained.

  • Support for PDF output:

    • Removed the old markdown2pdf.
    • pandoc can now create PDFs (assuming you have latex and a set of appropriate packages installed): just specify an output file with the .pdf extension.
    • A new option --latex-engine allows you to specify pdflatex, xelatex, or lualatex as the processor.
  • Highlighting changes:

    • Syntax highlighting is now a standard feature; the highlighting flag is no longer needed when compiling.
    • A new --no-highlight option allows highlighting to be disabled.
    • Highlighting now works in docx, latex, and epub, as well as html, html5, dzslides, s5, and slidy.
    • A new --highlight-style option selects between various highlighting color themes.
  • Internal links to sections now work in ConTeXt and LaTeX as well as HTML.

  • LaTeX \include and \usepackage commands are now processed, provided the files are in the working directory.

  • EPUB improvements:

    • Internal and external links now work in EPUB.
    • Raw HTML is allowed.
    • New --epub-embed-font option.
    • Customizable templates for EPUB pages offer more control over formatting: epub-page.html, epub-coverimage.html, epub-titlepage.html.
  • --mathml now works with DocBook.

  • Added support for math in RST reader and writer. Inline math uses the :math:`...` construct. Display math uses

    .. math:: ...

    or if the math is multiline,

    .. math::

    These constructions are now supported now by

  • Github syntax for fenced code blocks is supported in pandoc’s markdown. You can now write

    x = 2

    instead of

    ~~~ {.ruby}
    x = 2
  • Easier scripting: a new toJsonFilter function makes it easier to write Haskell scripts to manipulate the Pandoc AST. See Scripting with pandoc.

Behavior changes

  • Fixed parsing of consecutive lists in markdown. Pandoc previously behaved like for consecutive lists of different styles. Thus, the following would be parsed as a single ordered list, rather than an ordered list followed by an unordered list:

    1. one
    2. two
    - one
    - two

    This change makes pandoc behave more sensibly, parsing this as two lists. Any change in list type (ordered/unordered) or in list number style will trigger a new list. Thus, the following will also be parsed as two lists:

    1. one
    2. two
    a. one
    b. two

    Since we regard this as a bug in, and not something anyone would ever rely on, we do not preserve the old behavior even when --strict is selected.

  • Dashes work differently with --smart: --- is always em-dash, and -- is always en-dash. Pandoc no longer tries to guess when - should be en-dash. Note: This may change how existing documents look when processed with pandoc. A new option, --old-dashes, is provided for legacy documents.

  • The markdown writer now uses setext headers for levels 1-2. The old behavior (ATX headers for all levels) can be restored using the new --atx-headers option.

  • Links are now allowed in markdown image captions. They are also allowed in links, but will appear there as regular text. So,

    [link with [link](/url)](/url)

    will turn into

    <p><a href="/url">link with link</a></p>
  • Improved handling of citations using citeproc-hs-0.3.4. Added --citation-abbreviations option.

  • Citation keys can no longer end with a punctuation character. This means that @item1. will be parsed as a citation with key ‘item1’, followed by a period, instead of a citation with key ‘item1.’, as was the case previously.

  • In HTML output, citations are now put in a span with class citation.

  • The markdown reader now recognizes DocBook block and inline tags. It was always possible to include raw DocBook tags in a markdown document, but now pandoc will be able to distinguish block from inline tags and behave accordingly. Thus, for example,


    will not be wrapped in <para> tags.

  • The LaTeX parser has been completely rewritten; it is now much more accurate, robust, and extensible. However, there are two important changes in how it treats unknown LaTeX. (1) Previously, unknown environments became BlockQuote elements; now, they are treated as “transparent”, so \begin{unknown}xyz\end{unknown} is the same as xyz. (2) Previously, arguments of unknown commands were passed through with their braces; now the braces are stripped off.

  • --smart is no longer selected automatically with man output.

  • The deprecated --xetex option has been removed.

  • The --html5/-5 option has been deprecated. Use -t html5 instead. html5 and html5+lhs are now separate output formats.

  • Single quotes are no longer escaped in HTML output. They do not need to be escaped outside of attributes.

  • Pandoc will no longer transform leading newlines in code blocks to <br/> tags.

  • The ODT writer now sizes images appropriately, using the image size and DPI information embedded in the image.

  • --standalone is once again implicitly for a non-text output format (ODT, EPUB). You can again do pandoc test.txt -o test.odt and get a standalone ODT file.

  • The Docbook writer now uses <sect1>, <sect2>, etc. instead of <section>.

  • The HTML writer now uses <del> for strikeout.

  • In HTML output with --section-divs, the classes section and level[1,2,..6] are put on the div tags so they can be styled. In HTML 5 output with --section-divs, the classes level[1,2,...6] are put on section tags.

  • EPUB writer changes:

    • The lang variable now sets the language in the metadata (if it is not set, we default to the locale).
    • EPUB: UTF-8 is used rather than decimal entities.
  • Added titleslide class to title slide in S5 template.

  • In HTML, EPUB, and docx metadata, the date is normalized into YYYY-MM-DD format if possible. (This is required for validation.)

  • Attributes in highlighted code blocks are now preserved in HTML. The container element will have the classes, id, and key-value attributes you specified in the delimited code block. Previously these were stripped off.

  • The reference backlink in the HTML writer no longer has a special footnoteBacklink class.

  • The HTML template has been split into html and html5 templates.

  • Author and date are treated more consistently in HTML templates. Authors are now <h2>, date <h3>.

  • URLs are hyphenated in the ConTeXt writer (B. Scott Michel).

  • In Text.Pandoc.Builder, +++ has been replaced by <>.

Bug fixes

  • Better support for combining characters and East Asian wide characters in markdown and reST.

  • Better handling of single quotes with --smart. Previously D'oh l'*aide* would be parsed with left and right single quotes instead of apostrophes. This kind of error is now fixed.

  • Highlighting: Use reads instead of read for better error handling. Fixes crash on startNum="abc".

  • Added blank comment after directives in rst template.

  • Unescape entities in citation refId. The refIds coming from citeproc contain XML numeric entities, and these don’t match with the citation keys parsed by pandoc. Solution is to unescape them.

  • HTML reader: Fixed bug parsing tables with both thead and tbody.

  • Markdown reader:

    • Better handling of escapes in link URLs and titles.
    • Fixed backslash escapes in reference links.
    • Fixed bug in table/hrule parsing, by checking that the top line of a table is not followed by a blank line. This bug caused slowdowns on some files with hrules and tables, as pandoc tried to interpret the hrules as the tops of multiline tables.
    • Fixed bug in code block attribute parser. Previously the ID attribute got lost if it didn’t come first. Now attributes can come in any order.
  • RST reader: allow footnotes followed by newline without space characters.

  • LaTeX reader:

    • Ignore empty groups {}, { }.
    • LaTeX reader: Handle \@.
    • LaTeX reader: Don’t crash on commands like \itemsep.
    • LaTeX reader: Better handling of letter environments.
  • RST writer: Fixed bug involving empty table cells. isSimple was being calculated in a way that assumed there were no non-empty cells.

  • ConTeXt writer:

    • Made --toc work even without --number-sections.
    • Escape # in link URLs.
    • Use buffering for footnotes containing code blocks.
    • Changed ‘descr’ to ‘description’, fixed alignment.
  • LaTeX writer:

    • Escape euro character.
    • Don’t escape ~ inside \href{...}.
    • Escape # in href URLs.
    • Improved detection of book classes. We now check the documentclass variable, and if that is not set, we look through the template itself. Also, we have added the KOMA classes scrreprt and scrbook. You can now make a book using pandoc -V documentclass:book mybook.txt -o mybook.pdf
    • LHS files now set the “listings” variable, so that the definition of the code environment will be included in the template.
    • Links are colored blue by default (this can be changed by modifying hyperref settings in the template).
    • Added lang variable to LaTeX template.
  • HTML writer:

    • Fixed bug in HTML template with html5 and mathml.
    • Don’t use self-closing img, br, hr tags for HTML5.
    • Use <section> for footnotes if HTML5.
    • Update HTML templates to use Content-Style-Type meta tag.
    • Use separate variables for meta-date, meta-author. This makes footnotes work in author and date fields.
    • Use ‘vertical-align:middle’ in WebTeX math for better alignment.
  • S5/slidy writer: Make footnotes appear on separate slide at end.

  • MIME: Added ‘layout-cache’ to getMimeType. This ensures that the META-INF/manifest.xml for ODT files will have everything it needs, so that ODT files modified by LibreOffice can be used as --reference-odt.

  • Text.Pandoc.Templates: Return empty string for json template.

  • Text.Pandoc.Biblio:

    • Expand citations recursively inside nested inlines.
    • Treat \160 as space when parsing locator and suffix. This fixes a bug with “p. 33” when --smart is used. Previously the whole “p. 33” would be included in the suffix, with no locator.
    • Put whole author-in-text citation in a Cite. Previously just the date and other info went in the Cite.
    • Don’t add comma+space to prefix if it ends in punctuation.
  • Updated chicago-author-date.csl. The old version did not work properly for edited volumes with no author.

  • EPUB writer:

    • Add date to EPUB titlepage and metadata.
    • Added TOC identifier in EPUB page template.
    • Don’t generate superfluous file cover-image.jpg.

Under the hood improvements

  • Modified to use cabal-dev. Items are no longer installed as root. Man pages are zipped and given proper permissions.

  • Modified windows installer generater to use cabal-dev.

  • Setup: Making man pages now works with cabal-dev (at least on OSX). In Setup.hs we now invoke ‘runghc’ in a way that points it to the correct package databases, instead of always falling back to the default user package db.

  • Updated to work with GHC 7.4.1.

  • Removed dependency on old-time.

  • Removed dependency on dlist.

  • New slidy directory for “self-contained.”

  • TeXMath writer: Use unicode thin spaces for thin spaces.

  • Markdown citations: don’t strip off initial space in locator.

API changes

  • Removed Apostrophe, EmDash, EnDash, and Ellipses from the native Inline type in pandoc-types. Now we use Str elements with unicode.

  • Improvements to Text.Pandoc.Builder:

    • Inlines and Blocks are now newtypes (not synonyms for sequences).
    • Instances are defined for IsString, Show, Read, Monoid, and a new Listable class, which allows these to be manipulated to some extent like lists. Monoid append includes automatic normalization.
    • +++ has been replaced by <> (mappend).
  • Use blaze-html instead of xhtml for HTML generation. This changes the type of writeHtml.

  • Text.Pandoc.Shared:

    • Added warn and err.
    • Removed unescapeURI, modified escapeURI. (See under [behavior changes], above.)
  • Changes in URI escaping: Previously the readers escaped URIs by converting unicode characters to octets and then percent encoding. Now unicode characters are left as they are, and escapeURI only percent-encodes space characters. This gives more readable URIs, and works well with modern user agents. URIs are no longer unescaped at all on conversion to markdown, asciidoc, rst, org.

  • New module Text.Pandoc.SelfContained.

  • New module Text.Pandoc.Docx.

  • New module Text.Pandoc.PDF.

  • Added writerBeamer to WriterOptions.

  • Added normalizeDate to Text.Pandoc.Shared.

  • Added splitStringWithIndices in Text.Pandoc.Shared. This is like splitWithIndices, but it is sensitive to distinctions between wide, combining, and regular characters.

  • Text.Pandoc.Pretty:

    • Added chomp combinator.
    • Added beforeNonBreak combinator. This allows you to include something conditionally on it being before a nonblank. Used for RST inline math.
    • Added charWidth function. All characters marked W or F in the unicode spec EastAsianWidth.txt get width 2.
    • Added realLength, based on charWidth. realLength is now used in calculating offsets.
  • New module Text.Pandoc.Slides, for common functions for breaking a document into slides.

  • Removed Text.Pandoc.S5, which is no longer needed.

  • Removed Text.Pandoc.CharacterReferences. Moved characterReference to Text.Pandoc.Parsing. decodeCharacterReferences is replaced by fromEntities in Text.Pandoc.XML.

  • Added Text.Pandoc.ImageSize. This is intened for use in docx and odt writers, so the size and dpi of images can be calculated.

  • Removed writerAscii in WriterOptions.

  • Added writerHighlight to WriterOptions.

  • Added DZSlides to HTMLSlideVariant.

  • writeEPUB has a new argument for font files to embed.

  • Added stateLastStrPos to ParserState. This lets us keep track of whether we’re parsing the position immediately after a regular (non-space, non-symbol) string, which is useful for distinguishing apostrophes from single quote starts.

  • Text.Pandoc.Parsing:

    • escaped now returns a Char.
    • Removed charsInBalanced', added a character parser as a parameter of charsInBalanced. This is needed for proper handling of escapes, etc.
    • Added withRaw.
  • Added toEntities to Text.Pandoc.XML.

  • Text.Pandoc.Readers.LaTeX:

    • Export handleIncludes.
    • Export rawLaTeXBlock instead of rawLaTeXEnvironment'.
  • Added ToJsonFilter class and toJsonFilter function to Text.Pandoc, deprecating the old jsonFilter function.

  • Text.Pandoc.Highlighting:

    • Removed highlightHtml, defaultHighlightingCss.
    • Export formatLaTeXInline, formatLaTeXBlock, and highlight, plus key functions from highlighting-kate.
    • Changed types of highlighting function. highlight returns a Maybe, not an Either.

pandoc (2011-08-01)

  • Adjusted Arbitrary instance to help avoid timeouts in tests.

  • Added Tests.Writers.Markdown to cabal file.

  • Relaxed version bounds on pandoc-types, test-framework.

pandoc 1.8.2 (2011-07-30)

  • Added script to produce OS X package.

  • Made templates directory a git submodule. This should make it easier for people to revise their custom templates when the default templates change.

  • Changed template naming scheme: FORMAT.template -> default.FORMAT. Note: If you have existing templates in ~/.pandoc/templates, you must rename them to conform to the new scheme!

  • Default template improvements:

    • HTML: Display author and date after title.
    • HTML: Made table of contents more customizable. The container for the TOC is now in the template, so users can insert a header or other styling. (Thanks to Bruce D’Arcus for the suggestion.)
    • HTML, Slidy, S5: Enclose scripts in CDATA tags.
    • Slidy, S5: Added s5-url and slidy-url variables, instead of hard-coding. If you want to put your slidy files in the slidy subdirectory, for example, you can do pandoc -t slidy -V slidy-url=slidy -s.
    • LaTeX: Use \and to separate authors in LaTeX documents (reader & writer). Closes #279.
    • LaTeX: Set \emergencystretch to prevent overfull lines.
    • LaTeX: Use different hyperref options for xetex, fixing problems with unicode bookmarks (thanks to CircleCode).
    • LaTeX: Removed ucs package, use utf8 rather than utf8x with inputenc. This covers fewer characters but is more robust with other packages, and ucs is unmaintained. Users who need better unicode support should use xelatex or lualatex.
  • If a template specified with --template is not found, look for it in datadir. Also, if no extension is provided, supply one based on the writer. So now you can put your special.latex template in ~/.pandoc/templates, and use it from any directory via pandoc -t latex --template special.

  • Added nonspaceChar to Text.Pandoc.Parsing.

  • Fixed smart quotes bug, now handling '...hi' properly.

  • RST reader:

    • Partial support for labeled footnotes.
    • Improved accuracy of simpleReferenceName parser.
  • HTML reader:

    • Substitute correct unicode characters for characters in the 128..159 range, which are often found even in HTML that purports to be UTF-8.
  • LaTeX reader: Handle \subtitle command (a subtitle is added to the title, after a colon and linebreak). Closes #280.

  • Leaner reference.odt.

  • Added unexported module Text.Pandoc.MIME for use in the ODT writer.

  • ODT writer: Construct manifest.xml based on archive contents. This fixes a bug in ODTs containing images. Recent versions of LibreOffice would reject these as corrupt, because manifest.xml did not contain a reference to the image files.

  • LaTeX writer:

    • Make verbatim environments flush to avoid spurious blank lines. Closes #277.
    • Use \texttt and escapes insntead of \verb!...!, which is too fragile (doesn’t work in command arguments).
    • Use \enquote{} for quotes if the template includes the csquotes package. This provides better support for local quoting styles. (Thanks to Andreas Wagner for the idea.)
  • ConTeXt writer: Make \starttyping/\stoptyping flush with margin, preventing spurious blank lines.

  • Slidy writer:

    • Use non-minimized version of slidy.css with --offline option, so users can more easily edit it.
    • Also fixed a bug in the CSS that prevented proper centering of title (now reported and fixed upstream).
  • S5 writer:

    • Replaced s5/default/slides.js.{comment,packed} with new compressed s5/default/slides.min.js.
    • Use data: protocol to embed S5 CSS in <link> tags, when --offline is specified. Using inline CSS didn’t work with Chrome or Safari. This fixes offline S5 on those browsers.
  • HTML writer: Removed English title on footnote backlinks. This is incongrous in non-English documents.

  • Docbook writer:

    • Use CALS tables. (Some older docbook software does not work well with XHTML tables.) Closes #77.
    • Use programlisting tags (instead of screen) for code blocks.
  • markdown2pdf:

    • Calls latex with -halt-on-error -interaction nonstopmode instead of -interaction=batchmode, which essentially just ignored errors, leading to bad results. Better to know when something is wrong.
    • Fixed issues with non-UTF-8 output of pdflatex.
    • Better error reporting.
  • --mathjax now takes an optional URL argument. If it is not provided, pandoc links directly to the (secure) mathjax CDN, as now recommended (thanks to dsanson).

  • Deprecated --xetex option in pandoc. It is no longer needed, since the LaTeX writer now produces a file that can be processed by latex, pdflatex, lualatex, or xelatex.

  • Introduced --luatex option to markdown2pdf. This causes lualatex to be used to create the PDF.

pandoc (2011-07-16)

  • Added --epub-cover-image option.

  • Documented --biblatex and --natbib options.

  • Allow --section-divs with slidy output. Resolves Issue #296.

  • Disallow notes within notes in reST and markdown. These previously caused infinite looping and stack overflows. For example:

    [^1]: See [^1]

    Note references are allowed in reST notes, so this isn’t a full implementation of reST. That can come later. For now we need to prevent the stack overflows. Partially resolves Issue #297.

  • EPUB writer: Allow non-plain math methods.

  • Forbid ()s in citation item keys. Resolves Issue #304: problems with (@item1; @item2) because the final paren was being parsed as part of the item key.

  • Changed URI parser so it doesn’t include trailing punctuation. So, in RST, should be parsed as a link followed by a period. The parser is smart enough to recognize balanced parentheses, as often occur in wikipedia links:

  • Markdown+lhs reader: Require space after inverse bird tracks, so that HTML tags can be used freely at the left margin of a markdown+lhs document. Thanks to Conal Elliot for the suggestion.

  • Markdown reader: Fixed bug in footnote order (reported by CircleCode).

  • RST reader:
    • Fixed bug in in field lists with multi-line items at the end of the list.
    • Added parentheses to RST specialChars, so ( will be parsed as a link in parens. Resolves Issue #291.
    • Allow | followed by newline in RST line block.
  • LaTeX reader:
    • Support \dots.
    • Gobble option & space after linebreak \\[10pt].
  • Textile reader:
    • Make it possible to have colons after links. (qerub)
    • Make it possible to have colons after links. (Christoffer Sawicki)
  • HTML reader:
    • Skip spaces after <b>, <emph>, etc.
    • Handle tbody, thead in simple tables. Closes #274.
    • Implicit Paras instead of Plains in some contexts.
  • OpenDocument writer: Use special First paragraph style for first paragraph after most non-paragraph blocks. This allows users to specify e.g. that only paragraphs after the first paragraph of a block are to be indented. Thanks to Andrea Rossato for the patch. Closes #20.

  • LaTeX writer: use deVerb on table and picture captions. Otherwise LaTeX complains about \verb inside command argument. Thanks to bbanier for reporting the bug.

  • Markdown writer: Insert HTML comment btw list and indented code block. This prevents the code block from being interpreted as part of the list.

  • EPUB writer: Add a meta element specify the cover. Some EPUB e-readers, such as the Nook, require a meta element inside the OPF metadata block to ensure the cover image is properly displayed. (Kelsey Hightower)

  • HTML writer: Use embed tag for images with non-image extensions. (e.g. PDFs). Closes #264.

  • LaTeX writer: Improved tables.

    • More space between lines, top-align cells.
    • Use ctable package, which allows footnotes and provides additional options.
    • Made cell alignments work in multiline tables.
    • Closes #271, #272.
  • Un-URI-escape image filenames in LaTeX, ConTeXt, RTF, Texinfo. Also do this when copying image files into EPUBs and ODTs. Closes #263.

  • Changed to github issue tracker.

  • Added failing emph/strong markdown test case due to Perry Wagle.

  • Slidy improvements:
    • Updated to use Slidy2.
    • Fixed bug, unclosed div tag.
    • Added duration variable in template. Setting this activates the timer.
    • Use ‘titlepage’ instead of ‘cover’ for title div.

pandoc (2011-02-13)

  • markdown2pdf: Removed some debugging lines accidentally included in the 1.8.1 release. With those lines, the temp directory is created in the working directory, and it is not deleted. This fix restores the original behavior.

pandoc 1.8.1 (2011-02-13)

  • Added --ascii option. Currently supported only in HTML writer, which it causes to use numerical entities instead of UTF-8.

  • EPUB writer: --toc now works to provide a table of contents at the beginning of each chapter.

  • LaTeX writer: Change figure defaults to htbp. This prevents “too many unprocessed floats.” Resolves Issue #285.

  • Text.Pandoc.UTF8: Encode filenames even when using recent base.

  • markdown2pdf: Fixed filename encoding issues. With help from Paulo Tanimoto. Resolves Issue #286.

  • HTML writer: Put line breaks in section divs.

  • Text.Pandoc.Shared: Make writerSectionDivs default to False.

pandoc (2011-02-05)

  • Fixed Source-repository stanza in cabal file.

pandoc (2011-02-05)

  • HTML writer:

    • Stringify alt text instead of converting to HTML.
    • Break lines after block elements, not inside tags. HTML output now closely resembles that of tidy. Resolves Issue #134.
  • Markdown reader: Fixed bug in footnote block parser (pointed out by Jesse Rosenthal). The problem arose when the blank line at the end of a footnote block contained indenting spaces.

  • Shared: Improved ‘normalize’ function so it normalizes Spaces too. In normal form, Space elements only occur to separate two non-Space elements. So, we never have [Space], or [, …, Space].

  • Tests:

    • Improved Arbitrary instance.
    • Added timeout for test instances.

    • Added section on four-space rule for lists. Resolves Issue #283.
    • Clarified optional arguments on math options.
  • markdown2pdf: Fixed bug with output file extensions. Previously markdown2pdf test.txt -o test.en.pdf would produce test.pdf, not test.en.pdf. Thanks to Paolo Tanimoto for the fix.

pandoc (2001-01-31)

  • Revised Interact.hs so that it works with the CPP macros in the UTF8 module.

  • Revised Setup.hs so that we don’t call MakeManPage.hs unless the man pages are out of date.

pandoc 1.8 (2011-01-30)

New features

  • Support for citations using Andrea Rossato’s citeproc-hs 0.3. You can now write, for example,

    Water is wet [see @doe99, pp. 33-35; also @smith04, ch. 1].

    and, when you process your document using pandoc, specifying a citation style using --csl and a bibliography using --bibliography, the citation will be replaced by an appropriately formatted citation, and a list of works cited will be added to the end of the document.

    This means that you can switch effortlessly between different citation and bibliography styles, including footnote, numerical, and author-date formats. The bibliography can be in any of the following formats: MODS, BibTeX, BibLaTeX, RIS, EndNote, EndNote XML, ISI, MEDLINE, Copac, or JSON. See the README for further details.

    Citations are supported in the markdown reader, using a special syntax, and in the LaTeX reader, using natbib or biblatex syntax. (Thanks to Nathan Gass for the natbib and biblatex support.)

  • New textile reader and writer. Thanks to Paul Rivier for contributing the textile reader, an almost complete implementation of the textile syntax used by the ruby RedCloth library. Resolves Issue #51.

  • New org writer, for Emacs Org-mode, contributed by Puneeth Chaganti.

  • New json reader and writer, for reading and writing a JSON representation of the native Pandoc AST. These are much faster than the native reader and writer, and should be used for serializing Pandoc to text. To convert between the JSON representation and native Pandoc, use encodeJSON and decodeJSON from Text.JSON.Generic.

  • A new jsonFilter function in Text.Pandoc makes it easy to write scripts that transform a JSON-encoded pandoc document. For example:

    -- removelinks.hs - removes links from document
    import Text.Pandoc
    main = interact $ jsonFilter $ bottomUp removeLink
             where removeLink (Link xs _) = Emph xs
                   removeLink x = x

    To use this to remove links while translating markdown to LaTeX:

    pandoc -t json | runghc removelinks.hs | pandoc -f json -t latex
  • Attributes are now allowed in inline Code elements, for example:

    In this code, `ulist ! [theclass "special"] << elts`{.haskell} is...

    The attribute syntax is the same as for delimited code blocks. Code inline has an extra argument place for attributes, just like CodeBlock. Inline code will be highlighted in HTML output, if pandoc is compiled with highlighting support. Resolves Issue #119.

  • New RawBlock and RawInline elements (replacing RawHtml, HtmlInline, and TeX) provide lots of flexibility in writing scripts to transform Pandoc documents. Scripts can now change how each element is rendered in each output format.

  • You can now define LaTeX macros in markdown documents, and pandoc will apply them to TeX math. For example,

    3 + 4

    yields 3+4. Since the macros are applied in the reader, they will work in every output format, not just LaTeX.

  • LaTeX macros can also be used in LaTeX documents (both in math and in non-math contexts).

  • A new --mathjax option has been added for displaying math in HTML using MathJax. Resolves issue #259.

  • Footnotes are now supported in the RST reader. (Note, however, that unlike docutils, pandoc ignores the numeral or symbol used in the note; footnotes are put in an auto-numbered ordered list.) Resolves Issue #258.

  • A new --normalize option causes pandoc to normalize the AST before writing the document. This means that, for example, *hi**there* will be rendered as <em>hithere</em> instead of <em>hi</em><em>there</em>. This is not the default, because there is a significant performance penalty.

  • A new --chapters command-line option causes headers in DocBook, LaTeX, and ConTeXt to start with “chapter” (level one). Resolves Issue #265.

  • In DocBook output, <chapter> is now used for top-level headers if the template contains <book>. Resolves Issue #265.

  • A new --listings option in pandoc and markdown2pdf causes the LaTeX writer to use the listings package for code blocks. (Thanks to Josef Svennigsson for the pandoc patch, and Etienne Millon for the markdown2pdf patch.)

  • markdown2pdf now supports --data-dir.

  • URLs in autolinks now have class “url” so they can be styled.

  • Improved prettyprinting in most formats. Lines will be wrapped more evenly and duplicate blank lines avoided.

  • New --columns command-line option sets the column width for line wrapping and relative width calculations for tables.

  • Made --smart work in HTML, RST, and Textile readers, as well as markdown.

  • Added --html5 option for HTML5 output.

  • Added support for listings package in LaTeX reader (Puneeth Chaganti).

  • Added support for simple tables in the LaTeX reader.

  • Added support for simple tables in the HTML reader.

  • Significant performance improvements in many readers and writers.

API and program changes

  • Moved Text.Pandoc.Definition from the pandoc package to a new auxiliary package, pandoc-types. This will make it possible for other programs to supply output in Pandoc format, without depending on the whole pandoc package.

  • Added Attr field to Code.

  • Removed RawHtml, HtmlInline, and TeX elements; added generic RawBlock and RawInline.

  • Moved generic functions to Text.Pandoc.Generic. Deprecated processWith, replacing it with two functions, bottomUp and topDown. Removed previously deprecated functions processPandoc and queryPandoc.

  • Added Text.Pandoc.Builder, for building Pandoc structures.

  • Text.Pandoc now exports association lists readers and writers.

  • Added Text.Pandoc.Readers.Native, which exports readNative. readNative can now read full pandoc documents, block lists, blocks, inline lists, or inlines. It will interpret Str "hi" as if it were Pandoc (Meta [] [] []) [Plain [Str "hi"]]. This should make testing easier.

  • Removed deprecated -C/--custom-header option. Use --template instead.

  • --biblio-file has been replaced by --bibliography. --biblio-format has been removed; pandoc now guesses the format from the file extension (see README).

  • pandoc will treat an argument as a URI only if it has an http(s) scheme. Previously pandoc would treat some Windows pathnames beginning with C:/ as URIs.

  • The --sanitize-html option and the stateSanitize field in ParserState have been removed. Sanitization is better done in the resulting HTML using xss-sanitize, which is based on pandoc’s sanitization, but improved.

  • pandoc now adds a newline to the end of its output in fragment mode (= not --standalone).

  • Added support for lang in html tag in the HTML template, so you can do pandoc -s -V lang=es, for example.

  • highlightHtml in Text.Pandoc.Highlighting now takes a boolean argument that selects between “inline” and “block” HTML.

  • Text.Pandoc.Writers.RTF now exports rtfEmbedImage. Images are embedded in RTF output when possible (png, jpeg). Resolves Issue #275.

  • Added Text.Pandoc.Pretty. This is better suited for pandoc than the pretty package. Changed all writers that used Text.PrettyPrint.HughesPJ to use Text.Pandoc.Pretty instead.

  • Rewrote writeNative using the new prettyprinting module. It is now much faster. The output has been made more consistent and compressed. writeNative is also now sensitive to writerStandalone, and will simplyprint a block list if writerStandalone` is False.

  • Removed Text.Pandoc.Blocks. Text.Pandoc.Pretty allows you to define blocks and concatenate them, so a separate module is no longer needed.

  • Text.Pandoc.Shared:

    • Added writerColumns, writerChapters, and writerHtml5 to WriterOptions.
    • Added normalize.
    • Removed unneeded prettyprinting functions: wrapped, wrapIfNeeded, wrappedTeX, wrapTeXIfNeeded, hang', BlockWrapper, wrappedBlocksToDoc.
    • Made splitBy take a test instead of an element.
    • Added findDataFile, refactored readDataFile.
    • Added stringify. Rewrote inlineListToIdentifier using stringify.
    • Fixed inlineListToIdentifier to treat ‘60’ as ‘’.
  • Text.Pandoc.Readers.HTML:

    • Removed rawHtmlBlock, anyHtmlBlockTag, anyHtmlInlineTag, anyHtmlTag, anyHtmlEndTag, htmlEndTag, extractTagType, htmlBlockElement, htmlComment
    • Added htmlTag, htmlInBalanced, isInlineTag, isBlockTag, isTextTag
  • Moved smartPunctuation from Text.Pandoc.Readers.Markdown to Text.Pandoc.Readers.Parsing, and parameterized it with an inline parser.

  • Ellipses are no longer allowed to contain spaces. Previously we allowed ‘. . .’, ‘. . .’, etc. This caused too many complications, and removed author’s flexibility in combining ellipses with spaces and periods.

  • Allow linebreaks in URLs (treat as spaces). Also, a string of consecutive spaces or tabs is now parsed as a single space. If you have multiple spaces in your URL, use %20%20.

  • Text.Pandoc.Parsing:

    • Removed refsMatch.
    • Hid Key constructor.
    • Removed custom Ord and Eq instances for Key.
    • Added toKey and fromKey to convert between Key and [Inline].
    • Generalized type on readWith.
  • Small change in calculation of relative widths of table columns. If the size of the header > the specified column width, use the header size as 100% for purposes of calculating relative widths of columns.

  • Markdown writer now uses some pandoc-specific features when --strict is not specified: \ newline is used for a hard linebreak instead of two spaces then a newline. And delimited code blocks are used when there are attributes.

  • HTML writer: improved gladTeX output by setting ENV appropriately for display or inline math (Jonathan Daugherty).

  • LaTeX writer: Use \paragraph, \subparagraph for level 4,5 headers.

  • LaTeX reader:

    • \label{foo} and \ref{foo} now become {foo} instead of (foo).
    • \index{} commands are skipped.
  • Added fontsize variable to default LaTeX template. This makes it easy to set the font size using markdown2pdf: markdown2pdf -V fontsize=12pt input.txt.

  • Fixed problem with strikeout in LaTeX headers when using hyperref, by adding a command to the default LaTeX template that disables \sout inside pdf strings. Thanks to Joost Kremers for the fix.

  • The COLUMNS environment variable no longer has any effect.

Under-the-hood improvements

  • Pandoc now compiles with GHC 7. (This alone leads to a significant performance improvement, 15-20%.)

  • Completely rewrote HTML reader using tagsoup as a lexer. The new reader is faster and more accurate. Unlike the old reader, it does not get bogged down on some input (Issues #277, 255). And it handles namespaces in tags (Issue #274).

  • Replaced escapeStringAsXML with a faster version.

  • Rewrote spaceChar and some other parsers in Text.Pandoc.Parsing for a significant performance boost.

  • Improved performance of all readers by rewriting parsers.

  • Simplified Text.Pandoc.CharacterReferences by using entity lookup functions from TagSoup.

  • Text.Pandoc.UTF8 now uses the unicode-aware IO functions from System.IO if base >= 4.2. This gives support for windows line endings on windows.

  • Remove duplications in documentation by generating the pandoc man page from README, using MakeManPage.hs.

  • README now includes a full description of markdown syntax, including non-pandoc-specific parts. A new pandoc_markdown man page is extracted from this, so you can look up markdown syntax by doing man pandoc_markdown.

  • Completely revised test framework (with help from Nathan Gass). The new test framework is built when the tests Cabal flag is set. It includes the old integration tests, but also some new unit and quickcheck tests. Test output has been much improved, and you can now specify a glob pattern after cabal test to indicate which tests should be run; for example cabal test citations will run all the citation tests.

  • Added a shell script,, for filtering ANSI control sequences from test output: cabal test | ./ > test.log.

  • Added Interact.hs to make it easier to use ghci while developing. Interact.hs loads ghci from the src directory, specifying all the options needed to load pandoc modules (including specific package dependencies, which it gets by parsing dist/setup-config).

  • Added Benchmark.hs, testing all readers + writers using criterion.

  • Added, to make it easier to collect and archive benchmark and lines-of-code stats.

  • Added upper bounds to all cabal dependencies.

  • Include man pages in extra-source-files. This allows users to install pandoc from the tarball without needing to build the man pages.

Bug fixes

  • Filenames are encoded as UTF8. Resolves Issue #252.

  • Handle curly quotes better in --smart mode. Previously, curly quotes were just parsed literally, leading to problems in some output formats. Now they are parsed as Quoted inlines, if --smart is specified. Resolves Issue #270.

  • Text.Pandoc.Parsing: Fixed bug in grid table parser. Spaces at end of line were not being stripped properly, resulting in unintended LineBreaks.

  • Markdown reader:

    • Allow HTML comments as inline elements in markdown. So, aaa <!-- comment --> bbb can be a single paragraph.
    • Fixed superscripts with links: ^[link](/foo)^ gets recognized as a superscripted link, not an inline note followed by garbage.
    • Fixed regression, making markdown reference keys case-insensitive again. Resolves Issue #272.
    • Properly handle abbreviations (like Mr.) at the end of a line.
    • Better handling of intraword underscores, avoiding exponential slowdowns in some cases. Resolves Issue #182.
    • Fixed bug in alignments in tables with blank rows in the header.
  • RST reader:

    • Field lists now allow spaces in field names, and block content in field values. (Thanks to Lachlan Musicman for pointing out the bug.)
    • Definition list items are now always Para instead of Plain, matching behavior of
    • In image blocks, the description is parsed properly and used for the alt attribute, not also the title.
    • Skip blank lines at beginning of file. Resolves Debian #611328.
  • LaTeX reader:

    • Improved parsing of preamble. Previously you’d get unexpected behavior on a document that contained \begin{document} in, say, a verbatim block.
    • Allow spaces between \begin or \end and {.
    • Support \L and \l.
    • Skip comments inside paragraphs.
  • LaTeX writer:

    • Escape strings in \href{..}.
    • In nonsimple tables, put cells in \parbox.
  • OpenDocument writer: don’t print raw TeX.

  • Markdown writer:

    • Fixed bug in Image. URI was getting unescaped twice!
    • Avoid printing extra blank lines at the end if there are no notes or references.
  • LaTeX and ConTeXt: Escape [ and ] as {[} and {]}. This avoids unwanted interpretation as an optional argument.

  • ConTeXt writer: Fixed problem with inline code. Previously } would be rendered \type{}}. Now we check the string for ‘}’ and ‘{’. If it contains neither, use \type{}; otherwise use \mono{} with an escaped version of the string.

  • : now allowed in HTML tags. Resolves Issue #274.

pandoc 1.6 (2010-07-24)

  • New EPUB and HTML Slidy writers. (Issue #122)

    • EPUB is a standard ebook format, used in Apple’s iBooks for the iPad and iPhone, Barnes and Noble’s nook reader, the Sony reader, and many other devices, and by online ebook readers like bookworm. (Amazon’s Kindle uses a different format, MobiPocket, but EPUB books can easily be converted to Kindle format.) Now you can write your book in markdown and produce an ebook with a single command! I’ve put up a short tutorial here.
    • Slidy, like S5, is a system for producing HTML+javascript slide shows.
  • All input is assumed to be UTF-8, no matter what the locale and ghc version, and all output is UTF-8. This reverts to pre-1.5 behavior. Also, a BOM, if present, is stripped from the input.

  • Markdown now supports grid tables, whose cells can contain arbitrary block elements. (Issue #43)

  • Sequentially numbered example lists in markdown with @ marker.

  • Markdown table captions can begin with a bare colon and no longer need to include the English word “table.” Also, a caption can now occur either before or after the table. (Issue #227)

  • New command-line options:

    • --epub-stylesheet allows you to specify a CSS file that will be used to style your ebook.
    • --epub-metadata allows you to specify metadata for the ebook.
    • --offline causes the generated HTML slideshow to include all needed scripts and stylesheets.
    • --webtex causes TeX math to be converted to images using the Google Charts API (unless a different URL is specified).
    • --section-divs causes div tags to be added around each section in an HTML document. (Issue #230, 239)
  • Default behavior of S5 writer in standalone mode has changed: previously, it would include all needed scripts and stylesheets in the generated HTML; now, only links are included unless the --offline option is used.

  • Default behavior of HTML writer has changed. Between 1.2 and 1.5, pandoc would enclose sections in div tags with identifiers on the div tags, so that the sections can be manipulated in javascript. This caused undesirable interactions with raw HTML div tags. So, starting with 1.6, the default is to put the identifiers directly on the header tags, and not to include the divs. The --section-divs option selects the 1.2-1.5 behavior.

  • API changes:

    • HTMLMathMethod: Added WebTeX, removed MimeTeX.
    • WriterOptions: Added writerUserDataDir, writerSourceDirectory, writerEPUBMetadata fields. Removed writerIncludeBefore, writerIncludeAfter.
    • Added headerShift to Text.Pandoc.Shared.
    • Moved parsing code and ParserState from Text.Pandoc.Shared to a new module, Text.Pandoc.Parsing.
    • Added stateHasChapters to ParserState.
    • Added HTMLSlideVariant.
    • Made KeyTable a map instead of an association list.
    • Added accessors for Meta fields (docTitle, docAuthors, docDate).
    • Pandoc, Meta, Inline, and Block have been given Ord instances.
    • Reference keys now have a type of their own (Key), with its own Ord instance for case-insensitive comparison.
    • Added Text.Pandoc.Writers.EPUB.
    • Added Text.Pandoc.UUID.
    • Removed Text.Pandoc.ODT, added Text.Pandoc.Writers.ODT. Removed saveOpenDocumentAsODT, added writeODT.
    • Added Text.Pandoc.Writers.Native and writeNative. Removed prettyPandoc.
    • Added Text.Pandoc.UTF8 for portable UTF8 string IO.
    • Removed Text.Pandoc.Writers.S5 and the writeS5 function. Moved s5Includes to a new module, Text.Pandoc.S5. To write S5, you now use writeHtml with writerSlideVariant set to S5Slides or SlidySlides.
  • Template changes. If you use custom templates, please update them, particularly if you use syntax highlighting with pandoc. The old HTML templates hardcoded highlighting CSS that will no longer work with the most recent version of highlighting-kate.

    • HTML template: avoid empty meta tag if no date.
    • HTML template: Use default highlighting CSS from highlighting-kate instead of hard-coding the CSS into the template.
    • HTML template: insert-before text goes before the title, and immediately after the <body> tag, as documented. (Issue #241)
    • Added slidy and s5 templates.
    • Added amssymb to preamble of latex template. (github Issue 1)
  • Removed excess newlines at the end of output. Note: because output will not contain an extra newline, you may need to make adjustments if you are inserting pandoc’s output into a template.

  • In S5 and slidy, horizontal rules now cause a new slide, so you are no longer limited to one slide per section.

  • Improved handling of code in man writer. Inline code is now monospace, not bold, and code blocks now use .nf (no fill) and .IP (indented para).

  • HTML reader parses <tt> as Code. (Issue #247)

  • html+lhs output now contains bird tracks, even when compiled without highlighting support. (Issue #242)

  • Colons are now no longer allowed in autogenerated XML/HTML identifiers, since they have a special meaning in XML.

  • Code improvements in ODT writer. Remote images are now replaced with their alt text rather than a broken link.

  • LaTeX reader improvements:

    • Made latex \section, \chapter parsers more forgiving of whitespace.
    • Parse \chapter{} in latex.
    • Changed rawLaTeXInline to accept \section, \begin, etc.
    • Use new rawLaTeXInline' in LaTeX reader, and export rawLaTeXInline for use in markdown reader.
    • Fixes bug wherein \section{foo} was not recognized as raw TeX in markdown document.
  • LaTeX writer: images are automatically shrunk if they would extend beyond the page margin.

  • Plain, markdown, RST writers now use unicode for smart punctuation.

  • Man writer converts math to unicode when possible, as in other writers.

  • markdown2pdf can now recognize citeproc options.

  • Command-line arguments are converted to UTF-8. (Issue #234)

  • Text.Pandoc.TeXMath has been rewritten to use texmath’s parser. This allows it to handle a wider range of formulas. Also, if a formula cannot be converted, it is left in raw TeX; formulas are no longer partially converted.

  • Unicode curly quotes are left alone when parsing smart quotes. (Issue #143)

  • Cabal file changes:

    • Removed parsec < 3 restriction.
    • Added ‘threaded’ flag for architectures where GHC lacks a threaded runtime.
    • Use ‘threaded’ only for markdown2pdf; it is not needed for pandoc.
    • Require highlighting-kate 0.2.7.
  • Use explicit imports from Data.Generics. Otherwise we have a conflict with the ‘empty’ symbol, introduced in syb >= 0.2. (Issue #237)

  • New data files: slidy/slidy.min.js, slidy/slidy.min.css, epub.css.

pandoc (2010-03-29)

  • Fixed header identifiers (uniqueIdent in Shared) so they work as advertized in README and are guaranteed to be valid XHTML names. Thanks to Xyne for reporting the bug.

pandoc 1.5.1 (2010-03-23)

  • Fixed treatment of unicode characters in URIs.
  • Revised Setup.hs so it works with debian’s build process.
  • Fixed bug in OpenDocument writer that led to invalid XML for some input.

pandoc (2010-03-21)

  • HTML writer: Fixed error in math writer (with MathML option) that caused an infinite loop for unparsable MathML.

pandoc 1.5 (2010-03-20)

  • Moved repository to github.
  • New --mathml option, for display of TeX math as MathML.
  • New --data-dir option, allowing users to specify a data directory other than ~/.pandoc. Files placed in this directory will be used instead of system defaults.
  • New --base-header-level option. For example, --base-header-level=2 changes level 1 headers to level 2, level 2 to level 3, etc.
  • New ‘plain’ output format: plain text without pictures, hyperlinks, inline formatting, or anything else that looks even vaguely markupish.
  • Titles and authors in title blocks can now span multiple lines, as long as the continuation lines begin with a space character.
  • When given an absolute URI as a parameter, pandoc will fetch the content via HTTP.
  • The HTML reader has been made much more forgiving. It no longer requires well-formed xhtml as input.
  • html2markdown has been removed; it is no longer necessary, given the last two changes. pandoc can be used by itself to convert web pages to markdown or other formats.
  • hsmarkdown has also been removed. Use pandoc --strict instead. Or symlink pandoc’s executable to hsmarkdown; pandoc will then behave like hsmarkdown used to.
  • An image in a paragraph by itself is now rendered as a figure in most writers, with the alt text as the caption.
  • Incomplete support for reST tables (simple and grid). Thanks to Eric Kow. Colspans and rowspans not yet supported.
  • In mediawiki, links with relative URLs are now formatted as wikilinks. Also, headers have been promoted: = head = is now level 1 instead of level 2.
  • The markdown reader now handles “inverse bird tracks” when parsing literate haskell. These are used for haskell example code that is not part of the literate program.
  • The -B and -A options now imply -s and no longer work in fragment mode.
  • Headerless tables are now printed properly in all writers. In addition, tbody, thead, and cols are used in HTML and Docbook tables.
  • Improved build system; removed obsolete Makefile.
  • In LaTeX writer, \chapter is now used instead of \section. when the documentclass is book, report, or memoir.
  • Many small bug fixes. See changelog for details.

pandoc 1.4 (2010-01-02)

  • New template system replaces old headers, giving users much more control over pandoc’s output in --standalone mode. Added --template and --variable options. The --print-default-header option is now --print-default-template. See README under “Templates” for details.
  • The old --custom-header option should still work, but it has been deprecated.
  • New --reference-odt option allows users to customize styles in ODT output.
  • Users may now put custom templates, s5 styles, and a reference ODT in the ~/.pandoc directory, where they will override system defaults. See README for details.
  • Unicode is now used whenever possible in HTML and XML output. Entities are used only where necessary (&gt;, &lt;, &quot;, &amp;).
  • Authors and dates may now include formatting and notes.
  • Added --xetex option for pandoc and markdown2pdf.
  • Windows installer now includes highlighting support and markdown2pdf and hsmarkdown wrappers.
  • Pandoc no longer requires Template Haskell, which should make it more portable.
  • Pandoc can now be built on GHC 6.12, as well as earlier versions.
  • See README for other small improvements and bug fixes.

pandoc 1.3 (2009-12-10)

  • Added --id-prefix option to help prevent duplicate identifiers when you’re generating HTML fragments.
  • Added --indented-code-classes option, which specifies default highlighting syntax for indented code blocks.
  • --number-sections now affects HTML output.
  • Improved syntax for markdown definition lists.
  • Better looking simple tables.
  • Markdown tables without headers are now possible.
  • New hard line break syntax: backslash followed by newline.
  • Improved performance of markdown reader by ~10% by eliminating the need for a separate parsing pass for notes.
  • Improved syntax highlighting for literate Haskell.
  • Support for “..code-block” directive in RST reader.
  • Windows binary now includes highlighting support.
  • Many bug fixes and small improvements. See changelog for details.

pandoc 1.2.1 (2009-07-18)

  • Improved the efficiency of the markdown reader’s abbreviation parsing (should give a big performance boost with --smart).
  • HTML writer now wraps sections in divs with unique identifiers, for easier manipulation.
  • Improved LaTeX reader’s coverage of math modes.
  • Added a portable Haskell version of markdown2pdf (thanks to Paolo Tanimoto).
  • Made --strict compatible with --standalone and --toc.
  • Many other small improvements and bug fixes. See changelog for details.

pandoc 1.2 (2009-03-01)

  • Added support for literate Haskell. lhs support is triggered by ‘+lhs’ suffixes in formats. For example, ‘latex+lhs’ is literate Haskell LaTeX. ‘.lhs’ files are treated by default as literate markdown.
  • Added --email-obfuscation option.
  • Brought citeproc support up to date for citeproc-hs-0.2.
  • Many bugs fixed. See changelog for details.

pandoc 1.1 (2008-11-06)

  • New --jsmath option supporting use of pandoc with [jsMath].
  • Classes on HTML table output for better CSS styling.
  • Windows installer no longer requires admin privileges.
  • Many bugs fixed. See changelog for details.

pandoc 1.0 (2008-09-13)

  • New writers for MediaWiki, GNU Texinfo (thanks to Peter Wang), OpenDocument XML (thanks to Andrea Rossato), and ODT (OpenOffice document).
  • New delimited code blocks, with optional syntax highlighting.
  • Reorganized build system: pandoc can now be built using standard Cabal tools. It can be compiled on Windows without Cygwin. The tests can also be run without perl or unix tools.
  • LaTeXMathML replaces ASCIIMathML for rendering math in HTML.
  • Support for “displayed” math.
  • Common abbreviations are now handled more intelligently, with a non-breaking space (and not a sentence-ending space) after the period.
  • Code is -Wall clean.
  • Many bug fixes and small improvements. See changelog for full details.

pandoc 0.46 (2008-01-08)

  • Added a --sanitize-html option (and a corresponding parameter in ParserState for those using the pandoc libraries in programs). This option causes pandoc to sanitize HTML (in HTML or Markdown input) using a whitelist method. Possibly harmful HTML elements are replaced with HTML comments. This should be useful in the context of web applications, where pandoc may be used to convert user input into HTML.
  • Made -H, -A, and -B options cumulative: if they are specified multiple times, multiple files will be included.
  • Many bug fixes and small improvements. See changelog for full details.

pandoc 0.45 (2007-12-09)

  • Many bug fixes and structural improvements. See changelog for full details.
  • Improved treatment of math. Math is now rendered using unicode by default in HTML, RTF, and DocBook output. For more accurate display of math in HTML, --gladtex, --mimetex, and --asciimathml options are provided. See the User’s Guide for details.
  • Removed support for box-style block quotes in markdown.
  • More idiomatic ConTeXt output.
  • Text wrapping in ConTeXt and LaTeX output.
  • Pandoc now correctly handles all standard line endings (CR, LF, CRLF).
  • New --no-wrap option that disables line wrapping and minimizes whitespace in HTML output.
  • Build process is now compatible with both GHC 6.8 and GHC 6.6. GHC and GHC_PKG environment variables may be used to specify which version of the compiler to use, when multiple versions are installed.