Description

Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library. It can read markdown and (subsets of) Textile, reStructuredText, HTML, LaTeX, MediaWiki markup, Haddock markup, OPML, and DocBook; and it can write plain text, markdown, reStructuredText, XHTML, HTML 5, LaTeX (including beamer slide shows), ConTeXt, RTF, OPML, DocBook, OpenDocument, ODT, Word docx, GNU Texinfo, MediaWiki markup, EPUB (v2 or v3), FictionBook2, Textile, groff man pages, Emacs Org-Mode, AsciiDoc, and Slidy, Slideous, DZSlides, reveal.js or S5 HTML slide shows. It can also produce PDF output on systems where LaTeX is installed.

Pandoc’s enhanced version of markdown includes syntax for footnotes, tables, flexible ordered lists, definition lists, fenced code blocks, superscript, subscript, strikeout, title blocks, automatic tables of contents, embedded LaTeX math, citations, and markdown inside HTML block elements. (These enhancements, described below under Pandoc’s markdown, can be disabled using the markdown_strict input or output format.)

In contrast to most existing tools for converting markdown to HTML, which use regex substitutions, Pandoc has a modular design: it consists of a set of readers, which parse text in a given format and produce a native representation of the document, and a set of writers, which convert this native representation into a target format. Thus, adding an input or output format requires only adding a reader or writer.

Using pandoc

If no input-file is specified, input is read from stdin. Otherwise, the input-files are concatenated (with a blank line between each) and used as input. Output goes to stdout by default (though output to stdout is disabled for the odt, docx, epub, and epub3 output formats). For output to a file, use the -o option:

pandoc -o output.html input.txt

Instead of a file, an absolute URI may be given. In this case pandoc will fetch the content using HTTP:

pandoc -f html -t markdown http://www.fsf.org

If multiple input files are given, pandoc will concatenate them all (with blank lines between them) before parsing.

The format of the input and output can be specified explicitly using command-line options. The input format can be specified using the -r/--read or -f/--from options, the output format using the -w/--write or -t/--to options. Thus, to convert hello.txt from markdown to LaTeX, you could type:

pandoc -f markdown -t latex hello.txt

To convert hello.html from html to markdown:

pandoc -f html -t markdown hello.html

Supported output formats are listed below under the -t/--to option. Supported input formats are listed below under the -f/--from option. Note that the rst, textile, latex, and html readers are not complete; there are some constructs that they do not parse.

If the input or output format is not specified explicitly, pandoc will attempt to guess it from the extensions of the input and output filenames. Thus, for example,

pandoc -o hello.tex hello.txt

will convert hello.txt from markdown to LaTeX. If no output file is specified (so that output goes to stdout), or if the output file’s extension is unknown, the output format will default to HTML. If no input file is specified (so that input comes from stdin), or if the input files’ extensions are unknown, the input format will be assumed to be markdown unless explicitly specified.

Pandoc uses the UTF-8 character encoding for both input and output. If your local character encoding is not UTF-8, you should pipe input and output through iconv:

iconv -t utf-8 input.txt | pandoc | iconv -f utf-8

Creating a PDF

Earlier versions of pandoc came with a program, markdown2pdf, that used pandoc and pdflatex to produce a PDF. This is no longer needed, since pandoc can now produce pdf output itself. To produce a PDF, simply specify an output file with a .pdf extension. Pandoc will create a latex file and use pdflatex (or another engine, see --latex-engine) to convert it to PDF:

pandoc test.txt -o test.pdf

Production of a PDF requires that a LaTeX engine be installed (see --latex-engine, below), and assumes that the following LaTeX packages are available: amssymb, amsmath, ifxetex, ifluatex, listings (if the --listings option is used), fancyvrb, longtable, booktabs, url, graphicx, hyperref, ulem, babel (if the lang variable is set), fontspec (if xelatex or lualatex is used as the LaTeX engine), xltxtra and xunicode (if xelatex is used).

hsmarkdown

A user who wants a drop-in replacement for Markdown.pl may create a symbolic link to the pandoc executable called hsmarkdown. When invoked under the name hsmarkdown, pandoc will behave as if invoked with -f markdown_strict --email-obfuscation=references, and all command-line options will be treated as regular arguments. However, this approach does not work under Cygwin, due to problems with its simulation of symbolic links.