pandoc-1.9.3: Conversion between markup formats

Portabilityportable
Stabilityalpha
MaintainerJohn MacFarlane <jgm@berkeley.edu>
Safe HaskellSafe-Infered

Text.Pandoc

Contents

Description

This helper module exports the main writers, readers, and data structure definitions from the Pandoc libraries.

A typical application will chain together a reader and a writer to convert strings from one format to another. For example, the following simple program will act as a filter converting markdown fragments to reStructuredText, using reference-style links instead of inline links:

 module Main where
 import Text.Pandoc
 -- include the following two lines only if you're using ghc < 6.12:
 import Prelude hiding (getContents, putStrLn)
 import System.IO.UTF8

 markdownToRST :: String -> String
 markdownToRST =
   (writeRST defaultWriterOptions {writerReferenceLinks = True}) .
   readMarkdown defaultParserState
 
 main = getContents >>= putStrLn . markdownToRST

Note: all of the readers assume that the input text has '\n' line endings. So if you get your input text from a web form, you should remove '\r' characters using filter (/='\r').

Synopsis

Definitions

Generics

Lists of readers and writers

readers :: [(String, ParserState -> String -> Pandoc)]

Association list of formats and readers.

writers :: [(String, WriterOptions -> Pandoc -> String)]

Association list of formats and writers (omitting the binary writers, odt, docx, and epub).

Readers: converting to Pandoc format

readMarkdown

Arguments

:: ParserState

Parser state, including options for parser

-> String

String to parse (assuming '\n' line endings)

-> Pandoc 

Read markdown from an input string and return a Pandoc document.

readRST

Arguments

:: ParserState

Parser state, including options for parser

-> String

String to parse (assuming '\n' line endings)

-> Pandoc 

Parse reStructuredText string and return Pandoc document.

readLaTeX

Arguments

:: ParserState

Parser state, including options for parser

-> String

String to parse (assumes '\n' line endings)

-> Pandoc 

Parse LaTeX from string and return Pandoc document.

readHtml

Arguments

:: ParserState

Parser state

-> String

String to parse (assumes '\n' line endings)

-> Pandoc 

Convert HTML-formatted string to Pandoc document.

readTextile

Arguments

:: ParserState

Parser state, including options for parser

-> String

String to parse (assuming '\n' line endings)

-> Pandoc 

Parse a Textile text and return a Pandoc document.

readNative

Arguments

:: String

String to parse (assuming '\n' line endings)

-> Pandoc 

Read native formatted text and return a Pandoc document. The input may be a full pandoc document, a block list, a block, an inline list, or an inline. Thus, for example,

 Str "hi"

will be treated as if it were

 Pandoc (Meta [] [] []) [Plain [Str "hi"]]

Parser state used in readers

data ParserState

Parsing options.

Constructors

ParserState 

Fields

stateParseRaw :: Bool

Parse raw HTML and LaTeX?

stateParserContext :: ParserContext

Inside list?

stateQuoteContext :: QuoteContext

Inside quoted environment?

stateMaxNestingLevel :: Int

Max # of nested Strong/Emph

stateLastStrPos :: Maybe SourcePos

Position after last str parsed

stateKeys :: KeyTable

List of reference keys

stateCitations :: [String]

List of available citations

stateNotes :: NoteTable

List of notes

stateTabStop :: Int

Tab stop

stateStandalone :: Bool

Parse bibliographic info?

stateTitle :: [Inline]

Title of document

stateAuthors :: [[Inline]]

Authors of document

stateDate :: [Inline]

Date of document

stateStrict :: Bool

Use strict markdown syntax?

stateSmart :: Bool

Use smart typography?

stateOldDashes :: Bool

Use pandoc <= 1.8.2.1 behavior in parsing dashes; -- is em-dash; before numeral is en-dash

stateLiterateHaskell :: Bool

Treat input as literate haskell

stateColumns :: Int

Number of columns in terminal

stateHeaderTable :: [HeaderType]

Ordered list of header types used

stateIndentedCodeClasses :: [String]

Classes to use for indented code blocks

stateNextExample :: Int

Number of next example

stateExamples :: Map String Int

Map from example labels to numbers

stateHasChapters :: Bool

True if chapter encountered

stateApplyMacros :: Bool

Apply LaTeX macros?

stateMacros :: [Macro]

List of macros defined so far

stateRstDefaultRole :: String

Current rST default interpreted text role

Instances

data ParserContext

Constructors

ListItemState

Used when running parser on list item contents

NullState

Default state

data QuoteContext

Constructors

InSingleQuote

Used when parsing inside single quotes

InDoubleQuote

Used when parsing inside double quotes

NoQuote

Used when not parsing inside quotes

type NoteTable = [(String, String)]

data HeaderType

Constructors

SingleHeader Char

Single line of characters underneath

DoubleHeader Char

Lines of characters above and below

Writers: converting from Pandoc format

writeNative :: WriterOptions -> Pandoc -> String

Prettyprint Pandoc document.

writeMarkdown :: WriterOptions -> Pandoc -> String

Convert Pandoc to Markdown.

writePlain :: WriterOptions -> Pandoc -> String

Convert Pandoc to plain text (like markdown, but without links, pictures, or inline formatting).

writeRST :: WriterOptions -> Pandoc -> String

Convert Pandoc to RST.

writeLaTeX :: WriterOptions -> Pandoc -> String

Convert Pandoc to LaTeX.

writeConTeXt :: WriterOptions -> Pandoc -> String

Convert Pandoc to ConTeXt.

writeTexinfo :: WriterOptions -> Pandoc -> String

Convert Pandoc to Texinfo.

writeHtml :: WriterOptions -> Pandoc -> Html

Convert Pandoc document to Html structure.

writeHtmlString :: WriterOptions -> Pandoc -> String

Convert Pandoc document to Html string.

writeDocbook :: WriterOptions -> Pandoc -> String

Convert Pandoc document to string in Docbook format.

writeOpenDocument :: WriterOptions -> Pandoc -> String

Convert Pandoc document to string in OpenDocument format.

writeMan :: WriterOptions -> Pandoc -> String

Convert Pandoc to Man.

writeMediaWiki :: WriterOptions -> Pandoc -> String

Convert Pandoc to MediaWiki.

writeTextile :: WriterOptions -> Pandoc -> String

Convert Pandoc to Textile.

writeRTF :: WriterOptions -> Pandoc -> String

Convert Pandoc to a string in rich text format.

writeODT

Arguments

:: Maybe FilePath

Path specified by --reference-odt

-> WriterOptions

Writer options

-> Pandoc

Document to convert

-> IO ByteString 

Produce an ODT file from a Pandoc document.

writeDocx

Arguments

:: Maybe FilePath

Path specified by --reference-docx

-> WriterOptions

Writer options

-> Pandoc

Document to convert

-> IO ByteString 

Produce an Docx file from a Pandoc document.

writeEPUB

Arguments

:: Maybe String

EPUB stylesheet specified at command line

-> [FilePath]

Paths to fonts to embed

-> WriterOptions

Writer options

-> Pandoc

Document to convert

-> IO ByteString 

Produce an EPUB file from a Pandoc document.

writeOrg :: WriterOptions -> Pandoc -> String

Convert Pandoc to Org.

writeAsciiDoc :: WriterOptions -> Pandoc -> String

Convert Pandoc to AsciiDoc.

Writer options used in writers

data WriterOptions

Options for writers

Constructors

WriterOptions 

Fields

writerStandalone :: Bool

Include header and footer

writerTemplate :: String

Template to use in standalone mode

writerVariables :: [(String, String)]

Variables to set in template

writerEPUBMetadata :: String

Metadata to include in EPUB

writerTabStop :: Int

Tabstop for conversion btw spaces and tabs

writerTableOfContents :: Bool

Include table of contents

writerSlideVariant :: HTMLSlideVariant

Are we writing S5 or Slidy?

writerIncremental :: Bool

True if lists should be incremental

writerXeTeX :: Bool

Create latex suitable for use by xetex

writerHTMLMathMethod :: HTMLMathMethod

How to print math in HTML

writerIgnoreNotes :: Bool

Ignore footnotes (used in making toc)

writerNumberSections :: Bool

Number sections in LaTeX

writerSectionDivs :: Bool

Put sections in div tags in HTML

writerStrictMarkdown :: Bool

Use strict markdown syntax

writerReferenceLinks :: Bool

Use reference links in writing markdown, rst

writerWrapText :: Bool

Wrap text to line length

writerColumns :: Int

Characters in a line (for text wrapping)

writerLiterateHaskell :: Bool

Write as literate haskell

writerEmailObfuscation :: ObfuscationMethod

How to obfuscate emails

writerIdentifierPrefix :: String

Prefix for section & note ids in HTML

writerSourceDirectory :: FilePath

Directory path of 1st source file

writerUserDataDir :: Maybe FilePath

Path of user data directory

writerCiteMethod :: CiteMethod

How to print cites

writerBiblioFiles :: [FilePath]

Biblio files to use for citations

writerHtml5 :: Bool

Produce HTML5

writerBeamer :: Bool

Produce beamer LaTeX slide show

writerSlideLevel :: Maybe Int

Force header level of slides

writerChapters :: Bool

Use chapter for top-level sects

writerListings :: Bool

Use listings package for code

writerHighlight :: Bool

Highlight source code

writerHighlightStyle :: Style

Style to use for highlighting

writerSetextHeaders :: Bool

Use setext headers for levels 1-2 in markdown

writerTeXLigatures :: Bool

Use tex ligatures quotes, dashes in latex

Instances

data HTMLSlideVariant

Varieties of HTML slide shows.

defaultWriterOptions :: WriterOptions

Default writer options.

Rendering templates and default templates

Version

pandocVersion :: String

Version number of pandoc library.

Miscellaneous

rtfEmbedImage :: Inline -> IO Inline

Convert Image inlines into a raw RTF embedded image, read from a file. If file not found or filetype not jpeg or png, leave the inline unchanged.

jsonFilter :: (Pandoc -> Pandoc) -> String -> String

Converts a transformation on the Pandoc AST into a function that reads and writes a JSON-encoded string. This is useful for writing small scripts.

class ToJsonFilter a where

toJsonFilter convert a function into a filter that reads pandoc's json output from stdin, transforms it by walking the AST and applying the specified function, and writes the result as json to stdout. Usage example:

 -- capitalize.hs
 -- compile with:  ghc --make capitalize
 -- run with:      pandoc -t json | ./capitalize | pandoc -f json

 import Text.Pandoc
 import Data.Char (toUpper)

 main :: IO ()
 main = toJsonFilter capitalizeStrings

 capitalizeStrings :: Inline -> Inline
 capitalizeStrings (Str s) = Str $ map toUpper s
 capitalizeStrings x       = x

The function can be any type (a -> a), (a -> IO a), (a -> [a]), or (a -> IO [a]), where a is an instance of Data. So, for example, a can be Pandoc, Inline, Block, [Inline], [Block], Meta, ListNumberStyle, Alignment, ListNumberDelim, QuoteType, etc. See Definition.

Methods

toJsonFilter :: a -> IO ()

Instances

Data a => ToJsonFilter (a -> IO [a]) 
Data a => ToJsonFilter (a -> [a]) 
Data a => ToJsonFilter (a -> IO a) 
Data a => ToJsonFilter (a -> a)