Available at https://github.com/Tiefseetauchner/TiefDownConverter

TiefDownConverter Documentation

Tiefseetauchner et al.

Introduction

For the documenation of the library, see docs.rs.

For the documentation of the CLI, see TiefDownConverter.

This is the documentation for the TiefDown concepts. This won’t explain the library or the CLI usage, but rather function as an introduction to the basics of TiefDown for users and contributors alike.

What is TiefDown?

TiefDown is a project format for managing markdown files and converting them to other formats. It’s not a markdown parser, but rather a project format and management system.

Importantly, the project is split in a few parts:

Manifest File

The manifest file is the heart of the project. It contains all the information needed to manage and convert the project.

It consists of a few important parts (for the full documentation, check https://docs.rs/tiefdownlib/latest/tiefdownlib/manifest_model/index.html):

Templates and processors (at a glance)

Each entry under [[templates]] specifies one output variant. Important fields are:

Project-wide [[custom_processors.preprocessors]] and [[custom_processors.processors]] define reusable building blocks referenced by templates.

Example snippets

# CustomPreprocessors: you decide how inputs are preprocessed; the combined
# output is copied to the final destination without an additional processor step.
[[templates]]
name = "Website HTML"
template_type = "CustomPreprocessors"
output = "site/index.html"

  [templates.preprocessors]
  preprocessors = ["html-from-md"]
  combined_output = "output.html"

[[custom_processors.preprocessors]]
name = "html-from-md"
cli = "pandoc"
cli_args = ["-f", "markdown", "-t", "html"]

# CustomProcessor: preprocess to Pandoc Native, then run Pandoc once more with
# processor arguments to produce the final artifact.
[[templates]]
name = "Docx Export"
template_type = "CustomProcessor"
output = "book.docx"
processor = "docx-out"

  [templates.preprocessors]
  preprocessors = ["native-parts"]
  combined_output = "output.pandoc_native"

[[custom_processors.preprocessors]]
name = "native-parts"
cli = "pandoc"
cli_args = ["-t", "native"]

[[custom_processors.processors]]
name = "docx-out"
processor_args = ["--reference-doc", "resources/reference.docx"]

Conversion folders

The conversion folder is where inputs are collected and processed for a run. A new, time-stamped folder is created inside the project whenever a conversion starts. If a markdown project specifies an output directory, TiefDown uses that as the base for that project’s conversion folder. Old conversion folders can be removed manually with the clean command or automatically with smart clean.

To understand what goes into the folder, it helps to look at the conversion pipeline (see the overview diagram):

Workflow
  1. Input discovery and ordering: TiefDown scans the markdown project directory, orders files by the first number in the filename (e.g. Chapter 10 - …), and recurses into similarly numbered subfolders, preserving their order.

  2. Preprocessing by extension: Inputs are grouped by file extension. For each group, TiefDown selects the matching preprocessor for the active template (either a default or a custom one filtered by extension) and runs it in the conversion folder. The stdout from each run is captured.

  3. Combined output: The captured outputs are concatenated and written to the template’s configured preprocessors.combined_output file (typically output.tex for LaTeX or output.typ for Typst). Your template includes this file.

  4. Metadata files: TiefDown generates metadata.tex or metadata.typ (only if they don’t already exist) based on [shared_metadata], any project-specific metadata, and your optional metadata settings.

  5. Template processing: Depending on the template type, TiefDown runs XeLaTeX (twice) or Typst on the template file in the conversion folder, optionally passing arguments from a named processor. EPUB templates invoke Pandoc directly. CustomPreprocessors templates copy the combined output as-is to the final destination. CustomProcessor templates run a final Pandoc invocation reading the combined Pandoc Native input and passing the configured processor arguments.

  6. Finalization: The produced artifact (e.g. a PDF or EPUB) is then copied to the markdown project’s configured output path.

Templates

Templating in TiefDown is done in several ways:

LaTeX Templates

LaTeX templates are the most intuitive form of templating in TiefDown, but also the most fleshed out. The basic usage generates a LaTeX document from markdown, usually output.tex, with lua-filters applied depending on the template, and then converts that template file to a PDF.

The LaTeX file must include the following:

\input{./output.tex}

This imports the converted markdown files into the LaTeX document. You may adjust the behaviour by using custom preprocessors and custom processors.

For Metadata, one can also import the metadata file, which is generated by TiefDown during the conversion process.

\input{./metadata.tex}

This file provides a macro to access metadata using the \meta{} keyword. This can be adjusted using the metadata settings.

There are preset templates available in the core library. These give a basic framework for extension and shouldn’t be taken as the only way to use LaTeX templates.

Typst Templates

Typst templates are, in concept, identical to LaTeX templates. They generate a Typst document from markdown, usually output.typ, with lua-filters applied depending on the template, and then converts that template file to a PDF.

Importing the typst file works similar:

#include "output.typ"

Again, see custom preprocessors and custom processors for more information on customization.

Metadata importing is easier as typst has an object system.

#import "./metadata.typ": meta

One can then access metadata using the meta object.

There are also preset templates available in the core library. Again, these are just a basic suggestion.

EPUB Templates

Epub templates are a custom version of CustomProcessing templates. They add the ability to add CSS and embed fonts through the templating system without defining the files in the processor.

When adding a css file to the template directory of an epub template, it gets added to the pandoc conversion process with the -c flag.

Additionally, you can add a fonts/ directory in the epub folder. Every file in this directory gets added to the conversion process using the --epub-embed-font flag.

NOTE: this template type is somewhat deprecated and will likely not be gaining features. It has some shortcuts to CustomProcessor conversion but for full control, use said template type instead.

CustomPreprocessors Conversion

CustomPreprocessors replaces the older “Custom Pandoc Conversion” naming. It lets you define exactly how inputs are preprocessed (with Pandoc or any CLI) and where the combined output is written. There is no additional processing step after the combined output is produced.

Key points

Example: single-file HTML without a LaTeX/Typst step

[[templates]]
name = "HTML Article"
template_type = "CustomPreprocessors"
output = "article.html"

  [templates.preprocessors]
  preprocessors = ["md-to-html"]
  combined_output = "output.html"

[[custom_processors.preprocessors]]
name = "md-to-html"
cli = "pandoc"
cli_args = ["-f", "gfm", "-t", "html"]

CustomProcessor Conversion

CustomProcessor is a two-phase pipeline:

Requirements and behavior

Example: produce a docx from the combined native document

[[templates]]
name = "Docx"
template_type = "CustomProcessor"
output = "book.docx"
processor = "docx"

  [templates.preprocessors]
  preprocessors = ["native"]
  combined_output = "output.pandoc_native"

[[custom_processors.preprocessors]]
name = "native"
cli = "pandoc"
cli_args = ["-t", "native"]

[[custom_processors.processors]]
name = "docx"
processor_args = ["--reference-doc", "resources/ref.docx"]

Smart Clean

Smart clean automatically removes old conversion folders. When enabled via smart_clean in manifest.toml, TiefDown keeps only a given number of recent folders. The number is specified with smart_clean_threshold and defaults to 5.

During conversion the library checks the amount of existing conversion folders and deletes the oldest ones once the threshold is exceeded.

Profiles

A profile is a named list of templates that can be converted together. Defining profiles avoids having to pass a long list of template names every time you run the converter.

Profiles are stored in the project’s manifest.toml:

[[profiles]]
name = "Documentation"
templates = ["PDF Documentation LaTeX", "PDF Documentation"]

Use the --profile option with tiefdownconverter convert to select a profile. Markdown projects may also specify a default_profile; this profile is used if none is supplied on the command line.

Lua Filters

Lua filters allow you to modify the document structure during Pandoc’s conversion step. They are attached to templates through the filters field. The value may be a single Lua file or a directory containing multiple filter scripts.

Pandoc executes the filters in the order they are listed. Filters can rename headers, insert custom blocks or otherwise transform the document before it reaches the template engine.

Example filter to adjust chapter headings:

function Header(el)
  if el.level == 1 then
    return pandoc.RawBlock("latex", "\\chapter{" .. pandoc.utils.stringify(el.content) .. "}")
  end
end

For more details on writing filters see the Pandoc documentation.

Note: Lua filters apply to the Pandoc preprocessing step(s). If a template uses a custom preprocessor whose CLI is not Pandoc, those filters have no effect on that preprocessor’s output.

Markdown Projects

A TiefDown project can contain multiple markdown projects. Each project defines where the source files live and where the converted results should be placed. The information is stored in [[markdown_projects]] entries in manifest.toml.

[[markdown_projects]]
name = "Book One"
path = "book_one/markdown"
output = "book_one/output"

A markdown project may define a default_profile used for conversion, a list of resources to copy into the conversion folder and its own metadata.

Custom Resources

Resources are additional files that are copied from the markdown project directory to the conversion folder before processing. Typical examples are images, CSS files or fonts needed by a template. Specify them in the resources array:

resources = ["resources/cover.png", "resources/styles.css"]

Markdown Project Metadata

Project specific metadata is stored under the metadata_fields table of a markdown project. These values are merged with the [shared_metadata] of the project during conversion. When keys collide, the markdown project metadata overrides the shared metadata.

Custom Processors

Custom processors let you change the commands used during conversion. They come in two forms:

A preprocessor is defined under [[custom_processors.preprocessors]]:

[[custom_processors.preprocessors]]
name = "Enable Listings"
cli_args = ["-t", "latex", "--listings"]

A preprocessor can also define a command using the cli field. This replaces the Pandoc preprocessing step with a custom cli command preprocessing step.

[[custom_processors.preprocessors]]
name = "Copy without modification"
cli = "cat"
cli_args = []
extension_filter = "typ"

Templates reference one or more preprocessors with their preprocessors field, which also has to define a combined_output field. The converter captures the stdout of each preprocessor run and writes it to this file, which your template then includes (\input{./output.tex} or #include "./output.typ") or copies to the output location for CustomPreprocessors templates.

Processors are specified similarly and referenced via the processor field:

[[custom_processors.processors]]
name = "Typst Font Directory"
processor_args = ["--font-path", "fonts/"]

Usage notes:

These mechanisms allow fine-grained control over the conversion pipeline when the defaults are not sufficient.

Defaults

TiefDown provides reasonable defaults per template type:

You can override a default for a particular extension by defining a preprocessor with a matching extension_filter; defaults for other extensions remain.

Preprocessors can be scoped by extension via extension_filter, which matches only the file extension (glob patterns such as t* are supported). If you omit the filter, the preprocessor acts as a fallback when no more specific filter matches. Defaults exist per template type and are merged by extension; defining your own preprocessor for a particular extension replaces the default for that extension but leaves the others intact. Finally, cli_args support metadata substitution, so any occurrence of {{key}} is replaced with the corresponding metadata value at conversion time.

Shared Metadata

Shared metadata is defined once for the whole project and is available to every markdown project. It lives under [shared_metadata] in the manifest file and is merged with project specific metadata at conversion time. Values defined in a markdown project override entries from the shared metadata.

Use shared metadata for information that stays the same across multiple books or documents, like the publisher or an overarching author list.

Metadata Settings

Metadata settings influence how metadata files are generated. The [metadata_settings] table currently supports the metadata_prefix option. This prefix determines the name of the macro or object used to access metadata in templates.

For example, with

[metadata_settings]
metadata_prefix = "book"

the generated LaTeX file defines a \book{} command while Typst exposes a book object. In other words, the prefix fully replaces the default meta name. If no prefix is set the command and object are called meta.

Injections

Injections are a project-driven way to insert input files at either the top, inside, or bottom of a document, correspondingly named header, body and footer injections.

Injections serve as a template scoped way to add content to a conversion. An injection is defined once in the manifest and then assigned to a template as any of the aforementioned methods.

During conversion, the injected files are resolved and placed in the list of input files to be converted to the intermediary format.

Header injections are inserted at the top of the document, while footer injections are inserted at the bottom, both in the order as defined in the manifest. The first file defined in the first referenced injection is placed first, the last file of the last referenced injection last.

Body injections

Body injections are injected before the main sorting algorithm as any file in the input directory would be. That means they get sorted in accordance with the primary sorting algorithm.

Tiefseetauchner © 2025