mdfusion

A Python module for merging and exporting Markdown files. Available for pip.

#Python

When I am studying for exams, I often end up with a lot of Markdown files in a folder hierarchy. For revising the content, I want to merge these files into one single file and have them in a nicely readable format.

When looking for a solution, I did not find a tool that would crawl the directory for me, merge the files and export them to a PDF or HTML file in one go. So I wrote a small Python module using Pandoc that does exactly that. Since I feel like this is a common use case, I decided to publish it on GitHub and PyPI.

I also added a feature for exporting the merged Markdown to a Powerpoint-like presentation. To do this, I use Pandoc's native support for the reveal.js framework. The presentation can then be viewed in a web browser or exported to a PDF file, which I implemented via the python library Playwright. I figured this would come in handy if I ever need to prepare a presentation quickly.

The code is available on GitHuband on PyPI.

Project Setup

Even though this is a small project, I manage it in a git repository using

  • pytest for unit tests
  • pythons build module and twine for uploading the package to PyPI
  • black for code formatting when a file is saved in VSCode

Documentation

mdfusion

Merge all Markdown files in a directory tree into a single PDF or HTML presentation with formatting via Pandoc + XeLaTeX.


Features

  • Recursively collects and sorts all .md files under a directory (natural sort order)
  • Merges them into one document, rewriting image links to absolute paths (so images with the same name in different folders don't collide)
  • Optionally adds a title page with configurable title, author, and date
  • Supports both PDF (via Pandoc + XeLaTeX) and HTML presentations (via reveal.js)
  • Customizes output with your own LaTeX or HTML headers/footers
  • Configurable via TOML for repeatable builds (great for books, reports, or slides)
  • Bundles HTML presentations with all assets for easy sharing

Installation

Requirements

You must have the following on your PATH:

For HTML presentations and PDF export from HTML, you may also want to install:

  • Playwright (for HTML→PDF conversion) via pip install playwright and then playwright install

Install via pip

pip install mdfusion

Install from source

git clone https://github.com/ejuet/mdfusion.git
cd mdfusion
pip install .

Usage

mdfusion ROOT_DIR [OPTIONS]

Common options

  • -o, --output FILE Output filename (default: <root_dir>.pdf or .html for presentations)
  • --no-toc Omit table of contents
  • --title-page Include a title page (PDF only)
  • --title TITLE Set title for title page (default: directory name)
  • --author AUTHOR Set author for title page (default: OS user)
  • --pandoc-args ARGS Extra Pandoc arguments (whitespace-separated)
  • -c, --config FILE Path to a mdfusion.toml config file (default: mdfusion.toml in the current directory)
  • --presentation Output as a reveal.js HTML presentation (not PDF)
  • --footer-text TEXT Custom footer for presentations

Example: Merge docs/ into a PDF with a title page

mdfusion --title-page --title "My Book" --author "Jane Doe" docs/

Example: Create a reveal.js HTML presentation

mdfusion --presentation --title "My Talk" --author "Speaker" --footer-text "My Conference 2025" slides/

Configuration file

You can create a mdfusion.toml file in your project directory to avoid long command lines. The [mdfusion] section supports all the same options as the CLI.

Example: Normal document (PDF)

[mdfusion]
root_dir = "docs"
output = "my-book.pdf"
no_toc = false
title_page = true
title = "My Book"
author = "Jane Doe"
pandoc_args = ["--number-sections", "--slide-level", "2"]
# header_tex = "header.tex"  # Optional: custom LaTeX header

Example: Presentation (HTML via reveal.js)

[mdfusion]
root_dir = "slides"
output = "my-presentation.html"
title = "My Talk"
author = "Speaker"
presentation = true
footer_text = "My Conference 2025"
pandoc_args = ["--slide-level", "6", "--number-sections", "-V", "transition=fade", "-c", "custom.css"]
# You can add more reveal.js or pandoc options as needed with ["-V", "option=value"]

Then just run:

mdfusion

How it works

  1. Finds and sorts all Markdown files under the root directory (natural order)
  2. Merges them into one file, rewriting image links to absolute paths
  3. Optionally adds a YAML metadata block for title/author/date
  4. Calls Pandoc with XeLaTeX (for PDF) or reveal.js (for HTML presentations)
  5. Optionally bundles HTML output with all assets for easy sharing

Testing

Run all tests with:

pytest

Author

ejuet

You can also find this project here:

Comments

Feel free to leave your opinion or questions in the comment section below.