Pdf2epub3fixed
Convert PDF to fixed-layout EPUB, conserving the table of contents, inner cross-references and hyperlinks.
Install / Use
/learn @aourednik/Pdf2epub3fixedREADME
PDF2epub3fixed
This python script generates a fixed-layout EPUB3 e-book from a PDF file in two variants:
- your_file_html.epub : A rich text variant, with a table of contents, clickable cross-references and hyperlinks. The text body is selectable and searchable. Vector drawings are converted to EPUB-suported SVG. Positioning of all text boxes is 95% reliable and the resulting file readable by most epub readers. For fine-tuning, use an EPUB editor like Sigil.
- your_file_pageimages.epub : A variant containing high-res image renderings of all your pages, with a table of contents, clickable cross-references and hyperlinks. The only HTML elements included in the EPUB are the links. This conversion is more bullet-proof but yields a larger file, with unselectable and unsearchable text.
Further, the script produces files and folders that can help analyse the structure of your PDF file, and to understand eventual conversion errors:
- your_file_pageimages.json : JSON object containing the positionings of your words, images and link-boxes.
- your_file_html/ : Folder containing all XML and other resources that corresponds to the pre-zipped sturcture of your_file_html.epub
- your_file_pageimages/ : Folder containing all XML and other resources that corresponds to the pre-zipped sturcture of your_file_pageimages.epub
(Yes, an EPUB is nothing but a zipped collection of XMLs.)
This script is particularly suitable for the conversion of PDFs generated with LaTeX variants (XeLaTeX, LuaLaTeX etc.) as it reproduces the "link-boxes" that LaTeX usually generates for cross-refs and hyperlinks. Rendering of complex mathematical equations, nevertheless, is reliable only in the pageimages.epub variant.
Installation
Installing Git and Conda
The python script requires dependencies. It is best run in a managed environment. I advise using Git for download and Conda for enviroment management. If you already have them installed, skip this section.
Both Conda and Git are available for all major platforms (Linux, Mac, Windows). See:
On Mac, you can also use homebrew:
brew install git
brew install --cask miniconda
Installing PDF2epub3fixed using Git and Conda
These lines can be executed in any terminal, including the Windows Console":
git clone https://github.com/aourednik/pdf2epub3fixed.git
cd pdf2epub3fixed
# Create and activate Conda environment
conda create -y -n pdf2epub3fixed python=3.13
conda activate pdf2epub3fixed
# Install dependencies
pip3 install pymupdf
conda install pillow
conda install shututil
conda install zipfile
conda install pyyaml
Use
If you have not already done so, activate the conda environment and navigate to where pdf2epub3fixed.py is located:
On Mac and Linux
conda activate pdf2epub3fixed
cd path/to/pdf2epub3fixed
On Windows
conda activate pdf2epub3fixed
cd path\to\pdf2epub3fixed
Execute with a configuration file
Prepare a configuration file (See an example in config.yml) and run this:
python pdf2epub3fixed.py --yaml_config=config.yml
For all undefined arguments, PDF2epub3fixed will fall back on default values.
Execute with inline arguments
You can also directly provide the arguments in the command line:
python pdf2epub3fixed.py --pdf_path=path/to/your/pdffile.pdf
The following additional arguments should be used:
- --output_folder="path/to/your/output/folder" (by default, this is set to the output subfolder of the folder from which you execute pdf2epub3fixed.py.)
- --epub_file_name="your_epub_file_name_without_extension"
- --title="Your title"
- --author="Monica Example"
- --language="en" ("fr-FR", "de" etc.)
- --publisher = "Publishing House"
- --date = "yyyy-mm-dd"
- --description = "Your book abstract"
- --rights = "All rights reserved."
- --font_folder = "path/to/your/Fonts" (This folder should contain all the fonts used in your PDF, in TTF format. As fonts are embedded in the EPUB and impact on its size, make sure you only include fonts you really need)
- --cover_image = "your_cover_image.png"
- --urn = "12345678-1234-1234-1234-123456789abc"
For all undefined arguments, PDF2epub3fixed will fall back on default values.
Example files
This repository contains an example PDF and cover image consisting of an excerpt of my English translation of my French book Robopoïèses. This translation is currently unpublished and rights can be discussed with my French editor laurence.gudin@editions-baconniere.ch .
I use this for code testing, as the book has crosslinks, hyperlinks, a complex layout and contains text in several writing systems, including right-to-left scripts.
Related Skills
qqbot-channel
343.1kQQ 频道管理技能。查询频道列表、子频道、成员、发帖、公告、日程等操作。使用 qqbot_channel_api 工具代理 QQ 开放平台 HTTP 接口,自动处理 Token 鉴权。当用户需要查看频道、管理子频道、查询成员、发布帖子/公告/日程时使用。
claude-opus-4-5-migration
90.0kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
docs-writer
99.7k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
343.1kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
