Ebook
A template project for building an eBook, using Python, Pandoc and Markdown.
Install / Use
/learn @bmc/EbookREADME
ebook
Note: For the old ebook-template code, see the
v0.8.0 tag.
FYI: That code is no longer maintained.
Overview
This repository contains an opinionated tooling framework that allows you to write an eBook (in ePub, PDF, Microsoft Word, and HTML formats) from [Markdown][] input files.
Basically, you write your book as a series of Markdown files, adhering to some
file naming conventions, and you run the ebook
command (see Building your book) to build your book in
one or more of the supported formats. ebook does some magic, and then it uses
[Pandoc][] to generate your book.
In addition to a simplified convention for laying out your book, ebook
supports extras, such as:
- Enhanced Markdown capabilities like YAML metadata, fenced code blocks, smart quote conversions, enhanced lists, examples, and other features.
- Additional non-standard markup to allow you to center-, left-, or right-justify paragraphs; create a three-bullet paragraph separator easily; and other goodies.
- Bibliographic references
There are sample files in this repository, in the book subdirectory,
so you can build a (completely pointless and utterly useless) eBook right
away. You can also use those sample files as templates for starting your
own book.
This tooling has been tested with [Pandoc][] versions 3.1.7.
If you're impatient, jump to Getting Started.
Warnings
This code is a work in progress. It generally does what it's supposed to do,
though I haven't finished building out a Docker version yet. (What's in the
docker folder is old, from the previous version of this code. It doesn't
work; it's only there so I can use it as a reference.)
Warnings aside, I am actively using this tooling to work on an eBook, which is driving ongoing fixes and enhancements.
Supported output formats
ebook will generate your book in the following formats:
ePub
book.epub
ePub is the format used by Apple's iBooks and various free readers, including [Calibre][].
book.pdf is a single PDF document, generated from HTML via [Weasy Print][].
Limitations:
- There's no table of contents.
HTML
book.html is a single-page HTML, styled in a pleasant format.
Limitations:
- There's no table of contents.
- There's no real notion of a "page" in HTML, so level 1 headings don't start on new pages.
Microsoft Word
book.docx is a Microsoft Word version of your book. The
customer-reference.docx file in the etc/files directory is used
to style the document. This reference document is an augmented version of
the one shipped with Pandoc. You can get the Pandoc reference document by
running:
$ pandoc -o custom-reference.docx --print-default-data-file reference.docx
The one shipped with ebook adds support for left-, right- and center-justified
paragraphs, which you can create via the
additional non-standard markup added by
ebook.
Limitations:
- There's no table of contents. But it's straightforward enough to create
your own in the generated Word document. In newer versions of Microsoft
Word (e.g., the version you get with Office 365):
- Insert a page break to create a new, blank page.
- Select "References" from the menu bar.
- Select "Table of Contents", and select your desired style.
- Paragraphs don't have their first lines indented. You can manually correct this in the document by putting your cursor within a paragraph and selecting Format > Style to style all similar paragraphs.
- Level 1 headings don't start on a new page. You can fix that throughout the entire document by putting your cursor within a level 1 heading and selecting Format > Style.
- The cover image may need to be scaled manually within Word.
Unsupported formats
Kindle (MOBI)
Pandoc can't generate books in Kindle format. However, there are several options for generating Kindle content:
-
Haul the Microsoft Word version into Kindle Create
-
Use the free and open source [Calibre][] suite to convert the ePub format to Kindle format.
Getting started
Using Docker
A Docker image of this tool chain, with all appropriate dependencies, is in the works. Stay tuned.
Required software
You'll need to install a few tools on your local machine.
- Install pandoc.
- Install a Python distribution, version 3.10 or better.
- I recommend creating and activating a Python virtual environment, to keep the installed version of Python 3 more or less pristine.
Installation
Once you have your Python 3 environment set up (and activated, if you're
using a virtual environment), check out this repository and run the
install.py command. It will install an executable version of ebook
in $HOME/bin, and it will install its support files in $HOME/etc/ebook.
It will also attempt to install all necessary packages (except for Pandoc) in
the activated Python environment.
Note that you'll have to tell ebook where to find its etc directory.
You can either specify it on the command line, like so:
$ ebook -e $HOME/etc/ebook
You can also simply set an environment variable (preferably in your shell's startup file):
export EBOOK_ETC=$HOME/etc/ebook
You don't have to be in the repo directory to run the install.py program.
Uninstalling
Simply run
$ python install.py -u
NOTE: Uninstalling does not remove the pip-installed third party Python packages that were installed.
Windows Support
There is none.
I don't do development or writing on Windows. I don't, and won't, test this software on Windows. If you insist on trying to use this program on a Windows system, you are entirely on your own. This is a hobby project for me, and I have no desire to make my life more miserable by supporting it on Windows.
Initial configuration
Create your cover image
In your book directory, create a cover image, as a PNG. If you haven't
settled on a cover image yet, you can use the dummy image that's already
there. The cover image is optional, but you really want one, especially if
you're generating an ePub. You can use the book/cover.png file as a
placeholder, until you settle on your own image.
Fill in the metadata
Use this repo's book/metadata.yaml as an example, and fill in the relevant
pieces for your book. Both Pandoc and ebook use this metadata.
Note: This file contains Pandoc YAML Metadata, with some additional fields used by this build tooling.
The following elements require your consideration:
-
title(Required): The book title. -
subtitle(Optional): Subtitle, if any. -
author(Required): A YAML list of authors. If there is only one author, use a single-element YAML list. For example:
author:
- Joe Horrid
author:
- Joe Horrid
- Frances Horrid
-
copyright(Required): A block with two required fields,ownerandyear. See the existing samplemetadata.yamlfor an example. These values are substituted into thecopyright.mdfile, if it is present. -
publisher(Required): The publisher of the book. -
language(Required): The language in which the book is written. The value can be a 2-letter ISO 639-1 code, such as "en" or "fr". It can also be a 2-part string consisting of the ISO 639-1 language code and the 2-letter ISO 3166 country code, such as "en-US", "en-UK", "fr-CA", "fr-FR", etc. -
genre(Required): The book's genre. See https://wiki.mobileread.com/wiki/Genre for a list of genres.
Supply copyright information
Use the book/copyright.md file in this repo as an example, and fill in the
copyright information for your book. As the sample copyright.md file
demonstrates, you can use special tokens to substitute values directly out of
the metadata. You're not required to use these tokens, but they can make things
easier, since you won't have to specify the values in multiple places. The
tokens are:
%copyright-year%is replaced with the copyright "year" value from the metadata%copyright-owner%is replaced with the copyright "owner" value from the metadata
In truth, those tokens are supported in any of your Markdown source files,
though they make the most sense in the copyright.md file. See
Substitution Patterns for more details.
The {<} token in the sample copyright file forces left justification, as
described in Additional markup.
Note that copyright.md is not required, but it is highly recommended.
Markup notes
Enhanced Markdown
Your book will use Markdown, as interpreted by Pandoc. The following Pandoc extensions are enabled. See the [Pandoc User's Guide][] for full details.
-
line_blocks: Use vertical bars to create lines that are formatted as is. See http://pandoc.org/MANUAL.html#line-blocks for details. -
escaped_line_breaks: A backslash followed by a newline is also a hard line break. See http://pandoc.org/MANUAL.html#extension-escaped_line_breaks for details. -
yaml_metadata_block: Allows metadata in the Markdown. See See http://pandoc.org/MANUAL.html#extension-yaml_metadata_block for details. -
smart: Interprets straight quotes as curly quotes, "---" as em-dashes, "--" as en-dashes, and "..." as ellipses. Nonbreaking spaces are inserted after certain abbreviations, such as "Mr."
