Manubot
Python utilities for Manubot: Manuscripts, open and automated
Install / Use
/learn @manubot/ManubotREADME
Python utilities for Manubot: Manuscripts, open and automated
Manubot is a workflow and set of tools for the next generation of scholarly publishing. This repository contains a Python package with several Manubot-related utilities, as described in the usage section below. Package documentation is available at https://manubot.github.io/manubot (auto-generated from the Python source code).
The manubot cite command-line interface retrieves and formats bibliographic metadata for user-supplied persistent identifiers like DOIs or PubMed IDs.
The manubot process command-line interface prepares scholarly manuscripts for Pandoc consumption.
The manubot process command is used by Manubot manuscripts, which are based off the Rootstock template, to automate several aspects of manuscript generation.
The manubot ai-revision command is used to automatically revise a manuscript based on a set of AI-generated suggestions.
See Rootstock's manuscript usage guide for more information.
Note: If you want to experience Manubot by editing an existing manuscript, see https://github.com/manubot/try-manubot. If you want to create a new manuscript, see https://github.com/manubot/rootstock.
To cite the Manubot project or for more information on its design and history, see:
Open collaborative writing with Manubot<br> Daniel S. Himmelstein, Vincent Rubinetti, David R. Slochower, Dongbo Hu, Venkat S. Malladi, Casey S. Greene, Anthony Gitter<br> PLOS Computational Biology (2019-06-24) https://doi.org/c7np<br> DOI: 10.1371/journal.pcbi.1007128 · PMID: 31233491 · PMCID: PMC6611653
The Manubot version of this manuscript is available at https://greenelab.github.io/meta-review/.
Installation
If you are using the manubot Python package as part of a manuscript repository, installation of this package is handled though the Rootstock's environment specification.
For other use cases, this package can be installed via pip.
Install the latest release version from PyPI:
pip install --upgrade manubot
Or install from the source code on GitHub, using the version specified by a commit hash:
COMMIT=d2160151e52750895571079a6e257beb6e0b1278
pip install --upgrade git+https://github.com/manubot/manubot@$COMMIT
The --upgrade argument ensures pip updates an existing manubot installation if present.
Some functions in this package require Pandoc,
which must be installed separately on the system.
The pandoc-manubot-cite filter depends on Pandoc as well as panflute (a Python package).
Users must install a compatible version of panflute based on their Pandoc version.
For example, on a system with Pandoc 2.9,
install the appropriate panflute like pip install panflute==1.12.5.
Usage
Installing the python package creates the manubot command line program.
Here is the usage information as per manubot --help:
usage: manubot [-h] [--version] {process,cite,webpage,ai-revision} ...
Manubot: the manuscript bot for scholarly writing
options:
-h, --help show this help message and exit
--version show program's version number and exit
subcommands:
All operations are done through subcommands:
{process,cite,webpage,ai-revision}
process process manuscript content
cite citekey to CSL JSON command line utility
webpage deploy Manubot outputs to a webpage directory tree
ai-revision revise manuscript content with language models
Note that all operations are done through the following sub-commands.
Process
The manubot process program is the primary interface to using Manubot.
There are two required arguments: --content-directory and --output-directory, which specify the respective paths to the content and output directories.
The content directory stores the manuscript source files.
Files generated by Manubot are saved to the output directory.
One common setup is to create a directory for a manuscript that contains both the content and output directory.
Under this setup, you can run the Manubot using:
manubot process \
--skip-citations \
--content-directory=content \
--output-directory=output
See manubot process --help for documentation of all command line arguments:
usage: manubot process [-h] --content-directory CONTENT_DIRECTORY
--output-directory OUTPUT_DIRECTORY
[--template-variables-path TEMPLATE_VARIABLES_PATH]
--skip-citations [--cache-directory CACHE_DIRECTORY]
[--clear-requests-cache] [--skip-remote]
[--log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
Process manuscript content to create outputs for Pandoc consumption. Performs
bibliographic processing and templating.
options:
-h, --help show this help message and exit
--content-directory CONTENT_DIRECTORY
Directory where manuscript content files are located.
--output-directory OUTPUT_DIRECTORY
Directory to output files generated by this script.
--template-variables-path TEMPLATE_VARIABLES_PATH
Path or URL of a file containing template variables
for jinja2. Serialization format is inferred from the
file extension, with support for JSON, YAML, and TOML.
If the format cannot be detected, the parser assumes
JSON. Specify this argument multiple times to read
multiple files. Variables can be applied to a
namespace (i.e. stored under a dictionary key) like
`--template-variables-path=namespace=path_or_url`.
Namespaces must match the regex `[a-zA-
Z_][a-zA-Z0-9_]*`.
--skip-citations Skip citation and reference processing. Support for
citation and reference processing has been moved from
`manubot process` to the pandoc-manubot-cite filter.
Therefore this argument is now required. If citation-
tags.tsv is found in content, these tags will be
inserted in the markdown output using the reference-
link syntax for citekey aliases. Appends
content/manual-references*.* paths to Pandoc's
metadata.bibliography field.
--cache-directory CACHE_DIRECTORY
Custom cache directory. If not specified, caches to
output-directory.
--clear-requests-cache
--skip-remote Do not add the rootstock repository to the local git
repository remotes.
--log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Set the logging level for stderr logging
Manual references
Manubot has the ability to rely on user-provided reference metadata rather than generating it.
manubot process searches the content directory for files containing manually-provided reference metadata that match the glob manual-references*.*.
These files are stored in the Pandoc metadata bibliography field, such that they can be loaded by pandoc-manubot-cite.
Cite
manubot cite is a command line utility to produce bibliographic metadata for citation keys.
The utility either outputs metadata as CSL JSON items or produces formatted references if --render.
Citation keys should be in the format prefix:accession.
For example, the following example generates Markdown-formatted references for four persistent identifiers:
manubot cite --format=markdown \
doi:10.1098/rsif.2017.0387 pubmed:29424689 pmc:PMC5640425 arxiv:1806.05726
The following terminal recording demonstrates the main features of manubot cite (for a slightly outdated version):

Additional usage information is available from manubot cite --help:
usage: manubot cite [-h] [--output OUTPUT]
[--format {csljson,cslyaml,plain,markdown,docx,html,jats} | --yml | --txt | --md]
[--csl CSL] [--bibliography BIBLIOGRAPHY]
