Biopython
Official git repository for Biopython (originally converted from CVS)
Install / Use
/learn @biopython/BiopythonREADME
.. image:: https://img.shields.io/pypi/v/biopython.svg?logo=pypi :alt: Biopython on the Python Package Index (PyPI) :target: https://pypi.python.org/pypi/biopython .. image:: https://img.shields.io/conda/vn/conda-forge/biopython.svg?logo=conda-forge :alt: Biopython on the Conda package conda-forge channel :target: https://anaconda.org/conda-forge/biopython .. image:: https://results.pre-commit.ci/badge/github/biopython/biopython/master.svg :target: https://results.pre-commit.ci/latest/github/biopython/biopython/master :alt: pre-commit.ci status .. image:: https://img.shields.io/circleci/build/github/biopython/biopython.svg?logo=circleci :alt: Linux testing with CircleCI :target: https://app.circleci.com/pipelines/github/biopython/biopython .. image:: https://img.shields.io/appveyor/ci/biopython/biopython/master.svg?logo=appveyor :alt: Windows testing with AppVeyor :target: https://ci.appveyor.com/project/biopython/biopython/history .. image:: https://img.shields.io/github/actions/workflow/status/biopython/biopython/ci.yml?logo=github-actions :alt: GitHub workflow status :target: https://github.com/biopython/biopython/actions .. image:: https://img.shields.io/codecov/c/github/biopython/biopython/master.svg?logo=codecov :alt: Test coverage on CodeCov :target: https://codecov.io/github/biopython/biopython/ .. image:: https://depsy.org/api/package/pypi/biopython/badge.svg :alt: Research software impact on Depsy :target: https://depsy.org/package/python/biopython
.. image:: https://github.com/biopython/biopython/raw/master/Doc/images/biopython_logo_m.png :alt: The Biopython Project :target: https://biopython.org
Biopython README file
The Biopython Project is an international association of developers of freely available Python tools for computational molecular biology.
This README file is intended primarily for people interested in working with the Biopython source code, either one of the releases from the https://biopython.org website, or from our repository on GitHub https://github.com/biopython/biopython
Our user-centric documentation, The Biopython Tutorial and Cookbook, and API documentation <https://biopython.org/docs/latest/>_, is generated from our
repository using Sphinx.
The NEWS <https://github.com/biopython/biopython/blob/master/NEWS.rst>_
file summarises the changes in each release of Biopython, alongside the
DEPRECATED <https://github.com/biopython/biopython/blob/master/DEPRECATED.rst>_
file which notes API breakages.
The Biopython package is open source software made available under generous
terms. Please see the LICENSE <https://github.com/biopython/biopython/blob/master/LICENSE.rst>_ file for
further details.
If you use Biopython in work contributing to a scientific publication, we ask that you cite our application note (below) or one of the module specific publications (listed on our website):
Cock, P.J.A. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 2009 Jun 1; 25(11) 1422-3 https://doi.org/10.1093/bioinformatics/btp163 pmid:19304878
For the impatient
Python includes the package management system "pip" which should allow you to install Biopython (and its dependency NumPy if needed), upgrade or uninstall with just one terminal command::
pip install biopython
pip install --upgrade biopython
pip uninstall biopython
Since Biopython 1.70 we have provided pre-compiled binary wheel packages on PyPI for Linux, macOS and Windows. This means pip install should be quick, and not require a compiler.
As a developer or potential contributor, you may wish to download, build and install Biopython yourself. This is described below.
Python Requirements
We currently recommend using Python 3.13 from https://www.python.org
Biopython is currently supported and tested on the following Python implementations:
-
Python 3.10, 3.11, 3.12, 3.13 and 3.14 -- see https://www.python.org
-
PyPy3.10 v7.3.17 -- or later, see https://www.pypy.org
Optional Dependencies
Biopython requires NumPy (see https://www.numpy.org) which will be installed automatically if you install Biopython with pip (see below for compiling Biopython yourself).
Depending on which parts of Biopython you plan to use, there are a number of other optional Python dependencies, which can be installed later if needed:
-
ReportLab, see https://www.reportlab.com/opensource/ (optional) This package is only used in
Bio.Graphics, so if you do not need this functionality, you will not need to install this package. -
matplotlib, see https://matplotlib.org/ (optional)
Bio.Phylouses this package to plot phylogenetic trees. -
networkx, see https://networkx.github.io/ (optional) and pygraphviz or pydot, see https://pygraphviz.github.io/ and https://code.google.com/p/pydot/ (optional) These packages are used for certain niche functions in
Bio.Phylo. -
rdflib, see https://github.com/RDFLib/rdflib (optional) This package is used in the CDAO parser under
Bio.Phylo. -
psycopg2, see https://initd.org/psycopg/ (optional) or PyGreSQL (pgdb), see https://www.pygresql.org/ (optional) These packages are used by
BioSQLto access a PostgreSQL database. -
MySQL Connector/Python, see https://dev.mysql.com/downloads/connector/python/ This package is used by
BioSQLto access a MySQL database, and is supported on PyPy too. -
mysqlclient, see https://github.com/PyMySQL/mysqlclient-python (optional) This is a fork of the older MySQLdb and is used by
BioSQLto access a MySQL database. It is supported by PyPy.
In addition there are a number of useful third party tools you may wish to install such as standalone NCBI BLAST, EMBOSS or ClustalW.
Installation From Source
We recommend using the pre-compiled binary wheels available on PyPI using::
pip install biopython
However, if you need to compile Biopython yourself, the following are required at compile time:
-
Python including development header files like
python.h, which on Linux are often not installed by default (trying looking for and installing a package namedpython-devorpython-develas well as thepythonpackage). -
Appropriate C compiler for your version of Python, for example GCC on Linux, or MSVC on Windows. For Windows, you must install the 'Visual Studio Build Tools' and select the 'Desktop development with C++' workload. For macOS, use Apple's command line tools, which can be installed with the terminal command::
xcode-select --installThis will offer to install Apple's XCode development suite - you can, but it is not needed and takes a lot of disk space.
Then either download and decompress our source code, or fetch it using git. Now change directory to the Biopython source code folder and run::
pip install -e . --group dev
cd Tests
python run_tests.py
Substitute python with your specific version if required, for example
python3, or pypy3.
To exclude tests that require an internet connection (and which may take a
long time), use the --offline option::
cd Tests
python run_tests.py --offline
Testing
Biopython includes a suite of regression tests to check if everything is running correctly. To run the tests, go to the biopython source code directory and type::
pip install -e . --group dev
cd Tests
python run_tests.py
If you want to skip the online tests (which is recommended when doing repeated testing), use::
cd Tests
python run_tests.py --offline
Do not panic if you see messages warning of skipped tests::
test_DocSQL ... skipping. Install MySQLdb if you want to use Bio.DocSQL.
This most likely means that a package is not installed. You can ignore this if it occurs in the tests for a module that you were not planning on using. If you did want to use that module, please install the required dependency and re-run the tests.
Some of the tests may fail due to network issues, this is often down to
chance or a service outage. If the problem does not go away on
re-running the tests, you can use the --offline option.
There is more testing information in the Biopython Tutorial & Cookbook.
Experimental code
Biopython 1.61 introduced a new warning, Bio.BiopythonExperimentalWarning,
which is used to mark any experimental code included in the otherwise
stable Biopython releases. Such 'beta' level code is ready for wider
testing, but still likely to change, and should only be tried by early
adopters in order to give feedback via the biopython-dev mailing list.
We'd expect such experimental code to reach stable status within one or two releases, at which point our normal policies about trying to preserve backwards compatibility would apply.
Bugs
While we try to ship a robust package, bugs inevitably pop up. If you are having problems that might be caused by a bug in Biopython, it is possible that it has already been identified. Update to the latest release if you are not using it already, and retry. If the problem persists, please search our bug database and our mailing lists to see if it has already been reported (and hopefully fixed), and if not please do report the bug. We can't fix problems we don't know about ;)
Issue tracker: https://github.com/biopython/biopython/issues
If you suspect the problem lies within a parser, it is likely that the data
format has changed and broken the parsing code. (The text BLAST and GenBank
formats seem to be particularly fragile.) Thus, the parsing code in
Biopython is sometimes updated faster than we can build Biopython releases.
You can get the most recent parser by pulling the relevant files (e.g. the
ones in Bio.SeqIO or Bio.Blast) from our git repository. However, be
careful when doing th
