Uproot3
ROOT I/O in pure Python and NumPy.
Install / Use
/learn @scikit-hep/Uproot3README
.. image:: docs/source/logo-300px.png :alt: uproot :target: http://uproot.readthedocs.io/en/latest/
This is a deprecated version of Uproot
See scikit-hep/uproot4 <https://github.com/scikit-hep/uproot4>__ for the latest version of Uproot. Old and new versions are available as separate packages,
.. code-block:: bash
pip install uproot3 # old
pip install uproot # new
because the interface has changed.
You can adopt the new library gradually by importing both in Python, switching to the old version as a contingency (missing feature or bug in the new version). Note that Uproot 3 returns old-style Awkward 0 <https://github.com/scikit-hep/awkward-0.x#readme>__ arrays and Uproot 4 returns new-style Awkward 1 <https://github.com/scikit-hep/awkward-1.0#readme>__ arrays. (The new version of Uproot was motivated by the new version of Awkward, to make a clear distinction.)
uproot
.. image:: https://zenodo.org/badge/DOI/10.5281/zenodo.1173083.svg :target: https://doi.org/10.5281/zenodo.1173083
.. inclusion-marker-1-do-not-remove
ROOT I/O in pure Python and Numpy.
.. inclusion-marker-1-5-do-not-remove
uproot (originally μproot, for "micro-Python ROOT") is a reader and a writer of the ROOT file format <https://root.cern/>__ using only Python and Numpy. Unlike the standard C++ ROOT implementation, uproot is only an I/O library, primarily intended to stream data into machine learning libraries in Python. Unlike PyROOT and root_numpy, uproot does not depend on C++ ROOT. Instead, it uses Numpy to cast blocks of data from the ROOT file as Numpy arrays.
Python does not necessarily mean slow. As long as the data blocks ("baskets") are large, this "array at a time" approach can even be faster than "event at a time" C++. Below, the rate of reading data into arrays with uproot is shown to be faster than C++ ROOT (left) and root_numpy (right), as long as the baskets are tens of kilobytes or larger (for a variable number of muons per event in an ensemble of different physics samples; higher is better).
.. inclusion-marker-replaceplots-start
.. raw:: html
<table border="0"><tr><td><img src="https://raw.githubusercontent.com/scikit-hep/uproot3/master/docs/root-none-muon.png" width="100%"></td><td><img src="https://raw.githubusercontent.com/scikit-hep/uproot3/master/docs/rootnumpy-none-muon.png" width="100%"></td></tr></table>
.. inclusion-marker-replaceplots-stop
uproot is not maintained by the ROOT project team, so post bug reports here as GitHub issues <https://github.com/scikit-hep/uproot3/issues>__, not on a ROOT forum. Thanks!
.. inclusion-marker-2-do-not-remove
Installation
Install uproot like any other Python package:
.. code-block:: bash
pip install uproot3 # maybe with sudo or --user, or in virtualenv
The pip installer automatically installs strict dependencies; the conda installer also installs optional dependencies (except for Pandas).
Strict dependencies:
numpy <https://scipy.org/install.html>__ (1.13.1+)Awkward Array 0.x <https://github.com/scikit-hep/awkward-0.x>__uproot3-methods <https://github.com/scikit-hep/uproot3-methods>__cachetools <https://pypi.org/project/cachetools>__
Optional dependencies:
lz4 <https://pypi.org/project/lz4>__ to read/write lz4-compressed ROOT filesxxhash <https://pypi.org/project/xxhash/>__ to read/write lz4-compressed ROOT fileslzma <https://pypi.org/project/backports.lzma>__ to read/write lzma-compressed ROOT files in Python 2xrootd <https://anaconda.org/conda-forge/xrootd>__ to access remote files through XRootDrequests <https://pypi.org/project/requests>__ to access remote files through HTTPpandas <https://pandas.pydata.org>__ to fill Pandas DataFrames instead of Numpy arrays
Reminder: you do not need C++ ROOT to run uproot.
.. inclusion-marker-3-do-not-remove
Questions
If you have a question about how to use uproot that is not answered in the document below, I recommend asking your question on StackOverflow <https://stackoverflow.com/questions/tagged/uproot>__ with the [uproot] tag. (I get notified of questions with this tag.) Note that this tag is primarily intended for the new version of Uproot, so if you're using this version (Uproot 3.x), be sure to mention that.
.. raw:: html
<p align="center"><a href="https://stackoverflow.com/questions/tagged/uproot"><img src="https://cdn.sstatic.net/Sites/stackoverflow/company/img/logos/so/so-logo.png" width="30%"></a></p>If you believe you have found a bug in uproot, post it on the GitHub issues tab <https://github.com/scikit-hep/uproot3/issues>__.
Tutorial
Tutorial contents:
-
Introduction <#introduction>__ -
What is uproot? <#what-is-uproot>__ -
Exploring a file <#exploring-a-file>__Compressed objects in ROOT files <#compressed-objects-in-root-files>__Exploring a TTree <#exploring-a-ttree>__Some terminology <#some-terminology>__
-
Reading arrays from a TTree <#reading-arrays-from-a-ttree>__ -
Caching data <#caching-data>__Automatically managed caches <#automatically-managed-caches>__Caching at all levels of abstraction <#caching-at-all-levels-of-abstraction>__
-
Lazy arrays <#lazy-arrays>__Lazy array of many files <#lazy-array-of-many-files>__Lazy arrays with caching <#lazy-arrays-with-caching>__Lazy arrays as lightweight skims <#lazy-arrays-as-lightweight-skims>__Lazy arrays in Dask <#lazy-arrays-in-dask>__
-
Iteration <#iteration>__Filenames and entry numbers while iterating <#filenames-and-entry-numbers-while-iterating>__Limiting the number of entries to be read <#limiting-the-number-of-entries-to-be-read>__Controlling lazy chunk and iteration step sizes <#controlling-lazy-chunk-and-iteration-step-sizes>__Caching and iteration <#caching-and-iteration>__
-
Changing the output container type <#changing-the-output-container-type>__ -
Filling Pandas DataFrames <#filling-pandas-dataframes>__ -
Selecting and interpreting branches <#selecting-and-interpreting-branches>__TBranch interpretations <#tbranch-interpretations>__Reading data into a preexisting array <#reading-data-into-a-preexisting-array>__Passing many new interpretations in one call <#passing-many-new-interpretations-in-one-call>__Multiple values per event: fixed size arrays <#multiple-values-per-event-fixed-size-arrays>__Multiple values per event: leaf-lists <#multiple-values-per-event-leaf-lists>__Multiple values per event: jagged arrays <#multiple-values-per-event-jagged-arrays>__Jagged array performance <#jagged-array-performance>__Special physics objects: Lorentz vectors <#special-physics-objects-lorentz-vectors>__Variable-width values: strings <#variable-width-values-strings>__Arbitrary objects in TTrees <#arbitrary-objects-in-ttrees>__Doubly nested jagged arrays (i.e. std::vector<std::vector<T>>) <#doubly-nested-jagged-arrays-ie-stdvectorstdvectort>__
-
Parallel array reading <#parallel-array-reading>__ -
Histograms, TProfiles, TGraphs, and others <#histograms-tprofiles-tgraphs-and-others>__ -
Creating and writing data to ROOT files <#creating-and-writing-data-to-root-files>__Writing histograms <#writing-histograms>__Writing TTrees <#writing-ttrees>__
Introduction
This tutorial is designed to help you start using uproot.
The original tutorial has been archived <https://github.com/scikit-hep/uproot/blob/master/docs/old-tutorial.rst>—this
version was written in June 2019 in response to feedback from a series
of tutorials I presented early this year and common questions in the
GitHub issues <https://github.com/scikit-hep/uproot3/issues>. The new
tutorial is executable on Binder <https://mybinder.org/v2/gh/scikit-hep/uproot3/master?urlpath=lab/tree/binder%2Ftutorial.ipynb>__
and may be read in any order, though it has to be executed from top to
bottom because some variables are reused.
What is uproot?
Uproot is a Python package; it is pip and conda-installable, and it only
depends on other Python packages. Although it is similar in function to
root_numpy <https://pypi.org/project/root-numpy/>__ and
root_pandas <https://pypi.org/project/root_pandas/>__, it does not
compile into ROOT and therefore avoids issues in which the version used
in compilation differs from the version encountered at runtime.
In short, you should never see a segmentation fault.
.. raw:: html
<p align="center"><img src="https://raw.githubusercontent.com/scikit-hep/uproot3/master/docs/abstraction-layers.png" width="75%"></p>Uproot is strictly concerned with file I/O only—all other functionality is handled by other libraries:
uproot3-methods <https://github.com/scikit-hep/uproot3-methods>__: physics methods for types read from ROOT files, such as histograms and Lorentz vectors. It is intended to be largely user-contributed (and is).awkward-array <https://github.com/scikit-hep/awkward-0.x>: array manipulation beyondNumpy <https://docs.scipy.org/doc/numpy/reference/>. Several are encountered in this tutorial, particularly lazy arrays and jagged arrays.
In the past year, uproot has become one of the most widely used Python packages made for particle physics, with users in all four LHC experiments, theory, neutrino experiments, XENON-nT (dark matter direct detection), MAGIC (gamma ray astronomy), and IceCube (neutrino astronomy).
.. raw:: html
<p align="center"><img src="https://raw.githubusercontent.com/scikit-hep/uproot3/master/docs/all_file_project.png" width="75%"></p>Exploring a file
uproot3.open is the entry point for reading a single file.
It takes a local filename path or a remote http://
