Pyexcel
Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files
Install / Use
/learn @pyexcel/PyexcelREADME
================================================================================ pyexcel - Let you focus on data, instead of file formats
.. image:: https://raw.githubusercontent.com/pyexcel/pyexcel.github.io/master/images/patreon.png :target: https://www.patreon.com/chfw
.. image:: https://codecov.io/gh/pyexcel/pyexcel/branch/master/graph/badge.svg :target: https://codecov.io/gh/pyexcel/pyexcel
.. image:: https://badge.fury.io/py/pyexcel.svg :target: https://pypi.org/project/pyexcel
.. image:: https://anaconda.org/conda-forge/pyexcel/badges/version.svg :target: https://anaconda.org/conda-forge/pyexcel
.. image:: https://pepy.tech/badge/pyexcel/month :target: https://pepy.tech/project/pyexcel
.. image:: https://anaconda.org/conda-forge/pyexcel/badges/downloads.svg :target: https://anaconda.org/conda-forge/pyexcel
.. image:: https://img.shields.io/static/v1?label=continuous%20templating&message=%E6%A8%A1%E7%89%88%E6%9B%B4%E6%96%B0&color=blue&style=flat-square :target: https://moban.readthedocs.io/en/latest/#at-scale-continous-templating-for-open-source-projects
.. image:: https://img.shields.io/static/v1?label=coding%20style&message=black&color=black&style=flat-square :target: https://github.com/psf/black .. image:: https://readthedocs.org/projects/pyexcel/badge/?version=latest :target: http://pyexcel.readthedocs.org/en/latest/
Support the project
If your company uses pyexcel and its components in a revenue-generating product,
please consider supporting the project on GitHub or
Patreon <https://www.patreon.com/bePatron?u=5537627>_. Your financial
support will enable me to dedicate more time to coding, improving documentation,
and creating engaging content.
Known constraints
Fonts, colors and charts are not supported.
Nor to read password protected xls, xlsx and ods files.
Introduction
Feature Highlights
.. table:: A list of supported file formats
============ =======================================================
file format definition
============ =======================================================
csv comma separated values
tsv tab separated values
csvz a zip file that contains one or many csv files
tsvz a zip file that contains one or many tsv files
xls a spreadsheet file format created by
MS-Excel 97-2003
xlsx MS-Excel Extensions to the Office Open XML
SpreadsheetML File Format.
xlsm an MS-Excel Macro-Enabled Workbook file
ods open document spreadsheet
fods flat open document spreadsheet
json java script object notation
html html table of the data structure
simple simple presentation
rst rStructured Text presentation of the data
mediawiki media wiki table
============ =======================================================
.. image:: https://github.com/pyexcel/pyexcel/raw/dev/docs/source/_static/images/architecture.svg
-
One application programming interface(API) to handle multiple data sources:
- physical file
- memory file
- SQLAlchemy table
- Django Model
- Python data structures: dictionary, records and array
-
One API to read and write data in various excel file formats.
-
For large data sets, data streaming are supported. A genenerator can be returned to you. Checkout iget_records, iget_array, isave_as and isave_book_as.
Installation
You can install pyexcel via pip:
.. code-block:: bash
$ pip install pyexcel
or clone it and install it:
.. code-block:: bash
$ git clone https://github.com/pyexcel/pyexcel.git
$ cd pyexcel
$ python setup.py install
One liners
This section shows you how to get data from your excel files and how to export data to excel files in one line
Read from the excel files
Get a list of dictionaries
Suppose you want to process History of Classical Music <https://www.naxos.com/education/brief_history.asp>_:
History of Classical Music:
=============== ============= ==================================== Name Period Representative Composers Medieval c.1150-c.1400 Machaut, Landini Renaissance c.1400-c.1600 Gibbons, Frescobaldi Baroque c.1600-c.1750 JS Bach, Vivaldi Classical c.1750-c.1830 Joseph Haydn, Wolfgan Amadeus Mozart Early Romantic c.1830-c.1860 Chopin, Mendelssohn, Schumann, Liszt Late Romantic c.1860-c.1920 Wagner,Verdi Modernist 20th century Sergei Rachmaninoff,Calude Debussy =============== ============= ====================================
Let's get a list of dictionary out from the xls file:
.. code-block:: python
records = p.get_records(file_name="your_file.xls")
And let's check what do we have:
.. code-block:: python
for row in records: ... print(f"{row['Representative Composers']} are from {row['Name']} period ({row['Period']})") Machaut, Landini are from Medieval period (c.1150-c.1400) Gibbons, Frescobaldi are from Renaissance period (c.1400-c.1600) JS Bach, Vivaldi are from Baroque period (c.1600-c.1750) Joseph Haydn, Wolfgan Amadeus Mozart are from Classical period (c.1750-c.1830) Chopin, Mendelssohn, Schumann, Liszt are from Early Romantic period (c.1830-c.1860) Wagner,Verdi are from Late Romantic period (c.1860-c.1920) Sergei Rachmaninoff,Calude Debussy are from Modernist period (20th century)
Get two dimensional array
Instead, what if you have to use pyexcel.get_array to do the same:
.. code-block:: python
for row in p.get_array(file_name="your_file.xls", start_row=1): ... print(f"{row[2]} are from {row[0]} period ({row[1]})") Machaut, Landini are from Medieval period (c.1150-c.1400) Gibbons, Frescobaldi are from Renaissance period (c.1400-c.1600) JS Bach, Vivaldi are from Baroque period (c.1600-c.1750) Joseph Haydn, Wolfgan Amadeus Mozart are from Classical period (c.1750-c.1830) Chopin, Mendelssohn, Schumann, Liszt are from Early Romantic period (c.1830-c.1860) Wagner,Verdi are from Late Romantic period (c.1860-c.1920) Sergei Rachmaninoff,Calude Debussy are from Modernist period (20th century)
where start_row skips the header row.
Get a dictionary
You can get a dictionary too:
.. code-block:: python
my_dict = p.get_dict(file_name="your_file.xls", name_columns_by_row=0)
And let's have a look inside:
.. code-block:: python
from pyexcel._compact import OrderedDict isinstance(my_dict, OrderedDict) True for key, values in my_dict.items(): ... print(key + " : " + ','.join([str(item) for item in values])) Name : Medieval,Renaissance,Baroque,Classical,Early Romantic,Late Romantic,Modernist Period : c.1150-c.1400,c.1400-c.1600,c.1600-c.1750,c.1750-c.1830,c.1830-c.1860,c.1860-c.1920,20th century Representative Composers : Machaut, Landini,Gibbons, Frescobaldi,JS Bach, Vivaldi,Joseph Haydn, Wolfgan Amadeus Mozart,Chopin, Mendelssohn, Schumann, Liszt,Wagner,Verdi,Sergei Rachmaninoff,Calude Debussy
Please note that my_dict is an OrderedDict.
Get a dictionary of two dimensional array
Suppose you have a multiple sheet book as the following:
Top Violinist:
================= ========= ================ Name Period Nationality Antonio Vivaldi 1678-1741 Italian Niccolo Paganini 1782-1840 Italian Pablo de Sarasate 1852-1904 Spainish Eugene Ysaye 1858-1931 Belgian Fritz Kreisler 1875-1962 Astria-American Jascha Heifetz 1901-1987 Russian-American David Oistrakh 1908-1974 Russian Yehundi Menuhin 1916-1999 American Itzhak Perlman 1945- Israeli-American Hilary Hahn 1979- American ================= ========= ================
Noteable Violin Makers:
====================== ========= ================ Maker Period Country Antonio Stradivari 1644-1737 Cremona, Italy Giovanni Paolo Maggini 1580-1630 Botticino, Italy Amati Family 1500-1740 Cremona, Italy Guarneri Family 1626-1744 Cremona, Italy Rugeri Family 1628-1719 Cremona, Italy Carlo Bergonzi 1683-1747 Cremona, Italy Jacob Stainer 1617-1683 Austria ====================== ========= ================
Most Expensive Violins:
===================== =============== =================================== Name Estimated Value Location Messiah Stradivarious $ 20,000,000 Ashmolean Museum in Oxford, England Vieuxtemps Guarneri $ 16,000,000 On loan to Anne Akiko Meyers Lady Blunt $ 15,900,000 Anonymous bidder ===================== =============== ===================================
Here is the code to obtain those sheets as a single dictionary:
.. code-block:: python
book_dict = p.get_book_dict(file_name="book.xls")
And check:
.. code-block:: python
isinstance(book_dict, OrderedDict) True import json for key, item in book_dict.items(): ... print(json.dumps({key: item})) {"Most Expensive Violins": [["Nam
Related Skills
claude-opus-4-5-migration
84.6kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
docs-writer
99.6k`docs-writer` skill instructions As an expert technical writer and editor for the Gemini CLI project, you produce accurate, clear, and consistent documentation. When asked to write, edit, or revie
model-usage
341.8kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
TrendRadar
50.1k⭐AI-driven public opinion & trend monitor with multi-platform aggregation, RSS, and smart alerts.🎯 告别信息过载,你的 AI 舆情监控助手与热点筛选工具!聚合多平台热点 + RSS 订阅,支持关键词精准筛选。AI 智能筛选新闻 + AI 翻译 + AI 分析简报直推手机,也支持接入 MCP 架构,赋能 AI 自然语言对话分析、情感洞察与趋势预测等。支持 Docker ,数据本地/云端自持。集成微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 等渠道智能推送。
