SkillAgentSearch skills...

Censusdis

censusdis is a Python package for discovering, loading and analyzing, U.S. Census demographic, economic, and geographic data and metadata. It is designed to be intuitive and Pythonic, giving users access to the full collection of data and maps the U.S. Census publishes via their APIs.

Install / Use

/learn @censusdis/Censusdis
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

censusdis

Hippocratic License HL3-CL-ECO-EXTR-FFD-LAW-MIL-SV PyPI PyPI - Python Version

PyPI - Status PyPI - Format PyPI - Downloads

GitHub last commit Tests Badge Coverage Badge Documentation Status

censusdis is a package for discovering, loading, analyzing, and computing diversity, integration, and segregation metrics to U.S. Census demographic data. It is designed

  • to support every dataset, every geography, and every year. It's not just about ACS data through the last time the software was updated and released;
  • to support all geographies, on and off-spine, not just states, counties, and census tracts;
  • to have integrated mapping capabilities that save you time and extra coding;
  • to be intuitive, Pythonic, and fast.

Click any of the thumbnails below to see the notebook that generated it.

<img src="../docs/_static/images/sample01.png" alt="Diversity in New Jersey" height=160> <img src="../docs/_static/images/sample02.png" alt="2020 Median Income by County in Georgia" height=160> <img src="../docs/_static/images/sample05.png" alt="Nationwide Integration at the Census Tract over Block Group Level" height=160> <img src="../docs/_static/images/sample03.png" alt="White Alone Population as a Percent of County Population" height=160> <img src="../docs/_static/images/sample06.png" alt="Urban Census Tracts in Illinois" height=160> <img src="../docs/_static/images/sample07.png" alt="NYC Area with Water Overlap Removed" height=160> <img src="../docs/_static/images/sample00.png" alt="Integration in SoMa Tracts" height=160> <img src="../docs/_static/images/sample04.png" alt="Average Age by Public Use Microdata Area in Massachusetts" height=160>

Installation and First Example

censusdis can be installed with pip:

pip install censusdis

Every censusdis query needs four things:

  1. What data set we want to query.
  2. What vintage, or year.
  3. What variables.
  4. What geographies.

Here is an example of how we can use censusdis to download data once we know those four things.

import censusdis.data as ced
from censusdis.datasets import ACS5
from censusdis import states

df_median_income = ced.download(
    # Data set: American Community Survey 5-Year
    dataset=ACS5,
    
    # Vintage: 2022
    vintage=2022, 
    
    # Variable: median household income
    download_variables=['NAME', 'B19013_001E'], 
    
    # Geography: All counties in New Jersey.
    state=states.NJ,
    county='*'
)

There are many more examples in the tuturial and in the sample notebooks.

Tutorial (A Great Place to Start!)

We presented a half-day tutorial on censusdis at SciPy '24. All the material covered in the tutorial is available as in a github repo at https://github.com/censusdis/censusdis-tutorial-2024. The tutorial consists of a series of five lessons, each with worked exercises, and two choices for a final project. If you really want to learn the ins and outs of what censusdis can do, from the most basic queries all the way through some relatively advanced topics, this is the tutorial for you.

An Older Tutorial

For an older tutorial that is shorter but does not include some of the newest features, please see the censusdis-tutorial repository. This tutorial was presented at PyData Seattle 2023. If you want to try it out for yourself, the README.md contains links that let you run the tutorial notebooks live on mybinder.org in your browser without needing to set up a local development environment or download or install any code.

Tutorial Video

We expect a vireo of the SciPy '24 tutorial to be available soon, hopefully by some time in August '24.

A 86 minute video of the older tutorial as presented at PyData Seattle 2023 is also available.

PyData Seattle Tutorial Video

Overview

censusdis is a package for discovering, loading, analyzing, and computing diversity, integration, and segregation metrics to U.S. Census demographic data. It is designed to be intuitive and Pythonic, but give users access to the full collection of data and maps the US Census publishes via their APIs. It also avoids hard-coding metadata about U.S. Census variables, such as their names, types, and hierarchies in groups. Instead, it queries this from the U.S. Census API. This allows it to operate over a large set of datasets and years, likely including many that don't exist as of time of this writing. It also integrates downloading and merging the geometry of geographic geometries to make plotting data and derived metrics simple and easy. Finally, it interacts with the divintseg package to compute diversity and integration metrics.

The design goal of censusdis are discussed in more detail in design-goals.md.

I'm not sure I get it. Show me what it can do.

The Nationwide Diversity and Integration notebook demonstrates how we can download, process, and plot a large amount of US Census demographic data quickly and easily to produce compelling results with just a few lines of code.

I'm sold! I want to dive right in!

To get straight to installing and trying out code hop over to our Getting Started guide.

censusdis lets you quickly and easily load US Census data and make plots like this one:

Median income by block group in GA

We downloaded the data behind this plot, including the geometry of all the block groups, with a single call:

import censusdis.data as ced
from censusdis.states import STATE_GA

# This is a census variable for median household income.
# See https://api.census.gov/data/2020/acs/acs5/variables/B19013_001E.html
MEDIAN_HOUSEHOLD_INCOME_VARIABLE = "B19013_001E"

gdf_bg = ced.download(
    "acs/acs5",  # The American Community Survey 5-Year Data
    2020,
    ["NAME", MEDIAN_HOUSEHOLD_INCOME_VARIABLE],
    state=STATE_GA,
    block_group="*",
    with_geometry=True
)

Similarly, we can download data and geographies, do a little analysis on our own using familiar Pandas data frame operations, and plot graphs like these

Percent of population identifying as white by county Integration is SoMa

Modules

The public modules that make up the censusdis package are

| Module | Description | |-----------------------|:--------------------------------------------------------------------------------------------------------------| | censusdis.geography | Code for managing geography hierarchies in which census data is organized. | | censusdis.data | Code for fetching data from the US Census API, including managing datasets, groups, and variable hierarchies. | | censusdis.maps | Code for downloading map data from the US, caching it locally, and using it to render maps. | | censusdis.states | Constants defining the US States. Used by the other modules. | | censusdis.counties | Constants defining counties in all of the US States. |

Demonstration Notebooks

There are several demonstration notebooks available to illustrate how censusdis can be used. They are found in the notebook directory of the source code.

The demo notebooks include

| Notebook Name | Description | |---------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | [ACS Com

Related Skills

View on GitHub
GitHub Stars129
CategoryData
Updated2d ago
Forks21

Languages

Python

Security Score

85/100

Audited on Mar 27, 2026

No findings