Livelike
Livelike: Vivid Synthetic Populations
Install / Use
/learn @likeness-pop/LivelikeREADME
Livelike: Vivid Synthetic Populations
This package provides a high-level wrapper for generating synthetic populations via Census APIs based on the American Community Survey (ACS) 5-Year Estimates. Synthetic populations are virtual representations of people and households produced for small census areas (block groups, tracts) and can be attributed by a variety of demographic, economic, social, worker, student, mobility, housing, health, and communication characteristics found in the ACS.
Installation
Conda-forge (recommended)
The livelike feedstock is available via the conda-forge channel.
$ conda install --channel conda-forge livelike
PyPI
livelike is available on the Python Package Index.
$ pip install livelike
Source
Directly via GitHub + pip
$ pip install git+https://github.com/likeness-pop/livelike.git@develop
Download + pip
Download the source distribution (.tar.gz) and decompress where desired. From that location:
$ pip install .
Usage
- See usage examples in
./notebooks/
Specifying a P-MEDM Problem
Synthetic populations are generated by allocating records from the ACS Public Use Microdata Sample (PUMS) from their native spatial resolution of Public-Use Microdata Areas (100,000+ people) to small census areas (typically <8000 people) such that the aggregate characteristics of people and households align closely with population profiles of the small census areas available in the ACS Summary File (SF). This is accomplished using Penalized Maximum-Entropy Dasymetric Modeling (P-MEDM), which seeks to recreate the error variances on each small-area variable estimate in the ACS SF. LiveLike makes it simple to design and solve P-MEDM problems by fetching all of the necessary P-MEDM inputs for a given PUMA via Census APIs.
The bulk of P-MEDM setup is handled automatically by the acs module via the Census Microdata API.
In a basic use-case, inputs are simply:
- The 2010 or 2020 PUMA ID (
<State FIPS> + <PUMA FIPS>, as shown here - A Census API key (optional).
Examples are provided in the notebooks directory.
Supported Geographies
P-MEDM requires a target geography and an aggregate geography to account for error variances. The selected target geography determines the aggregate geography:
| Level | Code | Population (approx.) | Aggregate |
|-------|------------|-------------|-------|
| Block group | bg | 600 - 3000 | Tract |
| Tract | trt | 1200 - 8000 | Supertract |
LiveLike handles tracts, which have no sub-county aggregation level, using a regionalization approach to generate custom "supertracts" (see notebooks/tract_supertract_2019.ipynb for an example).
Supported ACS Years
The ACS 5-Year Estimates are a rolling 5% sample of the United States population weighted to be representative of the release year (vintage), with additional adjustments for factors like income. LiveLike uses the ACS 2019 5-Year Estimates as its default vintage.
| Year | Vintage | Available | |------|---------|-----------| | 2016 | ACS 2012 - 2016 5-Year Estimates | :white_check_mark: | 2017 | ACS 2013 - 2017 5-Year Estimates | :white_check_mark: | 2018 | ACS 2014 - 2018 5-Year Estimates | :white_check_mark: | 2019 | ACS 2015 - 2019 5-Year Estimates | :white_check_mark: | 2020 | ACS 2016 - 2020 5-Year Estimates | :x: | 2021 | ACS 2017 - 2021 5-Year Estimates | :x: | 2022 | ACS 2018 - 2022 5-Year Estimates | :x: | 2023 | ACS 2019 - 2023 5-Year Estimates | :white_check_mark:
Currently, years between 2016 and 2019 and 2023 are supported. The gap between 2020 - 2022 is due to mixed geography problems that P-MEDM cannot directly handle (2010 PUMAs with 2020 small areas for 2020, 2021; mixture of 2010/2020 PUMAs with 2020 small areas for 2022).
P-MEDM Constraints
P-MEDM constraints are sets of residential and population characteristics common between the ACS SF and PUMS that can be used to design a P-MEDM model and attribute the synthetic population. LiveLike provides several configurations of prebuilt constraints:
-
Base (default): Baseline modeling constraints representing population totals, routine daily activities (workers, students), and mobility characteristics, available in
config.up_base_constraints_selection. -
Expanded: Baseline modeling constraints with a selection of demographic, social, economic, and housing characteristics, available in
config.up_expanded_constraints_selection. The Base constraints can be overwritten by the Expanded ones using:from config import up_expanded_constraints_selection acs.puma(..., constraints_selection=up_expanded_constraints_selection)
Several additional constraint themes (health, communications) are available outside the prebuilt configurations and can be added onto a custom constraints selection.
| Theme | Description | Base | Expanded | Notes |
|-------------|---------------------------------------------------------------------------------------------------------------------------------------------------|------|----------|---------------------------------------------------|
| universe | Sampling universe totals (population, civilian noninstituionalized population, group quarters population, housing units, occupied housing units). | x | x | |
| worker | Worker characteristics (employment, class of worker, industry, occupation, hours worked per week). | x | x | |
| student | Student characteristics (grade level attending, public/private school). | x | x | |
| mobility | Mobility characteristics (commute time/mode, vehicles available). | x | x | |
| demographic | Basic demographics (sex, age) and living arrangement characteristics. | | x | Expanded: Sex by age and household type only |
| social | Social characteristics (race/ethnicity, language, place of birth, veteran status). | | x | Expanded: Race/ethnicity only |
| economic | Economic characteristics (household income, poverty, educational attainment). | | x | Expanded: Household income and income to poverty ratio only |
| housing | Housing characteristics (tenure, dwelling type, year built, number of rooms, house heating fuel). | | x | Expanded: Dwelling type and year built only
| health | Health insurance coverage type.
| communications | Household internet access. | | | |
Custom Constraint Selection
Constraint selections are passed to acs.puma(constraint_selection=...) as a dict with keys representing ACS variable themes and values representing specific subjects (tables). If the value passed is a bool type, a True value will include variables for all subjects in the theme, while a False value will bypass that theme (the same as omitting the theme from the selection). If the value passed is a list type, only listed subjects will be included in the result.
Example:
custom_constraints_selection = {
"universe" : True,
"worker" : True,
"student" : True,
"mobility" : True,
"demographic" : [
"sex_age",
"hhtype",
],
"economic" : [
"hhinc",
"ipr",
],
"health" : True,
"communications" : True,
}
- Use all variables listed under the
universe,worker,student, andmobility,health, andcommunicationsthemes. - Use only household income (
hhinc) and income to poverty ratio (ipr) from theeconomictheme.
The Constraints File
The constraints file (`livelike/data/cons
