Optimaldistricts
Code and data for algorithmically drawing congressional districts using Sinkhorn algorithm
Install / Use
/learn @btengels/OptimaldistrictsREADME
#Optimal Districts This repo contains code/data to impliment the districting algorithm used the paper "A Political Redistricting Algorithm with Communities of Interest" as well as the website optimaldistricts.org. The algorithm draws district boundaries in order to minimize the distance between constituents and representatives while maintaining, as much as possible, demographic communities of interest.
The districting problem is difficult from a computational standpoint (NP-complete, actually). Composing an optimal set of districts from larger geographic units (census tracts or precincts) simplifies the problem somewhat, but many states have several thousand precincts which is enough to make the problem slow using standard software/techniques. Requirements such as nearly-equal population shares across districts, geographic contiguity, or generating a sufficient number of majority-minority districts, complicate the problem further.
We overcome this computational hurdle using some recent numerical tools from the Optimal Transport community (hence the name "optimal" districts). Given the location of district offices, we use a regularized penalty function and an iterative projection algorithm to quickly assign precincts to districts (Sinkhorn algorithm). Given this (new) set of districts, we find the best location for district offices, then repeat until convergence (Lloyd's algorithm/constrained-kmeans algorithm).
Code and Data
The precinct-level data are first taken from autoredistrict.org. The raw data are comprised of shapefiles (.shp) which we then pull into GeoPandas for our analysis. The data include demographics and presidential election results at the precinct level for all 50 states.
The code is written in Python and aims to be tidy/readable and simple as possible so that anyone can use and adapt the code as they like. Currently, there are two important scripts used to generate the maps/graphs/etc. for each state.
-
transport_plan_functions.py: This file contains functions used in computing the optimal districts. The primary function is
gradientDescentOT(), and most of the other functions in the file exist to support this main function. All of the functions have detailed doc strings describing their inputs/outputs and functionality. -
make_maps.py:
- Reads in the shapefiles and simplifies the precinct polygons for faster rendering, stores reuslts as a pickled Geopandas DataFrame
- Reads in the pickled DataFrames and for each state with 2 or more congressional districts...
- Makes a map of the current (2010) congresisonal districts (approximate since we keep precincts intact). Two kinds of maps are made, a static .png and a web-ready bokeh plot.
- Computes the optimal districts for a given value of alphaW.
- (beta) Checks to see if the resulting map is comprised of contiguous districts. If the districts are contiguous, the script computes new districts with highest possible value for alphaW that still respects contiguity.
- Makes .png and html maps of the new districts.
- Plots kernel density plots of district-level statistics
- Saves a list of dataframes used in making the maps...to be used in ongoing projects (ranking states by a gerrymandering index, looking at the number of competitive seats, etc.)
A work in progress
We have a long list of additional features to add and questions to answer using our algorithm. Some of our short-term TODOs include:
- Parallelize loops over states
- Improve/speed up checks for contiguity, figure out how to handle states like Michigan without hardcoding
- Optimally determine the regularization and demographic distance parameters
- Measure "wasted" voter shares (a la Wisconsin supreme court case)
- Generalize functions in
transport_plan_functions.pyso that an arbitrary Geopandas DataFrame with the necessary lat/lon variables can solve the districting problem (sometimes called the "warehousing problem").
Thanks for reading and check back for updates!
Related Skills
node-connect
349.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.5kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
349.2kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
349.2kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
