🍻 Open Brewery DB Dataset

This is the open-source dataset for the Open Brewery DB API which is served by a REST API built with Laravel

🎯 Purpose

Provide an approval-based pipeline to update the dataset and API.

🗄 Data Formats

🚀 Getting Started

git clone git@github.com:openbrewerydb/openbrewerydb.git
cd openbrewerydb && npm install

⚙️ Scripts

The following npm scripts help maintain and manage the dataset:

Data Management

npm run validate
- Validates all CSV files against the JSON Schema
- Checks for required fields and data format consistency
- Reports any validation errors that need attention
npm run csv:combine
- Combines all individual CSV files from country/state-region folders into a single breweries.csv
- Useful when you've made changes to individual state files and need to update the main dataset
npm run csv:split
- Splits the main breweries.csv into separate files by country/state-region
- Helps maintain organized, manageable data files for each region
- Creates directories if they don't exist

Data Generation

npm run generate:ids
- Creates unique OBDB IDs for each brewery based on name and city
- Automatically updates breweries.csv with new IDs
- Ensures no duplicate IDs exist in the dataset
npm run generate:json
- Converts breweries.csv into a JSON format (breweries.json)
- Useful for applications that prefer working with JSON data
- Maintains data consistency across formats
npm run generate:sql
- Creates PostgreSQL SQL file from breweries.csv
- Includes table creation and data insertion statements
- Perfect for database implementations
npm run generate:stats
- Generates comprehensive dataset statistics
- Shows brewery counts by state/city
- Displays brewery type distribution
- Reports data completeness metrics
npm run update:readme-stats
- Updates the Statistics section in README.md with latest data
- Automatically calculates and formats all statistics
- Includes last updated timestamp

Contributor Management

npm run contributors:add
- Interactive CLI tool to add new contributors
- Prompts for contributor information and contribution type
- Updates .all-contributorsrc file
npm run contributors:check
- Verifies if any contributors are missing from the list
- Helps maintain accurate recognition of all contributors
npm run contributors:generate
- Updates the Contributors section in README.md
- Generates contributor table with avatars and contribution types

Workflow

npm run workflow:maintain
- Comprehensive maintenance workflow that:
  1. Validates all CSV files
  2. Combines all CSV files
  3. Creates unique IDs for each brewery
  4. Splits back into individual state files
  5. Creates JSON and SQL files
  6. Updates README.md with latest statistics
- Run this after making any dataset updates

🤝 Contributing

For information on contributing to this project, please see the contributing guide and our code of conduct.

Fork the repository
Add or update breweries in the CSV (Excel, Google Sheets)
Submit a Pull Request

Tips

First and foremost, don't worry about messing up! 🙂 Thank you so much for contributing! 🙌

CSVs are organized by data/[country]/[state_province]
Required fields/columns: name, brewery_type, city, state_province, and country
When adding a brewery, do not include an id. This will be created after review.
Please either add to breweries.csv (preferred if adding breweries for a new country) or the individual state/province CSV file. Adding to both at the same time may introduce duplicates/errors.

👾 Community

📫 Feedback

Any feedback, please email me.

Cheers! 🍻

📊 Project Status

Status: Active
Last Dataset Update: 2024
Maintenance: Actively maintained through community contributions
Dataset Size: 8,000+ breweries
Coverage: United States, with growing international data

🔧 Requirements

Node.js v22 or higher
npm package manager
Git

📚 Data Schema

Each brewery entry contains the following fields:

| Field | Type | Description | Required | | -------------- | ------ | ------------------------------------------------ | -------- | | id | String | Unique identifier | Yes | | name | String | Name of the brewery | Yes | | brewery_type | String | Type of brewery (micro, regional, brewpub, etc.) | Yes | | street | String | Street address | No | | city | String | City | Yes | | state_province | String | State/Province | Yes | | postal_code | String | Postal code | Yes | | country | String | Country | Yes | | longitude | String | Decimal longitude coordinate | No | | latitude | String | Decimal latitude coordinate | No | | phone | String | Phone number | No | | website_url | String | Website URL | No |

📖 Usage Examples

Python

import pandas as pd

# Read CSV
breweries_df = pd.read_csv('breweries.csv')

# Filter by state
california_breweries = breweries_df[breweries_df['state_province'] == 'California']

JavaScript/Node.js

const fs = require("fs");

// Read JSON
const breweries = JSON.parse(fs.readFileSync("breweries.json", "utf8"));

// Filter by type
const microBreweries = breweries.filter((b) => b.brewery_type === "micro");

SQL

-- After importing breweries.sql
SELECT name, city, state_province
FROM breweries
WHERE brewery_type = 'brewpub'
ORDER BY state_province, city;

🔄 Versioning

The dataset is updated regularly through community contributions. Each update goes through the following process:

Community members submit new breweries or updates via pull requests
Changes are reviewed and validated
Upon approval, changes are merged and new dataset files are generated
The API is automatically updated with the new data

Latest dataset version: 2024.1

Contributors ✨

Thanks goes to these wonderful people (emoji key):

<table> <tbody> <tr> <td align="center" valign="top" width="14.28%"><a href="https://theputnams.net/mike/"><img src="https://avatars3.githubusercontent.com/u/213371?v=4?s=100" width="100px;" alt="Mike Putnam"/> Mike Putnam</a> <a href="#data-mikeputnam" title="Data">🔣</a></td> <td align="center" valign="top" width="14.28%"><a href="https://andrewbarber.me/"><img src="https://avatars0.githubusercontent.com/u/135927?v=4?s=100" width="100px;" alt="Andrew A. Barber"/> Andrew A. Barber</a> <a href="#data-AndrewBarber" title="Data">🔣</a></td> <td align="center" valign="top" width="14.28%"><a href="http://www.therearefourmics.com/"><img src="https://avatars2.githubusercontent.com/u/39307371?v=4?s=100" width="100px;" alt="Jason Allen"/> Jason Allen</a> <a href="#data-jallend1" title="Data">🔣</a></td> <td align="center" valign="top" width="14.28%"><a href="https://github.com/Juicob"><img src="https://avatars1.githubusercontent.com/u/68080175?v=4?s=100" width="100px;" alt="Juicob"/> Juicob</a> <a href="#data-Juicob" title="Data">🔣</a></td> <td align="center" valign="top" width="14.28%"><a href="https://github.com/wkarney"><img src="https://avatars0.githubusercontent.com/u/35663282?v=4?s=100" width="100px;" alt="Will Karnasiewicz"/> Will Karnasiewicz</a> <a href="#data-wkarney" title="Data">🔣</a></td> <td align="center" valign="top" width="14.28%"><a href="https://dvavs.github.io/"><img src="https://avatars0.githubusercontent.com/u/49594473?v=4?s=100" width="100px;" alt="Dylan T. Vavra"/> Dylan T. Vavra</a> <a href="#data-dvavs" title="Data">🔣</a></td> <td align="center" valign="top" width="14.28%"><a href="https://github.com/amadisonm1209"><img src="https://avatars0.githubusercontent.com/u/44384309?v=4?s=100" width="100px;" alt="Madison Martinez"/> Madison Martinez</a> <a href="#data-amadisonm1209" title="Data">🔣</a></td> </tr> <tr> <td align="center" valign="top" width="14.28%"><a href="https://github.com/danieleremchuk"><img src="https://avatars0.githubusercontent.com/u/50344935?v=4?s=100" width="100px;" alt="Daniel Eremchuk"/> Daniel Eremchuk</a> <a href="#data-danieleremchuk" title="Data">🔣</a></td> <td align="center" valign="top" width="14.28%"><a href="https://github.com/alexchong"><img src="https://avatars2.githubusercontent.com/u/18007017?v=4?s=10

Openbrewerydb

Install / Use

README