SkillAgentSearch skills...

Stadata

STADATA is a Python package that simplifies access to statistical data provided by BPS - Statistics Indonesia

Install / Use

/learn @bps-statistics/Stadata

README

STADATA - Simplified Access to WebAPI BPS

pyversion pypi status downloads sourcerank contributors license

<div align="center"> <!-- <img src="https://github.com/bps-statistics/stadata/assets/1611358/72ac1fab-900f-4a44-b326-0f7b7707668c" width="40%"> --> <img src="https://github.com/bps-statistics/stadata/assets/1611358/5a52b335-8e7c-4198-9d4a-7650fe4004da" width="100%"> </div>

Introduction

STADATA is a Python package that simplifies access to statistical data provided by BPS - Statistics Indonesia, National Statistics Office of Indonesia. BPS offers a WebAPI - https://webapi.bps.go.id/developer/ that allows users to programmatically access various types of data, including Publications, Press Releases, static tables, and dynamic tables.

With STADATA, Python users can utilize this WebAPI to retrieve data directly from Python scripts, providing users with a convenient and easy-to-use interface to interact with the WebAPI BPS. The package aims to facilitate public access to the data generated by BPS - Statistics Indonesia and eliminate the need for manual data downloads from the https://www.bps.go.id/.

The key features of STADATA include:

  • Access to WebAPI BPS: STADATA enables users to access the BPS official data and retrieve it using Python.
  • Easy Installation: The package can be easily installed using pip, making it accessible to Python users.
  • Convenient API Methods: STADATA offers simple and straightforward API methods for listing domains, static tables, dynamic tables, and viewing specific tables.
  • Language Support: Users can choose between Indonesian ('ind') and English ('eng') languages to display the retrieved data.

Table of Contents

Installation

To install STADATA, use the following pip command:

pip install stadata

Requirements

STADATA is designed for Python 3.7 and above. To use the package, the following dependencies are required:

  • requests: A library used for making HTTP requests to the WebAPI BPS.
  • html: A library used for processing HTML content from the API response.
  • pandas: A library used for generate dataframe output for data manipulation and analysis.
  • tqdm: A library used for adding progress bars to data retrieval operations.

With the necessary requirements in place, you can easily start utilizing STADATA to access the WebAPI BPS and retrieve statistical data from BPS - Statistics Indonesia directly in your Python scripts.

Usage

To begin using STADATA, you must first install the package and satisfy its requirements, as mentioned in the previous section. Once you have the package installed and the dependencies in place, you can start accessing statistical data from BPS - Statistics Indonesia through the WebAPI BPS.

Getting Started

To get started with STADATA, you will need an API token from WebAPI BPS. Once you have obtained your token, you can use it to set up the STADATA client in your Python script:

import stadata

# Replace 'token' with your actual API token obtained from WebAPI BPS - https://webapi.bps.go.id/developer/
client = stadata.Client('token')

Parameter:

  • token (str, required): Your personal API token provided by the WebAPI BPS Developer portal. This token is necessary to authenticate and access the API. Make sure to replace token with your actual API token.

API Methods

The STADATA package provides the following API methods:

  • List Domain: This method returns a list of BPS's webpage domains from the national level to the district/region level. Domains are used to specify the region from which data is requested.
  • List Static Table: This method returns a list of all static tables available on the BPS's webpage.
  • List Dynamic Table: This method returns a list of all dynamic tables available on the BPS's webpage.
  • List Press Release: This method returns a list of all press release available on the BPS's webpage.
  • List Publication: This method returns a list of all publication available on the BPS's webpage.
  • View Static Table: This method returns data from a specific static table.
  • View Dynamic Table: This method returns data from a specific dynamic table.
  • View Press Release: This method returns data from a specific press release content.
  • View Publication: This method returns data from a specific publication.

List Domain

This method returns a list of BPS's webpage domains from the national level to the district level. Domains are used to specify the region from which data is requested.

client.list_domain()

Returns:

  • domains: A list of domain IDs for different regions, e.g., provinces, districts, or national.

List Static Table

This method returns a list of all static tables available on the BPS's webpage. You can specify whether to get all static tables from all domains or only from specific domains.

# Get all static tables from all domains
client.list_statictable(all=True)

# Get static tables from specific domains
client.list_statictable(all=False, domain=['domain_id-1', 'domain_id-2'])

Parameters:

  • all (bool, optional): A boolean indicating whether to get all static tables from all domains (True) or only from specific domains (False).
  • domain (list of str, required if all is False): A list of domain IDs which you want to retrieve static tables from.

Returns:

  • data: A list of static table information

    table_id|title|subj_id|subj|updt_date|size|domain
    

List Dynamic Table

This method returns a list of all dynamic tables available on the BPS's webpage. You can specify whether to get all dynamic tables from all domains or only from specific domains.

# Get all static tables from all domains
client.list_dynamictable(all=True)

# Get static tables from specific domains
client.list_dynamictable(all=False, domain=['domain_id-1', 'domain_id-2'])

Parameters:

  • all (bool, optional): A boolean indicating whether to get all static tables from all domains (True) or only from specific domains (False).
  • domain (list of str, required if all is False): A list of domain IDs which you want to retrieve static tables from.

Returns:

  • data: A list of static table information

    var_id|title|sub_id|sub_name|subcsa_id|subcsa_name|notes|vertical|unit|graph_id|graph_name|domain
    

List Publication

This method returns a list of all publication available on the BPS's webpage. You can specify whether to get all publication from all domains or only from specific domains. You can also specify month and year when publication published to get specific publication.

# Get all static tables from all domains
client.list_publication(all=True)

# Get static tables from specific domains
client.list_publication(all=False, domain=['domain_id-1', 'domain_id-2'])

# Get static tables from specific domains, year, and month
client.list_publication(all=False, domain=['domain_id-1', 'domain_id-2'], month="4", year="2022")

Parameters:

  • all (bool, optional): A boolean indicating whether to get all publication from all domains (True) or only from specific domains (False).
  • domain (list of str, required if all is False): A list of domain IDs which you want to retrieve publication from.
  • month (str, optional): A month when publication published.
  • year (str, required): A year when publication published.

Returns:

  • data: A list of publication

    pub_id|title|issn|sch_date|rl_date|updt_date|size|domain
    

List Press Release

This method returns a list of all press release available on the BPS's webpage. You can specify whether to get all press release content from all domains or only from specific domains. You can also specify month and year when press release published to get specific press release.

# Get all static tables from all domains
client.list_pressrelease(all=True)

# Get static tables from specific domains
client.list_pressrelease(all=False, domain=['domain_id-1', 'domain_id-2'])

# Get static tables from specific domains, year, and month
client.list_pressrelease(all=False, domain=['domain_id-1', 'domain_id-2'], month="4", year="2022")

Parameters:

  • all (bool
View on GitHub
GitHub Stars148
CategoryData
Updated3d ago
Forks17

Languages

Python

Security Score

100/100

Audited on Apr 2, 2026

No findings