SkillAgentSearch skills...

DatenguideR

R wrapper for the datengui.de GraphQL API to easily access German regional statistics

Install / Use

/learn @CorrelAid/DatenguideR
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<!-- README.md is generated from README.Rmd. Please edit that file -->

datenguideR <img src='man/figures/logo.png' align="right" height="139" />

<!-- badges: start -->

Build_Status Codecov test
coverage Licence Open Source
Love

<!-- badges: end -->

Access and download German regional statistics from Datenguide http://datengui.de. datenguideR provides a wrapper for their GraphQL API and also includes metadata for all available statistics and regions.

Overview

Usage

First, install datenguideR from GitHub:

devtools::install_github("CorrelAid/datenguideR")

Load package:

library(datenguideR)

Examples

Get IDs of all available NUTS-1 regions:

datenguideR::dg_regions %>%
  dplyr::filter(level == "nuts1") %>%
  knitr::kable()

| id | name | level | parent | |:----|:-----------------------|:------|:-------| | 01 | Schleswig-Holstein | nuts1 | DG | | 02 | Hamburg | nuts1 | DG | | 03 | Niedersachsen | nuts1 | DG | | 04 | Bremen | nuts1 | DG | | 05 | Nordrhein-Westfalen | nuts1 | DG | | 06 | Hessen | nuts1 | DG | | 07 | Rheinland-Pfalz | nuts1 | DG | | 08 | Baden-Württemberg | nuts1 | DG | | 09 | Bayern | nuts1 | DG | | 10 | Saarland | nuts1 | DG | | 11 | Berlin | nuts1 | DG | | 12 | Brandenburg | nuts1 | DG | | 13 | Mecklenburg-Vorpommern | nuts1 | DG | | 14 | Sachsen | nuts1 | DG | | 15 | Sachsen-Anhalt | nuts1 | DG | | 16 | Thüringen | nuts1 | DG |

Get all available meta data on statistics, substatistics, and parameters:

datenguideR::dg_descriptions
#> # A tibble: 3,419 x 11
#>    stat_name stat_description   stat_description_~ substat_name substat_descrip~
#>    <chr>     <chr>              <chr>              <chr>        <chr>           
#>  1 AENW01    Entsorgte/behande~ "**Entsorgte/beha~ NA           NA              
#>  2 AENW02    Abgelagerte Abfal~ "**Abgelagerte Ab~ NA           NA              
#>  3 AENW03    Entsorg.u.Behandl~ "**Entsorg.u.Beha~ NA           NA              
#>  4 AENW04    Entsorgte/behande~ "**Entsorgte/beha~ NA           NA              
#>  5 AENW05    Abgelagerte Abfal~ "**Abgelagerte Ab~ NA           NA              
#>  6 AENW06    Entsorg.u.Behandl~ "**Entsorg.u.Beha~ NA           NA              
#>  7 AEW001    Entsorgungs- und ~ "**Entsorgungs- u~ NA           NA              
#>  8 AEW001    Entsorgungs- und ~ "**Entsorgungs- u~ EBANL1       Entsorgungs- un~
#>  9 AEW001    Entsorgungs- und ~ "**Entsorgungs- u~ EBANL1       Entsorgungs- un~
#> 10 AEW001    Entsorgungs- und ~ "**Entsorgungs- u~ EBANL1       Entsorgungs- un~
#> # ... with 3,409 more rows, and 6 more variables: param_name <chr>,
#> #   param_description <chr>, stat_description_en <chr>,
#> #   stat_description_full_en <chr>, substat_description_en <chr>,
#> #   param_description_en <chr>

dg_search

You can also use dg_search to look for a variable of interest. The function will match your string with any strings in the dg_descriptions data frame, returning only rows with those matches.

Looking for variables where the string “vote” appears somewhere in the documentation:

dg_search("vote")
#> # A tibble: 90 x 11
#>    stat_name stat_description   stat_description_~ substat_name substat_descrip~
#>    <chr>     <chr>              <chr>              <chr>        <chr>           
#>  1 AI0501    Zweitstimmenantei~ "**Zweitstimmenan~ NA           NA              
#>  2 AI0502    Zweitstimmenantei~ "**Zweitstimmenan~ NA           NA              
#>  3 AI0503    Zweitstimmenantei~ "**Zweitstimmenan~ NA           NA              
#>  4 AI0504    Zweitstimmenantei~ "**Zweitstimmenan~ NA           NA              
#>  5 AI0505    Zweitstimmenantei~ "**Zweitstimmenan~ NA           NA              
#>  6 AI0506    Wahlbeteiligung, ~ "**Wahlbeteiligun~ NA           NA              
#>  7 AI0601    Stimmenanteil CDU~ "**Stimmenanteil ~ NA           NA              
#>  8 AI0602    Stimmenanteil SPD~ "**Stimmenanteil ~ NA           NA              
#>  9 AI0603    Stimmenanteil FDP~ "**Stimmenanteil ~ NA           NA              
#> 10 AI0604    Stimmenanteil GRÜ~ "**Stimmenanteil ~ NA           NA              
#> # ... with 80 more rows, and 6 more variables: param_name <chr>,
#> #   param_description <chr>, stat_description_en <chr>,
#> #   stat_description_full_en <chr>, substat_description_en <chr>,
#> #   param_description_en <chr>

Note: Descriptions of variables are also available in English now! Translated via the googleLanguageR package.

dg_search("vote") %>% 
  dplyr::select(stat_name, dplyr::contains("_en"))
#> # A tibble: 90 x 5
#>    stat_name stat_descriptio~ stat_descriptio~ substat_descrip~ param_descripti~
#>    <chr>     <chr>            <chr>            <chr>            <chr>           
#>  1 AI0501    Second Vote Sha~ "** CDU / CSU s~ NA               NA              
#>  2 AI0502    SPD Second Vote~ "** SPD second ~ NA               NA              
#>  3 AI0503    FDP Second Vote~ "** Second vote~ NA               NA              
#>  4 AI0504    Second Vote Sha~ "** GREEN secon~ NA               NA              
#>  5 AI0505    Second Vote Sha~ "** Second vote~ NA               NA              
#>  6 AI0506    Voter Turnout, ~ "** Voter turno~ NA               NA              
#>  7 AI0601    CDU / CSU, Euro~ "** CDU / CSU v~ NA               NA              
#>  8 AI0602    SPD Vote Share,~ "** SPD vote sh~ NA               NA              
#>  9 AI0603    FDP Share of Vo~ "** FDP vote sh~ NA               NA              
#> 10 AI0604    Share of Votes ~ "** GREEN share~ NA               NA              
#> # ... with 80 more rows

dg_call

The main function of the package is dg_call. It gives access to all API endpoints.

Simply pick a statistic and put it into dg_call() (infos can be retrieved from dg_descriptions).

For example:

  • stat_name: AI0506 (Wahlbeteiligung, Bundestagswahl)
  • region_id: 11 (stands for Berlin)
dg_call(region_id = "11",
        year = 2017,
        stat_name = "AI0506")
#> New names:
#> * name -> name...2
#> * name -> name...6
#> # A tibble: 1 x 9
#>   id    name...2  year value GENESIS_source  name...6 stat_name stat_description
#>   <chr> <chr>    <int> <dbl> <chr>           <chr>    <chr>     <chr>           
#> 1 11    Berlin    2017  75.6 Regionalatlas ~ 99910    AI0506    Wahlbeteiligung~
#> # ... with 1 more variable: stat_description_en <chr>

A slightly more complex call with substatistics:

  • stat_name: BETR08 (Landwirtschaftliche Betriebe mit Tierhaltung)
  • substat_name: TIERA8 (Landwirtschaftliche Betriebe mit Viehhaltung)
  • parameter:
    • TIERART2 (Rinder)
    • TIERART3 (Schweine)
dg_call(region_id = "11", 
        year = c(2001, 2003, 2007), 
        stat_name = "BETR08", 
        substat_name = "TIERA8", 
        parameter = c("TIERART2", "TIERART3")) 
#> New names:
#> * name -> name...2
#> * name -> name...7
#> # A tibble: 6 x 15
#>   id    name...2  year TIERA8  value GENESIS_source           name...7 stat_name
#>   <chr> <chr>    <int> <chr>   <int> <chr>                    <chr>    <chr>    
#> 1 11    Berlin    2001 TIERAR~     8 Allgemeine Agrarstruktu~ 41120    BETR08   
#> 2 11    Berlin    2001 TIERAR~     7 Allgemeine Agrarstruktu~ 41120    BETR08   
#> 3 11    Berlin    2003 TIERAR~     9 Allgemeine Agrarstruktu~ 41120    BETR08   
#> 4 11    Berlin    2003 TIERAR~     7 Allgemeine Agrarstruktu~ 41120    BETR08   
#> 5 11    Berlin    2007 TIERAR~    11 Allgemeine Agrarstruktu~ 41120    BETR08   
#> 6 11    Berlin    2007 TIERAR~     5 Allgemeine Agrarstruktu~ 41120    BETR08   
#> # ... with 7 more variables: stat_description <chr>, substat_name <chr>,
#> #   substat_description <chr>, param_description <chr>,
#> #   stat_description_en <chr>, substat_description_en <chr>,
#> #   param_description_en <chr>

If you give no parameters for a substat, it will default to return results for all of them.

dg_call(region_id = "11", 
        year = c(2001, 2003, 2007), 
        stat_name =  "BETR08", 
        substat_name = "TIERA8") 
#> New names:
#> * name -> name...2
#> * name -> name...7
#> # A tibble: 23 x 15
#>    id    name...2  year TIERA8   value GENESIS_source         name...7 stat_name
#>    <chr> <chr>    <int> <chr>    <int> <chr>                  <chr>    <chr>    
#>  1 11    Berlin    2001 TIER
View on GitHub
GitHub Stars26
CategoryDevelopment
Updated6mo ago
Forks3

Languages

R

Security Score

87/100

Audited on Oct 6, 2025

No findings