Eurostat
Back-end code and website for eurostat.linked-statistics.org
Install / Use
/learn @linked-statistics/EurostatREADME
Eurostat dataset
About
This project is about publishing EuroStat as Linked Data on the Web.
Design
-
Base URI would be http://eurostat.linked-statistics.org.
-
VoID discovery mechanism would be delivered through http://eurostat.linked-statistics.org/.well-known/void. Upon discovery, the system will serve the VoID description which would be like :
@prefix meta: <http://eurostat.linked-statistics.org/meta#> . @prefix dss: <http://eurostat.linked-statistics.org/dss/> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix void: <http://rdfs.org/ns/void#> . meta:Eurostat a void:Dataset; dcterms:title "EuroStat"; void:subset dss:ds_1, dss:ds_2 . -
Upon de-referencing any dataset, we serve the DataSet Summary (DSS):
@prefix data: <http://eurostat.linked-statistics.org/data/> . @prefix dss: <http://eurostat.linked-statistics.org/dss/> . @prefix dsd: <http://eurostat.linked-statistics.org/dsd#> . @prefix qb: <http://purl.org/linked-data/cube#> . @prefix void: <http://rdfs.org/ns/void#> . dss:ds_1 a qb:DataSet, void:Dataset; qb:DataStructureDefinition dsd:dsd_1; void:dataDump data:ds_1.ttl .
Remarks
-
Use N-Quads format to generate dataset triples.
-
Reuse code from linked-eurostat for Dataset RDFication.
-
Use our own code for DSD RDFication.
Batch Scripts
Detailed description on each script can be found at this page
Steps to RDFize EuroStat data
The RDFication process can be found at this page
How to convert a single dataset to RDF
- The best way to test the RDFication process is to use
Main.shscript. You are required to download the*.zipfile(s) in a directory before running the script. It can be achieved by runningEurostatMirror.sh. - Change the directory path variables in the
Main.shto your desired directory paths. Make sure that the directories exists before running the script. - How to run :
sh Main.sh -i ~/sdmx-code.ttl -l ~/logs/
Example Query
Here is an examplary quert that joins two SDMX datasets:
PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#
PREFIX qb: http://purl.org/linked-data/cube#
PREFIX e: http://ontologycentral.com/2009/01/eurostat/ns#
PREFIX sdmx-measure: http://purl.org/linked-data/sdmx/2009/measure#
PREFIX skos: http://www.w3.org/2004/02/skos/core#
PREFIX g: http://eurostat.linked-statistics.org/ontologies/geographic.rdf#
PREFIX dataset: http://eurostat.linked-statistics.org/data/
SELECT ?nuts2
SUM(xsd:decimal(?pop)) AS ?population
?wateruse
xsd:decimal(?wateruse)*1000000/SUM(xsd:decimal(?pop)) AS
?percapita WHERE { ?observation qb:dataset dataset:demo_r_pjanaggr3 ;
e:time http://eurostat.linked-statistics.org/dic/time#2007;
e:age http://eurostat.linked-statistics.org/dic/age#TOTAL;
e:sex http://eurostat.linked-statistics.org/dic/sex#F;
e:geo ?ugeo;
sdmx-measure:obsValue ?pop.
?ugeo g:hasParentRegion ?parent.
?parent rdfs:label ?nuts2.
?wuregion qb:dataset dataset:env_n2_wu ;
e:geo ?parent;
e:cons http://eurostat.linked-statistics.org/dic/cons#W18_2_7_2;
e:time http://eurostat.linked-statistics.org/dic/time#2007;
sdmx-measure:obsValue ?wateruse.
} GROUP BY ?nuts2 ?wateruse ORDER BY DESC(?percapita)
The query above uses dataset demo_r_pjanaggr3, which contains Population by sex and age groups on 1 January - NUTS level 3 regions. We need populations for NUTS level 2 and we therefore aggregate the dataset by using the NUTS vocabulary to find the parent regions.
We only want data for 2007, and both sexes. We then join the data with the env_n2_wu dataset, which contains Water use (NUTS2) - mio m3. We can then find the regions with the most domestic water (code W18_2_7_2) use per million inhabitants.
Another example query
Below is a SPARQL query that combines 24 Eurostat datasets. It is a combined query on all national statistics for Albacore. The idea is to see if a species needs further protection in the form of fishing quotas etc. A similar query is used here.
PREFIX qb: <http://purl.org/linked-data/cube#>
PREFIX e: <http://ontologycentral.com/2009/01/eurostat/ns#>
PREFIX sdmx-measure: <http://purl.org/linked-data/sdmx/2009/measure#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX g: <http://eurostat.linked-statistics.org/ontologies/geographic.rdf#>
PREFIX dataset: <http://eurostat.linked-statistics.org/data/>
PREFIX eunis: <http://eunis.eea.europa.eu/rdf/species-schema.rdf#>
SELECT ?country ?year ?presentation ?landed ?unit
FROM <http://eurostat.linked-statistics.org/data/fish_ld_be.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_bg.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_cy.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_de.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_dk.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_ee.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_es.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_fi.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_fr.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_gr.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_ie.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_is.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_it.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_lt.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_lv.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_mt.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_nl.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_no.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_pl.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_pt.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_ro.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_se.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_si.rdf>
FROM <http://eurostat.linked-statistics.org/data/fish_ld_uk.rdf>
FROM <http://semantic.eea.europa.eu/home/roug/eurostatdictionaries.rdf>
WHERE {
?obsUri e:species <http://eurostat.linked-statistics.org/dic/species#ALB>;
e:pres <http://eurostat.linked-statistics.org/dic/pres#P00>, ?upresentation;
e:dest <http://eurostat.linked-statistics.org/dic/dest#D0>;
e:natvessr <http://eurostat.linked-statistics.org/dic/natvessr#TOTAL>;
e:unit <http://eurostat.linked-statistics.org/dic/unit#TPW>, ?uunit;
e:geo ?ucountry;
e:time ?uyear;
sdmx-measure:obsValue ?landed.
?ucountry skos:prefLabel ?country.
?uunit skos:prefLabel ?unit.
?uyear skos:prefLabel ?year.
?upresentation skos:prefLabel ?presentation.
} ORDER BY ?country ?year ?presentation
URIs for Eurostat identities
-
The base URI for Eurostat is http://eurostat.linked-statistics.org.
-
The Data Structure Definition (DSD) can be found under
http://eurostat.linked-statistics.org/dsd/For example: http://eurostat.linked-statistics.org/dsd/bsbu_q.ttl
-
The SDMX data sets can be found under
http://eurostat.linked-statistics.org/data/For example: http://eurostat.linked-statistics.org/data/bsbu_q.rdf
-
The dictionaries can be found under
http://eurostat.linked-statistics.org/dic/For example: http://eurostat.linked-statistics.org/dic/geo.rdf
NameSpaces for Eurostat
-
Namespace for SDMX datasets :
@prefix data: <http://eurostat.linked-statistics.org/data/> . -
Namespace for Data Structure Definition (DSD) :
@prefix dsd: <http://eurostat.linked-statistics.org/dsd/> . -
Namespace for dictionaries :
@prefix cl: <http://eurostat.linked-statistics.org/dic/> . -
Namespace for dataset summaries is :
@prefix dss: <http://eurostat.linked-statistics.org/dss#> . -
Namespace for the concepts defined in DSDs :
@prefix concept: <http://eurostat.linked-statistics.org/concept#> . -
Namespace for the properties defined in DSDs :
@prefix property: <http://eurostat.linked-statistics.org/property#> . -
Namespace for titles of the datasets is :
@prefix title: <http://eurostat.linked-statistics.org/title#> .
License
The software provided in this repository is Open Source.
To Do
- Document the percentage of datasets that change per week, on average
- Explore how to generate a per-country subset of the data
- Load a .ie subset into data-gov.ie dataspace
- Interlinking
- Regions: DBpedia, Geonames, LinkedGeoData
- National regions: Data-gov.ie, GeoLinkedData.es, Ordnance Survey
- Indicators: US Census
- Topics/Subjects: DBpedia
Pitfalls
There are few pitfalls in our current approach of converting eurostat datasets into RDF:
- Some datasets have time period associated to an observation which is not a date but instead represented as
LTAA (long term anual average), see [dataset] (http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing?sort=1&file=data%2Fenv_watq1a.tsv.gz). Currently we dont know how to deal withLTAAin RDF because we dont have any extra information provided in the SDMX and DSD of
Related Skills
node-connect
353.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
111.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
353.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
353.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
