SkillAgentSearch skills...

Obistools

Tools for data enhancement and quality control

Install / Use

/learn @iobis/Obistools
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

obistools: Tools for data enhancement and quality control.

Travis-CI Build Status Coverage Status DOI

Installation
Taxon matching
Check required fields
Plot points on a map
Identify points on a map
Check points on land
Check depth
Check eventID and parentEventID
Check eventID in an extension
Flatten event records
Flatten occurrence and event records
Calculate centroid and radius for WKT geometries
Map column names to Darwin Core terms
Check eventDate
Data quality report
Lookup XY

Installation

Installing obistools requires the devtools package:

install.packages("devtools")
devtools::install_github("iobis/obistools")

Taxon matching

match_taxa() performs interactive taxon matching with the World Register of Marine Species.

names <- c("Abra alva", "Buccinum fusiforme", "Buccinum fusiforme", "Buccinum fusiforme", "hlqsdkf")
match_taxa(names)
3 names, 1 without matches, 1 with multiple matches
Proceed to resolve names (y/n/p)? y

  AphiaID     scientificname      authority     status match_type
1  531014 Buccinum fusiforme   Kiener, 1834 unaccepted      exact
2  510389 Buccinum fusiforme Broderip, 1830 unaccepted      exact

Multiple matches, pick a number or leave empty to skip: 2

        scientificName                          scientificNameID match_type
1            Abra alba urn:lsid:marinespecies.org:taxname:141433     near_1
2   Buccinum fusiforme urn:lsid:marinespecies.org:taxname:510389      exact
2.1 Buccinum fusiforme urn:lsid:marinespecies.org:taxname:510389      exact
2.2 Buccinum fusiforme urn:lsid:marinespecies.org:taxname:510389      exact
3                 <NA>                                      <NA>       <NA>

Check required fields

check_fields() will check if all OBIS required fields are present in an occurrence table and if any values are missing.

data <- data.frame(
  occurrenceID = c("1", "2", "3"),
  scientificName = c("Abra alba", NA, ""),
  locality = c("North Sea", "English Channel", "Flemish Banks"),
  minimumDepthInMeters = c("10", "", "5")
)

check_fields(data)

This function returns a dataframe of errors (if any):

             field level                                       message row
1        eventDate error           Required field eventDate is missing  NA
2 decimalLongitude error    Required field decimalLongitude is missing  NA
3  decimalLatitude error     Required field decimalLatitude is missing  NA
4 scientificNameID error    Required field scientificNameID is missing  NA
5 occurrenceStatus error    Required field occurrenceStatus is missing  NA
6    basisOfRecord error       Required field basisOfRecord is missing  NA
7   scientificName error Empty value for required field scientificName   2
8   scientificName error Empty value for required field scientificName   3

Plot points on a map

plot_map() will generate a ggplot2 map of occurrence records, plot_map_leaflet() creates a Leaflet map.

plot_map(abra, zoom = TRUE)

https://raw.githubusercontent.com/iobis/obistools/master/images/abra.png

plot_map_leaflet(abra)

https://raw.githubusercontent.com/iobis/obistools/master/images/abra_2.png

Identify points on a map

Use identify_map() to identify points on a ggplot2 map. This function will return the record closest to where the mouse was clicked.

plot_map(abra, zoom = TRUE)
identify_map(abra)
            id decimalLongitude decimalLatitude    basisOfRecord           eventDate institutionCode
2078 384334009            29.51           43.97 HumanObservation 2010-05-20 10:00:00       GeoEcoMar
                                           collectionCode                            catalogNumber         locality
2078 GeoEcoMar BlackSea R/V Mare Nigrum Cruises 2010-2011 GeoEcoMar_BlackSeaCruises_2003_2011_3723 Constanta_10CT05
                                                                         datasetName   phylum    order    family
2078 Macrobenthos data from the Romanian part of the Black Sea between 2003 and 2011 Mollusca Cardiida Semelidae
     genus scientificName originalScientificName scientificNameAuthorship obisID resourceID yearcollected   species
2078  Abra      Abra alba              Abra alba          (W. Wood, 1802) 395450       4273          2010 Abra alba
            qc aphiaID speciesID continent coordinateUncertaintyInMeters       datasetID            modified
2078 859307135  141433    395450 Black Sea                          <NA> IMIS:dasid:5256 2015-12-27 00:00:00
                                 occurrenceID recordedBy                          scientificNameID    class
2078 GeoEcoMar_BlackSeaCruises_2003_2011_3723       <NA> urn:lsid:marinespecies.org:taxname:141433 Bivalvia
     lifestage  sex individualCount eventID depth minimumDepthInMeters maximumDepthInMeters fieldNumber
2078      <NA> <NA>              NA    <NA> 60.94                60.94                60.94           I
     occurrenceRemarks eventTime footprintWKT identifiedBy
2078              <NA>      <NA>         <NA>     Teaca A.

Check points on land

check_onland() uses the xylookup web service which internally uses land polygons from OpenStreetMap to check if any points are located on land. Other shapefiles can be used as well.

check_onland(abra)
          id decimalLongitude decimalLatitude basisOfRecord           eventDate
31 365512845       -0.9092748        54.57467    Occurrence 2011-09-03 10:00:00
                                      institutionCode collectionCode catalogNumber                      locality
31 Yorkshire Naturalists' Union Marine and Coastal Se          60051     261729389 Skinningrove. Cattersty Sands
                                                      datasetName   phylum    order    family genus scientificName
31 Yorkshire Naturalists Union Marine and Coastal Section Records Mollusca Cardiida Semelidae  Abra      Abra alba
   originalScientificName scientificNameAuthorship obisID resourceID yearcollected   species         qc aphiaID
31              Abra alba          (W. Wood, 1802) 395450       3083          2011 Abra alba 1073216639  141433
   speciesID continent coordinateUncertaintyInMeters       datasetID            modified
31    395450    Europe                         707.0 IMIS:dasid:3182 2014-04-16 16:16:43
                                                                     occurrenceID    recordedBy
31 urn:catalog:Yorkshire Naturalists' Union Marine and Coastal Se:60051:261729389 Adrian Norris
                            scientificNameID    class lifestage  sex individualCount eventID depth
31 urn:lsid:marinespecies.org:taxname:141433 Bivalvia      <NA> <NA>              NA    <NA>    NA
   minimumDepthInMeters maximumDepthInMeters fieldNumber occurrenceRemarks eventTime footprintWKT identifiedBy
31                   NA                   NA        <NA>              <NA>      <NA>         <NA>         <NA>
check_onland(abra, report = TRUE)
  field   level row                         message
1    NA warning  31 Coordinates are located on land

Check depth

check_depth uses the xylookup web service to identify which records have potentially invalid depths. Multiple checks are performed in this function:

  • missing depth column (warning)
  • empty depth column (warning)
  • depth values that can't be converted to numbers (error)
  • values that are larger than the depth value in the bathymetry layer, after applying the provided depthmargin (error)
  • depth values that are negative for off shore points, after applying the provided shoremargin (error)
  • minimum depth greater than maximum depth (error)
plot_map(check_depth(abra, depthmargin = 50), zoom = TRUE)

https://raw.githubusercontent.com/iobis/obistools/master/images/abra_check_depth_50.png

report <- check_depth(abra, report=T, depthmargin = 50)
head(report)
field level  row                                                                                              message
1 minimumDepthInMeters error 1209 Depth value (52.9) is greater than the value found in the bathymetry raster (depth=-27.0, margin=50)
2 minimumDepthInMeters error 1226   Depth value (62.3) is greater than the value found in the bathymetry raster (depth=4.4, margin=50)
3 minimumDepthInMeters error 1232   Depth value (64.9) is greater than the value found in the bathymetry raster (depth=5.8, margin=50)
4 minimumDepthInMeters error 1235   Depth value (61.2) is greater than the value found in the bathymetry raster (depth=4.0, margin=50)
5 minimumDepthInMeters error 1249   Depth value (68.3) is greater than the value found in the bathymetry raster (depth=8.0, margin=50)
6 minimumDepthInMeters error 1250   Depth value (72.9) is greater than the value found in the bathymetry raster (depth=5.0, margin=50)

Check eventID and parentEventID

check_eventids() checks if both eventID() and `parentE

Related Skills

View on GitHub
GitHub Stars32
CategoryDevelopment
Updated2mo ago
Forks6

Languages

R

Security Score

75/100

Audited on Jan 13, 2026

No findings