SkillAgentSearch skills...

Chemspiderapi

R functionalites for ChemSpider's new API services

Install / Use

/learn @NIVANorge/Chemspiderapi

README

chemspiderapi

<!-- README.md is generated from README.Rmd. Please edit that file --> <!-- badges: start -->

R build
status CodeCov test
coverage

<!-- badges: end -->

R functionalities for ChemSpider’s API services

ChemSpider has introduced a new API syntax in late 2018, and the old ChemSpider API syntax will be shut down at the end of November 2018. {chemspiderapi} provides R wrappers around the new API services from ChemSpider.

The aim of this package is to:

  1. Translate the new ChemSpider API services into R-friendly functions.
  2. Include thorough quality checking before the query is send, to avoid using up the query quota on, e.g., spelling errors.
  3. Implement the R functionality in a way that is suitable for both {base}- and {tidyverse}-style programming.

{chemspiderapi} relies on API keys to access ChemSpider’s API services. For this, we recommend storing the ChemSpider API key using the {keyring} package.

To limit the rate of API queries, we recommend using the {ratelimitr} package or purrr::slowly() within the {tidyverse} package collection.

To handle PNG images, the {magick} package is recommended.

We furthermore recommend the {memoise} package to “remember” the results of API queries (i.e., to not ruin the API allowance).

Additionally, the superb {webchem} package provides access to a lot of additional chemistry-related API services and is highly recommended.

Installation

R package

Install the package from GitHub (using the {remotes} package; automatically installed as part of {devtools}):

# install.packages("devtools")
remotes::install_github("NIVANorge/chemspiderapi")

The development version can be installed with:

# install.packages("devtools")
remotes::install_github("NIVANorge/chemspiderapi", ref = "dev")

Currently the only tested continuous integration environment for {chemspiderapi} is Linux. It should install smoothly on Windows 10 and mac OS machines as well. Please open an issue if you run into any troubles.

Dependencies

{chemspiderapi} relies on two essential dependencies. The {curl} package is used to handle the API queries and the {jsonlite} package is necessary to wrap and unwrap information for the queries.

If not already installed, these packages will be installed automatically when installing {chemspiderapi}. Should this result in trouble, the dependency packages can be installed manually:

install.packages(c("curl", "jsonlite"))

If {curl} or {jsonlite} are missing, (almost) all functions of {chemspiderapi} will fail and throw an error.

Coverage

As of 2021-01-05, the following functionalities are implemented (100% functionality with 100% annotation):

FILTERING

| ChemSpider Compound API | chemspiderapi Wrapper | chemspiderapi Help File | | :--------------------------------------- | :------------------------------------ | :-----------------------: | | filter-element-post | post_element() | yes | | filter-formula-batch-post | post_formula_batch() | yes | | filter-formula-batch-queryId-results-get | get_formula_batch_queryId_results() | yes | | filter-formula-batch-queryId-status-get | get_formula_batch_queryId_status() | yes | | filter-formula-post | post_formula() | yes | | filter-inchi-post | post_inchi() | yes | | filter-inchikey-post | post_inchikey() | yes | | filter-intrinsicproperty-post | post_intrinsicproperty() | yes | | filter-mass-batch-post | post_mass_batch() | yes | | filter-mass-batch-queryId-results-get | get_mass_batch_queryId_results() | yes | | filter-mass-batch-queryId-status-get | get_mass_batch_queryId_status() | yes | | filter-mass-post | post_mass() | yes | | filter-name-post | post_name() | yes | | filter-queryId-results-get | get_queryId_results() | yes | | filter-queryId-results-sdf-get | get_queryId_results_sdf() | yes | | filter-queryId-status-get | get_queryId_status() | yes | | filter-smiles-post | post_smiles() | yes |

LOOKUPS

| ChemSpider Compound API | chemspiderapi Wrapper | chemspiderapi Help File | | :---------------------- | :---------------------- | :-----------------------: | | lookups-datasources-get | get_datasources() | yes |

RECORDS

| ChemSpider Compound API | chemspiderapi Wrapper | chemspiderapi Help File | | :-------------------------------------- | :---------------------------------- | :-----------------------: | | records-batch-post | post_batch() | yes | | records-recordId-details-get | get_recordId_details() | yes | | records-recordId-externalreferences-get | get_recordId_externalreferences() | yes | | records-recordId-image-get | get_recordId_image() | yes | | records-recordId-mol-get | get_recordId_mol() | yes |

TOOLS

| ChemSpider Compound API | chemspiderapi Wrapper | chemspiderapi Help File | | :--------------------------- | :------------------------- | :-----------------------: | | tools-convert-post | post_convert() | yes | | tools-validate-inchikey-post | post_validate_inchikey() | yes |

Best practices for ChemSpider’s Compound APIs

This section will be updated with practical examples in the future.

The basic workflow order for the above FILTERING queries is:

  1. POST Query

  2. GET Status

  3. GET Results (after GET Status returns "Complete")

In practice, this means the following possible workflows can be implemented (from left to right):

| POST Query | GET Status | GET Results | | :------------------------- | :----------------------------------- | :------------------------------------ | | post_element() | get_queryId_status() | get_queryId_results() | | post_formula() | get_queryId_status() | get_queryId_results() | | post_formula_batch() | get_formula_batch_queryId_status() | get_formula_batch_queryId_results() | | post_inchi() | get_queryId_status() | get_queryId_results() | | post_inchikey() | get_queryId_status() | get_queryId_results() | | post_intrinsicproperty() | get_queryId_status() | get_queryId_results() | | post_mass() | get_queryId_status() | get_queryId_results() | | post_mass_batch() | get_mass_batch_queryId_status() | get_mass_batch_queryId_results() | | post_mass() | get_queryId_status() | get_queryId_results() | | post_name() | get_queryId_status() | get_queryId_results() | | post_smiles() | get_queryId_status() | get_queryId_results() |

Typically, the result will be one or multiple ChemSpider IDs (recordId). They can be used as input into the above RECORDS queries, e.g., get_recordId_details().

Vignettes

As of 2021-01-05, the following five vignettes are available:

  • Storing and Accessing API Keys: A basic example on how to safely store and retrieve API keys using the keyring package.

  • Rate-Limiting API Queries: Multiple examples on how to slow down API queries. base, ratelimitr, and tidyverse workflows are provided.

  • Remembering API Queries: A basic example on how to “remember” the results of API queries using the memoise package. This can greatly reduce redundant API queries.

  • Saving MOL and SDF Files: Examples on how to save MOL and SDF files as returned by get_recordId_mol() and get_queryId_results_sdf().

  • Saving PNG Images: Examples on how to save PNG files, as returned by get_recordId_image(), using functionalities of the magick package.

Funding

This package was created at

Related Skills

View on GitHub
GitHub Stars6
CategoryDevelopment
Updated1y ago
Forks3

Languages

R

Security Score

75/100

Audited on Jan 24, 2025

No findings