nodbi

⁠ ⁠

nodbi is an R package that provides a single interface for several NoSQL databases and databases with JSON functionality, with the same function parameters and return values across all database backends. Last updated 2026-02-21.

| Currently, nodbi supports<br/>as database backends | for an R object of any<br/>of these data types | for these operations | |:---|:---|:---| | MongoDB | data.frame | List, Exists | | SQLite | list | Create | | PostgreSQL | JSON string | Get | | DuckDB | file name of NDJSON records | Query | | Elasticsearch | URL of NDJSON records | Update | | CouchDB | | Delete |

For speed comparisons of database backends, see benchmark and testing below.

Plans for developing package nodbi in 2026 include to add MariaDB and MySQL as database backends.

API overview

Parameters for docdb_*() functions are the same across all database backends. See walk-through below and the canonical testing in core-nodbi.R. “Container” is used as term to indicate where conceptually the backend holds the data, see Database connections below. The key parameter holds the name of a container.

| Purpose | Function call | |:---|:---| | Create database connection (see below) | src <- nodbi::src_{duckdb, postgres, mongo, sqlite, couchdb, elastic}(<see below for parameters>) | | Load my_data (a data frame, list, JSON string, or file name or URL pointing to NDJSON records) into database, container my_container | nodbi::docdb_create(src = src, key = "my_container", value = my_data) | | Get all documents back into a data frame | nodbi::docdb_get(src = src, key = "my_container") | | Get documents selected with query (as MongoDB-compatible JSON) into a data frame | nodbi::docdb_query(src = src, key = "my_container", query = '{"age": 20}') | | Get selected fields (in MongoDB compatible JSON) from documents selected by query into a data frame | nodbi::docdb_query(src = src, key = "my_container", query = '{"age": {"$gt": 20}}', fields = '{"friends.name": 1, "_id": 0, "age": 1}', limit = 2L) | | Update (patch) documents selected by query with new data my_data (in a data frame, list, JSON string, or file name or URL pointing to NDJSON records) | nodbi::docdb_update(src = src, key = "my_container", value = my_data, query = '{"age": 20}') | | Check if container exists | nodbi::docdb_exists(src = src, key = "my_container") | | List all containers in database | nodbi::docdb_list(src = src) | | Delete document(s) in container | nodbi::docdb_delete(src = src, key = "my_container", query = '{"age": 20}') | | Delete container | nodbi::docdb_delete(src = src, key = "my_container") | | Close and remove database connection manually (when restarting R, connections are automatically closed and removed by nodbi) | rm(src) |

Install

CRAN version

install.packages("nodbi")

Development version

remotes::install_github("ropensci/nodbi")

Load package from library

library("nodbi")

Database connections

The following subsections show those parameters and aspects that are specific to the database backend. These are only needed once, for src_*() to create a connection object. Any such connection object is subsequently used similarly across the docdb_* functions in package ‘nodbi’.

“Container” refers to how conceptually the database backend holds the data. Users specify the relevant container with parameter key = <container_name> in docdb_* functions, see Walk-through below).

Data types are mapped from JSON to R objects by jsonlite. Any root-level _id is extracted from the document(s) and used for an index column _id, otherwise an UUID is created as _id.

DuckDB

See also https://CRAN.R-project.org/package=duckdb. “Container” refers to a DuckDB table, with columns _id and json created and used by package nodbi, applying SQL functions and functions as per https://duckdb.org/docs/extensions/json to the json column. Each row in the table represents a JSON document.

src <- nodbi::src_duckdb(dbdir = ":memory:", ...)

MongoDB

“Container” refers to a MongoDB collection, in which nodbi creates JSON documents. See also https://jeroen.github.io/mongolite/. MongoDB but none of the other databases require to specify the container name already in the src_*() function; use the collection name for parameter key in docdb_* functions.

src <- nodbi::src_mongo(
  collection = "my_container", db = "my_database",
  url = "mongodb://localhost", ...
)

SQLite

“Container” refers to an SQLite table, with columns _id and json created and used by package nodbi, applying SQL functions and functions as per https://www.sqlite.org/json1.html to the json column. Each row in the table represents a JSON document. The table is indexed on _id. See also https://CRAN.R-project.org/package=RSQLite.

src <- nodbi::src_sqlite(dbname = ":memory:", ...)

PostgreSQL

“Container” refers to a PostgreSQL table, with columns _id and json created and used by package nodbi, applying SQL functions and functions as per https://www.postgresql.org/docs/current/functions-json.html to the json column. With PostgreSQL, a custom plpgsql function jsonb_merge_patch() is used for docdb_update(). The order of variables in data frames returned by docdb_get() and docdb_query() can differ from their order the input to docdb_create().

src <- nodbi::src_postgres(
  dbname = "my_database", host = "127.0.0.1", port = 5432L, ...
)

CouchDB

“Container” refers to a CouchDB database, in which nodbi creates JSON documents. See also https://CRAN.R-project.org/package=sofa. With CouchDB, function docdb_update() uses jqr to implement patching JSON, in analogy to functions available for the other databases.

src <- nodbi::src_couchdb(
  host = "127.0.0.1", port = 5984L, path = NULL,
  transport = "http", user = NULL, pwd = NULL, headers = NULL
)

Elasticsearch

“Container” refers to an Elasticsearch index, in which nodbi creates JSON documents. Opensearch can equally be used. Only lowercase is accepted for container names (in parameter key of docdb_* functions).

src <- nodbi::src_elastic(
  host = "127.0.0.1", port = 9200L, path = NULL,
  transport_schema = "http", user = NULL, pwd = NULL, ...
)

Walk-through

This example is to show how functional nodbi is at this time: With any of the six database backends, the functions work in the same way and return the same values.

# load nodbi
library(nodbi)

# name of container
key <- "my_container"

# connect any of these database backends
src <- src_duckdb()
src <- src_mongo(collection = key)
src <- src_sqlite()
src <- src_postgres()
src <- src_elastic()
src <- src_couchdb(
  user = Sys.getenv("COUCHDB_TEST_USER"),
  pwd = Sys.getenv("COUCHDB_TEST_PWD")
)

# check if container already exists
docdb_exists(src, key)
# [1] FALSE

# load data (here data frame, alternatively a list, JSON or file with NSJSON)
# into the container "my_container" specified in "key" parameter
docdb_create(src, key, value = mtcars)
# [1] 32

# load additionally 98 NDJSON records
docdb_create(src, key, "https://httpbin.org/stream/98")
# Note: container 'my_container' already exists
# [1] 98

# load additionally contacts JSON data, from package nodbi
docdb_create(src, key, contacts)
# Note: container 'my_container' already exists
# [1] 5

# get all documents, irrespective of schema
dplyr::tibble(docdb_get(src, key))
# # A tibble: 135 × 27
#    `_id` isActive balance   age eyeColor name  email about registered tags   friends
#    <chr> <lgl>    <chr>   <int> <chr>    <chr> <chr> <chr> <chr>      <list> <list>
#  1 5cd6… TRUE     $2,412…    20 blue     Kris… kris… "Sin… 2017-07-1… <chr>  <df>
#  2 5cd6… FALSE    $3,400…    20 brown    Rae … raec… "Nis… 2018-12-1… <chr>  <df>
#  3 5cd6… TRUE     $1,161…    22 brown    Pace… pace… "Eiu… 2018-08-1… <chr>  <df>
#  4 5cd6… FALSE    $2,579…    30 brown    Will… will… "Nul… 2018-02-1… <chr>  <df>
#  5 5cd6… FALSE    $3,808…    23 green    Lacy… lacy… "Sun… 2014-08-0… <chr>  <df>
#  6 69bc… NA       NA         NA NA       NA    NA     NA   NA         <NULL> <NULL>
#  7 69bc… NA       NA         NA NA       NA    NA     NA   NA         <NULL> <NULL>
#  8 69bc… NA       NA         NA NA       NA    NA     NA   NA         <NULL> <NULL>
#  9 69bc… NA       NA         NA NA       NA    NA     NA   NA         <NULL> <NULL>
# 10 69bc… NA       NA         NA NA       NA    NA     NA   NA         <NULL> <NULL>
# # ℹ 125 more rows
# # ℹ 16 more variables: url <chr>, args <df[,0]>, headers <df[,4]>, origin <chr>,
# #   id <int>, mpg <dbl>, cyl <int>, disp <dbl>, hp <int>, drat <dbl>, wt <dbl>,
# #   qsec <dbl>, vs <int>, am <int>, gear <int>, carb <int>
# # ℹ Use `print(n = ...)` to see more rows


# query some documents
docdb_query(src, key, query = '{"mpg": {"$gte": 30}}')
#              _id mpg cyl disp  hp drat  wt qsec vs am gear carb
# 1       Fiat 128  32   4   79  66  4.1 2.2   19  1  1    4    1
# 2    Honda Civic  30   4   76  52  4.9 1.6   19  1  1    4    2
# 3 To

Nodbi

Install / Use

README