ReadMDTable
R π¦ for reading markdown tables into tibbles
Install / Use
/learn @jrdnbradford/ReadMDTableREADME
readMDTable <a href="https://jrdnbradford.github.io/readMDTable/"><img src="man/figures/logo.png" align="right" height="139" alt="readMDTable website" /></a>
<!-- badges: start --> <!-- badges: end -->readMDTable helps convert raw markdown tables from a string, file, or URL to tibbles.
Many sites (like GitHub) convert markdown tables into HTML tables, making both available. See the vignette Benchmarking Against rvest to help determine if you should use readMDTable or rvest in those circumstances.
Installation
Install the latest CRAN release with:
install.packages("readMDTable")
Install the development version from GitHub using pak:
# install.packages("pak")
pak::pkg_install("jrdnbradford/readMDTable")
Usage
If you have a string, file, or URL whose entire content is just a
markdown table, you should use read_md_table which will return a
tibble.
If the string, file, or URL is a markdown file that has other
content besides just a table or tables, such as headings, paragraphs,
etc, you should use extract_md_tables which will parse the file and
return a tibble or list of tibbles.
From a File
Read in an example markdown table from the package:
mtcars_file <- read_md_table_example("mtcars.md")
read_md_table(mtcars_file)
#> Rows: 32 Columns: 12
#> ββ Column specification ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> Delimiter: "|"
#> chr (1): model
#> dbl (11): mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb
#>
#> βΉ Use `spec()` to retrieve the full column specification for this data.
#> βΉ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> # A tibble: 32 Γ 12
#> model mpg cyl disp hp drat wt qsec vs am gear carb
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Mazda RX4 21 6 160 110 3.9 2.62 16.5 0 1 4 4
#> 2 Mazda RX4 β¦ 21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 3 Datsun 710 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
#> 4 Hornet 4 D⦠21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
#> 5 Hornet Spo⦠18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
#> 6 Valiant 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
#> 7 Duster 360 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
#> 8 Merc 240D 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
#> 9 Merc 230 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
#> 10 Merc 280 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
#> # βΉ 22 more rows
Read in an example markdown file that has multiple tables as well as headings and paragraphs:
mtcars_file <- read_md_table_example("mtcars-split.md")
extract_md_tables(mtcars_file, show_col_types = FALSE)
#> [[1]]
#> # A tibble: 4 Γ 12
#> model mpg cyl disp hp drat wt qsec vs am gear carb
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Mazda RX4 21 6 160 110 3.9 2.62 16.5 0 1 4 4
#> 2 Mazda RX4 W⦠21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 3 Datsun 710 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
#> 4 Hornet 4 Dr⦠21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
#>
#> [[2]]
#> # A tibble: 4 Γ 12
#> model mpg cyl disp hp drat wt qsec vs am gear carb
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Hornet Spor⦠18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
#> 2 Valiant 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
#> 3 Duster 360 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
#> 4 Merc 240D 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
#>
#> [[3]]
#> # A tibble: 4 Γ 12
#> model mpg cyl disp hp drat wt qsec vs am gear carb
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Cadillac Fl⦠10.4 8 472 205 2.93 5.25 18.0 0 0 3 4
#> 2 Lincoln Con⦠10.4 8 460 215 3 5.42 17.8 0 0 3 4
#> 3 Chrysler Im⦠14.7 8 440 230 3.23 5.34 17.4 0 0 3 4
#> 4 Fiat 128 32.4 4 78.7 66 4.08 2.2 19.5 1 1 4 1
#>
#> [[4]]
#> # A tibble: 6 Γ 12
#> model mpg cyl disp hp drat wt qsec vs am gear carb
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Porsche 914β¦ 26 4 120. 91 4.43 2.14 16.7 0 1 5 2
#> 2 Lotus Europa 30.4 4 95.1 113 3.77 1.51 16.9 1 1 5 2
#> 3 Ford Panter⦠15.8 8 351 264 4.22 3.17 14.5 0 1 5 4
#> 4 Ferrari Dino 19.7 6 145 175 3.62 2.77 15.5 0 1 5 6
#> 5 Maserati Bo⦠15 8 301 335 3.54 3.57 14.6 0 1 5 8
#> 6 Volvo 142E 21.4 4 121 109 4.11 2.78 18.6 1 1 4 2
From a String
read_md_table("| len | supp | dose |\n|---|---|---|\n| 4.2 | VC | 0.5 |\n")
#> Rows: 1 Columns: 3
#> ββ Column specification ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> Delimiter: "|"
#> chr (1): supp
#> dbl (2): len, dose
#>
#> βΉ Use `spec()` to retrieve the full column specification for this data.
#> βΉ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> # A tibble: 1 Γ 3
#> len supp dose
#> <dbl> <chr> <dbl>
#> 1 4.2 VC 0.5
From a URL
read_md_table("https://raw.githubusercontent.com/jrdnbradford/readMDTable/main/inst/extdata/iris.md")
#> Rows: 150 Columns: 5
#> ββ Column specification ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> Delimiter: "|"
#> chr (1): variety
#> dbl (4): sepal.length, sepal.width, petal.length, petal.width
#>
#> βΉ Use `spec()` to retrieve the full column specification for this data.
#> βΉ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> # A tibble: 150 Γ 5
#> sepal.length sepal.width petal.length petal.width variety
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 Setosa
#> 2 4.9 3 1.4 0.2 Setosa
#> 3 4.7 3.2 1.3 0.2 Setosa
#> 4 4.6 3.1 1.5 0.2 Setosa
#> 5 5 3.6 1.4 0.2 Setosa
#> 6 5.4 3.9 1.7 0.4 Setosa
#> 7 4.6 3.4 1.4 0.3 Setosa
#> 8 5 3.4 1.5 0.2 Setosa
#> 9 4.4 2.9 1.4 0.2 Setosa
#> 10 4.9 3.1 1.5 0.1 Setosa
#> # βΉ 140 more rows
extract_md_tables("https://raw.githubusercontent.com/jrdnbradford/readMDTable/main/inst/extdata/ToothGrowth.md")
#> Rows: 60 Columns: 4
#> ββ Column specification ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> Delimiter: "|"
#> chr (1): supp
#> dbl (3): rownames, len, dose
#>
#> βΉ Use `spec()` to retrieve the full column specification for this data.
#> βΉ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> # A tibble: 60 Γ 4
#> rownames len supp dose
#> <dbl> <dbl> <chr> <dbl>
#> 1 1 4.2 VC 0.5
#> 2 2 11.5 VC 0.5
#> 3 3 7.3 VC 0.5
#> 4 4 5.8 VC 0.5
#> 5 5 6.4 VC 0.5
#> 6 6 10 VC 0.5
#> 7 7 11.2 VC 0.5
#> 8 8 11.2 VC 0.5
#> 9 9 5.2 VC 0.5
#> 10 10 7 VC 0.5
#> # βΉ 50 more rows
Warnings and Messy Data
read_md_table will throw warnings by default if there are potential
issues with the markdown table. In many cases it will still correctly
read in the messy data if you use force = TRUE:
read_md_table(
" | Name | Age | City | Date |
|-------|-----|-------------|------------|
| Alice | 30 | | 2021/01/08 |
| Bob | 25 | Los Angeles | 2023/07/22
| Carol | 27 | Chicago | |",
force = TRUE
)
#> War
Related Skills
feishu-drive
350.1k|
things-mac
350.1kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
350.1kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
postkit
PostgreSQL-native identity, configuration, metering, and job queues. SQL functions that work with any language or driver
