BioSampleParser
R tool for parsing .xml metadata obtained through NCBI BioSample into tabular format
Install / Use
/learn @angelolimeta/BioSampleParserREADME
BioSampleParser

A tool designed to provide users with easier access to study metadata.
BioSampleParser takes the following study identifiers as input:
- European Nucleotide Archive (EMBL-EBI) accession number, e.g. PRJNA397906.
- NCBI BioProject ID, e.g. 397906.
It then queries NCBI BioSample for any samples related to the study, downloads sample metadata in .xml format and parses it into a tabular format of choice (Data frame object in R, or .tsv file).
Setup and Usage
Donwload BioSampleParser.R and place it in your directory of choice.
Run the following command in an R session to source the function:
source("/path/to/BioSampleParser.R")
Call the function with the query argumnet set to your study identifier:
df_metadata <- BioSampleParser(query = "PRJNA397906")
List of all arguments
query
- character, ENA ID or NCBI BioProject ID for the study of interest
filePath
- character, path to local NCBI BioProject .xml file.
file.tsv
- character, filename for saving metadata as a .tsv file.
