SkillAgentSearch skills...

SampleSheet

Create an Illumina Sample Sheet, the comma-separated text document required by Illumina sequencing systems to specify (1) sequencing parameters and (2) sample-barcode relationships. Customize [Header], [Reads], and [Data] sections; in particular, draw on up to 192 8-bp barcodes (96x i7 & 96x i5 indices) to specify up to 9,216 sample-barcode relationships for multiplexed amplicon sequencing.

Install / Use

/learn @YamamotoLabUCSF/SampleSheet
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<span style="color:mediumblue">SampleSheet.py</span>

Create an Illumina® Sample Sheet, the comma-separated text document required by Illumina® sequencing systems to specify (1) sequencing parameters and (2) sample-barcode relationships. Customize [Header], [Reads], and [Data] sections; in particular, draw on up to 192 8-bp barcodes (96x i7 & 96x i5 indices) to specify up to 9,216 sample-barcode relationships for multiplexed amplicon sequencing. <br/><br/>

<span style="color:mediumblue">Table of contents</span>

<span style="color:mediumblue">Background</span>

<img src="SampleSheet_img/SampleSheet_thumbnail.png" align="left" width="600"> Sequencing by synthesis (SBS) collects millions to billions of DNA sequence reads *en masse*. DNA templates from tens to thousands of independent sample sources can be barcoded, pooled, and sequenced on a common flow cell. Unique indices (barcode sequences) allow pooled reads to be assigned to cognate sample sources (demultiplexed). <br/><br/>

This script automates creation of an Illumina® Sample Sheet, the comma-separated text document required by Illumina® sequencing systems to specify (1) sequencing parameters and (2) sample-barcode relationships. With this script, a Sample Sheet with up to 9,216 sample-barcode relationships can be automatically generated in <1 second, following user entry of a single simplified list containing up to 96 sample prefixes assigned to an i7 index range and unique i5 index (each sample prefix to be expanded to up to 96 individual samples, suffixed by well ID (i.e., A01-H12) of a 96-well plate).

<span style="color:mediumblue">Features</span>

  • Automates data entry into 3 Illumina® Sample Sheet sections, based on command-line input provided by a user:
    • [Header] (InvestigatorName, ProjectName)
    • [Reads] (# of reads)
    • [Data] (Sample ID, i7 index, i5 index)

<span style="color:mediumblue">Requirements</span>

  • Python 3.7 or higher - instructions for install below
  • Python library for command line script (suggested) PrettyTable - instructions for install below

<span style="color:mediumblue">Synopsis</span>

This script returns a Sample Sheet file compatible with Illumina® sequencing platforms.

Users are asked for the path to an output directory in which a Sample Sheet will be created, along with user-specific variables for Sample Sheet [Header], [Reads], and [Data] sections.

For [Data] relationships between sample names and i7+i5 indices, SampleSheet.py draws upon a set of 192 custom primers with unique 8-bp barcodes compatible with Illumina® sequencing platforms; these indices allow up to 9,216 samples to be arrayed in 96-well (or 384-well) format with unique barcodes for pooled sequencing.

(see 'Input notes' for details).

Note on index usage: In this script, each i7 index identifies an individual well within a 96-well plate format (each well is uniquely barcoded by a single i7 index), whereas a single i5 index defines all wells of a specific plate (up to 96 wells in a single plate are barcoded by a common i5 index). Primer sequences (and indices used by SampleSheet.py) can be found in associated files, i7_barcode_primers.xls and i5_barcode_primers.xls.

For further usage details, please refer to the following manuscript:

Ehmsen, Knuesel, Martinez, Asahina, Aridomi, Yamamoto (2021)

Please cite usage as:

SampleSheet.py
Ehmsen, Knuesel, Martinez, Asahina, Aridomi, Yamamoto (2021)

<span style="color:mediumblue">System setup</span>

<span style="color:dodgerblue">Virtual machine<span>


<span style="color:dodgerblue">Alleles_and_altered_motifs.ova</span>

The programs are available for use either individually or packaged into a virtual machine which can be run on Mac, Linux, or Windows operating systems. The "Alleles_and_altered_motifs" virtual machine comes pre-installed with BLAST, MEME, the full hg38 genome BLAST database, test datasets, and all the external dependencies needed to run SampleSheet, CollatedMotifs, and Genotypes. Windows users are encouraged to use the virtual machine to run CollatedMotifs because the MEME suite software upon which CollatedMotifs relies is not natively supported on Windows OS.

  • Detailed instuctions on Virtual machine download and setup at <a href="https://doi.org/10.5281/zenodo.3406861">Download Alleles_and_altered_motifs virtual machine</a> from Zenodo, DOI 10.5281/zenodo.3406861

  • Note: Running the virtual machine requires virtualization software, such as Oracle VM VirtualBox, available for download at <a href="https://www.virtualbox.org/">Download virtualbox Software</a> https://www.virtualbox.org/

Linux and Mac users can also follow the steps below to install SampleSheet, Genotypes, and CollatedMotifs. If you are running Windows, you can follow the steps below to install SampleSheet and Genotypes (without CollatedMotifs).

<span style="color:dodgerblue">Direct install<span>


<span style="color:dodgerblue">2.1. Python 3 setup</span>

<span style="color:dodgerblue"> First confirm that Python 3 (required) and Jupyter Notebook (optional) are available on your system, or download & install by following the steps below</span>

Mac and Linux OS generally come with Python pre-installed, but Windows OS does not. Check on your system for the availability of Python version 3.7 or higher by following guidelines below:

  • First open a console in Terminal (Mac/Linux OS) or PowerShell (Windows OS), to access the command line interface.

  • Check to see which version of Python your OS counts as default by issuing the following command (here, $ refers to your command-line prompt and is not a character to be typed):

    $ python --version

    • If the output reads Python 3.7.3 or any version >=3.7, you are good to go and can proceed to Jupyter Notebook (optional).

    • If the output reads Python 2.7.10 or anything below Python 3, this signifies that a Python version <3 is the default version, and you will need to check whether a Python version >=3.7 is available on your system.

      • To check whether a Python version >=3.7 is available on your system, issue the following command:

        $ python3 --version

      • If the output finds a Python version >=3.7 (such as Python 3.7.3), you are good to go and can proceed to Jupyter Notebook (optional).

      • If the output does not find a Python version >3.7, use one of the following two options to download and install Python version >=3.7 on your computer:

<span style="color:dodgerblue">Python 3 (required)</span>

Option 1) Install Python 3 prior to Jupyter Notebook This option is recommended for most users

  • Go to the following website to download and install Python https://www.python.org/downloads/
    • Select "Download the latest version for X", and then follow installation guidelines and prompts when you double-click the downloaded package to complete installation.

    • Once you have downloaded and installed a Python 3 version >=3.7, double-check in your command-line that Python 3 can be found on your system by issuing the following command:

      $ python3 --version

    • The output should signify the Python version you just installed. Proceed to Jupyter Notebook (optional).

<span style="color:dodgerblue">Anaconda (Optional: Python 3 with Jupyter Notebook in one)</span>

Option 2) Install Python 3 and Jupyter Notebook (together as part of Anaconda package)

  • Note, this method has only been tested for use of SampleSheet.py and Genotypes.py on Windows and may not work on all Mac or Linux systems in conjunction with the use of Python Virtual Environments (virtualenv) to run CollatedMotifs.py**

  • Anaconda (with Jupyter Notebook) Download & Installation https://jupyter.readthedocs.io/en/latest/install/notebook-classic.html

    • Download Anaconda with Python 3, and then follow installation guidelines and prompts when you double-click the downloaded package to complete installation.

View on GitHub
GitHub Stars7
CategoryProduct
Updated7mo ago
Forks1

Languages

Jupyter Notebook

Security Score

62/100

Audited on Aug 7, 2025

No findings