SkillAgentSearch skills...

AffyPipe

an open-source pipeline for Affymetrix Axiom genotyping workflow on livestock species

Install / Use

/learn @nicolazzie/AffyPipe
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

AffyPipe: an open-source pipeline for Affymetrix Axiom genotyping workflow

ref: E.L. Nicolazzi (Fondazione Parco Tecnologico Padano) - Via Einstein, Loc. Cascina Codazza (26900) Lodi (Italy). email: ezequielluis [dot] nicolazzi [at] gmail [dot] com

IMPORTANT WARNING FOR AXIOM apt2 USERS

Please note that a new series of library files are being released in many species. Most of these files carry the extention "apt2.xml". Please note AffyPipe will not run with these files. I have tried to contact Affymetrix's DevNet several times now, but their support has not been helpful. At all. I will keep on trying to understand why on earth they keep changing their software, inputs and outputs, and how to make this new software work. Please be patient, as this issue is not due to AffyPipe but for a sudden (and hardly documented) change in Affymetrix software. Please know that the windows GUI software works with these library files, so I'm going to write something I never thought I would: 'If you have a windows computer at hand, please use it. It'll take you less time and mental energy to use the Windows GUI rather than trying to understand how to make the Linux/Mac versions work'.

I am truly sorry, but my hands are tied here.

Hope to get back to you with good news, but for the moment AffyPipe is in the garage.

Ezequiel L. Nicolazzi

What is AffyPipe?

The goal of this pipeline is to authomatize Affymetrix's standard and "best practice" genotyping workflows for Linux and Mac users: from Power tools (APTools) to SNPolisher R package. This is a one-step tool that combines all Affymetrix software and produces edited and user-friendly format output files. In fact, AffyPipe allows you to edit SNP probe classes directly while exporting genotypes in PLINK format (Purcel et al, 2007). It was originally built for the International Buffalo Genome Consortium (Iamartino, 2013), but now is able to handle all species (e.g. human, cow, chichen, fisheries). Users are strongly adviced to read carefully Affymetrix's "Axiom genotyping solution data analysis guide" and "Best practice supplement to Axiom genotyping solution data analysis user guide" before using this tool.

0) AffyPipe publication & how to cite

The AffyPipe publication can be found in: http://www.ncbi.nlm.nih.gov/pubmed/25028724

If you used this pipeline for your analysis, please cite: Nicolazzie EL, Iamartino D, Williams JL (2014). AffyPipe: an open-source pipeline for Affymetrix Axiom genotyping workflow. Bioinformatics, DOI: 10.1093/bioinformatics/btu486

Thanks in advance!

1) Getting the pipeline, and requirements

The fastest and more clever way of getting this pipeline and all accessory files is installing git and cloning this repository. Further information on how to install git on Linux and Mac can be found at: http://git-scm.com/book/en/Getting-Started-Installing-Git . An example of cloning command using command line is:

% git clone --recursive https://github.com/nicolazzie/AffyPipe.git

The AffyPipe pipeline is for users running Linux/Unix and Mac operative systems, and only runs over 64bit processors. Windows users should use Gentoyping Console (TM) Software, which already cover all of these functionalities!!! You should have Python (2.x) and R (any version?) already installed on your computer (Mac users have python already installed by default). The whole pipeline was thoroughly tested under Python 2.7.6 and R 3.0.

IMPORTANT: Since Cygwin uses a twisted way of building linux-like (?) paths, AffyPipe may not work properly. We strongly suggest using a virtual machine (e.g. VirtualBox) with ubuntu (or similar), instead of Cygwin. A tip: if you really want to use Cygwin (why would you?!?!?), please know that you should use relative paths for all the folders and files involved. Absolute paths will not work.

2) Folders and files required

The Affymetrix genotyping workflow requires several Affymetrix files to run. For simplicity, all these files are expected to be placed into one folder. The default folder names and values specified below are provided as example. However, please note these names and values are also default in AffyPipe (see "Options" paragraph in Section 3).

All Affymetrix files are downloadable at their website (http://www.affymetrix.com). Please remember that you need to register to be able to download all the files below! NOTE: If you cloned or downloaded all the folders in this repository, you'll see example names of the files you need for the Buffalo species. All files are empty: i)to avoid copyright issues with Affymetrix and; ii) to force you downloading the latest version of all the files and softwares.

  • 2.a.) AFFYTOOLS folder: All data downloaded from Affymetrix website is placed here. Essentially 4 different files are needed (2.a.1, 2.a.2.1 + 2.a.2.2, and 2.a.3).

  • 2.a.1) Go to: Products > Microarray Solutions > DNA Analysis Solutions > Agrigenomics Solutions > Arrays > Species > Buffalo/[other species], and download the file under Library Files section. For the buffalo it is Axiom® Buffalo Analysis Files.r[X].zip, for cow Axiom_GW_bos_snp_1_r[X].zip, where [X] stands for the version of the Analysis Files. Please uncompress this file and you'll get a lot of "Axiom_[species].[X].[blablabla] files.

  • 2.a.2) Go to: Partners & Programs > Developers' Network > DevNet Tools. Inside there you'll find:

    • 2.a.2.1) Affymetrix PowerTools (APTools): Here you should download the right file for your operative system (something like APT[X] Linux 64 bit x86 binaries or APT [X] Mac OS-Lion 64-bit Intel Binaries). Note: This pipeline was tested on both OSs. If you're using Mavericks OS, no worries, it will run ok. Please uncompress this folder (inside AFFYTOOLS, if you like).
    • 2.a.2.2) SNPolisher: An R package for post-processing array's results. Please uncompress this folder (inside AFFYTOOLS, if you like). WARNING: (Sept. 2014) Affymetrix has updated its SNPolisher package to v1.5.0, deprecating some functions. It is compulsary you update your SNPolisher (otherwise the program will stop!). If you have older versions of SNPolisher installed, please either delete them or install the new version on your own BEFORE running Affypipe!**
  • 2.a.3) Annotation file: You will receive an annotation file from Affymetrix along with your genotypes. However, since data is updated constantly, the download of the latest annotation file is STRONGLY recommended. By doing this, even if you analyse samples in different times, you'll be sure of using the latest map information! You can find this in: Products > Microarray Solutions > DNA Analysis Solutions > Agrigenomics Solutions > Arrays > Species > Buffalo[/Bovine or other species]. There, under Current NetAffx Annotation Files section, download the file Axiom_Buffalo Annotations, CSV format (for buffalo) or Axiom_GW_Bos_SNP_1.na[X].annot, CSV format (for cow). Please remember to uncompress the file and put it inside AFFYTOOLS folder. IMPORTANT: Please check your annotation file contains the following variables in the header (e.g. first row after a series of lines with leading '#'): "Probe Set ID","Affy SNP ID","Chromosome","Physical Position","Allele A","Allele B". Note that if any of these is not present (or is written differently) the program will stop before running SNPolisher!!

  • 2.b) a CEL list file: All Affymetrix Power Tools programs need a file containing a list of CEL files (raw data) to be analysed. Fortunately, using AffyPipe you will have to do this just once! It is highly recommended that you provide also the full path to the .CEL files. Just remember that CEL list files need a compulsary header row: "cel_files". If such header is missing, AffyPipe will stop, since APTools programs cannot run without that header!. To help you here, a small bash program called "createcelfile.sh" is also provided. The program actually creates the CEL list files for you. You only have to place this program in the directory where the .CEL files are stored, and run it. This creates a "mycellistfile.txt" file (your CEL list file!), that you can rename and put wherever you want (e.g. in the main folder?). To run this program:

    % chmod 755 createcelfile.sh && ./createcelfile.sh

    A short explanation for those not used to command line: chmod 755 createcelfile.sh means that you are giving all access rights (read/write/execute) to your user + read/execute rights to all other users. You actually need to do this just once!; && means something like "after you have succeeded doing the thing on the left, do the thing on the right"; ./createcelfile.sh this command actually launches the program.

  • 2.c) a PARAM_species.inp file: This file is already provided, and you should NOT change the name of the file! However, you need to edit it, based on the array/species you're going to analyse. Thanks to this file, AffyPipe can be used by any species genotyped with Affymetrix Axiom technology! Please note that testing has been carried out only on Buffalo + Human Exome 319 and EUR Axiom datasets (GEO platforms: GPL18760 and GPL52691). In this file you need to edit 3 parameters:

    • SPEC_prefix= : This is the prefix of your species file. Search for the file "*apt-geno-qc.AximQC1.xml". Prefix is whatever comes before the first dot. For example: in "Axiom_Buffalo.r2.apt-geno-qc.AxiomQC1.xml", prefix is "Axiom_Buffalo""
    • SPEC_version= : This is the release of the library. Usually is specified as "r[number]". For example: in "Axiom_Buffalo.r2.apt-geno-qc.AxiomQC1.xml", version is "r2".
    • SPEC_annotation= : This is Affymetrix's annotation file for the selected species. PLEASE NOTE that this file should be placed in the AFFYTOOLS (or whatever you call it) directory!! E.g. if "MasterCsvAnnotationFil
View on GitHub
GitHub Stars14
CategoryProduct
Updated7mo ago
Forks6

Languages

Python

Security Score

67/100

Audited on Aug 15, 2025

No findings