SkillAgentSearch skills...

Jbwa

Java Bindings (JNI) for bwa

Install / Use

/learn @lindenb/Jbwa
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

jbwa

Build Status

Java Bindings (JNI) for bwa

Author: Pierre Lindenbaum PhD. @yokofakun (Institut du Thorax, Nantes, France) BWA is written by Heng Li (Broad Institute)

Motivation

BWA (http://bio-bwa.sourceforge.net/) contains a small C example(https://github.com/lh3/bwa/blob/master/example.c) for running bwa-mem as a library (bwamem-lite). I created some JNI bindings to see if I can bind the C bwa library to java and get the same output than bwamem-lite.

Compilation

I've tested this code under linux and

  • JAVA oracle JDK8
  • GNU Make 3.81
  • gcc 4.8.2
  • wget

BWA for apache2 will be downloaded ( https://github.com/lh3/bwa/tree/Apache2 ) .

typing make, should download the sources bwa, compile and execute some tests.

See also

  • https://github.com/broadinstitute/gatk/issues/1517

Contribute

  • Issue Tracker: http://github.com/lindenb/jbwa/issues
  • Source Code: http://github.com/lindenb/jbwa

License

The project is licensed under the Apache2 license.

Example (Two FASTQs)

System.loadLibrary("bwajni");
//load the index
BwaIndex index=new BwaIndex(new File(args[0]));
//load the bwa engine
BwaMem mem=new BwaMem(index);
//get reads from two fastqs
KSeq kseq1=new KSeq(new File(args[1]));
KSeq kseq2=new KSeq(new File(args[2]));
//build a list of two fastqs, forward and reverse
List<ShortRead> L1=new ArrayList<ShortRead>();
List<ShortRead> L2=new ArrayList<ShortRead>();
//while something can be done
for(;;)
        {
        //read the pair of fastq
        ShortRead read1=kseq1.next();
        ShortRead read2=kseq2.next();
	//should we analyze and dump the data ?
        if(read1==null || read2==null || L1.size()>100)
                {
                if(!L1.isEmpty())
                        for(String sam:mem.align(L1,L2)) //get the SAM records
                                {
                                System.out.print(sam);
                                }
                if(read1==null || read2==null) break;
                L1.clear();
                L2.clear();
                }
        L1.add(read1);
        L2.add(read2);
        }
kseq1.dispose();
kseq2.dispose();
index.close();
mem.dispose();

Testing

Here is the ouput of the JAVA version:

java  -Djava.library.path=src/main/native -cp src/main/java com.github.lindenb.jbwa.jni.Example2 \
	human_g1k_v37.fasta  tmp1.fq  tmp2.fq

HWI-1KL149:20:C1CU7ACXX:4:1101:13638:2192       121     1       229568362       37      13S87M  =       229568362       0       GCTCTTCCGATCTGGCACGTTGAAGGTCTCAAACATGATCTGGGTCATCTTCTCGCGGTTGGCCTTGGGATTGAGGGGGGCCTCGGTGAGCAGGGNGGGG       AB?DDDDDDDBDCDDDDDDDDDDCDDDDCCC>(DCDDDDDDBDDDCCCCBDDDFFEEJIHIJIIHJIJJJJJJIJJJJJJJJJJJJJHHHHHDA2#FCCC    NM:i:1  AS:i:85 XS:i:61
HWI-1KL149:20:C1CU7ACXX:4:1101:13638:2192       181     1       229568362       0       *       =       229568362       0       GCTCTTCCGATCTCCCCACCCTGCTCACCGAGGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCAGNNNNNNNNNNNNNNNNNNAACGTGCC       ?DDDDDDDDDDDDDDB?9BDDDDDDDBBB?8,,######################################?12##################FFFFFCCC    AS:i:0  XS:i:0  
HWI-1KL149:20:C1CU7ACXX:4:1101:1424:2423        69      X       16753128        0       *       =       16753128        0       AGATNGGAAGAGCACACGTCTGAACTCCAGTCACCAAGGAGCATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAAAAACAAATACGGATGAGACATG       CCCF#2ADHHHHHJJJJJJJJJJJJJJ>9:1*1C3C8D600)0*0*/00-.8B)--5B().).=).?CFFFBBBDB########################    AS:i:0  XS:i:0  
HWI-1KL149:20:C1CU7ACXX:4:1101:1424:2423        137     X       16753128        0       58S34M8S        =       16753128        0       AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAAAAAAAAAAAAAACAAAAAAAGAGATGAACAAGCAAA       CCCFFFFFHHHHHJJJJJJJJJJJJJJJJJHIHIIJJJJJJJHJJIIJJJHFFFFEEEEEEEDDDD##################################    NM:i:0  AS:i:34    XS:i:29 
HWI-1KL149:20:C1CU7ACXX:4:1101:2908:2463        97      12      110765491       60      70M30S  =       110765491       70      AATTNGGGGAACAGCTTTCCAAAGTCATCTCCCTTATTTGCATTGCAGTCTGGATCATAAATATTGGGCAAGATCGGAAGAGCACACGTCTGAACTCCAG       CCCF#4BDHGHHHJJJJJJJJJIJHIJJJJJJJJJJJJJJJJIJJJJJJJJJJJJJIIIIHIJJJJIIIJIJJGHEHFFFEDDEEAA@BDDDCDDDD:C@    NM:i:1  AS:i:68 XS:i:0
HWI-1KL149:20:C1CU7ACXX:4:1101:2908:2463        145     12      110765491       60      30S70M  =       110765491       -70     CTCTTTCCCTACACGACGCTCTTCCGATCTAATTTGGGGAACAGCTTTCCAAAGTCATCTCCCTTATTTGCATTGCAGTCTGGATCATAAATATTGGGCA       DDDDDDDDDCAB=DDBDEEFFFFHHHJJJGHHGGFJJJJJIIIIJJJJJIJJJJJIJIIIJJJJJJJJJJJJIJJHHHFHEEJJIJJHHHHHFFFFFCBC    NM:i:0  AS:i:70 XS:i:0
HWI-1KL149:20:C1CU7ACXX:4:1101:4663:2297        81      4       114279632       60      100M    =       114279455       -277    GATTCCTACTGCACCCATGGAGAATGTGCCTTTTACTGAAAGCAAATCCAAAATTCCTGTAAGGACTATGCCCACTTCCACCCCAGCACCTCCATNTGCA       DCDDDDCACCDBCBCDDDCDDCCA?EEDDDFFDFFFHHHGHHHJJJJJJJJIJJIJIJJIJIJJJJJJJJJJIGJJIIHFIJJJJHGDHHDHDA2#FCCB    NM:i:1  AS:i:98 XS:i:0
HWI-1KL149:20:C1CU7ACXX:4:1101:4663:2297        161     4       114279455       60      100M    =       114279632       277     CGTGCAAACGGGTGATATACCTCCTCTCTCTGGTGTAAAGCAGATATCCTGCCCCGACTCTTCTGAACCAGCTGTACAAGTCCAGTTAGATTTTTCCACA       CCBFFFFFHHHHFHIJJJJJIIJJJJJJJJJJJHIGIJIIJJJJJJJJJJHIJJJJJJJHHHHHHFDDDFDDEEEDDDADCCDDDCCDCCDEDDDCACCC    NM:i:0  AS:i:100  XS:i:0   
HWI-1KL149:20:C1CU7ACXX:4:1101:6872:2320        81      2       179597667       60      100M    =       179597628       -139    GGCTGTGCCTTCCACAAATGCTATCCTGTATCTGTCAGAAGCAGCTATTTCTTTGCCATCCTTAAACCAGGACACCCTCATGGGGAGGGAGCCTGNAATT       ABDDDDDBDDDDDDEDDEDDDEECEEFFFFFFHGHHHHJJIJJJJJIIJJIJJJJJJJJJIIJIHGJJJJJHHEJJIHJJJJJJJJJHHHHHDA2#FCCC    NM:i:1  AS:i:98 XS:i:0
HWI-1KL149:20:C1CU7ACXX:4:1101:6872:2320        161     2       179597628       60      100M    =       179597667       139     CCCTGCATCATTCATGTCTACTCTGATGATCTCCAAAGAGGCTGTGCCTTCCACAAATGCTATCCTGTATCTGTCAGAAGCAGCTATTTCTTTGCCATCC       CCCFFFFFHHHHHJJJJJJJJJJJJIJJJJJJJJJJJJJJIIJJIIHJJJJIJJGIIIJJJIIJIIIHGIJJJJJIIEHHHHHHFBFFDEFECDECCDDA    NM:i:0  AS:i:100  XS:i:0   
HWI-1KL149:20:C1CU7ACXX:4:1101:9215:2408        97      2       220283746       60      100M    =       220283863       217     CAGCNGCTCAAGGCCAAGTGAGGGCCCGGCACCCCAGACTCCTCTTTCTGCGGGCAGGGCACAGGAGGCTAGGCCTGGGGGCTGGGGTCCCGCTGTCAGC       CCCF#2ADHHHHHFIJIIHIGIJJJJJJJJIIJJJJIJJJJJJJIIIJJIGFFFDDDDDDDBDDD?BDBDCBBDDCDDDDDBDDDBB>BBDDDDB@CDCD    NM:i:2  AS:i:93 XS:i:23
HWI-1KL149:20:C1CU7ACXX:4:1101:9215:2408        145     2       220283863       60      100M    =       220283746       -217    GCCCGGGACCCTCTCCTGCCCCATGTGGAGAAAGGGTCCTCCACCTGTGTGTTTCAAGGGGCCGTGACCTCCAGGTCTCTCCCCCTGCGATCCCATCTTG       BDDBDBC?DDDDDDDDDDDDDDDDDDDDDDDDDDDDBDDDDCADDDDDBEEEEEFFFFHHIJJJIHGJJJIJJJJJIIIIJIJJJJJHHHGHFFFFFCCC    NM:i:0  AS:i:100  XS:i:0   
HWI-1KL149:20:C1CU7ACXX:4:1101:9815:2325        97      22      46114322        60      100M    =       46114410        188     AAAGNCCGGAATTGGTACAAGCCATGTTTCCCAAACTGAACAATCAAGAAAGGTAACCCCCCAACCAGCGTGGTCTGGAGTATTTAGCATTCCATATAGG       CCCF#2ADHHHHHJJGHIJJJJJJJJIGJJJJJJJJJJJJJJJJJJJJGHIJJHIJJIIJJHFFFFDDCD?BDDDCCDCD>ACDEEDDDEDDEDCCCCCD    NM:i:1  AS:i:98 XS:i:0
HWI-1KL149:20:C1CU7ACXX:4:1101:9815:2325        145     22      46114410        60      100M    =       46114322        -188    ATTCCATATAGGGTATTCGATGCACGTGACTGAAAAGCTGTGTGGTTTCTGAGTTGGCACAGAATCTCTAAATACATGTTTCTGTGTTGGTAATGGTTTT       DDCDEDCCDDDDCDDEEDEFFFFFHHHHIJJJJJJJIJJJJIIJJJIIGGJJJJJIJJJJJJJJIIHJJJJJIIJJJJJJJIIJIJIHFHHHFFFFFCCC    NM:i:0  AS:i:100  XS:i:0   
HWI-1KL149:20:C1CU7ACXX:4:1101:11401:2488       97      3       38763808        60      100M    =       38763855        147     CCACNATACGGTAGCAAGTCTTGCGCACCTGCCAGCCCACATCCCATGGACTCTTCGTGGTATCCAGTTTGCAGCAGGGACAGTGGCGAATGCATCCTGT       CCCF#4ADHHHHHJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJIJJJEIJJIJJJHHHFFFFFFFEEEEEEEDABBDDDBBCCDBD>BDDDDEDDDD>    NM:i:2  AS:i:93 XS:i:0
HWI-1KL149:20:C1CU7ACXX:4:1101:11401:2488       145     3       38763855        60      100M    =       38763808        -147    GGACTCTTCGTGGTATCCAGTTTGCAGCAGGGACAGTGGCGAATGCATCCTGTGGGGAGAGGTGACTGATGGTGGGTGATGGCCAGTGGGCAAAGGGGAT       DDCDDDB?DCCCDECDDCDDDCDDEEDEFFFFFFHHHJJIJJJIJIIJIJJJIJJIJJJJJJJJIJJJJJJJJJJIJJJJJJJJJJJHHHHHFFFFFCCC    NM:i:1  AS:i:95 XS:i:0
HWI-1KL149:20:C1CU7ACXX:4:1101:11658:2375       97      7       35293037        60      100M    =       35293129        192     CAGCNAGGGGCACAGACGGATGCGCAGCATCCCCAGTCCTCGGCGGACAGCCGGGTAGCCCAACTTACCCAGGGGTTTGATTGTGTTCTCCGTCGCCTCC       CCCF#2ADHHHHHJIIJJJJIJJJJJJJJJIJJJJJIJJJJJJJJDDDDDDDDDDBBDDDDDDDDDDDDDDDDDDDBBBDDDDDDDDCEDCB?ABDBDD1    NM:i:1  AS:i:98 XS:i:0
HWI-1KL149:20:C1CU7ACXX:4:1101:11658:2375       145     7       35293129        60      100M    =       35293037        -192    TCGCCTCCTTCTCCTTAGAGCCGCCGCTCGACATGAGCGCGGCAATGGAGAAGGCGTTGGCCCGGGAGGAGAGTTGGGGCTTGGGGGACGCCGTGAACTC       DDBBBDDCA8DDDCC@DDDBDDDDDDDDDDEDDDDDDDDDDDDEDDDDCCDDDDFFFHHJJJJJJJJJHJJJJJJJJJJJJJJJJJJHHHHHFFFFDCBB    NM:i:1  AS:i:95 XS:i:20
HWI-1KL149:20:C1CU7ACXX:4:1101:12054:2300       97      2       40401764        60      100M    =       40401971        307     CAAGNTACATAAGATGTAGGTTTGGATTGATGGTTAAGGGTATTTGGGGAAAAATAAGGAACATTAAAAAAATAAGTCTTACCAAACAGGTATTTTCCTT       CCCF#4=DHHHHHIJJHIJJHIJJJHIJJIIJJEGHJJJJDGIJJJJJJGHHIJJIIJJJIIIIJIJJHHFDEDECDDEEDDDDDDDDDDCCDEEEDDCD    NM:i:1  AS:i:98 XS:i:0
HWI-1KL149:20:C1CU7ACXX:4:1101:12054:2300       145     2       40401971        60      100M    =       40401764        -307    TTGTGAAGCCACCTAAAAAAGAAAAAAACAACAACAAATGTTATAATTTGACACTCTACATAACAAATACCAGTGACATCAGACTGCCTGACAACCCACC       @CC@DDDDDDDDDDDDDDDDDDFHHHHEIIHIIIJJJIJJJJJJJJJIHDIJJJJJIIJJJJIJJJJHFJJJJJJIJJJJJJJJJJJHHHHHFFFFDBCB    NM:i:0  AS:i:100  XS:i:0   

And the ouput of the Native C version:

bwa mem human_g1k_v37.fasta tmp1.fq tmp2.fq 2> /dev/null | grep -v -E '^@'

HWI-1KL149:20:C1CU7ACXX:4:1101:13638:2192       121     1    
View on GitHub
GitHub Stars20
CategoryDevelopment
Updated7mo ago
Forks11

Languages

Java

Security Score

82/100

Audited on Aug 13, 2025

No findings