AMOS file format (.afg)

6
AMOS file format (.afg) {LIB iid:453 eid:17000001585820 {DST mea:3000.000 std:166.667 } This is an insert “library” with mean insert length of 3000bp, and a standard deviation of 166.667 bp. The library ID is 453

description

AMOS file format (.afg). { LIB iid:453 eid:17000001585820 {DST mea:3000.000 std:166.667 }. This is an insert “library” with mean insert length of 3000bp, and a standard deviation of 166.667 bp. The library ID is 453. This “fragment” is a clone insert, from which - PowerPoint PPT Presentation

Transcript of AMOS file format (.afg)

Page 1: AMOS file format (.afg)

AMOS file format (.afg)

{LIBiid:453eid:17000001585820{DSTmea:3000.000std:166.667}

This is an insert “library” with mean insert length of 3000bp,and a standard deviation of 166.667 bp. The library ID is 453

Page 2: AMOS file format (.afg)

AMOS file format (.afg)

{FRGiid:456eid:90lib:453rds:88,89typ:I}

Its internal ID is 456

This “fragment” is a clone insert, from whichboth ends have been sequenced

It came from Library 453 (which hasan insert length of 3000bp)

Its ends are identified by the two readswith internal IDs 88 and 89

Its ends face “Inward”, 5’ on the outsideand 3’ on the inside

Page 3: AMOS file format (.afg)

AMOS file format (.afg)

{REDiid:88eid:17000001585880f seq:GCCACGTAGGCGTTTTGGATGGAAATTAGCCGCCTCGGGCGTCGCATTGCTCAAGGGACTAATTTCAGCGGCCCTGTGATGTGGCCTGTCGGTGGGGGTGTGGTGAGGAGTTCGCGAACCTGATCGTCGAGTAGATCTGTCCAACCGTCATCAAACGCGGATATCAATGGGTTGCGCACACCACATCGTAGGCTTCGTGCGATCTCACGGCCAGGCTGGCTGTTGGCCCGACCGGTATCGTGACAATTATTGATTTGGGGGGTCGAGCGGGTCTCGTGGCCCGTAAGTTACGGTACGGCGGCCGTCAGCATGCTGGCGCCGGTGGCTATGCCGTCATCGACGGGGGTCACGGTCCTGCCGTGTGGGTCGGCCGACGGTGCGCTTGCCCCTATACATCCGTTTGCATCGCATGAGTGCCACTGTCTCCTTGTCAATCACTCGTGCGAGTCAGCATCGGACGGGGCATTGTTGGGGTATTGAGGCCTTGGGTGGTGGTGTTGTG.qlt:KKKKK7IK:KKKKKKA9KKKKKKKK5KKKKKKKT;KKKKQKLKKKKKKFKKKK<E<K:KKKKNKK9KK9=FK<KKK@KKLKOKKKKK:KKKKJK5?KKKKMLKKK8IKKTKKKKF@KKTK=KK5@UKBKKUADDKKEKH<EKDUKKK;KPKKKBKK9TKKPKK@?KKGKKKKKKKKTKKKKKUK9KKK>LK5KKKKK9KK8KFO;KKKQKKKKKKKKTKKK5FKKKKKKKKKKUKKKKKKKK8RKKKQTKKKFKKPSKKKKKK:KKKKKKKK<KKOKKK=KPKKKKKKKKKIHBKKKK<NKBKKKKK;KKKKK6DKKKKK=KKKKSKKKKKUKKEKKKKKKHKPIKRKKGKOKKMKKKKKKKK5K>KKOKS6:KKCKKSK<KKKN@TKKKKK?QKKK>PK>KLGKKKKKKKKMUKKDKKKKKKKKKK9KKKKKKK;KK7KTNKQKKKKKKKKJBKNKUKK7K99OKKKKK7KKKKKKDKKKKKPK7HAKKKKKKKKUTKUKK.frg:456clr:0,502}

This is read ID 88

It comes from fragment (pair) 456

The high-quality (“clear”) part of the read is from 0-502

Page 4: AMOS file format (.afg)

Lab2: identifying the species

just BLAST it

http://www.ncbi.nlm.nih.gov/BLAST/

Suggestion: If you want a fast answer, set BLAST to use a word size of 15, and set “expect” to a small value such as 0.00001. Or use Megablast.

Page 5: AMOS file format (.afg)

Running AMOScmp

Input files: lab02.afg, lab02.1con

$ AMOScmp lab02The log file is: lab02.runAmos.logDoing step 10: Building AMOS bankDoing step 20: Collecting clear range sequencesDoing step 30: Running nucmerDoing step 40: Running layoutDoing step 50: Running consensusDoing step 60: Outputting contigsDoing step 70: Outputting fasta

Files created:lab02.bnk (a directory)lab02.conflictlab02.delta (created by nucmer)lab02.layoutlab02.seqlab02.cluster (created by nucmer)lab02.contiglab02.fastalab02.runAmos.log

Page 6: AMOS file format (.afg)

Human duplicationsArabidopsis thaliana duplications