On the ViPR homepage, choose a virus family or a Featured Virus to start. 1. Mouse-over the “Analyze & Visualize” tab and click
“Analyze Sequence Variation (SNP)”. 2. On the SNP landing page, use one of the three
options to input sequences: 2.1 Upload a sequence file in FASTA format OR 2.2 Paste sequences in FASTA format OR 2.3 Use a working set from your Workbench.
Then click “Run” to run the analysis. 3. As soon as the analysis is finished, a report similar to the
above sample report will be displayed on the screen.
ViPR is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between Northrop Grumman Health IT, J. Craig Venter Institute and Vecna Technologies. Comments, questions, suggestions? Contact us at [email protected]
http://www.viprbrc.org/
Freely available Integrated datasets Bioinformatics tool suite
Sequence Conservation/Variation Analysis
• Analyze sequence polymorphism at the nucleotide or amino acid level. • Calculate concensus sequence and polymorphism of ViPR sequences or your own
Sequence Variation Analysis Sample Report
The analysis report page shows the polymorphism score, consensus, and counts for each different base/amino acid at each position.
Consensus sequence and raw alignment are available for download.
At each position, the consensus is the allele with
frequency greater than 50%. If no allele exceeds 50%, N (for nucleotide) or Xaa (for
amino acid) is used to indicate ambiguity.
Download consensus sequence
in FASTA format
Download raw alignment of all
sequences
Score ranges from 0 (no
polymorphism) to 232 (highest
polymorphism).
Count for different
nucleotides at each position
Save SNP result to Workbench for future retrieval
Cite ViPR Tutorials Report a Bug Request Web Training Contact Us Release Date: Sep 7, 2012
This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between NorthropGrumman Health IT, J. Craig Venter Institute and Vecna Technologies. Virus images courtesy of CDC Public Health Image Library, Wellcome Images, U.S. Department of Veterans Affairs ,Science of the Invisible and ViralZone, Swiss Institute of Bioinformatics.
Upload a file containing my sequences in FASTA format.
Paste sequences in FASTA format.
Use working sets
INPUT SEQUENCES
Sequences can also be selected from search results or a working set in your workbench.
File Path:Browse…
The minimum number of sequences is 2.
RunClear
For polymorphism calculation MUSCLE is used for multiple sequence alignment. A consensus sequence is created by "majority rule". At each position, the consensus is theallele with frequency greater than 50%, regardless of coverage. If no allele exceeds 50%, N (for nucleotide) or Xaa (for amino acid) indicates ambiguity. Sequences in thealignment are then compared to the consensus to identify polymorphisms.
To score polymorphism at each position, a formula modified from the one cited in Crooks et al. is used.S = -100 * Sum (Pi * logPi) where Pi is the frequency of the ith allele
The score is the normalized entropy of the observed allele distribution. For nucleotides, scores can range from 0 (no polymorphism) to 232 (4 alleles and an indel, 20%frequency each).
Analyze Sequence Variation (SNP) Home Analyze Sequence Variation
DengueSEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA VIRUS FAMILIES HOME
About Us Community Announcements Links Resources Support Workbench Sign InOption 1: Calculate consensus sequence and sequence variation of your own sequences 2
2.1
Three options to input sequences
Cite ViPR Tutorials Report a Bug Request Web Training Contact Us Release Date: Jul 18, 2011
This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between NorthropGrumman Health IT, University of Texas Southwestern Medical Center and Vecna Technologies. Virus images courtesy of CDC Public Health Image Library, Wellcome Images, U.S.Department of Veterans Affairs , Science of the Invisible and ViralZone, Swiss Institute of Bioinformatics.
Upload a file containing my sequences in FASTA or Phylipformat.
Paste sequence in FASTA or Phylip format.Defline in your FASTA file will be used to label the display
Use working set.
Choose a Working Set
TREE GENERATION
Quick Tree Custom Tree (I want to set my own parameters)
SEQUENCE TYPE *
Nucleotide Amino Acid (Protein)
SOURCE OF SEQUENCES TO BE ANALYZED *Sequences can also be selected from search results or a working set in yourworkbench.
LABELINGDefline in your FASTA file will be used to label the display
FORMAT OF SEQUENCES PROVIDED *
Unaligned FASTA Aligned FASTA Phylip (interleaved)
Build TreeClear
Generate Phylogenetic Tree The "Quick Tree" option uses the FastME [ ]. This algorithm uses a fast, distance-basedapproach and is good for generating trees from datasets containing 1) more than 1,000 sequences of short or medium length sequences, 2) more than 100 very long sequences,or 3) to reconstruct a "quick and dirty" tree.The "Custom Tree" option incorporates PhyML [ ] to infer a more evolutionarily-accurate phylogenetic topologyby applying a substitution model to the nucleotide sequences. This algorithm is best applied to datasets containing 1) fewer than 100 very long sequences, 2) between 100 and1,000 small or medium length sequences. (Note: An asterisk (*
Home Generate Phylogenetic Tree
FlaviviridaeSEARCH DATA
About Us Announcements Links Resources Support Sign Out
You are logged in as [email protected]
Cancel Select
Choose Working Set
Name Type Number ofSequences
Date
Dengue2_genome_human-1999-2000
Genome 32 08/05/2011 3:37PM
DENV1-4_99-00_human_Genomes Genome 82 06/24/2011 10:43AM
hepatitis c Genome 1 03/29/2011 11:10AM
Virus Pathogen Database and Analysis Resource (ViPR) - Flavivir... http://www.viprbrc.org/brc/tree.do?method=ShowCleanInputPage...
1 of 1 8/11/11 12:28 PM
2.3
Cite ViPR Tutorials Report a Bug Request Web Training Contact Us Release Date: Sep 15, 2011
This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between NorthropGrumman Health IT, University of Texas Southwestern Medical Center and Vecna Technologies. Virus images courtesy of CDC Public Health Image Library, Wellcome Images, U.S.Department of Veterans Affairs , Science of the Invisible and ViralZone, Swiss Institute of Bioinformatics.
Upload a file containing my sequences in FASTA format.
Paste sequence in FASTA format.
Use working set.
INPUT SEQUENCESSequences can also be selected from search results or a working set in yourworkbench. HTML
SELECT OUTPUT FORMAT
Aligned
SELECT OUTPUTORDER
RunClear
Align Sequences (MSA) ViPR uses the MUSCLE (Multiple Sequence Comparison by Log-Expectation) algorithm to align the sequences you select from a search result or a working set on yourworkbench or that you provide in an uploaded file.
Home Align Sequences (MSA)
DengueSEARCH DATA ANALYZE & VISUALIZE WORKBENCH VIRUS FAMILIES HOME
ANALYZE & VISUALIZE
Identify Similar Sequences (BLAST)
Align Sequences (MSA)
Visualize Aligned Sequences
Identify Short Peptides in Proteins
Genome Annotator (GATU)
Analyze Sequence Variation (SNP)
Metadata Sequence Analysis
Generate Phylogenetic Tree
HISTORY
Retrieve an Analysis
Retrieve a Download
Your Analysis History
About Us Announcements Links Resources Support Sign Out
You are logged in as [email protected]
Use MUSCLE to align nucleotide or amino acid sequences.
Virus Pathogen Database and Analysis Resource (ViPR) - Flavivir... http://www.viprbrc.org/brc/msa.do?method=ShowCleanInputPag...
1 of 1 10/13/11 3:15 PM
1
Cite ViPR Tutorials Report a Bug Request Web Training Contact Us Release Date: Jul 18, 2011
This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between NorthropGrumman Health IT, University of Texas Southwestern Medical Center and Vecna Technologies. Virus images courtesy of CDC Public Health Image Library, Wellcome Images, U.S.Department of Veterans Affairs , Science of the Invisible and ViralZone, Swiss Institute of Bioinformatics.
Upload a file containing my sequences in FASTA or Phylipformat.
Paste sequence in FASTA or Phylip format.Defline in your FASTA file will be used to label the display
Use working set.
TREE GENERATION
Quick Tree Custom Tree (I want to set my own parameters)
SEQUENCE TYPE *
Nucleotide Amino Acid (Protein)
SOURCE OF SEQUENCES TO BE ANALYZED *Sequences can also be selected from search results or a working set in yourworkbench.
>gb:FJ850072|Organism:Dengue virus DENV-2/BR/BID-V2376/2000|Subtype:2|Host:HumanACAAAGACAGATTCTTTGAGGGAGCTAAGCTCAACGTAGTTCTAACAGTTTTTTGATTAGAGAGCAGATCTCTGATGAATAACCAACGAAAAAAGGCGAGAAGTACGCCTTTCAATATGCTGAAACGCGAGAGAAACCGCGTGTCAACTGTGCAACAGCTGACAAAGAGATTCTCA
LABELINGDefline in your FASTA file will be used to label the display
FORMAT OF SEQUENCES PROVIDED *
Unaligned FASTA Aligned FASTA Phylip (interleaved)
TREE ALGORITHM
PHYML RAXML
Run ProtTest for recommendation of evolutionary modelthat best fits my data
I know which evolutionary model I want to use.
HKY
Proportion Invariant 0.0
Number of categories 1 Integer from 1 to 20 or use default value
Shape parameter 1.0 Positive real value or use default value
Outgroup (optional)
PARAMETERS FOR PHYLOGENETIC ANALYSIS
EVOLUTIONARY MODEL (Substitution DNA)
All models are optimized using the maximum likelihood criterion. We have set defaultvalues for 1-7 parameters for each model.
Real value between 0.00 and 1.00 or use default value. The models implementedhere allow you to specify that a proportion of the sites never vary.
GAMMA RATE VARIATION
Specify the defline of the uploaded/pasted sequences to build the tree
Build TreeClear
Generate Phylogenetic Tree The "Quick Tree" option uses the FastME [ Desper, R., Gascuel, O. (2002) Journal of Computational Biology 19(5), pp. 687-705. ]. This algorithm uses a fast, distance-basedapproach and is good for generating trees from datasets containing 1) more than 1,000 sequences of short or medium length sequences, 2) more than 100 very long sequences,or 3) to reconstruct a "quick and dirty" tree.The "Custom Tree" option incorporates PhyML [ Guindon, S. and Gascuel, O., (2003) Syst Biol. 52: 696-704 ] to infer a more evolutionarily-accurate phylogenetic topologyby applying a substitution model to the nucleotide sequences. This algorithm is best applied to datasets containing 1) fewer than 100 very long sequences, 2) between 100 and1,000 small or medium length sequences. (SOP)Note: An asterisk (*) = required field
Home Generate Phylogenetic Tree
FlaviviridaeSEARCH DATA ANALYZE & VISUALIZE WORKBENCH VIRUS FAMILIES HOME
About Us Announcements Links Resources Support Sign Out
You are logged in as [email protected]
Virus Pathogen Database and Analysis Resource (ViPR) - Flavivir... http://www.viprbrc.org/brc/tree.do?method=ShowCleanInputPage...
1 of 1 8/11/11 12:46 PM
2.2
2
http://www.viprbrc.org/
Freely available Integrated datasets Bioinformatics tool suite
ViPR is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between Northrop Grumman Health IT, J. Craig Venter Institute and Vecna Technologies. Comments, questions, suggestions? Contact us at [email protected]
Start to type strain to get suggestions Jump
Deselect All
Species: Dengue virus Select All(0/3097 strains selected) (7 Types - 3066 complete genomes)
Type: Dengue virus type 1 Select All(0/1288 strains selected) (1288 Strains - 1289 complete genomes)
Type: Dengue virus type 2 Select All(0/943 strains selected) (943 Strains - 929 complete genomes)
Type: Dengue virus type 2Thailand/16681/84
Select All(0/1 strains selected)
(1 Strain - 2 complete genomes)
T D i t 3 S l t All
SELECT VIRUS(ES) TO INCLUDE IN SEARCHExclude partially sequenced genomes
Jump to strain in taxonomy: Start: YYYY
End: YYYY
COLLECTIONYEAR
To add month tosearch, seeAdvance SearchOptions: MonthRange
SAMPLE LOCATIONAllAmerican SamoaAnguillaAustraliaBangladeshBelizeBrazilBritish Virgin IslandsBruneiBurkina FasoCambodiaChileChinaColombiaComorosCook IslandsCote D'IvoireCote d'IvoireC b
HOST SELECTIONAllHumanMosquitoMouseUnknown
Host Gender All Male Female
From
To
Host Age Range
PrimaryProbable primaryProbable secondarySecondary
Immune Status
HOST ATTRIBUTES
cell supernatantserumplasma
Sample Source
NoneC6/36 0C6/36 18
Passage History
SAMPLE ATTRIBUTES
DEN1DEN4DEN3DEN2
Virus Type
deathnot knownrecovery
Disease Course
VIRUS ATTRIBUTES
Submission Site Authors
* Use comma to separate multiple entries.Ex: McElroy, Jorge
Sample Collection Authors
* Use comma to separate multiple entries.Ex: Comach, Jarman
DENFRAMEPDVI NIsm-DVEICAKP
Cohort (Study) Population
ISOLATION EVENT
Genome Search Search for virus genomic sequences and related information. You can search for the whole virus family or search for specified genus, species etc. You can also find your strain orgenome record if you have its information, such as strain name, accession.
Home Genome Search
DengueSEARCH DATA ANALYZE & VISUALIZE WORKBENCH VIRUS FAMILIES HOME
About Us Announcements Links Resources Support Sign Out
You are logged in as [email protected]
Virus Pathogen Database and Analysis Resource (ViPR) - Flavivir... http://www.viprbrc.org/brc/vipr_genome_search.do?method=Sho...
1 of 2 8/11/11 11:41 AM
Option 2. Calculate consensus sequence and sequence variation of ViPR sequences
1 2
Your search returned 32 genomes. Search Criteria Displaying 50 per pageDisplay Settings
Add to Working Set Save Search Download
Add to Working Set Save Search Download
Genome Search Result
Your Selected Items: 32 items selected | Deselect All
Select all 32 genomes
Strain Name Species Name Sequence Length Date Host GenBank Host Country Mol Type
DENV-2/BR/BID-V2376/2000 Dengue virus 2 10677 2000 Human Homo sapiens Brazil genomic RNA
DENV-2/CO/BID-V3369/1999 Dengue virus 2 10625 1999 Human Homo sapiens Colombia genomic RNA
DENV-2/NI/BID-V2344/2000 Dengue virus 2 FJ850060 10729 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2346/2000 Dengue virus 2 FJ850061 10690 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2362/2000 Dengue virus 2 FJ744745 10679 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2363/2000 Dengue virus 2 FJ744705 10669 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2364/2000 Dengue virus 2 FJ744744 10664 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2657/2000 Dengue virus 2 FJ850117 10679 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2658/2000 Dengue virus 2 FJ850118 10679 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2659/2000 Dengue virus 2 FJ850062 10679 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2660/2000 Dengue virus 2 FJ850063 10679 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2662/2000 Dengue virus 2 FJ850119 10678 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2663/2000 Dengue virus 2 FJ850064 10679 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2664/2000 Dengue virus 2 FJ850065 10679 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2665/2000 Dengue virus 2 FJ850066 10679 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2666/2000 Dengue virus 2 FJ873808 10741 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2683/1999 Dengue virus 2 GQ199895 10678 1999 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2923/2000 Dengue virus 2 FJ898477 10678 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/NI/BID-V2924/2000 Dengue virus 2 FJ898478 10679 2000 Human Homo sapiens Nicaragua genomic RNA
DENV-2/US/BID-V1048/1999 Dengue virus 2 EU482557 10639 1999 Human Homo sapiens USA genomic RNA
DENV-2/US/BID-V1425/1999 Dengue virus 2 EU677142 10678 1999 Human Homo sapiens USA genomic RNA
DENV-2/US/BID-V1426/1999 Dengue virus 2 EU677143 10678 1999 Human Homo sapiens USA genomic RNA
DENV-2/US/BID-V1427/1999 Dengue virus 2 EU677144 10678 1999 Human Homo sapiens USA genomic RNA
DENV-2/US/BID-V1428/1999 Dengue virus 2 EU677145 10678 1999 Human Homo sapiens USA genomic RNA
DENV-2/US/BID-V1461/2000 Dengue virus 2 EU687222 10678 2000 Human Homo sapiens USA genomic RNA
DENV-2/US/BID-V1462/2000 Dengue virus 2 EU687223 10678 2000 Human Homo sapiens USA genomic RNA
DENV-2/US/BID-V1463/2000 Dengue virus 2 EU687224 10678 2000 Human Homo sapiens USA genomic RNA
DENV-2/US/BID-V1464/2000 Dengue virus 2 EU687225 10678 2000 Human Homo sapiens USA genomic RNA
DENV-2/US/BID-V598/1999 Dengue virus 2 EU482729 10678 1999 Human Homo sapiens USA genomic RNA
DENV-2/US/BID-V599/1999 Dengue virus 2 EU482730 10629 1999 Human Homo sapiens USA genomic RNA
DENV-2/VE/BID-V2942/2000 Dengue virus 2 FJ898466 10679 2000 Human Homo sapiens Venezuela genomic RNA
DF404 Dengue virus 2 FM210217 10685 1999 Human Homo sapiens Viet Nam genomic RNA
Your Selected Items: 32 items selected
Top
Run Analysis �
Run Analysis �
Home Genome Search Results
Identify Similar Sequences (BLAST)
Analyze Sequence Variation (SNP)
Align Sequences (MSA)
Metadata Genome Compare
Generate Phylogenetic Tree
DengueSEARCH DATA ANALYZE & VISUALIZE WORKBENCH VIRUS FAMILIES HOME
About Us Announcements Links Resources Support Sign Out
You are logged in as [email protected]
Virus Pathogen Database and Analysis Resource (ViPR) - Flavivir... http://www.viprbrc.org/brc/vipr_genome_search.do
1 of 2 8/11/11 11:49 AM
3
Select sequences and add them to a working set for
future analysis. You’ll need to register for a Workbench account to use this feature.
• Select display fields • Custom-sort records
Click to view details of
the record
Cite ViPR Tutorials Report a Bug Request Web Training Contact Us Release Date: Sep 7, 2012
This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between NorthropGrumman Health IT, J. Craig Venter Institute and Vecna Technologies. Virus images courtesy of CDC Public Health Image Library, Wellcome Images, U.S. Department of Veterans Affairs ,Science of the Invisible and ViralZone, Swiss Institute of Bioinformatics.
50 records were previously selected from search results
INPUT SEQUENCES
RunClear
For polymorphism calculation MUSCLE is used for multiple sequence alignment. A consensus sequence is created by "majority rule". At each position, the consensus is theallele with frequency greater than 50%, regardless of coverage. If no allele exceeds 50%, N (for nucleotide) or Xaa (for amino acid) indicates ambiguity. Sequences in thealignment are then compared to the consensus to identify polymorphisms.
To score polymorphism at each position, a formula modified from the one cited in Crooks et al. is used.S = -100 * Sum (Pi * logPi) where Pi is the frequency of the ith allele
The score is the normalized entropy of the observed allele distribution. For nucleotides, scores can range from 0 (no polymorphism) to 232 (4 alleles and an indel, 20%frequency each).
Analyze Sequence Variation (SNP) Home Genome Search Results Analyze Sequence Variation
DengueSEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA VIRUS FAMILIES HOME
About Us Community Announcements Links Resources Support Workbench Sign In
4
Cite ViPR Tutorials Report a Bug Request Web Training Contact Us Release Date: Sep 7, 2012
This project is funded by the National Institute of Allergy and Infectious Diseases (NIH / DHHS) under Contract No. HHSN272200900041C and is a collaboration between NorthropGrumman Health IT, J. Craig Venter Institute and Vecna Technologies. Virus images courtesy of CDC Public Health Image Library, Wellcome Images, U.S. Department of Veterans Affairs ,Science of the Invisible and ViralZone, Swiss Institute of Bioinformatics.
Save to Workbench
Request Notification
Processing...Data is still processing. Results will be shown when ready.
TICKET NUMBERIf you do not want to wait for the results, use your ticket number ( SA_925272578970 ) to come back to the Retrieve Results by Ticket Number page at a later timeand retrieve your results.
SAVE ANALYSIS TO WORKBENCHEnter the name you want to use and click Save to Workbench if you want to save the analysis when the results are ready.
NOTIFICATION OF COMPLETIONEnter your email and click Request Notification if you want to receive a notification when the results are ready.
Home Gene/Protein... Results Analyze Sequence Variation Processing...
DengueSEARCH DATA ANALYZE & VISUALIZE WORKBENCH SUBMIT DATA VIRUS FAMILIES HOME
About Us Community Announcements Links Resources Support Workbench Sign In
6
5
4. A “Select Sequence Type” lightbox will pop up. Select the appropriate sequence type and click “Continue”.
5. On the next page, you will see a brief description of the SNP tool. Click “Run” to proceed.
6. If you have a large number of long sequences to analyze, it may take a few minutes to run. While the analysis is running, you can choose to save it (upon completion) to your Workbench by entering a name for the analysis and then clicking the “Save to Workbench” button. Then you can move to other parts of the ViPR site, and retrieve the SNP analysis result later from your Workbench.
7. As soon as the analysis is finished, a report similar to the sample report on the reverse page will be displayed on the screen.
On the ViPR homepage, choose a virus family or a Featured Virus to start.
1. Search for nucleotide or protein sequences in ViPR by using the “Genomes” or “Genes & Proteins” search option available from the “Search Data” tab. For this example, we will use genome sequences.
2. Select search criteria on the Genome Search page and click the “Search” button to run your query.
3. On the search results page, select the desired sequences by clicking the checkboxes, mouse-over the yellow “Run Analysis” button and click “Analyze Sequence Variation (SNP)”. If you want to include sequences that are not in this search result or to use the sequences to do further analysis, select the desired sequences and click “Add to Working Set”. Then add other sequences to the same working set later by repeating the process. Click the “Workbench” tab and find the working set you saved. Click next to it to view the details of the working set. Then mouse-over the yellow “Run Analysis” button and click “Analyze Sequence Variation (SNP)”.
Top Related