Analysis: Tools for directly examining sequence What follows is a simulation of the proposed...

23
Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface has not yet been ported to the web. As you go through the simulation please consider what capabilities you would want to serve your research and annotation interests. A narrative to help you go through the simulation appears in a red-bordered box, such as the one below. To begin: 1. Click on Slide Show, (on the upper toolbar) 2. Click View Show 3. Click Continue button Continue Scenario 6

description

Anab7120:all4312 NostPunc: TricEryt: Syny6803:sll1330 TherElon:tlr1330 Anabaena PCC 7120: all4312 OptionsAnnotate Main Menu History Replicon: Chromosome Coordinates: (stop) < (start-GTG) System Length = 256 amino acids Strand: Complementary Function: Two-component response regulator System Syny6803:sll1330: Expression data (click to expand)Experiment Mutant: None Syny6803:sll1330: Failed to segregate Experiment Cyanobacterial orthologs: NostPunc TricEryt Syny6803 TherElon Scenario 1 left us with the provocative finding that all five cyanobacterial orthologs of all4312 are preceded by the same motif. What is that motif and what might it mean? To answer that question, click on the coordinates of all4312 to get to the sequence interface. A Lawrence/Collier conserved motif set

Transcript of Analysis: Tools for directly examining sequence What follows is a simulation of the proposed...

Page 1: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

Analysis: Tools for directly examining sequence

What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface has not yet been ported to the web. As you go through the simulation please consider what capabilities you would want to serve your research and annotation interests.

A narrative to help you go through the simulation appears in a red-bordered box, such as the one below.

To begin:1. Click on Slide Show, (on the upper toolbar)2. Click View Show3. Click Continue button

Continue

Scenario 6

Page 2: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

You’re intrigued by the motif you found in front of Anabaena PCC 7120 all4312 and its cyanobacterial orthologs (see Scenarios 1 and 5).

You’d like to look more deeply into it, by examining the sequence near the orf. You’re not sure what you’re looking for, and you’re open for anything.

Continue

Scenario 6

Analysis: Tools for directly examining sequence

Page 3: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

Anab7120:all4312

NostPunc:618.077

TricEryt:5.6053

Syny6803:sll1330

TherElon:tlr1330

Anabaena PCC 7120: all4312OptionsAnnotateMain Menu History

Replicon: Chromosome

Coordinates: 5166997 (stop) <- 5167767 (start-GTG) System Length = 256 amino acids

Strand: Complementary

Function: Two-component response regulator System Syny6803:sll1330: Expression data (click to expand) Experiment

Mutant: None Syny6803:sll1330: Failed to segregate Experiment

Cyanobacterial orthologs: NostPunc TricEryt Syny6803 TherElon

Scenario 1 left us with the provocative finding that all five cyanobacterial orthologs of

all4312 are preceded by the same motif. What is that motif and what might it mean? To

answer that question, click on the coordinates of all4312 to get to the sequence interface.

ALawrence/Collier conserved motif set

Page 4: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTACACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGTTAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAAATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGATCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACGCGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGCTCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCCTGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGACTAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTGTGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATCCCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAACGATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTTGCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAATCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCTTGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCAAGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCAATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCAAAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCACAATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTATGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT

all4312two-component system

5166997 <- 5167767

Anabaena Chromosome (6413771 bp): 5166951-5967950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

5166961516700151670515167101516715151672015167251 51673015167351 51674015167451 51675015167551 51676015167651 51677015167751 516780151678515167901

The interface places you in the Anabaena chromosome in the region surrounding

all4312, with the orf highlighted as a block.

Clicking on all4312 would get us back to the

annotation page. Our goal was to look at the motif

preceding the orf, so click on Display.

Page 5: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTACACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGTTAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAAATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGATCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACGCGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGCTCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCCTGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGACTAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTGTGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATCCCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAACGATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTTGCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAATCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCTTGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCAAGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCAATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCAAAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCACAATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTATGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT

all4312two-component system

5166997 <- 5167767

Anabaena Chromosome (6413771 bp): 5166951-5967950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

5166961516700151670515167101516715151672015167251 51673015167351 51674015167451 51675015167551 51676015167651 51677015167751 516780151678515167901

We want to display the motif predicted by Lawrence/Collier, so

click on Predicted features.

Alternate startsAnnotated featuresPredicted featuresPrivate featuresTandem repeatsInverted repeatsBase symbolsInvert display

Predicted features

Page 6: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTACACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGTTAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAAATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGATCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACGCGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGCTCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCCTGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGACTAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTGTGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATCCCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAACGATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTTGCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAATCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCTTGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCAAGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCAATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCAAAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCACAATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTATGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT

all4312two-component system

5166997 <- 5167767

Anabaena Chromosome (6413771 bp): 5166951-5967950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

5166961516700151670515167101516715151672015167251 51673015167351 51674015167451 51675015167551 51676015167651 51677015167751 516780151678515167901

I was hoping to see sequences I

recognized, but that’s made more difficult by the orf being on the wrong strand. I

could invert the entire display, but

instead I’ll just work on a segment. Click

Block.

Page 7: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTACACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGTTAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAAATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGATCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACGCGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGCTCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCCTGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGACTAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTGTGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATCCCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAACGATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTTGCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAATCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCTTGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCAAGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCAATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCAAAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCACAATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTATGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT

all4312two-component system

5166997 <- 5167767

Anabaena Chromosome (6413771 bp): 5166951-5967950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

5166961516700151670515167101516715151672015167251 51673015167351 51674015167451 51675015167551 51676015167651 51677015167751 516780151678515167901

The highlighted orf sequence could now be

downloaded or first translated then downloaded, but I’m interested now only in the region preceding the gene. Click Define, in order to highlight a new block of

sequence.DefineInvert

TranslateSaveTools

Define

Page 8: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTACACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGTTAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAAATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGATCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACGCGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGCTCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCCTGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGACTAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTGTGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATCCCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAACGATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTTGCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAATCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCTTGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCAAGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCAATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCAAAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCACAATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTATGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT

all4312two-component system

5166997 <- 5167767

Anabaena Chromosome (6413771 bp): 5166951-5967950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

5166961516700151670515167101516715151672015167251 51673015167351 51674015167451 51675015167551 51676015167651 51677015167751 516780151678515167901

Define the beginning of the block by clicking on base

5167751 (4th line up). Then click on the last base on the page (lower right corner).

Page 9: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTACACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGTTAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAAATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGATCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACGCGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGCTCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCCTGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGACTAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTGTGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATCCCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAACGATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTTGCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAATCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCTTGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCAAGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCAATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCAAAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCACAATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTATGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT

all4312two-component system

5166997 <- 5167767

Anabaena Chromosome (6413771 bp): 5166951-5967950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

5166961516700151670515167101516715151672015167251 51673015167351 51674015167451 51675015167551 51676015167651 51677015167751 516780151678515167901

Now that the bottom four lines are blocked, Click on

Block and then Invert.

Page 10: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTACACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGTTAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAAATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGATCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACGCGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGCTCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCCTGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGACTAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTGTGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATCCCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAACGATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTTGCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAATCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCTTGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCAAGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCAATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCAAAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCACAATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTATGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT

all4312two-component system

5166997 <- 5167767

Anabaena Chromosome (6413771 bp): 5166951-5967950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

5166961516700151670515167101516715151672015167251 51673015167351 51674015167451 51675015167551 51676015167651 51677015167751 516780151678515167901

Now that the bottom four lines are blocked, Click on

Block and then Invert.

DefineInvert

TranslateSaveTools

Invert

Page 11: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

AACTATAACAAAAATTTAATAATATTATCAACTTCGCTCTGGACAAGGCATAAACTCAACATTTTGCCAACATAGGTTATAAAAAAACGTAGAGGTAATTGTGGCTAGAGTAACAAAGACTACAAAACCTTGGGCATGGGCTTGTTACTTTGAAATTCATCGACGCTAAGGGGTCTTGCCGCCGTGGGTTCGGTTTGTAT

all4312two-component system

5167767 -> 5166997

Anabaena Chromosome (6413771 bp): 5167950-5967751 (inverted)

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

5167950516790051678505167800

That’s more like it. Now a person attuned to such things can recognize the elements of a binding site for the transcriptional regulator NtcA, followed by the -10 region of a promoter, properly spaced. The

gene comes shortly after that, now in the direct (blue) orientation. To get back to the full sequence,

click on Block and then unInvert.

Page 12: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

AACTATAACAAAAATTTAATAATATTATCAACTTCGCTCTGGACAAGGCATAAACTCAACATTTTGCCAACATAGGTTATAAAAAAACGTAGAGGTAATTGTGGCTAGAGTAACAAAGACTACAAAACCTTGGGCATGGGCTTGTTACTTTGAAATTCATCGACGCTAAGGGGTCTTGCCGCCGTGGGTTCGGTTTGTAT

all4312two-component system

5167767 -> 5166997

Anabaena Chromosome (6413771 bp): 5167950-5967751 (inverted)

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

5167950516790051678505167800

That’s more like it. Now a person attuned to such things can recognize the elements of a binding site for the transcriptional regulator NtcA, followed by the -10 region of a promoter, properly spaced. The

gene comes shortly after that, now in the direct (blue) orientation. To get back to the full sequence,

click on Block and then unInvert.

InvertTranslate

SaveTools

unInvert

Page 13: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTACACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGTTAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAAATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGATCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACGCGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGCTCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCCTGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGACTAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTGTGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATCCCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAACGATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTTGCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAATCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCTTGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCAAGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCAATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCAAAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCACAATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTATGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT

all4312two-component system

5166997 <- 5167767

Anabaena Chromosome (6413771 bp): 5166951-5967950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

5166961516700151670515167101516715151672015167251 51673015167351 51674015167451 51675015167551 51676015167651 51677015167751 516780151678515167901

If suspicious, we could have found this same site by a direct search for its consensus sequence

(though there are better ways than this), clicking on Find, then Sequence, and typing in the

NtcA/promoter consensus sequence.

Page 14: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTACACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGTTAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAAATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGATCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACGCGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGCTCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCCTGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGACTAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTGTGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATCCCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAACGATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTTGCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAATCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCTTGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCAAGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCAATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCAAAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCACAATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTATGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT

all4312two-component system

5166997 <- 5167767

Anabaena Chromosome (6413771 bp): 5166951-5967950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

5166961516700151670515167101516715151672015167251 51673015167351 51674015167451 51675015167551 51676015167651 51677015167751 516780151678515167901 Gene name

DescriptionSequenceSequence

If suspicious, we could have found this same site by a direct search for its consensus sequence

(though there are better ways than this), clicking on Find, then Sequence, and typing in the

NtcA/promoter consensus sequence.

Page 15: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTACACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGTTAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAAATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGATCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACGCGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGCTCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCCTGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGACTAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTGTGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATCCCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAACGATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTTGCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAATCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCTTGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCAAGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCAATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCAAAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCACAATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTATGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT

all4312two-component system

5166997 <- 5167767

Anabaena Chromosome (6413771 bp): 5166951-5967950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

5166961516700151670515167101516715151672015167251 51673015167351 51674015167451 51675015167551 51676015167651 51677015167751 516780151678515167901 Gene name

DescriptionSequenceSequence GTA.{8}TAC.{20,24}TA...T

The NtcA binding sequence is flexible, like most sequences of biological interest. Search tools need to be similarly flexible.This search

string says: Look for “GTA” followed by 8 nucleotides of any sort, followed by “TAC”

followed by 20 to 24 nucleotides, followed by “TA”, three nucleotides, then a final “T”. Press Enter to find a matching sequence.

Page 16: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTACACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGTTAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAAATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGATCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACGCGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGCTCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCCTGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGACTAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTGTGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATCCCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAACGATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTTGCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAATCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCTTGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCAAGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCAATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCAAAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCACAATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTATGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT

all4312two-component system

5166997 <- 5167767

Anabaena Chromosome (6413771 bp): 5166951-5967950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

5166961516700151670515167101516715151672015167251 51673015167351 51674015167451 51675015167551 51676015167651 51677015167751 516780151678515167901

It is sometimes easier to see patterns in DNA sequences if we can engage our visual recognition abilities. Click Display and then

Base Symbols to try it out for yourself.

Page 17: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

GCTGAGTTAGGAGTAAAAATCATTATTTTTCCTCCCTCTGCCTCCTCTACACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGTTAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAAATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGATCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACGCGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGCTCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCCTGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGACTAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTGTGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATCCCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAACGATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTTGCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAATCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCTTGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCAAGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCAATACAAACCGAACCCACGGCGGCAAGACCCCTTAGCGTCGATGAATTTCAAAGTAACAAGCCCATGCCCAAGGTTTTGTAGTCTTTGTTACTCTAGCCACAATTACCTCTACGTTTTTTTATAACCTATGTTGGCAAAATGTTGAGTTTATGCCTTGTCCAGAGCGAAGTTGATAATATTATTAAATTTTTGTTATAGTT

all4312two-component system

5166997 <- 5167767

Anabaena Chromosome (6413771 bp): 5166951-5967950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

5166961516700151670515167101516715151672015167251 51673015167351 51674015167451 51675015167551 51676015167651 51677015167751 516780151678515167901

Alternate startsAnnotated features

Local featuresTandem repeatsInverted repeatsBase symbolsBase symbols

It is sometimes easier to see patterns in DNA sequences if we can engage our visual recognition abilities. Click Display and then

Base Symbols to try it out for yourself.

Page 18: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

□■■□□□■■□□□□□■□□□□□■■□■■□■■■■■■■■■■■■■■□■■■■■■CTACACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGTTAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAAATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGATCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACGCGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGCTCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCCTGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGACTAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTGTGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATCCCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAACGATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTTGCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAATCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCTTGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCAAGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCAATACAAACCGAACCCAC□□■□□■■■□■■■■■■■□□■□■■□□■□□□■■■■□□□□■□□■□□□■■■□■□■■■□□□□■■■■□■□□■■■■■□■■□■■■■□□■■□■□□■■□■■■■■□■□■■■■■■■□■□□■■■□■□■■□□■□□□□■□■■□□□■■■□■□■■■■□■■■□□□□■□□□□■■□□■□□■□■■□■■□□□■■■■■□■■□■□□■■

5166961516700151670515167101516715151672015167251 51673015167351 51674015167451 51675015167551 51676015167651 51677015167751 516780151678515167901

all4312two-component system

5166997 <- 5167767

Anabaena Chromosome (6413771 bp): 5166951-5967950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

Purines are represented as open symbols and pyrimidines as filled in symbols. A and T are purple, G

and C are green. Fortunately, you don’t have to remember any of this to recognize patterns. Look at the top line. It’s immediately evident (as it probably was not

before) that all4312 is followed by a string of... pyrimidines and then a string of purines. Possibly a

termination region? Let’s look beyond. Press the right arrow key to move the display one line down.

Page 19: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

AACCAAGCCGATGAAGAATGGAACTAA□■■■■■■■■■■■■■□■■■■■■□■ □■■□□□■■□□□□□■□□□□□■■□■■□■■■■■■■■■■■■■■□■■■■■■CTACACCCTCTGCCCACTTAGAGTTGAGCGTTGGTTGCTAAATCTTTCTTTTGTTAACTTTGCTTGTGTTTGTGGAGGATTAGCATTCAAAATTTCCATGTTAAATCGGTATCCAACATTGCGGATAGTCTGAATGAGGCTAGGTTGGCGGGGATCAAGTTCTACTTTTTTACGTAACGATAGAACATGAGTGTCAATGGTACGCGGATTGTCGATAGCGTCAGGCCACGCACGACGTAGCAACTCTGATCGGCTCAAAGGTACTCCACCAGCTTGCGCCAAAACGTACAACAAACTAAATTCCTGTGGAGTCAGGTCGATAAACTCCCCTTGGAATCGTACACGGCGTTGGACTAAATCGATTTGCAAAGTACCATAATCCAAATAAGCAGGAGCAGTAGGTGTGCGCTTGCGGCGGATTAATGCCTCTACCCTAGCCAAAAACTCCTGCATCCCAAATGGTTTGCTCAAGTAATCATCAGCTCCCGCCTTCAACCCGGCAACGATATCAGCCTCATTAGTCCGAGCAGATAACATGAGAATTAGCGGCTGTTGCTGACGATGCAGCCAACGGCAAAATTCAATACCGTCACCATCTGGCAAATCAGCATCCAGAATCACTAGAGTTGGCTGATGGCTCAAAAAGGCTTCCCTTGCTTGATATATGCTGGCGGCTTGATGCACACGGTATTCCAATTGTTGCAAGTGCCAACCCAGCAACGACCTCAGATGGGGATTCCCCTCAACGATTTCAATACAAACCGAACCCAC□□■□□■■■□■■■■■■■■□■□■■□■■□■■■■■■■□□□■□□■□□□■■■□■□■■■□□□□■■■■□■□□■■■■■□■■□■■■■□□■■□■□□■■□■■■■■□■□■■■■■■■□■□□■■■□■□■■□□■□□□□■□■■□□□■■■□

51669015166961516700151670515167101516715151672015167251 51673015167351 51674015167451 51675015167551 51676015167651 51677015167751 51678015167851

alr4311ABC transporter

5166172 -> 5166927

all4312two-component system

5166997 <- 5167767

Anabaena Chromosome (6413771 bp): 5166901-5967950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

From the change in color from yellow to blue, we’ve evidently run into a gene on the other strand, this one

also ending in a string of pyrimidines. Let’s look further by clicking on PgUp.

Page 20: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

CCAAAGCAAAACAGGTATAGACACCACTGATGTTCGCCCTTTAGCGCAACCGTGGATGTATTTGATTTTATTAGGATTTACACTATTACTACTTTTAATTGATGCTTGGGCGATCGCCACAGCTATAGCCATCTAA□■■■■□□■■■■□□□□■■■■□□□■■■□□□■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ATGACAGCCCAATTAAGGCTAGAACAAGTTAATCTGTTTGCCAAGCTAAAAACCCAGCTTCAGGGCTACCCAATATTGCAGGATATCTCTTTTGAGATTAACTCTGGCGATCGCCTAGCAATTATTGGCCCCTCCGGTGCTGGTAAAACTTCTTTACTACGTCTAATTAACCGCCTCAGTGAACCTAATAGCGGCAAAATTTTTTTAGAAAATCAAGAATATCCGCAAATTCCTGTTATCCAGTTGCGCCAGATAGTGACCCTGGTATTACAAGAGCCAAAGTTTCTGGGGATGACAGTCCAACAAGCCTTAGCTTACCCTTTAATTTTGCGCGGTTTGACCAAAGAGACGATTCAGCAGCGAGTCAGTCATTGGGCGGAACAGCTGCAAATCCCTGGTGATTGGTTAGGACGCACTGAGGTACAACTTTCGGCTGGACAGAGACAGCTCGTAGCGATCGCTCGTGCTTTAGTCATTCAACCGAAAATCCTCCTGTTAGATGAGCCAACCTCTCATCTAGATATTGGTATAGCCTCCCATCTTATCCAAGTCTTAACCCAGCTAACTCAAACTCATCACACAACAATTGTGATGGTAAACAGCCAGCTAGACTTCACTCAGATGTTTTGTAATCGGCTTTTGTATTTACAGCAAGGACGTTTATTGGTTAATCAAACAGCTTCTAACATCGACTGGATTGACTTACAAAAAAGGTTGATGCACGCCGAAAACCAAGCCGATGAAGAATGGAACTAA□■■■■■■■■■■■■■□■■■■■■□■

5165961516600151660515166101516615151662015166251 51663015166351 51664015166451 51665015166551 51666015166651 51667015166751 516680151668515166901

alr4310hypothetical protein5165532 -> 5166086

alr4311ABC transporter

5166172 -> 5166927

Anabaena Chromosome (6413771 bp): 5165951-5966950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

The intergenic region between alr4310 and alr4311 shows a remarkable pattern. I’ll give you a few

seconds to try to find it yourself...

The intergenic region between alr4310 and alr4311 shows a remarkable pattern. I’ll give you a few

seconds to try to find it yourself......a series of tandem repeats. Now that we see it by

eye, we can ask the computer to find them in a more systematic fashion. Click on Display and then

Tandem repeats.

Page 21: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

CCAAAGCAAAACAGGTATAGACACCACTGATGTTCGCCCTTTAGCGCAACCGTGGATGTATTTGATTTTATTAGGATTTACACTATTACTACTTTTAATTGATGCTTGGGCGATCGCCACAGCTATAGCCATCTAA□■■■■□□■■■■□□□□■■■■□□□■■■□□□■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ATGACAGCCCAATTAAGGCTAGAACAAGTTAATCTGTTTGCCAAGCTAAAAACCCAGCTTCAGGGCTACCCAATATTGCAGGATATCTCTTTTGAGATTAACTCTGGCGATCGCCTAGCAATTATTGGCCCCTCCGGTGCTGGTAAAACTTCTTTACTACGTCTAATTAACCGCCTCAGTGAACCTAATAGCGGCAAAATTTTTTTAGAAAATCAAGAATATCCGCAAATTCCTGTTATCCAGTTGCGCCAGATAGTGACCCTGGTATTACAAGAGCCAAAGTTTCTGGGGATGACAGTCCAACAAGCCTTAGCTTACCCTTTAATTTTGCGCGGTTTGACCAAAGAGACGATTCAGCAGCGAGTCAGTCATTGGGCGGAACAGCTGCAAATCCCTGGTGATTGGTTAGGACGCACTGAGGTACAACTTTCGGCTGGACAGAGACAGCTCGTAGCGATCGCTCGTGCTTTAGTCATTCAACCGAAAATCCTCCTGTTAGATGAGCCAACCTCTCATCTAGATATTGGTATAGCCTCCCATCTTATCCAAGTCTTAACCCAGCTAACTCAAACTCATCACACAACAATTGTGATGGTAAACAGCCAGCTAGACTTCACTCAGATGTTTTGTAATCGGCTTTTGTATTTACAGCAAGGACGTTTATTGGTTAATCAAACAGCTTCTAACATCGACTGGATTGACTTACAAAAAAGGTTGATGCACGCCGAAAACCAAGCCGATGAAGAATGGAACTAA□■■■■■■■■■■■■■□■■■■■■□■

5165961516600151660515166101516615151662015166251 51663015166351 51664015166451 51665015166551 51666015166651 51667015166751 516680151668515166901

alr4310hypothetical protein5165532 -> 5166086

alr4311ABC transporter

5166172 -> 5166927

Anabaena Chromosome (6413771 bp): 5165951-5966950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

The intergenic region between alr4310 and alr4311 show a remarkable pattern. I’ll give you a few

seconds to try to find it yourself......a series of tandem repeats. Now that we see it by

eye, we can ask the computer to find them in a more systematic fashion. Click on Display and then

Tandem repeats.

Alternate startsAnnotated features

Local featuresTandem repeatsInverted repeatsBase symbols

Tandem repeats

Page 22: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

CCAAAGCAAAACAGGTATAGACACCACTGATGTTCGCCCTTTAGCGCAACCGTGGATGTATTTGATTTTATTAGGATTTACACTATTACTACTTTTAATTGATGCTTGGGCGATCGCCACAGCTATAGCCATCTAA□■■■■□□■■■■□□□□■■■■□□□■■■□□□■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ATGACAGCCCAATTAAGGCTAGAACAAGTTAATCTGTTTGCCAAGCTAAAAACCCAGCTTCAGGGCTACCCAATATTGCAGGATATCTCTTTTGAGATTAACTCTGGCGATCGCCTAGCAATTATTGGCCCCTCCGGTGCTGGTAAAACTTCTTTACTACGTCTAATTAACCGCCTCAGTGAACCTAATAGCGGCAAAATTTTTTTAGAAAATCAAGAATATCCGCAAATTCCTGTTATCCAGTTGCGCCAGATAGTGACCCTGGTATTACAAGAGCCAAAGTTTCTGGGGATGACAGTCCAACAAGCCTTAGCTTACCCTTTAATTTTGCGCGGTTTGACCAAAGAGACGATTCAGCAGCGAGTCAGTCATTGGGCGGAACAGCTGCAAATCCCTGGTGATTGGTTAGGACGCACTGAGGTACAACTTTCGGCTGGACAGAGACAGCTCGTAGCGATCGCTCGTGCTTTAGTCATTCAACCGAAAATCCTCCTGTTAGATGAGCCAACCTCTCATCTAGATATTGGTATAGCCTCCCATCTTATCCAAGTCTTAACCCAGCTAACTCAAACTCATCACACAACAATTGTGATGGTAAACAGCCAGCTAGACTTCACTCAGATGTTTTGTAATCGGCTTTTGTATTTACAGCAAGGACGTTTATTGGTTAATCAAACAGCTTCTAACATCGACTGGATTGACTTACAAAAAAGGTTGATGCACGCCGAAAACCAAGCCGATGAAGAATGGAACTAA□■■■■■■■■■■■■■□■■■■■■□■

5165961516600151660515166101516615151662015166251 51663015166351 51664015166451 51665015166551 51666015166651 51667015166751 516680151668515166901

alr4310hypothetical protein5165532 -> 5166086

alr4311ABC transporter

5166172 -> 5166927

Anabaena Chromosome (6413771 bp): 5165951-5966950

.........|.........|.........|.........|.........|

Contig GoTo Block Find Display PgUp/PgDn Help Quit

The machine saw more than we did! Not only are the repeats we saw more extensive, but there

is also another set of repeats nearby. What do they mean? Hard to say, but certainly our

chances of figuring them out are better if we can engage our visual imagination and if we can see

them in a biological context.

End

Page 23: Analysis: Tools for directly examining sequence What follows is a simulation of the proposed sequence interface. A PC-based prototype exists, but the interface.

Analysis: Tools for directly examining sequenceSummary

• (article of faith) The freshest insights and most fundamental discoveries require intimate contact with the basic phenomenon.• In genomic analysis, the basic phenomenon is often the genome.• The sequence interface makes it possible to view DNA features within a biological context.• The interface provides tool to aid discovery of features within noncoding DNA.

Scenario 6

Software that does most of what you saw already exists, but it would need to be rewritten

before it could serve as a web interface.