Analysis: Discovery of coregulated genes
description
Transcript of Analysis: Discovery of coregulated genes
Analysis: Discovery of coregulated genes
What follows is a simulation of the proposed graphical interface. As you go through the simulation please consider what capabilities you would want to serve your research and annotation interests.
A narrative to help you go through the simulation appears in a red-bordered box, such as the one below.
To begin:1. Click on Slide Show, (on the upper toolbar)2. Click View Show3. Click Continue button
Continue
Scenario 4
• Gene present in Prochlorococcus MED4 MED4 is naturally adapted to grow in high light.
How do cells control response to light?
• Ortholog absent in Prochlorococcus MIT9313 MIT9313 is naturally adapted to grow in low light
• Ortholog present in Synechocystis PCC 6803 Reason will become apparent in a moment
• Synechocystis PCC 6803 ortholog responds to high light Gene turns on by factor > 2 in response to high light
What genes are related to the adaptation to high light?
Look for:
Continue
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Click to start building a new setClick to see set you or someone else madeClick to see statements to manipulate setsClick to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick any red button to get help
Variable
Data
Function
Operation
Set operation
Display set
Build set
Click Build Set to begin finding orfs with the desired specifications
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Click to see statements to manipulate setsClick to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick any red button to get help
Variable
Data
Function
Operation
Set operation
You want to go through all MED4 ORFS. Click Operation to see how to
do that.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider X in set... (loop)
IF... THEN... OTHERWISEConsider X in set... (loop)
You want to go consider each ORF in the set of all all MED4 ORFS. Click
Consider X in set…
Click to go through each element of a setClick to perform actions only under certain conditionsClick any red button to get help
Consider X in set... (loop)
IF... THEN... OTHERWISE
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All nucleotides of
All open reading frames of
All amino acid sequences of
All intergenic regions of
Human-annotated orfs of
Private set
Public set
All open reading frames of
Choose set type
Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa
Nostoc punctiformeNostoc PCC 7120
Prochlorococcus MED4Prochlorococcus MIT9313
Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942
Synechococcus WHSynechocystis PCC 6803Thermosynechococcus
TrichodesmiumUnicellulularFilamentous
All
Choose database
Choose ORFs as the set type and
Prochlorococcus MED4 as the database.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa
Nostoc punctiformeNostoc PCC 7120
Prochlorococcus MED4Prochlorococcus MIT9313
Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942
Synechococcus WHSynechocystis PCC 6803Thermosynechococcus
TrichodesmiumUnicellulularFilamentous
All
Prochlorococcus MED4
Choose database
Choose ORFs as the set type and
Prochlorococcus MED4 as the database.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
Click to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick to see statements to manipulate sets
Variable
Data
Function
Operation
Set operation
You want to consider this MED4 ORF only if an
ortholog in MIT9313 doesn’t exist.. Click Operation and
choose If… then…
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
Consider X in set... (loop)
IF... THEN... OTHERWISE
KEEP item
IF... THEN... OTHERWISE
You want to this MED4 ORF only if an ortholog in
MIT9313 doesn’t exist.. Click Operation and choose
If… then…
Click to go through each element of a setClick to perform actions only under certain conditionsClick to add item to setClick any red button to get help
Consider X in set... (loop)
IF... THEN... OTHERWISE
Keep item
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Click or to begin specifying condition to be met
Variable Data Function
Your condition is that the ortholog of the item in MIT9313 doesn’t exist.
Ortholog of… is a function.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF
Ortholog of
Protein product of
Sequence of
Upstream region of
Downstream region of
Ortholog of
Your condition is that the ortholog of the item in MIT9313 doesn’t exist.
Ortholog of… is a function.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF ItemSet
Specify
Variable
Item( inOrtholog of Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa
Nostoc punctiformeNostoc PCC 7120
Prochlorococcus MED4Prochlorococcus MIT9313
Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803Thermosynechococcus
Trichodesmium
Choose database
)
You want the ortholog of the item (the specific ORF
of MED4 being considered)…
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa
Nostoc punctiformeNostoc PCC 7120
Prochlorococcus MED4Prochlorococcus MIT9313
Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803Thermosynechococcus
Trichodesmium
Choose database
)
Prochlorococcus MIT9313
... And you want the ortholog of the item in
Prochlorococcus MIT9313
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 )
=
existsdoesn’t existdoesn’t exist
Click specific operation to continue specifying the condition to be met
or click to save results of functionVariable
You want the ortholog of the item in
Prochlorococcus MIT9313 not to exist.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 ) doesn’t exist
AND
OR
BUT NOT
IF... THEN... OTHERWISE
Click a logical operation to continue specifying the condition to be met
or IF... THEN... to end the condition
Let’s pause to see where we are in the task at hand
(Click to proceed)
• Gene present in Prochlorococcus MED4 MED4 is naturally adapted to grow in high light.
How do cells control response to light?
• Ortholog absent in Prochlorococcus MIT9313 MIT9313 is naturally adapted to grow in low light
• Ortholog present in Synechocystis PCC 6803 Reason will become apparent in a moment
• Synechocystis PCC 6803 ortholog responds to high light Gene turns on by factor > 2 in response to high light
What genes are related to the adaptation to high light?
Look for:
√
√
Continue
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 ) doesn’t exist
AND
OR
BUT NOT
IF... THEN... OTHERWISE
AND
Click a logical operation to continue specifying the condition to be met
or IF... THEN... to end the condition
There are more conditions to fulfill, so click AND.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 ) doesn’t exist AND
Click or to continue specifying condition to be met
Variable Data Function
Your condition now is that the ortholog of the item in Synechocystis
does exist. Ortholog of… is a function.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 ) doesn’t exist AND
Ortholog of
Protein product of
Sequence of
Upstream region of
Downstream region of
Ortholog of
Your condition now is that the ortholog of the item in Synechocystis
does exist. Ortholog of… is a function.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
ItemSet
Specify
Variable
Item( inOrtholog of Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa
Nostoc punctiformeNostoc PCC 7120
Prochlorococcus MED4Prochlorococcus MIT9313
Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803Thermosynechococcus
Trichodesmium
Choose database
)
You want the ortholog of the item (the specific ORF
of MED4 being considered)…
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa
Nostoc punctiformeNostoc PCC 7120
Prochlorococcus MED4Prochlorococcus MIT9313
Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803Thermosynechococcus
Trichodesmium
Choose database
)
Synechocystis PCC6803You want the ortholog of
the item in Synechocystis PCC 6803
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
=
existsdoesn’t exist
Click specific operation to continue specifying the condition to be met
or click to save results of functionVariable
You want the ortholog of the item in Synechocystis PCC 6803, this time to
exist... but first you need to save the ortholog for later. To do this click Variable.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog
Type variable name
)
Give the ortholog a logical name so that you can refer to it later (for now, just click on
the box and I’ll do the typing)
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
Click specific operation to continue specifying the condition to be met
or click to save results of functionVariable
=
existsdoesn’t exist
exists
Now you can demand that the ortholog of the item in Synechocystis PCC 6803
exists.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists
Click a logical operation to continue specifying the condition to be met
or IF... THEN... to end the condition
AND
OR
BUT NOT
IF... THEN... OTHERWISE
AND
Still one more condition (2x expression) so click
AND.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
Click or to continue specifying condition to be met
Variable Data Function
This time the condition to be met concerns data from a microarray experiment.
Click Data.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
data for Item6803 ortholog
Specify
Variable
in Microcystis aeruginosaNostoc punctiformeNostoc PCC 7120
Prochlorococcus MED4Prochlorococcus MIT9313
Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803
Choose organism used
6803 ortholog
The data desired concerns the 6803 ortholog and an
experiment using Synechocystis PCC6803
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
data for 6803 ortholog in Microcystis aeruginosaNostoc punctiformeNostoc PCC 7120
Prochlorococcus MED4Prochlorococcus MIT9313
Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803
Choose organism used
Synechocystis PCC 6803
The data desired concerns the 6803 ortholog and an
experiment using Synechocystis PCC6803
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
data for 6803 ortholog in Synechocystis PCC 6803 microarray2D gel
Choose type
microarray
It’s a microarray experiment…
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1Suzuki1
Yoshimura1
Choose exptHihara1
High light vs low light experiment
It’s a microarray experiment, Hihara et al, you think…
Mouse over that experiment to see (here click on it)
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1
<< or =
=
> or =>
existsdoesn’t exist
>
You want the experimental condition
(high light) to be greater than the control…
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1
Value
+2> You want the
experimental condition (high light) to be greater
than the control by a factor of 2 (I’ll type it).
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1
Value
+2>
Click a logical operation to continue specifying the condition to be met
or IF... THEN... to end the condition
AND
OR
BUT NOT
IF... THEN... OTHERWISE
(Let’s pause again)
• Gene present in Prochlorococcus MED4 MED4 is naturally adapted to grow in high light.
How do cells control response to light?
• Ortholog absent in Prochlorococcus MIT9313 MIT9313 is naturally adapted to grow in low light
• Ortholog present in Synechocystis PCC 6803 Reason will become apparent in a moment
• Synechocystis PCC 6803 ortholog responds to high light Gene turns on by factor > 2 in response to high light
What genes are related to the adaptation to high light?
Look for:
√
√
√
√
Continue
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Consider each item in All open reading frames of Prochlorococcus MED4 :
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1
Value
+2>
Click a logical operation to continue specifying the condition to be met
or IF... THEN... to end the condition
AND
OR
BUT NOT
IF... THEN... OTHERWISEIF... THEN... OTHERWISE
That’s it! We’ve specified all the conditions, so end the IF segment and specify what do
do if all the conditions are met.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1
+2> THEN
Click to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick to see statements to manipulate sets
Variable
Data
Function
Operation
Set operation
And if they are met, what you want to do is to save the
gene in the set you’re building, a Set operation.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1
+2> THEN
ADD TO
DELETE FROM
UNION OF
INTERSECTION OF
Arithmetic
ADD TO
In other words, you want to ADD the gene to the growing
set.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1
+2> THEN
Item6803 ortholog
Specify
6803 orthologtoAdd
Variable Type name of set
Which gene? I could save the Prochlorococcus gene (the item), but I’ll instead save the gene from PCC 6803
(more known about them).
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1
+2> THEN
6803 ortholog toAdd Light-specific genes
Type name of set
I need to give the set a logical name (you click, I’ll type).
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1
+2> THEN
6803 ortholog toAdd
Type name of set
Light-specific genes
Click to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick to see statements to manipulate sets
Variable
Data
Function
Operation
Set operation
If I stop here, then all Prochlorococcus genes will
be considered, and if the conditions are met, the 6803
ortholog will be saved. That’s what I want, so click
Done.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Item( inOrtholog of Synechocystis PCC 6803 )
(assigned to 6803 ortholog )
exists AND
data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1
+2> THEN
6803 ortholog toAdd
Type name of set
Light-specific genes
Save results and scriptSave only results
Save results and script
That was a complicated script, so I’ll save it in case I (or someone else) needs to run it again or modify it.
Equivalent script that bypasses interface
(loop for item in (#^Genes ProcMed4) as all-orthologous = (all-blast-orthologous-geneIDs item stdevalue) as 6803ortholog = (#^Genes Syny6803) :in all-orthologous) as light-specific-genes = nil when (and (there-are-not-any #’member-geneID-of-gene-frames (#^Genes slotv Proc9313) :in all-orthologous)) (there-are-any #'member-geneID-of-gene-frames 6803ortholog) (>= ratio-value (select-matching-geneIDs-from-table Hihara1 item) 2)))) collect light-specific-genes 6803ortholog)
This is the script that the interface would have produced. It looks for all the world like a computer program. In fact, it is a
computer program, written in BioLingua. Continue
Build set Display set
Set: Light-specific genes
Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)
Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase
Syny6803:sll0337 Sensor histidine kinase
Syny6803:sll0335 Hypothetical
Syny6803:sll0789 Response regulator (OmpR)
Syny6803:sll0576 Putative epimerase/hydratase
DoneHELPSet operation
Here are the results of the program. The genes meeting all the conditions are given along with a brief description and a graphic display of the regions
surrounding the genes.
(Click to proceed)
Build set Display set
Set: Light-specific genes
Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)
Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase
Syny6803:sll0337 Sensor histidine kinase
Syny6803:sll0335 Hypothetical
Syny6803:sll0789 Response regulator (OmpR)
Syny6803:sll0576 Putative epimerase/hydratase
DoneHELPSet operation
What can you do with this set? Certainly one thing of interest is
the function of the genes. Clicking on a gene name brings you to the annotation page. Try
clicking on slr1332.
Synechocystis PCC 6803: slr1332
Replicon: Chromosome
Coordinates: 1670650 (start-ATG) 1671864 (stop) Human Length = 404 amino acids
Strand: Direct
Gene name(s): fabF or fabJ
Function: beta-ketoacyl-acyl carrier protein synthase Human
Activity: In vivo activity: exists Experiment
Cyanobacterial orthologs:
Syny6803
Nost7120
NostPun
OptionsAnnotateMain Menu History
More
A
A
A
HELP
You can find out more about this kind of page from Scenarios 1 – 3.
For now, click to return to the Set Display page.
Build set Display set
Set: Light-specific genes
Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)
Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase
Syny6803:sll0337 Sensor histidine kinase
Syny6803:sll0335 Hypothetical
Syny6803:sll0789 Response regulator (OmpR)
Syny6803:sll0576 Putative epimerase/hydratase
DoneHELPSet operation
Another interesting point of attack is the regulation of this set of genes (after all, they were selected as being coregulated by light). Perhaps the upstream regions
share a common motif.
(Click to continue)
Build set Display set
Set: Light-specific genes
Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)
Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase
Syny6803:sll0337 Sensor histidine kinase
Syny6803:sll0335 Hypothetical
Syny6803:sll0789 Response regulator (OmpR)
Syny6803:sll0576 Putative epimerase/hydratase
DoneHELPSet operation
Unfortunately, in three cases, the genes don’t have an upstream region. Evidently
these genes are part of operons. We’d like to consider the upstream regions of the operons by adding the first genes of
the operon to the set. Add to set... that’s a Set operation. (Click on that)
Build set Display set
Set: Light-specific genes
Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)
Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase
Syny6803:sll0337 Sensor histidine kinase
Syny6803:sll0335 Hypothetical
Syny6803:sll0789 Response regulator (OmpR)
Syny6803:sll0576 Putative epimerase/hydratase
DoneHELPSet operation
OperationADD TO
DELETE FROM
UNION OF
INTERSECTION OF
Arithmetic
ADD TO
Click ADD TO...
Build set Display set
Set: Light-specific genes
Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)
Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase
Syny6803:sll0337 Sensor histidine kinase
Syny6803:sll0335 Hypothetical
Syny6803:sll0789 Response regulator (OmpR)
Syny6803:sll0576 Putative epimerase/hydratase
DoneHELPSet operation
Find specific gene
Add set of genes
Specify gene
Click on icon of gene
Add to set
Click ADD TO and click on the icon (the short black
arrow) upstream from sll0990, the first gene without an upstream
region.
Build set Display set
Set: Light-specific genes
Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)
Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase
Syny6803:sll0337 Sensor histidine kinase
Syny6803:sll0335 Hypothetical
Syny6803:sll0789 Response regulator (OmpR)
Syny6803:sll0576 Putative epimerase/hydratase
DoneHELPSet operation
Find specific gene
Add set of genes
Specify gene
Click on icon of gene
Add to set
Syny6803:srl7009 trnR: tRNA Arg (UCU)
The short arrow is now named and part of the set.
Click on the icon (the long black arrow) upstream
from slr1332, the next gene without an upstream
region.
Build set Display set
Set: Light-specific genes
Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)
Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase
Syny6803:sll0337 Sensor histidine kinase
Syny6803:sll0335 Hypothetical
Syny6803:sll0789 Response regulator (OmpR)
Syny6803:sll0576 Putative epimerase/hydratase
DoneHELPSet operation
Find specific gene
Add set of genes
Specify gene
Click on icon of gene
Add to set
Syny6803:srl7009 trnR tRNA Arg (UCU)
Syny6803:slr1331 Processing protease
slr1331 is now part of the set. Click on the icon
upstream from sll0789, the third gene without an
upstream region.
Build set Display set
Set: Light-specific genes
Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)
Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase
Syny6803:sll0337 Sensor histidine kinase
Syny6803:sll0335 Hypothetical
Syny6803:sll0789 Response regulator (OmpR)
Syny6803:sll0576 Putative epimerase/hydratase
DoneHELPSet operation
Syny6803:srl7009 trnR tRNA Arg (UCU)
Syny6803:slr1331 Processing protease
Syny6803:sll0788 Hypothetical protein
Find specific gene
Add set of genes
Specify gene
Click on icon of gene
Add to set
Now the genes presumably at the start of the operons have been
added, and it’s time to remove the genes without upstream regions. Click on the radio buttons of the
three genes and click Set operation.
Build set Display set
Set: Light-specific genes
Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)
Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase
Syny6803:sll0337 Sensor histidine kinase
Syny6803:sll0335 Hypothetical
Syny6803:sll0789 Response regulator (OmpR)
Syny6803:sll0576 Putative epimerase/hydratase
DoneHELPSet operation
Syny6803:srl7009 trnR tRNA Arg (UCU)
Syny6803:slr1331 Processing protease
Syny6803:sll0788 Hypothetical protein
KEEP (delete others)
DELETE FROMDELETE FROM
Click DELETE FROM to remove the three genes
from the set.
Build set Display set
Set: Light-specific genes
Syny6803:srl7009 trnR tRNA Arg (UCU)
Syny6803:sll0337 Sensor histidine kinase
Syny6803:sll0335 Hypothetical
Syny6803:sll0576 Putative epimerase/hydratase
DoneHELPSet operation
Syny6803:slr1331 Processing protease
Syny6803:sll0788 Hypothetical protein
Now you have what you want: a set of genes coregulated by light. The game now is to
extract their upstream regions and determine if that set
contains a common sequence motif. So you need a new set.
Click Build set.
Click to see set you or someone else madeClick to see statements to manipulate setsClick to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick any red button to get help
Variable
Data
Function
Operation
Set operation
Display set
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
The set will be based on the upstream regions of the set of light-specific genes.
Upstream region of... that’s a Function.
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Ortholog of
Protein product of
Sequence of
Upstream region of
Downstream region of
Common sequences(Meme) of
Upstream region of
The set will be based on the upstream regions of the set of light-specific genes.
Upstream region of... that’s a Function.
Upstream region of
6803 orthologSet
Specify
Choose variable
Display set
Variable Data
Build set
Operation Function Done
CancelHELPSet operation HELP
Set( in Arthrobacter platensis
Gloeobacter violaceusMicrocystis aeruginosa
Nostoc punctiformeNostoc PCC 7120
Prochlorococcus MED4Prochlorococcus MIT9313
Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803Thermosynechococcus
Trichodesmium
Choose database
)
You want not the upstream region of a variable (6803 ortholog is the only one you’ve made so far) but
rather a set.
All open reading frames of
Human-annotated orfs of
Private set
Public set
Upstream region of
Choose set type
Display set
Variable Data
Build set
Operation Function Done
CancelHELPSet operation HELP
( Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa
Nostoc punctiformeNostoc PCC 7120
Prochlorococcus MED4Prochlorococcus MIT9313
Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803Thermosynechococcus
Trichodesmium
Choose database
)
Private set
You have a couple of premade sets available, but you want your own. Click
Private set.
Upstream region of
Light-specific genes
Choose set
Display set
Variable Data
Build set
Operation Function Done
CancelHELPSet operation HELP
assigned to
Type set name
Light-specific genes( )
You’ve made only one set. Choose it...
Upstream region of
Light-specific genes
Display set
Variable Data
Build set
Operation Function Done
CancelHELPSet operation HELP
assigned to Upstream light-sp genes
Type set name
You’ve made only one set. Choose it... and give it a
name (I’ll type it).
Upstream region of
Light-specific genes
Display set
Variable Data
Build set
Operation Function Done
CancelHELPSet operation HELP
assigned to Upstream light-sp genes
Type variable name
Click to see set you or someone else madeClick to see statements to manipulate setsClick to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick any red button to get help
Variable
Data
Function
Operation
Set operation
Display set
You want to run this new set through a filter that will give you conserved motifs.
That’s a Function.
Upstream region of
Light-specific genes assigned to Upstream light-sp genes
Display set
Variable Data Operation Function Done
Build set CancelHELPSet operation
Ortholog of
Protein product of
Sequence of
Upstream region of
Downstream region of
Common sequences(Meme) of
Common sequences(Meme) of
The function you want, Meme, analyzes a set of
sequences for statistically overrepresented motifs.
Common sequences of
Light-specific genes
Upstream light-sp genes
Public set
Premade set
Choose set
Display set
Variable Data
Build set
Operation Function Done
CancelHELPSet operation HELP
assigned to
Type variable name
Upstream light sp-genes
Upstream region of
Light-specific genes assigned to Upstream light-sp genes
You now have two private sets. You want, of course, the set of upstream light-
specific genes.
Common sequences of
Upstream light-sp genes
Display set
Variable Data
Build set
Operation Function Done
CancelHELPSet operation HELP
assigned to Memed light-sp genes
Type variable name
Upstream region of
Light-specific genes assigned to Upstream light-sp genes
You click and I’ll type in a logical name.
Common sequences of
Upstream light-sp genes
Display set
Variable Data
Build set
Operation Function Done
CancelHELPSet operation HELP
assigned to Memed light-sp genes
Type variable name
Upstream region of
Light-specific genes assigned to Upstream light-sp genes
Click to see set you or someone else madeClick to see statements to manipulate setsClick to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick any red button to get help
Variable
Data
Function
Operation
Set operation
Display set
Click Done to put your plans into action and to
display the last defined set.
Build set Display set
Set: Memed upstream light-sp genes
Upstream of Syny6803:srl7009 638 D 509 6.38e-05 AAATATGGGA GAATGGAA TTGAGTAGCA
Upstream of Syny6803:sll0337 138 D 34 3.29e-05 AGCTTAAAAA GACTAGAA TTCAATGGGTUpstream of Syny6803:sll0335 279 D 183 5.30e-05 AAGGGTTAGC GACTGGAG TTGGCAAAAC
Upstream of Syny6803:sll0576 79 D 28 1.31e-04 TTTTTTGCTT TACTGGGA ACGGATATTT
DoneHELPSet operation
Upstream of Syny6803:slr1331 159 D 152 4.38e-05 GGCCCATGGG GACTAGGA
Upstream of Syny6803:sll0788 221 D 184 1.09e-05 TGCTTTGCCA GACTGGAA TTAGAGAAGG
Len Pos E-val Left flank Motif Right flank
And there you have it! A conserved octomeric sequence found in front of
all six light-regulated genes (and, gratifyingly, all in the same
orientation: Direct rather than Reverse). Is the sequence involved in gene regulation? Only experiments
will tell...
• Drawn on complex knowledge base
What have we done?
• Written a computer program
• Modified set by hand
Did it ourselves
• Combined multiple tools
Continue
Analysis: Discovery of coregulated genesSummary
• The graphical interface facilitates searches using functions, loops and Boolean operations, with few of the complexities of most computer languages
• The graphical interface facilitates the searching through experimental data for orfs with desired properties
• The script interface permits interaction between the search and display capabilities of the web site and outside resources.
End
Scenario 4
Reminder: This was a simulation. The underlying language (BioLingua) exists, but not the interface that facilitates access.