Analysis: Discovery of coregulated genes

68
Analysis: Discovery of coregulated genes What follows is a simulation of the proposed graphical interface. As you go through the simulation please consider what capabilities you would want to serve your research and annotation interests. A narrative to help you go through the simulation appears in a red-bordered box, such as the one below. To begin: 1. Click on Slide Show, (on the upper toolbar) 2. Click View Show 3. Click Continue button Continue Scenario 4

description

Scenario 4. Analysis: Discovery of coregulated genes. What follows is a simulation of the proposed graphical interface. As you go through the simulation please consider what capabilities you would want to serve your research and annotation interests. - PowerPoint PPT Presentation

Transcript of Analysis: Discovery of coregulated genes

Page 1: Analysis:  Discovery of coregulated genes

Analysis: Discovery of coregulated genes

What follows is a simulation of the proposed graphical interface. As you go through the simulation please consider what capabilities you would want to serve your research and annotation interests.

A narrative to help you go through the simulation appears in a red-bordered box, such as the one below.

To begin:1. Click on Slide Show, (on the upper toolbar)2. Click View Show3. Click Continue button

Continue

Scenario 4

Page 2: Analysis:  Discovery of coregulated genes

• Gene present in Prochlorococcus MED4 MED4 is naturally adapted to grow in high light.

How do cells control response to light?

• Ortholog absent in Prochlorococcus MIT9313 MIT9313 is naturally adapted to grow in low light

• Ortholog present in Synechocystis PCC 6803 Reason will become apparent in a moment

• Synechocystis PCC 6803 ortholog responds to high light Gene turns on by factor > 2 in response to high light

What genes are related to the adaptation to high light?

Look for:

Continue

Page 3: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Click to start building a new setClick to see set you or someone else madeClick to see statements to manipulate setsClick to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick any red button to get help

Variable

Data

Function

Operation

Set operation

Display set

Build set

Click Build Set to begin finding orfs with the desired specifications

Page 4: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Click to see statements to manipulate setsClick to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick any red button to get help

Variable

Data

Function

Operation

Set operation

You want to go through all MED4 ORFS. Click Operation to see how to

do that.

Page 5: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider X in set... (loop)

IF... THEN... OTHERWISEConsider X in set... (loop)

You want to go consider each ORF in the set of all all MED4 ORFS. Click

Consider X in set…

Click to go through each element of a setClick to perform actions only under certain conditionsClick any red button to get help

Consider X in set... (loop)

IF... THEN... OTHERWISE

Page 6: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All nucleotides of

All open reading frames of

All amino acid sequences of

All intergenic regions of

Human-annotated orfs of

Private set

Public set

All open reading frames of

Choose set type

Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa

Nostoc punctiformeNostoc PCC 7120

Prochlorococcus MED4Prochlorococcus MIT9313

Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942

Synechococcus WHSynechocystis PCC 6803Thermosynechococcus

TrichodesmiumUnicellulularFilamentous

All

Choose database

Choose ORFs as the set type and

Prochlorococcus MED4 as the database.

Page 7: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa

Nostoc punctiformeNostoc PCC 7120

Prochlorococcus MED4Prochlorococcus MIT9313

Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942

Synechococcus WHSynechocystis PCC 6803Thermosynechococcus

TrichodesmiumUnicellulularFilamentous

All

Prochlorococcus MED4

Choose database

Choose ORFs as the set type and

Prochlorococcus MED4 as the database.

Page 8: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

Click to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick to see statements to manipulate sets

Variable

Data

Function

Operation

Set operation

You want to consider this MED4 ORF only if an

ortholog in MIT9313 doesn’t exist.. Click Operation and

choose If… then…

Page 9: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

Consider X in set... (loop)

IF... THEN... OTHERWISE

KEEP item

IF... THEN... OTHERWISE

You want to this MED4 ORF only if an ortholog in

MIT9313 doesn’t exist.. Click Operation and choose

If… then…

Click to go through each element of a setClick to perform actions only under certain conditionsClick to add item to setClick any red button to get help

Consider X in set... (loop)

IF... THEN... OTHERWISE

Keep item

Page 10: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Click or to begin specifying condition to be met

Variable Data Function

Your condition is that the ortholog of the item in MIT9313 doesn’t exist.

Ortholog of… is a function.

Page 11: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF

Ortholog of

Protein product of

Sequence of

Upstream region of

Downstream region of

Ortholog of

Your condition is that the ortholog of the item in MIT9313 doesn’t exist.

Ortholog of… is a function.

Page 12: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF ItemSet

Specify

Variable

Item( inOrtholog of Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa

Nostoc punctiformeNostoc PCC 7120

Prochlorococcus MED4Prochlorococcus MIT9313

Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803Thermosynechococcus

Trichodesmium

Choose database

)

You want the ortholog of the item (the specific ORF

of MED4 being considered)…

Page 13: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa

Nostoc punctiformeNostoc PCC 7120

Prochlorococcus MED4Prochlorococcus MIT9313

Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803Thermosynechococcus

Trichodesmium

Choose database

)

Prochlorococcus MIT9313

... And you want the ortholog of the item in

Prochlorococcus MIT9313

Page 14: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 )

=

existsdoesn’t existdoesn’t exist

Click specific operation to continue specifying the condition to be met

or click to save results of functionVariable

You want the ortholog of the item in

Prochlorococcus MIT9313 not to exist.

Page 15: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 ) doesn’t exist

AND

OR

BUT NOT

IF... THEN... OTHERWISE

Click a logical operation to continue specifying the condition to be met

or IF... THEN... to end the condition

Let’s pause to see where we are in the task at hand

(Click to proceed)

Page 16: Analysis:  Discovery of coregulated genes

• Gene present in Prochlorococcus MED4 MED4 is naturally adapted to grow in high light.

How do cells control response to light?

• Ortholog absent in Prochlorococcus MIT9313 MIT9313 is naturally adapted to grow in low light

• Ortholog present in Synechocystis PCC 6803 Reason will become apparent in a moment

• Synechocystis PCC 6803 ortholog responds to high light Gene turns on by factor > 2 in response to high light

What genes are related to the adaptation to high light?

Look for:

Continue

Page 17: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 ) doesn’t exist

AND

OR

BUT NOT

IF... THEN... OTHERWISE

AND

Click a logical operation to continue specifying the condition to be met

or IF... THEN... to end the condition

There are more conditions to fulfill, so click AND.

Page 18: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 ) doesn’t exist AND

Click or to continue specifying condition to be met

Variable Data Function

Your condition now is that the ortholog of the item in Synechocystis

does exist. Ortholog of… is a function.

Page 19: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 ) doesn’t exist AND

Ortholog of

Protein product of

Sequence of

Upstream region of

Downstream region of

Ortholog of

Your condition now is that the ortholog of the item in Synechocystis

does exist. Ortholog of… is a function.

Page 20: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

ItemSet

Specify

Variable

Item( inOrtholog of Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa

Nostoc punctiformeNostoc PCC 7120

Prochlorococcus MED4Prochlorococcus MIT9313

Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803Thermosynechococcus

Trichodesmium

Choose database

)

You want the ortholog of the item (the specific ORF

of MED4 being considered)…

Page 21: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa

Nostoc punctiformeNostoc PCC 7120

Prochlorococcus MED4Prochlorococcus MIT9313

Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803Thermosynechococcus

Trichodesmium

Choose database

)

Synechocystis PCC6803You want the ortholog of

the item in Synechocystis PCC 6803

Page 22: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

=

existsdoesn’t exist

Click specific operation to continue specifying the condition to be met

or click to save results of functionVariable

You want the ortholog of the item in Synechocystis PCC 6803, this time to

exist... but first you need to save the ortholog for later. To do this click Variable.

Page 23: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog

Type variable name

)

Give the ortholog a logical name so that you can refer to it later (for now, just click on

the box and I’ll do the typing)

Page 24: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

Click specific operation to continue specifying the condition to be met

or click to save results of functionVariable

=

existsdoesn’t exist

exists

Now you can demand that the ortholog of the item in Synechocystis PCC 6803

exists.

Page 25: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists

Click a logical operation to continue specifying the condition to be met

or IF... THEN... to end the condition

AND

OR

BUT NOT

IF... THEN... OTHERWISE

AND

Still one more condition (2x expression) so click

AND.

Page 26: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

Click or to continue specifying condition to be met

Variable Data Function

This time the condition to be met concerns data from a microarray experiment.

Click Data.

Page 27: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

data for Item6803 ortholog

Specify

Variable

in Microcystis aeruginosaNostoc punctiformeNostoc PCC 7120

Prochlorococcus MED4Prochlorococcus MIT9313

Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803

Choose organism used

6803 ortholog

The data desired concerns the 6803 ortholog and an

experiment using Synechocystis PCC6803

Page 28: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

data for 6803 ortholog in Microcystis aeruginosaNostoc punctiformeNostoc PCC 7120

Prochlorococcus MED4Prochlorococcus MIT9313

Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803

Choose organism used

Synechocystis PCC 6803

The data desired concerns the 6803 ortholog and an

experiment using Synechocystis PCC6803

Page 29: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

data for 6803 ortholog in Synechocystis PCC 6803 microarray2D gel

Choose type

microarray

It’s a microarray experiment…

Page 30: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1Suzuki1

Yoshimura1

Choose exptHihara1

High light vs low light experiment

It’s a microarray experiment, Hihara et al, you think…

Mouse over that experiment to see (here click on it)

Page 31: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1

<< or =

=

> or =>

existsdoesn’t exist

>

You want the experimental condition

(high light) to be greater than the control…

Page 32: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1

Value

+2> You want the

experimental condition (high light) to be greater

than the control by a factor of 2 (I’ll type it).

Page 33: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1

Value

+2>

Click a logical operation to continue specifying the condition to be met

or IF... THEN... to end the condition

AND

OR

BUT NOT

IF... THEN... OTHERWISE

(Let’s pause again)

Page 34: Analysis:  Discovery of coregulated genes

• Gene present in Prochlorococcus MED4 MED4 is naturally adapted to grow in high light.

How do cells control response to light?

• Ortholog absent in Prochlorococcus MIT9313 MIT9313 is naturally adapted to grow in low light

• Ortholog present in Synechocystis PCC 6803 Reason will become apparent in a moment

• Synechocystis PCC 6803 ortholog responds to high light Gene turns on by factor > 2 in response to high light

What genes are related to the adaptation to high light?

Look for:

Continue

Page 35: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Consider each item in All open reading frames of Prochlorococcus MED4 :

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1

Value

+2>

Click a logical operation to continue specifying the condition to be met

or IF... THEN... to end the condition

AND

OR

BUT NOT

IF... THEN... OTHERWISEIF... THEN... OTHERWISE

That’s it! We’ve specified all the conditions, so end the IF segment and specify what do

do if all the conditions are met.

Page 36: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1

+2> THEN

Click to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick to see statements to manipulate sets

Variable

Data

Function

Operation

Set operation

And if they are met, what you want to do is to save the

gene in the set you’re building, a Set operation.

Page 37: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1

+2> THEN

ADD TO

DELETE FROM

UNION OF

INTERSECTION OF

Arithmetic

ADD TO

In other words, you want to ADD the gene to the growing

set.

Page 38: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1

+2> THEN

Item6803 ortholog

Specify

6803 orthologtoAdd

Variable Type name of set

Which gene? I could save the Prochlorococcus gene (the item), but I’ll instead save the gene from PCC 6803

(more known about them).

Page 39: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

IF Item( inOrtholog of Prochlorococcus MIT9313 doesn’t exist AND

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1

+2> THEN

6803 ortholog toAdd Light-specific genes

Type name of set

I need to give the set a logical name (you click, I’ll type).

Page 40: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1

+2> THEN

6803 ortholog toAdd

Type name of set

Light-specific genes

Click to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick to see statements to manipulate sets

Variable

Data

Function

Operation

Set operation

If I stop here, then all Prochlorococcus genes will

be considered, and if the conditions are met, the 6803

ortholog will be saved. That’s what I want, so click

Done.

Page 41: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Item( inOrtholog of Synechocystis PCC 6803 )

(assigned to 6803 ortholog )

exists AND

data for 6803 ortholog in Synechocystis PCC 6803 microarray Hihara1

+2> THEN

6803 ortholog toAdd

Type name of set

Light-specific genes

Save results and scriptSave only results

Save results and script

That was a complicated script, so I’ll save it in case I (or someone else) needs to run it again or modify it.

Page 42: Analysis:  Discovery of coregulated genes

Equivalent script that bypasses interface

(loop for item in (#^Genes ProcMed4) as all-orthologous = (all-blast-orthologous-geneIDs item stdevalue) as 6803ortholog = (#^Genes Syny6803) :in all-orthologous) as light-specific-genes = nil when (and (there-are-not-any #’member-geneID-of-gene-frames (#^Genes slotv Proc9313) :in all-orthologous)) (there-are-any #'member-geneID-of-gene-frames 6803ortholog) (>= ratio-value (select-matching-geneIDs-from-table Hihara1 item) 2)))) collect light-specific-genes 6803ortholog)

This is the script that the interface would have produced. It looks for all the world like a computer program. In fact, it is a

computer program, written in BioLingua. Continue

Page 43: Analysis:  Discovery of coregulated genes

Build set Display set

Set: Light-specific genes

Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)

Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase

Syny6803:sll0337 Sensor histidine kinase

Syny6803:sll0335 Hypothetical

Syny6803:sll0789 Response regulator (OmpR)

Syny6803:sll0576 Putative epimerase/hydratase

DoneHELPSet operation

Here are the results of the program. The genes meeting all the conditions are given along with a brief description and a graphic display of the regions

surrounding the genes.

(Click to proceed)

Page 44: Analysis:  Discovery of coregulated genes

Build set Display set

Set: Light-specific genes

Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)

Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase

Syny6803:sll0337 Sensor histidine kinase

Syny6803:sll0335 Hypothetical

Syny6803:sll0789 Response regulator (OmpR)

Syny6803:sll0576 Putative epimerase/hydratase

DoneHELPSet operation

What can you do with this set? Certainly one thing of interest is

the function of the genes. Clicking on a gene name brings you to the annotation page. Try

clicking on slr1332.

Page 45: Analysis:  Discovery of coregulated genes

Synechocystis PCC 6803: slr1332

Replicon: Chromosome

Coordinates: 1670650 (start-ATG) 1671864 (stop) Human Length = 404 amino acids

Strand: Direct

Gene name(s): fabF or fabJ

Function: beta-ketoacyl-acyl carrier protein synthase Human

Activity: In vivo activity: exists Experiment

Cyanobacterial orthologs:

Syny6803

Nost7120

NostPun

OptionsAnnotateMain Menu History

More

A

A

A

HELP

You can find out more about this kind of page from Scenarios 1 – 3.

For now, click to return to the Set Display page.

Page 46: Analysis:  Discovery of coregulated genes

Build set Display set

Set: Light-specific genes

Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)

Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase

Syny6803:sll0337 Sensor histidine kinase

Syny6803:sll0335 Hypothetical

Syny6803:sll0789 Response regulator (OmpR)

Syny6803:sll0576 Putative epimerase/hydratase

DoneHELPSet operation

Another interesting point of attack is the regulation of this set of genes (after all, they were selected as being coregulated by light). Perhaps the upstream regions

share a common motif.

(Click to continue)

Page 47: Analysis:  Discovery of coregulated genes

Build set Display set

Set: Light-specific genes

Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)

Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase

Syny6803:sll0337 Sensor histidine kinase

Syny6803:sll0335 Hypothetical

Syny6803:sll0789 Response regulator (OmpR)

Syny6803:sll0576 Putative epimerase/hydratase

DoneHELPSet operation

Unfortunately, in three cases, the genes don’t have an upstream region. Evidently

these genes are part of operons. We’d like to consider the upstream regions of the operons by adding the first genes of

the operon to the set. Add to set... that’s a Set operation. (Click on that)

Page 48: Analysis:  Discovery of coregulated genes

Build set Display set

Set: Light-specific genes

Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)

Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase

Syny6803:sll0337 Sensor histidine kinase

Syny6803:sll0335 Hypothetical

Syny6803:sll0789 Response regulator (OmpR)

Syny6803:sll0576 Putative epimerase/hydratase

DoneHELPSet operation

OperationADD TO

DELETE FROM

UNION OF

INTERSECTION OF

Arithmetic

ADD TO

Click ADD TO...

Page 49: Analysis:  Discovery of coregulated genes

Build set Display set

Set: Light-specific genes

Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)

Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase

Syny6803:sll0337 Sensor histidine kinase

Syny6803:sll0335 Hypothetical

Syny6803:sll0789 Response regulator (OmpR)

Syny6803:sll0576 Putative epimerase/hydratase

DoneHELPSet operation

Find specific gene

Add set of genes

Specify gene

Click on icon of gene

Add to set

Click ADD TO and click on the icon (the short black

arrow) upstream from sll0990, the first gene without an upstream

region.

Page 50: Analysis:  Discovery of coregulated genes

Build set Display set

Set: Light-specific genes

Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)

Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase

Syny6803:sll0337 Sensor histidine kinase

Syny6803:sll0335 Hypothetical

Syny6803:sll0789 Response regulator (OmpR)

Syny6803:sll0576 Putative epimerase/hydratase

DoneHELPSet operation

Find specific gene

Add set of genes

Specify gene

Click on icon of gene

Add to set

Syny6803:srl7009 trnR: tRNA Arg (UCU)

The short arrow is now named and part of the set.

Click on the icon (the long black arrow) upstream

from slr1332, the next gene without an upstream

region.

Page 51: Analysis:  Discovery of coregulated genes

Build set Display set

Set: Light-specific genes

Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)

Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase

Syny6803:sll0337 Sensor histidine kinase

Syny6803:sll0335 Hypothetical

Syny6803:sll0789 Response regulator (OmpR)

Syny6803:sll0576 Putative epimerase/hydratase

DoneHELPSet operation

Find specific gene

Add set of genes

Specify gene

Click on icon of gene

Add to set

Syny6803:srl7009 trnR tRNA Arg (UCU)

Syny6803:slr1331 Processing protease

slr1331 is now part of the set. Click on the icon

upstream from sll0789, the third gene without an

upstream region.

Page 52: Analysis:  Discovery of coregulated genes

Build set Display set

Set: Light-specific genes

Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)

Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase

Syny6803:sll0337 Sensor histidine kinase

Syny6803:sll0335 Hypothetical

Syny6803:sll0789 Response regulator (OmpR)

Syny6803:sll0576 Putative epimerase/hydratase

DoneHELPSet operation

Syny6803:srl7009 trnR tRNA Arg (UCU)

Syny6803:slr1331 Processing protease

Syny6803:sll0788 Hypothetical protein

Find specific gene

Add set of genes

Specify gene

Click on icon of gene

Add to set

Now the genes presumably at the start of the operons have been

added, and it’s time to remove the genes without upstream regions. Click on the radio buttons of the

three genes and click Set operation.

Page 53: Analysis:  Discovery of coregulated genes

Build set Display set

Set: Light-specific genes

Syny6803:sll0990 Formaldehyde dehydrogenase (glutathione dependent)

Syny6803:slr1332 fabF beta ketoacyl acyl carrier protein synthase

Syny6803:sll0337 Sensor histidine kinase

Syny6803:sll0335 Hypothetical

Syny6803:sll0789 Response regulator (OmpR)

Syny6803:sll0576 Putative epimerase/hydratase

DoneHELPSet operation

Syny6803:srl7009 trnR tRNA Arg (UCU)

Syny6803:slr1331 Processing protease

Syny6803:sll0788 Hypothetical protein

KEEP (delete others)

DELETE FROMDELETE FROM

Click DELETE FROM to remove the three genes

from the set.

Page 54: Analysis:  Discovery of coregulated genes

Build set Display set

Set: Light-specific genes

Syny6803:srl7009 trnR tRNA Arg (UCU)

Syny6803:sll0337 Sensor histidine kinase

Syny6803:sll0335 Hypothetical

Syny6803:sll0576 Putative epimerase/hydratase

DoneHELPSet operation

Syny6803:slr1331 Processing protease

Syny6803:sll0788 Hypothetical protein

Now you have what you want: a set of genes coregulated by light. The game now is to

extract their upstream regions and determine if that set

contains a common sequence motif. So you need a new set.

Click Build set.

Page 55: Analysis:  Discovery of coregulated genes

Click to see set you or someone else madeClick to see statements to manipulate setsClick to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick any red button to get help

Variable

Data

Function

Operation

Set operation

Display set

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

The set will be based on the upstream regions of the set of light-specific genes.

Upstream region of... that’s a Function.

Page 56: Analysis:  Discovery of coregulated genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Ortholog of

Protein product of

Sequence of

Upstream region of

Downstream region of

Common sequences(Meme) of

Upstream region of

The set will be based on the upstream regions of the set of light-specific genes.

Upstream region of... that’s a Function.

Page 57: Analysis:  Discovery of coregulated genes

Upstream region of

6803 orthologSet

Specify

Choose variable

Display set

Variable Data

Build set

Operation Function Done

CancelHELPSet operation HELP

Set( in Arthrobacter platensis

Gloeobacter violaceusMicrocystis aeruginosa

Nostoc punctiformeNostoc PCC 7120

Prochlorococcus MED4Prochlorococcus MIT9313

Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803Thermosynechococcus

Trichodesmium

Choose database

)

You want not the upstream region of a variable (6803 ortholog is the only one you’ve made so far) but

rather a set.

Page 58: Analysis:  Discovery of coregulated genes

All open reading frames of

Human-annotated orfs of

Private set

Public set

Upstream region of

Choose set type

Display set

Variable Data

Build set

Operation Function Done

CancelHELPSet operation HELP

( Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa

Nostoc punctiformeNostoc PCC 7120

Prochlorococcus MED4Prochlorococcus MIT9313

Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803Thermosynechococcus

Trichodesmium

Choose database

)

Private set

You have a couple of premade sets available, but you want your own. Click

Private set.

Page 59: Analysis:  Discovery of coregulated genes

Upstream region of

Light-specific genes

Choose set

Display set

Variable Data

Build set

Operation Function Done

CancelHELPSet operation HELP

assigned to

Type set name

Light-specific genes( )

You’ve made only one set. Choose it...

Page 60: Analysis:  Discovery of coregulated genes

Upstream region of

Light-specific genes

Display set

Variable Data

Build set

Operation Function Done

CancelHELPSet operation HELP

assigned to Upstream light-sp genes

Type set name

You’ve made only one set. Choose it... and give it a

name (I’ll type it).

Page 61: Analysis:  Discovery of coregulated genes

Upstream region of

Light-specific genes

Display set

Variable Data

Build set

Operation Function Done

CancelHELPSet operation HELP

assigned to Upstream light-sp genes

Type variable name

Click to see set you or someone else madeClick to see statements to manipulate setsClick to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick any red button to get help

Variable

Data

Function

Operation

Set operation

Display set

You want to run this new set through a filter that will give you conserved motifs.

That’s a Function.

Page 62: Analysis:  Discovery of coregulated genes

Upstream region of

Light-specific genes assigned to Upstream light-sp genes

Display set

Variable Data Operation Function Done

Build set CancelHELPSet operation

Ortholog of

Protein product of

Sequence of

Upstream region of

Downstream region of

Common sequences(Meme) of

Common sequences(Meme) of

The function you want, Meme, analyzes a set of

sequences for statistically overrepresented motifs.

Page 63: Analysis:  Discovery of coregulated genes

Common sequences of

Light-specific genes

Upstream light-sp genes

Public set

Premade set

Choose set

Display set

Variable Data

Build set

Operation Function Done

CancelHELPSet operation HELP

assigned to

Type variable name

Upstream light sp-genes

Upstream region of

Light-specific genes assigned to Upstream light-sp genes

You now have two private sets. You want, of course, the set of upstream light-

specific genes.

Page 64: Analysis:  Discovery of coregulated genes

Common sequences of

Upstream light-sp genes

Display set

Variable Data

Build set

Operation Function Done

CancelHELPSet operation HELP

assigned to Memed light-sp genes

Type variable name

Upstream region of

Light-specific genes assigned to Upstream light-sp genes

You click and I’ll type in a logical name.

Page 65: Analysis:  Discovery of coregulated genes

Common sequences of

Upstream light-sp genes

Display set

Variable Data

Build set

Operation Function Done

CancelHELPSet operation HELP

assigned to Memed light-sp genes

Type variable name

Upstream region of

Light-specific genes assigned to Upstream light-sp genes

Click to see set you or someone else madeClick to see statements to manipulate setsClick to give a value to a new or old variableClick to access results of experimentsClick to see list of available statementsClick to see list of available manipulationsClick any red button to get help

Variable

Data

Function

Operation

Set operation

Display set

Click Done to put your plans into action and to

display the last defined set.

Page 66: Analysis:  Discovery of coregulated genes

Build set Display set

Set: Memed upstream light-sp genes

Upstream of Syny6803:srl7009 638 D 509 6.38e-05 AAATATGGGA GAATGGAA TTGAGTAGCA

Upstream of Syny6803:sll0337 138 D 34 3.29e-05 AGCTTAAAAA GACTAGAA TTCAATGGGTUpstream of Syny6803:sll0335 279 D 183 5.30e-05 AAGGGTTAGC GACTGGAG TTGGCAAAAC

Upstream of Syny6803:sll0576 79 D 28 1.31e-04 TTTTTTGCTT TACTGGGA ACGGATATTT

DoneHELPSet operation

Upstream of Syny6803:slr1331 159 D 152 4.38e-05 GGCCCATGGG GACTAGGA

Upstream of Syny6803:sll0788 221 D 184 1.09e-05 TGCTTTGCCA GACTGGAA TTAGAGAAGG

Len Pos E-val Left flank Motif Right flank  

And there you have it! A conserved octomeric sequence found in front of

all six light-regulated genes (and, gratifyingly, all in the same

orientation: Direct rather than Reverse). Is the sequence involved in gene regulation? Only experiments

will tell...

Page 67: Analysis:  Discovery of coregulated genes

• Drawn on complex knowledge base

What have we done?

• Written a computer program

• Modified set by hand

Did it ourselves

• Combined multiple tools

Continue

Page 68: Analysis:  Discovery of coregulated genes

Analysis: Discovery of coregulated genesSummary

• The graphical interface facilitates searches using functions, loops and Boolean operations, with few of the complexities of most computer languages

• The graphical interface facilitates the searching through experimental data for orfs with desired properties

• The script interface permits interaction between the search and display capabilities of the web site and outside resources.

End

Scenario 4

Reminder: This was a simulation. The underlying language (BioLingua) exists, but not the interface that facilitates access.