busqueda de Promotores

70
Secuencias regulatorias y búsqueda de promotores Bq. Francisco Duarte Ph.D. Biotechnology

description

Busqueda de promotores area bioinformatica

Transcript of busqueda de Promotores

Page 1: busqueda de Promotores

Secuencias regulatorias y búsqueda de promotores

Bq. Francisco DuartePh.D. Biotechnology

Page 2: busqueda de Promotores

Contenidos

1. Background

2. Representación de motivos regulatorios

3. Algoritmos de búsqueda de promotores

4. Bases de datos relacionadas con la búsqueda depromotores

Page 3: busqueda de Promotores
Page 4: busqueda de Promotores
Page 5: busqueda de Promotores
Page 6: busqueda de Promotores
Page 7: busqueda de Promotores
Page 8: busqueda de Promotores
Page 9: busqueda de Promotores
Page 10: busqueda de Promotores
Page 11: busqueda de Promotores

Probability of occurrence of each nucleotide

for -10 sequence

T A T A A T

77% 76% 60% 61% 56% 82%

for -35 sequence

T T G A C A

69% 79% 61% 56% 54% 54%

Page 12: busqueda de Promotores

TRANSFAC Estructura de un gen eucarionte

ContigGene

Splice

Variants mRNA

Regulatory

Elements

CDS5’-UTR 3’-UTR

5‘

Splicing

3‘

Transcription

primary

transcript

altern.exon

Page 13: busqueda de Promotores

promoterenhancer 1enhancer 2

TSS

TATAbox

initiatorInr

box Abox Bbox Cbox A‘

compositeelement

box E box Dbox D‘box Fbox Gbox A‘‘

Esquema general de la estructura jerárquica de lasregiones regulatorias de la transcripcion en geneseucariontes

Page 14: busqueda de Promotores

¿Qué es un factor de transcripción?

A transcription factor is a protein that regulates transcription

after nuclear translocation by specific interaction with DNA

or by stoichiometric interaction with a protein that can be

assembled into a sequence-specific DNA-protein complex.

http://www.gene-regulation.com/pub/databases/transfac/clSM.html

Page 15: busqueda de Promotores

Regiones regulatorias

Page 16: busqueda de Promotores

Gene regulation

• Virtually every cell in your body contains acomplete set of genes

• But they are not all turned on in every tissue

• Each cell in your body expresses only a smallsubset of genes at any time

• During development different cells express differentsets of genes in a precisely regulated fashion.

• Gene regulation occurs at the level oftranscription or production of mRNA

• A given cell transcribes only a specific set ofgenes and not others

• Insulin is made by pancreatic cells

Page 17: busqueda de Promotores

Características de las regiones reguladoras

Chequear: http://www.ccg.unam.mx/Computational_Genomics/PromoterTools/http://molbiol-tools.ca/Promoters.htmhttp://www.phisite.org/main/index.php?nav=tools&nav_sel=hunterhttp://www.fruitfly.org/seq_tools/promoter.htmlhttp://linux1.softberry.com/berry.phtml?topic=bprom&group=programs&subgroup=gfindb

Page 18: busqueda de Promotores

Central dogma

Genetic information always goes from DNA to RNA to protein

Gene regulation has been well studied in E. coli

When a bacterial cell encounters a potential food source it will manufacture the enzymesnecessary to metabolize that food

Gene Regulation

In addition to sugars like glucose and lactose E. coli cells also require amino acidsOne essential aa is tryptophan.

When E. coli is swimming in tryptophan (milk & poultry) it will absorb the amino acids fromthe mediaWhen tryptophan is not present in the media then the cell must manufacture its’ ownamino acids

Trp Operon

E. coli uses several proteins encoded by a cluster of 5 genes to manufacture the amino acidtryptophan.

All 5 genes are transcribed together as a unit called an operon, which produces a singlelong piece of mRNA for all the genes.

RNA polymerase binds to a promoter located at the beginning of the first gene andproceeds down the DNA transcribing the genes in sequence

Page 19: busqueda de Promotores
Page 20: busqueda de Promotores
Page 21: busqueda de Promotores

Gene regulation

In addition to amino acids, E. coli cells also metabolize sugars

in their environment.

In 1959 Jacques Monod and Fracois Jacob looked at the

ability of E. coli cells to digest the sugar lactose.

In the presence of the sugar lactose, E. coli makes an

enzyme called beta galactosidase.

Beta galactosidase breaks down the sugar lactose so the E.

coli can digest it for food.

It is the LAC Z gene in E. coli that codes for the enzyme beta

galactosidase.

Page 22: busqueda de Promotores

Lac Z Gene

The tryptophane gene is turned on when there is no tryptophan in the

media.

That is when the cell wants to make it´s own tryptophan.

E. coli cells can not make the sugar lactose.

They can only have lactose when it is present in their environment.

Then they turn on genes to beak down lactose.

The E. coli bacteria only needs beta galactosidase if there is lactose in the

environment to digest. There is no point in making the enzyme if there is no

lactose sugar to break down.

It is the combination of the promoter and the DNA that regulate when a

gene will be transcribed.

This combination of a promoter and a gene is called an OPERON

Page 23: busqueda de Promotores

THE OPERON

Operon is a cluster of genes encoding related enzymes that are regulated together

Operon consists of:• a promoter site where RNA polymerase binds and begins transcribing themessage.• a region that makes a repressor.

Repressor sits on the DNA at a spot between the promoter and the gene to betranscribed.

This site is called the operator.

Page 24: busqueda de Promotores

LAC Z GENE

• E. coli regulate the production of BetaGalactocidase by using a regulatory protein calleda repressor

• The repressor binds to the lac Z gene at a sitebetween the promotor and the start of the codingsequence

• The site the repressor binds to is called theoperator

Page 25: busqueda de Promotores
Page 26: busqueda de Promotores

LAC Z GENE

• Normally the repressor sits on the operatorrepressing transcription of the lac Z gene

• In the presence of lactose the repressorbinds to the sugar and this allows thepolymerase to move down the lac Z gene

Page 27: busqueda de Promotores

LAC Z GENE

This results in the production of beta galactosidasewhich breaks down the sugar.

When there is no sugar left the repressor willreturn to its spot on the chromosome and stopthe transcription of the lac Z gene.

Page 28: busqueda de Promotores

Mecanismooperon apagado

Page 29: busqueda de Promotores
Page 30: busqueda de Promotores
Page 31: busqueda de Promotores

GENE REGULATION

• In eukaryotic organisms like ourselves there are severalmethods of regulating protein production

• Most regulatory sequences are found upstream fromthe promoter

• Genes are controlled by regulatory elements in thepromoter region that act like one/off switches ordimmer switches

Page 32: busqueda de Promotores
Page 33: busqueda de Promotores

GENE REGULATION

• Specific transcription factors bind to these regulatoryelements and regulate transcription.

• Regulatory elements may be tissue specific and willactivate their gene only in one kind of tissue

• Sometimes the expression of a gene requires thefunction of two or more different regulatory elements

Page 34: busqueda de Promotores

INTRONS AND EXONS

• Eukaryotic DNA differs from prokaryotic DNA it that the codingsequences along the gene are interspersed with noncodingsequences.

• The coding sequences are called

– EXONS

• The non coding sequences are called

– INTRONS

Page 35: busqueda de Promotores

INTRONS AND EXONS

• After the initial transcript is produced theintrons are spliced out to form the completedmessage ready for translation

• Introns can be very large and numerous, sosome genes are much bigger than the finalprocessed mRNA

Page 36: busqueda de Promotores

INTRONS AND EXONS

• Muscular dystrophy

• DMD gene is about 2.5 million base pairs long

• Has more than 70 introns

• The final mRNA is only about 17,000 base pairslong

Page 37: busqueda de Promotores

RNA Splicing

• Provides a point where the expression of a gene can becontrolled

• Exons can be spliced together in different ways

• This allows a variety of different polypeptides to beassembled from the same gene

• Alternate splicing is common in insects and vertebrates,where 2 or 3 different proteins are produced from onegene

Page 38: busqueda de Promotores
Page 39: busqueda de Promotores

Protein domains in regulator sequences

Page 40: busqueda de Promotores

TFBS: Transcription factors binding sites

Page 41: busqueda de Promotores

Motif representations: from alignments to motifs

Page 42: busqueda de Promotores
Page 43: busqueda de Promotores
Page 44: busqueda de Promotores
Page 45: busqueda de Promotores

Transcription factors

Sequence-specific

DNA bindingNon-DNA binding

TF1 TF2 TF3 TF4

adapter

Co-activator

HAT

DNA

Layer I

Layer III

Layer II

Page 46: busqueda de Promotores

Structure of transcription factors

USF-1, dimer

DNA binding

domain

Activation

domain

oligomerization

domain

Ligand-

binding

domain

Protein-protein

interaction

domain

Page 47: busqueda de Promotores

N Gene Schema and positions of a CE

TRANSCompel

accession number

1.

Scavenger receptor, Homo sapiens

Enhancer –4500/-4100

C00080

2.

GM-CSF,

Mus musculus

-53 -40

: :

C00081

3.

Collagenase, Homo sapiens

-89 -82 -72 -66

: : : :

C00083

4.

IgH ,

Mus musculus

Enhancer at 3’ flank

C00133

5.

Interleukin 2,

Homo sapiens

-283 -268

: :

C00109

6.

Interleukin 2,

Homo sapiens

-167 -142

: :

C00165

7.

2, Mus musculus

-167 -142

: :

C00158

8.

IgH,

Homo sapiens

C00173

9.

А1, Rattus

norvegicus

-117 -73

: :

С00101

10.

IRF-1, Mus

musculus

-123 -113 -49 -40

: : : :

C00192

AP-1 Ets

AP-1 Ets

AP-1 Ets

AP-1 Ets

AP-1 NFAT

AP-1 NF-B

AP-1 Oct-2

Ets CBF

NF-B C/EBP

NF-B STAT-1

Page 48: busqueda de Promotores

Ternary complex NFATp - AP1 - DNA

Page 49: busqueda de Promotores

Synergistic activation of

transcription

Low level

of transcription

Low level

of transcription

F1

F1

F1

F2

F2

F2

Composite elements

Minimal functional units where both protein-DNA and protein-protein

interactions contribute to a highly specific pattern of gene expression

and provide cross-coupling of different signal transduction pathways.

Page 50: busqueda de Promotores

Membrane receptor

Src

SH3

SH2Ras

Ras

GDP

GTP

AdaptorsPLC

PI3-K

Phosphorylation

IP3

Ca2+

Ca2+Ca2+

Ca2+ dependent canal

Calcineurin

ERK

ERK

JNK

JNK

P38MAPK

P38MAPK

NFATp NFATp

NFATp

P

P Pc-Fos c-Fos

с-Fos

c-Jun

c-Jun

c-Jun

c-Jun

ATF-2 ATF-2

ATF-2

IL-2

PKB/Akt

Composite element

cytoplasm

Nucleus

Integration of signals. Cross-coupling of signal transduction pathways

Page 51: busqueda de Promotores

-180 -150-249

AP-1

NFAT

HMG Y

NFAT NFAT

AP-1STAT 6 NF-Y

-114 -88

AP-1

NFAT

HMG Y

-60

AP-1

NFAT

TATA

-28

c-MAF

CE CE

ST

Mouse IL-4 promoter

+1

Page 52: busqueda de Promotores

ST

GM-CSF Homo sapiens

+1

T-cell specific inducible enhancer at –3500 bp Promoter

TATTT

-54

AP-1

NFAT

CE

NF-Bp50/p65

-88

AP-1

NFAT

CE

AP-1

NFAT

CE

AP-1

NFAT

AP-1

NFAT

CE

NF-Bc-Rel/p65

HMG Y(I)

-114

CD28 response element

CBF CBF

Page 53: busqueda de Promotores

Recruitment of CIITA to MHC-II promoters. A prototypical MHC-II promoter (HLA-DRA) is represented schematically with the W, X, X2, and Y sequences conserved in all MHC-II, Ii, and HLA-DM promoters. RFX, X2BP, NF-Y, and an as yet undefined W-binding protein bind cooperatively to these sequences and assemble into a stable higher order nucleoprotein complex referred to here as the MHC-II enhanceosome. CIITA is tethered to the enhanceosome via multiple weak protein-protein interactions with the W, X, X2, and Y-binding factors. The octamer site found in the HLA-DRA promoter (O), and its cognate activators (Oct and OBF-1) are not required for recruitment of CIITA. CIITA is proposed to activate transcription (arrow) via its amino-terminal activation domains(AD), which contact the RNA polymerase II basal transcription machinery.

Masternak K et al., Genes Dev 2000 May 1;14(9):1156-66

Enhanceosome

Page 54: busqueda de Promotores
Page 55: busqueda de Promotores

TFIIA

TFIIE

TFIIH

Site-specific TF

TFIIF

RNA pol II

TFIID

Co-activator

p300/CBP

Acetilase

PCAF

Closed nucleosomes

Acetilation

TFIIB

Acetylase

Acetylation

Page 56: busqueda de Promotores

Databases on gene regulation

Page 57: busqueda de Promotores

http://regulondb.ccg.unam.mx/

Page 58: busqueda de Promotores

Buscar .gbk y 100pares de basesupstream

Ejercicio

Page 59: busqueda de Promotores

BLASTp vs NR para buscar probables ortólogos

Page 60: busqueda de Promotores
Page 61: busqueda de Promotores
Page 62: busqueda de Promotores
Page 63: busqueda de Promotores
Page 64: busqueda de Promotores
Page 65: busqueda de Promotores

>malE - 100 bases upstream

aaagaactacctgaatttcgagattaggcctt

gatcgcgccggggtgaaagcgttatact

gacgcgcaaacgtttgcgcaatttgggcacag

agggggtt

>malE - 100 bases upstream

aggaggatggaaagaggatgtcatagaaagaa

actaaagaccgttaagcgacctctgcgt

atccacgagcaatatacacaaatggaaaagga

cgggttat

Page 66: busqueda de Promotores

http://molbiol-tools.ca/Promoters.htm

Page 67: busqueda de Promotores

http://www.prodoric.de/vfp/vfp_promoter.php

Page 68: busqueda de Promotores

http://www.phisite.org/main/index.php?nav=tools&nav_sel=hunter

Page 69: busqueda de Promotores
Page 70: busqueda de Promotores