BEL language v1.0
-
Upload
wshayes -
Category
Technology
-
view
109 -
download
0
description
Transcript of BEL language v1.0
Natalie CatlettMay 10, 2013
Language Overview
1
Contents
• Language Overview– Statements– Annotations– Terms– Functions– Relationships
• Knowledge Representation Examples
Language Overview
3
Language Overview
• BEL statements capture knowledge• BEL annotations provide information about
statements– Citation, experimental context, etc.
• BEL terms are composed using BEL functions applied to namespace values
• BEL relationships connect BEL terms
4
BEL Statements• Basic statement types:
• Complex statement type:– A causal statement can be used as the target term of a causal statement
5
Term Expression Relationship Term Expression
Term Expression
complex(p(HGNC:CCND1), p(HGNC:CDK4))
p(HGNC:CCND1) directlyIncreases kin(p(HGNC:CDK4))
Term Expression Causal Relationship Causal Statement
p(HGNC:CLSPN) -> (kin(p(HGNC:ATR)) => p(HGNC:CHEK1, pmod(P)))
BEL Annotations
• Annotations provide information about one or more BEL Statements
6
SET Citation = {"PubMed", "J Mol Med", "12682725", "2003-03-14","Limbourg FP|Liao JK",""}
SET Evidence = "high-dose steroid treatment decreases vascular inflammation and ischemic tissue damage after myocardial infarction and stroke through direct vascular effects involving the nontranscriptional activation of eNOS"
SET Species = "9606"
SET Tissue = "Vascular System"
SET Disease = "Stroke"
a(CHEBI:corticosteroid) -| bp(MESHD:"Inflammation")
BEL Terms
• BEL terms have the following components:– Function
• Required• Can be nested to create complex terms
– Namespace Abbreviation• Optional
– Value• Required• If a namespace is given, the value is found in that namespace
• BEL terms from different namespaces are unified during compilation using information in the BEL Namespace Equivalence documents
7
f (ns:value)
BEL Functions
• Types of functions:– Abundances– Modifications of abundances– Processes– Activities– Transformations
• Abundances and processes are applied directly to namespace values
• All other functions are applied to abundance functions!
BEL Functions
• BEL Functions enable representation of different aspects of a value– E.g., AKT1 (EGID:207)
can be represented as a• Gene• RNA• Protein• Modified Protein• Activity
9
Abundance Functions
10
• Most abundance functions take namespace values– complexAbundance() can take a namespace value OR a list of
abundance terms– compositeAbundance() must take a list of abundance terms
Short Form Long Form Example Example Descriptiona() abundance() a(CHEBI:water) the abundance of water
p() proteinAbundance() p(HGNC:IL6) the abundance of human IL6 protein
complex() complexAbundance()
complex(NCH:"AP-1 Complex") the abundance of the AP-1 complex
complex(p(MGI:Fos), p(MGI:Jun)) the abundance of the complex comprised of mouse Fos and Jun proteins
composite() compositeAbundance() composite(p(HGNC:IL6), a(CHEBI:dexamethasone))
the abundances of IL6 protein and dexamethasone, together
g() geneAbundance() g(HGNC:ERBB2) the abundance of the ERBB2 gene (DNA)
m() microRNAabundance() m(MGI:Mir21) the abundance of mouse Mir21 microRNA
r() rnaAbundance() r(HGNC:IL6) the abundance of human IL6 RNA
Modification Functions
• Modifications are functions used as arguments within abundance functions– Post-translational modifications– Sequence variants (mutations, polymorphisms)
11
Short Form Long Form Example Example Description
pmod() proteinModification()
p(HGNC:AKT1, pmod(P)) the abundance of human AKT1 protein modified by phosphorylation
p(MGI:Rela, pmod(A, K)) the abundance of mouse Rela protein acetylated at an unspecified lysine
p(HGNC:HIF1A, pmod(H, N, 803)) the abundance of human HIF1A protein hydroxylated at asparagine 803
sub() substitution() p(HGNC:PIK3CA, sub(E, 545, K))the abundance of the human PIK3CA protein in which glutamic acid 545 has been substituted
with lysine
trunc() truncation() p(HGNC:ABCA1, trunc(1851))the abundance of human ABCA1 protein that has
been truncated at amino acid residue 1851 via introduction of a stop codon
fus() fusion()
p(HGNC:BCR, fus(HGNC:JAK2, 1875, 2626))
the abundance of a fusion protein of the 5' partner BCR and 3' partner JAK2, with the
breakpoint for BCR at 1875 and JAK2 at 2626
p(HGNC:BCR, fus(HGNC:JAK2)) the abundance of a fusion protein of the 5' partner BCR and 3' partner JAK2
Process Functions
• Processes include biological phenomena that occur at the level of the cell or organism
12
Short Form Long Form Example Example Description
bp() biologicalProcess() bp(GO:"cellular senescence") the biological process cellular senescence
path() pathology() path(MESHD:"Pulmonary Disease, Chronic Obstructive") the pathology COPD
Activity Functions
• Applied to protein and complex abundances to specify the frequency of events resulting from the molecular activity of the abundance– This distinction is useful for proteins whose activities are regulated by post-
translational modification
13
Short Form Long Form Example Example Descriptioncat() catalyticActivity() cat(p(RGD:Sod1)) the catalytic activity of rat Sod1 protein
chap() chaperoneActivity() chap(p(HGNC:CANX)) the events in which the human CANX (Calnexin) protein functions as a chaperone to aid the folding of other proteins
gtp() gtpBoundActivity() gtp(p(PFH:"RAS Family")) the GTP-bound activity of RAS Family protein
kin() kinaseActivity()kin(complex(NCH:"AMP-activated protein kinase
complex"))the kinase activity of the AMP-activated protein kinase complex
act() molecularActivity() act(p(HGNC:TLR4)) the ligand-bound activity of the human non-catalytic receptor protein TLR4; a more specific activity function is not applicable to TLR4 protein
pep() peptidaseActivity() pep(p(RGD:Ace)) the peptidase activity of the Rat angiotensin converting enzyme (ACE)
phos() phosphataseActivity() phos(p(HGNC:DUSP1)) the phosphatase activity of human DUSP1 protein
ribo() ribosylationActivity() ribo(p(HGNC:PARP1)) the ribosylation activity of human PARP1 protein
tscript() transcriptionalActivity() tscript(p(MGI:Trp53)) the transcriptional activity of mouse TRP53 (p53) protein
tport() transportActivity() tport(complex(NCH:"ENaC Complex"))
the frequency of ion transport events mediated by the epithelial sodium channel (ENaC) complex
Transformation Functions
• Transformations are events in which one class of abundance is transformed or changed into a second class of abundance
14
Short Form Long Form Example Example Description
deg() degradation() deg(r(HGNC:MYC)) the degradation of human MYC RNA
sec() cellSecretion() sec(p(MGI:Il6)) the secretion of mouse Il6 protein
surf() cellSurfaceExpression() surf(p(RGD:Fas)) the translocation of Rat Fas protein to the cell surface
tloc() translocation() tloc(p(HGNC:NFE2L2), MESHCL:Cytoplasm, MESHCL:"Cell Nucleus")
the event in which human NFE2L2 protein is translocated from the
cytoplasm to the nucleus
rxn() reaction()rxn(reactants(a(CHEBI:phophoenolpyruvate), a(CHEBI:ADP)),products(a(CHEBI:pyruvate),
a(CHEBI:ATP)))
the event in which the reactants phosphoenolpyruvate and ADP are
converted into the products pyruvate and ATP
BEL Relationships
• Causal relationships– increases, directlyIncreases, decreases, directlyDecreases,
rateLimitingStepOf, causesNoChange
• Correlative relationships– negativeCorrelation, positiveCorrelation, association
• Biomarker relationships– biomarkerFor, prognosticBiomarkerFor
• Assignment to groups– hasMember, hasComponent, hasMembers, hasComponents
• Other– isA, subProcessOf
• Genomic relationships– transcribedTo, translatedTo, orthologousTo
15
Knowledge Representation Examples
16
Knowledge Capture – Example 1• From published paper describing effects of Tnf in rat
chondrocytes
17
Knowledge Capture – Example 1
18
SET Citation = {"PubMed","Arthritis Res Ther.","19144181"}
SET Species = "10116"
SET Cell = "Chondrocytes"
SET Evidence = "we identified the relative changes in transcript levels of the extracellular matrix components Agc1, Hapln1, and Col2a1, proteases Mmp-9 and Mmp-12, as well as the inflammatory cytokine macrophage Csf-1 (Figure 3). TNFα decreased Agc1 and Hapln1 (Figure 3a, b) and increased Mmp-9 and Mmp-12 (Figure 3e, f)"
p(RGD:Tnf) -> r(RGD:Mmp9)p(RGD:Tnf) -> r(RGD:Mmp12)p(RGD:Tnf) -| r(RGD:Acan) // Agc1 = Acanp(RGD:Tnf) -| r(RGD:Hapln1)Perturbation (source term)
= Tnf proteinMeasurements (target terms) = RNA abundance
In-line comment
Experimental context = Rat chondrocytes
Text from paper supporting statements
Reference
Knowledge Capture – Example 2
19
SET Citation = {"PubMed", "Anticancer Agents Med Chem. 2010 Oct 1;10(8):617-24.","21182469"}
SET Evidence = "One non-synonymous SNP 538G>A (Gly180Arg) has been found to greatly affect the function and stability of de novo synthesized ABCC11 (Arg180) variant protein. The SNP variant lacking N-linked glycosylation is recognized as a misfolded protein in the endoplasmic reticulum (ER) and readily undergoes proteasomal degradation. "
p(HGNC:ABCC11, sub(G,180,R)) =| \ p(HGNC:ABCC11, pmod(G,N))
p(HGNC:ABCC11, pmod(G,N)) =| deg(p(HGNC:ABCC11))
Gly180Arg variant ABCC11 protein lacks glycosylation
ABCC11 glycosylation blocks degradation
• Protein variants and post-translational modifications
Knowledge Capture – Example 3• Microarray data – can use probe set ID as identifier
20
SET Citation = {"PubMed","J Exp Med. 2006 Nov 27;203(12):2763-77.","17116732"}
SET Evidence = "Table S1. Affymetrix U133 Plus 2.0 GeneChip array data showing transcripts in HDLECs up- or down-regulated by a factor of at least twofold (P < 0.1) after stimulation with TNF-α."
SET Tissue = "Endothelium, Lymphatic"
p(HGNC:TNF) -> r(HGU133P2:205476_at)p(HGNC:TNF) -> r(HGU133P2:215101_s_at)p(HGNC:TNF) -> r(HGU133P2:214974_x_at)p(HGNC:TNF) -> r(HGU133P2:203868_s_at)
p(HGNC:TNF) -| r(HGU133P2:235683_at)p(HGNC:TNF) -| r(HGU133P2:235150_at)p(HGNC:TNF) -| r(HGU133P2:205258_at)
Knowledge Capture – Example 4• Protein modifications and activities
21
SET Citation = {"PubMed","Proc Natl Acad Sci U S A 2000 Oct 24 97(22) 11960-5","11035810"}
SET Evidence = "GSK-3 activity is inhibited through phosphorylation of serine 21 in GSK-3 alpha and serine 9 in GSK-3 beta."
SET Species = "9606"
p(HGNC:GSK3A,pmod(P,S,21)) =| kin(p(HGNC:GSK3A))p(HGNC:GSK3B,pmod(P,S,9)) =| kin(p(HGNC:GSK3B))
SET Evidence = "These serine residues of GSK-3 have been previously identified as targets of protein kinase B (PKB/Akt)"
kin(p(PFH:"AKT Family")) => p(HGNC:GSK3A,pmod(P,S,21))kin(p(PFH:"AKT Family")) => p(HGNC:GSK3B,pmod(P,S,9))
New Evidence Line; Citation and Species still apply to statements that follow