Download - Systems biology analysis of protein–drug interactions

7/31/2019 Systems biology analysis of protein–drug interactions

http://slidepdf.com/reader/full/systems-biology-analysis-of-proteindrug-interactions 1/15

REVIEW

Systems biology analysis of protein–drug interactions

Jacques Colinge, Uwe Rix, Keiryn L. Bennett and Giulio Superti-Furga

Research Center for Molecular Medicine of the Austrian Academy of Sciences (CeMM), Jacques Colinge, Vienna,

Austria

Received: September 1, 2011

Revised: September 26, 2011

Accepted: September 27, 2011

Drugs induce global perturbations at the molecular machinery level because their cognate

targets are involved in multiple biological functions or because of off-target effects. The

analysis or the prediction of such systems level consequences of drug treatment therefore

requires the application of systems biology concepts and methods. In this review, we firstsummarize the methods of chemical proteomics that can measure unbiased and proteome-

wide drug protein target spectra, which is an obvious necessity to perform a global analysis.We then focus on the introduction of computational methods and tools to relate such target

spectra to global models such as pathways and networks of protein–protein interactions, and

to integrate them with existing protein functional annotations. In particular, we discuss how

drug treatment can be mapped onto likely affected biological functions, how this can helpidentifying drug mechanisms of action, and how such mappings can be exploited to predict

potential side effects and to suggest new indications for existing compounds.

Keywords:

Bioinformatics / Chemical proteomics / Drugs / Personalized medicine / Statistics

1 Introduction

Our knowledge of drug protein target profiles is often

limited by practical difficulties in obtaining such informa-

tion. Consequently, numerous compounds in clinical use

are orphan ligands or one target only is identified. Contin-uous and substantial progress in proteomic technologies [1]

have made it possible to develop chemical proteomic – or

chemoproteomic – approaches, where the protein targets of a

drug are affinity purified and identified by MS [2, 3]. Thismethodology empowers researchers to measure

compound–protein interactions in a biological context as

opposed to in vitro-binding assays. That is, drug–protein

interactions cannot only be determined proteome wide, but

also in a tissue- or cell type-dependent manner. The strength

of this approach is that all the proteins are expressed at truephysiological concentrations and bear correct posttransla-

tional modifications.

Understanding the mechanism-of-action (MoA) of

compounds and elucidating the origin of observed sideeffects are of great importance in drug discovery. Accessing

accurate and sensitive target spectra offers promising

perspectives that result in the derivation of more efficient

and safer compounds. Detrimental leads can be halted at anearlier stage of development and thus significantly reducing

costs [4]. Furthermore, the knowledge of multiple potent

targets creates boundless opportunities for drug repurpos-

ing. In general, this can potentially provide access to new

targets. It could also reveal unexpected synergistic effects

that explain the success of certain compounds [5].Reaching the promises of chemical proteomics is not a

straightforward task since the compounds used in patient

therapy induce more global changes in the molecular

machinery of cells than simply regulating a selected protein

[6]. Obviously, a targeted protein is involved in biochemicalreactions that take place within one or several biological

pathways. These in turn can interact with other pathways

Colour Online: See the article online to view Figs. 2, 4 and 5 in colour.

Abbreviations: ABPP, affinity-based protein profiling; CML,

chronic myeloid leukemia; FDR, false discovery rate; GO, gene

ontology; PPI, protein–protein physical interaction; TCM, tradi-

tional Chinese medicine

Correspondence: Dr. Jacques Colinge, Research Center for

Molecular Medicine of the Austrian Academy of Sciences

(CeMM), Jacques Colinge, AKH BT 25.3, Lazarettgasse 14,

A-1090 Vienna, Austria

E-mail: [email protected]

Fax: 143-1-40160-970030

& 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.clinical.proteomics-journal.com

102 Proteomics Clin. Appl. 2012, 6 , 102–116DOI 10.1002/prca.201100077



[7]. Therefore, modification of the activity of a single agent,which is part of a complex network, can have far-reaching

consequences on multiple biological functions. Moreover,

compounds often have more than one target and hence can

exhibit broad-ranging impact on cell biology. For instance,

the tyrosine kinase inhibitor imatinib, which is a hallmark

of targeted therapy against the chronic myeloid leukemia(CML) causing fusion protein BCR-ABL, was found to have

at least five additional potent targets, including the nonki-

nase (NQO2) [8].

To fully exploit the potential of chemical proteomics,combination with computational methods is required. In

this way, it now becomes feasible to extend beyond an ideal

case where the protein target spectrum reveals valuable

information directly, considering complex and indirectimplications. This combination of unbiased proteome-wide

target determination and computation naturally occurswithin the paradigm of systems biology, where units of

functions are all regarded as interconnected [9, 10].

Computations can integrate available information on biolo-

gical pathways, protein and gene interaction networks, andprotein functions to predict the biological processes that are

impacted by drug treatment. This assists in understanding

the mechanism of action of compounds and naturally

expands in the direction of identifying potential side effects

and new indications for existing molecules.In this review, we briefly recall the general principles of

chemical proteomics before presenting the current methodsof data analysis and available tools and algorithms. We will

also discuss how the presented computational methods can

be applied to the analysis of target spectra that are not

necessarily derived from chemical proteomics, e.g. currentefforts in traditional Chinese medicine (TCM) research. We

will conclude by presenting attractive future perspectives

toward personalized medicine and digital patient models.

2 Methods of chemical proteomics

Different approaches to discover drug targets exist that can

be subdivided into two classes [2, 3]. The first is classical

compound affinity purification [8, 11, 12] that requires theimmobilization of the compound of interest. The second is

affinity-based protein profiling (ABPP) [13, 14]. Via a generic

chemical probe, it profiles a whole class of compounds,

though in a biased manner. Additionally, either theexpression proteomic experiments or the global mapping of

chosen posttranslational modifications, e.g. acetylations for

histone deacetylase (HDAC) inhibitors, can be employed to

compare drug-treated versus untreated samples. Thereby,

information regarding drug-induced changes is obtained

[15–17]. Such broader methods do not measure compoundtargets directly and are hence not properly speaking

chemical proteomic methods. The potential to partially

reveal target spectra is evident, although a combination of

direct drug influence is usually combined with downstream

regulation and the data obtained represent the global‘‘integrated’’ impact of drugs.

In recent years, chemical proteomic applications have

aided in the elucidation of several important drug–protein

interactions [8, 13, 18–25]. The main steps of compound

immobilization-based and ABPP methods are introduced

below.

2.1 Compound immobilization approaches

To profile the protein targets of a chosen compound

requires the immobilization of the compound on a matrix.

This operation is achieved through a functional group, e.g.

sulfhydryl, amino, hydroxyl, or carboxyl, that binds to anactivated resin, e.g. sepharose or agarose beads. Small

molecules that do not contain an appropriate group must bechemically modified. The potency of the modified, linked

compound should be assayed to ensure that it is preserved

[26]. A cell or tissue extract is then incubated with the matrix

and washed extensively before elution. Finally, proteomicmethods are applied to identify the proteins that bound the

linked molecule. Current strategies are to either further

reduce sample complexity via one- or two-dimensional SDS-

PAGE [8], or use gel-free methods [27]. Both approaches are

then followed by MS. The process is shown in Fig. 1 andcommonly named drug pulldown [3, 27].

2.2 Noise in the signal?

Similar to any affinity purification methods, compoundimmobilization approaches suffer from the presence of

nonspecific interactions that consistently appear in the list

of proteins identified by MS. The causes of nonspecific

interactions are multiple, and appropriate solutions exist. Inorder to analyze the true target profile of a compound, it is

crucial to eliminate nonspecific binders without the loss of

any important target. This step constitutes the first stage of

bioinformatic analysis.

Some proteins might bind to the chemical linker between

the matrix support and the compound or even to the matrixitself. This category of nonspecific binders can be readily

identified in negative control experiments performed with

‘‘empty’’ blocked beads.

Abundant proteins that have a low affinity for either theimmobilized compound or the true interactors of the

compound cannot be eliminated completely at the washing

step. These proteins contribute largely to the identified

nonspecific binders. They are not identified in negative

control experiments with blocked beads. Nonetheless,

different approaches can be implemented to identify them.It is possible to perform a parallel experiment with an

unrelated compound that, with some confidence, does not

share any target with the compound of interest. To improve

the potential of identifying nonspecific binders, a chemically

Proteomics Clin. Appl. 2012, 6 , 102–116 103




related compound, which is biologically inactive, is yet a

better tool, but the risk of overlapping target profiles,

however, is increased. One last variation of the same

approach is to compare a new drug pulldown with previousexperiments and to exclude frequently found proteins.

Determining the frequency threshold might require statis-

tical analysis or empirical validation. We have obtained

satisfying results with geometric tests (unpublished data). In

general, comparisons with previous experiments increasethe risk of discarding a correct target that was found to be

nonspecific in a different context. Thus, lists of ‘‘assumed’’

nonspecific binders, though useful as initial approxima-

tions, should be considered with care.

It is also possible to compare the proteins identified from

a drug pulldown with the proteome of the cells from whichthe experiment was performed. Due to the nature of MS, an

analysis at the whole proteome level will primarily detect the

abundant proteins, the so-called core proteome . Conversely,

the drug pulldown is an enrichment process resulting in thedetection of lower abundance proteins. As a first approx-

imation, proteins that are detected in both data sets are

considered suspicious and should be removed as true

interactors. The downside of this simplistic subtraction isthat abundant true targets, as recently exemplified by Hsp90

[23], are excluded. Furthermore, such an approach becomesquestionable since the emergence of highly sensitive MS

instrumentation that can now routinely detect medium-

abundance and even some low-abundance proteins. To

circumvent this difficulty, semi-quantitative MS indicatorssuch as spectral counts [28] can be exploited to detect

significant enrichments from a chemical proteomic experi-

ment. Identification of the minimum increase of spectral

count required in the pulldown versus the core proteome

can be achieved empirically, e.g. following known targets, orthrough proper statistical modeling [29, 30]. A final possi-

bility is to perform the desired pulldown again with a chosen

concentration of the free compound added to the cell

lysate prior to performing the drug pulldown. In this way,

true targets are bound by the free compound and are no

longer available in the lysate to interact with the immobi-lized drug (Fig. 2). Requiring the complete disappearance of

the targets or a significant reduction of the spectral counts

in the MS data, when the two data sets are compared,

identifies the correct proteins. In our hands, this concep-tually simple method gives reproducible, sensitive, and

reliable results [31].

Drug pulldowns retrieve complete or partial protein

complexes in certain cases and the elimination of nonspe-

cific binders leaves a list of proteins mixing direct and

indirect interactors (Fig. 3A). Depending on the compound,it is possible to recognize the direct binders with a good

confidence simply using the knowledge of the binding

mechanism. For instance, kinases are very likely to be direct

interactors of a kinase inhibitor. More generally, protein–protein binary interaction data such as measured by yeast 2-

hybrid experiments [32], which are available from public

databases, might shed light on a mixture of direct/indirect

drug interactions (Fig. 3B). Similar help can be obtained

through the knowledge of protein interaction domains and

predicted physical interactions [33]. Ultimately, complemen-tary protein–protein interaction experiments could be planed

to discover the structure of those interactions, and clarify the

position of the drug interactions in this network [29]. Table 1

provides a summary of noise elimination methods.

Figure 1. Overview of the entire process of measuring drug–-

protein interactions through chemical proteomic experiments

and of analyzing the generated data. A drug is coupled to

magnetic beads through a chemical linker and incubated with

the lysate of a biological sample. Different proteins bind the drug

with strong affinity (violet and orange) and some others might

have a low affinity for the compound (aqua). Additionally, some

proteins bind to drug strong binders (green). After washing and

elution, the strong binders are purified along with the indirect

binders and some abundant low-affinity proteins. MS detects the

purified proteins and provides the input data for the bioinfor-

matic analysis, which may exploit a wide range of additional

data sources to compute its results.

104 J. Colinge et al. Proteomics Clin. Appl. 2012, 6 , 102–116




2.3 Miniaturization toward individual patient

profiling

Until recently, relatively large quantities of protein material

were necessary to perform chemical proteomic experiments

successfully. This constraint might have limited the wide-

spread use of this powerful methodology, particularly when

analyzing clinical samples. Due to progresses in MS

instrumentation and the development of new experimentalprotocols involving more sensitive chromatography, it is

currently possible to perform drug pulldowns using as little

as 106 cells or even less [27]. These improvements provide

accessibility to individual patient sample analyses. Thus,

new opportunities both in research and, ultimately, inpersonalized medicine are created that are complementary

to next generation DNA-sequencing technologies.

2.4 Subproteome-focused chemical proteomics

To study an entire class of compounds that inhibit proteins

through a common binding mechanism, e.g. inhibitors that

interact with the ATP pocket of kinases, it is possible to

develop generic matrices that bind with a large range of thetargets with slightly reduced specificity. Through competi-

tion experiments, where the same matrix is employed alone

or with the presence of compounds to profile, protein targets

can be identified reliably through the same logic discussed

above for immobilized compounds. Such generic matriceshave been developed for kinases (Kinobeadss) [13] and

histone deacetylases [34] and offer powerful assay platforms

that do not need to be adapted to existing or future

compounds. On the other hand, only a limited range of allthe possible targets bind to the generic matrices and true

targets not present in this subset cannot be detected. A

similar approach, termed affinity-based protein profiling

(ABPP), uses chemically reactive enzyme-specific probe

molecules to capture e.g. kinases [14] or proteases [35] and

purify them via a biotin tag [36]. While the ABPP metho-dology follows the same downstream concepts as the

generic compound matrices, it offers the advantage to

potentially being able to differentiate between active and

inactive enzymes. Taken together, these methods are broad-range highly multiplexed assays but they are subjected to a

strong bias as they focus on a predefined subproteome.

2.5 The application of quantitative proteomics

Stable isotope-labeling techniques, e.g. iTRAQ [37], TMT

[38], or SILAC [39], find a natural application in chemical

proteomics to render the competition experiment we

mentioned previously more precise. For instance, one drug

Figure 2. Competition with a free compound in a second

experiment sheds light on the nonspecific binders. In compar-

ison with the original pulldown (left), proteins have the oppor-

tunity to bind to the free compound in the competition pulldown

(right). As a result, direct high-affinity binders and their inter-

actors are found in much reduced abundance in the purified

sample. A comparison of spectral counts – or truly quantitative

measurements – identifies such reductions readily. Abundant

low-affinity proteins binding to the immobilized compound do

not find sufficient free compound copies to significantly reduce

their presence in the purified sample. These are identified by

essentially constant spectral counts.

Figure 3. Drug pulldowns retrieve protein complexes. (A) A drug

can bind to isolated proteins (a–c) but it frequently binds to a

protein (f) that is part of a protein complex (d–g). (B) The pull-

down experiment will identify the direct drug–protein interac-

tions as well as the indirect protein interactions through the

complexes (d, e, g). Without a priori knowledge on the binding

mechanism, it is essentially impossible to distinguish direct from

indirect interactions from such data. When available, informa-

tion on direct – binary – interactions between proteins might

delineate complexes but not indicate which complex member

binds to the drug.





pulldown can be performed in biological duplicates in twoiTRAQ 4-plex channels with the corresponding two

competition pulldowns occupying the other two channels.

With such an experimental design, the less accurate spectral

counts are replaced by relative quantitative measures. The

selection of proteins that directly interact with the

compounds is then rather straightforward, particularlywhen combined with appropriate statistical models [40, 41].

In principle, a similar multiplexing approach could improve

the comparison of a drug pulldown with the corresponding

cell line core proteome as discussed above. In this situation,however, the highly complex core proteome would mask a

significant portion of the pulled down proteins and this

option should be disregarded.

In subproteome-focused applications, it is advantageousto perform competition pulldowns with increasing amounts

of the free compound, and to combine the pulldowns in asingle iTRAQ or TMT experiment [13, 34] to obtain

dose–response curves. Via an innovative protocol, Sharma

et al. were able to determine the dissociation constant and

the IC50 of gefitinib, an EGFR kinase inhibitor in clinical

use for lung cancer [42]. Namely, comparing a first experi-ment with a subsequent pulldown performed on the

supernatant of the first one, they determined the immobi-

lized gefitinib dissociation constant. In parallel, they

performed competition experiments with different concen-trations of the free compound to obtain the gefitinib IC50,

which combined with the immobilized gefitinib dissociationconstant gave the free compound dissociation constant.

Experiments were performed using SILAC 3-plex.

3 Computational methods

We introduce several computational techniques that are

useful in analyzing drug target lists, starting with rathersimple methods that ignore important aspects of chemical

proteomic data, and expanding with more sophisticated

algorithms that integrate additional domain-specific knowl-

edge. For the sake of concision, we often refer to the iden-

tification of relevant biological pathways as a model problem

but, unless otherwise specified, the methods apply to otherreference biological data sets as well. Moreover, we exem-

plify several methods with the target profile analysis of the

tyrosine kinase inhibitors imatinib, dasatinib, bosutinib, and

bafetinib that are in clinical use or in development as ther-apeutic agents, e.g. against CML and other malignant

diseases. They provide convenient illustrations of common

difficulties.

3.1 Classical mapping and enrichment methods

Lists of drug protein targets can be difficult to interpret

directly, especially if they comprise more than a few familiar

entities. The classical bioinformatic solution to relate

protein lists with existing knowledge is to first map thoseproteins onto descriptions of biological functions and then

to look for significant associations by means of statistical

tests (Fig. 4A). A standard example is to search for biological

pathways that are likely to be modulated by a compound.

There exist databases that describe each pathway with a

graphical representation and a list of involved proteins, e.g.KEGG [43] or NCI-PID [44]. The analysis is performed

ignoring the graphical structure by comparing the number

of protein targets present in a pathway with the total

number of proteins in this pathway and in the humangenome. A proportion of targets found in a pathway that is

larger than what is expected by chance indicates potential

pathway regulation (Fig. 4A). All the significant hits

obtained in a pathway database search are reported with anindication of statistical significance, e.g. a p-value. This

procedure is named as enrichment analysis .Depending on the research project, several databases can

be considered for enrichment analysis (Table 2). While

pathway databases are frequently employed in drug target

analysis, they can be complemented by gene ontology (GO)biological process (BP) descriptions [45], which document a

hierarchy of biological functions from metabolism to

signaling with some disease-related processes included as

well. As GO is not only a catalogue of sets of proteins

associated with a biological process, but it comes with ahierarchy, i.e. it is an ontology , enrichment analyses can be

performed at various levels of details. For instance, the GO

consortium has proposed slimmed ontologies that retain

rather general functions and certain tools propose their own

definitions of GO levels, e.g. DAVID [46]. If the interest is to

discover an unknown mechanism of binding, it is possibleto perform enrichment analyses on the target protein

domains or to use the GO molecular function ontology

(GO MF). More generally, any database containing sets of

proteins that each share any specific characteristic can beused to perform enrichment analyses [47]. Table 2 contains

a list of databases and tools that can be applied for this

purpose.

3.2 The use of target affinities and secondary

interactors

Chemical proteomics usually delivers lists of drug targets

with a notion of ‘‘weight,’’ which can be a direct measure of the target affinity [13, 42], or a rough indicator provided by

the spectral count or a related quantity (protein sequence

coverage, log-transformations, etc.) [26]. Such weights

inform on the importance of the targets and they obviously

have the potential to improve enrichment analyses

performed otherwise with all targets considered equal. Interms of effect strength, high-affinity targets are the best

candidates to be the mediators of biological response regu-

lation. Medium-affinity targets that are abundant might also

play an important role, provided the drug is present at





sufficient concentration. Spectral counts reflect a mixture

of target affinity and abundance and, although they are

less precise than affinity estimates, they therefore repre-

sent a convenient ad hoc indication of target importance.

In the case of available affinity estimates, more precisetarget weights can be determined. An obvious choice is

the affinity estimate itself, but it is also worth investigat-

ing the product of the affinity and a protein abundance

estimate.As indicated above, drug pulldowns retrieve protein

complexes – completely or partially – and target lists can

contain proteins that do not interact with the drug directly.

One can argue that the important units of molecular func-

tion are in fact the protein complexes, and to have several

members of a complex in the target list should not perturbthe bioinformatic analysis excessively. Indeed, secondary

interactors might even facilitate the analysis since knowl-

edge present in databases is incomplete, and it is possible

that association with a pathway or with any relevant biolo-

gical concept exists for a secondary interactor, but not for thecorresponding drug target. Working with kinase inhibitors,

where nonkinases are likely secondary interactors after

nonspecific binder filtering, we reduced nonkinase weights

in the analysis by a factor of 0.25 [26, 48].Enrichment analysis on the basis of a weighted protein

target list, eventually containing secondary interactor

weights, cannot be performed with standard tools because

the hypergeometric test would no longer be valid. It must be

substituted with another test that models random associa-

tion-weighted scores and, in general, there is no theoretical

Figure 4. General principle of enrichment analysis and its

extensions. (A) Drug protein targets are mapped to sets found in

a database, such as pathways or GO terms. Each pathway

(P1–P6) is composed of a certain number of proteins (small

circles). Determining whether a given pathway is significantly hit

by the target list requires a statistical null-model, i.e. a model

that allows us to compute the probability to find a given number

of targets in a pathway by random chance. Conceptually, the

situation is an experiment where from an urn containing N balls

in total, R of which are red, n balls are drawn randomly and we

want to compute the probability that they contain r red balls. The

urn is the set of all the proteins found in all the pathways of the

database (N 535 in the figure). The R red balls are the targets

(five in the figure, unmapped targets are ignored). The n drawn

balls are the proteins found in a given pathway (9 for P3) and r is

the number of targets in this pathway (3). The probability to find

r targets in a pathway of size n by chance is given by the

hypergeometric probability density P (r |n ,R ,N ), and summing

over all the possible values k Zr gives the pathway p -value.

When this p -value is below a chosen cutoff, say 1%, the pathway

is considered significantly enriched in the protein targets. This

method is equivalent to Fisher’s exact test, and with the numbers

above we find P (r |n ,R ,N )50.0841 and p o0.0946, which is not

significant. (B) With weights w i associated with drug targets, the

score s of a pathway P is the sum of the weights of the proteins

found the pathway (in case a multiplicative score is preferred,

logarithms can be summed). To estimate the distribution of

scores observed by random chance, a large number of subsets P 0of the database proteins with the same size as P are generated.

For each, the weights of the targets in P 0 are summed, and a

histogram of the null-distribution is obtained. The histogram can

be used as such or a theoretical distribution fit, e.g. Gamma, and

s p -value is estimated. (C) Individual pathway p -value thresholds

are adapted to control the FDR among the set of pathways

selected as significant. (D) In TopGO analysis [50], terms of a GO

that are found significant (lower red node) are excluded from

subsequent calculations, whereas nodes containing targets butnot significant (indicated with a ‘‘-’’) contribute to their parents

(upwards arrows). (E) Double filtering pathway selection by

requiring a target to be present in the pathway and coherent

downstream regulation measured in a complementary experi-

ment such as expression proteomics. Here, inhibiting the red

proteins with a drug should increase the expression of the

green protein. (F) Principle of local enrichment, i.e. as imple-

mented in the functional cloud method [58]. A group of adjacent

proteins in the interactome are found that share a common

functional annotation a. Such subnetworks can be scored and

those which obtain a score higher than what is expected by

chance represent biological functions that are likely to be

modulated by the drug.

3





statistical distribution to do this. However, nonparametricmethods such as permutation tests provide a convenient

solution to model the null-distribution and they can work

with virtually any weighted scoring scheme (Fig. 3B).

3.3 Multiple testing

As a drug target list is searched against a database of path-ways, or any other option mentioned above, the comparison

of the target list with each individual pathway yields a

p-value. To impose a maximum false-positive rate (FPR) a as

a threshold to individual p-values does not provide a

convenient way of controlling the false-positive rate in the

final selection of significant pathways. The p-values areobtained from a null-model that ignores the multiple

selections and, in particular, the number of true positives

present in the database. The most stringent solution is to

obtain individual p-values smaller than a/N , where N is thenumber of pathways described in the database. This solu-

tion is named the Bonferroni correction and it ensures that

the probability to have one or more false positives among

the selected pathways is not more than a. The problem with

this approach (and related less strict ones such as the Sidak

correction) is that sensitivity is clearly reduced, and itignores the number of selected pathways completely,

considering the total database size only.

A more appropriate and widely used solution is to control

the false discovery rate (FDR), which is defined as the rate of

false positives among the selection of significant pathways[49]. FDR is a much more natural concept that is related to

the selection size, which is readily understandable for the

user of a tool. There exist methods to automatically adjust p-

value thresholds on individual pathways such that a prere-quired FDR threshold is met (Fig. 4C) and many common

tools such as DAVID [46] offer this option.

3.4 The use of structures in the enrichment

Different improvements over the enrichment analysis

methods have been proposed to increase specificity. In most

cases, these methods try to make a better use of the struc-

ture of the reference data described in the database. A firstexample is provided by the GO, which can be compared with

drug targets at different levels of details. Full detail analyses

might return very specific and not so relevant annotations as

significant hits. High-level analyses, with all the detailed GOterms mapped onto a few generic ones, might hide inter-

esting differences at higher levels of details. Several authors

have proposed methods to combine both options. An

interesting example is TopGO [50], where detailed GO terms

not found to be significant participate in the analysis of

more generic terms, whereas detailed significant terms areexcluded from further analysis (Fig. 4D). DAVID proposes

only an alternative which is to use the most detailed GO

terms and to group them a posteriori to reduce the output

complexity.

Table 1. Major sources of nonspecific binders and their solutions

Source Solution method Limitations Refs.a)

Binds to the beads

or to the linker

Negative control with blocked beads

Frequent hitter in previous pulldowns Might eliminate correct targets

Better linker (with hydrophilic spacers) [83]

Low-affinity

abundant

proteins

Frequent hitter in previous pulldowns Might eliminate correct targets

Negative control with another compound Must ascertain there is no shared target; works

better with a close chemical structure which is

not always available with no shared target

[84, 85]

Subtract core pr oteome Will n ot remove medium-abun dant no

nspecific binders completely; no chance to

find abundant targets

[23, 86, 87]

Pulldown enrichment versus

core proteome

Will not remove medium-abundant nonspecific

binders completely

[8]

Competition with free compound [13, 31]

Public database of frequent hitters Might eliminate correct targets [88]

Indirect

interactions

Protein family, e.g. kinases only for

kinase inhibitors

Exclude the possibility of unexpected targets [8, 89]

Binary protein interaction from

databases

Reduces possibilities but does not indicate the

direct binders clearly

[53]

Binding domains, predicted

protein interactions

Reduces possibilities but does not indicate the

direct binders clearly; potential higher rate of

false-positive binary interactions compared with

measured binary interactions

[33]

a) Relevant references to either illustrate the successful application of the solution method or some of its limitations.





As a second example of exploiting structures in the

reference data, it is possible to filter pathway hits by inte-

grating expression proteomics or gene microarray data,where drug-treated cells are profiled. Pathways truly

modulated by a drug should contain upstream targets and

exhibit downstream regulation (Fig. 4E). There also exist

algorithms developed for gene microarray data that score theglobal coherence of multiple hits on pathways, taking their

topology into account [51]. Weighted target lists can be

submitted to such programs.

3.5 Integrating protein interactions – A first systems

biology method

To a certain extent, definitions of pathways and biological

processes are arbitrary and, for sure, many relationships

between proteins and genes are not known [52]. Large

amounts of human protein–protein physical interactions

(PPI) have been collected and stored in public repositoriessuch as IntAct, MINT, BioGRID, HPRD, and InnateDB

[53–57]. The interactome, i.e. the network of all the PPIs,

constitutes a valuable complementary approach to consider

drug targets in a broader context. It has been shown thatproteins sharing physical interactions often share a function

and, consequently, proximity in the interactome helps in

improving the specificity of enrichment analysis. This can

be explained by the frequent participation of proteins in

several functions or pathways depending on associations

with other proteins. Therefore, if a protein A can associatewith B for a certain function or with C for another one,

to find A and B in a drug pulldown indicates that the first

function is modulated and not the second one. This

further underlines the potential interest of secondary

Table 2. Selection of useful resources

Name Typea) Description and references

Classical enrichment analysis

DAVID W Generic, simple, and rich tool (pathways, GO, domains, etc.) [46]

GO W The European Bioinformatics Institute website provides the

GOs [45] with toolsPathway databases D,W K EGG [ 43], N CI-PID [44], BioCar ta (www.biocarta.com), NetPath [90],

and WikiPathways [91] are commonly used databases whose

websites provide mapping tools

Biological pathways

PPI databases D,W MINT [54], IntAct [53], DIP [92], HPRD [57], BioGRID [55], and

InnateDB [56] are common repositories of PPIs

STRING, STITCH D, W STRING is an interaction database that complements PPI data

with interactions inferred from text mining, coevolution, and

simultaneous presence in pathways [93]; STITCH is a related

project that compiles drug–protein interactions [94]

Diseases

OMIM D, W Database of gene–disease associations [95]

Tumors D, W COSMIC [96] and Oncomine (www.oncomine.org) compile

genetic defects found in tumors

Drug databases

Compounds D, W DrugBank integrates information on drugs and their targets [97],

SIDER compiles drug side effects [98], SMPDB provides drug

metabolic pathways [99], MMsINC describes a vast collection

of compounds [100], and ChEBI [101] lists active compounds

TCM database D, W A comprehensive database of active molecules in TCM [74]

Tools

Network display and

analysis tools

T Cytoscape is the most widely used tool with many plug-ins to perform

network analyses and extensions [102], BiologicalNetworks is an

alternative system also offering a rich set of functions [103]

R T R (www.r-project.org) is a data analysis environment and programming

language that provides a comprehensive set of packages to analyze

networks, perform enrichment analysis, etc., via its bioinformatics

extension (www.bioconductor.org)

MeV T A generic tool to perform a multitude of gene expression profile analyses,

also applicable to proteomic data [104]

SwissDock W Ligand–small molecule binding tool to identify true interactors [105]

a) D, database; W, web site with query tools; T, stand alone tool.





binders to increase precision with regard to the context of

drug action.

A first approach to improve enrichment analyses consists

in finding interactome subnetworks that contain the drugtargets and are enriched for an annotated biological function

[58] (Fig. 4F). We found this method very useful in analyz-

ing the bafetinib target profile, which featured 33 kinases

that yielded a clear CML-relevant association with apoptosis.

As a comparison, direct GO enrichment analysis only yiel-

ded three significant (1%) biological processes, none of which was directly cancer related. KEGG pathway enrich-

ment did not find any significant hit at the 1% significance

level.

3.6 More global analysis of the impact on the

protein interaction network

So far, we have presented methods where the target profile

was analyzed through its mapping onto existing biologicalconcepts to predict the impact of drug treatment. We already

exploited the interactome as a mean of embedding target

profiles in a broader and more neutral context. Here, we go

one step farther by trying to expand the target list with

functionally associated proteins. In this procedure, only the

topology of the interactome matters and existing functional

annotations are ignored.

In principle, adjacency within the interactome oftenimplies a related function due to the common participation

to a complex or a pathway. Directly expanding the target list

with immediate network neighbors is a first option

(Fig. 5A). We occasionally obtained satisfying results, but

this approach usually does not perform well either because

more distant relevant neighbors are missed or too manyadditional proteins are added, resulting in dilution of the

fingerprint of relevant biological functions (Fig. 5B). As an

example of this difficulty, bafetinib profile with direct

interactions increased from 33 initial proteins to 831. GOenrichment analysis yielded 676 1%-significant biological

processes [58], a massive dilution of information.

In general, PPI networks have a small-world or scale-free

topology [59], which means that their organization resem-

bles air traffic routes: every airport is at a distance of one or

two flights from a large hub that connects it to another hub,which is close to the final destination. Therefore, adding

more than one layer of neighbors frequently encounters

highly connected proteins and too many proteins are

included (Fig. 5A). Obviously, the solution to this problem is

A D

B

C

Figure 5. Target profile expansion. (A) Adjacent proteins (1) of a target (red) are likely to share a function. By adding one additional layer of

neighbors (2), functional specificity is often lost because of highly connected nodes. (B) To limit an explosion of the expanded target

profile, proteins that are linked to two targets at least (1) are added only. In a relaxed version, it is possible to consider proteins with one

link to a target only, provided they are directly linked to another such protein (2). (C) Diffusion over a small example network. The two

rectangle nodes indicated by gray arrows represent targets with identical weights. We selected the top 10 scores on the network after

diffusion and we observe synergistic effects that do not select nodes on the basis of their distance to the targets only. (D) We used

bosutinib target profile [26] and selected its 5% expanded profile [48] to illustrate the application of diffusion methods. Triangles are

targets, diffusion score are indicated by a color-scale (red, strong), immune system-related proteins are shown with large nodes and their

names are followed in brackets by the numbers of interactions with other immune system proteins found within the subnetwork. This

subnetwork suggests a high risk of immunosuppressive side effects.





to constrain the expansion procedure such that proteins areadded only when sufficient evidence of a potential associa-

tion with the targets exists. Proteins with direct PPIs with at

least two targets constitute safe expansions (Fig. 5B) and

they usually increase signal to noise in the functional

analysis [16], i.e. submitting the expanded list to enrichment

analysis yields more relevant hits without generating addi-tional false positives, or even reducing false positives

through FDR control.

There are cases where small target lists cannot be

augmented sufficiently this way and to consider a secondlayer of PPI is necessary. Usually, the list of added proteins

explodes with a second layer and all the target spectrum

specificity is lost. It can be done by imposing that such

proteins have at least one PPI with a target and one together(Fig. 5B) but this is not a complete secondary layer of PPI

and in many cases it is not even sufficient to control theexplosion of the expanded target list.

One elegant solution to the above limitations involves the

notion of diffusion over a network. The concept is rather

natural: starting from a set of seed proteins, in our case thedrug targets, their ‘‘influence’’ diffuses over the network to

give a score to all the other proteins (Fig. 5C). The interest

of this method is that the global network topology is

exploited and synergies between close protein targets confer

increased scores to linked proteins. Distant-related proteinsthat are connected to drug targets through specific paths,

which do not contain highly connected proteins, can bescored relatively high (Fig. 5C). In a neutral context, where it

is not known which protein interaction is crucial for a

treatment or a disease, diffusion methods provide efficient

methods that capture a notion of functional proximity thatfollows the biological intuition. The actual computation of

the diffusion and the weights can be implemented as the

asymptotic distribution of a random walk or first passage

times [48, 60, 61], or more precisely controlled throughdiffusion kernels [60]. In every case, affinity weights can be

used to adjust the importance of the seed proteins. Inter-

estingly, World Wide Web HTML document hyperlinks

constitute a network that has also a small-world topology

and random walk methods are at the heart of widely used

web search engines.On the basis of the weights determined by the diffusion

method, it is possible to select either the top K proteins or,

through many repeated diffusions using random target

spectra, to determine a score threshold [48] and thus obtaina subnetwork. Figure 5D shows the 5% significant bosutinib

subnetwork, which revealed a strong association with

immune system pathways [48]. This immunosuppressive

risk illustrates a plausible application of this method to the

prediction of side effects, since dasatinib, a related kinase

inhibitor in clinical use, has been documented previously tocause such effects [62] and several proteins in the bosutinib

subnetwork, such as LYN [62], BTK [63], TBK1 [64], and SYK

[65, 66], are known to cause immunosuppression upon

inactivation.

3.7 Toward drug efficacy predictions

The next step in the interpretation of chemical proteomic

target profiles consists in relating target profiles to diseases

and patient genetic backgrounds. The main tool to compute

this relationship is an intuitive notion of similarity over the

interactome. Through diffusion methods, we can computethe influence of a drug profile, i.e. we compute a drug

treatment model. We can do the same starting with the

genes causing a disease and thus obtain a disease model.

Comparing the two models, we can estimate drug treatmentefficiencies for a certain disease [48] (Fig. 6A). This method

has its origin in the work of researchers who tried to relate

phenotypes (diseases, patient records, etc.) with genetic

information (genes causing a disease, genetic defects, etc.)through the exploitation of PPI networks. Such studies can

predict unknown new important players in certain pathol-ogies [60, 67, 68], suggest new drug targets, or even and

closely related to our topic suggest candidate molecules to

treat a disease [69].

Direct KEGG pathway enrichment analysis of the imati-nib target profile yielded only two 5%-significant hits, none

of which was CML relevant. Expanding the imatinib target

lists through direct PPIs generated a list of 295 proteins.

When submitted to KEGG enrichment analysis, this exten-ded protein list returned 30 5%-significant hits with apop-

tosis at rank 4. For comparison, the network diffusion

methods combined with drug efficacy scores returned CML

as the top hit [48]. Furthermore, we were able to show that

reasonable estimates of treatment efficacy can be obtained

through drug efficacy scores when comparing four BCR-ABL kinase inhibitors. We also showed that modifying the

disease model to introduce the constitutive activation of the

LYN kinase observed in certain imatinib-resistant patients,

the computation could determine an increased score for the

second-generation compounds dasatininb, bosutinib, andnilotinib, especially designed to target LYN in addition to

BCR-ABL. The imatinib score remained essentially constant

thereby illustrating the potential to segregate patients with

these methods.

We compared the target profiles of all the four inhibitors

with a list of diseases and proposed plausible additionalindications for each compound (Fig. 6B). For instance, lung

cancer was highly ranked for dasatinib and actual efficacy

was shown in another study [18]. Noonan syndrome was

ranked first for bosutinib, which makes sense since severalcancer-associated genes (KRAS, PTPN11, SOS1, RAF1) are

involved [70] and kinase inhibitor treatments are currently

considered for this syndrome. These examples illustrate

how target profiles can be analyzed to predict new drug

indications.

There is a lot of evidence that distinct pathologies mightshare molecular mechanisms [71–73] and, looking forward,

this further increases the opportunities for drug repurpos-

ing: drugs can be related to diseases and, through disease

associations, proposed as therapeutic agents for new disea-





ses (Fig. 6C). Drug treatment models can also be applied to

compare drugs and potentially assign documented sideeffects or areas of applications to new compounds (Fig. 6C).

3.8 Data sets not from chemical proteomics

One important aspect discussed is the difficulty of inter-preting complex and potentially large drug profiles in a

systems biology perspective. Other fields of research related

to human health deal with similar situations. For instance,TCM research has accumulated a lot of information on

many active molecules present in a TCM drug and their

target proteins [74]. A TCM drug action can be analyzed with

the concepts presented here [75–77]. In fact, these were

pioneered by TCM researchers [78–80]. A similar consid-

eration can be made regarding nutritional science, wherethrough the ingestion of food multiple molecular changes

can be induced in human gourmets, suggesting that nutri-

genomics could benefit from such systems-wide impact

analyses as it is collecting evidence on the impact of specificdiets on human proteins [81, 82].

4 Perspective and concluding remarks

We have presented a brief overview of chemical proteomicmethods and introduced various bioinformatic methods to

analyze drug target profiles. The simplest methods

performed enrichment analyses, where sets of proteins, e.g.

biological pathways, are compared with drug profiles to

detect over-representation of the targets in those sets. Then,modeling the specifics of chemical proteomics better, we

introduced methods that take into account estimations of

drug–protein affinities. Finally, expanding further, we

presented methods that implement systems-wide analysesby embedding the drug profiles in global models of cell

biology such as an interactome.The high degree of interconnectivity among the entities

involved in biological processes is a natural and strong

argument to investigate the positive and negative conse-

quences of drug treatment from the point of view of systemsbiology. We have provided examples of kinase inhibitors that

have a broad spectrum of targets, e.g. bafetinib with 33

kinases, and whose target profiles cannot be analyzed

successfully without the contribution of the human inter-actome data. Furthermore, we have showed how informa-

tion on the molecular causes of diseases can be correlated

with the consequences of drug treatment over the inter-

actome to obtain reasonable predictions of side effects,

additional indications, and match with individual patient

genetic background (Fig. 6A and B). More generally, themeasurement of target profiles provides an elegant way to

compare drugs with each other on the basis of their action

on pathways or the interactome (Fig. 6C) and, in combina-

tion with similar comparisons realized with diseases, itmakes it possible to build a complex set of relationships that

allow transferring information from one well-characterized

compound to a new molecule or to postulate side effects.

It is well known that patients must be segregated to

improve treatment efficacy, limit their cost, and reduce new

compound attrition rates. The spectacular improvementsachieved in DNA-sequencing technologies open avenues in

obtaining very detailed information on patient genomes. We

believe that, at the other end of the spectrum, chemical

proteomics and the systems biology analysis of its data has a

Figure 6. Scoring drug treatment efficacy. (A) On the basis of a

diffusion method and drug targets with their affinities, a drug

treatment model is built. Similarly, genes or proteins causing a

disease can be used to build a disease model, eventually inte-

grating patient-specific information such as genetic abnormal-

ities. At the intersection of the two models, a drug efficacy score

can be computed, e.g. multiplying the two diffusion scores given

by each model to individual proteins and summing over the

proteins in the intersection. (B) Two obvious applications of

treatment efficacy scores. Drugs can be compared for a chosen

disease or patient to predict an adequate treatment, i.e. to

implement personalized medicine. Conversely, a chosen

compound can be compared with disease models to find new

indications for this compound (‘‘drug repurposing’’). (C) Drugs

cannot only be compared with diseases but relationships

between drugs can be exploited to transfer knowledge available

for some compounds to a new compound and, identically,

shared molecular bases of distinct pathologies can be exploited

to predict side effects and repurpose drugs.





fundamental role to play in order to link patient specificdigital models to accurate models of drug action.

The authors thank Professor Shao Li and Professor Jing Zhao

for their help on TCM applications.

The authors have declared no conflict of interest.

5 References

[1] Domon, B., Aebersold, R., Mass spectrometry and protein

analysis. Science 2006, 312 , 212–217.

[2] Bantscheff, M., Scholten, A., Heck, A. J., Revealing

promiscuous drug-target interactions by chemical proteo-

mics. Drug Discov. Today 2009, 14 , 1021–1029.

[3] Rix, U., Superti-Furga, G., Target profiling of small mole-

cules by chemical proteomics. Nat. Chem. Biol. 2009, 5 ,616–624.

[4] Booth, B., Zemmel, R., Prospects for productivity. Nat. Rev.

Drug Discov. 2004, 3 , 451–456.

[5] Csermely, P., Agoston, V., Pongor, S., The efficiency of

multi-target drugs: the network approach might help drug

design. Trends Pharmacol. Sci. 2005, 26 , 178–182.

[6] Araujo, R. P., Liotta, L. A., Petricoin, E. F., Proteins, drug

targets and the mechanisms they control: the simple truth

about complex networks. Nat. Rev. Drug Discov. 2007, 6 ,

871–880.

[7] Keith, C. T., Borisy, A. A., Stockwell, B. R., Multicomponent

therapeutics for networked systems. Nat. Rev. Drug

Discov. 2005, 4 , 71–78.[8] Rix, U., Hantschel, O., Durnberger, G., Remsing Rix, L. L.

et al., Chemical proteomic profiles of the BCR-ABL inhibi-

tors imatinib, nilotinib, and dasatinib reveal novel kinase

and nonkinase targets. Blood 2007, 110 , 4055–4063.

[9] Hood, L., Heath, J. R., Phelps, M. E., Lin, B., Systems

biology and new technologies enable predictive and

preventative medicine. Science 2004, 306 , 640–643.

[10] Gavin, A.-C., Aloy, P., Grandi, P., Krause, R. et al.,

Proteome survey reveals modularity of the yeast cell

machinery. Nature 2006, 440 , 631–636.

[11] Harding, M. W., Galat, A., Uehling, D. E., Schreiber, S. L., A

receptor for the immunosuppressant FK506 is a cis-trans

peptidyl-prolyl isomerase. Nature 1989, 341, 758–760.

[12] Cuatrecasas, P., Wilchek, M., Anfinsen, C. B., Selective

enzyme purification by affinity chromatography. Proc.

Natl. Acad. Sci. USA 1968, 61, 636–643.

[13] Bantscheff, M., Eberhard, D., Abraham, Y., Bastuck, S.

et al., Quantitative chemical proteomics reveals mechan-

isms of action of clinical ABL kinase inhibitors. Nat.

Biotechnol. 2007, 25 , 1035–1044.

[14] Patricelli, M. P., Nomanbhoy, T. K., Wu, J., Brown, H.

et al., In situ kinase profiling reveals functionally

relevant properties of native kinases. Chem. Biol. 2011, 18 ,

699–710.

[15] Olsen, J. V., Blagoev, B., Gnad, F., Macek, B. et al., Global,

in vivo, and site-specific phosphorylation dynamics in

signaling networks. Cell 2006, 127 , 635–648.

[16] Hantschel, O., Gstoettenbauer, A., Colinge, J., Kaupe, I.

et al., The chemokine interleukin-8 and the surface acti-

vation protein CD69 are markers for Bcr-Abl activity in

chronic myeloid leukemia. Mol. Oncol. 2008, 2 , 272–281.

[17] Pan, C., Olsen, J. V., Daub, H., Mann, M., Global effects of

kinase inhibitors on signaling networks revealed by

quantitative phosphoproteomics. Mol. Cell. Proteomics

2009, 8 , 2796–2808.

[18] Li, J., Rix, U., Fang, B., Bai, Y. et al., A chemical and

phosphoproteomic characterization of dasatinib action in

lung cancer. Nat. Chem. Biol. 2010, 6 , 291–299.

[19] Winter, G. E., Rix, U., Lissat, A., Stukalov, A. et al., An

integrated chemical biology approach identifies specific

vulnerability of Ewing’s sarcoma to combined inhibition of

Aurora kinases A and B. Mol. Cancer Ther. 2011.

[20] Ramsden, N., Perrin, J., Ren, Z., Lee, B. D. et al.,

Chemoproteomics-based design of potent LRRK2-

selective lead compounds that attenuate Parkinson’s

disease-related toxicity in human neurons. ACS Chem.

Biol. 2011.

[21] Duncan, J. S., Gyenis, L., Lenehan, J., Bretner, M. et al., An

unbiased evaluation of CK2 inhibitors by chemopro-

teomics: characterization of inhibitor effects on CK2 and

identification of novel inhibitor targets. Mol. Cell. Proteo-

mics 2008, 7 , 1077–1088.

[22] Mercer, L., Bowling, T., Perales, J., Freeman, J. et al., 2,

4-Diaminopyrimidines as potent inhibitors of Trypano-

soma brucei and identification of molecular targets by a

chemical proteomics approach. PLoS Negl. Trop. Dis.

2011, 5 , e956.[23] Fadden, P., Huang, K. H., Veal, J. M., Steed, P. M. et al.,

Application of chemoproteomics to drug discovery: iden-

tification of a clinical candidate targeting hsp90. Chem.

Biol. 2010, 17 , 686–694.

[24] Kang, H. J., Yoon, T. S., Jeong, D. G., Kim, Y. et al.,

Identification of proteins binding to decursinol by

chemical proteomics. J. Microbiol. Biotechnol. 2008, 18 ,

1427–1430.

[25] Fleischer, T. C., Murphy, B. R., Flick, J. S., Terry-Lorenzo,

R. T. et al., Chemical proteomics identifies Nampt as the

target of CB30865, an orphan cytotoxic compound. Chem.

Biol. 2010, 17 , 659–664.

[26] Remsing Rix, L. L., Rix, U., Colinge, J., Hantschel, O. et al.,Global target profile of the kinase inhibitor bosutinib in

primary chronic myeloid leukemia cells. Leukemia 2009,

23 , 477–485.

[27] Fernbach, N. V., Planyavsky, M., Muller, A., Breitwieser,

F. P. et al., Acid elution and one-dimensional shotgun

analysis on an Orbitrap mass spectrometer: an application

to drug affinity chromatography. J. Proteome Res. 2009, 8 ,

4753–4765.

[28] Lundgren, D. H., Hwang, S. I., Wu, L., Han, D. K., Role of

spectral counting in quantitative proteomics. Expert Rev.

Proteomics 2010, 7 , 39–53.





[29] Brehme, M., Hantschel, O., Colinge, J., Kaupe, I. et al.,

Charting the molecular network of the drug target Bcr-Abl.

Proc. Natl. Acad. Sci. USA 2009, 106 , 7414–7419.

[30] Choi, H., Fermin, D., Nesvizhskii, A. I., Significance analysis

of spectral count data in label-free shotgun proteomics.

Mol. Cell. Proteomics 2008, 7 , 2373–2385.

[31] Rix, U., Remsing Rix, L. L., Terker, A. S., Fernbach, N. V.

et al., A comprehensive target selectivity survey of the

BCR-ABL kinase inhibitor INNO-406 by kinase profiling and

chemical proteomics in chronic myeloid leukemia cells.

Leukemia 2009, 24 , 44–50.

[32] Venkatesan, K., Rual, J. F., Vazquez, A., Stelzl, U. et al., An

empirical framework for binary interactome mapping. Nat.

Methods 2009, 6 , 83–90.

[33] Rhodes, D. R., Tomlins, S. A., Varambally, S., Mahavisno,

V. et al., Probabilistic model of the human protein-protein

interaction network. Nat. Biotechnol. 2005, 23 , 951–959.

[34] Bantscheff, M., Hopf, C., Savitski, M. M., Dittmann, A. et al.,

Chemoproteomics profiling of HDAC inhibitors reveals

selective targeting of HDAC complexes. Nat. Biotechnol.

2011, 29 , 255–265.

[35] Liu, Y., Patricelli, M. P., Cravatt, B. F., Activity-based

protein profiling: the serine hydrolases. Proc. Natl. Acad.

Sci. USA 1999, 96 , 14694–14699.

[36] Adam, G. C., Sorensen, E. J., Cravatt, B. F., Chemical

strategies for functional proteomics. Mol. Cell. Proteomics

2002, 1, 781–790.

[37] Ross, P. L., Huang, Y. N., Marchese, J. N., Williamson, B.

et al., Multiplexed protein quantitation in Saccharomyces

cerevisiae using amine-reactive isobaric tagging reagents.


[38] Thompson, A., Schafer, J., Kuhn, K., Kienle, S. et al.,

Tandem mass tags: a novel quantification strategy for

comparative analysis of complex protein mixtures by MS/

MS. Anal. Chem. 2003, 75 , 1895–1904.

[39] Ong, S. E., Blagoev, B., Kratchmarova, I., Kristensen, D. B.

et al., Stable isotope labeling by amino acids in cell culture,

SILAC, as a simple and accurate approach to expression

proteomics. Mol. Cell. Proteomics 2002, 1, 376–386.

[40] Breitwieser, F. P., Muller, A., Dayon, L., Kocher, T. et al.,

General statistical modeling of data from protein relative

expression isobaric tags. J. Proteome Res. 2011, 10 ,

2758–2766.

[41] Cox, J., Mann, M., MaxQuant enables high peptide iden-

tification rates, individualized p.p.b.-range mass accura-

cies and proteome-wide protein quantification. Nat.

Biotechnol. 2008, 26 , 1367–1372.

[42] Sharma, K., Weber, C., Bairlein, M., Greff, Z. et al.,

Proteomics strategy for quantitative protein interaction

profiling in cell extracts. Nat. Methods 2009, 6 , 741–744.

[43] Kanehisa, M., Araki, M., Goto, S., Hattori, M. et al., KEGG

for linking genomes to life and the environment. Nucleic

Acids Res. 2008, 36 , D480–D484.

[44] Schaefer, C. F., Anthony, K., Krupa, S., Buchoff, J. et al.,

PID: the Pathway Interaction Database. Nucleic Acids Res.

2009, 37 , D674–D679.

[45] Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D. et al.,

Gene ontology: tool for the unification of biology. Gene

Ontol. Consortium Nat. Genet. 2000, 25 , 25–29.

[46] Huang da, W., Sherman, B. T., Lempicki, R. A., Systematic

and integrative analysis of large gene lists using DAVID

bioinformatics resources. Nat. Protoc. 2009, 4 , 44–57.

[47] Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S.

et al., Gene set enrichment analysis: a knowledge-based

approach for interpreting genome-wide expression profiles.


[48] Colinge, J., Rix, U., Superti-Furga, G., in: Chen, L., Zhang,

X., Shen, B., Wu, L., Wang, Y. (Eds.), 4th International

Conference on Computational Systems Biology , World

Publishing Company, Suzhou, China 2010, pp. 305–313.

[49] Benjamini, Y., Hochberg, Y., Controlling the false discov-

ery arte: a practical and powerful approach to multiple

testing. J. R. Stat. Soc. B 1995, 57 , 289–300.

[50] Alexa, A., Rahnenfuhrer, J., Lengauer, T., Improved scor-

ing of functional groups from gene expression data by

decorrelating GO graph structure. Bioinformatics 2006, 22 ,

1600–1607.

[51] Tarca, A. L., Draghici, S., Khatri, P., Hassan, S. S. et al., A

novel signaling pathway impact analysis. Bioinformatics

2009, 25 , 75–82.

[52] Glaab, E., Baudot, A., Krasnogor, N., Valencia, A.,

Extending pathways and processes using molecular

interaction networks to analyse cancer genome data.

Biomed. Chromatogr. Bioinformatics 2010, 11, 597.

[53] Kerrien, S., Alam-Faruque, Y., Aranda, B., Bancarz, I. et al.,

IntAct – open source resource for molecular interaction

data. Nucleic Acids Res. 2007, 35 , D561–D565.

[54] Cesareni, G., Chatr-aryamontri, A., Licata, L., Ceol, A.,

Searching the MINT database for protein interaction

information. Curr. Protoc. Bioinformatics 2008. Chapter 8,

Unit 8.5.

[55] Breitkreutz, B. J., Stark, C., Reguly, T., Boucher, L. et al.,

The BioGRID Interaction Database: 2008 update. Nucleic

Acids Res. 2008, 36 , D637–D640.

[56] Lynn, D. J., Winsor, G. L., Chan, C., Richard, N. et al.,

InnateDB: facilitating systems-level analyses of the

mammalian innate immune response. Mol. Syst. Biol.

2008, 4 , 218.

[57] Prasad, T. S., Kandasamy, K., Pandey, A., Human Protein

Reference Database and Human Proteinpedia as discovery

tools for systems biology. Methods Mol. Biol. 2009, 577 ,

67–79.

[58] Burkard, T. R., Rix, U., Breitwieser, F. P., Superti-Furga, G.,

Colinge, J., A computational approach to analyze the

mechanism of action of the kinase inhibitor bafetinib. PLoS

Comput. Biol. 2010, 6 , e1001001.

[59] Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N., Barabasi,

A. L., The large-scale organization of metabolic networks.

Nature 2000, 407 , 651–654.

[60] Kohler, S., Bauer, S., Horn, D., Robinson, P. N., Walking the

interactome for prioritization of candidate disease genes.

Am. J. Hum. Genet. 2008, 82 , 949–958.





[61] Berger, S. I., Iyengar, R., Network analyses in systems

pharmacology. Bioinformatics 2009, 25 , 2466–2472.

[62] Sillaber, C., Herrmann, H., Bennett, K., Rix, U. et al.,

Immunosuppression and atypical infections in CML

patients treated with dasatinib at 140 mg daily. Eur. J. Clin.

Invest. 2009, 39 , 1098–1109.

[63] Hantschel, O., Rix, U., Schmidt, U., Burckstummer, T. et al.,

The Btk tyrosine kinase is a major target of the Bcr-Abl

inhibitor dasatinib. Proc. Natl. Acad. Sci. USA 2007, 104 ,

13283–13288.

[64] Ishii, K. J., Kawagoe, T., Koyama, S., Matsui, K. et al.,

TANK-binding kinase-1 delineates innate and adaptive

immune responses to DNA vaccines. Nature 2008, 451,

725–729.

[65] Hussain, S. F., Kong, L. Y., Jordan, J., Conrad, C. et al., A

novel small molecule inhibitor of signal transducers and

activators of transcription 3 reverses immune tolerance in

malignant glioma patients. Cancer Res. 2007, 67 ,

9630–9636.

[66] Deuse, T., Velotta, J. B., Hoyt, G., Govaert, J. A. et al.,

Novel immunosuppression: R348, a JAK3- and Syk-inhi-

bitor attenuates acute cardiac allograft rejection. Trans-

plantation 2008, 85 , 885–892.

[67] Berger, S. I., Ma’ayan, A., Iyengar, R., Systems pharma-

cology of arrhythmias. Sci. Signal. 2010, 3 , ra30.

[68] Lage, K., Karlberg, E. O., Storling, Z. M., Olason, P. I. et al.,

A human phenome-interactome network of protein

complexes implicated in genetic disorders. Nat. Biotech-

nol. 2007, 25 , 309–316.

[69] Li, S., Zhang, N., Zhang, B., in: Chen, L., Zhang, X., Shen,

B., Wu, L., Wang, Y. (Eds.), 4th International Conference on

Computational Systems Biology , World Publishing

Company, Suzhou, China 2010, pp. 51–58.

[70] Gelb, B. D., Tartaglia, M., Noonan syndrome and related

disorders: dysregulated RAS-mitogen activated protein

kinase signal transduction. Hum. Mol. Genet. 2006, 15 ,

R220–R226.

[71] Goh, K. I., Cusick, M. E., Valle, D., Childs, B. et al., The

human disease network. Proc. Natl. Acad. Sci. USA 2007,

104 , 8685–8690.

[72] Braun, P., Rietman, E., Vidal, M., Networking metabolites

and diseases. Proc. Natl. Acad. Sci. USA 2008, 105 ,

9849–9850.

[73] Yildirim, M. A., Goh, K. I., Cusick, M. E., Barabasi, A. L.,

Vidal, M., Drug-target network. Nat. Biotechnol. 2007, 25 ,

1119–1126.

[74] Chen, C. Y., TCM Database@Taiwan: the world’s largest

traditional Chinese medicine database for drug screening

in silico. PLoS One 2011, 6 , e15939.

[75] Zhao, J., Jiang, P., Zhang, W., Molecular networks for the

study of TCM pharmacology. Brief. Bioinform. 2010, 11,

417–430.

[76] Li, S., Zhang, B., Zhang, N., Network target for screening

synergistic drug combinations with application to tradi-

tional Chinese medicine. Biomed. Chromatogr. Syst. Biol.

2011, 5 , S10.

[77] Li, S., Zhang, B., Jiang, D., Wei, Y., Zhang, N., Herb network

construction and co-module analysis for uncovering the

combination rule of traditional Chinese herbal formulae.

Biomed. Chromatogr. Bioinformatics 2010, 11, S6.

[78] Zeng, H., Dou, S., Zhao, J., Fan, S. et al., The inhibitory

activities of the components of Huang-Lian-Jie-Du-Tang

(HLJDT) on eicosanoid generation via lipoxygenase path-

way. J. Ethnopharmacol. 2011, 135 , 561–568.

[79] Wang, L., Zhou, G. B., Liu, P., Song, J. H. et al., Dissection of

mechanisms of Chinese medicinal formula Realgar-Indigo

naturalis as an effective treatment for promyelocytic leuke-

mia. Proc. Natl. Acad. Sci. USA 2008, 105 , 4826–4831.

[80] Ung, C. Y., Li, H., Cao, Z. W., Li, Y. X., Chen, Y. Z., Are herb-

pairs of traditional Chinese medicine distinguishable from

others? Pattern analysis and artificial intelligence classifi-

cation study of traditionally defined herbal properties.

J. Ethnopharmacol. 2007, 111, 371–377.

[81] Kussmann, M., Role of proteomics in nutrigenomics and

nutrigenetics. Expert Rev. Proteomics 2009, 6 , 453–456.

[82] Kussmann, M., Affolter, M., Proteomics at the center of

nutrigenomics: comprehensive molecular understanding

of dietary health effects. Nutrition 2009, 25 , 1085–1093.

[83] Shiyama, T., Furuya, M., Yamazaki, A., Terada, T., Tanaka,

A., Design and synthesis of novel hydrophilic spacers for

the reduction of nonspecific binding proteins on affinity

resins. Bioorg. Med. Chem. 2004, 12 , 2831–2841.

[84] Oda, Y., Owa, T., Sato, T., Boucher, B. et al., Quantitative

chemical proteomics for identifying candidate drug

targets. Anal. Chem. 2003, 75 , 2159–2165.

[85] Wang, G., Shang, L., Burgett, A. W., Harran, P. G., Wang,

X., Diazonamide toxins reveal an unexpected function for

ornithine delta-amino transferase in mitotic cell division.


[86] Schirle, M., Heurtier, M. A., Kuster, B., Profiling core

proteomes of human cell lines by one-dimensional PAGE

and liquid chromatography-tandem mass spectrometry.


[87] Burkard, T. R., Planyavsky, M., Kaupe, I., Breitwieser, F. P.

et al., Initial characterization of the human central

proteome. Biomed. Chromatogr. Syst. Biol. 2011, 5 , 17.

[88] Trinkle-Mulcahy, L., Boulon, S., Lam, Y. W., Urcia, R. et al.,

Identifying specific protein interaction partners using

quantitative mass spectrometry and bead proteomes.

J. Cell. Biol. 2008, 183 , 223–239.

[89] Winger, J. A., Hantschel, O., Superti-Furga, G., Kuriyan, J.,

The structure of the leukemia drug imatinib bound to

human quinone reductase 2 (NQO2). Biomed. Chroma-

togr. Struct. Biol. 2009, 9 , 7.

[90] Kandasamy, K., Mohan, S. S., Raju, R., Keerthikumar, S.

et al., NetPath: a public resource of curated signal trans-

duction pathways. Genome Biol. 2010, 11, R3.

[91] Pico, A. R., Kelder, T., van Iersel, M. P., Hanspers, K. et al.,

WikiPathways: pathway editing for the people. PLoS Biol.

2008, 6 , e184.

[92] Xenarios, I., Salwinski, L., Duan, X. J., Higney, P. et al., DIP,

the Database of Interacting Proteins: a research tool for





studying cellular networks of protein interactions. Nucleic

Acids Res. 2002, 30 , 303–305.

[93] Szklarczyk, D., Franceschini, A., Kuhn, M., Simonovic, M.

et al., The STRING database in 2011: functional interaction

networks of proteins, globally integrated and scored.

Nucleic Acids Res. 2011, 39 , D561–D568.

[94] Kuhn, M., Szklarczyk, D., Franceschini, A., Campillos, M.

et al., STITCH 2: an interaction network database for small

molecules and proteins. Nucleic Acids Res. 2010, 38 ,

D552–D556.

[95] Wheeler, D. L., Barrett, T., Benson, D. A., Bryant, S. H. et al.,

Database resources of the National Center for Biotech-

nology Information. Nucleic Acids Res. 2008, 36 , D13–D21.

[96] Forbes, S. A., Tang, G., Bindal, N., Bamford, S. et al.,

COSMIC (the Catalogue of Somatic Mutations in Cancer):

A resource to investigate acquired mutations in human

cancer. Nucleic Acids Res. 2010, 38 , D652–D657.

[97] Wishart, D. S., Knox, C., Guo, A. C., Cheng, D. et al.,

DrugBank: a knowledgebase for drugs, drug actions and

drug targets. Nucleic Acids Res. 2008, 36 , D901–D906.

[98] Kuhn, M., Campillos, M., Letunic, I., Jensen, L. J., Bork, P.,

A side effect resource to capture phenotypic effects of

drugs. Mol. Syst. Biol. 2010, 6 , 343.

[99] Frolkis, A., Knox, C., Lim, E., Jewison, T. et al., SMPDB: The

small molecule pathway database. Nucleic Acids Res.

2010, 38 , D480–D487.

[100] Masciocchi, J., Frau, G., Fanton, M., Sturlese, M. et al.,

MMsINC: a large-scale chemoinformatics database.

Nucleic Acids Res. 2009, 37 , D284–D290.

[101] de Matos, P., Alcantara, R., Dekker, A., Ennis, M. et al.,

Chemical entities of biological interest: an update. Nucleic

Acids Res. 2010, 38 , D249–D254.

[102] Smoot, M. E., Ono, K., Ruscheinski, J., Wang, P. L., Ideker,

T., Cytoscape 2.8: new features for data integration and

network visualization. Bioinformatics 2011, 27 , 431–432.

[103] Kozhenkov, S., Dubinina, Y., Sedova, M., Gupta, A. et al.,

BiologicalNetworks 2.0 – an integrative view of genome

biology data. Biomed. Chromatogr. Bioinformatics 2010,

11, 610.

[104] Saeed, A. I., Sharov, V., White, J., Li, J. et al., TM4: A free,

open-source system for microarray data management and

analysis. Biotechniques 2003, 34 , 374–378.

[105] Grosdidier, A., Zoete, V., Michielin, O., SwissDock, a

protein-small molecule docking web service based on

EADock DSS. Nucleic Acids Res. 2011, 39 , W270–W277.