Workshop: Introduction to Cytoscape at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)

Post on 16-Apr-2017

5.103 views 0 download

Transcript of Workshop: Introduction to Cytoscape at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)

Keiichiro Ono UC, San Diego Trey Ideker Lab

Bioinformatics Summit 2014 4/11/2014

Cytoscape An Open Source Platform for Biological Network Analysis and Visualization

-

Thanks for Attending!

- Keiichiro Ono - Cytoscape Core Developer since 2005

- Core module design & implementation

- Area of Interest: Data Integration & Visualization

- University of California, San Diego Trey Ideker Lab

Made with Cytoscape

Agenda

- What is Cytoscape?

- Data Integration, Analysis, and Visualization with Cytoscape

- Cytoscape Ecosystem - Cytoscape Future Plan

What is Cytoscape?

An Open Source Platform for Biological Network Data Integration, Analysis and Visualization

Cytoscape

Cytoscape- Open Source (LGPL)

- Free for both commercial and academic use - Developed and maintained by universities, companies, and research institutions

- De-facto standard software in biological network research community

- Expandable by Apps- This is why Cytoscape is a Platform, not a simple desktop application

Network Data Analysis

Analysis

Graph Analysis

NetworkX

igraph

Cytoscape

Python

Pandas

NumPy

SciPy

Excel

Visualization

Desktop

Gephi

Cytoscape

matplotlib

Web

Cytoscape.js

sigma.js

d3

NDV3

d3.chart

Google Charts

Data Storage

Graph

Neo4j

GraphXDocument

MongoDB

Relational

MySQL

IPython

3rd Party Apps

NetworkAnalyzer

Network Data Analysis

Analysis

Graph Analysis

NetworkX

igraph

Cytoscape

Python

Pandas

NumPy

SciPy

Excel

Visualization

Desktop

Gephi

Cytoscape

matplotlib

Web

Cytoscape.js

sigma.js

d3

NDV3

d3.chart

Google Charts

Data Storage

Graph

Neo4j

GraphXDocument

MongoDB

Relational

MySQL

IPython

3rd Party Apps

NetworkAnalyzer

Network Data Analysis

Analysis

Graph Analysis

NetworkX

igraph

Cytoscape

Python

Pandas

NumPy

SciPy

Excel

Visualization

Desktop

Gephi

Cytoscape

matplotlib

Web

Cytoscape.js

sigma.js

d3

NDV3

d3.chart

Google Charts

Data Storage

Graph

Neo4j

GraphXDocument

MongoDB

Relational

MySQL

IPython

3rd Party Apps

NetworkAnalyzer

Network Data Analysis

Analysis

Graph Analysis

NetworkX

igraph

Cytoscape

Python

Pandas

NumPy

SciPy

Excel

Visualization

Desktop

Gephi

Cytoscape

matplotlib

Web

Cytoscape.js

sigma.js

d3

NDV3

d3.chart

Google Charts

Data Storage

Graph

Neo4j

GraphXDocument

MongoDB

Relational

MySQL

IPython

3rd Party Apps

NetworkAnalyzer

Network?

Network = Nodes + Edges

Nodes and Edges in Biology

- Protein - Protein- Protein - DNA- Genetic (Epistasis)

- Synthetic lethality

- Biochemical Reactions

- Compound - Enzyme - Compound

Cartoon representation of a complex between DNA and the protein p53

Node

NodeEdge

Undirected Network

EP300

PPARG

SMARCD3

STMN1

SMARCA4

OPTN

ATP6V1C1

PSMD1

HTT

PRNP

HNRNPUL1

CCDC88A

CLU

HSP90AB1

SMARCD3

MAP4K4

MIF4GDUSP11

MARCH6TUBB

EDF1 CHD8

Protein-Protein Interactions

Directed Network

KEGG Pathway (TCA Cycle) visualized by Cytoscape KGMLReader

Human-Curated Pathways

KEGG Pathway Visualized by Cytoscape

KEGG Global Map Visualized by Cytoscape

The Challenge in Network Biology

C. Elegans Interactome from BioGRID Database

?

Biological Networks

- Tell us anything by themselves - Just a big hairball…

Module 1

Module 2

In other words…

Module 1

Need a tool to extract meaningful biological modules

Basic Use Case

Networks

Public Interaction Databases

List of Genes

Other Data

is NOT a...

- Cytoscape is a powerful tool, but cannot do everything - Simulator - Fully-featured Pathway diagram editor - Statistical network analysis tool suite

- Still, you can implement these as Apps though

Our Focus

Large-Scale Network Analysis and Visualization

Human Interactome data from BioGRID visualized by Cytoscape

Agenda

- What is Cytoscape? - Data Integration, Analysis, and

Visualization with Cytoscape - Cytoscape Ecosystem - Cytoscape Future Plan

Introduction to Biological Network

Analysis with Cytoscape

1. Data Integration (Load Networks and Tables)

2. Data Analysis

3. Visualization

Basic Workflow

4. Prepare for Publication

Network Data

Annotated Networks

Attributes

Analyzed Data

Cline, et al. “Integration of biological networks and gene expression data using Cytoscape”, Nature Protocols, 2, 2366-2382 (2007).

Protocol Paper

<?xml version="1.0" encoding="UTF-8"?> <graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd"> <!-- Created by igraph --> <key id="degree" for="node" attr.name="degree" attr.type="double"/> <key id="betweenness" for="node" attr.name="betweenness"

attr.type="double"/> <graph id="G" edgedefault="directed"> <node id="n0"> <data key="degree">79</data> <data key="betweenness">0</data> </node> <node id="n1"> <data key="degree">9</data> <data key="betweenness">167</data> </node> <node id="n2"> <data key="degree">18</data> <data key="betweenness">75</data> </node> <node id="n3"> <data key="degree">8</data> <data key="betweenness">12</data> </node> <node id="n4"> <data key="degree">26</data> <data key="betweenness">210</data> </node> <node id="n5"> <data key="degree">29</data> <data key="betweenness">320</data> </node>

Import Networks

Network Data Formats

- SIF - GML - XGMML - GraphML - BioPAX - PSI-MI - SBML - KGML (KEGG)

- Excel - Text Table

- CSV - Tab

Network Data Source

- From your own experiment - Public database

- Search known interactions by list of genes

Public Interaction Databases

Which Database?- Protein - Protein

- STRING- IntAct

- Genetic- BioGRID

- Protein - Compound- ChEMBL

Which Database?- Human-Curated Pathways

- KEGG - Reactome - WikiPathways - PathwayCommons

Import Tables

NCBI Gene ID 672

On Chromosome 17

GO Terms DNA Repair Cell Cycle

DNA Binding

Ensemble ID ENSG00000012048

BRCA1

Data Tables for Cytoscape- Example:

- Numeric- Gene expression profiles - Network statistics calculated in other

applications, such as R - Confidence scores for edges

- Text (or categorical)- GO annotation for genes - List of genes related to disease X - Targets for FDA approved drugs - Genes on KEGG Pathway Y - Clusters / group / community calculated

in external programs - …

Your Data Sets- Anything saved as a table can be

loaded into Cytoscape - Excel - Tab Delimited Document - CSV

- As long as proper mapping key is available, Cytoscape can map them to your networks.

Mapping Key in the Network

Mapping Key in the Table

Network Data

Annotated Networks

Attributes

Analyzed Data

Analysis

Network Analysis

- Filtering - Calculate network statistics by

Network Analyzer - Degree distribution,centrality, etc.

- Advanced analysis by Apps

Filtering (Selection)- Simple, but powerful feature to extract sub

networks from large data sets - Select nodes and edges with specific

conditions - Pick nodes with degree > 5 - Select edges extracted from publication

X - Find nodes on KEGG Pathway X

Other Selection Command: First Neighbor of Nodes

CTR+6

Create New Sub-Network From Selection

CTR+N

Summary: Selection- Create filter from your biological question:

- Select all nodes annotated with GO term “DNA repair” and have two or more know interaction

- Select all genes directly interact with brca1, brca2, etc.

- Extract subnetwork by CTR+N

Network Data

Annotated Networks

Attributes

Analyzed Data

Visualization

- Goal: Help others to understand your data

- Emphasize what you want to tell

- Use color, shape,

size of objects effectively!

- Excellent resource for data visualization

- Tamara Munzner’s Web Site: http://www.cs.ubc.ca/~tmm/

Data Visualization

Map Attributes to Visual Properties

Visual Style

- Collection of mappings from Attributes to Visual Properties

Visual Style- Example Mappings:

- Expression profiles to node color - Object type (protein, compounds,

complex, etc.) to node shape - Interaction type (Protein-Protein,

Protein-DNA, etc.) to edge line style

Layouts

Previous Example

GCN4

HIS7ILV2

RPL16A

CBF1

YOR264WYGR058W

YGR136W YBR190W

YCR086W

YNL050CACS2

FPR1SNP1HMO1 PRP21

TAH18

SIP4HOM3

PGK1RPS17A

RPL18B

HSP42ARG1

FCY1

SRP1

MCK1

YDR412W

PHO13 PHO4

PHO84

RPS24B

PDC1

RPL18A

RPS24A

PFK27

YPT1RAP1

ENO1

SEC17

GCR1

PHO5

GDS1

TPI1

CLB2

ADH1

CDC19

ENO2

ICL1

MSN5

SNC2 SEC9

DDI1CBF5

SNC1

SSO2

YGR203W

GCR2

HXT11

RFA2PDR3

SYF3

CAR1

PDR1

AST1

YNL047C CLB1

TIF5

HSL1

CKS1

NIP1

YML114C PMA1

TIF35

PRT1

RPS8B

GPD2PRE10

GNA1

TIF34

YDR070C

RLM1HSP150

CDC6

YDL023C

YIL113W

SSN6

BFR2

KNS1TUP1

YKL161C

SSL2

VPS21

RPL25

HIS3

TFA1

HIS4 PEP7

TFB1

NCE103

SIN4

GPA1

MSL5

ATC1

SST2

CNS1

PRP40

TOR1

BUD6

YGR046W

HSC82

MSL1

BAS1

YGL161C

SKO1

PEP12

YDR100W

YNL164C

YNL091W

SUC2

RPL42B

HSP82

MUD2

AHP1

CIN4

RPL34B

RPL11B

SXM1

RPL16B

RPL11A

RPL31A

RPL10

CUP1BSSA4

GFD1

GCN3

HRP1

RNA15

NAB2 CHK1

APG7

YLR345W

SFH1 ADE4

RAD52

MET16TRP4

HXT9

RFA1

HSP26

GIP1

PCL10

PHO85

PCL5

GSY2

PIG2

HSF1

GAC1

CUP1A

YOR315W

GLC7

APG12

REG2

AUT1

GIP2

APG5

GLC8

PIS1CTT1

CYC7CLN3

SLT2

CDC42

MFA2MCM1

MFA1

STE12SWI4

FBP1LYS9

DMC1

MLS1

DCP1

PCK1

GAL3

SWI5

GAL1

MIG1

GAL80

FAR1

GAL11

GAL10

HAP4

GAL4

GAL7

STE4

GAL2GCY1

RPA135

MAM33

CYB2

YDR032C

SPC24

ASN1

HEX3 LSM8

YEL015W

PRP9

TEM1

YMR044W

ECI1

YNR053C

TAF25

MTH1

YHR198C

MPT1

GLN1

GDH2

YER116C

UGA1

SNF3YLR432W

GLN3

PDC5

YIL105C

RPS28B

RPS28A

LSM4

LSM2

HAP1

STE11

SPA2

STE50

STE5

CYC1

HAP3

YER124C

HAP2

GIC2CDC28STE2

BAR1

ALPHA2

YBR190W

YCR086W

YNL050CACS2 SNP1 PRP21

TAH18

SIP4

PGK1RPS17A

RPL18B

HSP42ARG1

FCY1

SRP1

MCK1

YDR412W

PHO13 PHO4

PHO84

RPS24B

PDC1

RPL18A

RPS24A

PFK27

YPT1RAP1

ENO1

SEC17

GCR1

PHO5

GDS1

TPI1

CLB2

ADH1

CDC19

ENO2

ICL1

MSN5

SNC2 SEC9CBF5

SNC1

SSO2

YGR203W

GCR2

YNL047C CLB1HSL1

YML114C PMA1

RLM1HSP150

CDC6

YIL113W

SSL2

VPS21

RPL25

HIS3

HIS4 PEP7

TFB1

NCE103

SIN4

GPA1

MSL5

ATC1

SST2

CNS1

PRP40

TOR1

BUD6

HSC82

MSL1

BAS1

YGL161C

SKO1

PEP12

YDR100W

YNL164C

YNL091W

SUC2

RPL42B

HSP82

MUD2

AHP1

CIN4

PIS1CTT1

CYC7CLN3

SLT2

CDC42

MFA2MCM1

MFA1

STE12SWI4

FBP1LYS9

DMC1

MLS1

DCP1

PCK1

GAL3

SWI5

GAL1

MIG1

GAL80

FAR1

GAL11

GAL10

HAP4

GAL4

GAL7

STE4

GAL2GCY1

HEX3 LSM8

YEL015W

PRP9

TEM1

YMR044W YNR053C

TAF25

MTH1MPT1

GDH2

YER116C

SNF3YLR432W

GLN3

PDC5

YIL105C

RPS28B

RPS28A

LSM4

LSM2

HAP1

STE11

SPA2

STE50

STE5

CYC1

HAP3

YER124C

HAP2

GIC2CDC28STE2

BAR1

ALPHA2

Real World Exampleshttp://cytoscape-publications.tumblr.com

Apps

Cytoscape Apps- Extension programs to

add new features to Cytoscape (were called Plugins)

- Large App developer/user community - This is why Cytoscape

is so successful in life science community!

(As of 4/5/2014)

APPS.CYTOSCAPE.ORG

Quick Overview of Apps

A travel guide to Cytoscape plugins !Rintaro Saito, Michael E Smoot, Keiichiro Ono, Johannes Ruscheinski, Peng-Liang Wang, Samad Lotia, Alexander R Pico, Gary D Bader, Trey Ideker (2012) Nature Methods 9 (11) p. 1069-1076

A travel guide to Cytoscape plugins

!Rintaro Saito, Michael E Smoot, Keiichiro Ono, Johannes

Ruscheinski, Peng-Liang Wang, Samad Lotia, Alexander R Pico, Gary D Bader, Trey Ideker (2012)

Nature Methods 9 (11) p. 1069-1076

A travel guide to Cytoscape plugins!

Rintaro Saito, Michael E Smoot, Keiichiro Ono, Johannes Ruscheinski, Peng-Liang Wang, Samad Lotia, Alexander R Pico, Gary D Bader, Trey Ideker (2012)

Nature Methods 9 (11) p. 1069-1076

Agenda

- What is Cytoscape? - Data Integration, Analysis, and

Visualization with Cytoscape - Cytoscape Ecosystem

- Cytoscape Future Plan

Cytoscape EcosystemJS

Cytoscape Family- Version 2.x - Legacy version

- Last release for 2.x is 2.8.3.

- Version 3.x - Current production version

- Latest version: 3.1.0

- Important Note: Apps for 2.x is not compatible with 3.x (We have similar problem like Python…)

Cytoscape Family

- cytoscape.js: Library for web applications

JS

Cytoscape 3.1.0

JS

JS

Cytoscape.js Network Visualization Library Running on Web Browsers

What is cytoscape.js?

A Javascript Library for network visualization, not a web application!

Need to write some code to use it on the web browsers…

Complete desktop application for network

analysis and visualization !

Written in Java !

Expandable by Apps !

For Users

A Javascript Library for network visualization, not a web application!

!Written in JavaScript

!Expandable by Extensions

!

For Developers

JS

Analysis

Data Integration

Cytoscape Desktop

Cytoscape.js

Visualization

Minimal Analysis

Cytoscape

Web

Desktop

Layout

Visual Style

Visual Style

Layout

Visualization

Integration to Cytoscape

New in Cytoscape 3.1.0: Export Networks and Visual Styles to Cytoscape.js Format

JS

Export to HTML5 Session

Feature for users: you can view networks on web browsers with Cytoscape.js

Under development… We need early adapters for testing!

JS

Future Plan

Future Plan

- Integration to external tools - Access from

Python, R, Perl, etc. - More integration to

Cytoscape.js - Cytoscape

Cyberinfrastructure (Cytoscape CI)

Cytoscape CI: Background- The size of data biologists have

to analyze is still growing - Desktop machines are

powerful, but not enough for large scale data analysis - Clusters / Clouds

- Using multiple computing resources as external service is normal for biological data analysis…

- ScalabilityService 1

Service 2

Cytoscape Cyberinfrastructure

Internet

Service 1 Service 2

NDEx (DB)

Web Browser

Cytoscape Desktop

App Development

github.com/cytoscape

Developer Documents

- http://opentutorials.cgl.ucsf.edu/index.php/Portal:Cytoscape3

Collaboration

Collaboration- Once you are ready to use Cytoscape for real-world

problems, National Resources for Network Biology (NRNB) is always open for collaboration!

- NRNB Provides support for both of - Scientific Research - Application / Tool Development - nrnb.org

-

- Two Google Groups

- cytoscape-discuss@googlegroups.com

- cytoscape-helpdesk@googlegroups.com

- ANY question is OK!

Getting Help

General Introduction !!!!!

Advanced Topics - Effective Visualization

Techniques - External Tools

!!

Part 2Part 1

Q1. How many of you have never

used Cytoscape?

Q2. How many of you regularly use

R/Python/MATLAB?

www.cytoscape.org

2014 Keiichiro Ono kono@ucsd.edu