JBrowse within the Arabidopsis Information Portal - PAG XXIII

21
araport.org @araport JBrowse within the Arabidopsis Information Portal Vivek Krishnakumar J. Craig Venter Institute 12/14/2015 GMOD PAG XXIII, San Diego, CA

Transcript of JBrowse within the Arabidopsis Information Portal - PAG XXIII

araport.org@araport

JBrowse within the Arabidopsis

Information Portal

Vivek KrishnakumarJ. Craig Venter Institute

12/14/2015

GMOD

PAG XXIII, San Diego, CA

araport.org@araport

Overview

• About Araport

• Extensions to JBrowse

– Hybrid track selector plugin

– Sequence viewer widget

• Araport on GitHub

• Useful tidbits for JBrowse users

araport.org@araport

About Araport

• Objectives– Develop community web interface

• sustainable, fundable and community-extensible

• hosts analysis modules, visualization tools, user data spaces

– Practice data federation• integrate diverse data sets from distributed sources

• consume and expose data via RESTful web services

– Maintain “gold standard” Col-0 annotation• assemble tissue-specific transcripts from publicly

available RNA-seq datasets

• incorporate novel coding and non-coding genes

araport.org@araport

Araporthttps://www.araport.org

• Explore data

• ThaleMine

• JBrowse

• Science Apps

• Search data

• Quick Search

• BLAST

• Raw data

downloads

• Community

• News & Events

• Ask a question

• Job Postings

• Useful Links

araport.org@araport

Araport ArchitectureExternal programsPortal (www.araport.org)

API (api.araport.org)

Agave Coremeta data

user profileADAMA

service manage

service enroll

a b c d e f

CGI

Computing

Storage

Databases

ThaleMine JBrowse

Authentication, metering, logging, versioning, HTTPS, CORS

a b c d e f

Apps

Jobs

Systems

CGI

InterMine

Others

CoGe

SOAP

CGI

REST

Science

Apps

araport.org@araport

Track selection in standard JBrowse

installations

Hierarchical selector Faceted selector

araport.org@araport

Tracks at Araport

• Combination of local and remotely located datasets• Local tracks generated from flat-files converted to JSON (In

the near future, will be served by InterMine JBrowse web services)

– Assembly, annotation, expression, similarity and variation related data

– Normally associated with only a single unit of metadata (e.g. citation or text describing how data was generated)

• Remote tracks are federated from CoGe via their RESTful API infrastructure– Epigenomics datasets collated from the EPIC project

– Each dataset is associated with rich set of metadata tag/value pairs (such as experiment name, technique, conditions, etc.)

araport.org@araport

Using the JBrowse plugin architectureComboTrackSelector

Hierarchical selector Faceted selector

araport.org@araport

Configuring and enabling the pluginComboTrackSelector

Hierarchical selector Faceted selector

• Uses the standard configuration file, trackList.json

• Usually generated programmatically using loading scripts like flatfile-to-json.pl, etc.

• Adding below plugin config stanza enables dual selectors:"plugins" : {

"location" : "./plugins/ComboTrackSelector",

"name" : "ComboTrackSelector"

},

• Relies on a new configuration file, trackList2.json

• Requires supporting comma-separated configuration file with metadata about the data

• Both files above are either hand curated or programmatically generated

araport.org@araport

Annotated sequence viewers

TAIR SeqViewer

araport.org@araport

Sequence viewer in JBrowse

• JBrowse feature details windows shows region sequence and subfeaturesseparately

• Members of the Arabidopsis community were used to working with TAIR SeqViewer and GBrowse2. But, Araport promotes usage of JBrowse over GBrowse2

• Users of our portal were missing this functionality at Araport and made requests during our portal “Preview” phase

araport.org@araport

Developing a sequence viewer for

JBrowse

• Since JBrowse is client side application, widget would need to be lightweight JS for ease of integration

• BioJS (http://biojs.net) library of reusable JS components was the perfect fit

• Wide variety of components, not limited to:

– Sequence, Chromosome karyotype

– Tree, KEGGViewer

– Cytoscape, Interaction Table

araport.org@araport

Sequence viewer widget

for JBrowse

SeqLighter• Can be configured as a

right click menu option for

any track providing gene

structure features"menuTemplate" : [

{

”label" : "View Sequence",

"iconClass" : "dijitIconDatabase",

"action": "contentDialog",

"content": "function(track,feature,div ){ return

SequenceViewer(track,feature,div)}"

},

• Allows toggling annotation

of various gene features

(exon, intron, start/stop,

UTR)

• Flanking region sequences

can be added (0.5 to 4 KB)

• Results can be exported to

SVG, PNG and JPG

araport.org@araport

Araport on GitHub

• GitHub organization: https://www.github.com/Arabidopsis-Information-Portal

• Relevant repositories:– GMOD/jbrowse (fork)

– jbrowse-contrib (track config)

araport.org@araport

Tidbit (1)Generate screenshots programmatically

• JBrowse does not currently have feature

analogous to gbrowse_img to generate an

image of desired region and set of tracks

• It is possible however to disable

nav/tracklist/ruler elements from the view, by

setting appropriate URL parameters

• Example:

http://path/to/jbrowse/?tracks=A,B,C&nav=0&

tracklist=0&overview=0

araport.org@araport

• Possible to do so, with server side JavaScript

technology, nodejs (http://nodejs.org) and a headless

webkit browser, PhantomJS (http://phantomjs.org)

• Wrapped in command line toolkit, pageres

• Install easily via Node Package manager, npm$ npm install pageres

• Run simple command to generate PNG screenshots

$ pageres <url> <resolution>

$ pageres –d 15 \

"http://path/to/jbrowse/?tracks=A,B,C&nav=0&tracklist=0&o

verview=0" 1024x768

Tidbit (1)Generate screenshots programmatically

araport.org@araport

Tidbit (2)Streaming data files from iPlant

• iPlant Data Store (iDS) provides cloud storage to researchers (initial allocation 100GB)

• Several methods can be used to transfer data to and from iDShttps://pods.iplantcollaborative.org/wiki/display/DS/Storing+and+Accessing+Your+Data+in+the+Data+Store

– Discovery Environment web interfacehttp://de.iplantc.org

– iDrop GUI toolkit

– iCommands command line interface

• Instructions on wiki page describing method used to generate link(s) to BAM/BAI, VCF, GFF file on the iDShttps://pods.iplantcollaborative.org/wiki/display/DEmanual/Sending+Genome+Files+to+the+Genome+Browser

araport.org@araport

Tidbit (2)Streaming BAM files from iPlant

araport.org@araport

Acknowledgements

• JCVI Team Members

– Engineers: Maria Kim

– Analysts: Chia-Yi Cheng, Benjamin Rosen

• Awesome dev community!

• Matt Bomhoff, Eric Lyons from CoGe

• iPlant Collaborative

• Funding Agencies

araport.org@araport

Chris Town, PI

Lisa McDonald

Education and

Outreach

Coordinator

Chris Nelson

Project

Manager

Jason Miller, Co-PI

JCVI Technical Lead

Erik Ferlanti

Software Engineer

Vivek Krishnakumar

Bioinf. Engineer

Svetlana Karamycheva

Bioinf Engineer

Eva Huala

Project lead,

TAIR

Bob Muller

Technical lead,

TAIR

Gos Micklem,

co-PI

Sergio Contrino

Software

Engineer

Matt Vaughn

co-PI Steve Mock

Advanced

Computing

Interfaces

Rion Dooley,

Web and Cloud

Services

Matt Hanlon,

Web and

Mobile

Applications

Maria Kim

Bioinf

Engineer

Ben Rosen

Bioinf

Analyst

Joe Stubbs,

API Developer

Platform

Walter Moreira

API Developer

Federation

Chris

Jordan

Database

Manager

Eleanor

Pence

Intern

Chia-Yi

Cheng

Bioinf

Analyst

Seth Schobel

Bioinf. Engineer

Araport

Team

Irina Belyaeva

Software Engineer

araport.org@araport

THANK YOU!