Md. Zia Uddin Bio-Imaging Lab, Department of Biomedical Engineering Kyung Hee University.
UCSB Bio-imaging Infrastructure
description
Transcript of UCSB Bio-imaging Infrastructure
Center for Bioimaging Informatics
UCSB Bio-imaging Infrastructure
December 2006
Center for Bioimaging Informaticswww.bioimage.ucsb.eduSupported by NSF ITR Supported by NSF ITR #0331697#0331697
www.bioimage.ucsb.eduSupported by NSF ITR Supported by NSF ITR #0331697#0331697
December 2006 2
Center for Bioimaging Informatics Projects
• Bisque/OME– In use at UCSB
• Bisquik– Next generation system (in construction)
December 2006 3
Center for Bioimaging Informatics Motivation
• Analysis creates knowledge– Image analysis– Querying – Mining
• Available Storage systems are growing• Available Processing increasing
December 2006 4
Center for Bioimaging Informatics Challenges
• Diverse users– UCSB Neurosciences institute
• Retinal detachment : Fischer et al • Microtubule dynamics : Feinstein and Wilson
– CMU Murphy Lab– Flybase– Computable Plant
• Dataset challenges– Multiple types and collection techniques– Complex images (5D) and metadata– Acquisition and management of large sets– Security of private data – Sharing of experimental data
December 2006 5
Center for Bioimaging Informatics Challenges
• Metadata– Collection– Organization– Sharing/interpretation
• Dataset management– Personal collections– Multiple organizations– Privacy and security
• Analysis design• Analysis integration
December 2006 6
Center for Bioimaging Informatics Block System diagram
Image and metadata capture
DBStorage
ImageAnalysis
CollectionAnalysis
Semantic Analysis
Browse SearchContent + metadata
Interactive Enhancement
KnowledgeDiscovery
Ground truthCollection
December 2006 7
Center for Bioimaging Informatics
BISQUEBio-Image Semantic Query User Environment
• Current dataset collection• BISQUE functionality
– Data management – Ground truth acquisition– Analysis
• Architecture
December 2006 8
Center for Bioimaging Informatics Current collections
• UCSB Retinal Objects (confocal, EM)• UCSB Microtubule Objects(light, AFM)
Type Current Backlog
Rate/y
Expected 4Yrs
2D Images
Retinal EM 500 22000 500 23,000 23K
Retinal confocal P
3000 600 2400 10,000 10K
Retinal confocal Z
600 14000 12000 10,000 50K
Microtubule light 3000 2500 2500 13,000 400K
Microtubule AFM 0 500 1200 5000 30K
Flybase 0 125K 0 125K 125K• Total estimated size in TBs and growing for 1 Lab• Complexity and analysis are the main issues
December 2006 9
Center for Bioimaging Informatics
Data management capabilities
• Digital notebook– Direct import process– Reconfigurable (confocal, 4 x microtubule,
etc)
• Web – Web access and browsing– Organize images and metadata– Data sharing environment– Search by metadata or content– Integrated analysis
December 2006 10
Center for Bioimaging Informatics Screenshots (Digital Notebook)
December 2006 11
Center for Bioimaging Informatics
Data management capabilities
• Digital notebook– Capture experimental/image parameters– Direct Import process– Reconfigurable (confocal, 4 x microtubule,
etc)
• Web – Web access and browsing– Organize images and metadata– Data sharing environment– Search by metadata or content– Integrated analysis
December 2006 12
Center for Bioimaging Informatics Screenshots (browsing)
Dataset Browsing
December 2006 13
Center for Bioimaging Informatics Screenshots (search)
December 2006 14
Center for Bioimaging Informatics Screenshots (search)
December 2006 15
Center for Bioimaging Informatics Screenshots (search similar)
December 2006 16
Center for Bioimaging Informatics Screenshots (search similar)
December 2006 17
Center for Bioimaging Informatics
Screenshots (personal collection)
Data sharing
December 2006 18
Center for Bioimaging Informatics Screenshots (5D Viewer)
December 2006 19
Center for Bioimaging Informatics Screenshot (5D with tracks)
December 2006 20
Center for Bioimaging Informatics Image analysis
• Integrated image analysis– Image enhancement– Cell counter (ImageJ)– Quantify microtubule dynamics– Microtubule tracker (manual, automatic)– Track identification
• Integration in progress– Segmentation and classification– Modeling microtubule dynamics– Relevance feedback improving search
December 2006 21
Center for Bioimaging Informatics Screenshots (Image Analysis)
Image J Cell
counter
December 2006 22
Center for Bioimaging Informatics System description overview
• Current dataset collection• Current functionality
– Data management – Ground truth acquisition– Analysis
• Architecture
December 2006 23
Center for Bioimaging Informatics
Hardware/software infrastructure
• Hardware– 16 node Cluster
• dual Intel Xeon 3GHZ• Gigabit network switch• 2 TB Storage
• Software– Bisque – OME (Apache, Postgresql)– Linux
December 2006 24
Center for Bioimaging Informatics
Bisque ArchitectureBio-Image Semantic Query User Environment
Image/metadata
server
WEB
DigitalNotebook
image
search
external internal
BISQUE
XML
Research in image processingand indexing
Cell counterlug-in forImageJ
Research in biology
features
analysis
Distributed computing
cluster
Cell counterlug-in forImageJ
Cell counterlug-in forImageJ
Cell counterplug-in for
ImageJ
metadata XML
December 2006 25
Center for Bioimaging Informatics OME Base
• Open Microscopy Environment– “OME is an open source software
project to develop a database-driven system for the quantitative analysis of biological images. OME is a collaborative effort among academic labs and a number of commercial entities”.
• Provides base for image/metadata storage and analysis integration
• Boston: Sorger Lab• Baltimore: Goldberg Lab• Dundee: Swedlow Lab• Madison: LOCI
December 2006 26
Center for Bioimaging Informatics Bisque extensions
• Ongoing extensions with OME– Content based search– Analysis integration with OME
• Segmentation• Cell Counting• MT identification and Analysis • Etc.
– Front end dataset and analysis support– Schema additions for image types and
analysis– (Uncertainty modeling and queries)
December 2006 27
Center for Bioimaging Informatics Bisque
• Built 1st generation w/ >5000 5-D images– Integrated several useful analyses– Being used internally– External Interest
• Immediate future– Continued development
• analysis integration• Dataset integration
– Remote deployments– Integration with other projects
• Flybase : Integration of schema/datasets in progress• Computableplant.org
December 2006 28
Center for Bioimaging Informatics Projects
• Bisque/OME– In use at UCSB– http://hammer.ece.ucsb.edu/bisque
• guest/bioimage
• Bisquik– Next generation system (in construction)
December 2006 29
Center for Bioimaging Informatics Bisque/OME lessons learned
• Getting the “correct” metadata is hard– Correct may need to change or be
reinterpreted– Sql schema difficult to change
• Getting the “right” name is difficult– Different terms used in different labs make
data integration painful
• Analysis integration needs to be easy– Researcher balk at learning complex software
• Users expect a rich experience– Google, flickr, etc have raised the bar
December 2006 30
Center for Bioimaging Informatics New project: Bisquik
• DoughDB: Flexible metadata• Ontology support• Rich user experience
– Web 2.0 : Ajax, SVG – Web based tools: DN, Graphical Annotator
• Programming toolkit– Smaller is better
• Distributed data store• Support for large scale computation• Expected early 2007
December 2006 31
Center for Bioimaging Informatics Block Components
Blob/ImageServer
MetadataRenderer
DoughDB
MetadataAnnotation
Analysis Engine
Permission
Ontology
Remote Access
Web UI
December 2006 32
Center for Bioimaging Informatics Tagging Screen Shot
December 2006 33
Center for Bioimaging Informatics Motivation:
• Current metadata model is inflexible• Adding new experimental images
requires:– Changes to Digital notebook– Changes to Bisque interface– Changes to OME/postgres
• Shouldn’t this be easier?– “Find images tagged with rod-opsin”– “Create a region and specify the object”– “Add experimental metadata to this dataset”
December 2006 34
Center for Bioimaging Informatics Bisquik requirements
• Add new tag/value pairs to any db object– (Foo,2)– (visible-cell, rod)
• Allow multiple tags with same value– (visible-cell, rod)– (visible-cell, muller)
• Support fine-grained tag permission/visibility– Tags have creators and access control
• Support update semantics & preserve history– Timestamp tags– No deletes (except under certain conditions)
December 2006 35
Center for Bioimaging Informatics Bisquik: DoughDB
image OID1
Feature f1
[ … ]Name GH1020
Foo 2
pixels [OID2 OID3]
image OID1
data Server://…
Pixel-type raw
OID1OID4
OID2
December 2006 36
Center for Bioimaging Informatics Bisquik queries
• Find objects (images) with tag “foo”– Select a.foo
• Find images with name “GV100”– Select a.name = “gv100”
• Find images with cellcount < 100– Select a.cellcount < 100
• Find images with a region similar to r1 based on feature f1– Select r.image = i and l2(r.f1, r1[ … ]) < 10
December 2006 37
Center for Bioimaging Informatics DoughDB key features
• No classes or types only tag/value pairs– Open ended data model
• Pair values have owner and acl• Preserves history of annotations• SQL like query language • Simple keyword queries
December 2006 38
Center for Bioimaging Informatics Bisquik ontology support
• “ontology is a data model that represents a domain and is used to reason about the objects in that domain and the relations between them”
• Unstructured tag/value– Great for taggers– Unhappy searchers
• Different labs use different terms for the same object.
December 2006 39
Center for Bioimaging Informatics Bisquik ontology support
• Dictionary of terms and relations• Require (or strongly suggest) that tags
and value are defined before use• Drop into ontology editor when new
values and tags are encountered.• Integrated into search system
– Permit (or offer) ‘alias’ ‘part-of’ ‘related-to’ searches
December 2006 40
Center for Bioimaging Informatics Analysis Engine
• Analysis Components• Analysis Applications• Glue language: python• Component connection library
– Memory for single node– MPI for cluster based execution
• Execution engine– Automatic component placement– Resources and efficient communication
December 2006 41
Center for Bioimaging Informatics Analysis Engine
• Interfaced with DoughDB• All input/output are Pairs or objects• Application executions recorded for data
provenance
December 2006 42
Center for Bioimaging Informatics Bisquik interface
• http://oib.ece.ucsb.edu/bisquik• Some bisque functionality• Flickr-like interface for region tagging• Pair tagging and keyword tagging• Metadata renderers
– Textual (tag)– Graphical (regions, geometry, graph data)– Analysis oriented (histogram)
December 2006 43
Center for Bioimaging Informatics Region Tagging screenshot
December 2006 44
Center for Bioimaging Informatics Bisquik Metadata Annotation
• Unified offline (Digital Notebook) and online manipulation.
• Easy to build annotation forms• Allow “schema” modification “in field”• Permit annotation templates to be
shared between DN and Bisquik• Graphical geometry annotator
December 2006 45
Center for Bioimaging Informatics Blob Image server
• Extensible server for write-once objects– Pixels, Features
• Pluggable transforms/operations– Thumbnails, slices – pixel transforms (watermarks)– Graphical metadata renderers
December 2006 46
Center for Bioimaging Informatics Remote Access
• All basic services are web accessible:– Soap and WSDL– DoughDB, Image server,
• DoughDB pairs have unique 64 bit IDs– Split between machine ID/Pair ID– All pairs are addressable– Query engine processes foreign pairs
December 2006 47
Center for Bioimaging Informatics Bisquik new hardware
• 10 TB disk mirrored array (20TB)• Image Server• Database (backup) server• 8-16 Query/Analysis nodes
Image Server
DB/Backup Server
10TB 10TB
Query Index
1TB
Query Index
1TB
Query Index
1TB
Query Index
1TB
Query Index
1TB
Query Index
1TB
Query Index
1TB
Query Index
1TB
December 2006 48
Center for Bioimaging Informatics Status
• Web UI – Uploading, Tagging, simple searches
• DoughDB– Single node storage and queries
• Analysis Engine– In design
December 2006 49
Center for Bioimaging Informatics Conclusion
• Bisque Team: August, Jiejun, Melissa,Kris
• Bisquik Team:– Interface: August Jiejun Jaechok– Analysis Engine: Kris, Mellisa, Dmitry– Ontology: David(summer),
Verleen(summer), Kris– DoughDB: Kris,Vebjorn
December 2006 50
Center for Bioimaging Informatics Biowall
December 2006 51
Center for Bioimaging Informatics Biowall
• 5x4 1600x1200 displays – (8000x4800 pixels)– 38 MegaPixels
• 1 Head node + 5 slaves• Distributed multihead X (X proxy)• New simplified viewer
– OpenEV image server – Local client – Web/Database client
• 3D visualizations• Working with very large screen areas