John Darrell Van Horn, Ph.D. Associate Professor.

7
John Darrell Van Horn, Ph.D. Associate Professor ata for Discovery Science (B Training Plans

Transcript of John Darrell Van Horn, Ph.D. Associate Professor.

Page 1: John Darrell Van Horn, Ph.D. Associate Professor.

John Darrell Van Horn, Ph.D.Associate Professor

Big Data for Discovery Science (BDDS):Training Plans

Page 2: John Darrell Van Horn, Ph.D. Associate Professor.

• Discovery Science (also known as “discovery-based science”) is a scientific methodology which emphasizes analysis of large volumes of experimental data with the goal of finding new patterns or correlations, leading to hypothesis formation and new scientific results.

• Discovery-based methodologies are often viewed in contrast to traditional scientific practice, where hypotheses are formed before close examination of experimental data.

• However, from a philosophical perspective where all or most of the observable "low hanging fruit" has already been plucked, examining the phenomenological world more closely opens a new source of knowledge for hypothesis formation.

• Data mining is the most common tool used in discovery science, and is applied to data from diverse fields of study such as brain imaging, DNA analysis, and proteomics.

• The use of data mining in discovery science follows a general trend of increasing use of computers and computational theory in all fields of science. Further following this trend, the cutting edge of data mining employs specialized machine learning algorithms for automated hypothesis forming and automated theorem proving.

What is Discovery Science?

Page 3: John Darrell Van Horn, Ph.D. Associate Professor.

http://bd2k.ini.usc.edu

Page 4: John Darrell Van Horn, Ph.D. Associate Professor.

• Aim 1: Courses. Establish short course offerings focused on large-scale biomedical data informatics. These will be offered at University of Southern California (USC), University of Michigan (UM) and the University of Chicago (UC).

• Aim 2: Fellowships. Establish graduate and postdoctoral fellowships for those trainees electing to work and study in our BDDS Center(s).

• Aim 3: Visitors. Create a visiting professorship series providing space, training and support to enable professors from various institutions to come to our BDDS Center institutions for training and experience.

• Aim 4: Seminars and Workshops. Create topic-specific and audience-appropriate hands-on workshops. Our goal is to create a suite of tutorials on best practices and Big Data solutions - which illustrate hands-on, utilization of BDDS workflows, data management tools, data resources, and expertise to address concrete biomedical problems – and broadly disseminate our suite to broad multidisciplinary audiences.

• Aim 5: Training Materials. Develop interactive training materials for Big Data informatics. This will include online software documentation and tutorials, test datasets, Big Data use cases, educational papers, books, videos and webcasts with instructional aides to assist in training and spawn interest in large-scale biomedical informatics.

BDDS Training Aims

Page 5: John Darrell Van Horn, Ph.D. Associate Professor.

• Big Biomedical Data Roundtable (August 4th, 2015): Join leading experts in large-scale biomedicine and computer science for a round table discussion of what the future of medical science research looks like from the point of view of “big data”. Featured speakers will include Carl Kesselman (USC), Ian Foster (Chicago), Arthur Toga (USC), and Lee Hood (Inst. for Systems Biology). Located at the University Park Campus of the University of Southern California, this intimate discussion will reveal new insights into 21st Century biomedicine and the computational needs required for new understanding in brain, genomics, and proteomics and their combination for advancing science and curing disease.

• Big Data Analysis using the LONI Pipeline: Advanced Neuroimaging, Informatics, and Genomics Computing (September 11, 2015): This day-long event will include paired training and application demonstrations on using different graphical and script-based pipeline workflow architectures to manage, process, analyze and visualize large volumes of neuroimaging and genetics data. Attendees will learn to use several concrete end-to-end pipeline workflow solutions for brain imaging (sMRI, fMRI, DTI), proteomics, and phenotypic (demographic, genetic, clinical) data in development, aging and pathology.

• Proteomics Informatics Course in Vancouver, B.C., Canada: September 22-25, 2015 (Prior to HUPO World Congress in Vancouver, B.C, Canada): Our BDDS partner, the Seattle Proteome Center (SPC), is pleased to offer a four-day intensive in the use of a suite of open-source software tools designed for the analysis, validation, storage and interpretation of data obtained from large-scale quantitative proteomics experiments using stable isotope labeling method, multi-dimensional chromatography and tandem mass spectrometry. This will include a detailed introduction to the LONI Pipeline, the construction of scientific workflows, and the use of the Proteomics Toolkit. Through daily lectures and tutorials, each course participant should become proficient in the use of these BDDS-supported tools. (http://www.proteomecenter.org/nav.course_2015-9.php).

Upcoming BDDS Events

Page 6: John Darrell Van Horn, Ph.D. Associate Professor.

• Big Data for Discovery Science (Toga, USC)*• ENIGMA (Thompson, USC)*• Center for Big Data in Translational Genomics (Haussler, Santa Cruz)• Center for Expanded Data Annotation and Retrieval (Musen, Stanford)• Mobility Data Integration to Insight (Delp, Stanford)• Translate Protein Data to Knowledge (Ping, UCLA)• BIOCADDIE (Ohno-Machado, San Diego)• Integrated Active Learning Framework for Biomedical BD2K (Pevzner, UCSD)• The BD2K Concept Network (Lee, UCLA)

Palm Springs, CA - October 9-11th, 2015

California Big Data Brain Workshop 2015

Key Topics: Among the many topics relevant to BD2K to be discussed are:• Identifying California-wide thematic linkages on brain research between BD2K research centers• Organizing the computational needs for multi-site, large-scale brain research in CA• Reviewing and exploring neuroinformatics concepts, tools, ontologies, challenges, and best-

practices• Featuring examples of large-scale neuroscience applications, results, visualization, and clinical

outcomes• Examining the needs for graduate, post-doctoral, etc. training in large-scale biomedical data

methods

Page 7: John Darrell Van Horn, Ph.D. Associate Professor.

• As the breadth and depth of BDDS tools for Big Data increase, we plan to develop the following:– “Boot camp”-style multi-day “experiences”– TED-style talks and further Round Table events– Short courses: University classes, workshops, and

programs for conferences and satellite events

BDDS Training Plans under Development