CyVerse Education, Outreach, and Training€¦ · Transforming Science Through Data -driven...
Transcript of CyVerse Education, Outreach, and Training€¦ · Transforming Science Through Data -driven...
TransformingScienceThroughData-drivenDiscovery
CyVerseEducation,Outreach,andTrainingiDigBio Webinar
JasonWilliams– Lead,CyVerse– Education,Outreach,TrainingColdSpringHarborLaboratory
[email protected] [@JasonWilliamsNY]
WebinarOutline
• CyVerseprojecthistoryanddescription• Education,Outreach,andTraining• DNASubway:CyberinfrastructureforEducation
CyVerseEvolution
iPlant2008EmpoweringaNewPlant
Biology
iPlant2013CyberinfrastructureforLife
Science
CyVerse2016TransformingScienceThroughData-Driven
Discovery
CyVerseEvolution
CyVerse2016TransformingScienceThroughData-Driven
Discovery
Vision:Transformingsciencethroughdata-drivendiscovery
Mission:Design,develop,deploy,andexpandanationalcyberinfrastructure forlifescienceresearch,andtrainscientistsinitsuse
Morethan30Kusers,PBofdata,andhundredsofpublications,courses,anddiscoveries
WhatisCyberinfrastructure?
Platforms,tools,datasets Storageandcompute Trainingandsupport
WhatisCyberinfrastructure?
•Datastorage•Software•High-performancecomputing•People
organizedintosystemsthatsolveproblemsofsizeandscopethatwouldnototherwisebesolvable.
WearefundedbytheNationalScienceFoundation
• Weareyourcolleaguesandcollaborators!• $100Millionininvestment• Freelyavailabletothecommunity• Spurnational/internationalcollaboration• CiteCyVerse:
CyVerse.org/acknowledge-cite-cyverse
DBI-0735191andDBI-1265383
CyVerseEvolution
CyVersesupportsalldomainsoflifescience
Plant/MicrobialAnimal BiomedicalEcological/Climate
CyVerseisbuiltforData
CyVerseproductstackReadytousePlatforms
FoundationalCapabilities
EstablishedCIComponents
ExtensibleServices
EaseofU
se
HowwasCyVersebuilt?
CyVerseInstitutions
• WestrivetobetheCILegoblocks• Danish'leggodt'- 'playwell’• Alsotranslatesas'Iputtogether'inLatin
• IfasolutionisnotavailableyoucancraftyourownusingCyVerseCIcomponents
CyVerseProducts
• Abilitytoaccessandmanagedata• Softwaretoanalyzedata• Computingresources• Skillsandhelptousesoftwareandinterpretresults
• Metadatamanagement• Abilitytosharedataandworkflows• Opensourcesustainabletools
• High-performanceandscalablecomputing• Abilityautomateandcollaborate• Fundingspentonscience,notsoftwareorhardware
IncreaseProductivity
EnsureReproducibility
GetScienceDone
CyVerseFeatures
WebinarOutline
• CyVerseprojecthistoryanddescription• Education,Outreach,andTraining• DNASubway:CyberinfrastructureforEducation
93% ofresearchersareworkingwithlargedatasetsnow(ornext6months).34% feeltheirinstitutionprovidesadequatecomputationalresources.Mostalsofeeltheydon’thaveenoughskillstodothebioinformaticstheirsciencerequires. – iPlantSurveys(2010-14)
Never use(d) bioinformatics toolsBeginner
Intermediate
Advanced
Howwouldyourateyourbioinformaticsskilllevel?
Whereare“BenchBiologists”at?Gapsinneedsandskills
Hands-ontrainingWorkshopsthatreachuserswheretheyare
Tools&Services GenomicsinEducation
• Twodays;targetedtoresearchers• Hands-onlearningmodulestailortointerests• Individualconsultations• 1026participantssince2011
• Twodays;targetedtoeducators• Pairbioinformaticswithclassroomlabs• Helpforgeneratinglessonplans• 748participantssince2010
SeminarsWhatgraduatestudentsshouldknowaboutcyberinfrastructure
LifeScienceUser
• Labprotocolsareaprimaryexpressionoftheirresearchinterest
• AccustomedtoGUIs• Writes“quickanddirty,”non-optimized
scripts(Perl,Python,R)• Datadriven• HighThroughput
TraditionalHPCUser
• Writingcodeisaprimaryexpressionoftheirresearchinterest!
• Accustomedtothecommandline• Writesperformanceconscious,parallelcode(C,Fortran)!
• Simulationbased!• MassivelyParallel!
Hands-ontraining,workshops,seminarsGlobalCoverage
Majorworkshopsandseminarsatmorethan58academicinstitutionsworldwide
VirtualTrainingReachingthewidestaudience
Webinars Livestream
• ‘Orientation’for~700newuserspermonth• Quickdemos,Q&A• PublishedtoYouTube
• Double-diponworkshops• Virtualattendeesparticipatethroughchat• Recordingavailableinstantly
OnlinelearningmaterialsDocumentation,tutorials,andhelp
LearningCenter
• wiki.cyverse.org• Documentation
Wiki Forum
• cyverse.org/learning-center• Tutorialsandquick-starts
• ask.iplantcollaborative.org• Communityhelp/Q&A
Partnering“Findthebest,thenshareit”
• PartnersarecrucialtoEOTsuccess• Leverageexistingtools• Formalandinformalrelationships• Dowhatisrightfortheuser
WebinarOutline
• CyVerseprojecthistoryanddescription• Education,Outreach,andTraining• DNASubway:CyberinfrastructureforEducation
GenomicsinEducation
Challenge– bringingstudentsintothefold
Research Education
Studentscanworkwiththesamedataatthesametimeandwiththesametoolsasresearchscientists.
DNA SubwayEducationalworkflowsforGenomes,DNABarcoding,RNA-Seq
ü Commonlyusedbioinformaticstoolsinstreamlinedworkflows
ü Teachimportantconceptsinbiologyandbioinformatics
ü Inquiry-basedexperimentsfornoveldiscoveryandpublicationofdata
DNASubwayBigdatainbiology
ImageCredits :Genomesequencingcosts :http:/ /www.genome.gov/ images/content/costpergenome2015_4.jpgOxfordnanopore sequencer:https :/ /www.nanoporetech.com/Agricultural drone: http:/ /purdue.imodules .com/s/1461/ images/gid1001/editor/alumnus/2014_mar/drones_main.jpgFitbit: http:/ /www.fitbit.com/force
• 100Kfoldcostsdecreaseinsequencing• Hand-heldsequencers• Drones• Biologicalsensors
Biologyisswimmingindata
Canyounavigatethetools?
Whatareyourchallengesinteachingbioinformaticsintheclassroom?
DNASubwayClassroomfriendlybioinformatics
FacultyidentifiedguidingrequirementsthatshapedthedevelopmentofCyVerseeducationalplatforms:
• Mixlectureandlab– haveawetbench“hook”• Student-scientistpartnerships– someonehastocareaboutthedata
• Co-investigation– projectsshouldpotentiallyleadtopublications• Scale – platformsshouldsupportprojectsmultipleclassroomscanjoin.
DNASubwayClassroomfriendlybioinformatics
Morethan13,000usersMorethan28,000studentprojectsin2015
DNASubwayRed Line: Genome annotation
Red Line
• Analyze up to 150 KB of DNA sequence
• De novo gene prediction
• Construct evidence-based gene models
• Visualize genome sequence in browser
DNASubwayYellow Line: Genome prospecting
Yellow Line
• Analyze DNA or protein sequence
• Search plant genomes using TARGeT
• Explore gene duplications, transposons, and non-coding sequences not detectable in conventional BLAST searches
DNASubwayBlueLine:DNAbarcoding,andphylogenetics
• Analyze DNA or protein sequence
• Search plant genomes using TARGeT
• Explore gene duplications, transposons, and non-coding sequences not detectable in conventional BLAST searches
Blue Line
DNASubwayGreen Line: Transcriptome analysis
Green Line
• Examine RNA-Seq data for differentialexpression
• Use High-performance computing to analyze complete datasets
• Generate lists of genes and fold-changes; add results to Red Line projects
• Iconic,simplelaboratory• Distributedprojects• Subsumesmanyimportantbiologicalconcepts• Integratesgeneticswithecologyandconservationbiology• Combineslabexperimentationwithbioinformatics• Opportunitiesforstudentdiscoveryandpublicationofnovelsequences
DNASubwayDNA Barcoding: The “Do Everything” Lab
DNASubwayDNA Barcoding: Inspiration
DNASubwayDNA Barcoding: Workflow
Collect samples
Exportsequences
Wet-lab Bioinformatics
DNASubwayDNA Barcoding: Workflow
DNA is extracted using a safe, cost-effective kits and equipment
DNA extraction ~ 1 hourPCR and Electrophoresis ~1.5 - 2 hours
DNASubwayDNA Barcoding: Workflow
GENEWIZ provides low-cost sequencingSequence data available within 24-48 hours
DNASubwayDNA Barcoding: Student Projects
www.urbanbarcodeproject.org
DNASubwayDNA Barcoding: 2012 Winners – Content of Ginko products
TransformingScienceThrough Data-drivenDiscovery
ParkerAntinNiravMerchant
EricLyons
MattVaughn DoreenWareDaveMicklos
CyVerseissupported bytheNationalScienceFoundation underGrantNo.DBI-0735191andDBI-1265383.
CyVerseExecutiveTeam