H3ABioNet Pan African Bioinformatics Network for H3Africa
description
Transcript of H3ABioNet Pan African Bioinformatics Network for H3Africa
H3ABioNet
Pan African Bioinformatics Network for H3Africa
H3ABioNet Project Goal• To build H3ABioNet -- a sustainable African
Bioinformatics Network -- to provide bioinformatics infrastructure and support for the H3Africa consortium.
• H3Africa: NIH and Wellcome trust funded initiative to encourage genomic research in Africa on relevant health issues. Funding:– Research projects (kidney disease, diabetes, cardio-
metabolic disease, TB, Trypanosomiasis etc.)– Collaborative centres– Biorepositories
H3Africa data expected
SNP Arrays NGS Data (Exome, RNA Seq, Full Genome)
Other (Biorepository)0%
10%20%30%40%50%60%70%80%90%
Types of Data To Be Generated H3Africa Projects
Types of Data Generated
Perc
enta
ge o
f H3A
frca
Pro
ject
s (N
=9)
Illumina
454
Other
0% 10% 20% 30% 40% 50% 60%
Sequencing Platform Used
Percentage of H3Africa Projects Using a Sequencing Platform (N=9)
Type
of S
eque
ncin
g Pl
atfor
m U
sed
50% exome, 40% full genome, 10% metagenomics
Storage required for processed data for >30k samples: 52TB
Storage required for processed data for >12k samples: 2.5PB
H3Africa’s needs• Large-scale data analysis for:
– Genotyping by arrays– Next generation sequencing– GWAS
• Support for analysis:– General questions– Access to computing resources– Technical computing support
• Data access and visualization (public and new)• Data storage, backup and transfer• Data submission• Training
Approach• Formalize existing networks and connections between
bioinformatics institutions• Determine capacity at participating nodes, identify gaps• Determine bioinformatics needs of funded projects• Formulate plan to fill gaps and address needs through:
– Training– Computing infrastructure development– Research into new tools– Effective communication– Interaction/collaboration with foreign institutions
Network structure
Partner institutions
Administrative hub at UCT
34 partner institutions, 32 in 15 African countries, 2 in
USANodes are full, associate or development nodes
Project activities
Patient databaseData storage
& management
Pipelines and tools
Data submission
scripts
SOPs, ontologiesStorage & transfer facilities
Running of pipelines, analysis
Data submission
process
Database servers Databases Access to
hardwareStorage area of EGA files
Training in patient DB
Training in data
management
Training in data analysis
How to submit
Sample collection Data generation Data analysis Data submission
Research/tools
User support
Infrastructure
Training
Research and tools • Data management and storage
– Patient databasing protocols– BioMart for genotyping data– Grid-based tool for data storage
• Data analysis tools– Galaxy for NGS and Genotyping analysis– Genome assembly pipeline– Functional SNP calling pipeline– Admixture mapping and network tools– Data visualization tools– Recombination tools– Structural SNP analysis tool
• Joint research projects
User support• H3Africa research projects provide support
on:– Small bioinformatics queries (scripts etc)– Data management– Large-scale analysis– Access to computing infrastructure (data &
hardware)– Submission to EGA
• Support provided through:– Helpdesk– Training
Infrastructure development
• Technical support –programming, pipeline development, sys admin
• Access to HPC, Cloud, eBioKits• Access to public data• Standard operating procedures and
guidelines for data analysis• Tools/support for data submission• Storage, movement & management of data
H3ABioNet training• Researcher training• Train-the trainer• Graduate training MSc/PhD/Postdocs
• Shared course curriculum• Co-supervision across nodes• Specialised courses
• Technical staff• Computing/sys admin courses• HPC, Cloud, data management
• Internships with external partners
Course access & coordination• Technical course webcast using USTREAM• Bioinformatics course live Vidyo streaming
to 2 sites Tunisia and Nigeria • All lectures are recorded• Coordinate with other training programs –
Wellcome Trust, EMBO, etc.• Will work with GOBLET on training
objectives, quality etc.
Measuring our success• Assess and track node capability• 6 monthly reports to NIH• Assess and track other metrics:
– Students graduated– Publications– Grants– Infrastructure– Node size
Node assessment & accreditation
• Assessing internet capacity at each node• Node assessment exercises• Set of workflows with simulated data covering:
– Analysis of NGS data (exome and full genome)– Variant calling– Genotyping array data analysis– GWAS analysis
• Nodes can undertake exercise at any time
Internal database
Internal database
How to ensure success• Funding!• Good management structure• Effective and regular communication• Keeping everyone active and feeling a part of
the activities• Providing access to new opportunities• Facilitating new projects and funding
opportunities• Always keep an eye on milestones
Management structure
Communication• Meetings
• WG meetings fortnightly, MC monthly• GA and SAB meet annually
• Web portal:• Members info and expertise• Documents• Tools and resources, etc.
• Mailing lists, monthly bulletin
H3ABioNet website -public
H3ABioNet website -private
Monthly Bulletin
Have we been successful?• Too early to tell, many challenges• Some success stories
– >30 new staff or students– >100 people trained– Communication structures established– New projects and collaborations initiated– Funding proposals written– Egypt centre established
Acknowledgements Funding: NIH/NGHRIName Institution CountrySimani Gaseitsiwe Botswana Harvard AIDS Institute Partnership BotswanaAhmed Mansour Alzohairy Zagazig University EgyptJames Brandful NMIMR Ghana Ellis Owusu-Dabo KNUST Ghana Daniel Masiga ICIPE KenyaDean Everett Malawi-Liverpool Wellcome Trust Clinical research Programme MalawiSeydou Doumbia University of Bamako MaliYasmina Jaufeerally Fakim SANBio MauritiusHassan Ghazal University Mohammed First MoroccoAzedine Ibrahimi Faculte de Medecine et de Pharmacie de Rabat MoroccoAhmed Moussa Abdelamlek Essaadi University, Tangier MoroccoFouzia Radouani Pasteur Institute Casablanca MoroccoFouad Seghrouchni Institut National d'Hygiène, Rabat MoroccoFatima Gaboun National Institute of Agronomic Research, Rabat MoroccoKhalid Sadki Mohammed V University, Rabat MoroccoAlami Raouf Centre National de Transfusion Sanguine, Rabat MoroccoOdile Ouwe Missi CERMES NigerEzekiel Adebiyi Covenant University Bioinformatics Research NigeriaOyekanmi Nash NADBA NigeriaNicky Mulder University of Cape Town South AfricaJudit Kumuthini CPGR South AfricaNicki Tiffin SANBI, University of the Western Cape South AfricaOzlem Tastan Bishop Rhodes University South AfricaScott Hazelhurst Wits University South AfricaFourie Joubert University of Pretoria South AfricaHugh Patterton University of the Free State South AfricaFaisal Fadlelmola Future University SudanSylvester Lyantagaye University of Dar es Salaam (UDSM) TanzaniaNzovu Ulenga MDH TanzaniaJulie Makani MUHAS TanzaniaAlia Benkahla Institute Pasteur of Tunis TunisiaJonathan Kayondo UVRI UgandaVictor Jongeneel NCSA USAWin Hide Harvard School of Public Health USA
Thanks: Sumir Panji –Project Manager