Competências Básicas de Investigação Científica e de Publicação
description
Transcript of Competências Básicas de Investigação Científica e de Publicação
Ganesha Associates
Competências Básicas de Investigação Científica e de Publicação
Physio lecture 1: Introduction, Hypotheses and Search
13/08/2013
Ganesha Associates
Publishing is an essential research skill
13/08/2013
Preparation Journal Selection Writing Submission Peer
ReviewPublication
Success
determining likelihood of acceptance
citation management
navigating a submission system in a
second language
writing an outline
comparing journals
assessing relevance to
research topic
understanding comments
long decision timelines
decision to re-submit, or try a different journal
Publicationethics
writing in English formatting to
guidelines
Publishing is an essential research skill
Ganesha Associates
Me…
• BSc Physics 1971, PhD Neuroscience 1976, post doc 1975-1979
• Visiting Professor, UFPe 1978-79• Editor, Publisher, Director at Elsevier Science 1979
– 2005• Pubmed systems expert, NCBI, NIH 2006-2007• STM business analyst, Outsell Inc, 2009-2011• Visiting Professor UFPe, 2006, 2007, 2008, 2012,
2013
13/08/2013
Ganesha Associates CC BY 3.0 5
The scientific process involves making models of how things work
• These evolving models are described in the scientific literature
• Sometimes the models are wrong, often they are incomplete
• Scientific progress is driven by the communication and publication of the results of new research, and the reinterpretation of older work
• The tool which makes all of this possible is the hypothesis
9 September 2013
Ganesha Associates CC BY 3.0 69 September 2013
Ganesha Associates CC BY 3.0 79 September 2013
Ganesha Associates CC BY 3.0 89 September 2013
Experimental and observational types of research
Ganesha Associates CC BY 3.0 10
Experimental vs. Observational studies
No modification of experimental variablesUseful to discover trends and associationsCannot directly be used to infer causality
Compare responses different treatmentsDesigned to avoid misleading results
e.g. randomisationCan be used to infer cause and effect9 September 2013
9 September 2013 Ganesha Associates CC BY 3.0 11
Main learning points
• Student projects fall into three categories– No hypothesis, i.e. observational– Weak hypothesis– Strong hypothesis
• The work will be published in a – National journal– Low impact factor journal– High impact factor journal
• Starting with strong hypothesis improves your chances of getting published in a good journal
Ganesha Associates CC BY 3.0 129 September 2013
9 September 2013 Ganesha Associates CC BY 3.0 13
What is a strong hypothesis ?
• A strong hypothesis is based on a series of premises – things that are already known with some certainty
• Each premise must be supported by references back to the (international) primary literature
• So a strong hypothesis will be backed by references to recent papers in high quality journals
Ganesha Associates CC BY 3.0 149 September 2013
9 September 2013 Ganesha Associates CC BY 3.0 15
Coin-tossing - an example• I wonder how many heads or tails I will get if I toss
this coin 100 times– No model
• The frequency distribution of heads and tails will be approximated by a binomial distribution with n=100 and p=0.5– Simple model, based on symmetry
• A detailed analysis of the dynamics reveals that the probability of a head is 0.51– Complex model, based on asymmetry, aerodynamics, etc
9 September 2013 Ganesha Associates CC BY 3.0 16
Coin-tossing – impact on CV1. None, or possibly negative
2. R. A. Fisher and others did perform this experiment in the early days of biological statistics, before the advent of computers, as a proof that the binomial distribution tended towards a normal one at high levels of n.
Interestingly they all found that the probability of a head p was usually slightly higher than 0.5, but this difference was ignored.
3. Persi Diacusis, Susan Holmes and Richard Montgomery (Stanford, 2004) publish a paper on the ‘Dynamical bias in the coin toss’ proving that the lack of total symmetry in a coin means that the probability of a head will always be slightly greater than 0.5.
9 September 2013 Ganesha Associates CC BY 3.0 17
Coin tossing - relevance• I think that there will be an association (+ or -) between mutations
in gene x and susceptibility to disease y– No causal basis for a relationship given
• I predict that mutations in gene x will increase susceptibility to disease y because patients with disease y often have low levels of gene product x.– Built-in control, patients with normal levels of the gene product should not
have the disease.• I predict that chemically non-neutral mutations in gene x will
increase susceptibility to disease y in patients with low levels of gene product x.– Second level of control – neutral mutations should be asymptomatic
9 September 2013 Ganesha Associates CC BY 3.0 18
Coin-tossing – moral of the story
• With a strong hypothesis, you:– Avoid following leads which go nowhere – false
positives, fail early– Avoid ignoring unexpected observations that are
of high interest – false negatives– May need to do less work !– Will get published in better journals !
9 September 2013 Ganesha Associates CC BY 3.0 19
Case study: Hummingbird territorial behaviour
9 September 2013 Ganesha Associates CC BY 3.0 20
Most hummingbird species demonstrate strong territorial behavior
If a bluffing charge attack does not work, the residentmay engage the trespasser in a brief but intense physical battle
So why do hummingbirds defend territories ?
H0: Hummingbirds are randomly distributed in space and time.
Hummingbird territorial behaviour
9 September 2013 Ganesha Associates CC BY 3.0 21
Hummingbird territorial behaviour
H1
If territory = F(energy), then behavior not species-dependent
If territory = F(mating), then behavior should be species and sex dependent
If…
If…
9 September 2013 Ganesha Associates CC BY 3.0 22
Territorial behaviour in 1971
• Time, Energy, and Territoriality of the Anna Hummingbird (Calypte anna) Science 173 (1971) 818-821.
• When territory quality decreases defenders may switch to less expensive forms of defense because the energy savings outweigh the loss of resources
• Augmented territorial defense during the breeding season is made possible by increased feeding efficiency due to the availability at this time of very nectar-rich flowers.
• Individuals with large territories are more successful reproductively.
9 September 2013 Ganesha Associates CC BY 3.0 23
Hummingbird territoriality since• Digestive physiology is a determinant of foraging bout
frequency in hummingbirds. Nature. 1986 Mar 6-12;320(6057):62-3.
• Mitochondrial respiration in hummingbird flight muscles. Proc Natl Acad Sci U S A. 1991 Jun 1;88(11):4870-3.
• Cloning and analysis of the gene encoding hummingbird proinsulin. Gen Comp Endocrinol. 1993 Jul;91(1):25-30.
• Flight and size constraints: hovering performance of large hummingbirds under maximal loading. J Exp Biol. 1997 Nov;200(Pt 21):2757-63.
9 September 2013 Ganesha Associates CC BY 3.0 24
Hummingbird territoriality since
• Hovering performance of hummingbirds in hyperoxic gas mixtures. J Exp Biol. 2001 Jun;204(Pt 11):2021-7.
• Adipose energy stores, physical work, and the metabolic syndrome: lessons from hummingbirds. Nutr J. 2005 Dec 13;4:36.
• Neural specialization for hovering in hummingbirds: hypertrophy of the pretectal nucleus Lentiformis mesencephali. J Comp Neurol. 2007 Jan 10;500(2):211-21.
• Three-dimensional kinematics of hummingbird flight. J Exp Biol. 2007 Jul;210(Pt 13):2368-82.
9 September 2013 Ganesha Associates CC BY 3.0 25
Hypothesis lecture learning points
• Good hypotheses build directly onto previous work
• So they need to become technically more sophisticated over time moving from the general to the particular
• A given problem can be associated with a number of very different hypotheses – your experiments should include tests to exclude these alternative explanations
9 September 2013 Ganesha Associates CC BY 3.0 26
Hypothesis lecture learning points
• Hypotheses can be weak (observational) or strong (mechanism-based)
• For example, a hypothesis which predicts that a tossed coin will end up ‘heads’ 50% of the time is much weaker than one that can predict the exact sequence of ‘heads’ and ‘tails’
• So hypothesis ‘quality’ is important
Types of scientific output
• Abstracts• Primary journal articles
– peer-reviewed interpretations of original research• Reviews• Book chapters, monographs• Conference proceedings• Lectures, seminars• Sequences, data sets• Patents, other forms of intellectual property• Blogs, tweets…
14 May 2013 Ganesha Associates 27
Some sources of scientific content• Google• PubMed/Medline (NLM)• Scopus (Elsevier)• Web of Science (Thomson Reuters) • Google Scholar• PubMed Central, PubMed Central Europe• SciELO, Biblioteca Virtual em Saude• Science Direct, Ovid, SpringerLink, Wiley Online Library,
BiomedCentral, Public Library of Science, SWETSwise…• CAPES Portal de Periódicos
14 May 2013 Ganesha Associates 28
Each source is different
• Free– Google, Google Scholar, Pubmed Central
• Subscription– Scopus, ScienceDirect
• Abstracts and citations only– PubMed, Web of Science
• Full text, single publisher– SpringerLink
• Full text, many publishers– Pubmed Central, SwetsWise Online Content
Classify sources of content
Abstract only
Full text
Free access Subscription
14 May 2013 Ganesha Associates 31
You can get access if…
• The journal is subscribed to by CAPES• You have a personal subscription• The journal is of the ‘Open Access’ type
– Note: some journals only make their content ‘Open Access’ after 6 or longer months. Some journals contain a mixture of OA and non-OA articles. See http://europepmc.org/journalList for more info.
• Journals in the ‘red’ categories are available anywhere.• Most journals subscribed to by CAPES will be available from
more than one source.• CAPES journals are only available from computers within the
University network unless you have remote access privileges.
14 May 2013 Ganesha Associates 32
So which sources should I use ?
• No single source contains all of the articles relevant to your research
• Google has the broadest coverage, but not all of the documents you find will be peer-reviewed articles
• Scopus, WoS and PubMed give you the best balance between quality and quantity, and, in theory, should link to all the content subscribed to by CAPES, plus OA content.
Ganesha Associates 33
Indexing• The purpose of an index is to optimize speed and performance
in finding relevant documents for a search query.
• Without an index, the search engine would have to scan every document in the corpus, which would require considerable time and computing power.
• For example, while an index of 10,000 documents can be queried within milliseconds, a sequential scan of every word in 10,000 large documents could take hours.
• A common type of index used for document search is the “inverted index”
24 August 2012
Search: how the result list is ranked
• Date of publication• Relevance– Frequency with which search terms occur in the
document– Proximity of search terms
• Google’s PageRank algorithm uses "link popularity”- a document is ranked higher if there are more links to it
14 May 2013 Ganesha Associates 34
So…
• Using the same search terms will produce different results in different databases because:– Content different– Preparation of search terms will be different, e.g.
only Pubmed uses MeSH terms– Indexing process, implementation of stemming,
removal of stop words will be different– Ranking algorithms will be different
The question behind the query
• Search engines think in terms of words, but users think in terms of sentences!– How do you spell Bousfield?– What do we know about BRCA1?– Given these symptoms, what is the most likely
diagnosis?– What are the side effects of aspirin?– Has this chemical structure been synthesized before?
• “Cancer causes X” vs. “Y causes cancer”
Ganesha Associates 3724 August 2012
What real queries look like - Google
• pharmacogenomics and disorders• bacteria growth casein media effect• waal pseudomonas• TRPM2 PCR mouse• Chitinases in carnivorous plants• glycerophosphoinositol 4-phosphate• Dai N, Gubler C, Hengstler P, Meyenberger C,
Bauerfeind P. Improved capsule endoscopy after bowel preparation. Gastrointest Endosc 2005;61(1) 28-31.
Ganesha Associates 3824 August 2012
What real queries look like - PubMed
• ATR1 HAL2• Fuzzy[ALL] AND Hanage[AU] AND 2005[DP]• arndt and rhabdomyosarcoma• "Vorster HH"[Author]• (rotavirus infections[majr] OR rotavirus[majr])
AND english[la] AND humans[mh] NOT (editorial[pt] OR letter[pt])
Ganesha Associates 3924 August 2012
Search terms - summary
• Make sure you understand the search term syntax used by your preferred site, i.e. AND, +, “ ”, etc
• Search engines ‘see’ only certain words, not sentences
• Do not use ‘stop’ words, i.e. a, the, of, before unless they are part of “a text string search”
• Try to think of different ways to search for the same subject
• Look beyond the first page of search results
Ganesha Associates 4024 August 2012
NCBI full_report
Search results Full TextAbstractPlus
BLAST results
GOOGLE searchGOOGLE search AbstractPlus
Search results Full TextAbstractPlus Full Text
A search session may involve many information types..
Ganesha Associates 4124 August 2012
...and sources
?
GoogleScopusWeb of Science PubMedScielo
HighWireScienceDirect
Springer Link
National Literature
CAPESPortal
OA: BMCOr PLoS
Other Databases,e.g. NCBI
Quick tour
Break
Ganesha Associates 66
Other types of database• Some databases contain mainly text, but others contain image, sequence
or structural data
• The technologies required to search and retrieve these different data types are very different.
• There is a growing amount of information in publicly available databases.
• For example, in 2013 the Nucleic Acids Research journal online Molecular Biology Database Collection listed 1512.
• The National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute(EBI) host some of the most important databases used for biomedical research.
24 August 2012
Ganesha Associates 67
Linking different data types is a challenge
24 August 2012
Gene ExpressionWarehouse
ProteinDisease
SNP
Enzyme
Pathway
Known Gene
SequenceCluster
Affy Fragment
Sequence
LocusLink
MGD
ExPASySwissProt
PDBOMIM
NCBIdbSNP
ExPASyEnzyme
KEGG
SPAD
UniGene
Genbank
NMR
Metabolite
Ganesha Associates 68
Databases available at NCBI
24 August 2012
Ganesha Associates 69
Other ways to search – BLAST, PubChem, UCSC Genome Browser
24 August 2012
>DinoDNA from JURASSIC PARK p. 103 nt 1-1200GAATTCCGGAAGCGAGCAAGAGATAAGTCCTGGCATCAGATACAGTTGGAGATAAGGACGGACGTGTGGCAGCTCCCGCAGAGGATTCACTGGAAGTGCATTACCTATCCCATGGGAGCCATGGAGTTCGTGGCGCTGGGGGGGCCGGATGCGGGCTCCCCCACTCCGTTCCCTGATGAAGCCGGAGCCTTCCTGGGGCTGGGGGGGGGCG
By sequence – BLAST:
By structure – PubChem:
Ganesha Associates 7024 August 2012
Example of BLAST search results
Ganesha Associates 71
PC Compound Record
24 August 2012
Ganesha Associates
Learning points
13/08/2013
• Google is a good place to start• Learn to use several information resources• Modify your search terms during the
course of a search session• Understand how the results are ranked
and don’t just look on the first page