INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info...

32
Kathleen Padova March 21, 2017 1 INFO 780 Final Project: Dataset Exploration and Analysis Contents: Introduction and Problem Statement ............................................................................................... 2 Data Sources and Data Preparation................................................................................................... 2 Data Acquisition .................................................................................................................................... 2 Data Preparation................................................................................................................................... 2 Software preparation............................................................................................................................... 3 Analysis Process and Results ............................................................................................................... 4 Bibliometrix Tutorial .......................................................................................................................... 4 Results ....................................................................................................................................................... 4 Future Analysis ...................................................................................................................................... 5 Discussion of Challenges and Lessons learned............................................................................. 5 Data set size ............................................................................................................................................ 5 Multithreaded processing ................................................................................................................. 6 RStudio tips ............................................................................................................................................. 6 What do you learn through this project and how you think this project might become useful in the real world..................................................................................................... 6 References .................................................................................................................................................... 7 Appendix: Bibliometrix Output ........................................................................................................... 8 BiblioAnalysis......................................................................................................................................... 9 Analysis of Cited References .......................................................................................................... 16 Authors’ Dominance ranking......................................................................................................... 17 Authors’ h-index: ................................................................................................................................ 18 Lotka’s Law coefficient estimation: ............................................................................................ 25 Bibliographic network matrices: ................................................................................................. 26 Visualizing bibliographic networks ............................................................................................ 28

Transcript of INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info...

Page 1: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

1

INFO780–FinalProject:DatasetExplorationandAnalysis

Contents:IntroductionandProblemStatement...............................................................................................2

DataSourcesandDataPreparation...................................................................................................2DataAcquisition....................................................................................................................................2

DataPreparation...................................................................................................................................2Softwarepreparation...............................................................................................................................3

AnalysisProcessandResults...............................................................................................................4

BibliometrixTutorial..........................................................................................................................4Results.......................................................................................................................................................4

FutureAnalysis......................................................................................................................................5

DiscussionofChallengesandLessonslearned.............................................................................5Datasetsize............................................................................................................................................5

Multithreadedprocessing.................................................................................................................6RStudiotips.............................................................................................................................................6

Whatdoyoulearnthroughthisprojectandhowyouthinkthisprojectmightbecomeusefulintherealworld.....................................................................................................6

References....................................................................................................................................................7

Appendix:BibliometrixOutput...........................................................................................................8BiblioAnalysis.........................................................................................................................................9

AnalysisofCitedReferences..........................................................................................................16

Authors’Dominanceranking.........................................................................................................17Authors’h-index:................................................................................................................................18

Lotka’sLawcoefficientestimation:............................................................................................25

Bibliographicnetworkmatrices:.................................................................................................26Visualizingbibliographicnetworks............................................................................................28

Page 2: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

2

IntroductionandProblemStatementIwouldliketocontinueexploringRbyapplyingittomyresearchareaofknowledgediscoveryusingbibliometricanalysis.Ihaveadatasetof~75kWebofSciencerecordsthatIhadpreviouslydownloadedforuseinCiteSpace.IfoundandinstalledanRpackage,Bibliometrix,(Aria,Corrado)thatappearstodomanyofthebibliometricanalysesandevenvisualizationsthatbibliometricresearchersusuallycreate.Iamnotcertainthatthistoolwillassistwiththe"knowledgediscovery"partofmyresearch;butthatwillbesomethingIwilllookfor.

DataSourcesandDataPreparation

DataAcquisition

IdownloadedbibliographicrecordsfromThomsonReutersWebofScience(WoS)(https://webofknowledge.com/).IhadpreviouslydownloadedonlyplaintextformatasthatcanbeusedinbothBibliometrixandCiteSpace.AfterthedataloadwithplaintextfilesintoRbecameexcruciatinglylong,Ire-downloadedtherecordsinbothBibTeXandplaintextformats.MyWebofSciencesessionwasabruptlystoppedandrestartedtwiceduringtheplaintextformat,whichmayhavecontributedtosomeoftheinconsistenciesIexperiencedlateron.IeventuallydecidedtocompletethetutorialusingonlytheBibTeXfiles.

Web of Science Query (3/16/17)

Results: 75,193

(from Web of Science Core Collection)

You searched for: TOPIC: ("multiple sclerosis" and "multiple-sclerosis")

Refined by: DOCUMENT TYPES: ( ARTICLE OR PROCEEDINGS PAPER OR MEETING ABSTRACT )

Timespan: All years. Indexes: SCI-EXPANDED, SSCI, A&HCI, ESCI.

DownloadingbibliographicrecordsfromWoS.

Sinceyoucanonlydownload500recordsatatime,thisresultedin151BibTeXand151plaintextfiles,whichtookapproximatelyfourhoursforbothsets.Ihearthatitmaybepossibletowriteascripttoautomatethisstepandthatsomeonemayhavedonethisalready;butasthiswouldviolatetheWoStermsofservice,it’sdifficulttoconfirm(orformetobuildthescript).

DataPreparation

IthoughtIneededtoconcatenatealloftheplaintextfilesintoonetextfiletoloadintoRandusedthecatfunctioninTerminal(MacOS)forthisstep.IunderstoodlaterthatthereadFilesfunctioninRcouldconcatenatemanysourcefilestogether.I

Page 3: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

3

observednodifferencebetweenloadingonetextfilewithallrecordsversususingthereadFilesfunctiontoconcatenatethefilesonloading.IwasalsohavingproblemsconnectingtothedirectoryinBoxIhadsetupforallofmyprojectfiles.IwasabletogetthedirectpathfromtheBoxSyncfolderIhadrunning;butRdidnotlikemakingtheconnection.Ieventuallycopiedtheentireprojectfolderontoalocaldriveandworkedfromthere.

CreatingthedataframeinRiswherethedifferencebetweenplaintextandBibTeXformatsreallystoodout.TheBibliometrixdocumentationdidwarnthattheBibTeXformatwouldbefaster.Iobserved2hourstoconverttheplaintextfilesversus20minutesfortheBibTeXfiles.Ihadtriedtousetheplaintextfilessetformyanalysisfirst.Afterwaitingtotwohours(startedat4:42pm,completed6:44)tocreatethedataframe,therewas50ormorewarnings(whichIdidn’tunderstand)andIsawthatonly74,716articles(outof75,193WOSrecords)hadbeenextracted.Ididn’tknowifIshouldhaveexpected75,193articlesorifcomingupshortwastypical.IbeganmyanalysiswithDataframeM–74716observationsof65variables.DespiteBilbiometrix’swarningabouttheplaintextfilesbeingslowertoload,therewasnoindicationthattheformatwouldaffectandofthedataprocessingdownstream.Isuspectedthatitwouldn’tsinceallthedatawasnowloadedintoadataframe.TheplaintextdataprobablydidaffectmyanalysisduetosomeanomaliesIsuspecthadbeencausedbythesessioninterruptionwhiledownloadingfromWoS.Partwayrthroughmyanalysis,IfinallysuspectedtheproblemintheplaintextdatasetandrestartedwiththeBibTeXfiles.WhenIloadedtheBibTeXfiles,Ireceivednoerrorsandall75,193recordsinthedataframe.

SoftwarepreparationIchosetouseRStudioafterlearningabouttheBibliometrixpackage.TheBibliometrixRpackageisaccessiblefromwithinRStudio(Tools>InstallPackages)andfromhttp://www.bibliometrix.org.Itiscurrentlyonversion1.5.

RStudioandBbliometrixwerealldownloadedandrunlocallyonanAppleMacBookPro(15-inch,Late2016,2.9GHzIntelCorei7processor,16GB2133MHzmemory),macOSSierrav10.12.3.ImentionallthesesystemspecsbecauseIreallydidbuythebest,newest,fastestMacBookProwhentheycameoutlastNovemberanditstillstruggledwiththesizeofthedataIwascrammingthroughRStudio.Later,Ilearnedthatitdoesn’tmatterhowmanycoresyoursystemhas.

Page 4: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

4

AnalysisProcessandResults

BibliometrixTutorial

TheBibliometrixwebsitehasatutorial(http://htmlpreview.github.io/?https://github.com/massimoaria/bibliometrix/master/vignettes/bibliometrix-vignette.html)thatranthroughmostofthefunctions.Iranthroughthistutorialseveraltimes.Thetutorialusedadatasetofbibliographicrecords(query=“bibliometrics”)thatanyoneusingthispackageshouldbefamiliarwithenoughtounderstandthefunctionbeingdemonstrated.

Results

Isummarizeandhighlightsomeoftheresultsofthisanalysisinthissection;alloftheoutputcanbefoundintheappendix.

Bibliographicanalysis

IbelievetheSummaryfunctiongavearatherthoroughoverviewofwhatmydatasetcontained.Asummaryisjustthat,noanalysis;butIdidgetanideaofauthorsandpublicationstolookoutforandexpecttoseelateron.Ididencounternewconceptssuchas,Lotka’sLawandLocalCitationsthatIneedtoresearchlater.

Themostinterestingtomewerethefrequencyandnetworksinvolvingkeywordstohelpmyunderstandingofhowthedomainisconsidered.Idohaveaconcernaboutusingcurrentdomainstructurestotrytoclassifyandmodeloldercontentwhichmaynothaveunderstoodorfollowedthosesamestructuresandconceptsandhavenotnecessarilybeexaminedforfitwithinthecurrentdomainstructure;butthisisaratherlongrantofminewhichiscausingallsortsofangstwhenIconsiderit.

Thisdoesrevealhowfrequentlytopicsthathadnotnecessarilybeenassociatedwitheachother(NEUROLOGYandIMMUNOLOGY)arenowverycommon,socommon,infact,thatthereisasub-domainNEUROIMMUNOLOGY.IamextremelycuriousaboutthekeywordEXPERIMENTALAUTOIMMUNEENCEPHALOMYELITISthatappearsinseveralofthenetworkmatrices.FromwhatIknowaboutthosethreetermsseparately,Iwouldn’texpectthemtoappeartogetherandcertainlynotsofrequently.

Therewereafewnetworkmatricesthatwouldhavebeenmorerevealingabouttheconceptualstructureofthisdomain;buttheconceptualStructureandcouplingSimilarityfunctionswereunfortunatelytwooftheoperationsthatcausedRStudiotocrashconsistently.Itwasthisrepeatedfailurethatinspiresmetolearnenoughaboutthisdomainanddatasettoapplysomeintelligentfilterstotrimthesizedown.

Page 5: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

5

Visualizingbibliographicnetworks

Theresultingimages(whichyoucanseeintheappendix)wereprettytolookatandatleastshowedthenode;buttheyareverypracticalwithoutthecorrespondingedges.Inmanyofthegraphs,thelinesbetweennodesarenotrendering.Isuspectitmaybeduetothenumberofnodesinmydatasetoverwhelmingtherenderer.Thisisdefinitelyanissuetoinvestigatefurther.IamdisappointedthatIcouldnotgenerateseveralofthenetworkgraphsthatgaveabetteroverviewofhowthedomainisconceptuallystructured.TheRStudiosessionwouldunexpectedlyabortwhilerunningthefunction.Ihavebeentweakingsomeoftheargumentsinthesefunctions;buthaven’tfoundthemagiccombinationsyet.(Admittedly,whenafunctionwouldtakeanhourormore,IshouldbelookingattheratherlargedatasetIloadedtotakesomeoftheblame.)

FutureAnalysis

AsIbecomemorefamiliarwiththefunctions,arguments,andoutput,Iplantomanipulatedifferentdataandfunctionparameterstobettercontroltheoutputandunderstandtheanalyses.Ialsoplantospendmoretimeunderstandingthedomainasitpertainstotheliteraturethisdatasetrepresent.(No,Iamnotplanningonpursuingadditionaleducationinneuroscience;butbyexaminingtheactualarticlesandpapersbehindthebibliographicrecords,Iexpecttogainenoughunderstandingofthemeaningbehindthesevariousanalyses.)

DiscussionofChallengesandLessonslearned

Datasetsize

Ihadquitetheinternal,stillunresolveddebateabouttheappropriatesizeforadataset.

• Well,thiscourseDOEStalkaboutbigdata.• ManyscholarlyarticlesIreadinthebibliometricsandliterature-based

knowledgediscoverysub-domainsappeartouselargedatasets.• Kostoff,et.al.(2008)feltitwascriticalthatthequeryforcreatingtheinitial

setofliteratureneededtobeascompleteandcomprehensiveaspossiblefordiscoveryxtobeabletoexploreallpossiblepathwaysbetweenliteraturesandeliminateanygapsinthepoolofpossiblediscoveriesandmissingout.InreviewingmanyotherarticlesonLBDmethodologies,thisexpansivestepmayhavebeenmademorepossibleduetogreatercomputingpowerandcapacitythatearlierresearchersdidnothaveaccessto.NotethatRobertKostoffwasemployedbytheUSDepartmentofDefenseandlikelyhadaccesstosystemsnotgenerallyavailabletoacademicresearcher.

• IdonotworkfortheU.S.DepartmentofDefenseorDepartmentofanythingandhavenoaccesstothattypeofcomputingpower.

Page 6: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

6

• ThesizeofthisdatasetprovedchallengingforthesoftwareandcomputerIhaveaccessto.

• Ihadpreviouslyblamedmycomputer;butnowwithanewercomputer,it’sjustplainridiculoustonotaddressthisissue.

• IwantedtokeepthequerylargeandbroadtofamiliarizemyselfwithadatasetIhopetocontinuetouse.

• IfIintendtocontinue,Ineedmuchmorepowerfulcomputer,learnhowtooptimizeperformance,learnnatureofdomain/datasettorefineandfocusthequery.

Multithreadedprocessing

Disclaimer:Ireallydon’tunderstandthistopicmuch(yet)andwillmostlikelybedescribingitcompletelywrong.

ApparentlyRStudioisasingle-threadedprocessingapplication.Itwasonlyusingabout12%ononeofthefourcoresinmybrandnewlaptop.Inmyinvestigations,Ilearnedthatyoucandownloadapackage(s)toenableparallelprocessinginR;butthereweremanywarningsaboutknowingwhatyouaredoingwhentryingtoexecutethis.Iwilladdthistomylistofthingstolearnmoreabout.ItalsomakesmewonderifthisholdsbacktheuseofRinmorebusinessapplications.

RStudiotips

• Savingtheworkspaceaftereverymajordataframecreation/modification.Thisfiledoesgetratherlargeandtakesabitoftimesavingandreloading;butsavestimeonrecreatingdataframesandsubsetsintheeventthattheRStudiosessionunexpectedlyaborts.

• Thesavedworkspacedoesn’tsavethelibraryorworkingdirectory.Theyhavetobereset/reloadedeverytime.

• Capitalizationmatters.Andtyping,whichIstrugglewith.

• Usetheupanddownarrowkeystocyclethroughpreviouslyexecutedcodetore-executeandmodifyifnecessary(e.g.whenyou’vemadeatypoanddon’twanttoretypetheentireline).

Whatdoyoulearnthroughthisprojectandhowyouthinkthisprojectmightbecomeusefulintherealworld.

IamgladthatIwasabletoexplorethisdatasetusingRStudioandtheBibliometrixpackage.IhadpreviouslytriedunsuccessfullyusingCiteSpace;notanyfailingofCiteSpace;butprobablybecausethedatasetwastoolargeforsuitableanalysisinthetoolandtheinsufficientcomputingpowerofmypreviousmachine.NowthatIhavebeenabletoperformvariousbibliometricfunctionsonthisdataset,Iexpecttolearnmoreaboutthedatasetitself,learnmoreaboutthebibliomentricanalyses,andthenpursuetheknowledgediscoverygoalsIhaveforthisdataset.

Page 7: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

7

ReferencesMassimoAriaandCorradoCuccurullo(2017).bibliometrix:BibliometricandCo-

CitationAnalysisTool.Rpackageversion1.5.http://www.bibliometrix.org

KostoffR,BriggsM,SolkaJ,RushenbergR.Literature-relateddiscovery(LRD):Methodology.TechnolForecastSocChange.2008;75(2):186-202.doi:10.1016/j.techfore.2007.11.010.

Page 8: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

8

Appendix:BibliometrixOutputBiblioAnalysis.........................................................................................................................................9

Summary.............................................................................................................................................9

MainInformationaboutdata......................................................................................................9AnnualScientificProduction......................................................................................................9

AnnualPercentageGrowthRate1.186925........................................................................10MostProductiveAuthors............................................................................................................10

Topmanuscriptspercitations.................................................................................................11

MostProductiveCountries........................................................................................................12TotalCitationsperCountry.......................................................................................................12

MostRelevantSources................................................................................................................12

MostRelevantKeywords............................................................................................................13Basicplots.........................................................................................................................................14

AnalysisofCitedReferences..........................................................................................................16Mostfrequentcitedmanuscripts:...........................................................................................16

Mostfrequentcitedfirstauthors:...........................................................................................17

Mostfrequentlocalcitedauthors...........................................................................................17Authors’Dominanceranking.........................................................................................................17

Authors’h-index:................................................................................................................................18

H-indexoffirst10mostproductiveauthorsinthisdataset:......................................25Lotka’sLawcoefficientestimation:............................................................................................25

AuthorProductivity:.....................................................................................................................25Betacoefficientestimate:...........................................................................................................26

Constant:............................................................................................................................................26

Goodnessoffit:...............................................................................................................................26P-valueofK-Ssampletest:........................................................................................................26

PlotScientificproductivity........................................................................................................26Bibliographicnetworkmatrices:.................................................................................................26

Mostrelevantpublicationsources:........................................................................................26

Citationnetwork:...........................................................................................................................27AuthorNetwork:............................................................................................................................27

CountryNetwork:..........................................................................................................................27

Page 9: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

9

AuthorKeywordNetwork:........................................................................................................27

KeywordPlusNetwork:..............................................................................................................27Visualizingbibliographicnetworks............................................................................................28

CountryCollaboration.................................................................................................................28Co-citationNetwork.....................................................................................................................28

Keywordco-occurrences............................................................................................................29

Historicalco-citation....................................................................................................................31

BiblioAnalysis

Summary

MainInformationaboutdataArticles75193

Sources(Journals,Books,etc.)3882

KeywordsPlus(ID)50935

Author'sKeywords(DE)37400

Period1980-2017

Averagecitationsperarticle21.29

Authors71539

AuthorAppearances75193

Authorsofsingleauthoredarticles71539

Authorsofmultiauthoredarticles0

ArticlesperAuthor1.05

AuthorsperArticle0.951

Co-AuthorsperArticles1

CollaborationIndexNaN

AnnualScientificProductionYearArticles

1980327

1981307

1982333

1983291

1984358

1985345

1986394

1987432

Page 10: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

10

1988401

1989320

1990317

1991685

1992689

1993750

1994846

1995956

19961136

19971312

19981540

19991613

20001766

20011728

20021863

20031933

20042924

20052968

20063376

20073508

20083935

20094099

20103898

20114444

20124993

20135157

20144938

20155298

20164507

2017506

AnnualPercentageGrowthRate1.186925

MostProductiveAuthors

AuthorsArticlesAuthorsArticlesFractionalized

1[ANONYMOUS]99[ANONYMOUS]99

2SANDYK,R28SANDYK,R28

3COMPSTON,A23COMPSTON,A23

Page 11: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

11

4LASSMANN,H23LASSMANN,H23

5LAUER,K21LAUER,K21

6WARREN,KG;CATZI19WARREN,KG;CATZI19

7KURTZKE,JF15KURTZKE,JF15

8FILIPPI,M13FILIPPI,M13

9POSER,CM13POSER,CM13

10COMI,G12COMI,G12

Topmanuscriptspercitations...<truncated>

1KURTZKEJF,(1983),NEUROLOGY

2POSERCM;PATYDW;SCHEINBERGL;MCDONALDWI;DAVISFA;EBERSGC;JOHNSONKP;SIBLEYWA;SILBERBERGDH;TOURTELLOTTEWW,(1983),ANN.NEUROL.

3MCDONALDWI;COMPSTONA;EDANG;GOODKIND;HARTUNGHP;LUBLINFD;MCFARLANDHF;PATYDW;POLMANCH;REINGOLDSC;SANDBERG-WOLLHEIMM;SIBLEYW;THOMPSONAJ;VANDENNOORTS;WEINSHENKERBY;WOLINSKYJS,(2001),ANN.NEUROL.

4POLMANCH;REINGOLDSC;EDANG;FILIPPIM;HARTUNGHP;KAPPOSL;LUBLINFD;METZLM;MCFARLANDHF;O'CONNORPW;SANDBERG-WOLLHEIMM;THOMPSONAJ;WEINSHENKERBG;WOLINSKYJS,(2005),ANN.NEUROL.

5TRAPPBD;PETERSONJ;RANSOHOFFRM;RUDICKR;MORKS;BOL,(1998),N.ENGL.J.MED.

6LANGRISHCL;CHENY;BLUMENSCHEINWM;MATTSONJ;BASHAMB;SEDGWICKJD;MCCLANAHANT;KASTELEINRA;CUADJ,(2005),J.EXP.MED.

7POLMANCHRISH;REINGOLDSTEPHENC;BANWELLBRENDA;CLANETMICHEL;COHENJEFFREYA;FILIPPIMASSIMO;FUJIHARAKAZUO;HAVRDOVAEVA;HUTCHINSONMICHAEL;KAPPOSLUDWIG;LUBLINFREDD;MONTALBANXAVIER;O'CONNORPAUL;SANDBERG-WOLLHEIMMAGNHILD;THOMPSONALANJ;WAUBANTEMMANUELLE;WEINSHENKERBRIAN;WOLINSKYJERRYS,(2011),ANN.NEUROL.

8KRUPPLB;LAROCCANG;MUIRNASHJ;STEINBERGAD,(1989),ARCH.NEUROL.

9JACOBSLD;COOKFAIRDL;RUDICKRA;HERNDONRM;RICHERTJR;SALAZARAM;FISCHERJS;GOODKINDE;GRANGERCV;SIMONJH;ALAMJJ;BARTOSZAKDM;BOURDETTEDN;BRAIMANJ;BROWNSCHEIDLECM;COATSME;COHANSL;DOUGHERTYDS;KINKELRP;MASSMK;MUNSCHAUERFE;PRIORERL;PULLICINOPM;SCHEROKMANBJ;WEINSTOCKGUTTMANB;WHITMANRH;BAIRDWC;FILLMOREM;BONALM;COLONRUIZME;NADINEBS;DONOVANA;BENNETTS;KIEFFERYM;UMHAUERMA;MILLERCE;KILICAK;SARGENTEL;SCHACHTERM;SHUCARDDW;WEIDERV;CATALANOBA;CERVIJM;CZEKAYC;FARRELLJL;FILIPPINIJS;MATYASRC;MICHIENZIKE;ITOM;OMALLEYJA;ZIELEZNYMA;BRUNJM;DAVIDSONAL;GREENLA;OREILLYKM;SHELTONJA;WENDEKE;BARILLADY;BOYLESL;PERKINSKK;PERRYMANJE;STIEBELINGBG;KONECSNIJF;ROSSJS;CHOIKS;GUSTAFSONCJ;QUANDTBJ;SCHERZINGERAL;GRIFFITHDA;HARRISJM;LEZAKMD;MIMICAI;SAUNDERSJA;COITWE;FORCECR;GILMOREFJ;HARRISLB;JONESMM;KAUFFMANJA;MARBERGERKE;MCBRIDEJW;MILLERLL;WRIGHTGK;BROOKSJA;BROWN...<truncated>

10LUCCHINETTIC;BRUCKW;PARISIJ;SCHEITHAUERB;RODRIGUEZM;LASSMANNH,(2000),ANN.NEUROL.

TCTCperYear

17336215.8

26308185.5

33997249.8

42847237.2

Page 12: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

12

52525132.9

62219184.9

72016336.0

8200771.7

9177484.5

10164396.6

MostProductiveCountriesCountryArticlesFreq

1USA223790.3246

2ITALY54450.0790

3GERMANY46760.0678

4ENGLAND39130.0568

5FRANCE25110.0364

6CANADA24680.0358

7SPAIN24550.0356

8NETHERLANDS22170.0322

9JAPAN20350.0295

10SWEDEN18020.0261

TotalCitationsperCountryCountryTotalCitationsAverageArticleCitations

1USA69859631.217

2ENGLAND11827630.226

3ITALY9602217.635

4GERMANY8877918.986

5CANADA6877227.865

6NETHERLANDS6301928.425

7FRANCE4302517.135

8SWITZERLAND4200323.400

9SWEDEN4166823.123

10JAPAN4138720.338

MostRelevantSourcesSourcesArticles

1MULTIPLESCLEROSISJOURNAL5754

2MULTIPLESCLEROSIS4498

3NEUROLOGY4464

Page 13: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

13

4JOURNALOFNEUROIMMUNOLOGY3073

5JOURNALOFNEUROLOGY2133

6JOURNALOFTHENEUROLOGICALSCIENCES1753

7ANNALSOFNEUROLOGY1476

8EUROPEANJOURNALOFNEUROLOGY1383

9JOURNALOFIMMUNOLOGY1368

10PLOSONE1235

MostRelevantKeywordsAuthorKeywords(DE)ArticlesKeywords-Plus(ID)Articles

1MULTIPLESCLEROSIS;MRI6MULTIPLE-SCLEROSIS258

2MULTIPLESCLEROSIS5MS21

3DEMYELINATINGDISORDERS4DIAGNOSIS18

4EPIDEMIOLOGY;INCIDENCE;MULTIPLESCLEROSIS;PREVALENCE4DISEASE17

5MULTIPLESCLEROSIS;EPIDEMIOLOGY;PREVALENCE4MULTIPLE-SCLEROSIS;DISEASE17

6MULTIPLESCLEROSIS;OPTICNEURITIS4THERAPY16

7MULTIPLESCLEROSIS;QUALITYOFLIFE4DYSFUNCTION15

8MULTIPLESCLEROSIS;REHABILITATION4PREVALENCE15

9MULTIPLESCLEROSIS;VALIDITY;WALKING4MULTIPLE-SCLEROSIS;MRI14

10DEPRESSION;MULTIPLESCLEROSIS;SUICIDE3QUALITY-OF-LIFE14

Page 14: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

14

Basicplots

Page 15: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

15

Page 16: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

16

AnalysisofCitedReferences

Mostfrequentcitedmanuscripts:KURTZKEJF,1983,NEUROLOGY,V33,P1444

5977

POSERCM,1983,ANNNEUROL,V13,P227,DOI101002/ANA410130302

5000

MCDONALDWI,2001,ANNNEUROL,V50,P121,DOI101002/ANA1032

3043

POLMANCH,2005,ANNNEUROL,V58,P840,DOI101002/ANA206703

2125

LUBLINFD,1996,NEUROLOGY,V46,P907

1684

TRAPPBD,1998,NEWENGLJMED,V338,P278,DOI101056/NEJM199801293380502

1544

POLMANCH,2011,ANNNEUROL,V69,P292,DOI101002/ANA22366

1339

NOSEWORTHYJH,2000,NEWENGLJMED,V343,P938,DOI101056/NEJM200009283431307

Page 17: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

17

1216

JACOBSLD,1996,ANNNEUROL,V39,P285,DOI101002/ANA410390304

1008

DUQUETTEP,1993,NEUROLOGY,V43,P655

980

Mostfrequentcitedfirstauthors:KURTZKEJFPOSERCMFILIPPIMPOLMANCHMCDONALDWIMILLERDHEBERSGCRAOSM

91646179529045874099346430472957

KAPPOSLWINGERCHUKDM

29042580

Mostfrequentlocalcitedauthors

Frequencytableofmostlocalcitedauthors(howmanytimesauthorsinthisdatasethavebeencitedbyotherauthorsinthisdataset)KURTZKEJFPOSERCMFILIPPIMPOLMANCHMCDONALDWIMILLERDHEBERSGCKAPPOSL

90406179526545874098346430252885

RAOSMWINGERCHUKDM

28782546

Authors’Dominanceranking

Authors’Dominanceranking(aratioindicatingthefractionofmulti-authoredarticlesinwhichascholarappearsasthefirstauthor,referencedinKumar,Kumar,2008,whichIhavetolookup):DominanceFactorMultiAuthoredFirstAuthoredRankbyArticles

GIOVANNONI,G0.151187904637010

FILIPPI,M0.147311839301372

KAPPOS,L0.130487808201073

WEINSTOCK-GUTTMAN,B0.08333333492418

HARTUNG,HP0.07231405484359

COMI,G0.072064061124811

BARKHOF,F0.05546218595336

MILLER,DH0.05052006673344

MONTALBAN,X0.03908795614245

POLMAN,CH0.03853955493197

RankbyDF

GIOVANNONI,G1

FILIPPI,M2

KAPPOS,L3

Page 18: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

18

WEINSTOCK-GUTTMAN,B4

HARTUNG,HP5

COMI,G6

BARKHOF,F7

MILLER,DH8

MONTALBAN,X9

POLMAN,CH10

Notethatthehighestrankedauthor,GGiovannoni,onlyhasadominancefactorof.15.Thehighestwouldbe1.000.Intheexampledatasetusedinthetutorial,RobertKostoffhadadominancefactorof1.000meaninghewasthefirstnamedauthorinall8ofhismulti-authoredarticlesinthatdataset.

Authors’h-index:

Thiscreatesavectorforthespecifiedauthor.Soifwewanttolearntheh-indexforourfriendGGiovannoniAuthorh_indexg_indexm_indexTCNP

1GIOVANNONIG47892.1363648973463

AndnowGGionannoni’scitationlist:AuthorsJournalYearTotalCitation

1GIOVANNONI,G;LAI,M;THONEUROLOGY19960

28GIOVANNONI,G;DUBOIS,BD;JOURNALOFNEUROLOGYNEUROSURG20030

36LIM,ET;LEARY,SM;ALTMANNEUROLOGY20040

37LIM,ET;SELLEBJERG,F;ALNEUROLOGY20040

39PORTER,B;GIOVANNONI,G;MULTIPLESCLEROSIS20040

40LIM,ET;SELLEBJERG,F;SOJOURNALOFNEUROLOGYNEUROSURG20040

41REJDAK,K;LEARY,SM;NELIEUROPEANJOURNALOFNEUROLOGY20040

42PRYCE,G;O'NEILL,JK;HANJOURNALOFNEUROIMMUNOLOGY20040

49LIM,ET;SCHMIERER,K;GRAJOURNALOFNEUROLOGYNEUROSURG20050

55HAVRDOVA,E;O'CONNOR,P;JOURNALOFNEUROLOGY20050

56MILLER,D;O'CONNOR,P;HAJOURNALOFNEUROLOGY20050

58HAWKES,CH;GIOVANNONI,G;MULTIPLESCLEROSIS20050

59HUTCHINSON,M;O'CONNOR,PW;MULTIPLESCLEROSIS20050

61PETZOLD,A;EIKELENBOOM,M;MULTIPLESCLEROSIS20050

62SOON,D;ALTMANN,DR;BARKMULTIPLESCLEROSIS20050

65REJDAK,K;PETZOLD,A;CHAEUROPEANJOURNALOFNEUROLOGY20050

68KAPPOS,L;O'CONNOR,PW;HJOURNALOFTHENEUROLOGICALSC20050

Page 19: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

19

69REJDAK,K;LEARY,SM;STELJOURNALOFTHENEUROLOGICALSC20050

70FISHER,E;O'CONNOR,PW;HANNALSOFNEUROLOGY20060

71GIOVANNONI,G;O'CONNOR,P;CLINICALIMMUNOLOGY20060

73HAWKES,CH;GIOVANNONI,G;JOURNALOFNEUROLOGYNEUROSURG20060

74CALABRESI,PA;GIOVANNONI,GNEUROLOGY20060

75BALCER,L;GALETTA,SL;O'NEUROLOGY20060

76RUDICK,RA;HUTCHINSON,M;NEUROLOGY20060

79REJDAK,K;PETZOLD,A;BARJOURNALOFNEUROLOGY20060

80GIOVANNONI,G;COOK,S;RIJOURNALOFNEUROLOGY20060

81PETZOLD,A;MONDIRA,T;KEMULTIPLESCLEROSIS20060

82PETZOLD,A;EIKELENBOOM,M;MULTIPLESCLEROSIS20060

83RUDICK,RA;MILLER,DM;HUMULTIPLESCLEROSIS20060

84KALLMANN,B;SPECH,E;KROMULTIPLESCLEROSISJOURNAL20060

86STELMASIAK,Z;REJDAK,K;JOURNALOFNEURALTRANSMISSION20070

90GIOVANNONI,G;O'CONNOR,PW;JOURNALOFNEUROLOGYNEUROSURG20070

94FARRELL,RA;LEARY,S;RUDNEUROLOGY20070

95RUDICK,R;MILLER,D;HUTCNEUROLOGY20070

96SIMSARIAN,JP;PARDO,G;BNEUROLOGY20070

100SORENSEN,P;BARBARASH,O;JOURNALOFNEUROLOGY20070

101ALI,E;FARRELL,RA;WORTHJOURNALOFNEUROLOGYNEUROSURG20070

102FARRELL,R;LEARY,S;KAPOJOURNALOFNEUROLOGYNEUROSURG20070

103REJDAK,K;PETZOLD,A;STEEUROPEANJOURNALOFNEUROLOGY20070

107GIOVANNONI,G;COMI,G;COMULTIPLESCLEROSIS20070

108GIOVANNONI,G;BARBARASH,O;MULTIPLESCLEROSIS20070

109HAVRDOVA,E;GIOVANNONI,G;MULTIPLESCLEROSIS20070

110HERBERT,J;KAPPOS,L;GIOMULTIPLESCLEROSIS20070

115FARRELL,R;ANTONY,D;CLANEUROLOGY20080

116FOX,R;MAN,SM;TUCKY,B;NEUROLOGY20080

117GALETTA,S;CALABRESI,P;NEUROLOGY20080

118HERBERT,J;KAPPOS,L;CALNEUROLOGY20080

120MILLER,D;RUDICK,R;CALANEUROLOGY20080

121SIMSARIAN,J;BARBARASH,O;NEUROLOGY20080

122SOMERFIELD,J;GREEN,A;KNEUROLOGY20080

124FARRELL,R;ANTONY,D;CLAJOURNALOFNEUROLOGY20080

125CONFAVREUX,C;GALETTA,SL;JOURNALOFNEUROLOGY20080

126GIOVANNONI,G;BARBARASH,O;JOURNALOFNEUROLOGY20080

127GIOVANNONI,G;BARBARASH,O;JOURNALOFNEUROLOGY20080

128GIOVANNONI,G;COOK,S;GRJOURNALOFNEUROLOGY20080

130AL-IZKI,S;PRYCE,G;JACKJOURNALOFNEUROIMMUNOLOGY20080

Page 20: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

20

132MAGGIORE,C;TRILLO-PAZOS,GMULTIPLESCLEROSIS20080

133PARKES,H;SHORTER,S;SO,MULTIPLESCLEROSIS20080

134RAMAGOPLAN,S;VALDAR,W;MULTIPLESCLEROSIS20080

135HAVRDOVA,E;BATES,D;GALMULTIPLESCLEROSISJOURNAL20080

137FARRELL,R;ANTONY,D;CLAMULTIPLESCLEROSIS20080

142MUNSCHAUER,F;GIOVANNONI,GNEUROLOGY20090

145GIOVANNONI,G;MUNSCHAUER,FJOURNALOFNEUROLOGY20090

146GIOVANNONI,G;COMI,G;COJOURNALOFNEUROLOGY20090

148VERMERSCH,P;COMI,G;COOJOURNALOFNEUROLOGY20090

154COMI,G;COOK,S;GIOVANNOJOURNALOFNEUROLOGY20090

158HAWKES,CH;GIOVANNONI,GMULTIPLESCLEROSIS20090

159MAGHZI,AH;MEIER,U;HOLDMULTIPLESCLEROSIS20090

161PETZOLD,A;MONDRIA,T;KUMULTIPLESCLEROSIS20090

166GIOVANNONI,G;DESTEFANO,NJOURNALOFTHENEUROLOGICALSC20090

167GIOVANNONI,G;COMI,G;COJOURNALOFTHENEUROLOGICALSC20090

168MUNSCHAUER,F;GIOVANNONI,GJOURNALOFTHENEUROLOGICALSC20090

169BATOCCHI,AP;TAVAZZI,B;MULTIPLESCLEROSIS20090

172COMI,G;COOK,S;GIOVANNOJOURNALOFTHENEUROLOGICALSC20090

173COOK,S;VERMERSCH,P;COMJOURNALOFTHENEUROLOGICALSC20090

174TEUNISSEN,CE;PETZOLD,A;LABORATORIUMSMEDIZIN-JOURNALO20100

177HANDEL,AE;HANDUNNETTHI,L;ANNALSOFNEUROLOGY20100

179CLIFFORD,DB;DELUCA,A;SNEUROLOGY20100

180GIOVANNONI,G;CAMI,G;CONEUROLOGY20100

182MUNSCHAUER,F;GIOVANNONI,GNEUROLOGY20100

183RAMMOHAN,K;COMI,G;COOKNEUROLOGY20100

184SOELBERG,P;RIECKMANN,P;NEUROLOGY20100

189GIOVANNONI,G;COMI,G;COJOURNALOFNEUROLOGY20100

190RIECKMANN,P;COMI,G;COOJOURNALOFNEUROLOGY20100

191SORENSEN,PS;RIECKMANN,P;JOURNALOFNEUROLOGY20100

197RIECKMANN,P;COOK,S;COMMULTIPLESCLEROSISJOURNAL20100

198RIECKMANN,P;GIOVANNONI,G;MULTIPLESCLEROSISJOURNAL20100

199RIECKMANN,P;COMI,G;COOEUROPEANJOURNALOFNEUROLOGY20100

200GIOVANNONI,G;COMI,G;COEUROPEANJOURNALOFNEUROLOGY20100

201COMI,G;COOK,S;GIOVANNOEUROPEANJOURNALOFNEUROLOGY20100

202DOBSON,R;FELDMANN,M;THJOURNALOFNEUROLOGYNEUROSURG20100

204RAMAGOPALAN,S;DISANTO,G;NEUROLOGY20110

205RIECKMANN,P;SORENSEN,PS;NEUROLOGY20110

207RAMMOHAN,K;COMI,G;COOKNEUROLOGY20110

208VERMERSCH,P;COMI,G;COONEUROLOGY20110

Page 21: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

21

209COOK,S;COMI,G;GIOVANNONEUROLOGY20110

210FARRELL,R;CALADO-MARTA,M;NEUROLOGY20110

211RAMMOHAN,K;COMI,G;COOKJOURNALOFNEUROLOGY20110

212RIECKMANN,P;SORENSEN,PS;JOURNALOFNEUROLOGY20110

213RAMMOHAN,K;COMI,G;COOKJOURNALOFNEUROLOGY20110

214VERMERSCH,P;COMI,G;COOJOURNALOFNEUROLOGY20110

215BUTZKUEVEN,H;KAPPOS,L;JOURNALOFNEUROLOGY20110

216COOK,S;COMI,G;GIOVANNOJOURNALOFNEUROLOGY20110

227RAMAGOPALAN,S;EBERS,G;MULTIPLESCLEROSISJOURNAL20110

229SORENSEN,PS;GIOVANNONI,G;MULTIPLESCLEROSISJOURNAL20110

230THOMSON,A;ESPASANDIN,M;MULTIPLESCLEROSISJOURNAL20110

231ARNOLD,D;GOLD,R;BAR-ORMULTIPLESCLEROSISJOURNAL20110

232DOBSON,R;MEIER,UC;SCHMMULTIPLESCLEROSISJOURNAL20110

233ESPASANDIN,M;BARBER,T;MULTIPLESCLEROSISJOURNAL20110

237DOBSON,R;MEIER,UC;MARTCLINICALMEDICINE20110

243RAMMOHAN,K;GIOVANNONI,G;MULTIPLESCLEROSISANDRELATED20120

244DOBSON,R;MEIER,UC;SCHMJOURNALOFNEUROLOGYNEUROSURG20120

245SELMAJ,K;ARNOLD,D;BRINNEUROLOGY20120

246STUBINSKI,B;ROCAK,S;GIMULTIPLESCLEROSISJOURNAL20120

250RAFFEL,J;DOBSON,R;GIOVJOURNALOFNEUROLOGYNEUROSURG20120

252AGARWAL,S;KAPPOS,L;GOLNEUROLOGY20120

253BAR-OR,A;GOLD,R;KAPPOSNEUROLOGY20120

254COLES,A;BRINAR,V;ARNOLNEUROLOGY20120

255FOX,E;ARNOLD,D;BRINAR,NEUROLOGY20120

256GIOVANNONI,G;GOLD,R;KANEUROLOGY20120

257GOLD,R;GIOVANNONI,G;SENEUROLOGY20120

258HAVRDOVA,E;ARNOLD,D;CONEUROLOGY20120

261KOVAROVA,I;ARNOLD,DL;CJOURNALOFNEUROLOGY20120

270TZARTOS,J;KHAN,G;CRUZ-IMMUNOLOGY20120

272TZARTOS,JS;KHAN,G;MIDDMULTIPLESCLEROSISJOURNAL20120

274BAR-OR,A;FOX,RJ;GOLD,MULTIPLESCLEROSISJOURNAL20120

275RADUE,EW;GOLD,R;GIOVANMULTIPLESCLEROSISJOURNAL20120

276ARNOLD,DL;GOLD,R;KAPPOMULTIPLESCLEROSISJOURNAL20120

277DOBSON,R;RAMAGOPALAN,S;MULTIPLESCLEROSISJOURNAL20120

278PAKPOOR,J;DISANTO,G;MEMULTIPLESCLEROSISJOURNAL20120

279MATTHEWS,LAE;OPPENHEIMER,MULTIPLESCLEROSISJOURNAL20120

280GIOVANNONI,G;ARNOLD,DL;MULTIPLESCLEROSISJOURNAL20120

281GIOVANNONI,G;STEFOSKI,D;MULTIPLESCLEROSISJOURNAL20120

282DOBSON,R;TOPPING,J;DAVMULTIPLESCLEROSISJOURNAL20120

Page 22: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

22

288FURBY,J;GIOVANNONI,G;PJOURNALOFNEUROLOGYNEUROSURG20120

297GIOVANNONI,G;GOLD,R;SENEUROLOGY20130

298GIOVANNONI,G;GOLD,R;FONEUROLOGY20130

299GIOVANNONI,G;ARNOLD,D;NEUROLOGY20130

300GIOVANNONI,G;COMI,G;CONEUROLOGY20130

301HAVRDOVA,E;GIOVANNONI,G;NEUROLOGY20130

302HAVRDOVA,E;GOLD,R;FOX,NEUROLOGY20130

303HUTCHINSON,M;GOLD,R;FONEUROLOGY20130

304KAPPOS,L;GIOVANNONI,G;NEUROLOGY20130

305KAPPOS,L;BAR-OR,A;CREENEUROLOGY20130

306KITA,M;FOX,R;GOLD,R;NEUROLOGY20130

307MEHTA,L;GIOVANNONI,G;RNEUROLOGY20130

308MELTZER,L;SELMAJ,K;GOLNEUROLOGY20130

310HUTCHINSON,M;BAR-OR,A;MULTIPLESCLEROSISJOURNAL20130

314GIOVANNONI,G;GOLD,R;SEJOURNALOFNEUROLOGY20130

315GOLD,R;GIOVANNONI,G;PHJOURNALOFNEUROLOGY20130

316MAGGIORE,C;GIOVANNONI,G;JOURNALOFNEUROLOGY20130

327GOLD,R;GIOVANNONI,G;PHMULTIPLESCLEROSISJOURNAL20130

328BERGVALL,N;NIXON,R;TOMMULTIPLESCLEROSISJOURNAL20130

329BERGVALL,N;NIXON,R;TOMMULTIPLESCLEROSISJOURNAL20130

330KITA,M;FOX,RJ;GOLD,R;MULTIPLESCLEROSISJOURNAL20130

331GAFSON,A;WORTHINGTON,V;MULTIPLESCLEROSISJOURNAL20130

332KUHLE,J;STITES,T;CHEN,MULTIPLESCLEROSISJOURNAL20130

337GOLD,J;CHRISTENSEN,T;MMULTIPLESCLEROSISJOURNAL20130

339PAKPOOR,J;PAKPOOR,J;DIMULTIPLESCLEROSISJOURNAL20130

340ORCHARD,A;GIOVANNONI,G;MULTIPLESCLEROSISJOURNAL20130

342LUBLIN,F;CUTTER,G;GIOVMULTIPLESCLEROSISJOURNAL20130

344HUTCHINSON,M;GOLD,R;FOMULTIPLESCLEROSISJOURNAL20130

345LEDDY,S;HADAVI,S;MCCARMULTIPLESCLEROSISJOURNAL20130

348DOBSON,R;TOPPING,J;RAMJOURNALOFNEUROLOGYNEUROSURG20130

349GAFSON,A;WORTHINGTON,V;JOURNALOFNEUROLOGYNEUROSURG20130

350HADAVI,S;SHRIBMAN,S;NAJOURNALOFNEUROLOGYNEUROSURG20130

360HAVRDOVA,E;GOLD,R;FOX,EUROPEANJOURNALOFNEUROLOGY20140

361KAPPOS,L;FOX,RJ;GOLD,EUROPEANJOURNALOFNEUROLOGY20140

365GIOVANNONI,G;HEESEN,CMULTIPLESCLEROSISJOURNAL20140

368FERNANDEZ,O;GIOVANNONI,G;JOURNALOFNEUROLOGY20140

369GIOVANNONI,G;GREENBERG,S;JOURNALOFNEUROLOGY20140

370HAVRDOVA,E;GOLD,R;FOX,JOURNALOFNEUROLOGY20140

372FERNANDEZ,O;GIOVANNONI,G;EUROPEANJOURNALOFNEUROLOGY20140

Page 23: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

23

373GIOVANNONI,G;GREENBERG,S;EUROPEANJOURNALOFNEUROLOGY20140

374SADOVNICK,AD;LEE,JD;RAMULTIPLESCLEROSISJOURNAL20140

375KAPPOS,L;BAR-OR,A;CREEMULTIPLESCLEROSISJOURNAL20140

376GOLD,R;GIOVANNONI,G;PHMULTIPLESCLEROSISJOURNAL20140

380GIOVANNONI,G;GOLD,R;KAMULTIPLESCLEROSISJOURNAL20140

381PAKPOOR,J;GOLDACRE,R;DMULTIPLESCLEROSISJOURNAL20140

382DISANTO,G;DOBSON,R;PAKMULTIPLESCLEROSISJOURNAL20140

383PAKPOOR,J;NYEIN,S;DISAMULTIPLESCLEROSISJOURNAL20140

387GIOVANNONI,G;GOLD,R;FOMULTIPLESCLEROSISJOURNAL20140

392SISAY,S;ROSELLO,A;HENSJOURNALOFNEUROIMMUNOLOGY20140

393FINE,D;DATTANI,A;MOREIMULTIPLESCLEROSISANDRELATED20150

397GIOVANNONI,GMULTIPLESCLEROSISJOURNAL20150

400WIENDL,H;ARNOLD,DL;HUPEUROPEANJOURNALOFNEUROLOGY20150

401GIOVANNONI,G;HAVRDOVA,E;EUROPEANJOURNALOFNEUROLOGY20150

405GONZALEZ,RA;MOREAU,T;CEUROPEANJOURNALOFNEUROLOGY20150

406MONTALBAN,X;HEMMER,B;REUROPEANJOURNALOFNEUROLOGY20150

407FERNANDEZ,OF;HARTUNG,HP;EUROPEANJOURNALOFNEUROLOGY20150

409GIOVANNONI,G;BUTZKUEVEN,HMULTIPLESCLEROSISJOURNAL20150

410BENNETT,JL;CREE,BAC;LEMULTIPLESCLEROSISJOURNAL20150

411GIOVANNONI,G;GOLD,R;PHMULTIPLESCLEROSISJOURNAL20150

412GIOVANNONI,G;KAPPOS,L;MULTIPLESCLEROSISJOURNAL20150

413HAVRDOVA,E;GOLD,R;FOX,MULTIPLESCLEROSISJOURNAL20150

414GOLD,R;GIOVANNONI,G;PHMULTIPLESCLEROSISJOURNAL20150

415BAR-OR,A;GOLD,R;FOX,RMULTIPLESCLEROSISJOURNAL20150

416KAPPOS,L;BAR-OR,A;CREEMULTIPLESCLEROSISJOURNAL20150

417MAGGIORE,C;GIOVANNONI,G;MULTIPLESCLEROSISJOURNAL20150

418MOREAU,T;GONZALEZ,RA;HMULTIPLESCLEROSISJOURNAL20150

419MONTALBAN,X;GIOVANNONI,G;MULTIPLESCLEROSISJOURNAL20150

420VERMERSCH,P;GIOVANNONI,G;MULTIPLESCLEROSISJOURNAL20150

421FERNANDEZ,O;GIOVANNONI,G;MULTIPLESCLEROSISJOURNAL20150

422KAPPOS,L;HAVRDOVA,E;GIMULTIPLESCLEROSISJOURNAL20150

423BARKHOF,F;COHEN,JA;COLMULTIPLESCLEROSISJOURNAL20150

424PAKPOOR,J;WOTTON,CJ;GOMULTIPLESCLEROSISJOURNAL20150

425GOLD,J;GIOVANNONI,G;MAMULTIPLESCLEROSISJOURNAL20150

427CHATAWAY,J;CHANDRAN,S;JOURNALOFNEUROLOGYNEUROSURG20150

429RADUE,EW;SPRENGER,T;VOEUROPEANJOURNALOFNEUROLOGY20160

433FERNANDEZ,O;GIOVANNONI,G;MULTIPLESCLEROSISJOURNAL20160

434GIOVANNONI,G;BUTZKUEVEN,HMULTIPLESCLEROSISJOURNAL20160

435INSHASI,JS;ALJUMAH,M;MULTIPLESCLEROSISJOURNAL20160

Page 24: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

24

436PHILLIPS,JT;BAR-OR,A;GMULTIPLESCLEROSISJOURNAL20160

438GIOVANNONI,GIRISHJOURNALOFMEDICALSCIEN20160

441GIOVANNONI,GMULTIPLESCLEROSISANDRELATED20160

442GIOVANNONI,G;KAPPOS,L;MULTIPLESCLEROSISANDRELATED20160

443CHITNIS,T;GHEZZI,A;BAJNEUROLOGY20160

445DOBSON,R;RAMAGOPALAN,S;PLOSONE20160

446ARNOLD,DL;FISHER,E;BRINEUROLOGY20160

448ALVAREZ-GONZALEZ,C;ALLEN-PJOURNALOFNEUROLOGYNEUROSURG20160

449CHATAWAY,J;CHANDRAN,S;JOURNALOFNEUROLOGYNEUROSURG20160

450DAVIS,A;TURNER,B;RAMADJOURNALOFNEUROLOGYNEUROSURG20160

451RADUE,EW;GIOVANNONI,G;JOURNALOFNEUROLOGYNEUROSURG20160

452STAVROU,M;SMITH,D;SHAWJOURNALOFNEUROLOGYNEUROSURG20160

453WILLIAMS,T;PRYCE,G;GIOJOURNALOFNEUROLOGYNEUROSURG20160

455BAKER,D;ANANDHAKRISHNAN,AMULTIPLESCLEROSISANDRELATED20160

456PAKPOOR,J;WOTTON,CJ;SCMULTIPLESCLEROSISJOURNAL20160

457CARROLL,W;BUTZKUEVEN,H;MULTIPLESCLEROSISJOURNAL20170

458KOLODNY,S;GIOVANNONI,G;MULTIPLESCLEROSISJOURNAL20170

461GIOVANNONI,G;ZIEMSSEN,T;MULTIPLESCLEROSISJOURNAL20170

462PHILLIPS,JT;BAR-OR,A;GMULTIPLESCLEROSISJOURNAL20170

463GOLD,R;GIOVANNONI,G;PHMULTIPLESCLEROSISJOURNAL20170

50MILLER,D;O'CONNOR,P;HANEUROLOGY20051

57FISHER,E;O'CONNOR,R;HAJOURNALOFNEUROLOGY20051

60LUBLIN,FD;O'CONNOR,PW;MULTIPLESCLEROSIS20051

67GALETTA,SL;O'CONNOR,PW;JOURNALOFTHENEUROLOGICALSC20051

123MAGGIORE,C;TRILLO-PAZOS,GJOURNALOFNEUROLOGY20081

138FARRELL,R;WADHWA,M;WORMULTIPLESCLEROSIS20081

147SOELBERG-SORENSEN,P;COMI,JOURNALOFNEUROLOGY20091

149GIOVANNONI,G;COMI,G;CONEUROLOGY20091

160MUNSCHAUER,F;GIOVANNONI,GMULTIPLESCLEROSIS20091

170COMI,G;COOK,S;GIOVANNOMULTIPLESCLEROSIS20091

234GIOVANNONI,G;GOLD,R;SEMULTIPLESCLEROSISJOURNAL20111

236KAPPOS,L;GOLD,R;ARNOLDMULTIPLESCLEROSISJOURNAL20111

260GIOVANNONI,G;ARNOLD,DL;JOURNALOFNEUROLOGY20121

273GIOVANNONI,G;RADUE,EW;MULTIPLESCLEROSISJOURNAL20121

289HAVRDOVA,E;GOLD,R;FOX,VALUEINHEALTH20121

290KAPPOS,L;GOLD,R;ARNOLDVALUEINHEALTH20121

309GIOVANNONI,G;GOLD,R;FOVALUEINHEALTH20131

313GNANAPAVAN,S;HO,P;HEYWJOURNALOFNEUROCHEMISTRY20131

[reachedgetOption("max.print")--omitted213rows]

Page 25: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

25

H-indexoffirst10mostproductiveauthorsinthisdataset:Authorh_indexg_indexm_indexTCNP

1[ANONYMOUS]3325058.7368421159827575083

2SANDYKR11140.407407427834

3COMPSTONA441141.222222213236192

4LASSMANNH831492.243243224188359

5LAUERK15240.441176566254

6WARRENKG;CATZI000.000000000

7KURTZKEJF26540.6842105920054

8FILIPPIM851532.931034532731959

9POSERCM21390.5526316738939

10COMIG821452.4117647284461148

Lotka’sLawcoefficientestimation:

(Lotka’sLawisnewtome.Thetutorialgaveaverybriefdescriptionofit;butIwillhavetofollowthereferenceprovidedtolearnmoreaboutit.)

AuthorProductivity:N.ArticlesN.AuthorsFreq

11690359.649981e-01

2220312.839011e-02

332763.858035e-03

44791.104293e-03

55456.290275e-04

66263.634381e-04

77152.096758e-04

88101.397839e-04

9968.387034e-05

101011.397839e-05

111145.591356e-05

121222.795678e-05

131322.795678e-05

141511.397839e-05

151911.397839e-05

162111.397839e-05

172322.795678e-05

182811.397839e-05

199911.397839e-05

Page 26: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

26

Betacoefficientestimate:[1]2.605383

Constant:[1]0.05230765

Goodnessoffit:[1]0.7900412

P-valueofK-Ssampletest:[1]0.02815547

PlotScientificproductivity

Itisverydifficulttoseethattherearetwoplotlines,oneinredandoneinblue.IsuspectbecausetheTheoreticalDistributionisacomplexlogarithmiccalculation.

Bibliographicnetworkmatrices:

Mostrelevantpublicationsources:MULTIPLESCLEROSISJOURNALMULTIPLESCLEROSISNEUROLOGYJOURNALOFNEUROIMMUNOLOGY

5754449844643073

JOURNALOFNEUROLOGY

2133

Page 27: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

27

Citationnetwork:KURTZKEJF,1983,NEUROLOGYPOSERCM,1983,ANNNEUROLMCDONALDWI,2001,ANNNEUROL

607150633048

POLMANCH,2005,ANNNEUROLLUBLINFD,1996,NEUROLOGY

21321764

AuthorNetwork:COMI,GFILIPPI,MKAPPOS,LMILLER,DHMONTALBAN,X

1122930817673612

CountryNetwork:USA,USA,USAUSA,USAITALY,ITALY,ITALYGERMANY,GERMANY,GERMANY

8806370131552813

USA

2795

AuthorKeywordNetwork:MULTIPLESCLEROSIS,EXPERIMENTALAUTOIMMUNEENCEPHALOMYELITIS,

46

EXPERIMENTALAUTOIMMUNEENCEPHALOMYELITIS,MULTIPLESCLEROSIS,

32

MULTIPLESCLEROSIS,PREVALENCE,INCIDENCE

20

MULTIPLESCLEROSIS,EPIDEMIOLOGY,PREVALENCE

17

MULTIPLESCLEROSIS(MS),EXPERIMENTALAUTOIMMUNEENCEPHALOMYELITIS,(EAE)

13

KeywordPlusNetwork:EXPERIMENTALAUTOIMMUNEENCEPHALOMYELITIS,CENTRAL-NERVOUS-SYSTEM,

379

EXPERIMENTALAUTOIMMUNEENCEPHALOMYELITIS,EXPERIMENTALALLERGIC,ENCEPHALOMYELITIS

347

CENTRAL-NERVOUS-SYSTEM,EXPERIMENTALAUTOIMMUNEENCEPHALOMYELITIS,

279

MULTIPLE-SCLEROSIS

258

EXPERIMENTALALLERGICENCEPHALOMYELITIS,CENTRAL-NERVOUS-SYSTEM,

224

Page 28: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

28

Visualizingbibliographicnetworks

Inthesegraphs,thelinesbetweennodesarenotrendering.Isuspectitmaybeduetothenumberofnodesinmydatasetoverwhelmingtherenderer.Definitelyanissuetoinvestigatefurther.

CountryCollaboration

Co-citationNetwork

Thismaybeacaseoftoomanylinesrenderingasacloud.

Page 29: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

29

Keywordco-occurrences

Page 30: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

30

Page 31: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

31

Historicalco-citation

Legend:Legend

PaperYearLCS

1983-1POSERCM,1983,ANNNEUROL19835063

1983-2KURTZKEJF,1983,NEUROLOGY19835063

1989-3KRUPPLB,1989,ARCHNEUROL-CHICAGO1989852

1991-4RAOSM,1991,NEUROLOGY1991992

1993-5DUQUETTEP,1993,NEUROLOGY1993980

1996-6LUBLINFD,1996,NEUROLOGY19961764

1996-7JACOBSLD,1996,ANNNEUROL19961764

1998-8EBERSGC,1998,LANCET1998852

1998-9TRAPPBD,1998,NEWENGLJMED1998852

Page 32: INFO 780 Final Project: Dataset Exploration and Analysiskjp24/Portfolio/Info 780/Padova_780.FinalPr… · INFO 780 – Final Project: Dataset Exploration and Analysis Contents: ...

KathleenPadova March21,2017

32

2000-10LUCCHINETTIC,2000,ANNNEUROL2000978

2000-11NOSEWORTHYJH,2000,NEWENGLJMED2000978

2001-12MCDONALDWI,2001,ANNNEUROL20013048

2005-13POLMANCH,2005,ANNNEUROL20052132

2008-14COMPSTONA,2008,LANCET2008954

2011-15POLMANCH,2011,ANNNEUROL20111349