Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution
-
Upload
drahomira-herrmannova -
Category
Science
-
view
308 -
download
0
Transcript of Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution
/15
ANewSeman-cSimilarityBasedMeasureforAssessingResearchContribu-on
PetrKnoth&DrahomiraHerrmannovaKnowledgeMediains-tute,TheOpenUniversity
1
/15
Currentimpactmetrics
• Pros:simplicity,availabilityforevalua-onpurposes• Cons:insufficientevidenceofqualityandresearch
contribu-on
2
/15
Problemsofcurrentimpactmetrics• Sen-ment,seman-cs,contextandmo-ves[Nicolaisen,2007]• Popularityandsizeofresearchcommuni-es[Brumback,
2009;Seglen,1997]• Timedelay[PriemandHemminger,2010]• Skewnessofthedistribu-on[Seglen,1992]• Differencesbetweentypesofresearchpapers[Seglen,1997]• Abilitytogame/manipulatecita-ons[ArnoldandFowler,
2010;Editors,2006]
3
/15
Alterna-vemetrics• Alt-/Webo-metricsetc.– Impacts-lldependentonthenumberofinterac-onsinascholarlycommunica-onnetwork
• Full-text(Semantometrics)– Contribu-ontothedisciplinedependentonthecontentofthemanuscript.
4
/15
ApproachPremise:Full-textneededtoassesspublica-on’sresearchcontribu-on.Hypothesis:Addedvalueofpublica-onpcanbees-matedbasedontheseman-cdistancefromthepublica-onscitedbyptopublica-onsci-ngp.
5
/15
ApproachPremise:Full-textneededtoassesspublica-on’sresearchcontribu-on.Hypothesis:Addedvalueofpublica-onpcanbees-matedbasedontheseman-cdistancefromthepublica-onscitedbyptopublica-onsci-ngp.
5
/15
ApproachPremise:Full-textneededtoassesspublica-on’sresearchcontribu-on.Hypothesis:Addedvalueofpublica-onpcanbees-matedbasedontheseman-cdistancefromthepublica-onscitedbyptopublica-onsci-ngp.
5
/15
Contribu-onmeasure
6
/15
Contribu-onmeasure
p
6
/15
Contribu-onmeasure
p
6
/15
Contribu-onmeasure
p
6
/15
Contribu-onmeasure
p
A
6
/15
Contribu-onmeasure
p
A B
6
/15
Contribu-onmeasure
p
A B
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
6
/15
Contribu-onmeasure
p
A B
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
6
/15
Contribu-onmeasure
p
A Bdist(a,b)
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
6
/15
Contribu-onmeasure
p
A Bdist(a,b)
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
dist(a,b) =1− sim(a,b)
6
/15
Contribu-onmeasure
p
A Bdist(a,b)
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
dist(a,b) =1− sim(a,b)
6
/15
Contribu-onmeasure
p
A Bdist(a,b)
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
dist(a,b) =1− sim(a,b)
6
/15
Contribu-onmeasure
p
A Bdist(a,b)
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
dist(a,b) =1− sim(a,b)
6
/15
Contribu-onmeasure
p
A Bdist(a,b)
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
dist(a,b) =1− sim(a,b)
6
/15
Contribu-onmeasure
p
A Bdist(a,b)
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
X =1 | A |=1∨ | B |=1
1| X | | X |−1( )
⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2
∑ | A |>1∧ | B |>1
⎧
⎨⎪
⎩⎪
dist(a,b) =1− sim(a,b)
6
/15
Contribu-onmeasure
p
A Bdist(a,b)
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
X =1 | A |=1∨ | B |=1
1| X | | X |−1( )
⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2
∑ | A |>1∧ | B |>1
⎧
⎨⎪
⎩⎪
dist(a,b) =1− sim(a,b)
Averagedistanceofthesetmembers
6
/15
Contribu-onmeasure
p
A Bdist(a,b)
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
X =1 | A |=1∨ | B |=1
1| X | | X |−1( )
⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2
∑ | A |>1∧ | B |>1
(
)*
+*
dist(a,b) =1− sim(a,b)
Averagedistanceofthesetmembers
6
/15
Contribu-onmeasure
p
A Bdist(a,b)
dist(b1,b2)
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
X =1 | A |=1∨ | B |=1
1| X | | X |−1( )
⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2
∑ | A |>1∧ | B |>1
(
)*
+*
dist(a,b) =1− sim(a,b)
Averagedistanceofthesetmembers
6
/15
Contribu-onmeasure
p
A Bdist(a,b)
dist(b1,b2)
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
X =1 | A |=1∨ | B |=1
1| X | | X |−1( )
⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2
∑ | A |>1∧ | B |>1
(
)*
+*
dist(a,b) =1− sim(a,b)
Averagedistanceofthesetmembers
6
/15
Contribu-onmeasure
p
A Bdist(a,b)
dist(b1,b2)
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
X =1 | A |=1∨ | B |=1
1| X | | X |−1( )
⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2
∑ | A |>1∧ | B |>1
(
)*
+*
dist(a,b) =1− sim(a,b)
Averagedistanceofthesetmembers
6
/15
Contribu-onmeasure
p
A Bdist(a,b)
dist(b1,b2)
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
X =1 | A |=1∨ | B |=1
1| X | | X |−1( )
⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2
∑ | A |>1∧ | B |>1
(
)*
+*
dist(a,b) =1− sim(a,b)
Averagedistanceofthesetmembers
6
/15
Contribu-onmeasure
p
A Bdist(a,b)
dist(b1,b2)
Contribution p( ) = BA⋅
1| B | ⋅ | A |
⋅ dist(a,b)a∈A,b∈B,a≠b∑
X =1 | A |=1∨ | B |=1
1| X | | X |−1( )
⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2
∑ | A |>1∧ | B |>1
(
)*
+*
dist(a,b) =1− sim(a,b)
Averagedistanceofthesetmembers
6
/15
Datasets• Requirements– Availabilityoffull-text– Density– Mul-disciplinarity– (Availabilityofcita-ons)
7
/15
DatasetsFull-text Density Mul5disciplinarity
CORE ✓ ✗ ✓OpenCita-onCorpus ✓ - ✗ACMDataset ✗ - ✓DBLP+Cita-on ✗ - ✓iSearchCollec-on ✓ ✗ ✗
8
/15
Ourdataset• 10seedpublica-onsfromCOREwithvaryinglevelofcita-ons
• missingci-ngandcitedpublica-onsdownloadedmanually
• onlyfreelyaccessibleEnglishdocumentsweredownloaded
• intotal716documents(~50%ofthecompletenetwork)
• 2daystogatherthedata
9
/15
ResultsPublica5onno. |B|(Cita5onscore) |A|(No.ofreferences) Contribu5on
1 5(9) 6(8) 0.4160
2 7(11) 52(93) 0.3576
3 12(20) 15(31) 0.4874
4 14(27) 27(72) 0.4026
5 16(30) 12(21) 0.5117
6 25(41) 8(13) 0.4123
7 39(71) 70(128) 0.4309
8 53(131) 3(10) 0.5197
9 131(258) 22(32) 0.5058
10 172(360) 17(20) 0.5004
474(958) 232(428)
10
/15
Results
11
/15
CurrentimpactmetricsvsSemantometrics
Unaffectedby Currentimpactmetrics Semantometrics
Cita-onsen-ment,seman-cs,context,mo-ves
✗ ✔
Popularity&sizeofres.communi-es ✗ ✔
Timedelay ✗ ✗/✔*
Skewnessofthecita-ondistribu-on ✗ ✔
Differencesbetweentypesofres.papers ✗ ✔
Abilitytogame/manipulatethemetrics ✗ ✗/✔**
*reducedto1cita-on**assumingthatself-cita-onsarenottakenintoaccount
12
/15
Conclusions• Full-textnecessary• Semantometricsareanewclassofmethods• Weshowedonemethodtoassesstheresearchcontribu-on
13
/15
References• JeppeNicolaisen.2007.Cita-onAnalysis.AnnualReviewof
Informa-onScienceandTechnology,41(1):609-641.• DouglasNArnoldandKris-neKFowler.2010.Nefarious
numbers.No-cesoftheAmericanMathema-calSociety,58(3):434-437.
• RogerABrumback.2009.Impactfactorwars:EpisodeV--TheEmpireStrikesBack.Journalofchildneurology,24(3):260-2,March.
• ThePLoSMedicineEditors.2006.Theimpactfactorgame.PLoSmedicine,3(6),June.
14
/15
References• JasonPriemandBradelyM.Hemminger.2010.Scientometrics
2.0:TowardnewmetricsofscholarlyimpactonthesocialWeb.FirstMonday,15(7),July.
• PerOmarSeglen.1992.TheSkewnessofScience.JournaloftheAmericanSocietyforInforma-onScience,43(9):628-638,October.
• PerOmarSeglen.1997.Whytheimpactfactorofjournalsshouldnotbeusedforevalua-ngresearch.BMJ:Bri-shMedicalJournal,314(February):498-502.
15