Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

38
/15 A New Seman-c Similarity Based Measure for Assessing Research Contribu-on Petr Knoth & Drahomira Herrmannova Knowledge Media ins-tute, The Open University 1

Transcript of Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

Page 1: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

ANewSeman-cSimilarityBasedMeasureforAssessingResearchContribu-on

PetrKnoth&DrahomiraHerrmannovaKnowledgeMediains-tute,TheOpenUniversity

1

Page 2: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Currentimpactmetrics

•  Pros:simplicity,availabilityforevalua-onpurposes•  Cons:insufficientevidenceofqualityandresearch

contribu-on

2

Page 3: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Problemsofcurrentimpactmetrics•  Sen-ment,seman-cs,contextandmo-ves[Nicolaisen,2007]•  Popularityandsizeofresearchcommuni-es[Brumback,

2009;Seglen,1997]•  Timedelay[PriemandHemminger,2010]•  Skewnessofthedistribu-on[Seglen,1992]•  Differencesbetweentypesofresearchpapers[Seglen,1997]•  Abilitytogame/manipulatecita-ons[ArnoldandFowler,

2010;Editors,2006]

3

Page 4: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Alterna-vemetrics•  Alt-/Webo-metricsetc.–  Impacts-lldependentonthenumberofinterac-onsinascholarlycommunica-onnetwork

•  Full-text(Semantometrics)–  Contribu-ontothedisciplinedependentonthecontentofthemanuscript.

4

Page 5: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

ApproachPremise:Full-textneededtoassesspublica-on’sresearchcontribu-on.Hypothesis:Addedvalueofpublica-onpcanbees-matedbasedontheseman-cdistancefromthepublica-onscitedbyptopublica-onsci-ngp.

5

Page 6: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

ApproachPremise:Full-textneededtoassesspublica-on’sresearchcontribu-on.Hypothesis:Addedvalueofpublica-onpcanbees-matedbasedontheseman-cdistancefromthepublica-onscitedbyptopublica-onsci-ngp.

5

Page 7: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

ApproachPremise:Full-textneededtoassesspublica-on’sresearchcontribu-on.Hypothesis:Addedvalueofpublica-onpcanbees-matedbasedontheseman-cdistancefromthepublica-onscitedbyptopublica-onsci-ngp.

5

Page 8: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

6

Page 9: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

6

Page 10: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

6

Page 11: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

6

Page 12: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A

6

Page 13: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A B

6

Page 14: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A B

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

6

Page 15: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A B

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

6

Page 16: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A Bdist(a,b)

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

6

Page 17: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A Bdist(a,b)

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

dist(a,b) =1− sim(a,b)

6

Page 18: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A Bdist(a,b)

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

dist(a,b) =1− sim(a,b)

6

Page 19: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A Bdist(a,b)

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

dist(a,b) =1− sim(a,b)

6

Page 20: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A Bdist(a,b)

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

dist(a,b) =1− sim(a,b)

6

Page 21: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A Bdist(a,b)

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

dist(a,b) =1− sim(a,b)

6

Page 22: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A Bdist(a,b)

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

X =1 | A |=1∨ | B |=1

1| X | | X |−1( )

⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2

∑ | A |>1∧ | B |>1

⎨⎪

⎩⎪

dist(a,b) =1− sim(a,b)

6

Page 23: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A Bdist(a,b)

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

X =1 | A |=1∨ | B |=1

1| X | | X |−1( )

⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2

∑ | A |>1∧ | B |>1

⎨⎪

⎩⎪

dist(a,b) =1− sim(a,b)

Averagedistanceofthesetmembers

6

Page 24: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A Bdist(a,b)

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

X =1 | A |=1∨ | B |=1

1| X | | X |−1( )

⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2

∑ | A |>1∧ | B |>1

(

)*

+*

dist(a,b) =1− sim(a,b)

Averagedistanceofthesetmembers

6

Page 25: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A Bdist(a,b)

dist(b1,b2)

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

X =1 | A |=1∨ | B |=1

1| X | | X |−1( )

⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2

∑ | A |>1∧ | B |>1

(

)*

+*

dist(a,b) =1− sim(a,b)

Averagedistanceofthesetmembers

6

Page 26: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A Bdist(a,b)

dist(b1,b2)

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

X =1 | A |=1∨ | B |=1

1| X | | X |−1( )

⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2

∑ | A |>1∧ | B |>1

(

)*

+*

dist(a,b) =1− sim(a,b)

Averagedistanceofthesetmembers

6

Page 27: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A Bdist(a,b)

dist(b1,b2)

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

X =1 | A |=1∨ | B |=1

1| X | | X |−1( )

⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2

∑ | A |>1∧ | B |>1

(

)*

+*

dist(a,b) =1− sim(a,b)

Averagedistanceofthesetmembers

6

Page 28: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A Bdist(a,b)

dist(b1,b2)

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

X =1 | A |=1∨ | B |=1

1| X | | X |−1( )

⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2

∑ | A |>1∧ | B |>1

(

)*

+*

dist(a,b) =1− sim(a,b)

Averagedistanceofthesetmembers

6

Page 29: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Contribu-onmeasure

p

A Bdist(a,b)

dist(b1,b2)

Contribution p( ) = BA⋅

1| B | ⋅ | A |

⋅ dist(a,b)a∈A,b∈B,a≠b∑

X =1 | A |=1∨ | B |=1

1| X | | X |−1( )

⋅ dist x1, x2( )x1∈X,x2∈X,x1≠x2

∑ | A |>1∧ | B |>1

(

)*

+*

dist(a,b) =1− sim(a,b)

Averagedistanceofthesetmembers

6

Page 30: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Datasets•  Requirements– Availabilityoffull-text– Density– Mul-disciplinarity–  (Availabilityofcita-ons)

7

Page 31: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

DatasetsFull-text Density Mul5disciplinarity

CORE ✓ ✗ ✓OpenCita-onCorpus ✓ - ✗ACMDataset ✗ - ✓DBLP+Cita-on ✗ - ✓iSearchCollec-on ✓ ✗ ✗

8

Page 32: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Ourdataset•  10seedpublica-onsfromCOREwithvaryinglevelofcita-ons

•  missingci-ngandcitedpublica-onsdownloadedmanually

•  onlyfreelyaccessibleEnglishdocumentsweredownloaded

•  intotal716documents(~50%ofthecompletenetwork)

•  2daystogatherthedata

9

Page 33: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

ResultsPublica5onno. |B|(Cita5onscore) |A|(No.ofreferences) Contribu5on

1 5(9) 6(8) 0.4160

2 7(11) 52(93) 0.3576

3 12(20) 15(31) 0.4874

4 14(27) 27(72) 0.4026

5 16(30) 12(21) 0.5117

6 25(41) 8(13) 0.4123

7 39(71) 70(128) 0.4309

8 53(131) 3(10) 0.5197

9 131(258) 22(32) 0.5058

10 172(360) 17(20) 0.5004

474(958) 232(428)

10

Page 34: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Results

11

Page 35: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

CurrentimpactmetricsvsSemantometrics

Unaffectedby Currentimpactmetrics Semantometrics

Cita-onsen-ment,seman-cs,context,mo-ves

✗ ✔

Popularity&sizeofres.communi-es ✗ ✔

Timedelay ✗ ✗/✔*

Skewnessofthecita-ondistribu-on ✗ ✔

Differencesbetweentypesofres.papers ✗ ✔

Abilitytogame/manipulatethemetrics ✗ ✗/✔**

*reducedto1cita-on**assumingthatself-cita-onsarenottakenintoaccount

12

Page 36: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

Conclusions•  Full-textnecessary•  Semantometricsareanewclassofmethods•  Weshowedonemethodtoassesstheresearchcontribu-on

13

Page 37: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

References•  JeppeNicolaisen.2007.Cita-onAnalysis.AnnualReviewof

Informa-onScienceandTechnology,41(1):609-641.•  DouglasNArnoldandKris-neKFowler.2010.Nefarious

numbers.No-cesoftheAmericanMathema-calSociety,58(3):434-437.

•  RogerABrumback.2009.Impactfactorwars:EpisodeV--TheEmpireStrikesBack.Journalofchildneurology,24(3):260-2,March.

•  ThePLoSMedicineEditors.2006.Theimpactfactorgame.PLoSmedicine,3(6),June.

14

Page 38: Towards Semantometrics: A New Semantic Similarity Based Measure for Assessing Research Contribution

/15

References•  JasonPriemandBradelyM.Hemminger.2010.Scientometrics

2.0:TowardnewmetricsofscholarlyimpactonthesocialWeb.FirstMonday,15(7),July.

•  PerOmarSeglen.1992.TheSkewnessofScience.JournaloftheAmericanSocietyforInforma-onScience,43(9):628-638,October.

•  PerOmarSeglen.1997.Whytheimpactfactorofjournalsshouldnotbeusedforevalua-ngresearch.BMJ:Bri-shMedicalJournal,314(February):498-502.

15