The views expressed in this presentation are those of the presenter, not necessarily those of the...

21
The views expressed in this presentation are those of the presenter, not necessarily those of the CCHS or the CSIC. repositories.webometrics.info The July 2011 Webometrics repository ranking Isidro F. Aguillo

Transcript of The views expressed in this presentation are those of the presenter, not necessarily those of the...

The views expressed in this presentation are those of the presenter, not necessarily those of the CCHS or the CSIC.

repositories.webometrics.info

The July 2011 Webometrics repository rankingIsidro F. Aguillo

repositories.webometrics.info

22Agenda

• Introduction to the Cybermetrics Lab

• Webometrics, an emerging discipline

• Webometrics, OA and repositories

• Ranking Web– Preliminary results July 2011

• Final comments

• Open debate

repositories.webometrics.info

• Scholars making scientific research

– Researchers belonging to the National Research Council (CSIC)– The largest Spanish research public organization

– Recognised by our peers– 15 years experience in quantitative analysis and evaluation of scholar communication and

academic institutions

– Papers in referred scientific journals, contributions to international conferences, reports to

governmental bodies

– Funded by public resources– International cooperation projects funded by European Commission

• Research Agenda

– Promote Open Access initiatives

– Global coverage, including developing countries

– Building Cybermetrics/Webometrics as an emerging discipline

3The Cybermetrics Lab

repositories.webometrics.info

Number of external inlinks, Web impact factor, g-factor, PageRank

Inter-linking, co-linking, clusters, similarity, network measurements

Names of authors, papers, institutions, journals, hot topics

Size, geographical coverage, languages, biases, algorithms, updating frequency, operators

Number of webpages, rich files, academic papers, media files, languages, age

Social networks presence, blogmetrics, wikimetrics

Search Engines

Size

Web 2.0

Visibility

Networks

Mentions

Activity Impact

TrafficRank

Patterns of visits, referrers, referrals

Number of visits, visitors, geographical and temporal distribution

Frequency, presence in selected html tags, title, URL, bad practices

Presence in search engines and directories

Rank in search results

Criteria

Presence

Position

Popularity

Behavior

Visits, visitors

Position Analytics (usage)

4Webometrics

repositories.webometrics.info

• Webometrics requires public Web

– Direct crawling– OA Electronic Journals

– Repositories

– Indirect crawling: Search engines as proxies– Link analysis

– Mention analysis

• Analytics

– Usage

– from log files

– Google Analytics or similar

• OpenAIRE WP8

– Combining Bibliometrics, Webometrics and Analytics indicators

5Webometrics, OA and repositories

repositories.webometrics.info

• Priorities in OA initiatives

– Populate the repositories

– Obtaining mandates

– Applying standards

– Increase visibility

• Intellectual property issues

– Authors not transferring full rights to editors

– Participation in repositories intended for:

– Increasing the number of citations

– Improving author (and institutional) prestige

– But … current OA practices means some rights are being lost

– At the level of repository

– At the level of institution

6A few objectives and some problems

repositories.webometrics.info

• Research results are the most important assets of the universities, but in a few cases the repository is

outside the institutional webdomain

• HAL Sciences de l'Homme et de la Société http://halshs.archives-ouvertes.fr/

• White Rose Consortium ePrints Repository http://eprints.whiterose.ac.uk/

• University of Arizona's Campus Repository http://arizona.openrepository.com/

• Paris Institute of Technology Pastel Theses http://pastel.archives-ouvertes.fr/

• Universidad de Chile Cybertesis http://www.cybertesis.cl/

• Open Access Server Woods Hole http://darchive.mblwhoilibrary.org/

• TeesRep Teesside University http://tees.openrepository.com/

• Auckland Univ Technology ScholarlyCommons http://aut.researchgateway.ac.nz/

• University of Wolverhampton Digital Repository http://wlv.openrepository.com/

• HAL Ecole Polytechnique http://hal-polytechnique.archives-ouvertes.fr/

7Transfer of “institutional” rights

repositories.webometrics.info

• Regarding naming

– Institutional repository URL should be in the institutional web domain

– The relevant item is the full text file not the webpage of the record

– It is recommended that the URL of the file includes:

– Institutional webdomain

– Last name of (main) author

– Explicit file type (something.pdf)

• Regarding linking

– The item URL (not the record) should be easily linkable (citable). Short, no complex

or long numerical codes

– Nothing against purls but not as main linking target

– http://dx.doi.org/

– http://hdl.handle.net/

8A different point of view

repositories.webometrics.info

9

http://www.openstarts.units.it/dspace/bitstream/10077/2267/1/13.pdf

Recommended URL

repositories.webometrics.info

10

http://dare.uva.nl/document/131441

Discrepancies in record’s numbers

repositories.webometrics.info

11DOI recognise editor not author

http://digitalcommons.bolton.ac.uk/cmri_journalspr/48/

repositories.webometrics.info

12Complex URLs

http://doras.dcu.ie/15962/

http://doras.dcu.ie/15962/4/OPTICS-S-08-01522.pdf

Ranking Web of Repositories(July 2011)

13

repositories.webometrics.info

• Repositories with their own domain or subdomain

– 1,222 repositories

– Including 1,154 institutional repositories

– Plus 49 “portals”

• Major changes from previous editions

– Sources

– Exalead data no longer collected

– Yahoo Site Explorer instead of Yahoo Search

– Only for Size

– New formats added: docx, pptx, eps

– Total number of rich files excluded from Size count

– Scholar full count (50%) + Scholar 2006-2010 (50%)

14July 2011 edition

repositories.webometrics.info repositories.webometrics.info

Source Operator Normalization Weight Indicator

GoogleYahoo SE1

Bingsite2

Log-normalization3

20% SIZE

GoogleYahooBing

filetype2

(pdf, doc, docx, ppt, pptx, ps, eps)

15%RICH FILES

GoogleScholar

site(al least summaries)

50% total+50%(2006-10)15% SCHOLAR

Yahoo SE1 linkdomain 50% VISIBILITY

1 Yahoo is using Bing database, except for Site Explorer (SE) and a few national mirrors (till mid 2012)2 Number of rich files excluded from the global size count3 ln(ai+1)/ln(amax+1)

Methodology 15

repositories.webometrics.info

SCORE

RANK

WR

QS

CWTS

ARWUHEEACT

log-norm

z-score

Log-normalization 16

repositories.webometrics.info

17

TopRepositories

repositories.webometrics.info

18

TopInstitutionalRepositories

repositories.webometrics.info

19

Top“Portals”

repositories.webometrics.info

• Providers and end-users of repositories are scientists and their

institutions

– For them papers are the most important asset they produce

– Granting increased access and visibility is universally acknowledged

– But some practices are dislodging deposited material from authorships, making

difficult to cite (link) the papers and penalizing the “prestige” of the scientists and

their academic employers

• Ranking Web of Repositories intends to promote OA initiatives and

support best practices

– Current classification is still not reflecting the repositories diversity, but further

efforts will be done in the future

– Methodology is also evolving, but overall results are not changing abruptly among

consecutive editions

20Final comments

repositories.webometrics.info repositories.webometrics.info

21Thank you!

[email protected]

repositories.webometrics.info

Questions?