Local content in a Europeana cloud Alternative methods of ingestion for small institutions (Stein)...
-
Upload
colby-revels -
Category
Documents
-
view
213 -
download
1
Transcript of Local content in a Europeana cloud Alternative methods of ingestion for small institutions (Stein)...
local content in a Europeana cloud
Alternative methods of ingestion for small institutions
(Stein) Runar Bergheim Asplan Viak Internet as
LoCloud is funded by the European Commission's ICT Policy Support Programme
Overview of Presentation
• Characteristics of Europeana content providers
• Present ingestion methods for Europeana
• Alternative ingestion methods “out there”
• Experiments that may be conducted as part of LoCloud
• 7 slides• 284 words• 1 858 characters• 2 illustrations• (A seemingly endless
stream of words)
Characteristics of Europeana content providers
Those who are «in»• Professional cultural
heritage institutions• Capacity for investment in
infrastructure & projects• Technical skills beyond what
may be expected• Entities that fit into a
hierarchy of aggregators• Patient
Those who are «out»• Very small collections
– Collections by individuals– (tens to hundreds of objects)
• Independent institutions with strained funding
• «Non-conforming» online content structure– 1 web page 1 object
Present Europeana ingestion process
• Puts great demands on content providers– Partly mitigated by the excellent MINT-MORE tools
• Limited capacity at harvesting end– Partly mitigated by aggregator hierarchy
• Low frequency of updates – each iteration takes a long time– Partly mitigated by modified content/aggregation
architecture of Europeana Cloud
Weaknesses of presentEuropeana ingestion process
Alternativeingestion methods«out there»
Difficult to create complete ESE/EDM from crawling– But... the typical Europeana record is not really all that
«complete»– Schema.org. Microformats and other embedded
semantics may help• Deep-content URLs hidden for crawlers– Simple «site-map» protocol may be applied
• Increases capacity for small content providers• Decreases time-consumption of the content
ingestion life-cycle• Will serve more than one publishing channel
Considerations for alternativeingestion methods
• Content assessment– Assess quantity of «new» content that can be reached
using alternative ingestion methods• Technology experiments
– HTML embedded semantics based on open standards– Creating a test-spider for auto-extraction of metadata
from web pages– Transformation of data to ESE/EDM
• Design of processes– Embedding of spider into aggregator organizations
business processes– Ingestion + Quality assurance
Experiments that may be conducted as part of LoCloud
Thank you for the [email protected]
LoCloud is funded by the European Commission's ICT Policy Support Programme
The views and opinions expressed in this presentation are the sole responsibility of the
authors and do not necessarily reflect the views of the European Commission.
Funding