A Field Study of Subject Gateways on “Zeitgeschichte”.
Applied Historical Information Science
Diplomarbeit
zur Erlangung des akademischen Grades
eines Magisters der Philosophie an der
Philosophisch-Historischen Fakultät der
Leopold-Franzens-Universität Innsbruck
Eingericht am Institut für Zeitgeschichte
bei
o. Univ.-Prof. Dr. Rolf Steininger
von Michael Kröll
Innsbruck, im März 2006
2
Screenshot of the project homepage, March 5th 2006 http://www.pepl.info/papers/fieldstudy_sg_zeitgeschichte/
3
Table of contents I Introduction....................................................................................................................... 4
1 What is a “subject gateway”?......................................................................................... 6 2 Brief history of the development of subject gateways for the field of Contemporary
History............................................................................................................................ 7 II Basic information about the three web-based subject gateways .................................. 9 III Generic technical web-page evaluation methods ......................................................... 10
1 Syntactic standards....................................................................................................... 10 2 Metadata standards....................................................................................................... 10 3 Accessibility guidelines................................................................................................ 11 4 Usability evaluation...................................................................................................... 11
IV Setting up a framework for specific analyses............................................................... 14 V Using the custom analysis framework .......................................................................... 15
1 Resource identifier validation ...................................................................................... 15 2 Subject classification analysis...................................................................................... 16
2.1 Current subject indexing standards ..................................................................... 16 2.2 Semantic Web to the rescue? .............................................................................. 17 2.3 Subject classification of the three subject gateways ........................................... 18 2.4 Subject focuses as compared to the “offline world” of Contemporary History
research and teaching. ......................................................................................... 19 3 Common resource identifiers analysis ......................................................................... 21 4 Duplicate pointers analysis........................................................................................... 22 5 Information network analysis....................................................................................... 23
VI Interpreting the results of the analysis: Two theses on Contemporary History in the
German language area............................................................................................... 25 1 Is Contemporary History the history of national socialism? Discussing the meaning
and purpose of “Zeitgeschichte” .................................................................................. 25 2 Subject gateways are needed as hubs in the Contemporary History Network............. 28
VII Conclusions...................................................................................................................... 30 VIII Bibliography .................................................................................................................... 32 IX Appendix.......................................................................................................................... 47
1 Web usability checklists............................................................................................... 47 1.1 Best practices for web interfaces of searchable databases .................................. 47 1.2 Selected Nielsen web design mistakes ................................................................ 49
2 Top 50 keywords in the aggregated ZIS-, VLZ-, ZOL-Link database ........................ 51 3 Top 50 keywords related to Contemporary History - Innsbruck University Library
OPAC database ............................................................................................................ 53 4 URLs common to all three subject gateways ZIS/ZOL/VLZ ...................................... 55 5 System setup and availability....................................................................................... 56 6 Database design............................................................................................................ 57 7 Overview of crawler and import programs .................................................................. 58 8 Overview of the analysis programs.............................................................................. 59
4
I Introduction In the German speaking language area1, three major web-based subject gateways
focusing on Contemporary History2 have been built up during the last ten years. These
projects, essentially working in parallel, share the pretence of being a main reference of their
kind. Nevertheless, they differ substantially with regard to certain project characteristics, such
as their dates of establishment or disposable resources. This paper provides a comparative
analysis of www.zeitgeschichte-online.de, www.vl-zeitgeschichte.de and zis.uibk.ac.at as
three major examples of German-speaking web-based subject gateways on Contemporary
History embedded in the context of applied Historical Information Science.
The principal historical methods and quality standards established during the last one
and a half centuries by the science of history can be adapted to the online world of historical
content. However, new criteria for online historical content still have to be established for
“secondary aspects”3: such as technical form and structure of content, metadata and linking.
Only if such standards are well-established inside the community of online history, an
efficient coverage of the online historical information space can be established and potentially
leverage a participation in the vision of the Semantic Web4 for the further future.5
Tim Berners-Lee, original “inventor” of the current World Wide Web, envisioned the
Semantic Web as “an extension of the current web in which information is given well-defined
meaning, better enabling computers and people to work in cooperation.”6 This new generation
World Wide Web should overcome the limits of current search engines and provide a
completely new quality of knowledge development. Although Berners-Lee’s vision may
sound utopian in part, the technological foundation for the Semantic Web to function has
1 For the most parts of this study, the reference to the “German language area” covers Germany and
Austria. 2 If mentioned without a specific context, the meaning of “Contemporary History” has been used
synonymously with “Zeitgeschichte” in this paper, bearing in mind that “Contemporary History” has a number of different meanings in diverse national and language specific settings. An exemplary overview of the different “Contemporary Histories” in a number of European countries is provided by Gehler (2002) "Zeitgeschichte zwischen Europäisierung und Globalisierung" 25-32.
3 Enderle (2001a) "Der Historiker, die Spreu und der Weizen, zur Qualität und Evaluierung geschichtswissenschaftlicher Internet-Ressourcen" 62.
4 An general overview on the Semantic Web is provided by Miller and Swick (2003) "An Overview of W3C Semantic Web Activity".
5 Cf. Enderle (2001b) "Geschichtswissenschaft, Fachinformation und das Internet" 7. 6 Berners-Lee, et al. (2001) "The Semantic Web. A new form of Web content that is meaningful to
computers will unleash a revolution of new possibilities".
5
already been laid in the last years and considerable effort is still put into the further
development.7
Before standards for the technical form and structure of content, metadata and linking
of online historical content, which are amongst other things the basis for a participation in the
Semantic Web8 can be put to wide-spread use, it will be necessary to get an idea about the
current status. For web-pages on Contemporary History in the German language area,
systematic evaluations already exist9. As a complement thereto, this paper introduces aspects
of a methodological canon for evaluating subject gateways. Pursuing a higher level empirical
and technical approach, the author will discuss the methodological potential, prospects, and
implications as an example of applied Historical Information Science. It shall be
demonstrated that approaching historical content by analyzing its “secondary”, i.e. formal
aspects, can prepare the grounds for new insights about the content itself, and that such
analysis can also instigate discussion of the content focus of a whole discipline.10
In the context of this paper, Historical Information Science shall be defined “as an
extension of older notions of scientific historicism combined with modern Information
Science, with the application of modern Social Science research methodology and state-of-
the-art information technology”11. This follows the definition of U.S. American historian and
7 Cf. The W3C Semantic Web homepage: http://www.w3.org/2001/sw/, January, 18th 2006. 8 Cf. Berners-Lee, et al. : “For the semantic web to function, computers must have access to
structured collections of information and sets of inference rules that they can use to conduct automated reasoning”.
9 Cf. Wirtz (2005) "Marktanalyse. Deutschsprachige Online- und CD/DVD-Produktionen zum Thema Nationalsozialismus und Holocaust. Ein Projekt des Fritz Bauer Instituts im Auftrag der Bundeszentrale für politische Bildung" and Dornik (2003) "Zeitgeschichte und Internet".
10 Cf. the two theses on Contemporary History in the German language area in chapter VI. 11 McCrank (2002) "Historical Information Science. An emerging Unidiscipline" 593. He gives a
more elaborate definition at 56f: “the scientific study of historical information and of information and communication technologies, and the techniques, methods, and intellectual frameworks by which we extract meaning from this sources. This includes the creation of sources and their use in original content, historical use, and current use in studying History. This broad, integrative and unifying super-discipline concerns records of all kinds but especially electronic sources and archives because of the application of modern information technology for their access and analysis; historical information access and retrieval and contemporary access to historical materials; meta-history and metadata in documentation; data-text-image analysis; forensics and computing applications; and information technologies applied in historical research, communication, and instruction”. Alternative definitions of Historical Information Science, “historische informatiekunde” and “Historische Fachinformatik und Dokumentation” respectively, provide Boonstra, et al. (2004) "Past, present and future of historical information science" 20 and Kropač (2004) "Was ist 'Historische Fachinformatik und Dokumentation'? Terminologisches, Inhalte, Aufgaben". In the German mainstream of historical research, “Historische Fachinformatik” is seen as a historical ancillary science; cf. Vogeler, et al. (2005) "Historische Hilfswissenschaften".
6
librarian Lawrence McCrank. Being a study in his concept of applied Historical Information
Science, focuses and objectives12 such as
• “exploration of methodologies”,
• “testing potential applications”, or
• “experimentation to develop processes and products for historical research which
may be broadly applicable or customized for problems in specific domains”
have been followed. The database including the analysis data and the software programs
developed in the course of this study have been included in the study’s project homepage
available at http://www.pepl.info/papers/fieldstudy_sg_zeitgeschichte/.13
After defining the term “subject gateway”, a brief history of the development of
subject gateways for the field of Contemporary History will be given. Following basic
information about the three web-based subject gateways at issue, generic technical web-page
evaluation methods will be discussed. Subsequently, a framework for specific analyses of
web-pages and subject gateways will be introduced and applied to the three subject gateways
along several analysis vectors. Aspects of this specific analysis framework and the analysis
vectors have already been discussed in the author’s contribution to the XVI international
conference of the Association for History and Computing in Amsterdam 2005.14 Two theses
on Contemporary History in the German language area will be established in conclusion from
the analysis. Finally, the findings of the study will be summed up in the “Conclusions”
chapter.
1 What is a “subject gateway”?
A subject gateway is an Internet Service. Although sometimes used synonymously
with “Internet Portal” or “Virtual Library”15, the term “subject gateway” has a distinctive
meaning. In the course of this study, “subject gateway” and “quality-controlled subject
gateway” will be used as defined by “Digital Library Scientist“ Traugott Koch:
Subject gateways are thereby defined as
12 Cf. McCrank "Historical Information Science. An emerging Unidiscipline" 594. 13 Thus, meeting another Historical Information Science focus: “The deposit of data, files,
programming, shareware, etc. in appropriate archives and research centers so that one contributes to the cumulative resource base available to historians everywhere.” Ibid.
14 Kröll (2005) "Not ready for the Semantic Web: A field study of subject gateways on Contemporary History".
15 Complementary typologies of Subject Gateway-related Internet services provide Campbell, et al. (2003) "Definitions for Web-Based Services" and Nentwich (2003) "Cyberscience. Research in the Age of the Internet" 78-81.
7
Internet services which support systematic resource discovery. They provide links to resources (documents, objects, sites or services), predominantly accessible via the Internet. The service is based on resource description. Browsing access to the resources via a subject structure is an important feature.16
Quality-controlled subject gateways are thereby defined as
Internet services which apply a rich set of quality measures to support systematic resource discovery. Considerable manual effort is used to secure a selection of resources which meet quality criteria and to display a rich description of these resources with standards-based metadata. Regular checking and updating ensure good collection management. A main goal is to provide a high quality of subject access through indexing resources using controlled vocabularies and by offering a deep classification structure for advanced searching and browsing.
In the following, the three subject gateways at issue will still be referred to as “subject
gateway” and not “quality-controlled subject gateways” because none of the three fulfills all
seven criteria17 laid down for the latter by Koch.
2 Specifics of subject gateways for the field of Contemporary
History in the German language area
Subject gateways for the field of History have been developed as a manifestation of
the institutionalization of systematic resource discovery for online historical information.
Prior to that, single historians maintained more or less extensive annotated link lists which in
part still exist today, providing successive value for fellow historians.
In the Anglo-American language area, the development of subject gateways on
History did not include a single subject gateway specifically focused on Contemporary
History. Up to today, still not a single one can be found listed in main catalogues for subject
gateways18 and online resources on History. In that catalogues, only German language subject
gateways on “Zeitgeschichte” are present. An answer to the question why subject gateways on
Contemporary History are a domain of the German language area can be found in the
organizational structure of the respective academic research domains: Departments for
Contemporary History can only rarely be found outside the German language area. It can be
16 Koch (2000) "Quality-controlled subject gateways: definitions, typologies, empirical overview"
24f. 17 Ibid. 25f. 18 Cf. http://vlib.org/History, January 20th 2006 and
http://www.history.ac.uk/ihr/Resources/Type/gateway.html, January 20th 2006. Homepages of institutions or projects and web-pages lacking an explicit author will subsequently be referred to by URL and last access date.
8
assumed that given the lack of an institutional background it will be difficult to establish and
maintain subject gateways. Rather than subject gateways covering the broad area of
Contemporary History, web-pages with a more specific thematic focus e.g. like the Holocaust,
the Northern Ireland conflict, the Cold War, or the Vietnam War19 can be found in the Anglo-
American language area.
Following the developments of subject gateways in the German language area, it can
be noticed that besides fulfilling the original purpose of providing resource descriptions, a
trend towards additional services like providing content in form of articles or primary sources
as well as providing specialized communications platforms in form of discussion fora and
mailing lists can be observed.
19 Exemplary: “Holocaust Cybrary remembering the Survivors” (http://www.remember.org/,
February 5th 2006), “CAIN: Northern Ireland Conflict, Politics, & Society. Information on 'the troubles'” (http://cain.ulst.ac.uk/, February 5th 2006), “Cold War” (http://www.cnn.com/SPECIALS/cold.war/, February 5th 2006), “Vietnam War Internet Project” (http://www.vwip.org/, February 5th 2006).
9
II Basic information about the three web-based subject gateways
The “Zeitgeschichte Information System” (ZIS), online since early 1995, is the
longest-running web-based subject gateway on Contemporary History among the three
projects examined. Maintained by the Institute for Contemporary History at the Leopold-
Franzens-University of Innsbruck, its main features include an annotated link database
comprising about 800 entries, primary sources of 20th century Austrian history, a
documentation of the history of South Tyrol and a documentation on “Austria & Israel since
1945”. The most recent review of ZIS has been published by Martin Gasteiner and Christian
Pape in 200520.
The “Virtual Library Zeitgeschichte” (VLZ), part of the W3C Virtual Library21, was
the result of the merge of the Virtual Library sections “Third Reich/World War II” with “20th
Century” in 2003. The VLZ is managed by a team of historians, Ralf Blank and Stephanie
Marra on an honorary basis. Its main feature represents a link database including about 700
entries. In November 2005 the VLZ was target of a hacker attack from which it unfortunately
will not have recovered before the planned re-launch in March 200622. The most recent
review of VLZ has been published by Ingrid Böhler and Michael Gehler in 200423. The raw
data used for the comparative analysis has been gathered in April 2005.
The “Zeitgeschichte-Online” (ZOL) project is a joint endeavor of the
“Zentrum für Zeithistorische Forschung“ (ZZF), Potsdam and the “Staatsbibliothek zu
Berlin – Preußischer Kulturbesitz“ (SBB), Berlin funded by the “Deutsche
Forschungsgemeinschaft”. The subject gateway went online in early 2004, and is supported in
close co-operation with the two probably most important subject gateways on History in the
German speaking area, “Clio-Online”24 and “H-Soz-u-Kult”25. “Zeitgeschichte-Online”
features a database on institutions related to and persons working in the field of Contemporary
History, a sub-branch of the H-Net list H-Soz-u-Kult called “H-Soz-u-Kult/Zeitgeschichte”,
pertinent subject foci, subject related online discussion fora, and a link database including
20 Gasteiner and Pape (2005) "Clio-online Guide Österreich". 21 http://vlib.org/, January, 20th 2006. 22 According to an E-Mail from Ralf Blank to the author from February, 20th 2006. 23 Böhler and Gehler (2004) "Wendungen nach innen? Selektive Blicke auf die Zeitgeschichte". 24 http://www.clio-online.de/, January, 20th 2006. 25 Cf. Hohls (2004) "H-Soz-u-Kult: Kommunikation und Fachinformation für die
Geschichtswissenschaften" and http://hsozkult.geschichte.hu-berlin.de/, January, 20th 2006.
10
about 2,100 entries. The most recent review of ZOL has been published by Dirk van Laak26 in
2004.
Judging from the infrastructural background of the co-operation partners, the
“Zeitgeschichte-Online” project’s subject gateway should by far show the highest grade of
professionalism of the three subject gateways at issue.
III Generic technical web-page evaluation methods The Internet could not exist without technical standards. However, in the light of the
majority of web-pages currently available, one would be inclined to think that quite the
opposite is true. Given the lack of conformity with regard to technical standards, a
considerable lack of interoperability, accessibility, and usability can be discerned.
1 Syntactic standards
The W3-Consortium27, mainly responsible for creating web-standards, provides
validation services for syntactic web-page standards. Using these validators28 for the start-
pages of the three subject gateways to test HTML and CSS validity has shown that all pages
are invalid with error counts ranging from 6 to 410. Despite being syntactically invalid, the
document will still be accessible using most browsers. It has to be concluded that the creators
of the HTML- and CSS-pages simply are not aware or are not concerned about standard
conformance.29
2 Metadata standards
Only if a web-page is syntactically formalized, i.e. by being marked-up in valid
(X)HTML can value-added processing by software tools be undertaken. Adding a formal and
explicit meaning to content by using metadata is one of the cornerstones of a future Semantic
Web. Implementing Dublin Core30 as the de-facto formal metadata standard for one’s web
pages would be a first step towards that goal. Only one of the three subject gateways at issue,
the “Zeitgeschichte Informations System”, partly31 uses Dublin Core in its HTML pages.
26 Van Laak (2004) "Rez. WWW: Zeitgeschichte-online". 27 http://www.w3.org/, January, 20th 2006. 28 The validator used for HTML has been http://validator.w3.org/, January, 20th 2006, the one used
for CSS validation has been http://jigsaw.w3.org/css-validator/, January, 20th 2006. 29 A comprehensive discussion of the implications thereby created, exceed the scope of this paper. 30 http://www.dublincore.org/, January, 20th 2006. A brief overview and introduction of its usage is
provided by Hillman (2003) "Using Dublin Core". 31 Dublin Core is used on the entry pages only.
11
3 Accessibility guidelines
Another generic factor for the quality of web-pages is their conformance to
accessibility guidelines like the W3C’s WAI32 or the U.S. Government’s Section 50833. Again,
the use of validators34 to check conformance shows that none of the three subject-gateways
passes the tests. In contrast to the HTML- and CSS-validation tests, the effects of non-
accessible pages are far more severe for people with disabilities and therefore a strong call for
action to make web-pages accessible has to be stated.
4 Usability evaluation
Usability engineering for web-pages has grown out of the software development
discipline of Human Computer Interaction (HCI) and is faced with a number of web related
problems: The diversity of user configurations may cause a web page to be displayed or
loaded completely different for the individual user. Also, target audiences are difficult to
define because of the global nature of the Internet. In addition, the rapidly changing nature of
the Internet causes short development cycles, making it difficult to incorporate the findings of
usability studies.
That short list of web related problems of web-usability shows only some aspects of
difficulties a generalization of web-usability evaluation methods will typically face.
Therefore, it is not surprising that none of the pertinent web standardization bodies have yet
published generic web-page usability standards.
What has been published, however, is a number of web-usability guidelines35,
checklists and criteria36. The criteria contained in these guidelines can be used during the
process of entertaining the most prominent web-page evaluation method called “heuristic
evaluation”37, where “a small set of evaluators examines the interface and judges its
32 http://www.w3.org/WAI/, January, 20th 2006. 33 Implementation of Section 508 is legally binding for U.S. federal agencies. More information can
be found at http://www.section508.gov/, January, 20th 2006. 34 Validator used for WAI/WCAG and Section 508: http://www.contentquality.com/, January, 20th
2006. 35 Cf. Nielsen (1996) "Original Top Ten Mistakes in Web Design", Nielsen (1999b) "The Top Ten
New Mistakes of Web Design", Nielsen (1999a) "'Top Ten Mistakes' Revisited Three Years Later", Nielsen (2002b) "Top Ten Web Design Mistakes of 2002", Nielsen (2002a) "Top Ten Guidelines for Homepage Usability", Nielsen (2003b) "Top Ten Web Design Mistakes of 2003", Nielsen (2003a) "The Ten Most Violated Homepage Design Guidelines", Nielsen (2004) "Top Ten Mistakes in Web Design", Nielsen (2005b) "Top Ten Web Design Mistakes of 2005", and Koyani, et al. (2003) "Research-Based Web Design & Usability Guidelines".
36 Cf. Hennig and Quirion (2004) "Best practices for web interfaces of searchable databases". 37 Nielsen (2005a) "Ten Usability Heuristics".
12
compliance with recognized usability principles”. If the resources for such test settings are not
available, it is still possible to use the afore mentioned guideline listings as checklists and
either automatically test conformance with an experimental tool like WebSat38 or uzReview39,
or otherwise do the testing manually. The latter approach has been applied for this paper.
In this analysis, two different sets of web-page usability guidelines have been used to
evaluate web-usability of the three subject gateways. To honor the specific context of the
subject gateways, the first guideline set was taken from “Best practices for web interfaces of
searchable databases”40 published by Nicole Hennig and Christine Quirion of MIT’s Web
Advisory Group. Generic web-usability guidelines have been assembled for the second
guideline set by selecting “Web Design Mistakes” from Jakob Nielsen’s “Alert Boxes”41. The
full list of guidelines and detailed results can be found in the Appendix. A summary of results
for the two sets of guidelines is shown in the following table:
Subject Gateway Passed Tests,
Guideline Set 1 Henning and
Quirion 2004, IX1.1
Weighted Mistakes42,
Guideline Set 2 Nielsen, IX1.2
Zeitgeschichte Informationssystem
18 of 31 21 of 119
Virtual Library Zeitgeschichte
20 of 31 22 of 119
Zeitgeschichte-Online 22 of 31 30 of 119
These results merit further discussion: Because the web-usability guidelines are not
standardized and therefore have no reference character, it is more important to examine the
relationship of the three subject gateways fare, rather than examining the total number of
passed checks or total mistakes in general. That said, it can be noted that the statistical spread
of the results of the three subject gateways is rather low – the three gateways have very
similar scores. Regarding the searchable databases usability guideline set on the one hand, the
“Zeitgeschichte Online (ZOL)” project fares best. As regards the generic web-usability
guideline set on the other hand, the “Zeitgeschichte Informationssystem (ZIS)” project’s web-
page does. In both cases, the two other projects are running up to each other very closely.
Why does the ZIS web-page design score best in regard to the generic usability guidelines 38 Cf. NIST (2002) "WebSAT". 39 Cf. Edmonds, et al. (2003) "uzReview 0.7.1". 40 Hennig and Quirion "Best practices for web interfaces of searchable databases". 41 Cf. footnote 35. 42 The 36 mistakes have been weighted from a scale from 1 to 5 because a mistake like “Overly
detailed ALT Text” has less impact than “No Contact Information or Other Company Info”. Therefore the worst score would be the sum of the weighted mistakes which is 119.
13
even though it has not been updated for over two years? The answer for that lies in the
question itself: Usability of web-pages does not have to relate to the date of the establishment
of a homepage. The ZIS web-pages have a very simple layout with few graphics and even
fewer non-standard web-page elements. From that alone it is harder to make more mistakes
than it would be with a more complex layout. The ZIS project has a strong focus on its link
database and does not provide the variety of content, as ZOL does for example. That makes it
easier for ZIS to use a simple and focused layout of the web-pages. Consequently, the
challenge of web-usability grows with the variety and complexity of content.
Overall, the three subject gateways score satisfactorily with regard to the searchable
databases usability guideline set and they score well with regard to the generic web-usability
guideline set. The web-pages provide a usable platform for the two prototypes of web-users:
the “link-dominant” and the “search-dominant” ones43 and meet another important usability
criterion in that they clearly provide context for the different web-pages: Where am I? What
Can I do here? Where can I go to from here?44
43 The two terms have been coined by Nielsen (1997) "Search and You May Find". The same
prototypes have been identified by Krug (2000) "Don't Make Me Think. A Common Sense Approach to Web Usability" 54 or Kyunghye (2002) "A Model-based Approach to Usability Evaluation for Digital Libraries" (“scanning” vs. “searching”) for example. Also interesting in that context are the results of a usability study by Mitchell, et al. (1999) "Testing the Design of a Library Information Gateway" where the majority of tested users turned out to be “search-dominant” and almost too spoiled by the “Google-Comfort”: “The finding that came out most forcefully was that students want a white box into which they can type their search terms. If students have to go beyond two screens to find such a box, they become frustrated and impatient“.
44 Cf. Krug "Don't Make Me Think. A Common Sense Approach to Web Usability" 87: “Trunk Test” and Theng, et al. (2000) "Purpose and usability of digital libraries" 239: “Feeling Lost”.
14
IV Setting up a framework for specific analyses The discussed generic technical web-page evaluation methods can only provide rather
generic answers. For more specific questions, e.g. in quantitative analysis, more specifically
tailored software is needed. In the course of the comparative analysis of the three afore-
mentioned subject gateways, a crawler program has been developed for harvesting the content
of each subject gateway’s link database. The crawler has to use heuristics to map the crawled
data into a common database. There are two reasons for this: First, none of the subject
gateways offers a formalized public interface to access its databases, like for example
providing a custom Web Service. Therefore, the link databases have to be harvested by
parsing their HTML output. Second, it is necessary to map the harvested link metadata to a
common scheme. None of the three subject gateways declares to use a common metadata
scheme, thus a specific conceptual mapping to the Dublin Core-using aggregate database had
to be set up. Analyzing the data to be harvested showed that these metadata mappings could
not be static. In case of “Zeitgeschichte-Online”, for example, the fields “Autor” (“author”),
“Herausgeber” (“editor”), and “Veröffentlicht durch” (“published by”) could not be mapped
1:1 to DC-Creator and/or DC-Publisher, the only two DC-fields available for matching in that
case. Depending on the presence of data in one of the three fields, a different semantic
meaning had to be applied: If data was present in the “published by” field, it was used for the
Dublin Core “publisher” field and Dublin Core “creator” was filled with the “author” field, or
– if the “author” field was empty – with the “editor” field. If no data was present in the
“published by” field but in the “author” and “editor” fields, the content of the “author” field
was used for Dublin Core “creator” and “editor” for Dublin Core “publisher”.45 Using
standardized means for providing access to one’s metadata or archival information, e.g. by
implementing an interoperable OAI-PMH46 data provider interface, could avoid potential
errors due to such ambiguities.47
45 The corresponding application logic can be found at the end of the process_content() function of
harvester_zol.pl, http://pepl.info/viewcvs/trunk/harvester-zol.pl?view=log, February 22nd, 2006. 46 OAI PMH stands for the Open Archive Initiative Protocol for Metadata Harvesting. See Caplan
(2004) "OAI-PMH" for information about the protocol and Kelly (2004) "Interoperable Digital Library Programmes? We Must Have QA!" for general consideration on Digital Library interoperability.
47 Enderle "Der Historiker, die Spreu und der Weizen, zur Qualität und Evaluierung geschichtswissenschaftlicher Internet-Ressourcen" 60 also considers an OAI interface as one of the quality criteria for a web subject gateway.
15
The common database storing the harvester results has been implemented using the
PostgreSQL48 RDBMS. The crawlers have been implemented using Perl49 and Perl CPAN50
modules. They retrieved the metadata from the contents of the link databases assignable to the
Dublin Core Metadata Element Set, the HTTP status code and MD5 checksum of the database
item’s content. In addition, each link database item’s homepage was crawled recursively to
three levels of depth, to store the out-links to other link database items.
V Using the custom analysis framework Having crawled 3,646 interlinked items holding a number of attributes provides
copious space for analysis. In the following, a selection of options for analysis will be
discussed.
1 Resource identifier validation
HTTP status codes tell us about the availability of a resource. Status codes greater than
400 denote an invalid resource, which could inter alia be the result of either “404 Not Found”
or “500 Server Error”. The following table provides an overview of invalid items in the
aggregated link database grouped by subject gateway as of April, 14th 2005.
Subject Gateway Total Items Invalid Items Invalid Items %
Zeitgeschichte Informationssystem
822 178 21 %
Virtual Library Zeitgeschichte
693 66 9 %
Zeitgeschichte-Online 2,131 82 3 %
The disparate percentage of invalid items from the ZIS database could be explained by
the fact that the last update of that database has been performed at August, 13th 2003.
Unfortunately, the last update timestamps of the other databases could not be ascertained from
their homepages.
48 http://www.postgresql.org/, January, 20th 2006. 49 http://www.perl.org/, January, 20th 2006. 50 http://cpan.perl.org/, January, 20th 2006.
16
2 Subject classification analysis
2.1 Current subject indexing standards
Subject indexing is one of the most challenging and time consuming tasks of metadata
classification. It is also one of the tasks still most recalcitrant to automation, as a recent report
on Automated Metadata Classification51 has shown.
The creation of all three subject gateways at issue coincided with a period of change
for the respective bibliographic standards for subject indexing in the German language area.
Even the mere existence of the subject area of “Bibliothekswissenschaft” (Library Science)
itself has been questioned in the last years.52 Since the mid-nineties of the last century, a
transition of the conventional standards for both formal and subject indexing to new and
international, or internationally oriented ones has been on its way. The assets and drawbacks
of a transition from the German formal cataloging rules RFK (RAK)53 to its Anglo-American
counterpart AACR254 have been discussed intensively55. Regarding subject indexing, the
situation is similar: The main German language subject headings, SWD56, are difficult to map
to their other language counterparts because of their inherent rules57, and the recently
translated Dewey Decimal Classification System (DDC)58 misses the required granularity for
the purposes of a web subject gateway. What classification system should a German language
web subject gateway use for subject indexing then? The de-facto standard for subject
51 Cf. Greenberg, et al. (2005) "Final Report for the AMeGA (Automatic Metadata Generation
Applications) Project". 52 Cf. Hauke, et al. (2005) "Library Science - quo vadis? (Re)Discovering „Bibliothekswissenschaft“"
and Gradmann (2005) "Hat Bibliothekswissenschaft eine Zukunft? Abweichlerische Gedanken zur Zukunft einer Disziplin mit erodierendem Gegenstand".
53 An Introduction to cataloging rules along the „Regeln für die Formalkatalogisierung (RFK)“ formerly „Regeln für die alphabetische Katalogisierung (RAK)“ provides Eversberg (2005) "Wie katalogisiert man ein Buch? Ein Leitfaden nicht nur für Einsteiger".
54 AACR2, the “Anglo-American Cataloguing Rules, Second Edition”, http://www.aacr2.org/, January, 20th 2006.
55 An overview of the discussions provides Arbeitsgemeinschaft der Parlamentsbibliotheken und Behördenbibliotheken (2003) "Stellungnahmen, Materialien und Informationen zu dem Beschluss des Standardisierungsausschusses bei der Deutschen Bibliothek, einen Umstieg von den deutschen auf internationale Regelwerke und Formate (AACR und MARC) anzustreben".
56 SWD stands for “Schlagwortnormdatei”; cf. Deutsche Bibliothek (2005) "Schlagwortnormdatei (SWD)".
57 Cf. Eversberg (2004) "Eine seltene Sache. Erwartung und Ernüchterung bei der thematischen Katalogsuche" An introduction to the subject indexing rules RWSK provides Umlauf (2005) "Einführung in die Regeln für den Schlagwortkatalog RSWK".
58 Heiner-Freiling and Svensson (2005) "Dewey-Dezimalklassifikation".
17
indexing in the Anglo-American language area, the Library of Congress Authorities59, would
very likely provide the required granularity. For the use on the web, however, it has been
considered too complicated and the attempts to simplify the Library of Congress Authorities
for web resources using a concept of “Faceted Application of Subject Terminology”60 have
only started. Even if the result of those projects would be mature, the problem of how to
ensure interoperability of the different language and domain specific subject headings will
remain unresolved. In the near future there will be no shared international authority file and
no standards how to automatically map between different subject headings.61
2.2 Semantic Web to the rescue?
As we have seen, traditional Library Science cannot offer an out-of-the-box solution
for subject classification for web subject gateways. If a new subject gateway was to be set up
today, a custom subject authority file would still have to be compiled for subject indexing. To
ensure interoperability and long-time compatibility, such authority file could be created using
Semantic Web technologies62. For subject indexing, that would require the creation and use of
a domain specific Ontology63; roughly: the equivalent of a thesaurus.
Unfortunately, building such Ontologies is rather costly, difficult and still even
technically experimental. To the author’s knowledge, only one History related Ontology64 is
publicly available by the end of 2005. To roll up the research interests and current status of
Ontologies in the Humanities, a workshop will take place in April 2006 in Hamburg65. That
fact alone shows that the whole topic of Ontologies in the Humanities, let alone History, is in
a very early development phase and not ready to be widely used.
59 Library of Congress Authorities (LCA): http://authorities.loc.gov/, January, 20th 2006. Formerly the
LCA were called Library of Congress Subject Headings (LCSH). 60 Cf. Dean (2003) "FAST: Development of Simplified Headings for Metadata" This project also
investigates how to formalize the subject headings using the Dublin Core standard. 61 Cf. Tillet (2003) "Authority Control: State of the Art and New Perspectives". 62 Cf. Enderle "Geschichtswissenschaft, Fachinformation und das Internet" 7: “Es sei daher von der
These ausgegangen, daß künftige fachwissenschaftliche Erschließungsformen sich die Idee des Semantic Web zu eigen machen sollten.”
63 Boonstra, et al. "Past, present and future of historical information science" 102. 64 This Ontology has been developed by the VICODI Project (http://www.vicodi.org/, January, 20th
2006): “The objective of VICODI is to enhance human comprehension of the digital content on the Internet. This is reached by introducing novel visualisation and contextualisation environment for digital content.”
65 “Ontology Based Modelling in the Humanities”, 7-9 April 2006, University of Hamburg (http://www.c-phil.uni-hamburg.de/view/Main/OntologyWorkshop, January, 20th 2006).
18
2.3 Subject classification of the three subject gateways
From the last two chapters we know that during the creation of the three web subject
gateways at issue, no evident choice of specific subject indexing standards was available.
Even today it would be necessary to create a specific subject authority file as none exists for
the field of Contemporary History yet. In addition, the formal representation of the authority
file, i.e. in which way the authority terms and the file itself would be technically expressed,
would not be obvious, as standards are still at a development stage. As a consequence, given
that an application would rely on a formal representation e.g. using an XML file along a
specific schema would not guarantee that a finally developed standard would require
something completely different and consequently in a best-case scenario at least additional
conversion effort.
Independently from that difficulties, still, applications should at least be designed to
avoid free-form data entry as it seems to be the case with “Zeitgeschichte-Online”, where the
concept of “Arbeiterbewegung” (labor movement) can be found in the four different
keywords “Arbeiterbewegeung”, “Arbeiterbewegung”, “Arbeiterbewegungen”,
“Arbeiterbewgung”. Similar spelling errors or redundant classifications can also be found in
the other two databases using duplicate detecting algorithms66.
The distribution of keyword usage in the aggregated database can be interpreted as an
indication of the most popular research topics in web-present Contemporary History in the
German language area:
Top 4 Keywords (Number of Occurrences)67 425 Nationalsozialismus 288 Holocaust 243 Sozialgeschichte 203 Widerstand
Looking at the keywords’ distribution broken down by subject gateway it can be
noticed that all three mostly used keywords are related to the same subject area. In addition to
this well-defined subject area focus to be discussed in the following chapter, a maverick in
keyword distribution can be noticed:
66 For this study, the word stem, Soundex, and Levenshtein Edit Distance of keywords have been used
to identify duplicates. More information on that topic addressing the Merge/Purge problem provides Hernández and Stolfo (1998) "Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem".
67 A list of the 50 most used keywords is available at the corresponding table in chapter IX2.
19
Subject Gateway Total
items Distinct
keywords KWs with only one
occurrence /per distinct keywords %
Most used keyword
Zeitgeschichte Informationssystem
822 166 26 15.66% Holocaust
Virtual Library Zeitgeschichte
693 126 19 15.08% „Drittes Reich“
Zeitgeschichte-Online
2,131 1,146 502 43.80% Nationalsozialismus
Approximately 44% of the keywords used by “Zeitgeschichte-Online” are used only
once. This can be interpreted either as an indication for a thematically wide-spread content of
the link-database, or – for the worse – for a lack of stringent rules for subject classification. A
third possible interpretation for that outlier is a rather practical one: “Zeitgeschichte-Online”
is aggregating link database entries from partner institutions, which represents a factor
potentially increasing the content and classification diversity. In disregard of possible
explanations for the reasons of why almost half of the keywords used by “Zeitgeschichte-
Online” are used only once, it has to be put into question what purpose do keywords that are
used only once fulfill?
2.4 Subject focuses as compared to the “offline world” of Contemporary History research and teaching.
The last chapter showed that “National Socialism” and “Holocaust” are the top
subjects in the aggregated database and with that, also in the web-present Contemporary
History in the German language area. To what extent can those subject focuses be compared
to the “offline world” of Contemporary History? To answer that question it will be necessary
to analyze the subject focuses of printed publications in the field of Contemporary History.
The most comprehensive overview of print publications on History in the German
language area is provided by the annually published Historische Bibliographie68.
Unfortunately, using the query interface of the online version, it is not possible to get an
overview of the most common subjects used for classification. E-Mail correspondence with
68 The “Historische Bibliographie” is edited by the Arbeitsgemeinschaft außeruniversitärer
historischer Forschungseinrichtungen in der Bundesrepublik Deutschland. Its homepage including a test-access is available at: http://www.ahf-muenchen.de/HistBib/, January, 20th 2006.
20
the publisher of the Historische Bibliographie69 could not provide the answers about the
wanted keyword statistics either.
An alternative provider for bibliographic information with an international focus and
only partially covering the publications on History in the German language area, are the Arts
& Humanities Citation Index70 and the Social Sciences Citation Index71. As with the
Historische Bibliographie however, the query interfaces of both bibliometric tools could not
provide the sought answers72 and e-mail correspondence with the publisher73 remained
unanswered.
Since the trans-regional and trans-national sources of bibliographic information on
publications in the field of Contemporary History could not be used to answer the question
about subject focuses of printed publications, we have to resort to a bibliographic source with
a potential regional as well as a specific language focus: The OPAC database of a
University Library74. In our case, it was possible to retrieve an excerpt of the history related
subject classifications of publications indexed in the OPAC database of the University Library
of the Leopold-Franzens-University Innsbruck75. This University Library is not known to
have any specific History related subject focus, so it can be assumed that the composition of
indexed publications is sufficiently representative for the German language area.
After importing the round about 150,000 records into the aggregate database, it was
possible to sort the round about 62,000 distinct keywords by occurrence. In a final step,
69 E-Mail to [email protected], June, 26th 2005 with subject“Anfrage
Themenstatistik“, answer from Helmut Zedelmaier <[email protected]>, June, 28th 2005.
70 http://scientific.thomson.com/products/ahci/, December, 22nd 2005. 71 http://scientific.thomson.com/products/ssci/, December, 22nd 2005. 72 Thanks to Eveline Pipp and Heinz Hauffe of the University Library of the Leopold-Franzens-
University Innsbruck for their help searching the A&HCI and SSCI using the DIALOG interface. 73 E-Mail to Philip Heller <[email protected]> and George Herzhoff
<[email protected]>, June 23rd 2005 with subject “Request for Keyword Statistics”. 74 Potential alternatives for analyzing the contents of a OPAC database would have been the analysis
of subject focuses of thesis papers or journal articles, both of which would have implicated a separate study on their own. Mattl (1983) "Bestandsaufnahme zeitgeschichtlicher Forschung in Österreich" 27-53, provides a brief analysis of subject focuses of round about 280 thesis papers on Contemporary History published in Austria between 1975 and 1981. Because of its age and very strong focus on Austrian History it has not been used here. Mattl (2003) "Nicht eine, sondern viele Zeitgeschichten. In Annahme einer 'dritten Generation'" 365, footnote 28, provides a very brief “provisional” analysis of the round about 260 articles published in the journal zeitgeschichte until November 2003. Because of its tentative character and focus on a single journal it has only been put to “footnote use” for the comparison of “offline”- and “online-world” here at issue.
75 Thanks to Georg Stern-Erlebach of the University Library of the Leopold-Franzens-University Innsbruck for providing this list.
21
manually filtering the 30076 most used keywords by their relation to Contemporary History
yielded the wanted list of top keywords, comparable to the list of most used keywords in the
link databases of the three subject gateways:
Top 4 Keywords related to Contemporary History,
Innsbruck University Library OPAC Database (Number of Occurrences)77 1,335 Juden 1,157 Geschichte 1933-1945 1,075 Drittes Reich 877 Nationalsozialismus
The most used keywords in aggregated database as presented in chapter 2.3 have been
“Nationalsozialismus”, “Holocaust”, “Sozialgeschichte”, and “Widerstand”. The keyword
“Holocaust” does not exist in the OPAC database, an interpretation of equivalent mappings
could be: “Juden” (1,335), “Judenverfolgung” (293), and “Judenvernichtung” (271). All
occurrences of “Sozialgeschichte” with a chronological constraint to the 20th century can be
seen as equivalents to the keyword “Sozialgeschichte” of the aggregated database. Those
entries, e.g. “Sozialgeschichte 1945-1950”, have been used 97 times in the OPAC database,
putting them well behind the top 50 most used Contemporary History related keywords. The
4th most used keyword in the aggregated database, “Widerstand”, takes the 11th place in the
OPAC top-keywords list, by being used 367 times.
Although the keywords in the two databases are not literally identical, the focus on the
events in Europe between 1933 and 1945 remains the same. Thereby, the most treated
research topics of Contemporary History in the German language area do not substantially
differ in the “offline-” and the “online world”.78
3 Common resource identifiers analysis
When comparing three link databases with a common content focus, the question
about commonly shared URLs seems rather obvious. In the case of the present study, the
answers to that question turned out to be a surprise: From the total of 3,370 distinct
76 From the 62,000 distinct keywords related to History, the 300 most used keywords have been
selected assuming that the most used keywords related to Contemporary History would be included in that list of the top 300.
77 A list of the 50 most used keywords related to Contemporary History is available at the corresponding table in chapter IX2.
78 A similar, although “provisional” finding provides Mattl "Nicht eine, sondern viele Zeitgeschichten. In Annahme einer 'dritten Generation'" 365, footnote 28, there he shows that “Nationalsozialismus” and “History of Judaism” have been the most prominent subject focuses in the journal zeitgeschichte up to November 2003. Also see Footnote 74 in this paper.
22
normalized URLs, only 195 (5.79 %) are common to at least two subject gateways and only
2479 (sic!) (0.71 %) are common to all three. Especially the last number makes a statistician
doubt his methods and findings. However, the result stayed the same after double-checking.
There are some possible interpretations for that very low number of shared URLs.
However, each interpretation only provides a partial explanation. The fact that no entry of the
ZIS database is newer than August 2003 could be one factor, another that the “Virtual Library
Zeitgeschichte” has a relatively specific subject focus. The disproportionately small number
of shared URLs questions the comparability and the authority of the three subject gateways,
irrespective of the reasons for it and irrespective of the gateways’ – at least original –
authoritative pretence.
Shared URLs by subject gateway Virtual Library Zeitgeschichte Zeitgeschichte-Online
Zeitgeschichte Informationssystem 50 77 Virtual Library Zeitgeschichte n/a 116
Having only 24 shared link database entries facilitates a classification analysis.
Subsequent to the general classification analysis undertaken above, the results are not
unexpected: Not a single keyword is used by all three subject gateways for one of the 24
items. For 16 of the 24 items, at least one identical keyword has been used by two of the three
subject gateways. In the remaining 8 cases, all the keywords used per database item were
different. As an example, the classification of the resource http://chronik-der-mauer.de/ is
shown in the following table: Zeitgeschichte Informationssystem Berliner Mauer, Bibliographie, DDR, Kalter Krieg Virtual Library Zeitgeschichte Mauerbau und Grenzbefestigung, Nachkriegszeit Zeitgeschichte-Online Berlin, Berlinpolitik, Grenzen, Mauerbau, Zeitzeuge
It can be noticed that some of the used keywords, e.g. “Berlin” and “Grenzen”
(borders) are formulated very broad and therefore very unspecific.
4 Duplicate pointers analysis
The web offers several possibilities to address the same resources under different
URLs. For example, host aliases or directory index files allow http://www.h-net.org/~german/
and http://www2.h-net.msu.edu/~german/ or http://www.icbh.ac.uk/icbh/ and
http://www.ihrinfo.ac.uk/icbh/welcome.html to point to the same documents. Identical
content can be identified by using message digest algorithms. After harvesting the link
79 This list of URLs can be found at the table in chapter IX3.
23
databases of the three subject gateways at issue, the crawler stored MD5 checksums for each
document. Using that mean, it was possible to identify several documents stored under
different URLs. The ZIS database stores 9 distinct documents under 19 different URLs, the
ZOL database 11 under 26, and the VLZ database 4 distinct documents under 8 URLs. In
other words, the same document had been entered under two different URLs into the VLZ
database in four cases.
5 Information network analysis
As mentioned earlier, the crawler programs stored the outgoing hyperlinks of the
database items’ web-pages to the other items’ web-pages over a depth of three levels (clicks).
The resulting graph can be analyzed by using a variety of methods related to the field of
network analysis80.
The PageRank algorithm81, popularized since its use by the Google82 search engine,
can be used to help determine a page’s relevance in relation to other pages of a network.
Assuming that a page is casting a vote on another page by linking to it, the importance of a
page is determined by the number of votes cast for it. Also, the importance of the page that is
casting the vote determines how important the vote itself is. For our ZIS/ZOL/VLZ network,
the top-five ranked URLs are shown in the following table:
URL In-
Degree83 PageRank (ZIS/ZOL/VLZ network, n=2278, scale: 1-1084)
http://www.dhm.de/ 123 8.87 http://www.wiesenthal.com/ 109 7.94 http://www.ubka.uni-karlsruhe.de/kvk.html 119 7.76 http://www.iwm.org.uk/ 24 7.60 http://www.iwmcollections.org.uk/ 19 7.41
After converting the directed link graph to a binary asymmetric adjacency matrix, the
wide-ranged power of network analysis software tools like UCINET85 or Pajek86 can be put to
use.
80 An introduction to Social Network Analysis Methods provides Hanneman (2001) "Introduction to
Social Network Methods". 81 Cf. Page, et al. (1998) "The PageRank Citation Ranking: Bringing Order to the Web". 82 http://www.google.com/, January, 20th 2006. 83 The In-Degree determines how often other pages of the network link to a page. 84 The logarithmic PageRank scale has been simplified in analogy to the Google toolbar here. The
higher the value, the more important a page is considered to be. Details of the PageRank calculation can be found in the implementation of cagipch_pagerank.pl, http://pepl.info/viewcvs/trunk/cagipch_pagerank.pl?view=log, February, 22nd 2006.
24
www.archiv-buergerbewegung.de
people.freenet.de/DDR-Forschung.English_home.htm
www.tu-dresden.de/hait
www.zzf-pdm.de
www.chronik-der-mauer.de
www.calvin.edu/academic/cas/gpa www.ddr-suche.de
www.thueraz.de
www.bstu.de
www.thueraz.de/links.htm
www.17juni53.de
www.bpb.de/publikationen/*Alltagskultur_Ostdeutschland.html
www.bpb.de/themen/*Weltfestspiele_1973.html
www.stasiopfer.de
www-sul.stanford.edu/depts/hasrg/german/cultural.html
www.umass.edu/defa
Because the density of ties in the whole ZIS/ZOL/VLZ network is very low (0.3%),
we will partition the network matrix along keyword-based parameters to be able to tell about
degree-, betweenness-, and closeness centrality, as well as other network analysis concepts
like Bonacich Power Indices or cliques for the subject-related sub-networks.
Using network visualization software like NetDraw87 allows for a quick and
comprehensive overview of such sub-networks as shown in the following exemplary diagram:
Diagram 1: Network based on keywords 'DDR' and 'Deutsche Demokratische Republik 1949-1990'. Nodes with a degree of one have been removed for better visibility.
At 4.73%, the density of this “GDR-Network” is much higher compared to the overall
ZIS/ZOL/VLZ network. www.bstu.de, www.17juni53.de, and www.zzfg-pdm.de can be
identified as the three central web-pages by catching a glimpse of the diagram. The “GDR-
Network” only has 31 nodes, whereas the total number of distinct items sharing one of the
two keywords “DDR” or “Deutsche Demokratische Republik 1949-1990” in the aggregate
database is 59. That means that almost half of those items are not cross-linked by the others.
Because of this and the overall very low density of the network, the case for web-subject
gateways filling those missing links can easily be established, assuming that it is not Google
that will be forestalling this function in the pre-Semantic Web88 era.
85 Borgatti, et al. (1999) "UCINET 6.0 Version 1.00". 86 Batagelj (2005) "Pajek 1.04". 87 Borgatti (2002) "NetDraw: Graph Visualization Software". 88 In the context of the this study, an overview of the Semantic Web from a Web-Mining perspective
is given by Berendt, et al. (2004) "A Roadmap for Web-Mining: From Web to Semantic Web".
25
VI Interpreting the results of the analysis: Two theses on Contemporary History in the German language area
The results of the analysis obtained by using the custom framework have laid a
residuum for further discussion and interpretation. In the following, two theses based on the
results will be discussed.
1 Contemporary History is still the history of National Socialism –
Discussing the meaning and purpose of “Zeitgeschichte”
From the subject focus analysis in chapters V 2.3 and V 2.4, it has been concluded that
the events in Europe between 1933 and 1945 are the most treated research topics of
Contemporary History in the German language area. Gaining that insight in the year of 2005
one may ask, how contemporary that version of Contemporary History can call itself, and
how inherent this subject focus may be in the discipline itself. Reflecting about the status of
“Zeitgeschichte” as an academic discipline in the German language area, it will be helpful to
discuss its establishment on the one hand and the definitions of “Zeitgeschichte” found in the
historiography on the other.
Two milestones for establishing Contemporary History as an academic discipline in
Germany were (i) the foundation of the “Institut für Zeitgeschichte” (IfZ) in Munich in 194989
and (ii) the publication of the journal Vierteljahreshefte für Zeitgeschichte by that institute
starting in 1953. In the first article of its first issue the then director of the IfZ and co-
publisher of the latter journal, Hans Rothfels, introduced “Zeitgeschichte als Aufgabe” in
1953. Subsequently, he narrowed “Zeitgeschichte” down to “Epoche der Mitlebenden”90, then
starting “etwa mit den Jahren 1917/18”91. Rothfels’ article has had great impact and is still
referenced today by most articles when it comes to defining status and purpose of
“Zeitgeschichte” in the German language area. The current successor of Rothfels in his role as
head of the IfZ and, consequently, head of one of the largest92 academic institution dealing
with “Zeitgeschichte” in the German language area, Horst Möller, defined “Zeitgeschichte” as
89 Cf. http://www.ifz-muenchen.de/das_ifz/geschichte.html, January, 23rd 2006. 90 Rothfels (1953) "Zeitgeschichte als Aufgabe" 2; “Epoch of the co-living”. 91 Ibid. 6 (emphasis added); “about the years 1917/18”. 92 Based on the list of staff members available on the homepage, the IfZ is currently the largest with
52 members (http://www.ifz-muenchen.de/mitarbeiter/index.html, February 5th, 2006). The “Zentrum für Zeithistorische Forschung”, established 1996 in Potsdam, is closely running up with 51 members (http://www.zzf-pdm.de/mitarb/mtarbfr.html, February 5th, 2006).
26
the history of the 20th century from 1917 until 1989/9193 in his latest publication of
“Einführung in die Zeitgeschichte”. Has the start of the "Epoch of the Co-Living" not changed
in fifty years? Most of other current German speaking historians have taken a different view:
“Zeitgeschichte” can not94 and should not95 be exactly defined. Because “Zeitgeschichte” as
term has been assigned for the long period of time starting from 1917, it has also been
proposed to introduce “neueste Zeitgeschichte”96 as a new term for the era of the latest years.
One way or another: No single concept “Zeitgeschichte” exists in the German language area,
but rather more than one “Zeitgeschichten”97.
Besides the meaning of the term “Zeitgeschichte” itself, the strong national focus of
historiography has also been a prevailing point at issue. Although Hans Rothfels in 195398
already saw the need for dealing with “Zeitgeschichte” in an international context,
Contemporary History in the German language area still means Contemporary History of the
German language area in 2005. Differences or comparisons on an international scale still do
not have a strong focus in the mainstream historiography99. Also, initiatives against that
national focuses100 did not have any decisive impact so far101.
Not only the temporal “Epochengrenzen” and the national but also the thematic focus
of the historiography on Contemporary History in the German language area have been and
still are subject to debate. In the hindsight, it looks like the postulate by Hans Rothfels from 93 Möller and Wengst (2003) "Einführung in die Zeitgeschichte" 11: “Als Zeitgeschichte gilt heute in
Deutschland die Geschichte des 20. Jahrhunderts von 1917 bis 1989/91”. Möller relativizes this rather absolute definition later on page 25: Zeitgeschichte be “ebenso fließend wie ihr Gegenstandsbereich”.
94 Cf. Gehler (2001) "Zeitgeschichte im dynamischen Mehrebenensystem: Zwischen Regionalisierung, Nationalstaat, Europäisierung, internationaler Arena und Globalisierung" 12: “Dilemma der Zeitgeschichtsschreibung, daß sie sich einer genauen Zuordnung entzieht, was in der Natur der Sache liegt”.
95 Cf. Hockerts (2001) "Zeitgeschichte in Deutschland. Begriff, Methoden, Themenfelder" 19: “Zeitgeschichte” should itself “niemals zwischen die Grenzpfähle exklusiver Definitionen sperren lassen”.
96 Cf. Schwarz (2003) "Die neueste Zeitgeschichte". 97 Cf. Mattl "Nicht eine, sondern viele Zeitgeschichten. In Annahme einer 'dritten Generation'" for the
specifics of the Austrian historiography on “Zeitgeschichte”. 98 Rothfels 7: “[…] daß Zeitgeschichte als Aufgabe im Prinzip einer Behandlung im internationalen
Rahmen bedarf”. 99 Cf. Möller and Wengst "Einführung in die Zeitgeschichte" 11: In his introduction on
“Zeitgeschichte” Möller writes: “wobei international Entwicklungen soweit möglich und erforderlich in die Betrachtung einbezogen werden” (emphasis added). Critical about this: Angerer (2004) "'Eigene' deutsche Zeitgeschichte oder: wie veraltet eine neue 'Einführung in die Zeitgeschichte' sein kann" 264.
100 For example: Gehler "Zeitgeschichte zwischen Europäisierung und Globalisierung" and Kleßmann (2002) "Zeitgeschichte als wissenschaftliche Aufklärung".
101 Cf. Jarausch (2004) "Zeitgeschichte zwischen Nation und Europa. Eine transnationale Herausforderung", 4.
27
over fifty years ago, namely to deal with the time of National Socialism as an obligation102,
has been dutifully fulfilled. In particular, the historiography on “Zeitgeschichte” of the last
fifty years has had a determining focal point in dealing with the time of National Socialism103.
Although that rather exclusive focus, mainly in combination with a focus on political
history104, has been strongly criticized105 and has also been seen as already being deteriorating
in 1991106, the results of the analysis as presented in chapters V 2.3 and V 2.4 of this study
showed that it is still very present.
The list of alternative or additional topics for Contemporary History in the German
language area is innumerable. “Globalization”107 or the “History of European Integration”108
are just two recently discussed potential topics of “Zeitgeschichte”.
Especially in Germany and Austria with their specific historical backgrounds, it will
not be easy to – more or less by the way – have Contemporary History set its topical focus
after 1945, as it is currently practiced by the Anglo-American Journal of Contemporary
History109. The latter’s publisher announced the new chronological focus in an editorial. Not
102 Cf. Rothfels 8: It is a “unabweisbare Verpflichtung gerade der deutschen Wissenschaft, die
nationalsozialistische Phase mit aller Energie anzugehen”. 103 Möller and Wengst "Einführung in die Zeitgeschichte" 25: “entscheidende Prägung der deutschen
Zeitgeschichtsforschung durch die NS-Thematik”. Also see: Gehler "Zeitgeschichte zwischen Europäisierung und Globalisierung" 33: “beispiellos untersuchtes ‘Drittes Reich’”.
104 Cf. Angerer "'Eigene' deutsche Zeitgeschichte oder: wie veraltet eine neue 'Einführung in die Zeitgeschichte' sein kann" 265f.
105 For example: Gehler "Zeitgeschichte zwischen Europäisierung und Globalisierung" 33: “Eine ausschließlich oder überwiegend mit dem Nationalsozialismus befasste Zeitgeschichte bleibt nicht nur rückwärtsgewandt, sondern auch rückständig”. Similar in a review of the Virtual Library Zeitgeschichte: Böhler and Gehler "Wendungen nach innen? Selektive Blicke auf die Zeitgeschichte": “Ein zeithistoriografischer Mauerfall im Sinne einer gegenwartsorientierten Zeitgeschichtsbetrachtung hat hier ebenso wenig stattgefunden, wie eine Zeitgeschichte im Stile eines ‘dynamischen Mehrebenensystems’ erkennbar wird. […] Fixiert auf ‘Drittes Reich’ und Holocaust bewegt sich nahezu alles um die Fluchtpunkte 1933 und 1945 und ihre ‘Rezeption’“.
106 Cf. Botz (1991) "Zeitgeschichte in einer politisierten Geschichtskultur. Historiographie zum 20. Jahrhundert in Österreich" 328: “das nur allmähliche Abklingen der ‘politischen’, ‘klassischen’ Zeitgeschichte und die langsame Heraufkunft der Gegenwartsgeschichte mit ihren epochenübergreifenden, dennoch spezifisch gegenwartsbezogenen humanwissenschaftlichen Momenten […] erscheint als wahrscheinlich”.
107 Gehler "Zeitgeschichte im dynamischen Mehrebenensystem: Zwischen Regionalisierung, Nationalstaat, Europäisierung, internationaler Arena und Globalisierung" 192-197. In Möller and Wengst "Einführung in die Zeitgeschichte", Globalization is only indirectly referred to as “Interdependenz der modernen Welt” on the half of page 51. See Thomas Angerer’s critical comments on that at Angerer "'Eigene' deutsche Zeitgeschichte oder: wie veraltet eine neue 'Einführung in die Zeitgeschichte' sein kann" 264.
108 Cf. Jarausch , Gehler "Zeitgeschichte zwischen Europäisierung und Globalisierung", and Hockerts "Zeitgeschichte in Deutschland. Begriff, Methoden, Themenfelder" 19.
109 Cf. Marwick (1997) "A New Look, A New Departure: A Personal Comment on Our Changed Appearance", Journal of Contemporary History 32, 6 cited in Angerer (1997)
28
to think about the discussions and protest such a modus operandi would render for a German
or Austrian journal on Contemporary History.
The seemingly unchangeable central subject focus in the German language area may
be additionally amplified by the mere structure of academic organization, where departments
and chairs for “Zeitgeschichte” and “Neuere Geschichte” co-exist with blurry and partially
overlapping contours rather easily prone to turf wars. For “Zeitgeschichte” researchers it can
not be easy to metaphorically pass over the field of their specialty for which they have been
known for over years or even decades in the public, the academe and –– not to forget –– the
institutions responsible for funding: University heads and the federal ministry.
Still, if the current focus of Contemporary History in the German language area
remains the same, should it then still be called “Zeitgeschichte”?
Although far from being wide-spread, a discussion about the crisis of academic
“Zeitgeschichte” has already been started110. In order to prevent any such crisis to further
manifest itself, a new and clear answer to Hans Rothfels’ original question about the task of
“Zeitgeschichte” is required. Such answer would need to provide a clear profile of
“Zeitgeschichte” and thereby could explicitly denote the diversity of topics and methods as
inherent properties. Absolute definitions equating “Zeitgeschichte” limiting it to the 20th
century or a certain geographic area will be of very little help in that regard. Such definitions
would have to get more and more dissatisfying with the progress of time and the global
challenges of the future.
2 Subject gateways are needed as hubs in the Contemporary
History Network
As has been demonstrated in chapter V 5, the network density of the ZIS/ZOL/VLZ
Contemporary History network is very low. Only an evanescent number of websites are
linked among themselves. Reasons for this inter alia include the still reluctant attitude of
German historians towards “Computing in the Humanities” in general111 and the World Wide
Web in particular. In the recent “Einführung in die Zeitgeschichte” of Horst Möller and Udo
Wengst, the only article related to the field of Computing in the Humanities therein is titled
"'Gegenwartsgeschichte'? Für eine Zeitgeschichte ohne Ausflüchte" 50. Also see ibid. 53: “Demnach wird sich das JCH künftig auf die Zeit nach 1945 konzentrieren“.
110 Cf. Gehler "Zeitgeschichte im dynamischen Mehrebenensystem: Zwischen Regionalisierung, Nationalstaat, Europäisierung, internationaler Arena und Globalisierung", 197-204. Also see Falch, et al. (2003) "The neXt Generation - 7 Positionen" 383.
111 Cf. Enderle "Geschichtswissenschaft, Fachinformation und das Internet" 2.
29
“Internet”112 and encompasses six pages in a subsection called “practical tools”. The quality
of that article with regard to the usability for students of Contemporary History follows its
rather poor quantity and furthermore mirrors the attitude towards the relevance of the Internet
for the field of Contemporary History in the German language area; a relevance that is even
lapidary questioned per se in the preface of that book113 published by one of the largest
academic institution dealing with “Zeitgeschichte” in the German language area.
The character of most of the web pages related to Contemporary History as indexed by
the three subject gateways ZIS/ZOL/VLZ can be seen as another reason for the very low
network density: Many of the pages have a representative character and do not include neither
interactive elements like discussion fora, nor sources, nor other material potentially relevant
for research or teaching. In addition, scientific publishing standards are not met by many of
the web-pages114, revealing a striking discrepancy between the “offline-“ and the “online-
world” of scientific Contemporary History. Most web-pages are missing any link to subject-
related counterparts. One may therefore conclude that the original idea of Hypertext did not
have its breakthrough in the Contemporary History network.
For Contemporary History research professionals in the German language area, the H-
Net mailing list H-Soz-u-Kult115 has established itself a central role for subject specific
communication and information116. Web-pages, including subject gateways, only play a
secondary role117, although especially with regard to editorial quality filters, subject gateways
could provide a unique service.
However, interested non-professionals or undergraduate students on the other hand
will primarily see the “online-world” of Contemporary History through the viewpoint of the
World Wide Web and therefore the existence of web-subject gateways filling the large gaps in
the Contemporary History network is a necessity for the current status of the World Wide
Web.
112 Möller and Wengst "Einführung in die Zeitgeschichte" 255-260. 113 Ibid. 12: “Ein Kapitel unter der Überschrift ‘Praktische Hilfsmittel’ [..] enthält [..] einige
Basisinformationen über das Internet, soweit es für Zeithistoriker von Bedeutung ist” (emphasis added).
114 Cf. Wirtz 4 and Dornik "Zeitgeschichte und Internet" 166. 115 Cf. Footnote 25. 116 Cf. Enderle "Geschichtswissenschaft, Fachinformation und das Internet" 5. 117 Cf. The conclusion of Wirtz about the Internet as a publishing medium for Contemporary
Historians, 4: “Offensichtlich wird das Internet bei ZeitgeschichtlerInnen immer noch nicht als wissenschaftliches Publikationsmittel ernst genommen“.
30
VII Conclusions In the course of this field study, three major web-based subject gateways on
Contemporary History in the German language area have been analyzed as an example of
applied Historical Information Science.
Starting out by applying generic technical web-page evaluation methods, it could be
shown that support for formal standards of the three subject gateways is poor. Neither
syntactical, nor metadata nor accessibility standards are applied to an adequate degree. A
usability evaluation using two sets of custom evaluation criteria has revealed that the three
subject gateways score satisfactorily with regard to a searchable databases usability guideline
set, and that they score well with regard to a generic web-usability guideline set.
For further analysis, specifically tailored crawlers had to be developed to use
heuristics to harvest the data of the link databases because no standards-based interoperability
framework is used by the subject gateways. Analyzing the data in the harvested aggregate link
database has demonstrated that the subject gateways’ management software does not prevent
from entering duplicate resources and does not take measures to avoid spelling mistakes
during classification.
Subject classification analysis has shown that a strong focus on “National Socialism”
and “Holocaust” is present in the “online-world” of Contemporary History in the German
language area. A comparison to the subject classification focus of the “offline-world”
displayed the similarly strong subject focus.
The three subject gateways’ link databases only share less than 1% of their total sum
of URLs, which indicates that the overall network density of the ZIS/VLZ/ZOL network is
very low. Applying information network analysis methods further supports this indication.
The field study has shown that the three subject gateways are not making use of the
current web-technologies’ and metadata-standards’ potential. Especially in the light of the
development towards a Semantic Web it remains to hope that the subject gateways on
Contemporary History at issue will improve their standard compliance and interoperability, so
that they will not remain in the domain of a comparatively meaningless web.
The results of the application of the custom analysis framework fertilized further
discussion on the meaning and purpose of “Zeitgeschichte” as a term per se on the one hand
and as an academic profession on the other. In addition, an interpretation as to the role of
subject gateways in the Contemporary History Network of the World Wide Web has been
inspired.
31
This study has introduced aspects of a methodological canon for evaluating subject
gateways as an example of applied Historical Information Science. The author has shown that
approaching historical content by analyzing its “secondary”, i.e. formal aspects, can prepare
the grounds for new insights about the content itself, and that such analysis can enrich the
discussion about the current status and prospects of a whole discipline.
32
VIII Bibliography Angerer, T. (1997). "Gegenwartsgeschichte"? Für eine Zeitgeschichte ohne Ausflüchte.
Zeitgeschichte im Wandel. 3. Österreichische Zeitgeschichtetage 1997, Wien, Innsbruck -
Wien: Studienverlag 1998: 46-53.
Angerer, T. (2004). "'Eigene' deutsche Zeitgeschichte oder: wie veraltet eine neue 'Einführung
in die Zeitgeschichte' sein kann." zeitgeschichte 4 (31): 261-269.
Angerer, T. (2005). Zur Kritik an NS-Fixierungstendenzen der Österreichischen
Zeitgeschichtsforschung. Mit einem Blick auf den französischen Vergleichsfall.
Demokratie - Zivilgesellschaft - Menschenrechte. Österreichischer Zeitgeschichtetag
2001, Klagenfurt, 4. - 6. Oktober 2001, Klagenfurt, Innsbruck - Wien - München - Bozen:
Studienverlag 2005 (in print). Longer PDF-Version available at
http://www.univie.ac.at/igl.geschichte/angerer/IfG_homepage/Aufsaetze/Angerer_Zur_Kr
itik_an_NS-Fixierungstendenzen.pdf.
Apps, A. (2004). zetoc SOAP: A Web Service Interface for a Digital Library Resource.
Research and Advanced Technology for Digital Libraries, 8th European Conference,
ECDL 2004, Bath, UK.
Arbeitsgemeinschaft der Parlamentsbibliotheken und Behördenbibliotheken. (2003).
"Stellungnahmen, Materialien und Informationen zu dem Beschluss des
Standardisierungsausschusses bei der Deutschen Bibliothek, einen Umstieg von den
deutschen auf internationale Regelwerke und Formate (AACR und MARC) anzustreben."
Retrieved 2005-04-05, from http://www.apbb.de/aacr.html.
Baca, M., A. Gilliland-Swetland, T. Gill and M. Woodley. (2000). "Introduction to metadata:
Pathways to digital information." Retrieved 2005-04-20, from
http://www.getty.edu/research/conducting_research/standards/intrometadata/index.html.
Badre, A. N. (2002). Shaping Web Usability: Interaction Design in Context, Addison-Wesley
Professional.
Barton, M. R. and M. M. Waters. (2004). "Creating an Institutional Repository: LEADIRS
Workbook." Retrieved 2005-04-08, from http://dspace.org/implement/leadirs.pdf.
33
Batagelj, V. (2005). Pajek 1.04. Retrieved 2005-04-04, from http://vlado.fmf.uni-
lj.si/pub/networks/pajek/.
Benta, M. (2003). Agna 2.1.1. Retrieved 2005-03-02, from
http://www.geocities.com/imbenta/agna/.
Berendt, B., A. Hotho, D. Mladenic, M. van Someren, M. Spiliopoulou and G. Stumme
(2004). A Roadmap for Web-Mining: From Web to Semantic Web. Web Mining: From
Web to Semantic Web. First European Web Mining Forum, EWMF 2003, Cavtat-
Dubrovnik, Croatia.
Bergman, M. M. and A. P. M. Coxon (2005). "The Quality in Qualitative Methods." Forum:
Qualitative Social Research 6 (2). Retrieved 2005-12-06, from http://www.qualitative-
research.net/fqs-texte/2-05/05-2-34-e.htm.
Berners-Lee, T., J. Hendler and O. Lassila (2001). "The Semantic Web. A new form of Web
content that is meaningful to computers will unleash a revolution of new possibilities"
Scientific American (May 2001). Retrieved 2006-01-27, from
http://www.scientificamerican.com/article.cfm?articleID=00048144-10D2-1C70-
84A9809EC588EF21&catID=2.
Biste, B. and R. Hohls (2000). "Fachinformation und EDV-Arbeitstechniken für Historiker.
Einführung und Arbeitsbuch. Anhang: Online-Referenz." HSR-TRANS 6. Retrieved
2005-06-20, from http://hsr-trans.zhsf.uni-koeln.de/volume6.htm.
Blandford, A. and G. Buchanan (2002). Workshop report: Usability of Digital Libraries @
JCDL’02. Joint Conference on Digital Libraries 2002. Portland, Oregon, USA. Retrieved
2005-11-27, from http://www.uclic.ucl.ac.uk/annb/docs/SIGIR.pdf.
Bodoff, D., P. C. K. Hung and M. Ben-Menachem (2005). "Web metadata standards:
observations and prescriptions." IEEE Software 22 (1): 78-85. Retrieved 2005-03-01,
from
http://ieeexplore.ieee.org/iel5/52/30054/01377128.pdf?isnumber=30054&prod=JNL&arn
umber=1377128&arSt=+78&ared=+85&arAuthor=David+Bodoff%3B+Hung%2C+P.C.
K.%3B+Ben-Menachem%2C+M.
34
Böhler, I. (2001). "Zeitgeschichtsforschung und Internet. ZIS (Zeitgeschichte-Informations-
System) als Beispiel." e-forum Zeitgeschichte 1. Retrieved 2005-03-18, from
http://www.eforum-zeitgeschichte.at/1_01a5.html.
Böhler, I. and M. Gehler (2004). "Wendungen nach innen? Selektive Blicke auf die
Zeitgeschichte." Zeithistorische Forschungen/Studies in Contemporary History 1.
Retrieved 2005-03-01, from http://www.zeithistorische-forschungen.de/16126041-
Boehler-Gehler-1-2004.
Böhler, I., M. Kröll and E. Pfanzelter (1999). "Surfen in der Zeitgeschichte. ZIS: Das
österreichische Zeitgeschichte-Informations-System im Internet." medien & zeit.
Kommunikation in Geschichte und Gegenwart 14 (4): 43-50.
Böhler, I. and E. Pfanzelter (1997). ZIS: Das österreichische Zeitgeschichte-Informations-
System im Internet. Zeitgeschichte im Wandel. Österreichische Zeitgeschichtetage 1997,
Wien, Innsbruck - Wien: Studienverlag 1998: 449-458.
Boonstra, O., L. Breure and P. Doorn (2004). Past, present and future of historical
information science. Retrieved 2005-04-02, from
http://www.niwi.nl/nl/geschiedenis/medewerkers/peter_doorn_home_page/new_0_copy1/
past_present_future_of_historical_information_science/new/C%3A%5CDocuments+and+
Settings%5CPeterD%5CMy+Documents%5CNIWI%5CPPF%5CPPF+voor+web.pdf.
Borgatti, S. P. (2002). NetDraw: Graph Visualization Software, Analytic Technologies.
Retrieved 2005-03-18, from http://www.analytictech.com/ucinet.htm.
Borgatti, S. P., M. G. Everett and L. C. Freeman (1999). UCINET 6.0 Version 1.00, Analytic
Technologies. Retrieved 2005-03-18, from http://www.analytictech.com/ucinet.htm.
Botz, G. (1991). Zeitgeschichte in einer politisierten Geschichtskultur. Historiographie zum
20. Jahrhundert in Österreich. Geschichtswissenschaft vor 2000. Perspektiven der
Historiographiegeschichte, Geschichtstheorie, Sozial- und Kulturgeschichte. Festschrift
für Georg G. Iggers zum 65. Geburtstag. K. H. Jarausch, J. Rüsen and H. Schleier. Hagen:
299-328.
Botz, G. (1993). Zwölf Thesen zur Zeitgeschichte in Österreich. Österreichischer
Zeitgeschichtetag 1993, Innsbruck, Wien: Studienverlag 1995: 19-33.
35
Brandes, U., D. Wagner, M. Baur, M. Benkert, S. Cornelsen, M. Gaertler, B. Köpf, J. Lerner
and J. Ritter (2002). visone 1.1. Retrieved 2005-03-05, from http://www.visone.de/.
Brodersen, M. D., Jürgen and J.-H. Kirsch (2003). "Zeitgeschichte-Online - Ein Fachportal
für die zeithistorische Forschung." Potsdamer Bulletin für Zeithistorische Studien (30/31):
12-16. Retrieved 2005-01-10, from http://www.zzf-pdm.de/bull/pdf/b3031/3031_zol.pdf.
Campbell, D., N. Van Kempen, L. Arkles and B. Rozmus. (2003). "Definitions for Web-
Based Services." Retrieved 2006-01-23, from
http://www.nla.gov.au/initiatives/sg/servicetypes.html.
Caplan, P. (2004). "OAI-PMH." Computers in Libraries 24 (2): 24.
Clio-online – Historisches Informationssystem. (2005). "Arbeits- und Ergebnisbericht des
DFG-Projektes Clio-online - Historisches Informationssystem (Bericht zur Projektphase
I)." Retrieved 2005-03-21, from http://www.clio-
online.de/rainbow/_Rainbow/documents/Clio_online_Endbericht_Web_20050211.pdf
Dean, R. J. (2003). FAST: Development of Simplified Headings for Metadata. Authority
Control: Definition and International Experiences. Florence, Italy. Retrieved 2005-12-13,
from http://www.sba.unifi.it/ac/relazioni/dean_eng.pdf.
Deutsche Bibliothek. (2005). "Schlagwortnormdatei (SWD)." Retrieved 2005-12-12, from
http://www.ddb.de/standardisierung/normdateien/swd.htm.
Dornik, W. (2003). Zeitgeschichte und Internet, Dissertation, University of Graz, 223 pages.
Edmonds, K. A., A. Stephenson and M. Ashmore (2003). uzReview 0.7.1. Retrieved 2005-12-
14, from http://uzilla.mozdev.org/heuristicreview.html.
Enderle, W. (2001). Der Historiker, die Spreu und der Weizen, zur Qualität und Evaluierung
geschichtswissenschaftlicher Internet-Ressourcen. Geschichte und Internet – Raumlose
Orte, geschichtslose Zeit? Geschichte und Informatik – Histoire et Informatique 12: 49-
64. Retrieved 2006-01-17, from http://www.hist.net/hs-
kurs/qualitaet/doku/enderle_qualitaet.pdf.
36
Enderle, W. (2001). "Geschichtswissenschaft, Fachinformation und das Internet." eforum
zeitGeschichte 3/4. Retrieved 2006-01-17, from http://www.eforum-
zeitgeschichte.at/3_01a7.pdf.
Eversberg, B. (1999). "AACR: Die 50 wichtigsten Begriffe." Retrieved 2005-04-05, from
http://www.allegro-c.de/formate/aacr-it.htm.
Eversberg, B. (2004). "Eine seltene Sache. Erwartung und Ernüchterung bei der thematischen
Katalogsuche." Retrieved 2005-12-13, from http://www.allegro-c.de/regeln/cosarara.htm.
Eversberg, B. (2005a). "Sachliche Erschließung. Eine Aufgabe mit vielen Facetten."
Retrieved 2005-12-13, from http://www.allegro-c.de/formate/se.htm.
Eversberg, B. (2005b). "Wie katalogisiert man ein Buch? Ein Leitfaden nicht nur für
Einsteiger." Retrieved 2005-12.10, from http://www.allegro-c.de/regeln/rak-einf.htm.
Eversberg, B. (2005c). "Zur Zukunft der Katalogisierung.jenseits RAK und AACR."
Retrieved 2005-12-12, from http://www.allegro-c.de/formate/zk.htm.
Falch, S., H.-C. Gruber, G. Lamprecht, L. Rettl, A. Schober, M. Sommer and R. Thumser
(2003). "The neXt Generation - 7 Positionen." Zeitgeschichte 30 (6): 376-386.
Gehler, M. (2001). Zeitgeschichte im dynamischen Mehrebenensystem: Zwischen
Regionalisierung, Nationalstaat, Europäisierung, internationaler Arena und
Globalisierung. Bochum.
Gehler, M. (2002). "Zeitgeschichte zwischen Europäisierung und Globalisierung." Aus Politik
und Zeitgeschichte 51-52: 23-35. Retrieved 2006-01-15, from
http://www.bpb.de/publikationen/0RIY7E,0,0,Zeitgeschichte_als_wissenschaftliche_Aufk
l%E4rung.html.
Gehringer, H. (2003). "Rez. WWW: Zeitgeschichte Informations System (ZIS)." Retrieved
2005-01-18, from http://hsozkult.geschichte.hu-
berlin.de/rezensionen/id=18&type=rezwww.
Göttingen, N. S.-u. U. (1999). Das Sondersammelgebiets-Fachinformationsprojekt (SSG-FI)
Göttingen. Dokumentation – Teil 1. dbi-materialien. Schriften der Deutschen
37
Forschungsgemeinschaft, Deutsches Bibliotheksinstitut. 185. Retrieved 2005-02-17, from
http://www.sub.uni-goettingen.de/ssgfi/projekt/ssgfi.pdf.
Gradmann, S. (2005). Hat Bibliothekswissenschaft eine Zukunft? Abweichlerische Gedanken
zur Zukunft einer Disziplin mit erodierendem Gegenstand. Bibliothekswissenschaft – quo
vadis? = Library Science – quo vadis? Eine Disziplin zwischen Traditionen und Visionen;
Programme – Modelle – Forschungsaufgaben. P. Hauke: 97-102. Retrieved 2005-12-11,
from http://www.rrz.uni-
hamburg.de/RRZ/S.Gradmann/Bibliothekswissenschaft_gradmann.pdf.
Greenberg, J. (2003a). "The Semantic Web: More than a Vision." Bulletin of the American
Society for Information Science and Technology 29 (4): 6-7. Retrieved 2005-04-09, from
http://www.asis.org/Bulletin/Apr-03/greenberg.html.
Greenberg, J. (2004). "Metadata Extraction and Harvesting: A Comparison of Two Automatic
Metadata Generation Applications." Journal of Internet Cataloging 6 (4): 59-82 Retrieved,
from http://www.ils.unc.edu/mrc/automatic.pdf.
Greenberg, J., K. Spurgin and A. Crystal (2005). Final Report for the AMeGA (Automatic
Metadata Generation Applications) Project, UNC School of Information and Library
Science. Retrieved 2005-03-06, from
http://www.loc.gov/catdir/bibcontrol/lc_amega_final_report.pdf.
Greenberg, J., S. Sutton and D. G. Campbell (2003). "Metadata: A Fundamental Component
of the Semantic Web." Bulletin of the American Society for Information Science and
Technology 29 (4): 16-18. Retrieved 2005-04-09, from http://www.asis.org/Bulletin/Apr-
03/greenbergetal.html.
Hackathorn, R. (2003a). "The Link is the Thing. Part I." DM Review (August). Retrieved
2005-04-12, from http://www.bolder.com/ALA/Link-DMR2003.pdf.
Hackathorn, R. (2003b). "The Link is the Thing. Part II." DM Review (September). Retrieved
2005-04-12, from http://www.bolder.com/ALA/Link-DMR2003.pdf.
Hackathorn, R. (2003c). "The Link is the Thing. Part III." DM Review (October). Retrieved
2005-04-12, from http://www.bolder.com/ALA/Link-DMR2003.pdf.
38
Hanneman, R. A. (2001). "Introduction to Social Network Methods." Retrieved 2005-04-10,
from http://faculty.ucr.edu/~hanneman/SOC157/NETTEXT.PDF.
Hauke, P., J. Grünewald, B. Kaden, A. Kaufmann and M. Kindling (2005). Library Science -
quo vadis? (Re)Discovering “Bibliothekswissenschaft”. World Library and Information
Congress: 71th IFLA General Conference and Council. "Libraries - A voyage of
discovery". Oslo, Norway. Retrieved 2005-12-11, from
http://www.ifla.org/IV/ifla71/papers/048e-Hauke.pdf.
Heiner-Freiling, M. and L. G. Svensson. (2005). "Dewey-Dezimalklassifikation." Retrieved
2005-12-13, from http://www.ddc-deutsch.de/.
Hennig, N. and C. Quirion. (2004). "Best practices for web interfaces of searchable
databases." Retrieved 2005-12-01, from
http://macfadden.mit.edu:9500/webgroup/heuristics/index.html.
Hernández, M. A. and S. J. Stolfo (1998). "Real-world Data is Dirty: Data Cleansing and The
Merge/Purge Problem." Data Mining and Knowledge Discovery 2 (1): 9-37. Retrieved
2005-05-01, from
http://springerlink.metapress.com/openurl.asp?genre=article&id=doi:10.1023/A:10097616
03038.
Hillman, D. (2003). "Using Dublin Core." Retrieved 2005-03-03, from
http://www.dublincore.org/documents/usageguide/.
Hockerts, H. G. (2001b). "Zugänge zur Zeitgeschichte: Primärerfahrung, Erinnerungskultur,
Geschichtswissenschaft." Aus Politik und Zeitgeschichte 28: 15-30. Retrieved 2006-01-
16, from
http://www.bpb.de/publikationen/JSE0YE,0,0,Zug%E4nge_zur_Zeitgeschichte:_Prim%E
4rerfahrung_Erinnerungskultur_Geschichtswissenschaft.html.
Hockerts, H. G. (2001a). "Zeitgeschichte in Deutschland. Begriff, Methoden, Themenfelder."
Aus Politik und Zeitgeschichte 29-30: 3-19.
Hohls, R. (2004). "H-Soz-u-Kult: Kommunikation und Fachinformation für die
Geschichtswissenschaften." Historical Social Research (HSR) 29 (1): 212-232.
39
Hohls, R. and P. Helmberger (1999). "H-Soz-u-Kult: Eine Bilanz nach drei Jahren." Historical
Social Research (HSR) 24 (3): 7-35.
Holzbauer, R. (2001). "Ein Historiker im Netz." e-forum Zeitgeschichte (3). Retrieved 2005-
03-18, from http://www.eforum-zeitgeschichte.at/set3_01a5.htm.
Jacob, E. K. (2003). "Ontologies and the Semantic Web." Bulletin of the American Society
for Information Science and Technology 29 (4): 19-22. Retrieved 2005-04-09, from
http://www.asis.org/Bulletin/Apr-03/jacob.html.
Jarausch, K. H. (2004). "Zeitgeschichte zwischen Nation und Europa. Eine transnationale
Herausforderung." Aus Politik und Zeitgeschichte 39. Retrieved 2006-01-15, from
http://www.bpb.de/publikationen/YZF6ZR,0,0,Zeitgeschichte_zwischen_Nation_und_Eur
opa.html.
Jenks, S. and S. Marra, Eds. (2001). Internet-Handbuch Geschichte. Köln - Weimar - Wien.
Jenks, S. and P. Tiedemann (2000). Internet für Historiker. Eine praxisorientierte Einführung.
Darmstadt.
Kelly, B. (2004). Interoperable Digital Library Programmes? We Must Have QA! Research
and Advanced Technology for Digital Libraries, 8th European Conference, ECDL 2004,
Bath, UK.
Kelly, B. D., A. Guy, M. Phipps, L. (2003). "Ideology Or Pragmatism? Open Standards And
Cultural Heritage Web Sites". ichim03. Retrieved 2005-03-06, from
http://www.ukoln.ac.uk/qa-focus/documents/papers/ichim03/.
Kieslinger, C. (2004). Historischer Content Online – Fachinformation in Österreich. ODOK
'03 Ein Jahrzehnt World Wide Web: Rückblick - Standortbestimmung - Ausblick.
Tagungsbericht vom 10. Österreichischen Online-Informationstreffen und 11.
Österreichischen Dokumentartag, 23.-26. September 2003, Universität Salzburg,
Naturwissenschaftliche Fakultät. Ed. E. Pipp, Biblos-Schriften 179: 237-247.
Kleßmann, C. (2002). "Zeitgeschichte als wissenschaftliche Aufklärung." Aus Politik und
Zeitgeschichte 51-52: 3-12. Retrieved 2006-01-15, from
40
http://www.bpb.de/publikationen/0RIY7E,0,0,Zeitgeschichte_als_wissenschaftliche_Aufk
l%E4rung.html.
Koch, T. (2000). "Quality-controlled subject gateways: definitions, typologies, empirical
overview." Online Information Review 24 (1): 24-34. Manuscript retrieved 2006-01-23,
from http://www.lub.lu.se/~traugott/OIR-SBIG.txt.
Koyani, S. J., R. W. Bailey, J. R. Nall, S. Allison, C. Mulligan, K. Bailey and M. Tolson
(2003). Research-Based Web Design & Usability Guidelines, U.S. Department of Health
and Human Services (HHS). Retrieved 2005-11-25, from
http://usability.gov/pdfs/guidelines_book.pdf.
Kröll, M. (2005). Not ready for the Semantic Web: A field study of subject gateways on
Contemporary History. XVI international conference of the Association for History and
Computing (AHC 2005). Amsterdam, Netherlands: 176-181. Retrieved 2006-01-10, from
http://www.knaw.nl/publicaties/pdf/20051064.pdf.
Kropač, I. (2004). "Was ist 'Historische Fachinformatik und Dokumentation'?
Terminologisches, Inhalte, Aufgaben." Retrieved 2006-01-23, from http://hfi.uni-
graz.at/hfi/allg/hfi.htm.
Krug, S. (2000). Don't Make Me Think. A Common Sense Approach to Web Usability.
Indianapolis, Indiana, USA.
Kyunghye, K. (2002). A Model-based Approach to Usability Evaluation for Digital Libraries.
JCDL'02 Workshop on Usability of Digital Libraries, Portland, Oregon, USA. Retrieved
2005-11-27, from http://www.uclic.ucl.ac.uk/annb/docs/Kim33.pdf.
Lagoze, C., H. Van de Sompel, M. Nelson and S. Warner. (2002). "The Open Archives
Initiative Protocol for Metadata Harvesting." Retrieved 2005-04-10, from
http://www.openarchives.org/OAI/openarchivesprotocol.html.
Longzhuang, L., Y. Shang and W. Zhang (2002). Improvement of HITS-based Algorithms on
Web Documents. The Eleventh International World Wide Web Conference. Honolulu,
Hawai, USA. Retrieved 2005-09-01, from http://www2002.org/CDROM/refereed/643/.
41
Lorenz, B. "The Regensburg Classification: An introduction." Retrieved 2005-12-20, from
http://www.bibliothek.uni-regensburg.de/Systematik/RVK-Intro_en.pdf.
Mattl, S. (1983). Bestandsaufnahme zeitgeschichtlicher Forschung in Österreich,
Bundesministerium für Wissenschaft und Forschung, Vienna.
Mattl, S. (2003). "Nicht eine, sondern viele Zeitgeschichten. In Annahme einer ‘dritten
Generation’." Zeitgeschichte 30 (6): 357-366.
McCrank, L. (2002). Historical Information Science. An emerging Unidiscipline. Medford,
New Jersey, USA.
Miller, E. and R. Swick (2003). "An Overview of W3C Semantic Web Activity." Bulletin of
the American Society for Information Science and Technology 29 (4): 8-11. Retrieved
2005-04-09, from http://www.asis.org/Bulletin/Apr-03/millerswick.html.
Mirzaee, V., L. Iverson and B. Hamidzadeh (2004). Towards Ontological Modelling of
Historical Documents. 7th International Protégé Conference. Bethesda, Maryland, USA.
Retrieved 2005-10-01, from
http://protege.stanford.edu/conference/2004/abstracts/Mirzaee.pdf.
Mitchell, W. B., L. Davidson, R. Ziegler and A. Viles. (1999). "Testing the Design of a
Library Information Gateway." Retrieved 2005-04-02, from
http://library.georgiasouthern.edu/usability/acrlwebpapers5.pdf.
Möller, H. (1988). "Zeitgeschichte - Fragestellungen, Interpretationen, Kontroversen." Aus
Politik und Zeitgeschichte 2: 3-16.
Möller, H. and U. Wengst, Eds. (2003). Einführung in die Zeitgeschichte. Munich.
Mruck, K. (2005). "Providing (Online) Resources and Services for Qualitative Researchers:
Challenges and Potentials." Forum: Qualitative Social Research 6 (2). Retrieved 2005-12-
06, from http://www.qualitative-research.net/fqs-texte/2-05/05-2-38-e.htm.
Murray, G. and T. Costanzo. (1999). "Usability and the Web: An Overview." Retrieved 2005-
11-26, from http://www.collectionscanada.ca/9/1/p1-260-e.html.
42
Nagypal, G. (2005). History ontology building: The technical view. XVI international
conference of the Association for History and Computing (AHC 2005). Amsterdam,
Netherlands: 207-214. Retrieved 2006-01-10, from
http://www.knaw.nl/publicaties/pdf/20051064.pdf.
Neiling, M. (2004). Identifizierung von Realwelt-Objekten in multiplen Datenbanken,
Brandenburgische Technische Universität. Retrieved 2005-12-10, from
http://www.cis.cs.tu-berlin.de/~mneiling/NEILING_DISS_MIRROR_BTU-
COTTBUS/neiling_m.htm.
Nentwich, M. (2003). Cyberscience. Research in the Age of the Internet. Vienna.
Newman, M. E. J. (2003). "The structure and function of complex networks." SIAM Review
(45): 167-256. Retrieved 2005-04-14, from http://arxiv.org/pdf/cond-mat/0303516.
Nielsen, J. (1996). "Original Top Ten Mistakes in Web Design." Retrieved 2005-11-26, from
http://www.useit.com/alertbox/9605a.html.
Nielsen, J. (1997). "Search and You May Find." Retrieved 2005-11-26, from
http://www.useit.com/alertbox/9707b.html.
Nielsen, J. (1999a). "’Top Ten Mistakes’ Revisited Three Years Later." Retrieved 2005-11-
26, from http://www.useit.com/alertbox/990502.html.
Nielsen, J. (1999b). "The Top Ten New Mistakes of Web Design." Retrieved 2005-11-26,
from http://www.useit.com/alertbox/990530.html.
Nielsen, J. (2002a). "Top Ten Guidelines for Homepage Usability." Retrieved 2005-11-26,
from http://www.useit.com/alertbox/20020512.html.
Nielsen, J. (2002b). "Top Ten Web Design Mistakes of 2002." Retrieved 2005-11-26, from
http://www.useit.com/alertbox/20021223.html.
Nielsen, J. (2003a). "The Ten Most Violated Homepage Design Guidelines." Retrieved 2005-
11-26, from http://www.useit.com/alertbox/20031110.html.
Nielsen, J. (2003b). "Top Ten Web Design Mistakes of 2003." Retrieved 2005-11-26, from
http://www.useit.com/alertbox/20031222.html.
43
Nielsen, J. (2004). "Top Ten Mistakes in Web Design." Retrieved 2005-11-26, from
http://www.useit.com/alertbox/9605.html.
Nielsen, J. (2005a). "Ten Usability Heuristics." Retrieved 2005-11-26, from
http://useit.com/papers/heuristic/heuristic_list.html.
Nielsen, J. (2005b). "Top Ten Web Design Mistakes of 2005." Retrieved 2005-11-26, from
http://www.useit.com/alertbox/designmistakes.html.
NIST (2002). WebSAT, NIST. Retrieved 2005-12-01, from
http://zing.ncsl.nist.gov/WebTools/WebSAT/overview.html.
Online Computer Library Center (OCLC). (2005). "OCLC Bibliographic Formats and
Standards." 3rd. Retrieved 2005-04-04, from http://www.oclc.org/bibformats/.
Online Computer Library Center (OCLC). (2005). "FAST: Faceted Application of Subject
Terminology." Retrieved 2005-12-12, from
http://www.oclc.org/research/projects/fast/default.htm.
Online Computer Library Center (OCLC). (2005). "OCLC Research Software." Retrieved
2005-03-20, from http://www.oclc.org/research/software/.
Page, L., S. Brin, R. Motwani and T. Winograd (1998). The PageRank Citation Ranking:
Bringing Order to the Web, Stanford Digital Library Technologies Project. Retrieved
2005-04-04, from http://dbpubs.stanford.edu:8090/pub/1999-66.
Pierau, K. (2003). "Datenbank- und Informationsmanagement in der Historischen
Sozialforschung." HSR-TRANS 14. Retrieved 2005-06-20, from http://hsr-trans.zhsf.uni-
koeln.de/volume14.htm.
Powell, A. (2003). "Expressing Dublin Core in HTML/XHTML meta and link elements."
Retrieved 2005-03-03, from http://www.dublincore.org/documents/dcq-html/.
Ravindranathan, U., R. Shen, M. A. Gonçalves, W. Fan, E. A. Fox and J. W. Flanagan (2004).
Prototyping Digital Libraries Handling Heterogeneous Data Sources - The ETANA-DL
Case Study. Research and Advanced Technology for Digital Libraries, 8th European
Conference, ECDL 2004, Bath, UK.
44
Reamy, T. (2004). "To Metadata or Not To Metadata." EContent 27 (10): 34-39. Retrieved
2005-04-20, from
http://www.econtentmag.com/Articles/ArticleReader.aspx?ArticleID=7118.
Ridings, C. and M. Shishigin (2002). PageRank Uncovered Retrieved, from
http://www.chriseo.com/pagerank/PageRank.pdf.
Rothfels, H. (1953). "Zeitgeschichte als Aufgabe." Vierteljahreshefte für Zeitgeschichte 1: 1-
8.
Sabrow, M., R. Jessen, et al., Eds. (2003). Zeitgeschichte als Streitgeschichte. Große
Kontroversen nach 1945. Munich.
Sanderson, R. (2004). "A Gentle Introduction to SRW." Retrieved 2005-04-18, from
http://www.loc.gov/z3950/agency/zing/srw/introduction.html.
Schmidt, A. (1996). "Sacherschließung nach BIBOS." Mitteilungen der Vereinigung
Österreichischer Bibliothekarinnen & Bibliothekare 48 (3/4). Retrieved, from
http://www.uibk.ac.at/sci-org/voeb/vm48-34.html#alfred.
Schwarz, H.-P. (2003). "Die neueste Zeitgeschichte." Vierteljahreshefte für Zeitgeschichte 4:
5-29.
Sensch, J. (2002). "Statistische Modelle in der Historischen Sozialforschung I. Allgemeine
Grundlagen - Deskriptivstatistik." HSR-TRANS 8. Retrieved 2005-06-07, from http://hsr-
trans.zhsf.uni-koeln.de/hsr7/.
Shneiderman, B. (1997). "Designing information-abundant web sites: issues and
recommendations." International Journal of Human-Computer Studies 47 (1): 5-29.
Retrieved 2005-12-01, from http://ijhcs.open.ac.uk/shneiderman/shneiderman-nf.html.
Short, H. (2002). "The Role of Humanities Computing: Experiences and Challenges."
Historical Social Research (HSR) 24 (4): 282-301. Retrieved 2005-12-10, from http://hsr-
trans.zhsf.uni-koeln.de/hsrretro/docs/artikel/hsr/hsr2002_560.pdf.
Smith, M., R. Rodgers, J. Walker and R. Tansley (2004). DSpace: A Year in the Life of an
Open Source Digital Repository System. Research and Advanced Technology for Digital
Libraries, 8th European Conference, ECDL 2004, Bath, UK.
45
Smith, P. A., I. A. Newman and L. M. Parks (1997). "Virtual hierarchies and virtual networks:
some lessons from hypermedia usability research applied to the World Wide Web."
International Journal of Human-Computer Studies 47 (1): 67-95. Retrieved 2005-12-01,
from http://ijhcs.open.ac.uk/smith/smith-nf.html.
Stearns, S. (2004). "Automate Classification and Improve Information Discovery." EContent
27 (7/8): 18.
Stumpf, G. (2004). "Internet-Informationen zur Sacherschließung." Retrieved 2005-12-20,
from http://www2.bibliothek.uni-augsburg.de/allg/swk/sacher_allg.html.
Tauscher, L. and S. Greenberg (1997). "How people revisit web pages: empirical findings and
implications for the design of history systems." International Journal of Human-Computer
Studies 47 (1): 97-137. Retrieved 2005-12-01, from
http://ijhcs.open.ac.uk/tauscher/tauscher-nf.html.
Tennant, R. (2004). "The Expanding World of OAI." Library Journal 129 (3): 32.
Thaller, M. (1997). Virtuelle (Zeit-)Geschichte. Eine Disziplin zwischen Popularität,
Postmoderne und dem Post-Post-Positivismus. Zeitgeschichte im Wandel. 3.
Österreichische Zeitgeschichtetage 1997, Wien, Innsbruck - Wien: Studienverlag 1998:
407-421.
Theng, Y. L., G. Buchanan, H. Thimbleby and N. Mohd-Nasir (2000). Purpose and usability
of digital libraries. Fifth ACM Conference on Digital Libraries, ACM DL'2000, San
Antonio, Texas, USA Retrieved 2005-12-01, from
http://www.uclic.ucl.ac.uk/harold/srf/dl00-purpose.pdf.
Theng, Y. L., N. Mohd-Nasir and H. Thimbleby (2000). A Usability Tool for Web Evaluation
applied to Digital Library Design. WWW9 Poster Proceedings, Amsterdam, Netherlands.
Retrieved 2005-12-01, from http://www.uclic.ucl.ac.uk/harold/srf/www9-tool.pdf.
Thome, H. (2001). "Grundkurs Statistik für Historiker. Teil I: Deskriptive Statistik." HSR-
TRANS 7. Retrieved 2005-10-12, from http://hsr-trans.zhsf.uni-koeln.de/hsr2/.
46
Tillet, B. B. (2003). Authority Control: State of the Art and New Perspectives. Authority
Control: Definition and International Experiences. Florence, Italy. Retrieved 2005-12-13,
from http://www.sba.unifi.it/ac/relazioni/tillett_eng.pdf.
Umlauf, K. (2005). "Einführung in die Regeln für den Schlagwortkatalog RSWK." Retrieved
2005-12-11, from http://www.ib.hu-berlin.de/~kumlau/handreichungen/h66/.
Urbaner, R. and G. Lamprecht (2003). "'eForum zeitGeschichte' – ein Erfahrungsbericht."
zeitenblicke 2 (3). Retrieved 2005-03-08, from
http://www.zeitenblicke.historicum.net/2003/02/pdf/urbaner.pdf.
Van Laak, D. (2004). "Rez. WWW: Zeitgeschichte-online." Retrieved 2005-01-18, from
http://hsozkult.geschichte.hu-berlin.de/rezensionen/id=48&type=rezwww.
Vogeler, G., P. Sahle, H. Enzensberger and T. Frenz. (2005). "Historische
Hilfswissenschaften." Retrieved 2006-01-20, from http://www.vl-ghw.uni-
muenchen.de/hw.html#sect31.
Wirtz, S. (2005). "Marktanalyse. Deutschsprachige Online- und CD/DVD-Produktionen zum
Thema Nationalsozialismus und Holocaust. Ein Projekt des Fritz Bauer Instituts im
Auftrag der Bundeszentrale für politische Bildung." Retrieved 2006-01-20, from
http://www.fritz-bauer-institut.de/forschung/medienstudie.htm.
World Wide Web Consortium (W3C). (2004). "Architecture of the World Wide Web,
Volume One." Retrieved 2005-06-23, from http://www.w3.org/TR/webarch/.
47
IX Appendix
1 Web usability checklists
1.1 Best practices for web interfaces of searchable databases118
ZIS VLZ ZOL Category “Homepages” 1 description of scope 1 1 1 2 table of contents 0 1 1 3 prominent search box 1 1 0 4 visible browse
categories 1 1 1
5 links back to parent organization
1 1 1
6 consistent site id/logo on top
1 1 1
7 links to contact info, staff, projects, and related systems
1 1 1
8 link to "about" the project or system
0 1 1
9 news/spotlight/featured items
1 1 1
Category “Search Screens” 10 visible examples of
search syntax 1 0 0
11 link to advanced/detailed search
0 0 1
12 ability to search the whole vs. search particular collections
0 0? 1
13 ability to AND/OR/NOT across fields
1 0? 1
14 ability to limit search to specific fields
1 0 1
15 search should be prominent on home page and other pages (results)
1 1 1
16 good "no hits" screen with ideas for how to modify search
0 0 0
17 the fact that you got no results should stand
0 0? 0
118 Cf. Hennig and Quirion "Best practices for web interfaces of searchable databases".
48
out on the screen Category “Browse Screens” 18 visible categories on
top level 1 1 1
19 make sure links look clickable
1 1 1
20 show number of hits in each category (before you click)
0 1 0
Category “Results Screens” 21 show the number of
hits on the top of the page, and what you searched for
1 1 1
22 ability to modify the search right on that page
1 1 1
23 ability to sort by different criteria
0 1 1
24 make the default sort be the most useful one
1 1 1
25 ability to set how many results per page
0 0 0
26 Forward and back should be clear
1 1 1
27 rows of table display, every other row opposite color, helps scan
0 0 0
28 have a brief display that links to a full record
0 1? 1
29 avoid pop-up windows 1 1 1 30 links to related items 0 0 0 31 links to related
searches 0 0 0
Sum: 18 20 22
49
1.2 Selected119 Nielsen web design mistakes
Nr. Mistake Nielsen Reference (Year)
Severity (1-5)120
ZIS VLZ ZOL
1 Gratuitous Use of Bleeding-Edge Technology
1996 5 0 0 0
2 Scrolling Text, Marquees, and Constantly Running Animations
1996 5 0 0 0
3 Outdated Information 1996 5 1 1 0 4 Overly Long Download
Times 1996 5 0 0 0
5 Breaking or Slowing Down the Back Button
1999 5 0 0 0
6 No Contact Information or Other Company Info
2005 5 0 0 0
7 Complex URLs 1996 4 1 1 1 8 Lack of Navigation Support 1996 4 1 0 0 9 PDF Files for Online
Reading 2004 4 0 0 1
10 Legibility Problems 2005 4 0 0 0 11 Non-Standard Links 2005 4 0 0 1 12 Flash 2005 4 0 0 0 13 Bad Search 2005 4 0 0 0 14 Cumbersome Forms 2005 4 0 0 0 15 Frozen Layouts with Fixed
Page Widths 2005 4 0 1 1
16 Using Frames 1996a 3 1 0 1 17 Orphan Pages 1996 3 0 0 0 18 Opening New Browser
Windows 1999 3 1 1 1
19 Non-Standard Use of GUI Widgets
1999 3 0 0 0
20 Headlines That Make No Sense Out of Context
1999 3 0 0 0
21 Jumping at the Latest Internet Buzzword
1999 3 0 0 0
22 Anything That Looks Like Advertising
1999 3 0 0 0
23 Horizontal Scrolling 2002 3 0 1 1 24 Unclear Statement of 2003 3 0 0 0
119 This selection is based on Nielsen’s “Web Design Mistakes” (cf. footnote 35). The main selection
criteria were that a “mistake” had to be applicable to all three subject gateways. “Mistakes” like “No Prices” (Nielsen "Top Ten Web Design Mistakes of 2002") or “No ‘What-If’ support” (Nielsen "Top Ten Web Design Mistakes of 2003") failed that test and have been omitted.
120 “1” means least and “5” means most severe. Mistakes 5, 18-22, and 31 have been scored in analogy to Nielsen "'Top Ten Mistakes' Revisited Three Years Later".
50
Purpose 25 Overly Restrictive Form
Entry 2003 3 0 0 0
26 Non-Scannable Text 2004 3 0 0 0 27 Page Titles With Low
Search Engine Visibility 2004 3 0 0 1
28 Violating Design Convention
2004 3 0 0 0
29 Browser Incompatibility 2005 3 0 0 0 30 Long Scrolling Pages 1996 2 0 1 0 31 Lack of Biographies 1999 2 1 0 0 32 JavaScript in Links 2002 2 0 0 1 33 Small Thumbnail Images of
Big, Detailed Photos 2003 2 0 0 0
34 Mailto Links in Unexpected Locations
2002 1 0 0 0
35 Overly detailed ALT Text 2003 1 0 0 0 36 Pages That Link to
Themselves 2003 1 0 1 0
Weighted Sum: 21 22 30
51
2 Top 50 keywords in the aggregated ZIS-, VLZ-, ZOL-link
database
Keyword Number of Occurrences
1 Nationalsozialismus 4252 Holocaust 2883 Sozialgeschichte 2434 Widerstand 2035 Zweiter Weltkrieg 1716 Politik 1677 Kulturgeschichte 1678 Medien 1669 Politikgeschichte 163
10 Gesellschaft 15111 Zeitgeschichte 13012 Dokumente 12413 Wirtschaftsgeschichte 11114 Didaktik 11115 Biographie 11016 Opposition 9717 Kultur 9618 "Drittes Reich" 9519 US-Amerikanische Geschichte 9320 Parteien 9021 Institutionen 9022 Europa 9023 Landesgeschichte 8924 Nachkriegszeit 8825 Wirtschaft 8826 Elektronische Publikationen 8827 Linksammlung 8428 Hilfsmittel 8229 Film 8130 Archive 8031 Erinnerungskultur 7932 Alltag 7933 Kalter Krieg 7734 Demokratie 7735 Antisemitismus 7636 Technikgeschichte 7437 Archiv 7338 Migration 7139 Bibliographie 7140 2. Weltkrieg 6741 Zeitschriften 67
52
42 Außenpolitik 6643 Konzentrationslager 6544 Kommunismus 6545 Erster Weltkrieg 6246 Alltagsgeschichte 6047 Bibliothek 6048 Rechtsgeschichte 5849 DDR 5750 Militärgeschichte 57
53
3 Top 50 keywords related to Contemporary History - Innsbruck
University Library OPAC database
Keyword Number of Occurrences
1 Juden 13352 Geschichte 1933-1945 11573 Drittes Reich 10754 Nationalsozialismus 8775 Deutschland <DDR> 8696 Südtirol 6207 Judentum 5358 Geschichte 1900-2000 4519 Weltkrieg <1939-1945> 449
10 Geschichte 1939-1945 38211 Widerstand 36712 Antisemitismus 32513 Weimarer Republik 32214 Konzentrationslager 30915 Weltkrieg <1914-1918> 30716 Geschichte 1938-1945 29417 Judenverfolgung 29318 Geschichte 1900-1990 27619 Judenvernichtung 27120 Arbeiterbewegung 24421 Nationalismus 24422 Geschichte 1918-1933 23623 Geschichte 1945-1990 23224 Geschichte 1945-1995 22525 Vergangenheitsbewältigung 22226 Palästina 22027 Europäische Integration 21828 Geschichte 1945 20329 Geschichte 1945-1955 17830 Geschichte 1918-1938 17831 Deutsche Frage 16632 Geschichte 1980-1990 15233 Wehrmacht 15034 Faschismus 15035 Geschichte 1918-1945 14436 Geschichte 1914-1918 14437 Gebirgskrieg 13438 Geschichte 1940-1945 12839 Geschichte 1943-1945 12740 Geschichte 1945-2000 12541 Geschichte 1945-1949 125
54
42 Geschichte 1915-1918 12343 Geschichte 1945-1985 12144 Besatzungspolitik 11945 Geschichte 1990-2000 11746 Vertreibung 11647 Politische Verfolgung 11548 Geschichte 1941-1945 11549 Geschichte 1989 11050 Ost-West-Konflikt 109
55
4 URLs common to all three subject gateways ZIS/ZOL/VLZ
URL Google
Pagerank PageRank (ZIS/ZOL/VLZ network, n=2278, scale: 1-10)
1 http://www.ushmm.org/ 8 7.04 2 http://www.dhm.de/ 7 8.87 3 http://www.history-journals.de/ 7 0.69 4 http://www.hdg.de/ 6 3.33 5 http://www.17juni53.de/ 6 1.53 6 http://www.chronik-der-mauer.de/ 6 1.35 7 http://www.fritz-bauer-institut.de/ 6 1.28 8 http://www.bstu.de/ 6 1.09 9 http://www.doew.at/ 6 0.94 10 http://www.his-online.de/ 6 0.84 11 http://www.sehepunkte.historicum.net/ 6 0.30 12 http://www.wienerlibrary.co.uk/ 6 0.22 13 http://www.beutekunst.de/ 6 0.15 14 http://www.querelles-net.de/ 5 0.65 15 http://www.gedenkstaettesteinhof.at/ 5 0.54 16 http://www.rrz.uni-hamburg.de/FZH/ 5 0.35 17 http://www.nachkriegsjustiz.at/ 5 0.28 18 http://www.eforum-zeitgeschichte.at/ 5 0.24 19 http://www.salvator.net/salmat/pw/luft/blockade.html 5 0.24 20 http://www.hdg.de/zfl/ 5 0.19 21 http://www.h-ref.de/ 4 0.51 22 http://www.uni-kassel.de/fb1/infonsnh/ 4 0.20 23 http://www.nfhdata.de/premium/literaturdatenbank_index.html 4 0.19 24 http://www.topographie.de/imt/ 4 0.17
56
5 System setup and availability
The crawler, import and analysis programs have been developed and put to practice
using a Debian 3.1 (Sarge)121 Linux system running on a Windows coLinux122 host system for
heightened mobility and flexibility. PostgreSQL 8.0.3123 has been used as RDBMS system,
Subversion 1.2.0124 as code versioning system. The custom programs have been implemented
using Perl125 and Perl CPAN126 modules, amongst them Class::DBI127 for a simple database
to object mapping layer.
The database including the analysis data, analysis SQL queries, and the source code of the
programs is available at: http://www.pepl.info/papers/fieldstudy_sg_zeitgeschichte/
121 http://www.us.debian.org/releases/sarge/, January, 20th 2006. 122 http://www.colinux.org/, January, 20th 2006. 123 http://www.postgresql.org/, January, 20th 2006. 124 http://subversion.tigris.org/, January, 20th 2006. 125 http://www.perl.org/, January, 20th 2006. 126 http://cpan.perl.org/, January, 20th 2006. 127 http://search.cpan.org/~tmtm/Class-DBI-v3.0.13/, January, 20th 2006.
57
6 Database design
The data diagram:
The table cagipch_portals has been filled manually with the three subject gateways,
Clio-Online, and Aleph OPAC “Geschichte” – University Library of the
University of Innsbruck.
All other tables have been filled by the crawler and analysis programs. The view
cagipch_identifierlinks_idmatrix has been used by the information network analysis
programs.
cagipch_portals id: serial name: character varying(128) url: character varying(256)
cagipch_identifier_normalized id: serial identifier_normalized: character varying(512) google_pagerank: double precision cagipch_pagerank: double precision
cagipch_identifierlinks id: serial identifier: character varying(512) links_to: character varying(512)
cagipch_identifierlinks_idmatrix(view)
id: integer identifier_id: integer
links_to_id: integer
cagipch_items id: serial dces_description: character varying(4096) dces_date: timestamp(0) dces_type: character varying(200) dces_format: character varying(64) dces_identifier: character varying(512) dces_relation: character varying(200) dces_coverage: character varying(200) dces_rights: character varying(200) portal: integer last_time_checked: timestamp(0) url_host_part: character varying(128) dces_title: character varying(400) dces_source: character varying(400) dces_language: character varying(32) identifier_ltc: timestamp(0) without time zone identifier_status_code: integer identifier_redirected_to: character varying(512) identifier_content_md5sum:character varying(32) psource: character varying(128) dces_creator: character varying(1024) dces_subject: character varying(1024) dces_publisher: character varying(1024) dces_contributor: character varying(1024) identifier_normalized: character varying(512)
cagipch_keywords id: serial item: integer
name: character varying(200)
cagipch_keywords_levenshtein id: serial kw_from: character varying(200) kw_to: character varying(200) distance: integer
cagipch_keywords_normalized id: serial name: character varying(200) usage_count: integer stem: character varying(200) soundex: character varying(8)
58
7 Overview of crawler and import programs
• aleph-keywords_import.pl
Imports items into cagipch_items for the Aleph OPAC “portal” using dummy titles and
URLs. Assigns keywords accordingly from a plaintext import file.
• harvester-vlz.pl
Harvests items from www.vl-zeitgeschichte.de and imports them into cagipch_items for
the Virtual Library Zeitgeschichte portal. Uses a caching version of
WWW::Mechanize::Sleepy128 for polite harvesting.
• harvester-zol.pl
Harvests items from www.zeitgeschichte-online.de and www.clio-online.de and imports
them into cagipch_items for the Zeitgeschichte Online and Clio-Online portals. Uses a
caching version of WWW::Mechanize::Sleepy for polite harvesting and handles mixed
Latin1 and UTF-8 encoded metadata.
• zis_import.pl
Imports items into cagipch_items for the ZIS portal from an XML encoded import file.
Uses XML::LibXML129 for parsing and handling the XML file.
• check_status_codes.pl
Checks HTTP status codes of items in cagipch_items using LWP::UserAgent130.
Follows possible redirects and stores the MD5 sum of the returned content using
Digest::MD5131.
128 http://search.cpan.org/~esummers/WWW-Mechanize-Sleepy-0.5/, January, 20th 2006. 129 http://search.cpan.org/~phish/XML-LibXML-1.58/, January, 20th 2006. 130 http://search.cpan.org/~gaas/libwww-perl-5.805/, January, 20th 2006. 131 http://search.cpan.org/~gaas/Digest-MD5-2.36/, January, 20th 2006.
59
8 Overview of the analysis programs
• populate_identifier_normalized.pl
Populates cagipch_items.identifier_normalized with the canonical version of the URL
stored in cagipch_items.dces_identifier so that cagipch_identifier_normalized can be
filled with the distinct URLs using the SQL insert statement at line number 100 of
cagipch.sql.
• populate_identifier_network.pl
Processes in- and outgoing links with a recursion depth of 3 levels for the items stored
in cagipch_identifier_normalized and stores the result in cagipch_identifierlinks. Stores
the Google Pagerank of each crawled URL using WWW::Google::PageRank132 in the
cagipch_identifier_normalized table. After being run, the view
cagipch_identifierlinks_idmatrix can be created using the SQL statement starting at line
number 128 of cagipch.sql.
• cagipch_pagerank.pl
Populates cagipch_identifier_normalized.cagipch_pagerank with the PageRank values
of a network defined by cagipch_identifierlinks_idmatrix using
Algorithm::PageRank133. Optionally limits the network nodes to a specific to a subject-
area-specific network by passing a comma-separated list of keywords as command line
arguments.
• linkid_matrix.pl
Generates import files for either Pajek or UCINET network analysis programs out of
cagipch_identifierlinks_idmatrix. Accepts a comma-separated list of keywords as
command line arguments to limit the resulting matrix to a specific subject area.
• populate_keywords_normalized.pl
Populates cagipch_keywords_normalized.stem with the German stem of the respective
keyword using Lingua::Stem::Snowball134 and cagipch_keywords_normalized.soundex
with the Soundex value of the keyword using Text::Soundex135. Both stem and Soundex
can be used to identify duplicates or spelling errors. populate_keywords_normalized.pl
should be run after cagipch_items has been completely filled and after
132 http://search.cpan.org/~ykar/WWW-Google-PageRank-0.10/, January, 20th 2006. 133 http://search.cpan.org/~xern/Algorithm-PageRank-0.08/, January, 20th 2006. 134 http://search.cpan.org/~fabpot/Lingua-Stem-Snowball-0.93/, January, 20th 2006. 135 http://search.cpan.org/~markm/Text-Soundex-3.02/, January, 20th 2006.
60
cagipch_keywords_normalized has been populated by the SQL insert statement at line
number 114 of cagipch.sql.
• populate_keywords_levenshtein.pl
Populates cagipch_keywords_levenshtein with the Levensthein edit distances between
the keywords in cagipch_keywords_normalized using Text::LevenshteinXS136. The
Levensthein edit distances can be used to identify duplicates or spelling errors.
• keywords_from_common_uris.pl
Dumps keyword information and statistics from a list of URLs in cagipch_items. Used
for analyzing the classification of the 24 shared URLs of the three subject gateways.
136 http://search.cpan.org/~jgoldberg/Text-LevenshteinXS-0.03/, January, 20th 2006.
Top Related