E : by Jamon Camisso · 2013. 11. 1. · ii Embedding metadata: exploring the ontology of hybrid...
Transcript of E : by Jamon Camisso · 2013. 11. 1. · ii Embedding metadata: exploring the ontology of hybrid...
-
E :
by
Jamon Camisso
A thesis submitted in conformity with the requirementsfor the degree of Master of Information StudiesGraduate Department of Faculty of Information
University of Toronto
Copyright c 2011 by Jamon CamissoReleased under the WTFPL
(http://sam.zoy.org/wtfpl/COPYING)
-
ii
Embedding metadata: exploring the ontology of hybrid digital and material objects
Jamon Camisso
Master of Information Studies
Faculty of Information
University of Toronto
2011
Abstract
This thesis discusses the design of three systems that were built using Critical
Making as an investigative method. The systems are: an RFID antenna that links
ISBNs to online metadata; metamash.org, which aggregates ISBN metadata; and
doitag.org, which allows users to associate tags with DOI numbers. Each system was
designed to interrogate issues related to identification, categorization and the
institutional foundations of, and individual practices surrounding, information
systems, providing levers to get at deeper ontological issues.
Each investigation points in its own way to a profound lack of understanding about
the ontology of digital, or hybrid material/digital objects. David Weinberger’s
ordering scheme for material and digital objects is used because it allows for a
discussion of ordering systems in general. However, focusing solely on categorization
systems masks more important questions about the ontology of such objects and how
building and using such objects fundamentally defines what they are.
-
iii
Acknowledgements
Writing this thesis has been a thoroughly enjoyable exercise in academic meandering.
I would like to thank my supervisor Dr. Stephen Hockema for bearing with me and
challenging me throughout this process. While much of the content of our discussions
is reflected in these pages, our conversations about issues that are not raised here
have been equally inspiring and always lead me to new and exciting discoveries.
I would also like to thank my second reader Professor Matt Ratto. Without Critical
Making and his advocacy, this thesis would never have made it past the vetting stage
by the Committee on Standing. I would also like to extend my thanks and respect to
my external reviewer Jean-François Blanchette, whose questions managed to do more
to make me summarize and reorient my thinking about the topic in two weeks than I
could have managed in months on my own.
Finally, to my friends and family, who have constantly engaged themselves with this
project in equal portions of academic and extracurricular capacities, I would like to
express my sincerest thanks. Without their perspectives, encouragement, and
prodding, I would probably have taken yet another year to finish this thesis.
-
Contents
1 Introduction 1
1.1 Material Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Critical Making . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Background & Literature Review 9
2.1 Books, Texts, and Paratext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.1 Work/Text Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.2 Paratext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Categorization and Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 First, Second, Third Order Organization Systems . . . . . . . . . . . . 17
2.2.2 Ordering Systems Discussion . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Ontology and Intentionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4 Boundary Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6 Part 1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.7 Background: Every Book is a Problem . . . . . . . . . . . . . . . . . . . . . . . 33
2.7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.7.2 RFID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.8 Keywords and Tagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.9 Part 2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.10 Co-Citation and Latent Semantic Analysis . . . . . . . . . . . . . . . . . . . . 43
iv
-
CONTENTS v
2.10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.10.2 Co-word, Co-citation, and Contextual Co-citation Analyses . . . . . . . 45
2.10.3 Latent Semantic Indexes and Analysis . . . . . . . . . . . . . . . . . . 50
2.11 Tagging Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.11.1 Folksonomies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.12 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3 Method: Critical Making 60
3.1 Arduino and RFID system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.2 metamash.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.3 doitag.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4 Discussion: Just-in-time Dimensionality 80
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.2 metamash.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.3 WorldCat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.3.1 Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.3.2 Co-tag Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.4 U of T Catalogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
References 100
A Copyright Acknowledgements 109
B RWD-ICODE Email 110
C WorldCat Email 112
D metamash.org search terms 114
-
CONTENTS vi
E LibraryThing HTML Scraper 115
F Sample WorldCat RSS 116
-
List of Figures
1.1 Library Receipt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 Book History/Textual Studies/Sociology of the Text. Whereas Howsam’s
diagram uses lines to link each subject, this version borrows most of her
textual elements but uses circles to match the aesthetic used in Figures
2.7 and 2.2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Weinberger’s Three Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Negative Tag Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4 Term network for Cluster 3 (86 documents labelled as “Bibliometrics2”) . . 51
2.5 Overall framework of Janssens et. al’s analysis . . . . . . . . . . . . . . . . . 53
2.6 LibraryThing’s tag cloud for “Bibliometrics” . . . . . . . . . . . . . . . . . . . 55
2.7 Material and Digital systems. The items joining the horizontal circles
(RFID, Tags, and DOIs) represent means of interrogating the boundaries
of each set of ideas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.1 Photograph of the RFID System showing the Arduino, ID-12 Antenna, and
handmade antenna coils . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2 Arduino RFID System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3 Overview of metamash.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.4 LibraryThing tags as raw HTML . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.5 WorldCat RSS parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.6 Overview of doitag.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
vii
-
LIST OF FIGURES viii
4.1 LSI Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2 Related Bibsonomy Tags and Articles for “latent” tag . . . . . . . . . . . . . . 84
4.3 UTL element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
E.1 LibraryThing HTML scraper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
F.1 WorldCat RSS for Knuth’s Art of Computer Programming . . . . . . . . . . . 116
-
Chapter 1
Introduction
“Everywhere everything is ordered to stand by, to be immediately at hand, indeed tostand there just so that it may be on call for a further ordering” (Heidegger, 1977,p. 17).
In many ways, this thesis is a mashup. It discusses things like categorization
systems, institutional control over information, tagging, bibliometric methods, and
the design of material and digital web based mashups that were explicitly built for
this thesis. But this thesis is not just about categorization or mashups. Instead, all of
these topics taken together interrogate a much larger issue. Each topic understood
within the larger context of the others, and the insights into each, ultimately hints at
a profound lack of understanding about the ontology of digital, or hybrid
material/digital objects.
1.1 Material Metadata
To begin exploring this issue, grounding the beginning of this discussion on material
objects as metadata is necessary. Accordingly, this section will attempt to establish
the value of storing what appears to be useless material metadata by examining a
library receipt that was found in the trash at a library.
Consider the library receipt shown in Figure 1.1. This receipt was retrieved from a
1
-
C . I 2
garbage can in St. Michael’s Library, University of Toronto, for the simple reason that
it was there. However uninteresting it might appear at first glance, it is a record of a
particular library patron (or potentially patrons) having researched, located, and
checked out a number of items at a particular location and moment in time. Nothing
can be reliably deduced about the patron—their interests, personality, browsing
habits, nothing about them is not captured by the receipt. Just as Levy (2001, p. 8)
notes when discussing a lunch receipt from Steve’s Deli & Catering, all that the
receipt in Figure 1.1 captures is metadata about a particular event. In this case, the
event was one in which a patron located and checked out three items, though even
the date that the event occurred is unknowable.
This thesis will explore some of the relationships that emerge when metadata about
sets of items like books and articles is aggregated and stored for future use. In
keeping with Heidegger’s idea that technology can be used to create further order in
the world out of underlying raw materials, this thesis will attempt to demonstrate the
ways in which storing such metadata—instead of throwing it away—is useful.
For example, the three particular items shown on the receipt in Figure 1.1 were
checked out concurrently with each other. This is a useful observation in that all
three items share a common author. What that meant to the patron is unknowable,
but the fact is, there is a commonality amongst the items. Looking carefully, the
items have relatively recent publication dates: 2006, 2009, and 2007 respectively.
Again, what this means is unknown, but Chapter4 will demonstrate the value of such
apparently inconsequential information. Finally, still carefully examining the item
numbers, note that the last item of the three has a different Library of Congress
letter prefix, RC, which is reserved for items with the subject of Internal Medicine
(2010b), versus the first two items that use BQ, which is reserved for items about
Buddhism (2010a). Given the stack layout of the library in question (University of
Toronto, 2010), this means that the patron probably had to traverse multiple floors and
shelves to retrieve the items shown on the receipt.
-
C . I 3
Figure 1.1: Library Receipt
-
C . I 4
If it had been found in the front cover of a book, or used as a bookmark, the receipt’s
embeddedness could have been used by a researcher to know what others have read
in relation to any referenced book on the receipt. That embedded metadata could
function in a manner similar to amazon.com’s recommendation system, which uses
“item-to-item collaborative filtering [that] matches each of the user’s purchased and
rated items to similar items, then combines those similar items into a
recommendation list” (2003). Moreover, other patrons could add their receipts (or any
information of their choosing for that matter) to a book and thus build up a set of
related texts for others to use.
This embedding could also lead to a whole host of issues within libraries. Questions
like who gets to embed receipts into books, can receipts be modified or reprinted, can
multiple copies of the same receipt be placed in multiple books, can they be removed
from a book, and the immense question about what such a practice would mean for
patron privacy are just a few questions that arise by even speculating about
embedding such metadata into a text might entail.
Now consider Heidegger’s quote again: retrieving a piece of what was for all intents
and purposes garbage, and using it to discuss metadata about the objects that it
references reveals a good deal of explicit and latent information about the items, and
about a particular place and event. That information and the receipt upon which it is
printed was on the verge of being thrown out, but, once retrieved and discussed, has
now found a new use. It is evidence, used for this thesis to show that someone went
to the trouble of getting the items listed on the receipt, and to demonstrate that the
simple fact that they did so indicates a commonality, or commonalities between items
on the receipt, whatever they might be.
Indeed, for the sake of argument, the items on the receipt could be completely
unrelated in terms of their contents, authors, and subjects. The point is, the agency
that was exercised, specifically the work that was done in getting those items, entails
-
C . I 5
some degree of relation between the listed items beyond merely existing in the
University of Toronto catalogue. What any set of such items might have in common
is unknowable for certain, but demonstrating that there are meaningful, unexplored
material and abstract relationships between texts is a central pursuit that this thesis
will undertake.
To that end, the practical and philosophical implications of building systems to store
this type of metadata about texts will be explored in this thesis using Critical Making
as a method, a description of which follows.
1.2 Critical Making
Briefly, Critical Making is a method pioneered by Ratto (2009), whereby an object is
designed and built to serve as a material and technological lens through which to
understand a broader set of questions or issues about a subject.
Three different metadata linking and capturing systems were designed for this thesis
using Critical Making. First, a Radio Frequency Identifier (RFID)1 system was built in
an experimental attempt to define, generate, and capture metadata. The goal of the
system was to place sets of RFID tagged books on a table that could correlate the
physical proximity of books on the table with subject matter, authors, and other
criteria.
Based on the technical limitations of the RFID system, the second system,
metamash.org, was designed to link material books to their online digital metadata
via RFID tags and ISBN numbers. Again the idea driving the design of metamash.org
was that storing aggregated metadata about sets of books instead of individual items
could be useful. Finally, doitag.org was designed as a system to collect and link
journal articles using Digital Object Identifiers and user generated tags.
1Chapter 2 will describe RFID in more detail
-
C . I 6
Each system was designed to initiate thinking about and to explore theoretical topics
that arise when objects are linked to each other via metadata. Each system attempts
to understand how the ontologies of objects like material books, journal articles, or
even wholly digital texts are alterable when metadata is embedded into the self-same
object. This thesis will attempt to demonstrate that embedding metadata into objects
has the potential to fundamentally change conceptions of what material and digital
object boundaries are.
Accordingly, the following questions inform and will be addressed by Critical Making
and the three systems that were designed during the course of this thesis:
1. What new practices might arise when material and digital objects are linked
together based on their embedded metadata?
2. What do those practices reveal about the ontology of material, digital, or
abstract objects when metadata about themselves is stored and embedded into
the self-same objects?
1.3 Thesis Outline
Chapter 2 reviews the various bodies of knowledge required to understand what is
meant by terms like text, categorization, ordering systems, and Boundary Objects.
The chapter also explains what Radio Frequency Identification (RFID) is and how it
can be used. Subsequently, the chapter describes two notable large scale
implementations of RFID to establish the overwhelmingly institutional orientation of
existing RFID standards and systems. In a similar vein, the chapter also gives
background information about tagging and folksonomy systems. In both cases, the
individual and institutional underpinnings of each system are briefly discussed to
demonstrate the tension between nodes as points of information versus points of
control. In all cases, the objects discussed are boundary objects that are used as
points of negotiation between parties.
-
C . I 7
The second half of Chapter 2 gives detailed information about bibliometric methods
that can be used to map out disciplines. First, Co-word, co-citation, and contextual
co-citation are discussed as ways of mapping out relationships between textual nodes
that comprise a canon or discipline. Next, Latent Semantic Analysis (LSA) is
discussed as an alternate bibliometric method to the various co-word/citation methods.
LSA is used to group words that are found within a set of texts to indicate semantic
relationships between common words. The resulting set of words is then used as a
contextual container or list of categories to describe the relationships between texts
that were used for the analysis. Finally, Chapter 2 explains what socially generated
folksonomies are and how they can use used as alternatives to the previously
discussed bibliometric methods without requiring expert knowledge of a discipline. As
a part of that description, the notion of outliers and marginal tags is also discussed.
Chapter 3 explains how Critical Making as a method was used to construct the RFID
system that forms the basis of this thesis. The chapter explains alternate prototypes
that were considered but not constructed or seen through to completion and
demonstrates that, within the rubric of Critical Making, such incomplete
implementations are still useful tools to understand broader issues that the
prototyping process is designed to illustrate. Finally, the chapter goes on to show how
the decisions made in designing the RFID tagging prototype system, along with the
shortcomings of the system led to the design and creation of metamash.org and
doitag.org. It also proposes that Critical Making does not necessarily need to rely on
material objects to still be a useful method.
Chapter 4 shows three surprising results that arose during the design of
metamash.org. First, a programming flaw in the metamash.org code demonstrates an
unexpected but useful perspective on how metamash.org could be used as a search
tool as well as a mashup generator. Second, one of the very first sets of texts that was
input into metamash.org is discussed. A common tag shared by many of the texts is
pubd502 which, when carefully researched reveals a rich set of documents that are
-
C . I 8
related to the initial set of books that were input into metamash.org. Finally, the tag
#206, which is found on a mashup containing (B. Smith, 1996) is discussed. The tag
links to a unique social categorization and collection management system being used
by the University of Berne’s Software Composition Group.
Chapter 4 also concludes this thesis and discusses how the ideas of discreteness and
granularity challenge the idea that texts are not ontologically changed by
demonstrating that introduction of third order ordering systems fundamentally
changes how texts as objects are created and used. The conclusion discusses how
using a text is the most important aspect of determining what a text actually is,
notwithstanding definitions in various bodies of literature that are discussed and
problematized in Chapter 2.
-
Chapter 2
Background & Literature Review
This chapter is divided into seven main topics. The first three topics establish working
definitions and tools that will be used to analyze the subsequent topics and the objects
that are discussed in Chapter 3. The first of the three topics in the beginning of this
chapter revolves around ideas of what terms like “book” or “text” entail. In discussing
those ideas, the second topic of how and why to categorize such objects via
Weinberger’s (2007a) ordering framework is discussed. The third topic in this chapter
will describe how Bowker and Griesemer’s definition of “Boundary Objects” can be used
to understand some types relationships between objects and people who use them.
The next sections establish background information about RFID, and keyword and
tagging practices. These two sections rely on the discussion of ordering systems and
boundary objects to demonstrate how RFID and tagging systems are useful examples
within the broader context of this thesis’ goal of understanding the issues
surrounding material and digital object ontologies.
The last two sections discuss two existing bibliometric methods, co-citation and Latent
Semantic Analysis, which are currently used to understand relationships between
documents as participants in a network of texts. Both suites of techniques rely on
lossy statistical analyses where the resulting data are based on one-way operations.
The final detailed discussion about tagging in the last section of this chapter is based
9
-
C . B L R 10
on the idea that tagging is a lossless practice and relationships between documents
(and objects in general) can evolve or be recorded over time via tags.
Thus to proceed further, a discussion about the complexity and multiple notions of
what words like “book” or “text” entail is required. This will demonstrate how books
are one useful tool for attempting to answer the questions posed above. Additionally,
this section begins moving away from what has been a solely materially based
discussion, to examining ideas about “paratextual” elements.
2.1 Books, Texts, and Paratext
A book is never simply a remarkable object. Like every other technology it isinvariably the product of human agency in complex and highly volatile contextswhich a responsible scholarship must seek to recover (McKenzie, 1999, p. 4)
The existence of Book History programs at institutions like the University of Toronto1,
the University of Edinburgh2, Texas Tech University3, and international organizations
like the Society for the History of Authorship, Reading and Publishing (SHARP)4,
points to an ongoing and fertile discussion about modern and historical textual
practices. Indeed, the University of Toronto’s Book History and Print Culture program
notes on its program website that “Histoire du livre, History of the Book, Textual
Studies, Print Culture, Sociology of the Text-all these names have been used to
describe a growing international academic movement.”
McKenzie (1999, p. 4) points out obvious fact that it is not just books that are the
subject of discussion in these context. For McKenzie, films, sound recordings, images,
digital files, and oral texts are all interdependent artifacts that bear witness to
human experiences. Searle (1983, p. vii) even notes that “sentences—the sounds that
come out of one’s mouth or the marks that one makes on paper—are, considered in
2http://bookhistory.fis.utoronto.ca/about.html3www.hss.ed.ac.uk/chb/4www.english.ttu.edu/grad_degrees/BH_default.asp5http://www.sharpweb.org/index.php?option=com_content&view=article&id=20&Itemid=54&lang=en
http://bookhistory.fis.utoronto.ca/about.htmlwww.hss.ed.ac.uk/chb/www.english.ttu.edu/grad_degrees/BH_default.asphttp://www.sharpweb.org/index.php?option=com_content&view=article&id=20&Itemid=54&lang=en
-
C . B L R 11
one way, just objects in the world like any other objects,” a point that will reemerge in
the forthcoming discussion in the section on paratexts, and in the secton on
categorization and ordering systems.
However, to return to McKenzie, he chooses to focus his discussion on books, noting
that they have traditionally been the material primarily studied by practitioners in
the field. Overall, Figure 2.1, which is copied from a diagram and overview of Book
History by Howsam (2006), situates Bibliography within Book History as a discipline
as the area that focuses, in part, on the materiality of the book as an object.
However, as McKenzie notes in the quote that begins this section, books as objects of
study are problematic and do not easily fit into the single context of Bibliography as a
discipline. He even writes that “At best perhaps we can acknowledge the intricacies
of such a textual world and the almost insuperable problems of describing it
adequately” (1999, p. 4). Indeed, it is difficult to even enumerate the different material
contexts within which a book comes into existence.
For example, in the case of materiality, a book is a commitment to a particular set of
material constraints like page size, overall length, even ink colours, or typefaces.
Moreover, a book must exist in or transit through multiple social contexts and
processes like writing, editing, publication, and distribution. Thus understanding the
conditions within which a material codex was produced is but one set of problematic
contexts whose recovery, McKenzie claims, is the principal end of bibliography as a
scholarly practice (which for McKenzie includes sociology of the text, book history,
literature, and print culture).
Focusing on the materiality of a text also raises the more thorny issue of a text’s
immaterial existence. Though Sutherland specifically writes about electronic text, she
puts it best when she writes that textuality “requires that we consider the unfixity
[italics added] of the text, the promiscuity as opposed to integrity of its identity in an
age when the text has a diverse non-book existence” (1997, p. 5). Since the focus of
-
C . B L R 12
Figure 2.1: Book History/Textual Studies/Sociology of the Text. Whereas Howsam’s dia-gram uses lines to link each subject, this version borrows most of her textual elementsbut uses circles to match the aesthetic used in Figures 2.7 and 2.2.
-
C . B L R 13
this thesis is partly on the electronic forms of and interactions with a text,
Sutherland’s point is especially relevant.
Moreover, the discussion about considering a text both as a discrete material object
and as an abstract object is an old debate and is still ongoing. Sutherland initially
describes Roland Barthes’ description of a “work” as that which is held in the hand,
and his description of a “text”, which for Barthes is held in language (1997, p. 3).
However, according to Barwell (2005, p. 419), the “locus classicus of the distinction
between a “work” and an “expression” of it, which has been widely adopted by textual
critics, though equally strongly opposed by others” is provided by Tanselle (2001). To
confuse things, Tanselle’s definition is opposite to Barthes’, where a “work” is
equivalent to Barthes’ text, and an expression is equivalent to Barthe’s work.
2.1.1 Work/Text Discussion
The varying definitions of a work or an expression or a text are distinctions that are
established as problematic within Book History. As Sutherland and McKenzie, and to
some extent even Tanselle argue, any act of reading or experience of a text is locally
and contextually situated such that the distinction barely holds together. This thesis
further problematizes the boundaries of material and digital texts. Thus for the time
being, with these problems looming in the background, this thesis will proceed
without endorsing any particular definition until the Conclusion chapter at which
point the discussion will be picked up again within the larger discussion of material
and digital object ontologies.
Indeed, as Tanselle (2001, p. 37) describes it, “The simple point is this: electronic texts
and hypermedia archives often allow one to do many desirable things more easily
than one could accomplish them using the codex form.” As such, avoiding the
discussion about works, expressions, and texts until the Conclusion chapter allows for
many desirable points of discussion to be raised through the course of this thesis
-
C . B L R 14
without specifically needing to focus on Tanselle’s ‘codex form.’
Indeed, the multiplicity of a text, be it from a readerly, writerly, editorial, historical,
sociological, economic, political, or historical perspective (to name but a few), makes
the problem of accurately distinguishing between a work or text difficult outside of a
specific role or discipline. Consider that even internally, a codex itself is problematic
in terms of what is and is not a part of the “text” as a whole as the next section about
paratextual elements will demonstrate.
2.1.2 Paratext
Indeed, regardless of where one stands on the previously discussed distinctions about
how to characterize texts, one idea about the nature of texts that bears exploring here
is what Genette (1997, p. 1) calls “paratext.” Genette argues that paratextual elements
are those that, while not belonging to a text per se, nevertheless serve to “present”
and “make present” an abstract text in the world. He claims that things like author’s
names, titles, tables of contents, headings, and chapters are all paratextual elements.
For Genette, his argument is that any context is a paratext. He verges on, but does
not explicitly make the point, that any paratext itself can also be a text in the
Barthesian sense of the word. Genette argues that paratext consists of two main
varieties, “peritext” like chapters, and headings, which usually reside in or close to an
actual material text, and “epitext,” which “at least originally, are located outside the
book, generally with the help of the media (interviews, conversions) or under cover of
private communications (letters, diaries, and others)” (p. 5). Importantly, in making
the distinction between the two types of paratext, Genette acknowledges that on a
finer grained scale than that of a book as a container for a text, things like indices or
interviews can stand as texts on their own. He concedes that “Most often, then, the
paratext is itself a text: if it is not the text, it is already some text” (p. 7).
This distinction focuses attention on three important points that will inform the rest
-
C . B L R 15
of this thesis. The first is the idea that depending on the level of granularity chosen,
something can always be either paratext, or a text in and of itself. This idea that a
text can occupy multiple roles at once will become more apparent in the forthcoming
discussion of Weinberger’s (2007a) ordering systems. The second point, which Genette
makes explicit, is that paratext’s “existence alone, if made known to the public,
provides some commentary on the text and influences how the text is received” (p. 7).
Finally, Genette writes that there are “implicit contexts that surround a work and, to
a greater of lesser degree, modify its significance” (p. 7).
Taken together with Sutherland, McKenzie, and Tanselle’s characterizations of texts
as contextually specific objects or ideas, these three points highlight how problematic
texts are as objects of study. For example extracting a discrete component of a larger
containing text entails that a new text is created or can stand on its own. That new
text itself can then be contextualized, either in terms of another larger text, or in
terms of other external texts. Though not writing about books per se, Gibson (?) puts
it best when he writes that “the unit you choose for describing the environment
depends on the level of the environment you choose to describe” (p.9).
This idea of re-contextualizing discrete texts or textual elements warrants a
discussion about categorization and ordering systems, which follows. This section will
demonstrate the crucial importance of understanding how categories and the manner
in which objects (including, but not limited to, texts) are organized. Fundamentally,
this section relies on the premise that an object can have some kind of easily
encapsulated identity. Whether or not this is truly the case is an unresolvable debate,
but this section will proceed under the assumption that just as pieces of texts like
paratextual elements can stand on their own as textual objects, so too can any
discrete component of any object or set of objects.
-
C . B L R 16
2.2 Categorization and Ordering
The frontiers of a book are never clear-cut: beyond the title, the first lines,and the last full stop, beyond its internal configuration and its autonomousform, it is caught up in a system of references to other books, other texts,other sentences: it is a node within a network (Foucault, 1972, p. 23).
According to Foucault then, and with the aforementioned discussion about texts,
contexts, and granularity in mind, a material book or article is a paradox. It is at
once both constrained and limited to its predefined physical location on a library
shelf, while at the same time existing in a network of relations to other similar or
referenced texts, ideas, and authors. This thesis attempts to interrogate both the
material and abstract nature of texts by employing metadata systems to explore the
relationships between any text’s material and digital forms. One aim here is to
demonstrate the existence of a rich set of latent relationships between the nodes—like
words, paragraphs, and chapters—that make up a text and another text or texts that
are not demonstrated explicitly by systems like library shelves or online catalogues.
Consider Foucault’s description of a book that begins this section—it is a dramatic
explanation of what a book is that goes beyond the material form that any particular
copy might inhabit. Foucault’s description situates the idea of “book” as a tool that,
while having material form and properties, can be understood as a tool to interrogate
the intellectual terrain and practices that surround the very creation of knowledge
itself.
This is a different approach to that of the Book History scholars cited in the previous
section in that the book or text is treated as a tool or means to investigating a deeper
set of questions than what a book is or is not, or what meanings it might provoke in
the reader. Foucault goes on to problematize the notion of an author’s oeuvre,
pointing out that the act of collecting and categorizing works into an oeuvre is both a
concealment and manifestation of some set of ideas that cannot encompass the
entirety of an author’s thoughts or the entire discourse about any particular idea.
-
C . B L R 17
In both cases, Foucault’s primary concern is with the idea that a text or set of works
has some kind abstract characteristic that allows it to stand outside the systems that
produced it in the first place. By situating a text’s contents outside of a discussion
about material form or bibliographic textuality to explain what a book is and does,
Foucault can point to the interrelations that every text has with power and ordering
systems like language, authorship, and scholarly practice that collectively work to
comprise any particular text.
This idea that books exist in networks of relationships will prove useful in the latter
portion of this chapter that discusses Latent Semantic Analysis as a means of
discovering links between books (or articles) in a network of texts. Moreover, the idea
of a book existing simultaneously in a network of other texts, and as a codex on a
library shelf finds a 1-to-1 equivalence in the formulation of objects as both digital
and material at the same time.
In addition to Foucault’s argument that books function as nodes in networks of
discourse and power systems, Weinberger also (2007a) points out that “physical
limitations on how we have organized information have not only limited our vision,
they have also given the people who control the organization of information more
power than those who create the information.... because they get to decide what to
bring to the surface and what to ignore” (2007a, p. 89). Weinberger argues that
physical constraints are a limiting factor in organizing material information like
books, but his point also applies to the more abstract information systems that he
discusses.
2.2.1 First, Second, Third Order Organization Systems
Weinberger proposes a hierarchy of ordering systems comprised of first, second, and
third order constraints. Figure 2.2 is a visual representation of his different types of
orders. For example, Weinberger defines putting books on shelves or silverware into
-
C . B L R 18
drawers (to use his example) as first order systems. Their only criteria is that they
are material based and that they add order to the world. For Weinberger, consisting
of and arranging atoms is all that is necessary for an organization system to be a
first order system.
The switch to a second order system occurs the moment that metadata is created
about physically arranged objects. According to Weinberger, said objects can be
organized according to an abstract classification scheme. An example would be
library shelves (first order) holding items that are organized by Dewey Decimal or
Library of Congress classification schemes. The scheme itself says little about the
material object in question, it only provides an abstract label or index characteristic
that can be used to group many similar objects (books in this case) on the same shelf
or set of shelves. Weinberger points out a crucially important aspect of second order
systems: they are necessarily lossy systems. A second order system gives less
information about the object in question than the object itself contains. Indeed,
Weinberger notes that first and second order archives cannot “know” everything that
they contain (2007a, p. 19).
Note however that as the section on co-citation and Latent Semantic Analysis will
demonstrate, this losiness is not necessarily a problem. Indeed, those methods are
useful for revealing relationships between documents in a network (per Foucault’s
characterization) that might not otherwise be apparent. In this respect then, those
methods are ideal for examining material objects and their second-order relationships.
Finally, Weinberger’s third order systems include digital repositories like Flickr (a
photo sharing website), which rely on individual users to categorize and label their
submissions with metadata. The metadata that users generate does not need to have
any relationship to other items in a collection. The key is that with third order
systems, categorization can be done by an individual or individuals, and can be
deferred until the “last possible moment”5, or is done “on the fly” based on
-
C . B L R 19
Figure 2.2: Weinberger’s Three Orders
-
C . B L R 20
combinations of various pieces of metadata, again supplied by an individual or
individuals. Weinberger calls this third order ordering “miscellaneous order” and
notes that “Traditional authorities cannot maintain themselves by insisting that we
have to go to them.... It is changing how we think the world itself is organized
and—perhaps more important—who we think has the authority to tell us so” (2007a,
p. 23).
The idea that these “orders” can overlay each other is a fundamental assumption that
underlies this entire thesis. As Kirschenbaum (2005) points out, digital objects are
material objects (magnetic bits or electrons) that are just operated on abstractly
because they can be treated as formally identical despite being physically different,
or even existing in different locations. Kirschenbaum writes that:
this conundrum becomes the methodological lever with which to pry openthe relentless symbolic cascade of computation and understand what isunique about computers as writing technologies: that they are materialmachines dedicated to propagating a behavioral illusion, or call it aworking model, of immateriality (2005, p. 5).
Thus while material objects like books are constrained by things like editorial
processes, distribution requirements, or shelving space and categorization systems,
digital objects also exist in the material world, and are also subject to material
organizational requirements. A hard drive with bits, organized into tracks and sectors,
or even individual bits is a first-order system. A file-system on said hard drive with a
metadata volume describing where to find specific groups of bits (files), and how files
are spread out across the entire hard drive is a second order system that necessarily
does not capture every piece of information about the underlying first order.
A set of user defined directories and files that an individual computer user creates
and stores might appear to be a third-order system built on top of the first two
5As Steve Hockema notes, this builds in an assumption that there is an “ultimate moment” whensomething is found, and a penultimate moment (the last possible one) in which categorization can happen.On a more basic level, the assumption that things are categorized in order to be found later is problematicin that it ignores other possible reasons for categorizing things.
-
C . B L R 21
systems, but the idea of a file appearing to exist inside a single directory inside
another single directory (symbolic links notwithstanding) is strongly rooted in a first
and second-order conceptual model.
Bolter and Grusin’s (2000) notion of “remediation” clarifies how the user defined files
and directories portion of this file-system example straddles the boundaries between
second and third-orders. They argue that “the logic of immediacy dictates that the
medium itself should disappear and leave us in the presence of the thing represented”
(p. 6). Where that thing to be represented is a document on a hard drive, representing
it as a file draws upon the old technologies of paper files, file folders, and filing
cabinets propagates the illusion that there is a seamless transition from material files
(paper), to material files (bits), to representation on a screen.
Weinberger’s schema of ordering systems is not without its flaws. For example,
Weinberger does not give enougn consideration to how third order systems are
afforded or built upon second and first order systems via remediation per the
discussion of Kirschenbaum (2005) and Bolter and Grusin (2000). Moreover,
Weinberger arbitrarily limits third order systems to the realm of digital objects. The
RFID enabled surface that was to be designed for this thesis (which will be discussed
in Chapter 3) would be an example of a first order system that does many of the
things that a third order system does.
Indeed, at its core, Weinberger’s real point is that digitality simply allows leaving
objects loosely categorized until they are referred to or called upon for use. To frame
Weinberger’s system in terms of the already noted quote from Tanselle about
electronic objects (2001, p.39), all that Weinberger really does is note that digital
items afford being quickly categorized and as such, categorizing can be done before,
during, or after a digital object is created. But there is no reason that the same does
not apply to material objects as the RFID surface example will demonstrate, it is just
that digital forms make it easier.
-
C . B L R 22
The point is, if Weinberger’s schema is to be adopted for this thesis, third-order
systems are not entirely unproblematic. For example, according to Weinberger, a user
does not need to understand or care about the first two underlying systems to be able
to use the third order system. However, as Kirschenbaum points out, that model is
illusory since it is predicated on two very strict underlying organizational models. As
such, combining Kirschenbaum’s more ontologically grounded discussion of what a
digital object physically is with Weinberger’s point that physical limitations restrict
what can be done with information means that Weinberger’s critique can holds for
digital objects as well. As this thesis will demonstrate, the source of third order
information is just as problematic as the structure and arrangement of first and
second order information.
Notwithstanding the drawbacks to Weinberger’s system, it is nevertheless very useful
because it allows discussing hybrid material and digital objects in a particular
manner. Indeed, stretching Norman’s use of the word ‘affordance’, and despite
Weinberger’s system not being an object, the idea that Weinberger’s ordering system
affords discussing hybrid material/digital objects is important, a point which will be
discussed in the upcoming section on Ontology and Intentionality.
Finally, Foucault’s description of books as nodes in larger networks allows for a closer
reading of Weinberger’s criticism. Weinberger’s description of how systems limit
access to information and grant control over information to those who curate rather
than create it begs the question: how can those points of control can be better
understood or worked around? In other words, how can Foucault’s notion of a text as
a system of nodes and networks of knowledge and power relations be realized in
material or digital form in such a way as to either circumvent, or at least draw
attention to a) the linkages between nodes that Foucault describes, and b) the
Foucauldian centres of control that exist in any of the networks or levels of order per
5Norman’s use itself is an appropriation of Jerome Gibson’s who originally coined the term, see Chapter3 of Gibson (1986) .
-
C . B L R 23
Weinberger’s characterization of ordering systems?
2.2.2 Ordering Systems Discussion
The goal through all this is still ultimately that of describing how ordering and
categorizing material and digital objects points to the larger issue of what it means
for such objects to exist and be used. As Levy so eloquently writes, “It’s a curious
thing about documents: you can’t see them if you don’t look at them; but you also
can’t see them if you look only at them, ignoring the surroundings in which they
operate” (2001, p. 29). The broader perspective that Levy endorses of looking beyond
only books or specific types of documents is important and is one of the reasons that
Weinberger’s ordering framework is so useful.
Indeed, one of the key reasons to choose Weinberger’s ordering scheme is its
simplicity and ability to describe the contextual surroundings of documents and
objects in general. While this thesis specifically discusses texts in various forms, the
idea that any object or system can exist in first, second or third-order systems is a
broader point that this thesis attempts to explore. For example, while the RFID system
that will be described in Chapter 3 is built around specifically linking books to digital
metadata, the existence of sites like touchtag.com that are built around linking RFID
tags embedded in objects to online data makes the usefulness of Weinbgerger’s
scheme more apparent. The touchtag.com website gives numerous examples of
linking many types of objects to online metadata. Two examples are: “Link[ing]
souvenirs to the online photo albums” (2010b, np.) or linking “collectables directly to
online information” (2010a, np.).
These are broader examples of just two possible applications that arise when material
and digital objects or collections can be linked together. Building a similar system to
bridge the gaps between material and digital orders for books in Chapter 3 allows for
a meaningful discussion of these linking and embedding practices. The usefulness of
-
C . B L R 24
Weinberger’s divisions is that they help differentiate between types of systems. Thus
instead of building applications using RFID to do things like link photo albums to
material objects, Weinbgerger’s divisions lead to meaningful points of departure into
a broader discussion about bridging material and digital systems instead of glossing
over the divide, or focusing exclusively on material, abstract, or digital systems.
As a part of that broader discussion, it is appropriate to discuss ontology and
intentionality here, since these ideas have until now only been mentioned in passing.
Given the discussion in the previous section about books and texts, and in this section
about the nature of digital objects, it seems only appropriate to begin the following
section on ontology with a relevant quote from Tanselle.
2.3 Ontology and Intentionality
Printed and electronic renderings are thus not ontologically different; they may bemade of different physical materials, but the conceptual status of the texts in eachcase is identical. The philosophical conundrum as to where texts reside is exactlythe same as it always was (2006, np.).
As a key point of entry into the discussion about the nature of hybrid material and
digital objects that have embedded within themselves their own metadata, Tanselle’s
point is a good start. He notes that the particular renderings of any given text are
just that, renderings, which can take many forms but all fundamentally point to the
same text. However, the final sentence in Tanselle’s formulation avoids delving into a
much larger body of work on exactly the problem of what objects (not just texts) are
and where they reside.
For example, Smith (2002) attempts to succintly describe objects in the following
manner, writing that “To be an object is to be a patch of the world that is succesfully
abstracted.... The fundamental character of (what it is to be) an object is thus
intrinsically hooked into the intentional life practices of the objectifying subject” (p.
241).
-
C . B L R 25
Immediately then Smith’s characterization of objects requires three things. First,
there is a requirement that something be a “successfully abstracted” part of the
world. There is no specific endorsement of either material or digital media on Smith’s
part. Indeed, as Kirschenbaum’s (2005) analysis demonstrates, such a distinction does
not hold because at some level, all digital objects are material. However, the key to
that portion of Smith’s formulation is abstraction, in that it allows for
Kirschenbaum’s formalism, and Tanselle’s abstract idea that a “text” is the same no
matter what media is used to represent it.
The other key piece of Smith’s formulation is the idea that intentional practices
stabilize objects. This idea is similar to what was discussed earlier with the library
receipt in Figure 1.1. In that example, part of the reason that conceptualizing the
books as objects is useful is because of the inherent intentionality (in the
philosophical sense of aboutness or directedness (Jacob, 2010)) that is evidenced by the
receipt, from the overlapping perspectives of the patron and the library as an
institution. Both take the receipt to refer to a transaction involving discrete books as
objects, and this unit is reinforced by the information systems that catalog and
enumerate the books involved in the transaction. But what of the status of the receipt
itself? It is hard to dispute that this is an object as well (given, for example, the use
of it in library system as well as in this thesis). But it seems a special sort of object
that also embodies and represents “metadata” about other objects, and the significance
and chunking of this metadata differs based on the practicing subject.
Interestingly enough, discussing the ‘objectness’ of the receipt itself as a piece of, or
pieces of metadata that exist based on differing intentionalities raises a number of
questions about the object itself. Embedding such a receipt in a book, as was
previously discussed, raises even more questions when viewed as a means of
expressing intentionality related to an object. Note that the word used here is
intentionality, and not agency, per Smith’s quote above. It is useful to refer to
Searle’s (1983) usage of the word Intentionality, since he discusses it in a clear and
-
C . B L R 26
useful manner.
Searle first notes that Intentionality is not an ordinary relationship “like sitting on
top of something” (p.4). Instead, for Searle “Intentional states represent objects and
states of affairs in the same sense of ‘represent’ that speech acts represent objects
and states of affairs” (p.5). A discussion of his entire book and exploration of the topic
is beyond the scope of this thesis, but the key point to take away from Searle is that
Intentionality is a term that is used to describe the beliefs of a subject about an
object. Thus to have Intentionality about an object for Searle is to be committed to a
belief about something that can be represented by using an object in a particular way.
Consider an example of how books and articles are discussed within the scholarly
publishing community. Miller and Harris (2009) describe how individual scholars,
editors, publishers, and subscribers have conflicting agendas where any work is
involved. They describe the intent of publications for scientific researchers as key to
gaining credentials for academic survival (p.13); they describe editors as intending to
“maintain and improve the quality of the journals they serve” (p.14); they describe
publishers as intending to make money, with all other intents concomitant to that
main consideration; finally, they describe the intent of universities as “providing the
reference materials necessary to support the missions of the university” (p.17).
This example demonstrates how something like a single article allows for multiple
dramatically different goals to be expressed or represented by such an object, all
within a very specific context of academic publishing. Each actor in the relationship
has their own set of uses for scientific scholarly articles, which do not necessarily
compliment the others’. But the point is, to each participant, a given article might be
conceptualized in completely different ways based on how it can be put to
use—gaining credentials, gaining readership, monetary gain, or institutional mission
fulfillment respectively.
Going a step further, combining Searle’s (1983) ideas about intentionality with
-
C . B L R 27
Norman’s (1988) ideas about affordances gives a clear picture of how intentionality
can be expressed over an object. An object like an article can be designed such that it
affords scholars to easily read it, editors to easily publish it, publishers to easily sell it,
and universities to easily purchase it. Each participant can work with the affordances
of an article’s form, content, and portability to fulfill their own particular goals.
At the same time, intentionality informs the design and functionality of objects. The
two work in tandem with each other to inform how objects can be used. In so doing,
what an object affords in terms of different inteded uses can inform what that object
is when it is used. Indeed, as was noted earlier, the discussion about the nature of
books as a work or text demonstrates how the multiple notions of a book as an object
still allows for different groups of scholars to (mostly) successfully interact with
books despite their differing scholarly intentions.
This description is admittedly a simplistic formulation of a set of very difficult and
contested problems that philosophers constantly struggle to define and resolve.
However, it is crucial to bear these ideas in mind through the course of this thesis.
Since the attempt here is to demonstrate how embedding metadata into objects, be
they material or digital, affects not only the successful abstraction of some part of
the world, but also entails a difference in terms of what an object is. This argument is
based on both a hypothetical subject’s intent of using such an object, with the new
affordances provided by the embedded metadata, and the ways the metadata affects
the dynamic tension associated with how the object is registered and stabilized within
the system of stakeholders who all treat an object as an “abstract the patch” of the
world in their own ways.
One other important tool to understand the ontological issues surrounding hybrid
material/digital metadata objects, then, is to understand how linkages between nodes
function to either obscure or reveal boundaries between multiple ordering systems.
Thus another level of abstraction from material and digital objects to wholly abstract
-
C . B L R 28
objects is necessary. Accordingly, Star and Griesemer’s (1989) notion of “boundary
objects” is a useful conceptual tool that will be described in the following section.
2.4 Boundary Objects
As a tool for understanding the interplay between books, nodes, networks, and control
points over information, the idea that things can be boundary objects is a crucial
piece that ties all these ideas about ontology, power, categorization, and ordering
systems together. Star and Griesemer explain boundary objects as inhabiting multiple
social worlds while at the same time satisfying the informational requirements
required of them (1989, p. 393). According to Star and Griesemer, boundary objects
can be either concrete or abstract objects that allow reuse in different contexts and
locations, but maintain an identity despite their varied uses. The authors describe
four types of boundary object, which follow:
1. “Repositories. These are ‘ordered’ piles of objects which are indexed ina standardized fashion. Repositories are built to deal with problems ofheterogeneity caused by differences in unit of analysis. An exampleof a repository is a library or museum. It has the advantage ofmodularity. People from different worlds can use or borrow from the’pile’ for their own purposes without having directly to negotiatedifferences in purpose” (1989, p. 410).
2. Ideal type. An object such as a digram, atlas, or classifier that is abstracted from
its domain, which does not describe any one item but is instead vague enough to
be adaptable to multiple sites of use. The example that Star and Griesemer use
is the term “species”.
3. Coincident boundaries. Star and Griesemer describe these boundary objects as
having the same external boundaries, but different internal contents. The
example they give is of separate maps of the state of California, with one map
displaying typical road-map like features, while another map of the same state
shows highly abstract ecological zones. In this case, the state itself is the
coincident boundary that is used for different purposes.
-
C . B L R 29
4. Standardized forms. These objects are described as “methods of common
communication across dispersed work groups.... The results of this type of
boundary object are standardized indexes” (1989, p. 411). Star and Griesemer go
on to describe standardized forms as useful for transmitting objects over
distances without losing or changing information, such that any local
uncertainties are ‘deleted’ to use their term.
For the purposes of this thesis, repositories, ideal types, and standardized forms are
the boundary object types that will be used. As Star and Griesemer point out, libraries
are a specific example of a repository. Just as repositories are designed to deal with
differing units of analysis, here the unit of analysis will be shifted. Instead of
focusing on libraries as repositories, this thesis treats books as repositories,
specifically of ideas that can be used for multiple purposes, as Star and Griesemer
point out.
Additionally, in terms of ideal types, “books” taken abstractly are an ideal example of
the boundary object type. Foucault’s characterization is explicit in describing books as
nodes - essentially he treats books abstractly as ideal objects in order to adapt the
idea of a book to his overall discussion of knowledge. This lack of distinction between
objects and abstract ideas as boundary objects is useful. It meshes together objects
that are treated immaterially (Kirschenbaum’s illusion of immateriality and
Weinberger’s third-order systems) with material objects like actual “books” and helps
defer the previously noted work, text, representation discussion until the conclusion
of this thesis.
2.5 Discussion
As an initial attempt at building boundary objects to explore and discuss the research
questions state in the beginning of this chapter, the Critical Making Lab at the
Faculty of Information, University of Toronto, was used to build a prototype RFID
-
C . B L R 30
system. The system uses RFID tags to link books to their online metadata via a user’s
web browser. As the preceding section mentioned, the method used to build the
prototype system was Critical Making, wherein the object that is designed (the RFID
system) is not the actual focus of the thesis.
Instead, the RFID antenna and the decisions that were made in designing the system
serve as material and technological lenses through which to understand the broader
issues of object identity and ontology, categorization systems, and the individual and
institutional practices that underly each. Two other systems were designed based on
building the RFID system, metamash.org and doitag.org, which will be discussed later
in this thesis.
Note that the lack of precision or accuracy in cataloging a text is not in itself a
problem that this thesis attempts to discuss or resolve. Indeed, given sufficient
knowledge of how a library is laid out according to Library of Congress or Dewey
Decimal systems, or any other for that matter—see Chapter 6 for an interesting
example of an alternative classification system—people can and do find the items that
they seek in libraries.
The problem put forward here is that it is difficult on individual, social, and even
institutional levels to exercise agency over the institutions like the local branch or
the Library of Congress, which categorize items according to their internal needs. For
example, an outside practitioner would have an extremely difficult time of explaining
and changing the subject headings for a book like Archive Fever (Derrida, 1996) to
include “Archives” in the Library of Congress subject headings for the book cf. 1.
Memory (Philosophy) 2. Psychoanalysis 3. Freud, Sigmund, 1856-1939.
Even here, the problem is still not quite so clear. It is not that there is a tension
between objective and subjective classifications, readings, or descriptions of a text;
the problem lies elsewhere and encompasses Weinberger’s problematic power dynamic
and Foucault’s distributed network. The issue is one of scale and abstraction of
-
C . B L R 31
objects and the methods used to organize them. As the discussion about texts and
paratexts revealed, so to do objects simultaneously inhabit multiple local material and
abstract distributed contexts, on multiple scales of granularity. It is fairly clear that
this the case for books, but also highlights the looming question of whether or how
this applies to hybrid material and digital objects.
For example, metamash.org uses tags from LibraryThing that are all submitted to
that site6 by individuals based on their individual book collections. At the same time,
metamash.org also uses bibliographic data from WorldCat7, which is a site built
around multiple libraries sharing their catalogue data about books. A book entered
into metamash.org becomes a boundary object between these two systems and in so
doing can highlight the tensions between individual and institutional uses and
sources of data. More importantly, as the RFID system will demonstrate, a book with
an embedded RFID tag can also straddle first and third order ordering systems and in
so doing, interrogate what it means for such an object to exist in the first place.
Additionally, books are ideal boundary objects to begin attempting to resolve across
scales and practices, since as Star and Griesemer point out, repository boundary
objects are those that explicitly straddle different units of analysis or abstraction
(1989, p. 410). Instead of focusing on libraries per Star and Griesemer’s example,
books are finer grained units of analysis that are functionally similar to libraries in
that they are points of negotiation between different social worlds.
Whereas the Library of Congress can bring its full authority and credibility to bear
on the abstract and necessarily limited classification of Archive Fever, it does not
account for the multiplicity of specific, local, and contextual readings that individual
readers, or even groups of readers, or other institutions might derive from the book.
Indeed for this thesis, the primary subject heading for Archive Fever could easily be
Archives instead of Memory. Thus the material book is an information resource that
6http://librarything.org7http://worldcat.org
http://librarything.orghttp://worldcat.org
-
C . B L R 32
is cited in this individual thesis for a particular purpose; at the same time, the book
exists in an institutional catalogue that has different objectives than this thesis. Thus
the volume itself is a point of negotiation between individual and institutional needs,
never-mind the possible different interpretations of the ideas it contains.
Examining the discrete nature of books and articles by building the RFID system,
metamash.org, and doitag.org, all of which are described in detail in Chapter 3, will
better characterize the tension between institutional metadata systems and individual
agency over how those systems are used. For each system, there are multiple sources
of information (Foucault’s nodes and points of control), each system sits on the
boundary between ordering systems (Weinberger), and each system is and outputs
boundary objects that fit into either repository or ideal type categories, which
highlight both the Foucauldian and Weinbergerian aspects of each system.
2.6 Part 1 Summary
This chapter serves as an introduction to a number of key ideas that inform the rest
of this thesis. First is that storing metadata about a text or set of texts can be a used
to discover relationships between texts based on the agency of an unknown but
intelligent person(s). Second is that any text can be broken down into smaller
components than the containing form.
Next is Foucault’s notion of the book as a node situated in an abstract network of
other books and nodes like institutions, authors, and readers. Crucially important is
Foucault’s idea that the scale or unit of analysis for any network can extend to
individual words and sentences, or beyond the material form of a book to the
discourses in which it exists. This idea that objects can be understood as components
in interrelated scaleless networks is fundamental to the rest of this thesis.
Another important idea, which is complimentary to the Foucault’s, is Weinberger’s
formulation of first, second, and third order organization systems. The characteristics
-
C . B L R 33
of each underly the RFID system, metamash.org, and doitag.org. More importantly,
the idea that each type of system can be overlaid on top of the preceding systems
informs the design of all three metadata systems that are discussed in this thesis.
Foucault’s idea about scale of analysis also serves as a useful conceptual tool with
which to interrogate the power relations embedded in the joints between ordering
systems.
Finally, Star and Griesemer’s characterization of boundary objects is another useful
conceptual tool. Each of the objects used or created by the RFID, metamash.org, and
doitag.org systems is a boundary object and can be used to interrogate joints between
ordering systems, points of control over information, sources of said information, and
communities of interest and practice that surround each object or set of objects.
Deploying each of these ideas in conjunction with the others via Critical Making
raises more questions than they answer about identification and categorization.
However, the process of designing and building the systems that will be discussed in
this thesis draws attention to a boundary region between material and digital objects
and related practices that warrants in depth examination.
2.7 Background: Every Book is a Problem
Standards, categories, technologies, and phenomenology are increasinglyconverging in large-scale information infrastructure.... this convergenceposes both political and ethical questions (Bowker & Star, 1999, p. 47).
2.7.1 Introduction
The convergence of various types of systems into large scale infrastructures that
Bowker and Star desribe in the preceeding quote can be understood in one way as
effacing some of the gaps between ordering systems. For example, despite the
criticisms of second order metadata systems discussed in the Introduction of this
thesis, it is appropriate to point out that the Library of Congress metadata for Archive
-
C . B L R 34
Fever was organized enough that it was suitably catalogued and located exactly
where it was supposed to be within the University of Toronto’s library system. This is
no small feat indeed considering that the University of Toronto library system
contained 18,985,932 items in April 2009 (2009).
The ability of any user with enough knowledge of the organizational schema (Library
of Congress in this case) to place, store, cross-reference, and locate a single volume
within such an immense system given bears testament to the fact that however large,
difficult, inefficient, or problematic library metadata systems may or may not be, the
University of Toronto system (at least) worked as intended for the particular text in
question. In other words, the large scale first and second order UTL systems and the
second order Library of Congress system work well together to achieve one particular
user oriented goal - finding a book on a library shelf.
However, in searching for Archive Fever, the seed idea for this thesis arose. What
might happen if users could embed their own metadata into material books on
shelves? One need look no further than the folded pages of a library book or
handwritten marginalia to see that the very medium of a material book all but
encourages users to leave their thoughts and notes embedded in a book itself. Thus,
the initial attempt of this thesis was to build a prototype RFID system that could
enable embedding digital user generated metadata into material library books,
essentially overlaying a third order system on top of first and second order systems,
without altering the underlying ordering infrastructures. The goal was to augment
existing library catalogue systems in order to question the standards, categories, and
technologies governing such systems, in a manner similiar to that of Bowker and
Star whose quote opens this chapter.
By using RFID to create boundary objects, the points of negotiation between parties
and between Weinberger’s first and second order ordering systems, and second and
third order ordering systems become slightly more apparent. While augmenting
-
C . B L R 35
many institutional systems (libraries, supply chains, international travel and the
like), using RFID in this thesis interrogate the transition points between levels of
ordering reveals that the needs of institutions take precedence of those of
individuals. As such, to understand why RFID is important to this thesis because of
its largely transactional nature, some background information is required.
2.7.2 RFID
One particularly important augmentation of books started with the invention of
barcodes in 1949 (Wikipedia, 2010a). Historically however, Mai (2003) points out that
13th Century monks built a shared catalogue of 183 English monastary libraries to
keep track of items, so augmenting books with second-order metadata is not a new
practice by any means. The Online Computer Library Center (OCLC) notes that Dewey
Decimal has been around since the 1870’s (2010, np.). However, the barcode
specifically marks a turning point for books because it represents McLuhan’s
“stepping-up of speed from the mechanical [e.g. punch cards] to the instant electric
form” (1964, p. 47). In this case, the electric form resides in pulse-width modulated
signals as read from barcodes applied to books. Carrying that notion of an instant
electric form forward from barcodes to the present, myriad systems exist that operate
on the same basic principle of using digital signals to quickly identify objects. RFID
tags comprise one such system, which informs the rest of this thesis.
The foundational theoretical and applied research paper (Stockman, 1948) upon which
modern RFID systems are built was published in 1948, a year before research into
barcode systems had truly begun. Stockman’s research involved using various types
of transceivers to modulate power emitted from a receiver. The receiver would then
read the retransmitted signal from the remote transceiver. Crucially, the difference
in Stockman’s system versus traditional radar systems of his day was that signals
could be modulated over time instead of signaling binary on or off conditions.
Stockman pointed out that possible civilian applications of his reflected power system
-
C . B L R 36
included, amongst other things, “automatic pin-pointing... and simplified means for
identification and navigation” (1948, p. 1196,1204).
2.7.2.1 ISO/IEC 1443
Sixty years after Stockman’s initial research into the theory of measuring reflected
and modulated sound, light, and radar systems, there are multiple codified ISO
standards for specific types of RFID implementations. ISO/IEC 14443 is a specific
example of a global RFID standard that warrants further examination. It “is one of a
series of International Standards describing the parameters for identification cards as
defined in ISO 7810 and the use of such cards for international interchange.... part of
ISO/IEC 14443 describes the physical characteristics of proximity cards. This
International Standard does not preclude the incorporation of other standard
technologies on the card” (Joint Technical Committee ISO/IEC/JTC1, 2008). From the
outset then, it is important to note that the standard governing the form and intended
uses of RFID tags is institutionally motivated by other ISO/IEC standards, and by the
intended use case of “international interchange.”
Roussos (2008) gives a remarkably typical account of the intended international uses
of RFID transponders. He details the use of RFID in electronic machine-readable
travel documents (e-MRTDs) like passports and the international standards that
govern their implementation and use as defined by the International Civil Aviation
Organization (ICAO). He notes that e-MRTDs must adhere to the ISO/IEC 14443
standard which “provides specifications for iris scans and fingerprints for future use.”
He continues and writes that “Millions of e-passports are already in use, and
thousands of MRTD-capable immigration control facilities have been deployed at
disembarkation points in several countries” (2008, p. 12).
At every level of the discussion, the intended user of the device is not an individual.
e-MRTDs are intended for use by border security agents who are employed by
-
C . B L R 37
enormous institutions like border agencies and their parent countries. Indeed, “Their
development [e-MRTDs] is seen by ICAO and its member countries as a significant
improvement over manual inspection of travel documents at border control points in
terms of efficiency and data entry precision” (2008, p. 11). Thus to the holder of the
e-MRTD, there is little direct benefit apart from increased efficiency when crossing
through border checkpoints. Said benefit may of course be substantial and desirable,
but the point here is that the stated use of RFID tags in “international interchange”
and control over the devices themselves resides within institutions and not
individuals. At no point do individuals get to exercise agency over the devices or the
information that they contain.
Indeed the ISO IEC 14443 compliant RFID transponders privilege the authority of
institutions over the document holders, as demonstrated by the ICAO’s description of
e-MRTD Assisted Border Clearance. Their 2008 Guidelines document states that an
e-MRTD Assisted Border Clearance system is one “[that] assists the border control
officer to authenticate the eMRTD via the use of a suitable document reader, establish
that the passenger is the rightful holder of the document and query border control
records. The officer himself determines eligibility for border crossing” (2008, p. 24).
The language of the guidelines is telling when combined with the earlier extract
from the ISO/IEC 14443 standard - the document itself is the most important piece of
the border crossing, which is exactly the use case encompassed by the ISO/IEC’s term
“international interchange.” Thus the passenger is relegated to the role of being
identified as an element in an exchange of information, not as a particular human
being, but as the “rightful holder of the document.” The e-MRTD as a document
authoritatively represents a person’s identity and legitimacy in such an institutional
transaction.
Roussos goes on the describe the benefits of RFID in large metropolitan transportation
systems. He writes that
“RFID offers distinct advantages due to the superior durability of tickets....
-
C . B L R 38
Ticket inspection at the gates is also facilitated by the far higher readaccuracy of RFID compared with magnetic, which helps maintain thesteady flow of commuters.... Finally, RFID tickets can hold considerablymore data, which allows the use of personalized unique identifiers that canbe used to virtually eliminate fare evasion (2008, p. 16).”
Roussos does not explicitly describe the institutional desirability of RFID for transit
systems. In the case of transit systems, individuals also benefit from the advantages
that he lists. However, the described advantages of ticket inspection and reduced fare
evasion are rooted in a transactional model that privileges the needs of the
institution as a service provider over those of the individual as a transit rider who
wishes to move from point A to B in the most expedient manner.
2.7.2.2 RFID Discussion
These two examples, while lengthy and in many ways banal, are remarkable for
exactly the fact that they are unremarkable. They are important because they
demonstrate the institutional standards that underly RFID technology and many
practical implementations. Such foundations are not necessarily problematic either,
as described in the transit system example. Rather, they demonstrate how commercial
and institutional uses for RFID have determined current standards that govern how
information is stored on RFID tags, best practices for programming and reading tags,
and the actual physical size and contents of tags themselves, as is the case with ISO
14443 tags used in e-MRTDs.
ISO/IEC 1443 is strongly oriented towards second-order ordering in that it only uses
key pieces of indexical information about a passport holder or passenger, without
directly physically constraining its holder. In other words, ISO/IEC 1443 does not
regulate how people are physically categorized, it only indicates enough about a
person using categories to identify a person using, for example, using a combination
of name, gender, eye colour, birth country and the like.
Additionally, eMRTDs are excellent examples of boundary objects as the transit
-
C . B L R 39
example demonstrates. A transit authority has distinctly different goals than a
transit user, but both rely on the same physical objects and infrastructure to
accomplish their different tasks. On the one hand transit authorities would like to
eliminate fare evasion and achieve a steady flow of passengers, while on the other,
transit users would like to get from place to place with a minium amount work and
time spent. By abstractly identifying passengers via their second order RFID passes,
which are based on first order ordering of electromagnetic pulses emitted from
transponders, all parties can complete their tasks.
A key point here is that RFID technologies bridge Weinberger’s first two ordering
systems. In discussing how it is used in both first and second order systems, and how
it is a boundary object between classes of users, the larger issues of identity, and
personal versus insititutional goals come into focus. As Chapter 3 will demonstrate, a
new set of issues arises when RFID is used to bridge first and third-order systems.
2.8 Keywords and Tagging
While RFID is institutionally deployed and uses internationally agreed upon
standards to bridge first and second level ordering systems, the tagging and
folksonomy systems that this thesis will discuss are largely loosely structured and
mostly third-order in nature. Krauss (2010) gives an excellent summary of how RFID
systems and Library Management Systems (LMS) can be linked to allow users to
participate in “Library 2.0” type interactions. Indeed, the initial attempt at building
an RFID system for this thesis was intended to allow users to use passive RFID tags
and off the shelf hand-held components to leave tags on RFID t