E : by Jamon Camisso · 2013. 11. 1. · ii Embedding metadata: exploring the ontology of hybrid...

E :

by

Jamon Camisso

A thesis submitted in conformity with the requirementsfor the degree of Master of Information StudiesGraduate Department of Faculty of Information

University of Toronto

Copyright c 2011 by Jamon CamissoReleased under the WTFPL

(http://sam.zoy.org/wtfpl/COPYING)

ii

Embedding metadata: exploring the ontology of hybrid digital and material objects

Jamon Camisso

Master of Information Studies

Faculty of Information

University of Toronto

2011

Abstract

This thesis discusses the design of three systems that were built using Critical

Making as an investigative method. The systems are: an RFID antenna that links

ISBNs to online metadata; metamash.org, which aggregates ISBN metadata; and

doitag.org, which allows users to associate tags with DOI numbers. Each system was

designed to interrogate issues related to identification, categorization and the

institutional foundations of, and individual practices surrounding, information

systems, providing levers to get at deeper ontological issues.

Each investigation points in its own way to a profound lack of understanding about

the ontology of digital, or hybrid material/digital objects. David Weinberger’s

ordering scheme for material and digital objects is used because it allows for a

discussion of ordering systems in general. However, focusing solely on categorization

systems masks more important questions about the ontology of such objects and how

building and using such objects fundamentally defines what they are.

iii

Acknowledgements

Writing this thesis has been a thoroughly enjoyable exercise in academic meandering.

I would like to thank my supervisor Dr. Stephen Hockema for bearing with me and

challenging me throughout this process. While much of the content of our discussions

is reflected in these pages, our conversations about issues that are not raised here

have been equally inspiring and always lead me to new and exciting discoveries.

I would also like to thank my second reader Professor Matt Ratto. Without Critical

Making and his advocacy, this thesis would never have made it past the vetting stage

by the Committee on Standing. I would also like to extend my thanks and respect to

my external reviewer Jean-François Blanchette, whose questions managed to do more

to make me summarize and reorient my thinking about the topic in two weeks than I

could have managed in months on my own.

Finally, to my friends and family, who have constantly engaged themselves with this

project in equal portions of academic and extracurricular capacities, I would like to

express my sincerest thanks. Without their perspectives, encouragement, and

prodding, I would probably have taken yet another year to finish this thesis.

Contents

1 Introduction 1

1.1 Material Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Critical Making . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Background & Literature Review 9

2.1 Books, Texts, and Paratext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.1 Work/Text Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.1.2 Paratext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2 Categorization and Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.1 First, Second, Third Order Organization Systems . . . . . . . . . . . . 17

2.2.2 Ordering Systems Discussion . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3 Ontology and Intentionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.4 Boundary Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.6 Part 1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.7 Background: Every Book is a Problem . . . . . . . . . . . . . . . . . . . . . . . 33

2.7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.7.2 RFID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.8 Keywords and Tagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.9 Part 2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.10 Co-Citation and Latent Semantic Analysis . . . . . . . . . . . . . . . . . . . . 43

iv

CONTENTS v

2.10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.10.2 Co-word, Co-citation, and Contextual Co-citation Analyses . . . . . . . 45

2.10.3 Latent Semantic Indexes and Analysis . . . . . . . . . . . . . . . . . . 50

2.11 Tagging Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

2.11.1 Folksonomies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

2.12 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3 Method: Critical Making 60

3.1 Arduino and RFID system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3.2 metamash.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.3 doitag.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4 Discussion: Just-in-time Dimensionality 80

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.2 metamash.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.3 WorldCat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.3.1 Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4.3.2 Co-tag Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4.4 U of T Catalogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

4.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

References 100

A Copyright Acknowledgements 109

B RWD-ICODE Email 110

C WorldCat Email 112

D metamash.org search terms 114

CONTENTS vi

E LibraryThing HTML Scraper 115

F Sample WorldCat RSS 116

List of Figures

1.1 Library Receipt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1 Book History/Textual Studies/Sociology of the Text. Whereas Howsam’s

diagram uses lines to link each subject, this version borrows most of her

textual elements but uses circles to match the aesthetic used in Figures

2.7 and 2.2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 Weinberger’s Three Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.3 Negative Tag Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

2.4 Term network for Cluster 3 (86 documents labelled as “Bibliometrics2”) . . 51

2.5 Overall framework of Janssens et. al’s analysis . . . . . . . . . . . . . . . . . 53

2.6 LibraryThing’s tag cloud for “Bibliometrics” . . . . . . . . . . . . . . . . . . . 55

2.7 Material and Digital systems. The items joining the horizontal circles

(RFID, Tags, and DOIs) represent means of interrogating the boundaries

of each set of ideas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.1 Photograph of the RFID System showing the Arduino, ID-12 Antenna, and

handmade antenna coils . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.2 Arduino RFID System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

3.3 Overview of metamash.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

3.4 LibraryThing tags as raw HTML . . . . . . . . . . . . . . . . . . . . . . . . . . 72

3.5 WorldCat RSS parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

3.6 Overview of doitag.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

vii

LIST OF FIGURES viii

4.1 LSI Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

4.2 Related Bibsonomy Tags and Articles for “latent” tag . . . . . . . . . . . . . . 84

4.3 UTL element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

E.1 LibraryThing HTML scraper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

F.1 WorldCat RSS for Knuth’s Art of Computer Programming . . . . . . . . . . . 116

Chapter 1

Introduction

“Everywhere everything is ordered to stand by, to be immediately at hand, indeed tostand there just so that it may be on call for a further ordering” (Heidegger, 1977,p. 17).

In many ways, this thesis is a mashup. It discusses things like categorization

systems, institutional control over information, tagging, bibliometric methods, and

the design of material and digital web based mashups that were explicitly built for

this thesis. But this thesis is not just about categorization or mashups. Instead, all of

these topics taken together interrogate a much larger issue. Each topic understood

within the larger context of the others, and the insights into each, ultimately hints at

a profound lack of understanding about the ontology of digital, or hybrid

material/digital objects.

1.1 Material Metadata

To begin exploring this issue, grounding the beginning of this discussion on material

objects as metadata is necessary. Accordingly, this section will attempt to establish

the value of storing what appears to be useless material metadata by examining a

library receipt that was found in the trash at a library.

Consider the library receipt shown in Figure 1.1. This receipt was retrieved from a

1

C . I 2

garbage can in St. Michael’s Library, University of Toronto, for the simple reason that

it was there. However uninteresting it might appear at first glance, it is a record of a

particular library patron (or potentially patrons) having researched, located, and

checked out a number of items at a particular location and moment in time. Nothing

can be reliably deduced about the patron—their interests, personality, browsing

habits, nothing about them is not captured by the receipt. Just as Levy (2001, p. 8)

notes when discussing a lunch receipt from Steve’s Deli & Catering, all that the

receipt in Figure 1.1 captures is metadata about a particular event. In this case, the

event was one in which a patron located and checked out three items, though even

the date that the event occurred is unknowable.

This thesis will explore some of the relationships that emerge when metadata about

sets of items like books and articles is aggregated and stored for future use. In

keeping with Heidegger’s idea that technology can be used to create further order in

the world out of underlying raw materials, this thesis will attempt to demonstrate the

ways in which storing such metadata—instead of throwing it away—is useful.

For example, the three particular items shown on the receipt in Figure 1.1 were

checked out concurrently with each other. This is a useful observation in that all

three items share a common author. What that meant to the patron is unknowable,

but the fact is, there is a commonality amongst the items. Looking carefully, the

items have relatively recent publication dates: 2006, 2009, and 2007 respectively.

Again, what this means is unknown, but Chapter4 will demonstrate the value of such

apparently inconsequential information. Finally, still carefully examining the item

numbers, note that the last item of the three has a different Library of Congress

letter prefix, RC, which is reserved for items with the subject of Internal Medicine

(2010b), versus the first two items that use BQ, which is reserved for items about

Buddhism (2010a). Given the stack layout of the library in question (University of

Toronto, 2010), this means that the patron probably had to traverse multiple floors and

shelves to retrieve the items shown on the receipt.

C . I 3

Figure 1.1: Library Receipt

C . I 4

If it had been found in the front cover of a book, or used as a bookmark, the receipt’s

embeddedness could have been used by a researcher to know what others have read

in relation to any referenced book on the receipt. That embedded metadata could

function in a manner similar to amazon.com’s recommendation system, which uses

“item-to-item collaborative filtering [that] matches each of the user’s purchased and

rated items to similar items, then combines those similar items into a

recommendation list” (2003). Moreover, other patrons could add their receipts (or any

information of their choosing for that matter) to a book and thus build up a set of

related texts for others to use.

This embedding could also lead to a whole host of issues within libraries. Questions

like who gets to embed receipts into books, can receipts be modified or reprinted, can

multiple copies of the same receipt be placed in multiple books, can they be removed

from a book, and the immense question about what such a practice would mean for

patron privacy are just a few questions that arise by even speculating about

embedding such metadata into a text might entail.

Now consider Heidegger’s quote again: retrieving a piece of what was for all intents

and purposes garbage, and using it to discuss metadata about the objects that it

references reveals a good deal of explicit and latent information about the items, and

about a particular place and event. That information and the receipt upon which it is

printed was on the verge of being thrown out, but, once retrieved and discussed, has

now found a new use. It is evidence, used for this thesis to show that someone went

to the trouble of getting the items listed on the receipt, and to demonstrate that the

simple fact that they did so indicates a commonality, or commonalities between items

on the receipt, whatever they might be.

Indeed, for the sake of argument, the items on the receipt could be completely

unrelated in terms of their contents, authors, and subjects. The point is, the agency

that was exercised, specifically the work that was done in getting those items, entails

C . I 5

some degree of relation between the listed items beyond merely existing in the

University of Toronto catalogue. What any set of such items might have in common

is unknowable for certain, but demonstrating that there are meaningful, unexplored

material and abstract relationships between texts is a central pursuit that this thesis

will undertake.

To that end, the practical and philosophical implications of building systems to store

this type of metadata about texts will be explored in this thesis using Critical Making

as a method, a description of which follows.

1.2 Critical Making

Briefly, Critical Making is a method pioneered by Ratto (2009), whereby an object is

designed and built to serve as a material and technological lens through which to

understand a broader set of questions or issues about a subject.

Three different metadata linking and capturing systems were designed for this thesis

using Critical Making. First, a Radio Frequency Identifier (RFID)1 system was built in

an experimental attempt to define, generate, and capture metadata. The goal of the

system was to place sets of RFID tagged books on a table that could correlate the

physical proximity of books on the table with subject matter, authors, and other

criteria.

Based on the technical limitations of the RFID system, the second system,

metamash.org, was designed to link material books to their online digital metadata

via RFID tags and ISBN numbers. Again the idea driving the design of metamash.org

was that storing aggregated metadata about sets of books instead of individual items

could be useful. Finally, doitag.org was designed as a system to collect and link

journal articles using Digital Object Identifiers and user generated tags.

1Chapter 2 will describe RFID in more detail

C . I 6

Each system was designed to initiate thinking about and to explore theoretical topics

that arise when objects are linked to each other via metadata. Each system attempts

to understand how the ontologies of objects like material books, journal articles, or

even wholly digital texts are alterable when metadata is embedded into the self-same

object. This thesis will attempt to demonstrate that embedding metadata into objects

has the potential to fundamentally change conceptions of what material and digital

object boundaries are.

Accordingly, the following questions inform and will be addressed by Critical Making

and the three systems that were designed during the course of this thesis:

1. What new practices might arise when material and digital objects are linked

together based on their embedded metadata?

2. What do those practices reveal about the ontology of material, digital, or

abstract objects when metadata about themselves is stored and embedded into

the self-same objects?

1.3 Thesis Outline

Chapter 2 reviews the various bodies of knowledge required to understand what is

meant by terms like text, categorization, ordering systems, and Boundary Objects.

The chapter also explains what Radio Frequency Identification (RFID) is and how it

can be used. Subsequently, the chapter describes two notable large scale

implementations of RFID to establish the overwhelmingly institutional orientation of

existing RFID standards and systems. In a similar vein, the chapter also gives

background information about tagging and folksonomy systems. In both cases, the

individual and institutional underpinnings of each system are briefly discussed to

demonstrate the tension between nodes as points of information versus points of

control. In all cases, the objects discussed are boundary objects that are used as

points of negotiation between parties.

C . I 7

The second half of Chapter 2 gives detailed information about bibliometric methods

that can be used to map out disciplines. First, Co-word, co-citation, and contextual

co-citation are discussed as ways of mapping out relationships between textual nodes

that comprise a canon or discipline. Next, Latent Semantic Analysis (LSA) is

discussed as an alternate bibliometric method to the various co-word/citation methods.

LSA is used to group words that are found within a set of texts to indicate semantic

relationships between common words. The resulting set of words is then used as a

contextual container or list of categories to describe the relationships between texts

that were used for the analysis. Finally, Chapter 2 explains what socially generated

folksonomies are and how they can use used as alternatives to the previously

discussed bibliometric methods without requiring expert knowledge of a discipline. As

a part of that description, the notion of outliers and marginal tags is also discussed.

Chapter 3 explains how Critical Making as a method was used to construct the RFID

system that forms the basis of this thesis. The chapter explains alternate prototypes

that were considered but not constructed or seen through to completion and

demonstrates that, within the rubric of Critical Making, such incomplete

implementations are still useful tools to understand broader issues that the

prototyping process is designed to illustrate. Finally, the chapter goes on to show how

the decisions made in designing the RFID tagging prototype system, along with the

shortcomings of the system led to the design and creation of metamash.org and

doitag.org. It also proposes that Critical Making does not necessarily need to rely on

material objects to still be a useful method.

Chapter 4 shows three surprising results that arose during the design of

metamash.org. First, a programming flaw in the metamash.org code demonstrates an

unexpected but useful perspective on how metamash.org could be used as a search

tool as well as a mashup generator. Second, one of the very first sets of texts that was

input into metamash.org is discussed. A common tag shared by many of the texts is

pubd502 which, when carefully researched reveals a rich set of documents that are

C . I 8

related to the initial set of books that were input into metamash.org. Finally, the tag

#206, which is found on a mashup containing (B. Smith, 1996) is discussed. The tag

links to a unique social categorization and collection management system being used

by the University of Berne’s Software Composition Group.

Chapter 4 also concludes this thesis and discusses how the ideas of discreteness and

granularity challenge the idea that texts are not ontologically changed by

demonstrating that introduction of third order ordering systems fundamentally

changes how texts as objects are created and used. The conclusion discusses how

using a text is the most important aspect of determining what a text actually is,

notwithstanding definitions in various bodies of literature that are discussed and

problematized in Chapter 2.

Chapter 2

Background & Literature Review

This chapter is divided into seven main topics. The first three topics establish working

definitions and tools that will be used to analyze the subsequent topics and the objects

that are discussed in Chapter 3. The first of the three topics in the beginning of this

chapter revolves around ideas of what terms like “book” or “text” entail. In discussing

those ideas, the second topic of how and why to categorize such objects via

Weinberger’s (2007a) ordering framework is discussed. The third topic in this chapter

will describe how Bowker and Griesemer’s definition of “Boundary Objects” can be used

to understand some types relationships between objects and people who use them.

The next sections establish background information about RFID, and keyword and

tagging practices. These two sections rely on the discussion of ordering systems and

boundary objects to demonstrate how RFID and tagging systems are useful examples

within the broader context of this thesis’ goal of understanding the issues

surrounding material and digital object ontologies.

The last two sections discuss two existing bibliometric methods, co-citation and Latent

Semantic Analysis, which are currently used to understand relationships between

documents as participants in a network of texts. Both suites of techniques rely on

lossy statistical analyses where the resulting data are based on one-way operations.

The final detailed discussion about tagging in the last section of this chapter is based

9

C . B L R 10

on the idea that tagging is a lossless practice and relationships between documents

(and objects in general) can evolve or be recorded over time via tags.

Thus to proceed further, a discussion about the complexity and multiple notions of

what words like “book” or “text” entail is required. This will demonstrate how books

are one useful tool for attempting to answer the questions posed above. Additionally,

this section begins moving away from what has been a solely materially based

discussion, to examining ideas about “paratextual” elements.

2.1 Books, Texts, and Paratext

A book is never simply a remarkable object. Like every other technology it isinvariably the product of human agency in complex and highly volatile contextswhich a responsible scholarship must seek to recover (McKenzie, 1999, p. 4)

The existence of Book History programs at institutions like the University of Toronto1,

the University of Edinburgh2, Texas Tech University3, and international organizations

like the Society for the History of Authorship, Reading and Publishing (SHARP)4,

points to an ongoing and fertile discussion about modern and historical textual

practices. Indeed, the University of Toronto’s Book History and Print Culture program

notes on its program website that “Histoire du livre, History of the Book, Textual

Studies, Print Culture, Sociology of the Text-all these names have been used to

describe a growing international academic movement.”

McKenzie (1999, p. 4) points out obvious fact that it is not just books that are the

subject of discussion in these context. For McKenzie, films, sound recordings, images,

digital files, and oral texts are all interdependent artifacts that bear witness to

human experiences. Searle (1983, p. vii) even notes that “sentences—the sounds that

come out of one’s mouth or the marks that one makes on paper—are, considered in

2http://bookhistory.fis.utoronto.ca/about.html3www.hss.ed.ac.uk/chb/4www.english.ttu.edu/grad_degrees/BH_default.asp5http://www.sharpweb.org/index.php?option=com_content&view=article&id=20&Itemid=54&lang=en

http://bookhistory.fis.utoronto.ca/about.htmlwww.hss.ed.ac.uk/chb/www.english.ttu.edu/grad_degrees/BH_default.asphttp://www.sharpweb.org/index.php?option=com_content&view=article&id=20&Itemid=54&lang=en

C . B L R 11

one way, just objects in the world like any other objects,” a point that will reemerge in

the forthcoming discussion in the section on paratexts, and in the secton on

categorization and ordering systems.

However, to return to McKenzie, he chooses to focus his discussion on books, noting

that they have traditionally been the material primarily studied by practitioners in

the field. Overall, Figure 2.1, which is copied from a diagram and overview of Book

History by Howsam (2006), situates Bibliography within Book History as a discipline

as the area that focuses, in part, on the materiality of the book as an object.

However, as McKenzie notes in the quote that begins this section, books as objects of

study are problematic and do not easily fit into the single context of Bibliography as a

discipline. He even writes that “At best perhaps we can acknowledge the intricacies

of such a textual world and the almost insuperable problems of describing it

adequately” (1999, p. 4). Indeed, it is difficult to even enumerate the different material

contexts within which a book comes into existence.

For example, in the case of materiality, a book is a commitment to a particular set of

material constraints like page size, overall length, even ink colours, or typefaces.

Moreover, a book must exist in or transit through multiple social contexts and

processes like writing, editing, publication, and distribution. Thus understanding the

conditions within which a material codex was produced is but one set of problematic

contexts whose recovery, McKenzie claims, is the principal end of bibliography as a

scholarly practice (which for McKenzie includes sociology of the text, book history,

literature, and print culture).

Focusing on the materiality of a text also raises the more thorny issue of a text’s

immaterial existence. Though Sutherland specifically writes about electronic text, she

puts it best when she writes that textuality “requires that we consider the unfixity

[italics added] of the text, the promiscuity as opposed to integrity of its identity in an

age when the text has a diverse non-book existence” (1997, p. 5). Since the focus of

C . B L R 12

Figure 2.1: Book History/Textual Studies/Sociology of the Text. Whereas Howsam’s dia-gram uses lines to link each subject, this version borrows most of her textual elementsbut uses circles to match the aesthetic used in Figures 2.7 and 2.2.

C . B L R 13

this thesis is partly on the electronic forms of and interactions with a text,

Sutherland’s point is especially relevant.

Moreover, the discussion about considering a text both as a discrete material object

and as an abstract object is an old debate and is still ongoing. Sutherland initially

describes Roland Barthes’ description of a “work” as that which is held in the hand,

and his description of a “text”, which for Barthes is held in language (1997, p. 3).

However, according to Barwell (2005, p. 419), the “locus classicus of the distinction

between a “work” and an “expression” of it, which has been widely adopted by textual

critics, though equally strongly opposed by others” is provided by Tanselle (2001). To

confuse things, Tanselle’s definition is opposite to Barthes’, where a “work” is

equivalent to Barthes’ text, and an expression is equivalent to Barthe’s work.

2.1.1 Work/Text Discussion

The varying definitions of a work or an expression or a text are distinctions that are

established as problematic within Book History. As Sutherland and McKenzie, and to

some extent even Tanselle argue, any act of reading or experience of a text is locally

and contextually situated such that the distinction barely holds together. This thesis

further problematizes the boundaries of material and digital texts. Thus for the time

being, with these problems looming in the background, this thesis will proceed

without endorsing any particular definition until the Conclusion chapter at which

point the discussion will be picked up again within the larger discussion of material

and digital object ontologies.

Indeed, as Tanselle (2001, p. 37) describes it, “The simple point is this: electronic texts

and hypermedia archives often allow one to do many desirable things more easily

than one could accomplish them using the codex form.” As such, avoiding the

discussion about works, expressions, and texts until the Conclusion chapter allows for

many desirable points of discussion to be raised through the course of this thesis

C . B L R 14

without specifically needing to focus on Tanselle’s ‘codex form.’

Indeed, the multiplicity of a text, be it from a readerly, writerly, editorial, historical,

sociological, economic, political, or historical perspective (to name but a few), makes

the problem of accurately distinguishing between a work or text difficult outside of a

specific role or discipline. Consider that even internally, a codex itself is problematic

in terms of what is and is not a part of the “text” as a whole as the next section about

paratextual elements will demonstrate.

2.1.2 Paratext

Indeed, regardless of where one stands on the previously discussed distinctions about

how to characterize texts, one idea about the nature of texts that bears exploring here

is what Genette (1997, p. 1) calls “paratext.” Genette argues that paratextual elements

are those that, while not belonging to a text per se, nevertheless serve to “present”

and “make present” an abstract text in the world. He claims that things like author’s

names, titles, tables of contents, headings, and chapters are all paratextual elements.

For Genette, his argument is that any context is a paratext. He verges on, but does

not explicitly make the point, that any paratext itself can also be a text in the

Barthesian sense of the word. Genette argues that paratext consists of two main

varieties, “peritext” like chapters, and headings, which usually reside in or close to an

actual material text, and “epitext,” which “at least originally, are located outside the

book, generally with the help of the media (interviews, conversions) or under cover of

private communications (letters, diaries, and others)” (p. 5). Importantly, in making

the distinction between the two types of paratext, Genette acknowledges that on a

finer grained scale than that of a book as a container for a text, things like indices or

interviews can stand as texts on their own. He concedes that “Most often, then, the

paratext is itself a text: if it is not the text, it is already some text” (p. 7).

This distinction focuses attention on three important points that will inform the rest

C . B L R 15

of this thesis. The first is the idea that depending on the level of granularity chosen,

something can always be either paratext, or a text in and of itself. This idea that a

text can occupy multiple roles at once will become more apparent in the forthcoming

discussion of Weinberger’s (2007a) ordering systems. The second point, which Genette

makes explicit, is that paratext’s “existence alone, if made known to the public,

provides some commentary on the text and influences how the text is received” (p. 7).

Finally, Genette writes that there are “implicit contexts that surround a work and, to

a greater of lesser degree, modify its significance” (p. 7).

Taken together with Sutherland, McKenzie, and Tanselle’s characterizations of texts

as contextually specific objects or ideas, these three points highlight how problematic

texts are as objects of study. For example extracting a discrete component of a larger

containing text entails that a new text is created or can stand on its own. That new

text itself can then be contextualized, either in terms of another larger text, or in

terms of other external texts. Though not writing about books per se, Gibson (?) puts

it best when he writes that “the unit you choose for describing the environment

depends on the level of the environment you choose to describe” (p.9).

This idea of re-contextualizing discrete texts or textual elements warrants a

discussion about categorization and ordering systems, which follows. This section will

demonstrate the crucial importance of understanding how categories and the manner

in which objects (including, but not limited to, texts) are organized. Fundamentally,

this section relies on the premise that an object can have some kind of easily

encapsulated identity. Whether or not this is truly the case is an unresolvable debate,

but this section will proceed under the assumption that just as pieces of texts like

paratextual elements can stand on their own as textual objects, so too can any

discrete component of any object or set of objects.

C . B L R 16

2.2 Categorization and Ordering

The frontiers of a book are never clear-cut: beyond the title, the first lines,and the last full stop, beyond its internal configuration and its autonomousform, it is caught up in a system of references to other books, other texts,other sentences: it is a node within a network (Foucault, 1972, p. 23).

According to Foucault then, and with the aforementioned discussion about texts,

contexts, and granularity in mind, a material book or article is a paradox. It is at

once both constrained and limited to its predefined physical location on a library

shelf, while at the same time existing in a network of relations to other similar or

referenced texts, ideas, and authors. This thesis attempts to interrogate both the

material and abstract nature of texts by employing metadata systems to explore the

relationships between any text’s material and digital forms. One aim here is to

demonstrate the existence of a rich set of latent relationships between the nodes—like

words, paragraphs, and chapters—that make up a text and another text or texts that

are not demonstrated explicitly by systems like library shelves or online catalogues.

Consider Foucault’s description of a book that begins this section—it is a dramatic

explanation of what a book is that goes beyond the material form that any particular

copy might inhabit. Foucault’s description situates the idea of “book” as a tool that,

while having material form and properties, can be understood as a tool to interrogate

the intellectual terrain and practices that surround the very creation of knowledge

itself.

This is a different approach to that of the Book History scholars cited in the previous

section in that the book or text is treated as a tool or means to investigating a deeper

set of questions than what a book is or is not, or what meanings it might provoke in

the reader. Foucault goes on to problematize the notion of an author’s oeuvre,

pointing out that the act of collecting and categorizing works into an oeuvre is both a

concealment and manifestation of some set of ideas that cannot encompass the

entirety of an author’s thoughts or the entire discourse about any particular idea.

C . B L R 17

In both cases, Foucault’s primary concern is with the idea that a text or set of works

has some kind abstract characteristic that allows it to stand outside the systems that

produced it in the first place. By situating a text’s contents outside of a discussion

about material form or bibliographic textuality to explain what a book is and does,

Foucault can point to the interrelations that every text has with power and ordering

systems like language, authorship, and scholarly practice that collectively work to

comprise any particular text.

This idea that books exist in networks of relationships will prove useful in the latter

portion of this chapter that discusses Latent Semantic Analysis as a means of

discovering links between books (or articles) in a network of texts. Moreover, the idea

of a book existing simultaneously in a network of other texts, and as a codex on a

library shelf finds a 1-to-1 equivalence in the formulation of objects as both digital

and material at the same time.

In addition to Foucault’s argument that books function as nodes in networks of

discourse and power systems, Weinberger also (2007a) points out that “physical

limitations on how we have organized information have not only limited our vision,

they have also given the people who control the organization of information more

power than those who create the information.... because they get to decide what to

bring to the surface and what to ignore” (2007a, p. 89). Weinberger argues that

physical constraints are a limiting factor in organizing material information like

books, but his point also applies to the more abstract information systems that he

discusses.

2.2.1 First, Second, Third Order Organization Systems

Weinberger proposes a hierarchy of ordering systems comprised of first, second, and

third order constraints. Figure 2.2 is a visual representation of his different types of

orders. For example, Weinberger defines putting books on shelves or silverware into

C . B L R 18

drawers (to use his example) as first order systems. Their only criteria is that they

are material based and that they add order to the world. For Weinberger, consisting

of and arranging atoms is all that is necessary for an organization system to be a

first order system.

The switch to a second order system occurs the moment that metadata is created

about physically arranged objects. According to Weinberger, said objects can be

organized according to an abstract classification scheme. An example would be

library shelves (first order) holding items that are organized by Dewey Decimal or

Library of Congress classification schemes. The scheme itself says little about the

material object in question, it only provides an abstract label or index characteristic

that can be used to group many similar objects (books in this case) on the same shelf

or set of shelves. Weinberger points out a crucially important aspect of second order

systems: they are necessarily lossy systems. A second order system gives less

information about the object in question than the object itself contains. Indeed,

Weinberger notes that first and second order archives cannot “know” everything that

they contain (2007a, p. 19).

Note however that as the section on co-citation and Latent Semantic Analysis will

demonstrate, this losiness is not necessarily a problem. Indeed, those methods are

useful for revealing relationships between documents in a network (per Foucault’s

characterization) that might not otherwise be apparent. In this respect then, those

methods are ideal for examining material objects and their second-order relationships.

Finally, Weinberger’s third order systems include digital repositories like Flickr (a

photo sharing website), which rely on individual users to categorize and label their

submissions with metadata. The metadata that users generate does not need to have

any relationship to other items in a collection. The key is that with third order

systems, categorization can be done by an individual or individuals, and can be

deferred until the “last possible moment”5, or is done “on the fly” based on

C . B L R 19

Figure 2.2: Weinberger’s Three Orders

C . B L R 20

combinations of various pieces of metadata, again supplied by an individual or

individuals. Weinberger calls this third order ordering “miscellaneous order” and

notes that “Traditional authorities cannot maintain themselves by insisting that we

have to go to them.... It is changing how we think the world itself is organized

and—perhaps more important—who we think has the authority to tell us so” (2007a,

p. 23).

The idea that these “orders” can overlay each other is a fundamental assumption that

underlies this entire thesis. As Kirschenbaum (2005) points out, digital objects are

material objects (magnetic bits or electrons) that are just operated on abstractly

because they can be treated as formally identical despite being physically different,

or even existing in different locations. Kirschenbaum writes that:

this conundrum becomes the methodological lever with which to pry openthe relentless symbolic cascade of computation and understand what isunique about computers as writing technologies: that they are materialmachines dedicated to propagating a behavioral illusion, or call it aworking model, of immateriality (2005, p. 5).

Thus while material objects like books are constrained by things like editorial

processes, distribution requirements, or shelving space and categorization systems,

digital objects also exist in the material world, and are also subject to material

organizational requirements. A hard drive with bits, organized into tracks and sectors,

or even individual bits is a first-order system. A file-system on said hard drive with a

metadata volume describing where to find specific groups of bits (files), and how files

are spread out across the entire hard drive is a second order system that necessarily

does not capture every piece of information about the underlying first order.

A set of user defined directories and files that an individual computer user creates

and stores might appear to be a third-order system built on top of the first two

5As Steve Hockema notes, this builds in an assumption that there is an “ultimate moment” whensomething is found, and a penultimate moment (the last possible one) in which categorization can happen.On a more basic level, the assumption that things are categorized in order to be found later is problematicin that it ignores other possible reasons for categorizing things.

C . B L R 21

systems, but the idea of a file appearing to exist inside a single directory inside

another single directory (symbolic links notwithstanding) is strongly rooted in a first

and second-order conceptual model.

Bolter and Grusin’s (2000) notion of “remediation” clarifies how the user defined files

and directories portion of this file-system example straddles the boundaries between

second and third-orders. They argue that “the logic of immediacy dictates that the

medium itself should disappear and leave us in the presence of the thing represented”

(p. 6). Where that thing to be represented is a document on a hard drive, representing

it as a file draws upon the old technologies of paper files, file folders, and filing

cabinets propagates the illusion that there is a seamless transition from material files

(paper), to material files (bits), to representation on a screen.

Weinberger’s schema of ordering systems is not without its flaws. For example,

Weinberger does not give enougn consideration to how third order systems are

afforded or built upon second and first order systems via remediation per the

discussion of Kirschenbaum (2005) and Bolter and Grusin (2000). Moreover,

Weinberger arbitrarily limits third order systems to the realm of digital objects. The

RFID enabled surface that was to be designed for this thesis (which will be discussed

in Chapter 3) would be an example of a first order system that does many of the

things that a third order system does.

Indeed, at its core, Weinberger’s real point is that digitality simply allows leaving

objects loosely categorized until they are referred to or called upon for use. To frame

Weinberger’s system in terms of the already noted quote from Tanselle about

electronic objects (2001, p.39), all that Weinberger really does is note that digital

items afford being quickly categorized and as such, categorizing can be done before,

during, or after a digital object is created. But there is no reason that the same does

not apply to material objects as the RFID surface example will demonstrate, it is just

that digital forms make it easier.

C . B L R 22

The point is, if Weinberger’s schema is to be adopted for this thesis, third-order

systems are not entirely unproblematic. For example, according to Weinberger, a user

does not need to understand or care about the first two underlying systems to be able

to use the third order system. However, as Kirschenbaum points out, that model is

illusory since it is predicated on two very strict underlying organizational models. As

such, combining Kirschenbaum’s more ontologically grounded discussion of what a

digital object physically is with Weinberger’s point that physical limitations restrict

what can be done with information means that Weinberger’s critique can holds for

digital objects as well. As this thesis will demonstrate, the source of third order

information is just as problematic as the structure and arrangement of first and

second order information.

Notwithstanding the drawbacks to Weinberger’s system, it is nevertheless very useful

because it allows discussing hybrid material and digital objects in a particular

manner. Indeed, stretching Norman’s use of the word ‘affordance’, and despite

Weinberger’s system not being an object, the idea that Weinberger’s ordering system

affords discussing hybrid material/digital objects is important, a point which will be

discussed in the upcoming section on Ontology and Intentionality.

Finally, Foucault’s description of books as nodes in larger networks allows for a closer

reading of Weinberger’s criticism. Weinberger’s description of how systems limit

access to information and grant control over information to those who curate rather

than create it begs the question: how can those points of control can be better

understood or worked around? In other words, how can Foucault’s notion of a text as

a system of nodes and networks of knowledge and power relations be realized in

material or digital form in such a way as to either circumvent, or at least draw

attention to a) the linkages between nodes that Foucault describes, and b) the

Foucauldian centres of control that exist in any of the networks or levels of order per

5Norman’s use itself is an appropriation of Jerome Gibson’s who originally coined the term, see Chapter3 of Gibson (1986) .

C . B L R 23

Weinberger’s characterization of ordering systems?

2.2.2 Ordering Systems Discussion

The goal through all this is still ultimately that of describing how ordering and

categorizing material and digital objects points to the larger issue of what it means

for such objects to exist and be used. As Levy so eloquently writes, “It’s a curious

thing about documents: you can’t see them if you don’t look at them; but you also

can’t see them if you look only at them, ignoring the surroundings in which they

operate” (2001, p. 29). The broader perspective that Levy endorses of looking beyond

only books or specific types of documents is important and is one of the reasons that

Weinberger’s ordering framework is so useful.

Indeed, one of the key reasons to choose Weinberger’s ordering scheme is its

simplicity and ability to describe the contextual surroundings of documents and

objects in general. While this thesis specifically discusses texts in various forms, the

idea that any object or system can exist in first, second or third-order systems is a

broader point that this thesis attempts to explore. For example, while the RFID system

that will be described in Chapter 3 is built around specifically linking books to digital

metadata, the existence of sites like touchtag.com that are built around linking RFID

tags embedded in objects to online data makes the usefulness of Weinbgerger’s

scheme more apparent. The touchtag.com website gives numerous examples of

linking many types of objects to online metadata. Two examples are: “Link[ing]

souvenirs to the online photo albums” (2010b, np.) or linking “collectables directly to

online information” (2010a, np.).

These are broader examples of just two possible applications that arise when material

and digital objects or collections can be linked together. Building a similar system to

bridge the gaps between material and digital orders for books in Chapter 3 allows for

a meaningful discussion of these linking and embedding practices. The usefulness of

C . B L R 24

Weinberger’s divisions is that they help differentiate between types of systems. Thus

instead of building applications using RFID to do things like link photo albums to

material objects, Weinbgerger’s divisions lead to meaningful points of departure into

a broader discussion about bridging material and digital systems instead of glossing

over the divide, or focusing exclusively on material, abstract, or digital systems.

As a part of that broader discussion, it is appropriate to discuss ontology and

intentionality here, since these ideas have until now only been mentioned in passing.

Given the discussion in the previous section about books and texts, and in this section

about the nature of digital objects, it seems only appropriate to begin the following

section on ontology with a relevant quote from Tanselle.

2.3 Ontology and Intentionality

Printed and electronic renderings are thus not ontologically different; they may bemade of different physical materials, but the conceptual status of the texts in eachcase is identical. The philosophical conundrum as to where texts reside is exactlythe same as it always was (2006, np.).

As a key point of entry into the discussion about the nature of hybrid material and

digital objects that have embedded within themselves their own metadata, Tanselle’s

point is a good start. He notes that the particular renderings of any given text are

just that, renderings, which can take many forms but all fundamentally point to the

same text. However, the final sentence in Tanselle’s formulation avoids delving into a

much larger body of work on exactly the problem of what objects (not just texts) are

and where they reside.

For example, Smith (2002) attempts to succintly describe objects in the following

manner, writing that “To be an object is to be a patch of the world that is succesfully

abstracted.... The fundamental character of (what it is to be) an object is thus

intrinsically hooked into the intentional life practices of the objectifying subject” (p.

241).

C . B L R 25

Immediately then Smith’s characterization of objects requires three things. First,

there is a requirement that something be a “successfully abstracted” part of the

world. There is no specific endorsement of either material or digital media on Smith’s

part. Indeed, as Kirschenbaum’s (2005) analysis demonstrates, such a distinction does

not hold because at some level, all digital objects are material. However, the key to

that portion of Smith’s formulation is abstraction, in that it allows for

Kirschenbaum’s formalism, and Tanselle’s abstract idea that a “text” is the same no

matter what media is used to represent it.

The other key piece of Smith’s formulation is the idea that intentional practices

stabilize objects. This idea is similar to what was discussed earlier with the library

receipt in Figure 1.1. In that example, part of the reason that conceptualizing the

books as objects is useful is because of the inherent intentionality (in the

philosophical sense of aboutness or directedness (Jacob, 2010)) that is evidenced by the

receipt, from the overlapping perspectives of the patron and the library as an

institution. Both take the receipt to refer to a transaction involving discrete books as

objects, and this unit is reinforced by the information systems that catalog and

enumerate the books involved in the transaction. But what of the status of the receipt

itself? It is hard to dispute that this is an object as well (given, for example, the use

of it in library system as well as in this thesis). But it seems a special sort of object

that also embodies and represents “metadata” about other objects, and the significance

and chunking of this metadata differs based on the practicing subject.

Interestingly enough, discussing the ‘objectness’ of the receipt itself as a piece of, or

pieces of metadata that exist based on differing intentionalities raises a number of

questions about the object itself. Embedding such a receipt in a book, as was

previously discussed, raises even more questions when viewed as a means of

expressing intentionality related to an object. Note that the word used here is

intentionality, and not agency, per Smith’s quote above. It is useful to refer to

Searle’s (1983) usage of the word Intentionality, since he discusses it in a clear and

C . B L R 26

useful manner.

Searle first notes that Intentionality is not an ordinary relationship “like sitting on

top of something” (p.4). Instead, for Searle “Intentional states represent objects and

states of affairs in the same sense of ‘represent’ that speech acts represent objects

and states of affairs” (p.5). A discussion of his entire book and exploration of the topic

is beyond the scope of this thesis, but the key point to take away from Searle is that

Intentionality is a term that is used to describe the beliefs of a subject about an

object. Thus to have Intentionality about an object for Searle is to be committed to a

belief about something that can be represented by using an object in a particular way.

Consider an example of how books and articles are discussed within the scholarly

publishing community. Miller and Harris (2009) describe how individual scholars,

editors, publishers, and subscribers have conflicting agendas where any work is

involved. They describe the intent of publications for scientific researchers as key to

gaining credentials for academic survival (p.13); they describe editors as intending to

“maintain and improve the quality of the journals they serve” (p.14); they describe

publishers as intending to make money, with all other intents concomitant to that

main consideration; finally, they describe the intent of universities as “providing the

reference materials necessary to support the missions of the university” (p.17).

This example demonstrates how something like a single article allows for multiple

dramatically different goals to be expressed or represented by such an object, all

within a very specific context of academic publishing. Each actor in the relationship

has their own set of uses for scientific scholarly articles, which do not necessarily

compliment the others’. But the point is, to each participant, a given article might be

conceptualized in completely different ways based on how it can be put to

use—gaining credentials, gaining readership, monetary gain, or institutional mission

fulfillment respectively.

Going a step further, combining Searle’s (1983) ideas about intentionality with

C . B L R 27

Norman’s (1988) ideas about affordances gives a clear picture of how intentionality

can be expressed over an object. An object like an article can be designed such that it

affords scholars to easily read it, editors to easily publish it, publishers to easily sell it,

and universities to easily purchase it. Each participant can work with the affordances

of an article’s form, content, and portability to fulfill their own particular goals.

At the same time, intentionality informs the design and functionality of objects. The

two work in tandem with each other to inform how objects can be used. In so doing,

what an object affords in terms of different inteded uses can inform what that object

is when it is used. Indeed, as was noted earlier, the discussion about the nature of

books as a work or text demonstrates how the multiple notions of a book as an object

still allows for different groups of scholars to (mostly) successfully interact with

books despite their differing scholarly intentions.

This description is admittedly a simplistic formulation of a set of very difficult and

contested problems that philosophers constantly struggle to define and resolve.

However, it is crucial to bear these ideas in mind through the course of this thesis.

Since the attempt here is to demonstrate how embedding metadata into objects, be

they material or digital, affects not only the successful abstraction of some part of

the world, but also entails a difference in terms of what an object is. This argument is

based on both a hypothetical subject’s intent of using such an object, with the new

affordances provided by the embedded metadata, and the ways the metadata affects

the dynamic tension associated with how the object is registered and stabilized within

the system of stakeholders who all treat an object as an “abstract the patch” of the

world in their own ways.

One other important tool to understand the ontological issues surrounding hybrid

material/digital metadata objects, then, is to understand how linkages between nodes

function to either obscure or reveal boundaries between multiple ordering systems.

Thus another level of abstraction from material and digital objects to wholly abstract

C . B L R 28

objects is necessary. Accordingly, Star and Griesemer’s (1989) notion of “boundary

objects” is a useful conceptual tool that will be described in the following section.

2.4 Boundary Objects

As a tool for understanding the interplay between books, nodes, networks, and control

points over information, the idea that things can be boundary objects is a crucial

piece that ties all these ideas about ontology, power, categorization, and ordering

systems together. Star and Griesemer explain boundary objects as inhabiting multiple

social worlds while at the same time satisfying the informational requirements

required of them (1989, p. 393). According to Star and Griesemer, boundary objects

can be either concrete or abstract objects that allow reuse in different contexts and

locations, but maintain an identity despite their varied uses. The authors describe

four types of boundary object, which follow:

1. “Repositories. These are ‘ordered’ piles of objects which are indexed ina standardized fashion. Repositories are built to deal with problems ofheterogeneity caused by differences in unit of analysis. An exampleof a repository is a library or museum. It has the advantage ofmodularity. People from different worlds can use or borrow from the’pile’ for their own purposes without having directly to negotiatedifferences in purpose” (1989, p. 410).

2. Ideal type. An object such as a digram, atlas, or classifier that is abstracted from

its domain, which does not describe any one item but is instead vague enough to

be adaptable to multiple sites of use. The example that Star and Griesemer use

is the term “species”.

3. Coincident boundaries. Star and Griesemer describe these boundary objects as

having the same external boundaries, but different internal contents. The

example they give is of separate maps of the state of California, with one map

displaying typical road-map like features, while another map of the same state

shows highly abstract ecological zones. In this case, the state itself is the

coincident boundary that is used for different purposes.

C . B L R 29

4. Standardized forms. These objects are described as “methods of common

communication across dispersed work groups.... The results of this type of

boundary object are standardized indexes” (1989, p. 411). Star and Griesemer go

on to describe standardized forms as useful for transmitting objects over

distances without losing or changing information, such that any local

uncertainties are ‘deleted’ to use their term.

For the purposes of this thesis, repositories, ideal types, and standardized forms are

the boundary object types that will be used. As Star and Griesemer point out, libraries

are a specific example of a repository. Just as repositories are designed to deal with

differing units of analysis, here the unit of analysis will be shifted. Instead of

focusing on libraries as repositories, this thesis treats books as repositories,

specifically of ideas that can be used for multiple purposes, as Star and Griesemer

point out.

Additionally, in terms of ideal types, “books” taken abstractly are an ideal example of

the boundary object type. Foucault’s characterization is explicit in describing books as

nodes - essentially he treats books abstractly as ideal objects in order to adapt the

idea of a book to his overall discussion of knowledge. This lack of distinction between

objects and abstract ideas as boundary objects is useful. It meshes together objects

that are treated immaterially (Kirschenbaum’s illusion of immateriality and

Weinberger’s third-order systems) with material objects like actual “books” and helps

defer the previously noted work, text, representation discussion until the conclusion

of this thesis.

2.5 Discussion

As an initial attempt at building boundary objects to explore and discuss the research

questions state in the beginning of this chapter, the Critical Making Lab at the

Faculty of Information, University of Toronto, was used to build a prototype RFID

C . B L R 30

system. The system uses RFID tags to link books to their online metadata via a user’s

web browser. As the preceding section mentioned, the method used to build the

prototype system was Critical Making, wherein the object that is designed (the RFID

system) is not the actual focus of the thesis.

Instead, the RFID antenna and the decisions that were made in designing the system

serve as material and technological lenses through which to understand the broader

issues of object identity and ontology, categorization systems, and the individual and

institutional practices that underly each. Two other systems were designed based on

building the RFID system, metamash.org and doitag.org, which will be discussed later

in this thesis.

Note that the lack of precision or accuracy in cataloging a text is not in itself a

problem that this thesis attempts to discuss or resolve. Indeed, given sufficient

knowledge of how a library is laid out according to Library of Congress or Dewey

Decimal systems, or any other for that matter—see Chapter 6 for an interesting

example of an alternative classification system—people can and do find the items that

they seek in libraries.

The problem put forward here is that it is difficult on individual, social, and even

institutional levels to exercise agency over the institutions like the local branch or

the Library of Congress, which categorize items according to their internal needs. For

example, an outside practitioner would have an extremely difficult time of explaining

and changing the subject headings for a book like Archive Fever (Derrida, 1996) to

include “Archives” in the Library of Congress subject headings for the book cf. 1.

Memory (Philosophy) 2. Psychoanalysis 3. Freud, Sigmund, 1856-1939.

Even here, the problem is still not quite so clear. It is not that there is a tension

between objective and subjective classifications, readings, or descriptions of a text;

the problem lies elsewhere and encompasses Weinberger’s problematic power dynamic

and Foucault’s distributed network. The issue is one of scale and abstraction of

C . B L R 31

objects and the methods used to organize them. As the discussion about texts and

paratexts revealed, so to do objects simultaneously inhabit multiple local material and

abstract distributed contexts, on multiple scales of granularity. It is fairly clear that

this the case for books, but also highlights the looming question of whether or how

this applies to hybrid material and digital objects.

For example, metamash.org uses tags from LibraryThing that are all submitted to

that site6 by individuals based on their individual book collections. At the same time,

metamash.org also uses bibliographic data from WorldCat7, which is a site built

around multiple libraries sharing their catalogue data about books. A book entered

into metamash.org becomes a boundary object between these two systems and in so

doing can highlight the tensions between individual and institutional uses and

sources of data. More importantly, as the RFID system will demonstrate, a book with

an embedded RFID tag can also straddle first and third order ordering systems and in

so doing, interrogate what it means for such an object to exist in the first place.

Additionally, books are ideal boundary objects to begin attempting to resolve across

scales and practices, since as Star and Griesemer point out, repository boundary

objects are those that explicitly straddle different units of analysis or abstraction

(1989, p. 410). Instead of focusing on libraries per Star and Griesemer’s example,

books are finer grained units of analysis that are functionally similar to libraries in

that they are points of negotiation between different social worlds.

Whereas the Library of Congress can bring its full authority and credibility to bear

on the abstract and necessarily limited classification of Archive Fever, it does not

account for the multiplicity of specific, local, and contextual readings that individual

readers, or even groups of readers, or other institutions might derive from the book.

Indeed for this thesis, the primary subject heading for Archive Fever could easily be

Archives instead of Memory. Thus the material book is an information resource that

6http://librarything.org7http://worldcat.org

http://librarything.orghttp://worldcat.org

C . B L R 32

is cited in this individual thesis for a particular purpose; at the same time, the book

exists in an institutional catalogue that has different objectives than this thesis. Thus

the volume itself is a point of negotiation between individual and institutional needs,

never-mind the possible different interpretations of the ideas it contains.

Examining the discrete nature of books and articles by building the RFID system,

metamash.org, and doitag.org, all of which are described in detail in Chapter 3, will

better characterize the tension between institutional metadata systems and individual

agency over how those systems are used. For each system, there are multiple sources

of information (Foucault’s nodes and points of control), each system sits on the

boundary between ordering systems (Weinberger), and each system is and outputs

boundary objects that fit into either repository or ideal type categories, which

highlight both the Foucauldian and Weinbergerian aspects of each system.

2.6 Part 1 Summary

This chapter serves as an introduction to a number of key ideas that inform the rest

of this thesis. First is that storing metadata about a text or set of texts can be a used

to discover relationships between texts based on the agency of an unknown but

intelligent person(s). Second is that any text can be broken down into smaller

components than the containing form.

Next is Foucault’s notion of the book as a node situated in an abstract network of

other books and nodes like institutions, authors, and readers. Crucially important is

Foucault’s idea that the scale or unit of analysis for any network can extend to

individual words and sentences, or beyond the material form of a book to the

discourses in which it exists. This idea that objects can be understood as components

in interrelated scaleless networks is fundamental to the rest of this thesis.

Another important idea, which is complimentary to the Foucault’s, is Weinberger’s

formulation of first, second, and third order organization systems. The characteristics

C . B L R 33

of each underly the RFID system, metamash.org, and doitag.org. More importantly,

the idea that each type of system can be overlaid on top of the preceding systems

informs the design of all three metadata systems that are discussed in this thesis.

Foucault’s idea about scale of analysis also serves as a useful conceptual tool with

which to interrogate the power relations embedded in the joints between ordering

systems.

Finally, Star and Griesemer’s characterization of boundary objects is another useful

conceptual tool. Each of the objects used or created by the RFID, metamash.org, and

doitag.org systems is a boundary object and can be used to interrogate joints between

ordering systems, points of control over information, sources of said information, and

communities of interest and practice that surround each object or set of objects.

Deploying each of these ideas in conjunction with the others via Critical Making

raises more questions than they answer about identification and categorization.

However, the process of designing and building the systems that will be discussed in

this thesis draws attention to a boundary region between material and digital objects

and related practices that warrants in depth examination.

2.7 Background: Every Book is a Problem

Standards, categories, technologies, and phenomenology are increasinglyconverging in large-scale information infrastructure.... this convergenceposes both political and ethical questions (Bowker & Star, 1999, p. 47).

2.7.1 Introduction

The convergence of various types of systems into large scale infrastructures that

Bowker and Star desribe in the preceeding quote can be understood in one way as

effacing some of the gaps between ordering systems. For example, despite the

criticisms of second order metadata systems discussed in the Introduction of this

thesis, it is appropriate to point out that the Library of Congress metadata for Archive

C . B L R 34

Fever was organized enough that it was suitably catalogued and located exactly

where it was supposed to be within the University of Toronto’s library system. This is

no small feat indeed considering that the University of Toronto library system

contained 18,985,932 items in April 2009 (2009).

The ability of any user with enough knowledge of the organizational schema (Library

of Congress in this case) to place, store, cross-reference, and locate a single volume

within such an immense system given bears testament to the fact that however large,

difficult, inefficient, or problematic library metadata systems may or may not be, the

University of Toronto system (at least) worked as intended for the particular text in

question. In other words, the large scale first and second order UTL systems and the

second order Library of Congress system work well together to achieve one particular

user oriented goal - finding a book on a library shelf.

However, in searching for Archive Fever, the seed idea for this thesis arose. What

might happen if users could embed their own metadata into material books on

shelves? One need look no further than the folded pages of a library book or

handwritten marginalia to see that the very medium of a material book all but

encourages users to leave their thoughts and notes embedded in a book itself. Thus,

the initial attempt of this thesis was to build a prototype RFID system that could

enable embedding digital user generated metadata into material library books,

essentially overlaying a third order system on top of first and second order systems,

without altering the underlying ordering infrastructures. The goal was to augment

existing library catalogue systems in order to question the standards, categories, and

technologies governing such systems, in a manner similiar to that of Bowker and

Star whose quote opens this chapter.

By using RFID to create boundary objects, the points of negotiation between parties

and between Weinberger’s first and second order ordering systems, and second and

third order ordering systems become slightly more apparent. While augmenting

C . B L R 35

many institutional systems (libraries, supply chains, international travel and the

like), using RFID in this thesis interrogate the transition points between levels of

ordering reveals that the needs of institutions take precedence of those of

individuals. As such, to understand why RFID is important to this thesis because of

its largely transactional nature, some background information is required.

2.7.2 RFID

One particularly important augmentation of books started with the invention of

barcodes in 1949 (Wikipedia, 2010a). Historically however, Mai (2003) points out that

13th Century monks built a shared catalogue of 183 English monastary libraries to

keep track of items, so augmenting books with second-order metadata is not a new

practice by any means. The Online Computer Library Center (OCLC) notes that Dewey

Decimal has been around since the 1870’s (2010, np.). However, the barcode

specifically marks a turning point for books because it represents McLuhan’s

“stepping-up of speed from the mechanical [e.g. punch cards] to the instant electric

form” (1964, p. 47). In this case, the electric form resides in pulse-width modulated

signals as read from barcodes applied to books. Carrying that notion of an instant

electric form forward from barcodes to the present, myriad systems exist that operate

on the same basic principle of using digital signals to quickly identify objects. RFID

tags comprise one such system, which informs the rest of this thesis.

The foundational theoretical and applied research paper (Stockman, 1948) upon which

modern RFID systems are built was published in 1948, a year before research into

barcode systems had truly begun. Stockman’s research involved using various types

of transceivers to modulate power emitted from a receiver. The receiver would then

read the retransmitted signal from the remote transceiver. Crucially, the difference

in Stockman’s system versus traditional radar systems of his day was that signals

could be modulated over time instead of signaling binary on or off conditions.

Stockman pointed out that possible civilian applications of his reflected power system

C . B L R 36

included, amongst other things, “automatic pin-pointing... and simplified means for

identification and navigation” (1948, p. 1196,1204).

2.7.2.1 ISO/IEC 1443

Sixty years after Stockman’s initial research into the theory of measuring reflected

and modulated sound, light, and radar systems, there are multiple codified ISO

standards for specific types of RFID implementations. ISO/IEC 14443 is a specific

example of a global RFID standard that warrants further examination. It “is one of a

series of International Standards describing the parameters for identification cards as

defined in ISO 7810 and the use of such cards for international interchange.... part of

ISO/IEC 14443 describes the physical characteristics of proximity cards. This

International Standard does not preclude the incorporation of other standard

technologies on the card” (Joint Technical Committee ISO/IEC/JTC1, 2008). From the

outset then, it is important to note that the standard governing the form and intended

uses of RFID tags is institutionally motivated by other ISO/IEC standards, and by the

intended use case of “international interchange.”

Roussos (2008) gives a remarkably typical account of the intended international uses

of RFID transponders. He details the use of RFID in electronic machine-readable

travel documents (e-MRTDs) like passports and the international standards that

govern their implementation and use as defined by the International Civil Aviation

Organization (ICAO). He notes that e-MRTDs must adhere to the ISO/IEC 14443

standard which “provides specifications for iris scans and fingerprints for future use.”

He continues and writes that “Millions of e-passports are already in use, and

thousands of MRTD-capable immigration control facilities have been deployed at

disembarkation points in several countries” (2008, p. 12).

At every level of the discussion, the intended user of the device is not an individual.

e-MRTDs are intended for use by border security agents who are employed by

C . B L R 37

enormous institutions like border agencies and their parent countries. Indeed, “Their

development [e-MRTDs] is seen by ICAO and its member countries as a significant

improvement over manual inspection of travel documents at border control points in

terms of efficiency and data entry precision” (2008, p. 11). Thus to the holder of the

e-MRTD, there is little direct benefit apart from increased efficiency when crossing

through border checkpoints. Said benefit may of course be substantial and desirable,

but the point here is that the stated use of RFID tags in “international interchange”

and control over the devices themselves resides within institutions and not

individuals. At no point do individuals get to exercise agency over the devices or the

information that they contain.

Indeed the ISO IEC 14443 compliant RFID transponders privilege the authority of

institutions over the document holders, as demonstrated by the ICAO’s description of

e-MRTD Assisted Border Clearance. Their 2008 Guidelines document states that an

e-MRTD Assisted Border Clearance system is one “[that] assists the border control

officer to authenticate the eMRTD via the use of a suitable document reader, establish

that the passenger is the rightful holder of the document and query border control

records. The officer himself determines eligibility for border crossing” (2008, p. 24).

The language of the guidelines is telling when combined with the earlier extract

from the ISO/IEC 14443 standard - the document itself is the most important piece of

the border crossing, which is exactly the use case encompassed by the ISO/IEC’s term

“international interchange.” Thus the passenger is relegated to the role of being

identified as an element in an exchange of information, not as a particular human

being, but as the “rightful holder of the document.” The e-MRTD as a document

authoritatively represents a person’s identity and legitimacy in such an institutional

transaction.

Roussos goes on the describe the benefits of RFID in large metropolitan transportation

systems. He writes that

“RFID offers distinct advantages due to the superior durability of tickets....

C . B L R 38

Ticket inspection at the gates is also facilitated by the far higher readaccuracy of RFID compared with magnetic, which helps maintain thesteady flow of commuters.... Finally, RFID tickets can hold considerablymore data, which allows the use of personalized unique identifiers that canbe used to virtually eliminate fare evasion (2008, p. 16).”

Roussos does not explicitly describe the institutional desirability of RFID for transit

systems. In the case of transit systems, individuals also benefit from the advantages

that he lists. However, the described advantages of ticket inspection and reduced fare

evasion are rooted in a transactional model that privileges the needs of the

institution as a service provider over those of the individual as a transit rider who

wishes to move from point A to B in the most expedient manner.

2.7.2.2 RFID Discussion

These two examples, while lengthy and in many ways banal, are remarkable for

exactly the fact that they are unremarkable. They are important because they

demonstrate the institutional standards that underly RFID technology and many

practical implementations. Such foundations are not necessarily problematic either,

as described in the transit system example. Rather, they demonstrate how commercial

and institutional uses for RFID have determined current standards that govern how

information is stored on RFID tags, best practices for programming and reading tags,

and the actual physical size and contents of tags themselves, as is the case with ISO

14443 tags used in e-MRTDs.

ISO/IEC 1443 is strongly oriented towards second-order ordering in that it only uses

key pieces of indexical information about a passport holder or passenger, without

directly physically constraining its holder. In other words, ISO/IEC 1443 does not

regulate how people are physically categorized, it only indicates enough about a

person using categories to identify a person using, for example, using a combination

of name, gender, eye colour, birth country and the like.

Additionally, eMRTDs are excellent examples of boundary objects as the transit

C . B L R 39

example demonstrates. A transit authority has distinctly different goals than a

transit user, but both rely on the same physical objects and infrastructure to

accomplish their different tasks. On the one hand transit authorities would like to

eliminate fare evasion and achieve a steady flow of passengers, while on the other,

transit users would like to get from place to place with a minium amount work and

time spent. By abstractly identifying passengers via their second order RFID passes,

which are based on first order ordering of electromagnetic pulses emitted from

transponders, all parties can complete their tasks.

A key point here is that RFID technologies bridge Weinberger’s first two ordering

systems. In discussing how it is used in both first and second order systems, and how

it is a boundary object between classes of users, the larger issues of identity, and

personal versus insititutional goals come into focus. As Chapter 3 will demonstrate, a

new set of issues arises when RFID is used to bridge first and third-order systems.

2.8 Keywords and Tagging

While RFID is institutionally deployed and uses internationally agreed upon

standards to bridge first and second level ordering systems, the tagging and

folksonomy systems that this thesis will discuss are largely loosely structured and

mostly third-order in nature. Krauss (2010) gives an excellent summary of how RFID

systems and Library Management Systems (LMS) can be linked to allow users to

participate in “Library 2.0” type interactions. Indeed, the initial attempt at building

an RFID system for this thesis was intended to allow users to use passive RFID tags

and off the shelf hand-held components to leave tags on RFID t

E : by Jamon Camisso · 2013. 11. 1. · ii Embedding metadata: exploring the ontology of hybrid...

Documents

Transcript of E : by Jamon Camisso · 2013. 11. 1. · ii Embedding metadata: exploring the ontology of hybrid...