E : by Jamon Camisso · 2013. 11. 1. · ii Embedding metadata: exploring the ontology of hybrid...

124
E : by Jamon Camisso A thesis submitted in conformity with the requirements for the degree of Master of Information Studies Graduate Department of Faculty of Information University of Toronto Copyright c 2011 by Jamon Camisso Released under the WTFPL (http://sam.zoy.org/wtfpl/COPYING)

Transcript of E : by Jamon Camisso · 2013. 11. 1. · ii Embedding metadata: exploring the ontology of hybrid...

  • E :

    by

    Jamon Camisso

    A thesis submitted in conformity with the requirementsfor the degree of Master of Information StudiesGraduate Department of Faculty of Information

    University of Toronto

    Copyright c 2011 by Jamon CamissoReleased under the WTFPL

    (http://sam.zoy.org/wtfpl/COPYING)

  • ii

    Embedding metadata: exploring the ontology of hybrid digital and material objects

    Jamon Camisso

    Master of Information Studies

    Faculty of Information

    University of Toronto

    2011

    Abstract

    This thesis discusses the design of three systems that were built using Critical

    Making as an investigative method. The systems are: an RFID antenna that links

    ISBNs to online metadata; metamash.org, which aggregates ISBN metadata; and

    doitag.org, which allows users to associate tags with DOI numbers. Each system was

    designed to interrogate issues related to identification, categorization and the

    institutional foundations of, and individual practices surrounding, information

    systems, providing levers to get at deeper ontological issues.

    Each investigation points in its own way to a profound lack of understanding about

    the ontology of digital, or hybrid material/digital objects. David Weinberger’s

    ordering scheme for material and digital objects is used because it allows for a

    discussion of ordering systems in general. However, focusing solely on categorization

    systems masks more important questions about the ontology of such objects and how

    building and using such objects fundamentally defines what they are.

  • iii

    Acknowledgements

    Writing this thesis has been a thoroughly enjoyable exercise in academic meandering.

    I would like to thank my supervisor Dr. Stephen Hockema for bearing with me and

    challenging me throughout this process. While much of the content of our discussions

    is reflected in these pages, our conversations about issues that are not raised here

    have been equally inspiring and always lead me to new and exciting discoveries.

    I would also like to thank my second reader Professor Matt Ratto. Without Critical

    Making and his advocacy, this thesis would never have made it past the vetting stage

    by the Committee on Standing. I would also like to extend my thanks and respect to

    my external reviewer Jean-François Blanchette, whose questions managed to do more

    to make me summarize and reorient my thinking about the topic in two weeks than I

    could have managed in months on my own.

    Finally, to my friends and family, who have constantly engaged themselves with this

    project in equal portions of academic and extracurricular capacities, I would like to

    express my sincerest thanks. Without their perspectives, encouragement, and

    prodding, I would probably have taken yet another year to finish this thesis.

  • Contents

    1 Introduction 1

    1.1 Material Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    1.2 Critical Making . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    2 Background & Literature Review 9

    2.1 Books, Texts, and Paratext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    2.1.1 Work/Text Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    2.1.2 Paratext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    2.2 Categorization and Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    2.2.1 First, Second, Third Order Organization Systems . . . . . . . . . . . . 17

    2.2.2 Ordering Systems Discussion . . . . . . . . . . . . . . . . . . . . . . . . 23

    2.3 Ontology and Intentionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    2.4 Boundary Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

    2.6 Part 1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    2.7 Background: Every Book is a Problem . . . . . . . . . . . . . . . . . . . . . . . 33

    2.7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    2.7.2 RFID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    2.8 Keywords and Tagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    2.9 Part 2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    2.10 Co-Citation and Latent Semantic Analysis . . . . . . . . . . . . . . . . . . . . 43

    iv

  • CONTENTS v

    2.10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    2.10.2 Co-word, Co-citation, and Contextual Co-citation Analyses . . . . . . . 45

    2.10.3 Latent Semantic Indexes and Analysis . . . . . . . . . . . . . . . . . . 50

    2.11 Tagging Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    2.11.1 Folksonomies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    2.12 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    3 Method: Critical Making 60

    3.1 Arduino and RFID system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

    3.2 metamash.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

    3.3 doitag.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

    3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

    4 Discussion: Just-in-time Dimensionality 80

    4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

    4.2 metamash.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    4.3 WorldCat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

    4.3.1 Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

    4.3.2 Co-tag Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

    4.4 U of T Catalogue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

    4.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

    4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

    4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

    References 100

    A Copyright Acknowledgements 109

    B RWD-ICODE Email 110

    C WorldCat Email 112

    D metamash.org search terms 114

  • CONTENTS vi

    E LibraryThing HTML Scraper 115

    F Sample WorldCat RSS 116

  • List of Figures

    1.1 Library Receipt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    2.1 Book History/Textual Studies/Sociology of the Text. Whereas Howsam’s

    diagram uses lines to link each subject, this version borrows most of her

    textual elements but uses circles to match the aesthetic used in Figures

    2.7 and 2.2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    2.2 Weinberger’s Three Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    2.3 Negative Tag Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    2.4 Term network for Cluster 3 (86 documents labelled as “Bibliometrics2”) . . 51

    2.5 Overall framework of Janssens et. al’s analysis . . . . . . . . . . . . . . . . . 53

    2.6 LibraryThing’s tag cloud for “Bibliometrics” . . . . . . . . . . . . . . . . . . . 55

    2.7 Material and Digital systems. The items joining the horizontal circles

    (RFID, Tags, and DOIs) represent means of interrogating the boundaries

    of each set of ideas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

    3.1 Photograph of the RFID System showing the Arduino, ID-12 Antenna, and

    handmade antenna coils . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    3.2 Arduino RFID System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

    3.3 Overview of metamash.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

    3.4 LibraryThing tags as raw HTML . . . . . . . . . . . . . . . . . . . . . . . . . . 72

    3.5 WorldCat RSS parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

    3.6 Overview of doitag.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

    vii

  • LIST OF FIGURES viii

    4.1 LSI Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

    4.2 Related Bibsonomy Tags and Articles for “latent” tag . . . . . . . . . . . . . . 84

    4.3 UTL element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

    E.1 LibraryThing HTML scraper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

    F.1 WorldCat RSS for Knuth’s Art of Computer Programming . . . . . . . . . . . 116

  • Chapter 1

    Introduction

    “Everywhere everything is ordered to stand by, to be immediately at hand, indeed tostand there just so that it may be on call for a further ordering” (Heidegger, 1977,p. 17).

    In many ways, this thesis is a mashup. It discusses things like categorization

    systems, institutional control over information, tagging, bibliometric methods, and

    the design of material and digital web based mashups that were explicitly built for

    this thesis. But this thesis is not just about categorization or mashups. Instead, all of

    these topics taken together interrogate a much larger issue. Each topic understood

    within the larger context of the others, and the insights into each, ultimately hints at

    a profound lack of understanding about the ontology of digital, or hybrid

    material/digital objects.

    1.1 Material Metadata

    To begin exploring this issue, grounding the beginning of this discussion on material

    objects as metadata is necessary. Accordingly, this section will attempt to establish

    the value of storing what appears to be useless material metadata by examining a

    library receipt that was found in the trash at a library.

    Consider the library receipt shown in Figure 1.1. This receipt was retrieved from a

    1

  • C . I 2

    garbage can in St. Michael’s Library, University of Toronto, for the simple reason that

    it was there. However uninteresting it might appear at first glance, it is a record of a

    particular library patron (or potentially patrons) having researched, located, and

    checked out a number of items at a particular location and moment in time. Nothing

    can be reliably deduced about the patron—their interests, personality, browsing

    habits, nothing about them is not captured by the receipt. Just as Levy (2001, p. 8)

    notes when discussing a lunch receipt from Steve’s Deli & Catering, all that the

    receipt in Figure 1.1 captures is metadata about a particular event. In this case, the

    event was one in which a patron located and checked out three items, though even

    the date that the event occurred is unknowable.

    This thesis will explore some of the relationships that emerge when metadata about

    sets of items like books and articles is aggregated and stored for future use. In

    keeping with Heidegger’s idea that technology can be used to create further order in

    the world out of underlying raw materials, this thesis will attempt to demonstrate the

    ways in which storing such metadata—instead of throwing it away—is useful.

    For example, the three particular items shown on the receipt in Figure 1.1 were

    checked out concurrently with each other. This is a useful observation in that all

    three items share a common author. What that meant to the patron is unknowable,

    but the fact is, there is a commonality amongst the items. Looking carefully, the

    items have relatively recent publication dates: 2006, 2009, and 2007 respectively.

    Again, what this means is unknown, but Chapter4 will demonstrate the value of such

    apparently inconsequential information. Finally, still carefully examining the item

    numbers, note that the last item of the three has a different Library of Congress

    letter prefix, RC, which is reserved for items with the subject of Internal Medicine

    (2010b), versus the first two items that use BQ, which is reserved for items about

    Buddhism (2010a). Given the stack layout of the library in question (University of

    Toronto, 2010), this means that the patron probably had to traverse multiple floors and

    shelves to retrieve the items shown on the receipt.

  • C . I 3

    Figure 1.1: Library Receipt

  • C . I 4

    If it had been found in the front cover of a book, or used as a bookmark, the receipt’s

    embeddedness could have been used by a researcher to know what others have read

    in relation to any referenced book on the receipt. That embedded metadata could

    function in a manner similar to amazon.com’s recommendation system, which uses

    “item-to-item collaborative filtering [that] matches each of the user’s purchased and

    rated items to similar items, then combines those similar items into a

    recommendation list” (2003). Moreover, other patrons could add their receipts (or any

    information of their choosing for that matter) to a book and thus build up a set of

    related texts for others to use.

    This embedding could also lead to a whole host of issues within libraries. Questions

    like who gets to embed receipts into books, can receipts be modified or reprinted, can

    multiple copies of the same receipt be placed in multiple books, can they be removed

    from a book, and the immense question about what such a practice would mean for

    patron privacy are just a few questions that arise by even speculating about

    embedding such metadata into a text might entail.

    Now consider Heidegger’s quote again: retrieving a piece of what was for all intents

    and purposes garbage, and using it to discuss metadata about the objects that it

    references reveals a good deal of explicit and latent information about the items, and

    about a particular place and event. That information and the receipt upon which it is

    printed was on the verge of being thrown out, but, once retrieved and discussed, has

    now found a new use. It is evidence, used for this thesis to show that someone went

    to the trouble of getting the items listed on the receipt, and to demonstrate that the

    simple fact that they did so indicates a commonality, or commonalities between items

    on the receipt, whatever they might be.

    Indeed, for the sake of argument, the items on the receipt could be completely

    unrelated in terms of their contents, authors, and subjects. The point is, the agency

    that was exercised, specifically the work that was done in getting those items, entails

  • C . I 5

    some degree of relation between the listed items beyond merely existing in the

    University of Toronto catalogue. What any set of such items might have in common

    is unknowable for certain, but demonstrating that there are meaningful, unexplored

    material and abstract relationships between texts is a central pursuit that this thesis

    will undertake.

    To that end, the practical and philosophical implications of building systems to store

    this type of metadata about texts will be explored in this thesis using Critical Making

    as a method, a description of which follows.

    1.2 Critical Making

    Briefly, Critical Making is a method pioneered by Ratto (2009), whereby an object is

    designed and built to serve as a material and technological lens through which to

    understand a broader set of questions or issues about a subject.

    Three different metadata linking and capturing systems were designed for this thesis

    using Critical Making. First, a Radio Frequency Identifier (RFID)1 system was built in

    an experimental attempt to define, generate, and capture metadata. The goal of the

    system was to place sets of RFID tagged books on a table that could correlate the

    physical proximity of books on the table with subject matter, authors, and other

    criteria.

    Based on the technical limitations of the RFID system, the second system,

    metamash.org, was designed to link material books to their online digital metadata

    via RFID tags and ISBN numbers. Again the idea driving the design of metamash.org

    was that storing aggregated metadata about sets of books instead of individual items

    could be useful. Finally, doitag.org was designed as a system to collect and link

    journal articles using Digital Object Identifiers and user generated tags.

    1Chapter 2 will describe RFID in more detail

  • C . I 6

    Each system was designed to initiate thinking about and to explore theoretical topics

    that arise when objects are linked to each other via metadata. Each system attempts

    to understand how the ontologies of objects like material books, journal articles, or

    even wholly digital texts are alterable when metadata is embedded into the self-same

    object. This thesis will attempt to demonstrate that embedding metadata into objects

    has the potential to fundamentally change conceptions of what material and digital

    object boundaries are.

    Accordingly, the following questions inform and will be addressed by Critical Making

    and the three systems that were designed during the course of this thesis:

    1. What new practices might arise when material and digital objects are linked

    together based on their embedded metadata?

    2. What do those practices reveal about the ontology of material, digital, or

    abstract objects when metadata about themselves is stored and embedded into

    the self-same objects?

    1.3 Thesis Outline

    Chapter 2 reviews the various bodies of knowledge required to understand what is

    meant by terms like text, categorization, ordering systems, and Boundary Objects.

    The chapter also explains what Radio Frequency Identification (RFID) is and how it

    can be used. Subsequently, the chapter describes two notable large scale

    implementations of RFID to establish the overwhelmingly institutional orientation of

    existing RFID standards and systems. In a similar vein, the chapter also gives

    background information about tagging and folksonomy systems. In both cases, the

    individual and institutional underpinnings of each system are briefly discussed to

    demonstrate the tension between nodes as points of information versus points of

    control. In all cases, the objects discussed are boundary objects that are used as

    points of negotiation between parties.

  • C . I 7

    The second half of Chapter 2 gives detailed information about bibliometric methods

    that can be used to map out disciplines. First, Co-word, co-citation, and contextual

    co-citation are discussed as ways of mapping out relationships between textual nodes

    that comprise a canon or discipline. Next, Latent Semantic Analysis (LSA) is

    discussed as an alternate bibliometric method to the various co-word/citation methods.

    LSA is used to group words that are found within a set of texts to indicate semantic

    relationships between common words. The resulting set of words is then used as a

    contextual container or list of categories to describe the relationships between texts

    that were used for the analysis. Finally, Chapter 2 explains what socially generated

    folksonomies are and how they can use used as alternatives to the previously

    discussed bibliometric methods without requiring expert knowledge of a discipline. As

    a part of that description, the notion of outliers and marginal tags is also discussed.

    Chapter 3 explains how Critical Making as a method was used to construct the RFID

    system that forms the basis of this thesis. The chapter explains alternate prototypes

    that were considered but not constructed or seen through to completion and

    demonstrates that, within the rubric of Critical Making, such incomplete

    implementations are still useful tools to understand broader issues that the

    prototyping process is designed to illustrate. Finally, the chapter goes on to show how

    the decisions made in designing the RFID tagging prototype system, along with the

    shortcomings of the system led to the design and creation of metamash.org and

    doitag.org. It also proposes that Critical Making does not necessarily need to rely on

    material objects to still be a useful method.

    Chapter 4 shows three surprising results that arose during the design of

    metamash.org. First, a programming flaw in the metamash.org code demonstrates an

    unexpected but useful perspective on how metamash.org could be used as a search

    tool as well as a mashup generator. Second, one of the very first sets of texts that was

    input into metamash.org is discussed. A common tag shared by many of the texts is

    pubd502 which, when carefully researched reveals a rich set of documents that are

  • C . I 8

    related to the initial set of books that were input into metamash.org. Finally, the tag

    #206, which is found on a mashup containing (B. Smith, 1996) is discussed. The tag

    links to a unique social categorization and collection management system being used

    by the University of Berne’s Software Composition Group.

    Chapter 4 also concludes this thesis and discusses how the ideas of discreteness and

    granularity challenge the idea that texts are not ontologically changed by

    demonstrating that introduction of third order ordering systems fundamentally

    changes how texts as objects are created and used. The conclusion discusses how

    using a text is the most important aspect of determining what a text actually is,

    notwithstanding definitions in various bodies of literature that are discussed and

    problematized in Chapter 2.

  • Chapter 2

    Background & Literature Review

    This chapter is divided into seven main topics. The first three topics establish working

    definitions and tools that will be used to analyze the subsequent topics and the objects

    that are discussed in Chapter 3. The first of the three topics in the beginning of this

    chapter revolves around ideas of what terms like “book” or “text” entail. In discussing

    those ideas, the second topic of how and why to categorize such objects via

    Weinberger’s (2007a) ordering framework is discussed. The third topic in this chapter

    will describe how Bowker and Griesemer’s definition of “Boundary Objects” can be used

    to understand some types relationships between objects and people who use them.

    The next sections establish background information about RFID, and keyword and

    tagging practices. These two sections rely on the discussion of ordering systems and

    boundary objects to demonstrate how RFID and tagging systems are useful examples

    within the broader context of this thesis’ goal of understanding the issues

    surrounding material and digital object ontologies.

    The last two sections discuss two existing bibliometric methods, co-citation and Latent

    Semantic Analysis, which are currently used to understand relationships between

    documents as participants in a network of texts. Both suites of techniques rely on

    lossy statistical analyses where the resulting data are based on one-way operations.

    The final detailed discussion about tagging in the last section of this chapter is based

    9

  • C . B L R 10

    on the idea that tagging is a lossless practice and relationships between documents

    (and objects in general) can evolve or be recorded over time via tags.

    Thus to proceed further, a discussion about the complexity and multiple notions of

    what words like “book” or “text” entail is required. This will demonstrate how books

    are one useful tool for attempting to answer the questions posed above. Additionally,

    this section begins moving away from what has been a solely materially based

    discussion, to examining ideas about “paratextual” elements.

    2.1 Books, Texts, and Paratext

    A book is never simply a remarkable object. Like every other technology it isinvariably the product of human agency in complex and highly volatile contextswhich a responsible scholarship must seek to recover (McKenzie, 1999, p. 4)

    The existence of Book History programs at institutions like the University of Toronto1,

    the University of Edinburgh2, Texas Tech University3, and international organizations

    like the Society for the History of Authorship, Reading and Publishing (SHARP)4,

    points to an ongoing and fertile discussion about modern and historical textual

    practices. Indeed, the University of Toronto’s Book History and Print Culture program

    notes on its program website that “Histoire du livre, History of the Book, Textual

    Studies, Print Culture, Sociology of the Text-all these names have been used to

    describe a growing international academic movement.”

    McKenzie (1999, p. 4) points out obvious fact that it is not just books that are the

    subject of discussion in these context. For McKenzie, films, sound recordings, images,

    digital files, and oral texts are all interdependent artifacts that bear witness to

    human experiences. Searle (1983, p. vii) even notes that “sentences—the sounds that

    come out of one’s mouth or the marks that one makes on paper—are, considered in

    2http://bookhistory.fis.utoronto.ca/about.html3www.hss.ed.ac.uk/chb/4www.english.ttu.edu/grad_degrees/BH_default.asp5http://www.sharpweb.org/index.php?option=com_content&view=article&id=20&Itemid=54&lang=en

    http://bookhistory.fis.utoronto.ca/about.htmlwww.hss.ed.ac.uk/chb/www.english.ttu.edu/grad_degrees/BH_default.asphttp://www.sharpweb.org/index.php?option=com_content&view=article&id=20&Itemid=54&lang=en

  • C . B L R 11

    one way, just objects in the world like any other objects,” a point that will reemerge in

    the forthcoming discussion in the section on paratexts, and in the secton on

    categorization and ordering systems.

    However, to return to McKenzie, he chooses to focus his discussion on books, noting

    that they have traditionally been the material primarily studied by practitioners in

    the field. Overall, Figure 2.1, which is copied from a diagram and overview of Book

    History by Howsam (2006), situates Bibliography within Book History as a discipline

    as the area that focuses, in part, on the materiality of the book as an object.

    However, as McKenzie notes in the quote that begins this section, books as objects of

    study are problematic and do not easily fit into the single context of Bibliography as a

    discipline. He even writes that “At best perhaps we can acknowledge the intricacies

    of such a textual world and the almost insuperable problems of describing it

    adequately” (1999, p. 4). Indeed, it is difficult to even enumerate the different material

    contexts within which a book comes into existence.

    For example, in the case of materiality, a book is a commitment to a particular set of

    material constraints like page size, overall length, even ink colours, or typefaces.

    Moreover, a book must exist in or transit through multiple social contexts and

    processes like writing, editing, publication, and distribution. Thus understanding the

    conditions within which a material codex was produced is but one set of problematic

    contexts whose recovery, McKenzie claims, is the principal end of bibliography as a

    scholarly practice (which for McKenzie includes sociology of the text, book history,

    literature, and print culture).

    Focusing on the materiality of a text also raises the more thorny issue of a text’s

    immaterial existence. Though Sutherland specifically writes about electronic text, she

    puts it best when she writes that textuality “requires that we consider the unfixity

    [italics added] of the text, the promiscuity as opposed to integrity of its identity in an

    age when the text has a diverse non-book existence” (1997, p. 5). Since the focus of

  • C . B L R 12

    Figure 2.1: Book History/Textual Studies/Sociology of the Text. Whereas Howsam’s dia-gram uses lines to link each subject, this version borrows most of her textual elementsbut uses circles to match the aesthetic used in Figures 2.7 and 2.2.

  • C . B L R 13

    this thesis is partly on the electronic forms of and interactions with a text,

    Sutherland’s point is especially relevant.

    Moreover, the discussion about considering a text both as a discrete material object

    and as an abstract object is an old debate and is still ongoing. Sutherland initially

    describes Roland Barthes’ description of a “work” as that which is held in the hand,

    and his description of a “text”, which for Barthes is held in language (1997, p. 3).

    However, according to Barwell (2005, p. 419), the “locus classicus of the distinction

    between a “work” and an “expression” of it, which has been widely adopted by textual

    critics, though equally strongly opposed by others” is provided by Tanselle (2001). To

    confuse things, Tanselle’s definition is opposite to Barthes’, where a “work” is

    equivalent to Barthes’ text, and an expression is equivalent to Barthe’s work.

    2.1.1 Work/Text Discussion

    The varying definitions of a work or an expression or a text are distinctions that are

    established as problematic within Book History. As Sutherland and McKenzie, and to

    some extent even Tanselle argue, any act of reading or experience of a text is locally

    and contextually situated such that the distinction barely holds together. This thesis

    further problematizes the boundaries of material and digital texts. Thus for the time

    being, with these problems looming in the background, this thesis will proceed

    without endorsing any particular definition until the Conclusion chapter at which

    point the discussion will be picked up again within the larger discussion of material

    and digital object ontologies.

    Indeed, as Tanselle (2001, p. 37) describes it, “The simple point is this: electronic texts

    and hypermedia archives often allow one to do many desirable things more easily

    than one could accomplish them using the codex form.” As such, avoiding the

    discussion about works, expressions, and texts until the Conclusion chapter allows for

    many desirable points of discussion to be raised through the course of this thesis

  • C . B L R 14

    without specifically needing to focus on Tanselle’s ‘codex form.’

    Indeed, the multiplicity of a text, be it from a readerly, writerly, editorial, historical,

    sociological, economic, political, or historical perspective (to name but a few), makes

    the problem of accurately distinguishing between a work or text difficult outside of a

    specific role or discipline. Consider that even internally, a codex itself is problematic

    in terms of what is and is not a part of the “text” as a whole as the next section about

    paratextual elements will demonstrate.

    2.1.2 Paratext

    Indeed, regardless of where one stands on the previously discussed distinctions about

    how to characterize texts, one idea about the nature of texts that bears exploring here

    is what Genette (1997, p. 1) calls “paratext.” Genette argues that paratextual elements

    are those that, while not belonging to a text per se, nevertheless serve to “present”

    and “make present” an abstract text in the world. He claims that things like author’s

    names, titles, tables of contents, headings, and chapters are all paratextual elements.

    For Genette, his argument is that any context is a paratext. He verges on, but does

    not explicitly make the point, that any paratext itself can also be a text in the

    Barthesian sense of the word. Genette argues that paratext consists of two main

    varieties, “peritext” like chapters, and headings, which usually reside in or close to an

    actual material text, and “epitext,” which “at least originally, are located outside the

    book, generally with the help of the media (interviews, conversions) or under cover of

    private communications (letters, diaries, and others)” (p. 5). Importantly, in making

    the distinction between the two types of paratext, Genette acknowledges that on a

    finer grained scale than that of a book as a container for a text, things like indices or

    interviews can stand as texts on their own. He concedes that “Most often, then, the

    paratext is itself a text: if it is not the text, it is already some text” (p. 7).

    This distinction focuses attention on three important points that will inform the rest

  • C . B L R 15

    of this thesis. The first is the idea that depending on the level of granularity chosen,

    something can always be either paratext, or a text in and of itself. This idea that a

    text can occupy multiple roles at once will become more apparent in the forthcoming

    discussion of Weinberger’s (2007a) ordering systems. The second point, which Genette

    makes explicit, is that paratext’s “existence alone, if made known to the public,

    provides some commentary on the text and influences how the text is received” (p. 7).

    Finally, Genette writes that there are “implicit contexts that surround a work and, to

    a greater of lesser degree, modify its significance” (p. 7).

    Taken together with Sutherland, McKenzie, and Tanselle’s characterizations of texts

    as contextually specific objects or ideas, these three points highlight how problematic

    texts are as objects of study. For example extracting a discrete component of a larger

    containing text entails that a new text is created or can stand on its own. That new

    text itself can then be contextualized, either in terms of another larger text, or in

    terms of other external texts. Though not writing about books per se, Gibson (?) puts

    it best when he writes that “the unit you choose for describing the environment

    depends on the level of the environment you choose to describe” (p.9).

    This idea of re-contextualizing discrete texts or textual elements warrants a

    discussion about categorization and ordering systems, which follows. This section will

    demonstrate the crucial importance of understanding how categories and the manner

    in which objects (including, but not limited to, texts) are organized. Fundamentally,

    this section relies on the premise that an object can have some kind of easily

    encapsulated identity. Whether or not this is truly the case is an unresolvable debate,

    but this section will proceed under the assumption that just as pieces of texts like

    paratextual elements can stand on their own as textual objects, so too can any

    discrete component of any object or set of objects.

  • C . B L R 16

    2.2 Categorization and Ordering

    The frontiers of a book are never clear-cut: beyond the title, the first lines,and the last full stop, beyond its internal configuration and its autonomousform, it is caught up in a system of references to other books, other texts,other sentences: it is a node within a network (Foucault, 1972, p. 23).

    According to Foucault then, and with the aforementioned discussion about texts,

    contexts, and granularity in mind, a material book or article is a paradox. It is at

    once both constrained and limited to its predefined physical location on a library

    shelf, while at the same time existing in a network of relations to other similar or

    referenced texts, ideas, and authors. This thesis attempts to interrogate both the

    material and abstract nature of texts by employing metadata systems to explore the

    relationships between any text’s material and digital forms. One aim here is to

    demonstrate the existence of a rich set of latent relationships between the nodes—like

    words, paragraphs, and chapters—that make up a text and another text or texts that

    are not demonstrated explicitly by systems like library shelves or online catalogues.

    Consider Foucault’s description of a book that begins this section—it is a dramatic

    explanation of what a book is that goes beyond the material form that any particular

    copy might inhabit. Foucault’s description situates the idea of “book” as a tool that,

    while having material form and properties, can be understood as a tool to interrogate

    the intellectual terrain and practices that surround the very creation of knowledge

    itself.

    This is a different approach to that of the Book History scholars cited in the previous

    section in that the book or text is treated as a tool or means to investigating a deeper

    set of questions than what a book is or is not, or what meanings it might provoke in

    the reader. Foucault goes on to problematize the notion of an author’s oeuvre,

    pointing out that the act of collecting and categorizing works into an oeuvre is both a

    concealment and manifestation of some set of ideas that cannot encompass the

    entirety of an author’s thoughts or the entire discourse about any particular idea.

  • C . B L R 17

    In both cases, Foucault’s primary concern is with the idea that a text or set of works

    has some kind abstract characteristic that allows it to stand outside the systems that

    produced it in the first place. By situating a text’s contents outside of a discussion

    about material form or bibliographic textuality to explain what a book is and does,

    Foucault can point to the interrelations that every text has with power and ordering

    systems like language, authorship, and scholarly practice that collectively work to

    comprise any particular text.

    This idea that books exist in networks of relationships will prove useful in the latter

    portion of this chapter that discusses Latent Semantic Analysis as a means of

    discovering links between books (or articles) in a network of texts. Moreover, the idea

    of a book existing simultaneously in a network of other texts, and as a codex on a

    library shelf finds a 1-to-1 equivalence in the formulation of objects as both digital

    and material at the same time.

    In addition to Foucault’s argument that books function as nodes in networks of

    discourse and power systems, Weinberger also (2007a) points out that “physical

    limitations on how we have organized information have not only limited our vision,

    they have also given the people who control the organization of information more

    power than those who create the information.... because they get to decide what to

    bring to the surface and what to ignore” (2007a, p. 89). Weinberger argues that

    physical constraints are a limiting factor in organizing material information like

    books, but his point also applies to the more abstract information systems that he

    discusses.

    2.2.1 First, Second, Third Order Organization Systems

    Weinberger proposes a hierarchy of ordering systems comprised of first, second, and

    third order constraints. Figure 2.2 is a visual representation of his different types of

    orders. For example, Weinberger defines putting books on shelves or silverware into

  • C . B L R 18

    drawers (to use his example) as first order systems. Their only criteria is that they

    are material based and that they add order to the world. For Weinberger, consisting

    of and arranging atoms is all that is necessary for an organization system to be a

    first order system.

    The switch to a second order system occurs the moment that metadata is created

    about physically arranged objects. According to Weinberger, said objects can be

    organized according to an abstract classification scheme. An example would be

    library shelves (first order) holding items that are organized by Dewey Decimal or

    Library of Congress classification schemes. The scheme itself says little about the

    material object in question, it only provides an abstract label or index characteristic

    that can be used to group many similar objects (books in this case) on the same shelf

    or set of shelves. Weinberger points out a crucially important aspect of second order

    systems: they are necessarily lossy systems. A second order system gives less

    information about the object in question than the object itself contains. Indeed,

    Weinberger notes that first and second order archives cannot “know” everything that

    they contain (2007a, p. 19).

    Note however that as the section on co-citation and Latent Semantic Analysis will

    demonstrate, this losiness is not necessarily a problem. Indeed, those methods are

    useful for revealing relationships between documents in a network (per Foucault’s

    characterization) that might not otherwise be apparent. In this respect then, those

    methods are ideal for examining material objects and their second-order relationships.

    Finally, Weinberger’s third order systems include digital repositories like Flickr (a

    photo sharing website), which rely on individual users to categorize and label their

    submissions with metadata. The metadata that users generate does not need to have

    any relationship to other items in a collection. The key is that with third order

    systems, categorization can be done by an individual or individuals, and can be

    deferred until the “last possible moment”5, or is done “on the fly” based on

  • C . B L R 19

    Figure 2.2: Weinberger’s Three Orders

  • C . B L R 20

    combinations of various pieces of metadata, again supplied by an individual or

    individuals. Weinberger calls this third order ordering “miscellaneous order” and

    notes that “Traditional authorities cannot maintain themselves by insisting that we

    have to go to them.... It is changing how we think the world itself is organized

    and—perhaps more important—who we think has the authority to tell us so” (2007a,

    p. 23).

    The idea that these “orders” can overlay each other is a fundamental assumption that

    underlies this entire thesis. As Kirschenbaum (2005) points out, digital objects are

    material objects (magnetic bits or electrons) that are just operated on abstractly

    because they can be treated as formally identical despite being physically different,

    or even existing in different locations. Kirschenbaum writes that:

    this conundrum becomes the methodological lever with which to pry openthe relentless symbolic cascade of computation and understand what isunique about computers as writing technologies: that they are materialmachines dedicated to propagating a behavioral illusion, or call it aworking model, of immateriality (2005, p. 5).

    Thus while material objects like books are constrained by things like editorial

    processes, distribution requirements, or shelving space and categorization systems,

    digital objects also exist in the material world, and are also subject to material

    organizational requirements. A hard drive with bits, organized into tracks and sectors,

    or even individual bits is a first-order system. A file-system on said hard drive with a

    metadata volume describing where to find specific groups of bits (files), and how files

    are spread out across the entire hard drive is a second order system that necessarily

    does not capture every piece of information about the underlying first order.

    A set of user defined directories and files that an individual computer user creates

    and stores might appear to be a third-order system built on top of the first two

    5As Steve Hockema notes, this builds in an assumption that there is an “ultimate moment” whensomething is found, and a penultimate moment (the last possible one) in which categorization can happen.On a more basic level, the assumption that things are categorized in order to be found later is problematicin that it ignores other possible reasons for categorizing things.

  • C . B L R 21

    systems, but the idea of a file appearing to exist inside a single directory inside

    another single directory (symbolic links notwithstanding) is strongly rooted in a first

    and second-order conceptual model.

    Bolter and Grusin’s (2000) notion of “remediation” clarifies how the user defined files

    and directories portion of this file-system example straddles the boundaries between

    second and third-orders. They argue that “the logic of immediacy dictates that the

    medium itself should disappear and leave us in the presence of the thing represented”

    (p. 6). Where that thing to be represented is a document on a hard drive, representing

    it as a file draws upon the old technologies of paper files, file folders, and filing

    cabinets propagates the illusion that there is a seamless transition from material files

    (paper), to material files (bits), to representation on a screen.

    Weinberger’s schema of ordering systems is not without its flaws. For example,

    Weinberger does not give enougn consideration to how third order systems are

    afforded or built upon second and first order systems via remediation per the

    discussion of Kirschenbaum (2005) and Bolter and Grusin (2000). Moreover,

    Weinberger arbitrarily limits third order systems to the realm of digital objects. The

    RFID enabled surface that was to be designed for this thesis (which will be discussed

    in Chapter 3) would be an example of a first order system that does many of the

    things that a third order system does.

    Indeed, at its core, Weinberger’s real point is that digitality simply allows leaving

    objects loosely categorized until they are referred to or called upon for use. To frame

    Weinberger’s system in terms of the already noted quote from Tanselle about

    electronic objects (2001, p.39), all that Weinberger really does is note that digital

    items afford being quickly categorized and as such, categorizing can be done before,

    during, or after a digital object is created. But there is no reason that the same does

    not apply to material objects as the RFID surface example will demonstrate, it is just

    that digital forms make it easier.

  • C . B L R 22

    The point is, if Weinberger’s schema is to be adopted for this thesis, third-order

    systems are not entirely unproblematic. For example, according to Weinberger, a user

    does not need to understand or care about the first two underlying systems to be able

    to use the third order system. However, as Kirschenbaum points out, that model is

    illusory since it is predicated on two very strict underlying organizational models. As

    such, combining Kirschenbaum’s more ontologically grounded discussion of what a

    digital object physically is with Weinberger’s point that physical limitations restrict

    what can be done with information means that Weinberger’s critique can holds for

    digital objects as well. As this thesis will demonstrate, the source of third order

    information is just as problematic as the structure and arrangement of first and

    second order information.

    Notwithstanding the drawbacks to Weinberger’s system, it is nevertheless very useful

    because it allows discussing hybrid material and digital objects in a particular

    manner. Indeed, stretching Norman’s use of the word ‘affordance’, and despite

    Weinberger’s system not being an object, the idea that Weinberger’s ordering system

    affords discussing hybrid material/digital objects is important, a point which will be

    discussed in the upcoming section on Ontology and Intentionality.

    Finally, Foucault’s description of books as nodes in larger networks allows for a closer

    reading of Weinberger’s criticism. Weinberger’s description of how systems limit

    access to information and grant control over information to those who curate rather

    than create it begs the question: how can those points of control can be better

    understood or worked around? In other words, how can Foucault’s notion of a text as

    a system of nodes and networks of knowledge and power relations be realized in

    material or digital form in such a way as to either circumvent, or at least draw

    attention to a) the linkages between nodes that Foucault describes, and b) the

    Foucauldian centres of control that exist in any of the networks or levels of order per

    5Norman’s use itself is an appropriation of Jerome Gibson’s who originally coined the term, see Chapter3 of Gibson (1986) .

  • C . B L R 23

    Weinberger’s characterization of ordering systems?

    2.2.2 Ordering Systems Discussion

    The goal through all this is still ultimately that of describing how ordering and

    categorizing material and digital objects points to the larger issue of what it means

    for such objects to exist and be used. As Levy so eloquently writes, “It’s a curious

    thing about documents: you can’t see them if you don’t look at them; but you also

    can’t see them if you look only at them, ignoring the surroundings in which they

    operate” (2001, p. 29). The broader perspective that Levy endorses of looking beyond

    only books or specific types of documents is important and is one of the reasons that

    Weinberger’s ordering framework is so useful.

    Indeed, one of the key reasons to choose Weinberger’s ordering scheme is its

    simplicity and ability to describe the contextual surroundings of documents and

    objects in general. While this thesis specifically discusses texts in various forms, the

    idea that any object or system can exist in first, second or third-order systems is a

    broader point that this thesis attempts to explore. For example, while the RFID system

    that will be described in Chapter 3 is built around specifically linking books to digital

    metadata, the existence of sites like touchtag.com that are built around linking RFID

    tags embedded in objects to online data makes the usefulness of Weinbgerger’s

    scheme more apparent. The touchtag.com website gives numerous examples of

    linking many types of objects to online metadata. Two examples are: “Link[ing]

    souvenirs to the online photo albums” (2010b, np.) or linking “collectables directly to

    online information” (2010a, np.).

    These are broader examples of just two possible applications that arise when material

    and digital objects or collections can be linked together. Building a similar system to

    bridge the gaps between material and digital orders for books in Chapter 3 allows for

    a meaningful discussion of these linking and embedding practices. The usefulness of

  • C . B L R 24

    Weinberger’s divisions is that they help differentiate between types of systems. Thus

    instead of building applications using RFID to do things like link photo albums to

    material objects, Weinbgerger’s divisions lead to meaningful points of departure into

    a broader discussion about bridging material and digital systems instead of glossing

    over the divide, or focusing exclusively on material, abstract, or digital systems.

    As a part of that broader discussion, it is appropriate to discuss ontology and

    intentionality here, since these ideas have until now only been mentioned in passing.

    Given the discussion in the previous section about books and texts, and in this section

    about the nature of digital objects, it seems only appropriate to begin the following

    section on ontology with a relevant quote from Tanselle.

    2.3 Ontology and Intentionality

    Printed and electronic renderings are thus not ontologically different; they may bemade of different physical materials, but the conceptual status of the texts in eachcase is identical. The philosophical conundrum as to where texts reside is exactlythe same as it always was (2006, np.).

    As a key point of entry into the discussion about the nature of hybrid material and

    digital objects that have embedded within themselves their own metadata, Tanselle’s

    point is a good start. He notes that the particular renderings of any given text are

    just that, renderings, which can take many forms but all fundamentally point to the

    same text. However, the final sentence in Tanselle’s formulation avoids delving into a

    much larger body of work on exactly the problem of what objects (not just texts) are

    and where they reside.

    For example, Smith (2002) attempts to succintly describe objects in the following

    manner, writing that “To be an object is to be a patch of the world that is succesfully

    abstracted.... The fundamental character of (what it is to be) an object is thus

    intrinsically hooked into the intentional life practices of the objectifying subject” (p.

    241).

  • C . B L R 25

    Immediately then Smith’s characterization of objects requires three things. First,

    there is a requirement that something be a “successfully abstracted” part of the

    world. There is no specific endorsement of either material or digital media on Smith’s

    part. Indeed, as Kirschenbaum’s (2005) analysis demonstrates, such a distinction does

    not hold because at some level, all digital objects are material. However, the key to

    that portion of Smith’s formulation is abstraction, in that it allows for

    Kirschenbaum’s formalism, and Tanselle’s abstract idea that a “text” is the same no

    matter what media is used to represent it.

    The other key piece of Smith’s formulation is the idea that intentional practices

    stabilize objects. This idea is similar to what was discussed earlier with the library

    receipt in Figure 1.1. In that example, part of the reason that conceptualizing the

    books as objects is useful is because of the inherent intentionality (in the

    philosophical sense of aboutness or directedness (Jacob, 2010)) that is evidenced by the

    receipt, from the overlapping perspectives of the patron and the library as an

    institution. Both take the receipt to refer to a transaction involving discrete books as

    objects, and this unit is reinforced by the information systems that catalog and

    enumerate the books involved in the transaction. But what of the status of the receipt

    itself? It is hard to dispute that this is an object as well (given, for example, the use

    of it in library system as well as in this thesis). But it seems a special sort of object

    that also embodies and represents “metadata” about other objects, and the significance

    and chunking of this metadata differs based on the practicing subject.

    Interestingly enough, discussing the ‘objectness’ of the receipt itself as a piece of, or

    pieces of metadata that exist based on differing intentionalities raises a number of

    questions about the object itself. Embedding such a receipt in a book, as was

    previously discussed, raises even more questions when viewed as a means of

    expressing intentionality related to an object. Note that the word used here is

    intentionality, and not agency, per Smith’s quote above. It is useful to refer to

    Searle’s (1983) usage of the word Intentionality, since he discusses it in a clear and

  • C . B L R 26

    useful manner.

    Searle first notes that Intentionality is not an ordinary relationship “like sitting on

    top of something” (p.4). Instead, for Searle “Intentional states represent objects and

    states of affairs in the same sense of ‘represent’ that speech acts represent objects

    and states of affairs” (p.5). A discussion of his entire book and exploration of the topic

    is beyond the scope of this thesis, but the key point to take away from Searle is that

    Intentionality is a term that is used to describe the beliefs of a subject about an

    object. Thus to have Intentionality about an object for Searle is to be committed to a

    belief about something that can be represented by using an object in a particular way.

    Consider an example of how books and articles are discussed within the scholarly

    publishing community. Miller and Harris (2009) describe how individual scholars,

    editors, publishers, and subscribers have conflicting agendas where any work is

    involved. They describe the intent of publications for scientific researchers as key to

    gaining credentials for academic survival (p.13); they describe editors as intending to

    “maintain and improve the quality of the journals they serve” (p.14); they describe

    publishers as intending to make money, with all other intents concomitant to that

    main consideration; finally, they describe the intent of universities as “providing the

    reference materials necessary to support the missions of the university” (p.17).

    This example demonstrates how something like a single article allows for multiple

    dramatically different goals to be expressed or represented by such an object, all

    within a very specific context of academic publishing. Each actor in the relationship

    has their own set of uses for scientific scholarly articles, which do not necessarily

    compliment the others’. But the point is, to each participant, a given article might be

    conceptualized in completely different ways based on how it can be put to

    use—gaining credentials, gaining readership, monetary gain, or institutional mission

    fulfillment respectively.

    Going a step further, combining Searle’s (1983) ideas about intentionality with

  • C . B L R 27

    Norman’s (1988) ideas about affordances gives a clear picture of how intentionality

    can be expressed over an object. An object like an article can be designed such that it

    affords scholars to easily read it, editors to easily publish it, publishers to easily sell it,

    and universities to easily purchase it. Each participant can work with the affordances

    of an article’s form, content, and portability to fulfill their own particular goals.

    At the same time, intentionality informs the design and functionality of objects. The

    two work in tandem with each other to inform how objects can be used. In so doing,

    what an object affords in terms of different inteded uses can inform what that object

    is when it is used. Indeed, as was noted earlier, the discussion about the nature of

    books as a work or text demonstrates how the multiple notions of a book as an object

    still allows for different groups of scholars to (mostly) successfully interact with

    books despite their differing scholarly intentions.

    This description is admittedly a simplistic formulation of a set of very difficult and

    contested problems that philosophers constantly struggle to define and resolve.

    However, it is crucial to bear these ideas in mind through the course of this thesis.

    Since the attempt here is to demonstrate how embedding metadata into objects, be

    they material or digital, affects not only the successful abstraction of some part of

    the world, but also entails a difference in terms of what an object is. This argument is

    based on both a hypothetical subject’s intent of using such an object, with the new

    affordances provided by the embedded metadata, and the ways the metadata affects

    the dynamic tension associated with how the object is registered and stabilized within

    the system of stakeholders who all treat an object as an “abstract the patch” of the

    world in their own ways.

    One other important tool to understand the ontological issues surrounding hybrid

    material/digital metadata objects, then, is to understand how linkages between nodes

    function to either obscure or reveal boundaries between multiple ordering systems.

    Thus another level of abstraction from material and digital objects to wholly abstract

  • C . B L R 28

    objects is necessary. Accordingly, Star and Griesemer’s (1989) notion of “boundary

    objects” is a useful conceptual tool that will be described in the following section.

    2.4 Boundary Objects

    As a tool for understanding the interplay between books, nodes, networks, and control

    points over information, the idea that things can be boundary objects is a crucial

    piece that ties all these ideas about ontology, power, categorization, and ordering

    systems together. Star and Griesemer explain boundary objects as inhabiting multiple

    social worlds while at the same time satisfying the informational requirements

    required of them (1989, p. 393). According to Star and Griesemer, boundary objects

    can be either concrete or abstract objects that allow reuse in different contexts and

    locations, but maintain an identity despite their varied uses. The authors describe

    four types of boundary object, which follow:

    1. “Repositories. These are ‘ordered’ piles of objects which are indexed ina standardized fashion. Repositories are built to deal with problems ofheterogeneity caused by differences in unit of analysis. An exampleof a repository is a library or museum. It has the advantage ofmodularity. People from different worlds can use or borrow from the’pile’ for their own purposes without having directly to negotiatedifferences in purpose” (1989, p. 410).

    2. Ideal type. An object such as a digram, atlas, or classifier that is abstracted from

    its domain, which does not describe any one item but is instead vague enough to

    be adaptable to multiple sites of use. The example that Star and Griesemer use

    is the term “species”.

    3. Coincident boundaries. Star and Griesemer describe these boundary objects as

    having the same external boundaries, but different internal contents. The

    example they give is of separate maps of the state of California, with one map

    displaying typical road-map like features, while another map of the same state

    shows highly abstract ecological zones. In this case, the state itself is the

    coincident boundary that is used for different purposes.

  • C . B L R 29

    4. Standardized forms. These objects are described as “methods of common

    communication across dispersed work groups.... The results of this type of

    boundary object are standardized indexes” (1989, p. 411). Star and Griesemer go

    on to describe standardized forms as useful for transmitting objects over

    distances without losing or changing information, such that any local

    uncertainties are ‘deleted’ to use their term.

    For the purposes of this thesis, repositories, ideal types, and standardized forms are

    the boundary object types that will be used. As Star and Griesemer point out, libraries

    are a specific example of a repository. Just as repositories are designed to deal with

    differing units of analysis, here the unit of analysis will be shifted. Instead of

    focusing on libraries as repositories, this thesis treats books as repositories,

    specifically of ideas that can be used for multiple purposes, as Star and Griesemer

    point out.

    Additionally, in terms of ideal types, “books” taken abstractly are an ideal example of

    the boundary object type. Foucault’s characterization is explicit in describing books as

    nodes - essentially he treats books abstractly as ideal objects in order to adapt the

    idea of a book to his overall discussion of knowledge. This lack of distinction between

    objects and abstract ideas as boundary objects is useful. It meshes together objects

    that are treated immaterially (Kirschenbaum’s illusion of immateriality and

    Weinberger’s third-order systems) with material objects like actual “books” and helps

    defer the previously noted work, text, representation discussion until the conclusion

    of this thesis.

    2.5 Discussion

    As an initial attempt at building boundary objects to explore and discuss the research

    questions state in the beginning of this chapter, the Critical Making Lab at the

    Faculty of Information, University of Toronto, was used to build a prototype RFID

  • C . B L R 30

    system. The system uses RFID tags to link books to their online metadata via a user’s

    web browser. As the preceding section mentioned, the method used to build the

    prototype system was Critical Making, wherein the object that is designed (the RFID

    system) is not the actual focus of the thesis.

    Instead, the RFID antenna and the decisions that were made in designing the system

    serve as material and technological lenses through which to understand the broader

    issues of object identity and ontology, categorization systems, and the individual and

    institutional practices that underly each. Two other systems were designed based on

    building the RFID system, metamash.org and doitag.org, which will be discussed later

    in this thesis.

    Note that the lack of precision or accuracy in cataloging a text is not in itself a

    problem that this thesis attempts to discuss or resolve. Indeed, given sufficient

    knowledge of how a library is laid out according to Library of Congress or Dewey

    Decimal systems, or any other for that matter—see Chapter 6 for an interesting

    example of an alternative classification system—people can and do find the items that

    they seek in libraries.

    The problem put forward here is that it is difficult on individual, social, and even

    institutional levels to exercise agency over the institutions like the local branch or

    the Library of Congress, which categorize items according to their internal needs. For

    example, an outside practitioner would have an extremely difficult time of explaining

    and changing the subject headings for a book like Archive Fever (Derrida, 1996) to

    include “Archives” in the Library of Congress subject headings for the book cf. 1.

    Memory (Philosophy) 2. Psychoanalysis 3. Freud, Sigmund, 1856-1939.

    Even here, the problem is still not quite so clear. It is not that there is a tension

    between objective and subjective classifications, readings, or descriptions of a text;

    the problem lies elsewhere and encompasses Weinberger’s problematic power dynamic

    and Foucault’s distributed network. The issue is one of scale and abstraction of

  • C . B L R 31

    objects and the methods used to organize them. As the discussion about texts and

    paratexts revealed, so to do objects simultaneously inhabit multiple local material and

    abstract distributed contexts, on multiple scales of granularity. It is fairly clear that

    this the case for books, but also highlights the looming question of whether or how

    this applies to hybrid material and digital objects.

    For example, metamash.org uses tags from LibraryThing that are all submitted to

    that site6 by individuals based on their individual book collections. At the same time,

    metamash.org also uses bibliographic data from WorldCat7, which is a site built

    around multiple libraries sharing their catalogue data about books. A book entered

    into metamash.org becomes a boundary object between these two systems and in so

    doing can highlight the tensions between individual and institutional uses and

    sources of data. More importantly, as the RFID system will demonstrate, a book with

    an embedded RFID tag can also straddle first and third order ordering systems and in

    so doing, interrogate what it means for such an object to exist in the first place.

    Additionally, books are ideal boundary objects to begin attempting to resolve across

    scales and practices, since as Star and Griesemer point out, repository boundary

    objects are those that explicitly straddle different units of analysis or abstraction

    (1989, p. 410). Instead of focusing on libraries per Star and Griesemer’s example,

    books are finer grained units of analysis that are functionally similar to libraries in

    that they are points of negotiation between different social worlds.

    Whereas the Library of Congress can bring its full authority and credibility to bear

    on the abstract and necessarily limited classification of Archive Fever, it does not

    account for the multiplicity of specific, local, and contextual readings that individual

    readers, or even groups of readers, or other institutions might derive from the book.

    Indeed for this thesis, the primary subject heading for Archive Fever could easily be

    Archives instead of Memory. Thus the material book is an information resource that

    6http://librarything.org7http://worldcat.org

    http://librarything.orghttp://worldcat.org

  • C . B L R 32

    is cited in this individual thesis for a particular purpose; at the same time, the book

    exists in an institutional catalogue that has different objectives than this thesis. Thus

    the volume itself is a point of negotiation between individual and institutional needs,

    never-mind the possible different interpretations of the ideas it contains.

    Examining the discrete nature of books and articles by building the RFID system,

    metamash.org, and doitag.org, all of which are described in detail in Chapter 3, will

    better characterize the tension between institutional metadata systems and individual

    agency over how those systems are used. For each system, there are multiple sources

    of information (Foucault’s nodes and points of control), each system sits on the

    boundary between ordering systems (Weinberger), and each system is and outputs

    boundary objects that fit into either repository or ideal type categories, which

    highlight both the Foucauldian and Weinbergerian aspects of each system.

    2.6 Part 1 Summary

    This chapter serves as an introduction to a number of key ideas that inform the rest

    of this thesis. First is that storing metadata about a text or set of texts can be a used

    to discover relationships between texts based on the agency of an unknown but

    intelligent person(s). Second is that any text can be broken down into smaller

    components than the containing form.

    Next is Foucault’s notion of the book as a node situated in an abstract network of

    other books and nodes like institutions, authors, and readers. Crucially important is

    Foucault’s idea that the scale or unit of analysis for any network can extend to

    individual words and sentences, or beyond the material form of a book to the

    discourses in which it exists. This idea that objects can be understood as components

    in interrelated scaleless networks is fundamental to the rest of this thesis.

    Another important idea, which is complimentary to the Foucault’s, is Weinberger’s

    formulation of first, second, and third order organization systems. The characteristics

  • C . B L R 33

    of each underly the RFID system, metamash.org, and doitag.org. More importantly,

    the idea that each type of system can be overlaid on top of the preceding systems

    informs the design of all three metadata systems that are discussed in this thesis.

    Foucault’s idea about scale of analysis also serves as a useful conceptual tool with

    which to interrogate the power relations embedded in the joints between ordering

    systems.

    Finally, Star and Griesemer’s characterization of boundary objects is another useful

    conceptual tool. Each of the objects used or created by the RFID, metamash.org, and

    doitag.org systems is a boundary object and can be used to interrogate joints between

    ordering systems, points of control over information, sources of said information, and

    communities of interest and practice that surround each object or set of objects.

    Deploying each of these ideas in conjunction with the others via Critical Making

    raises more questions than they answer about identification and categorization.

    However, the process of designing and building the systems that will be discussed in

    this thesis draws attention to a boundary region between material and digital objects

    and related practices that warrants in depth examination.

    2.7 Background: Every Book is a Problem

    Standards, categories, technologies, and phenomenology are increasinglyconverging in large-scale information infrastructure.... this convergenceposes both political and ethical questions (Bowker & Star, 1999, p. 47).

    2.7.1 Introduction

    The convergence of various types of systems into large scale infrastructures that

    Bowker and Star desribe in the preceeding quote can be understood in one way as

    effacing some of the gaps between ordering systems. For example, despite the

    criticisms of second order metadata systems discussed in the Introduction of this

    thesis, it is appropriate to point out that the Library of Congress metadata for Archive

  • C . B L R 34

    Fever was organized enough that it was suitably catalogued and located exactly

    where it was supposed to be within the University of Toronto’s library system. This is

    no small feat indeed considering that the University of Toronto library system

    contained 18,985,932 items in April 2009 (2009).

    The ability of any user with enough knowledge of the organizational schema (Library

    of Congress in this case) to place, store, cross-reference, and locate a single volume

    within such an immense system given bears testament to the fact that however large,

    difficult, inefficient, or problematic library metadata systems may or may not be, the

    University of Toronto system (at least) worked as intended for the particular text in

    question. In other words, the large scale first and second order UTL systems and the

    second order Library of Congress system work well together to achieve one particular

    user oriented goal - finding a book on a library shelf.

    However, in searching for Archive Fever, the seed idea for this thesis arose. What

    might happen if users could embed their own metadata into material books on

    shelves? One need look no further than the folded pages of a library book or

    handwritten marginalia to see that the very medium of a material book all but

    encourages users to leave their thoughts and notes embedded in a book itself. Thus,

    the initial attempt of this thesis was to build a prototype RFID system that could

    enable embedding digital user generated metadata into material library books,

    essentially overlaying a third order system on top of first and second order systems,

    without altering the underlying ordering infrastructures. The goal was to augment

    existing library catalogue systems in order to question the standards, categories, and

    technologies governing such systems, in a manner similiar to that of Bowker and

    Star whose quote opens this chapter.

    By using RFID to create boundary objects, the points of negotiation between parties

    and between Weinberger’s first and second order ordering systems, and second and

    third order ordering systems become slightly more apparent. While augmenting

  • C . B L R 35

    many institutional systems (libraries, supply chains, international travel and the

    like), using RFID in this thesis interrogate the transition points between levels of

    ordering reveals that the needs of institutions take precedence of those of

    individuals. As such, to understand why RFID is important to this thesis because of

    its largely transactional nature, some background information is required.

    2.7.2 RFID

    One particularly important augmentation of books started with the invention of

    barcodes in 1949 (Wikipedia, 2010a). Historically however, Mai (2003) points out that

    13th Century monks built a shared catalogue of 183 English monastary libraries to

    keep track of items, so augmenting books with second-order metadata is not a new

    practice by any means. The Online Computer Library Center (OCLC) notes that Dewey

    Decimal has been around since the 1870’s (2010, np.). However, the barcode

    specifically marks a turning point for books because it represents McLuhan’s

    “stepping-up of speed from the mechanical [e.g. punch cards] to the instant electric

    form” (1964, p. 47). In this case, the electric form resides in pulse-width modulated

    signals as read from barcodes applied to books. Carrying that notion of an instant

    electric form forward from barcodes to the present, myriad systems exist that operate

    on the same basic principle of using digital signals to quickly identify objects. RFID

    tags comprise one such system, which informs the rest of this thesis.

    The foundational theoretical and applied research paper (Stockman, 1948) upon which

    modern RFID systems are built was published in 1948, a year before research into

    barcode systems had truly begun. Stockman’s research involved using various types

    of transceivers to modulate power emitted from a receiver. The receiver would then

    read the retransmitted signal from the remote transceiver. Crucially, the difference

    in Stockman’s system versus traditional radar systems of his day was that signals

    could be modulated over time instead of signaling binary on or off conditions.

    Stockman pointed out that possible civilian applications of his reflected power system

  • C . B L R 36

    included, amongst other things, “automatic pin-pointing... and simplified means for

    identification and navigation” (1948, p. 1196,1204).

    2.7.2.1 ISO/IEC 1443

    Sixty years after Stockman’s initial research into the theory of measuring reflected

    and modulated sound, light, and radar systems, there are multiple codified ISO

    standards for specific types of RFID implementations. ISO/IEC 14443 is a specific

    example of a global RFID standard that warrants further examination. It “is one of a

    series of International Standards describing the parameters for identification cards as

    defined in ISO 7810 and the use of such cards for international interchange.... part of

    ISO/IEC 14443 describes the physical characteristics of proximity cards. This

    International Standard does not preclude the incorporation of other standard

    technologies on the card” (Joint Technical Committee ISO/IEC/JTC1, 2008). From the

    outset then, it is important to note that the standard governing the form and intended

    uses of RFID tags is institutionally motivated by other ISO/IEC standards, and by the

    intended use case of “international interchange.”

    Roussos (2008) gives a remarkably typical account of the intended international uses

    of RFID transponders. He details the use of RFID in electronic machine-readable

    travel documents (e-MRTDs) like passports and the international standards that

    govern their implementation and use as defined by the International Civil Aviation

    Organization (ICAO). He notes that e-MRTDs must adhere to the ISO/IEC 14443

    standard which “provides specifications for iris scans and fingerprints for future use.”

    He continues and writes that “Millions of e-passports are already in use, and

    thousands of MRTD-capable immigration control facilities have been deployed at

    disembarkation points in several countries” (2008, p. 12).

    At every level of the discussion, the intended user of the device is not an individual.

    e-MRTDs are intended for use by border security agents who are employed by

  • C . B L R 37

    enormous institutions like border agencies and their parent countries. Indeed, “Their

    development [e-MRTDs] is seen by ICAO and its member countries as a significant

    improvement over manual inspection of travel documents at border control points in

    terms of efficiency and data entry precision” (2008, p. 11). Thus to the holder of the

    e-MRTD, there is little direct benefit apart from increased efficiency when crossing

    through border checkpoints. Said benefit may of course be substantial and desirable,

    but the point here is that the stated use of RFID tags in “international interchange”

    and control over the devices themselves resides within institutions and not

    individuals. At no point do individuals get to exercise agency over the devices or the

    information that they contain.

    Indeed the ISO IEC 14443 compliant RFID transponders privilege the authority of

    institutions over the document holders, as demonstrated by the ICAO’s description of

    e-MRTD Assisted Border Clearance. Their 2008 Guidelines document states that an

    e-MRTD Assisted Border Clearance system is one “[that] assists the border control

    officer to authenticate the eMRTD via the use of a suitable document reader, establish

    that the passenger is the rightful holder of the document and query border control

    records. The officer himself determines eligibility for border crossing” (2008, p. 24).

    The language of the guidelines is telling when combined with the earlier extract

    from the ISO/IEC 14443 standard - the document itself is the most important piece of

    the border crossing, which is exactly the use case encompassed by the ISO/IEC’s term

    “international interchange.” Thus the passenger is relegated to the role of being

    identified as an element in an exchange of information, not as a particular human

    being, but as the “rightful holder of the document.” The e-MRTD as a document

    authoritatively represents a person’s identity and legitimacy in such an institutional

    transaction.

    Roussos goes on the describe the benefits of RFID in large metropolitan transportation

    systems. He writes that

    “RFID offers distinct advantages due to the superior durability of tickets....

  • C . B L R 38

    Ticket inspection at the gates is also facilitated by the far higher readaccuracy of RFID compared with magnetic, which helps maintain thesteady flow of commuters.... Finally, RFID tickets can hold considerablymore data, which allows the use of personalized unique identifiers that canbe used to virtually eliminate fare evasion (2008, p. 16).”

    Roussos does not explicitly describe the institutional desirability of RFID for transit

    systems. In the case of transit systems, individuals also benefit from the advantages

    that he lists. However, the described advantages of ticket inspection and reduced fare

    evasion are rooted in a transactional model that privileges the needs of the

    institution as a service provider over those of the individual as a transit rider who

    wishes to move from point A to B in the most expedient manner.

    2.7.2.2 RFID Discussion

    These two examples, while lengthy and in many ways banal, are remarkable for

    exactly the fact that they are unremarkable. They are important because they

    demonstrate the institutional standards that underly RFID technology and many

    practical implementations. Such foundations are not necessarily problematic either,

    as described in the transit system example. Rather, they demonstrate how commercial

    and institutional uses for RFID have determined current standards that govern how

    information is stored on RFID tags, best practices for programming and reading tags,

    and the actual physical size and contents of tags themselves, as is the case with ISO

    14443 tags used in e-MRTDs.

    ISO/IEC 1443 is strongly oriented towards second-order ordering in that it only uses

    key pieces of indexical information about a passport holder or passenger, without

    directly physically constraining its holder. In other words, ISO/IEC 1443 does not

    regulate how people are physically categorized, it only indicates enough about a

    person using categories to identify a person using, for example, using a combination

    of name, gender, eye colour, birth country and the like.

    Additionally, eMRTDs are excellent examples of boundary objects as the transit

  • C . B L R 39

    example demonstrates. A transit authority has distinctly different goals than a

    transit user, but both rely on the same physical objects and infrastructure to

    accomplish their different tasks. On the one hand transit authorities would like to

    eliminate fare evasion and achieve a steady flow of passengers, while on the other,

    transit users would like to get from place to place with a minium amount work and

    time spent. By abstractly identifying passengers via their second order RFID passes,

    which are based on first order ordering of electromagnetic pulses emitted from

    transponders, all parties can complete their tasks.

    A key point here is that RFID technologies bridge Weinberger’s first two ordering

    systems. In discussing how it is used in both first and second order systems, and how

    it is a boundary object between classes of users, the larger issues of identity, and

    personal versus insititutional goals come into focus. As Chapter 3 will demonstrate, a

    new set of issues arises when RFID is used to bridge first and third-order systems.

    2.8 Keywords and Tagging

    While RFID is institutionally deployed and uses internationally agreed upon

    standards to bridge first and second level ordering systems, the tagging and

    folksonomy systems that this thesis will discuss are largely loosely structured and

    mostly third-order in nature. Krauss (2010) gives an excellent summary of how RFID

    systems and Library Management Systems (LMS) can be linked to allow users to

    participate in “Library 2.0” type interactions. Indeed, the initial attempt at building

    an RFID system for this thesis was intended to allow users to use passive RFID tags

    and off the shelf hand-held components to leave tags on RFID t