Project CLiMB C omputational Li nguistics for M etadata B uilding
description
Transcript of Project CLiMB C omputational Li nguistics for M etadata B uilding
![Page 1: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/1.jpg)
1
Project CLiMB
Computational Linguistics for
Metadata Building
Using Computational Linguistic Techniques
to Harvest Image Descriptors
Columbia UniversityFunded by the Andrew W. Mellon Foundation
2002-2004
![Page 2: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/2.jpg)
2Photograph courtesy of the Council of Industrial Design's Design Archive.
![Page 3: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/3.jpg)
3
![Page 4: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/4.jpg)
4
CLiMB: Interdisciplinary Research at
Columbia University
• Libraries• Computer Science Department• Center for Research on Information Access
(CRIA)
Funded by the Andrew W. Mellon Foundation2002-2004
![Page 5: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/5.jpg)
5
CLiMB Project Members
Judith Klavans, PI
Stephen Davis
Angela Giral
Patricia Renfro
Bob Wolven
Roberta Blitz
Rebecca Passonneau
Veronika Horvath
David Elson
![Page 6: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/6.jpg)
6
Problems in Image Access
Traditional approach: labor intensiveexpensive
![Page 7: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/7.jpg)
7
Project CLiMB
Help image catalogers provide subject access?
Harvest image descriptors
from existing literature?
![Page 8: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/8.jpg)
8
Can we harvest image descriptors?
![Page 9: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/9.jpg)
9
CLiMB will identify and extract• proper nouns• terms and phrases
from text related to an image:
By September 14, 1908, the basis of the Greenes' final design had been worked out. It featured a radically informal, V-shaped plan (that maintained the original angled porch) and interior volumes of various heights, all under a constantly changing roofline that echoed the rise and fall of the mountains behind it. The chimneys and foundation would be constructed of the sandstone boulders that comprised the local geology, and the exterior of the house would be sheathed in stained split-redwood shakes.
— Edward R. Bosley. Greene & Greene. London: Phaidon, 2000. p.127.
CLiMB Technical Contribution
![Page 10: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/10.jpg)
10
CLiMB Overall Goals
The essence of CLiMB: • Use scholars themselves as “catalogers” by
employing scholarly publications• Enhance existing descriptive metadata
The CLiMB project:• Research: Development of richer retrieval
through increased numbers of descriptors• Practice: Development of CLiMB ToolKit
![Page 11: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/11.jpg)
11
• Image collection
• Associated text
• Target object identification (TOI)
• CLiMB ToolKit
Squeezing Metadata out of Scholarly Texts
![Page 12: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/12.jpg)
12
Greene & Greene Architectural Records and Papers Collection
Drawings and ArchivesAvery Architectural and Fine Arts LibraryColumbia University Libraries
![Page 13: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/13.jpg)
13
NYDA.1960.001.00023
All Saints Episcopal Church (Pasadena, Calif.). Alterations1902-1903
![Page 14: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/14.jpg)
14
Greene & Greene Catalog RecordAuthor: Greene & Greene.Title: [Mrs. Dudley P. Allen house, 1188 Hillcrest Avenue (Pasadena,
Calif.). Alterations.]Residence of Mrs. Dudley P. Allen, 1188 Hillcrest Ave., Pasadena,
Cal. [graphic] : Alteration / Greene & Greene, Architects. Published: [1917]
Physical Details: 4 sheets : various media ; 87.8 x 57.3 cm. (34 5/8 x 22 5/8 in.)Location: Columbia University, Avery Architectural Drawings
Other Authors: Greene, Charles Sumner, 1868-1957. Greene, Henry Mather, 1870-1954.
Subjects: HousesAlterationsArchitecture--Designs and plans--United States.Mrs. Dudley P. Allen house, 1188 Hillcrest Avenue (Pasadena,
Calif.)
Component Item: [1] Item no. NYDA.1960.001.03224. [AVERYimage]. Electric lighting -- floor plan, part plan of basement : Sheet no.
Component Item: [2] Item no. NYDA.1960.001.00073. [AVERYimage]. [Electric lighting] floor plan, part plan of basement.
![Page 15: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/15.jpg)
15
• Bosley, Edward R. Greene & Greene. London : Phaidon, 2000.
• Current, William R. Greene & Greene: architects in the residential style. Fort Worth [Tex.] : Amon Carter Museum of Western Art, [1974]
• Makinson, Randell L. Greene & Greene: architecture as fine art. Salt Lake City : Peregrine Smith, c1977.
• Makinson, Randell L. Greene & Greene: the passion and the legacy. Salt Lake City : Gibbs and Smith, c1998.
• Smith, Bruce. Greene & Greene masterworks. San Francisco : Chronicle Books, c1998.
• Strand, Janann. A Greene & Greene guide [Pasadena, Calif. : G. Dahlstrom, 1974]
Greene & Greene Bibliography(associated texts)
![Page 16: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/16.jpg)
16
![Page 17: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/17.jpg)
17
• Image collection
• Associated text
• Target object identification (TOI)
• CLiMB ToolKit
Squeezing Metadata out of Scholarly Texts
![Page 18: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/18.jpg)
18
Target Object Identification (TOI)
• “Authority” list
• Varies from collection to collection– Greene & Greene – Project Names– North Carolina Museum – Creator/Title
![Page 19: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/19.jpg)
19
![Page 20: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/20.jpg)
20
North Carolina Museum of Art Museum Catalog
(Associated Text)
Images
(Catalog Records)
North Carolina Museum of Art: Handbook of the Collections. Ed. Rebecca Martin Nagy. Raleigh, NC: North Carolina Museum of Art, Hudson Hills Press, 1998.
![Page 21: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/21.jpg)
21
Georgia O'Keeffe (American, 1887-1986)
Cebolla Church, 1945
Oil on canvas, 20 1/16 x 36 1/4 in. (51.1 x 92.0 cm.) Purchased with funds from the North Carolina Art Society (Robert F. Phifer Bequest), in honor of Joseph C. Sloane, 72.18.1
North Carolina Museum of Art<http://ncartmuseum.org/collections/highlights/20thcentury/20th/1910-
1950/038_lrg.shtml>
![Page 22: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/22.jpg)
22
MARC format
100 O’Keeffe, Georgia, ≠d 1887 -1986. 245 Cebolla church ≠ h [slide] / ≠ c Georgia
O’Keeffe.260 ≠c2003300 1 slide : ≠ b col.500 Object date: 1945.500 Oil on canvas.500 20 x 36 in.535 North Carolina Museum of Art ≠ b Raleigh, N.C.650 Painting, American ≠ y 20th century.650 Women artist ≠ z United States 650 Church buildings in art.
![Page 23: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/23.jpg)
23
Cebolla Church, 1945Oil on canvas, 20 1/16 x 36 1/4 in. (51.1 x 92.0 cm.)Purchased with funds from the North Carolina Art Society (Robert F. Phifer Bequest), in honor of Joseph C. Sloane, 72.18.1
Driving through the New Mexican highlands near her home, Georgia O'Keeffe would often pass through the village of Cebolla with its rude adobe Church of Santo Niño. The artist was moved by the poignancy of the little building: its sagging, sun-bleached walls and rusted tin roof seemed so typical of the difficult life of the people.
When O'Keeffe came to paint the church she addressed it directly, emphasizing its isolation and stark simplicity. Literally formed out of the earth, the building affirms the permanence and the hard, defiant patience of the people. For O’Keeffe, it symbolized human endurance and aspiration. "I have always thought it one of my very good pictures", she wrote, "though its message is not as pleasant as many others".
And the question remains: What is that in the window?
![Page 24: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/24.jpg)
24
MARC format with CLiMB subject terms100 O’Keeffe, Georgia, ≠d 1887 -1986. 245 Cebolla church ≠ h [slide] / ≠ c Georgia O’Keeffe.260 ≠c2003300 1 slide : ≠ b col.500 Object date: 1945.500 Oil on canvas.500 20 x 36 in.535 North Carolina Museum of Art ≠ b Raleigh, N.C.650 Painting, American ≠ y 20th century.650 Women artist ≠ z United States 650 Church buildings in art.
CLiMB New Mexican highlands CLiMB village of Cebolla CLiMB adobe Church of Santo NiñoCLiMB sagging, sun-bleached walls CLiMB rusted tin roofCLiMB isolationCLiMB human enduranceCLiMB window
![Page 25: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/25.jpg)
25
• Image collection
• Associated text
• Target object identification (TOI)
• CLiMB ToolKit
Squeezing Metadata out of Scholarly Texts
![Page 26: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/26.jpg)
26
The CLiMB ToolKit
• Software prototype• For large image collections• Semi-automated metadata
– Subject access terms– Human intervention at all steps
• Iterative development cycle
![Page 27: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/27.jpg)
27
![Page 28: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/28.jpg)
28
The CLiMB ToolKit
• Web Browser
• Help Menus
• Projects
A Graphical User Interface (GUI)
![Page 29: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/29.jpg)
29
CLiMB TOOLKIT: Process Flow
1. Load Text
2. Load TOI List
3. Analyze Text
5. Review
4. Select Subject Access Terms
![Page 30: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/30.jpg)
30
CLiMB DocViewer
http://www1.cs.columbia.edu/~delson/cni/
![Page 31: Project CLiMB C omputational Li nguistics for M etadata B uilding](https://reader036.fdocuments.in/reader036/viewer/2022062309/568158c0550346895dc609e0/html5/thumbnails/31.jpg)
31
Thank you!
Any further questions?
www.columbia.edu/cu/cria/climb