Lipstick on a Pig: Integrated Library Systems

Lipstick on a Pig:Integrated Library Systems

30 November 2010

Tip of the week: Communication• Writing for your library colleagues is not like

writing academic papers.• No required pagecounts, and no points for prolixity

• Nobody cares about your erudition. (Lit review only to make a serious point, or to leverage peer pressure.)

• Get to the point. Fast.• “Executive summary” for people who won’t read it all (which is

most of them!).

• Bullet points are good. Compound-complex sentences and ten-dollar words are bad.

• Always ask: What is the purpose of this thing I am writing? Who cares?• Communicate? Persuade? Train?

• Upbeat matters. Even if you’re frustrated, tired, fed up.

Tool of the week: Creative Commons

• Creators who license their creations for reuse; no need to ask permission• But observe any conditions on the license!

• Images: http://flickr.com/creativecommons/• or Compfight: http://compfight.com/

• or Flickr Storm: http://zoo-m.com/flickr-storm

• Music: search for “podsafe music”

• Brilliant for decorating presentations and podcasts

• Remember to give back!

Tool of the week: Online presence

• Yes, you need to worry about your egoGoogle.

• You also need to worry if you don’t have one.

• Are you missing a chance to stand out in a competitive field?

• Once you’re on the job market, THEY ALL HAVE MLS-ES.

• You’ll have to take the bad with the good... and only you can decide what’s worth it. Consider:

• Professional network (assists, job leads, conference buddies)

• What fits your way of presenting yourself (blog, Flickr, YouTube, Ravelry, FriendFeed, Facebook, LinkedIn, SlideShare)

• Whether and how to pull all of it together into a portfolio.

Weekly reflection(for next week’s discussion)

• Google yourself. Could you find yourself? Do you have Googlegangers?

• Any surprises, good or bad?

• Is this what you want a potential employer knowing about you?

• If not, what do you plan to do about it?

• Are you satisfied with your own online privacy?

Software development models: why you care

• How your software was built affects:

• how much you pay for it, up-front and ongoing

• which chunk of budget those costs come from

• how much you can do with and to it

• how much it will cost to support and train people on it

• how much control you have over your data and how your data are presented to your patrons

• how good it is

• There is no one right answer. There are only tradeoffs, which you need to understand.

Building it yourself

• Some libraries deliberately and intentionally develop their own software. Go them!

• Some libraries do it by accident!

• One bright tinkerer whomps something up.

• The library comes to depend on it.

• ... and then the tinkerer leaves. Oops.

• ... or the computing world changes such that the whomped-up thing no longer works. Oops.

• Tinkerers are great. But make them document. And have a plan for transitioning off the whomped-up thing!

Off-the-shelf software

• What you buy in the TechStore

• Made by for-profit companies

• Though small developers and shareware makers are still out there!

• Certain expectations of performance, stability, polish, documentation

• May vary somewhat depending on customer base

• May rely on proprietary file formats for customer lock-in

• Pricing: usually “per seat” or “site licensed”

Vendor software• Usually springs up in niches where off-the-shelf

software can’t sell enough seats

• ... e.g. ILS software for libraries! Also learning-management systems!

• You pay to run the software AND for a certain level of customer service

• Installation help

• Employee training, user groups, conferences

• Technical support (up to and including vendor-run servers)

• You’ll still need local tech staff, often!

• Installing and customizing these things is a HASSLE.

• But there will be strict limits on what you can do.

Use the source, Luke!

• “Source code” = the instructions that humans write for computers to follow

• “Compiled code” or “binary code” = source code that has been munged to be directly understandable by the computer

• Not interpretable by humans any more!

• This is the only form in which proprietary software is distributed (usually), and why you can’t peek under its hood.

• “Compiler,” “interpreter,” “virtual machine” all bits and pieces of the source-code to compiled-code transformation.

Open-source software• The source code is open!

• You can (legally) download and install it without paying.

• You can (legally) read it.

• You can (legally) change it.

• You can (legally) resell it (sometimes with caveats).

• Developers “license” their code under one of a number of open-source licenses

• Commonest: GNU General Public License (GPL), which has a sting in its tail

• Also notable: BSD license, Artistic License

• OSI maintains a vetted list of open-source licenses.

Brief digression: open source, open standard, open access

• Open source: refers to SOFTWARE

• Open standard: refers to RULES for protocols, file formats, software specs, etc.

• “Reference implementation:” software that shows how software that complies with a particular standard should work

• Example: W3C’s Amaya browser

• Open access: refers to the SCHOLARLY LITERATURE

I’m not a programmer. Why should I care about the source?• Do you benefit when other people hack on

the software?• With open source, quite possibly yes.

• If there’s a good API, quite possibly yes.

• With API-less proprietary software, rarely and only indirectly.

• What happens when a software company goes out of business? Or kills a product?• Proprietary software: decay and obsolescence.

• Open-source software: new companies, forks, options.

• Security• Security-through-obscurity doesn’t work. No software is

perfectly secure, but OSS has a good track record of fast patches.

Should I use open-source or proprietary software, Dorothea?• It depends. There are tradeoffs.

• $$$ vs. staff time/expertise: “free as in kittens”

• Ease of use/installation vs. control

• Professional support vs. ad-hoc online communities

• You can’t always know what your experience will be.

• Some vendor support is horrible. Some is great. Some online communities are horrible. Some are great.

• Some open-source projects move fast. Some don’t. Some vendors move fast. Most don’t (most can’t!).

• Only you understand your library’s situation.

• ASK AROUND before you invest, either way.

The worst of OSS: DSpace• Few developers (and until recently, all

volunteers), so change is slow.

• Arrogant developers, so change is out-of-touch with actual user needs.

• Why did publicly-accessible statistics take YEARS?

• This has gotten better of late. It’s still not perfect.

• Architecture deeply hostile to casual hacking, so innovation is slow.

• APIs? What APIs? Plugins? Who needs plugins? And why should we have a space to share code?

• Usability? This is open-source software! We don’t need no stinkin’ usability!

The worst of vendor software: ILSes

• Migration is a huge hassle, so vendors lock in customers and have little further incentive to serve them.

• Heinous hardware-price markup

• Totally opaque data models; few APIs; licenses that forbid tinkering

• Horrendous customer support

• Stunningly slow to innovate (partly our fault!)

What’s an ILS?

• Integrated Library System

• THE system that handles library operations.

• “Modules”

• Acquisitions

• Cataloguing

• OPAC

• Circulation/patron management

• Also: serials, metasearch, e-resource managers (sometimes), link resolvers... separately or bundled

• Underneath: heap big relational database!

State of the market• Big consolidations in mid-decade

• Players: Endeavor (Voyager), Ex Libris, Sirsi/Dynix (Horizon)

• Up-and-coming open-source packages

• Koha: geared toward public libraries

• Evergreen: geared toward library consortia, is building code for academic libraries (e.g. serials management)

• eXtensible Catalog Project: University of Rochester

• Some service innovation

• WorldCat Local

• LibraryThing for Libraries

• Typical ILS replacement cycle: 5 to 10 years

Lipsticking the pig

• Libraries turned to outside vendors, homegrown solutions

• NCSU: adopted Endeca, who are a web-commerce firm

• UVa: Solr/Flare/Blacklight (ha ha ha)

• Scriblio, VuFind, etc.

• What were they looking for?

• USABILITY!

• Faceted searching/browsing

• Better associations among records (quasi-FRBRization)

• Better correlation between user language and controlled vocabularies

• Generally: making the data work harder!

More pieces: Link resolvers and OpenURL

• You have a citation. How do you find out if the library has the article among its e-resources?

• OpenURL: protocol for checking citation information against a library’s list of vendor-provided e-journals and article databases

• Pack citation info into a URL or a teeny XML document

• Link resolver: gizmo that takes in an OpenURL and returns list of available copies.

• SFX (Ex Libris) current market leader

Still more pieces: e-resource management

• You just bought a Big Deal. How do you update holdings and URLs in your OPAC? How do you update your link resolver?

• How do you keep track of who bought what out of which fund? Or who to call when something breaks? Or usage stats?

• Market leader: Serials Solutions

• Service (auto-holdings-updating), not just product.

• Open-source (though dependent on MS Access) entrant: ERMes

Catalog vs. “resource discovery”

• What’s actually in an OPAC?• Print books, maps, sheet music

• Title-level serials

• Maybe govdocs, theses/dissertations, collection records for stuff in special collections

• What’s not?• The rest of the world! Including digital collections, stuff on

the web, article-level access to journals, finding aids...

• The information world is bigger than it used to be!• So is the ILS/OPAC an INVENTORY tool, or a

DISCOVERY tool?

• And what is our inventory, really?

First-cut solution: Metasearch

• How many databases are you willing to search? With all their different interfaces?

• Metasearch to the rescue! or something.

• Single search interface presented to the user.

• Sends user’s query to various databases; receives, processes (deduping, relevance ranking), and presents the results.

• Some databases use search protocols like Z39.50 and SRU/SRW. Others have to be screenscraped.

• Lousy solution.

• Slow, not always good at processing results, coverage not always the best, search bells and whistles gone.

Next try: Building local index for search

• Tricky to do!

• Which data sources can you legally build your index from?

• Of those, how many have an API? Or will you be stuck screenscraping HTML?

• Or do you have to work with your link resolver?

• See also: Google Scholar

• Essentially this is what GS does. They make special arrangements to crawl publisher sites, even behind firewalls.

Now: “web-scale” discovery• OPAC layers (or ILS replacements, or ILS add-

ins) that purport to offer one-stop shopping: OPAC, digital collections, serials, etc.

• Serials Solutions: Summon

• WorldCat Local

• Ex Libris: Primo Central

• EBSCO: EBSCO Discovery Service (EDS)

• First question: is this a SEARCH TOOL or a CONTENT DATABASE or both?

• Next question: coverage?

• Players VERY close-mouthed about serials coverage.

The future of MARC• Bluntly: it doesn’t have one.

• As a file format, it’s LONG past its sell-by date.

• Does not fit into the mashup universe at all.

• Making it work with current-gen technology is a tremendous resource drain.

• In hindsight, decisions made so that MARC could easily output human-readable catalog cards are hurting us badly now that catalog cards aren’t what we want any more.

• That said, we have a lot of data in it.

• If you become a cataloger, you will be involved in a mass data migration. Have fun! (Believe me, I feel your pain.)

• Migration to what? Well, that’s the question.

• The answer is probably multiple. But RDA is part of the answer.

What is RDA?

• Resource Description and Access

• the next analogue to AACR2

• Does not assume MARC or ISBD underneath!

• Diane Hillmann, others actively working on linked-data/RDF expressions.

• Claimed benefits

• Expand the universe of what is describable

• Spend less time on rules pilpul, punctuation, and other cruft

• Less emphasis on “record,” more on linkages

• Ability to make our records work with/for outside world

• FRBRization

Right, so what’s FRBR?• Functional Requirements for Bibliographic

Records

• Relational data model for catalog records.

• Recognizes that not all parts of a bibliographic record describe the same thing

• Author: of a “work”

• Page count: of an “edition”

• “FRBRizing” a catalog means drawing all those relationship arrows between records, and then doing something with them for patrons.

• We can do this mechanically. Sort of. Some of it.

Next problem: Who owns our records?

• OCLC controls union catalog in the US.• But OCLC didn’t author most of the records!

• Huge, ongoing flap about who can use/remix those records, with or without permission.

• Open-records initiatives springing up• Open Library

• Michigan: http://blog.okfn.org/2010/11/29/open-bibliographic-data-how-should-the-ecosystem-work/

• To be clear: legal restrictions on reuse and mashups damage librarianship’s presence online. We can’t afford not to settle this.

Last problem: How does our data fit into the Web?• This is not entirely a catalog problem.

• What about our digitized collections? Born-digital holdings? Finding aids? Usage data? Authority data?

• What are our APIs?

• To what extent do we NEED local catalogs?

• Uncomfortable but necessary question! Do we need to reinvent Google? If so, how do we exchange records for stuff that isn’t in our ILS?

• Are we overinvested in the ILS?

• How do we facilitate appropriate reuse of our data? Do we/can we bar inappropriate reuse?

Lipstick on a Pig: Integrated Library Systems

Education

Transcript of Lipstick on a Pig: Integrated Library Systems