PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources...

20
PIDs Go To The Movies Raymond Drewry MovieLabs/EIDR [email protected] PIDapalooza 2019 Dublin pidapalooza.org eidr.org

Transcript of PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources...

Page 1: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

PIDs Go To The Movies RaymondDrewryMovieLabs/EIDR

[email protected]

PIDapalooza2019–Dublin

pidapalooza.orgeidr.org

Page 2: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

What I wanted

•  Coverageofawiderangeofworksfromglobalsources(excludinguser-generatedcontentfornow)

•  Appropriategranularityofidentification(coveringtheabstractconceptofanunderlyingworkandallitsmanyvariations)

•  Reliable,freeaccesstotheidentifiersandtheirmetadata(i.e.theidentifiersareresolvable,andanyonecanusethem)

•  Connectiontootherdatasources(informationfrommultiplesourcesismorepowerfulthaninformationfromanysinglesource)

•  Aknowledgeable,engagedusercommunitytohelppopulatethedatabase(crowd-sourcedwithacuratedcrowd.)

•  Easeofintegrationwithandusebyotherpiecesofsoftware-databasesaren’tveryusefulifnooneusesthem.AUIisjustanotherapplication.

•  Economicviability–cheapisgood,andpersistencerequireslongevity,whichrequiresmoney

Page 3: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

What I Didn’t Want

•  RightsInformation-contextdependent,variesbytimeandcountry,andsomeofitissecretandwrappedupincomplicatedlegadocuments

•  Peoplethinkingthey‘own’themetadataortheID--metadataforidentifyingaudiovisualworksisfreelyavailableinlotsofinconvenientways,butitisavailable,sowhyshouldarecordbe‘owned’?

•  Restrictionsonreadaccesstothemetadata-it’sonlyusefulifpeoplecanuseitastheywish

•  Uncontrolledadditionsandmodifications--whendealingwithuniqueness,sloppymetadataisdeath

•  Onlycommerciallyimportantworks-thatwouldgiveawoefullyincompletepictureoftheworldoffilmandTV,andprecludelotsofinterestingprojectsandapplications

•  Toomuchlegalese-theuseragreementisonlyacoupleofpageslong,whichisremarkableforsomethingthatistheoffspringofHollywoodandSiliconValley.

Page 4: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

What is EIDR – spiffy marketing version

EIDR Technology Summary

• Interoperable, standards-based infrastructure

• Built on ISO Digital Object Identifier (DOI) standard

• Application integration through public APIs and schemas, freely available SDK for members

• Efficient infrastructure for new and existing applications

EIDR Purpose

• Make digital distribution competitive • Help reduce costs • Improve collaboration and automation across multiple application domains & platforms • Enable new businesses and create new efficiencies

What EIDR is

• Global registry for unique identification of movie and TV content • Designed for automated machine-to-machine communication • Flexible data hierarchy down to the product & SKU level, incl. edits, clips, composites, encodings, and relationships

What EIDR is Not

• Profit-making • Rich commercial metadata • Ownership or rights information • US-only

4

Page 5: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

Example EIDR movie hierarchy with multiple versions

Page 6: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

Example EIDR episodic hierarchy

S e a s o n s ( A b s t r a c t i o n s )

IsSeasonOf IsSeasonOf

Series

Season 1

Retail EST (EN, FR)

Season 2

S e r i e s ( A b s t r a c t i o n s )

Episode 1

IsEpisopdeOf E p i s o d e s ( A b s t r a c t i o n s )

IsManifestationOf

Similar Hierarchy

Here

Episode2 Episode N …

E d i t s ( P e r f o r m a n c e s )

Broadcast Edit

IsEditOf

Broadcast Edit

Broadcast Edit

Retail EST (EN, FR)

Retail EST (EN, FR)

Season 2 Trailer

IsPromotionFor

UGC Upload

M a n i f e s t a t i o n s ( D i g i t a l )

Promotional Clip

IsClipOf

Social Upload

IsPromotionFor

6

Page 7: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

What Metadata Is Included

•  Enoughdescriptivemetadatatodistinguishoneworkfromanother•  DifferentepisodesofaSeries,remakes,director’scuts,dubbedlanguages,etc•  Differentrequirementsatdifferentlevelsofthehierarchy•  Factual,notinterpretive=nogenresorplotsummaries,e.g.

•  Someexceptions(e.g.earlyActualities)

•  LinkstootherEIDRIDs•  ContainingSeries,originalabstractwork,itemsincludedinaretailcompilation,etc•  Wetryprettyhardtomakethisconcrete,notmarketingoropinion(sowedon’tdo‘franchises’,forexample)

•  NoteonEIDRIDs•  Standardformis10.5240/C1B5-3BA1-8991-A571-8472-W•  Lotsofotherformatsdefinedforvariousapplications•  ResolutionandcontentnegotiationthroughtheDOIproxy

Page 8: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

What’s in an external identifier?

•  TheID,inwhateveritsbasic,nativeformhappenstobe•  Currentlyonlyforotheraudiovisualworks•  Significantinterestincoveringotherthings(posters,scripts,reviews,miscellaneousephemera)

•  …butnoneofthosehavegoodidentifiers(yet)•  Atype(IMDB,ISAN,DOI,etc)

•  Ifit’snota‘first-classtype’,thetypeis‘Proprietary’•  Proprietarytypeshaveadomainwithinwhichtheyarevalid,e.g.bfi.org.uk,bfi.org.uk/gifford,Disney.com,Disney.com/MPM,etc

•  AresolvableURLforthatID,ifweknowhowtomakeone•  Generatedontheflyfromapublishedlistofpatterns•  SomeIDtypeshavenone,somehavemorethanone

•  Arelationship(optional)•  IsSameAs,IsDerivedFrom,ContainsPartOf,etc

Page 9: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

IDs in EIDR

0200,000400,000600,000800,000

1,000,0001,200,0001,400,0001,600,0001,800,0002,000,0002,200,0002,400,0002,600,0002,800,0003,000,0003,200,0003,400,0003,600,0003,800,0004,000,000

Dec-10 Dec-11 Dec-12 Dec-13 Dec-14 Dec-15 Dec-16 Dec-17 Dec-18

EIDRContentIDs

AlternateIDs

Page 10: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

Top Alternate IDs in EIDR Alt. ID Type 2018

(000)

Chg.

TiVo ID 483 39 Webedia/AlloCiné

404 11

IMDb 314 69 CITWF 290 30 Sony 275 13 ISAN 209 20 ITV 174 14 Wikidata* 130 123Warner 112 10

Alt. ID Type 2018 (000)

Chg.

Flixster- Rotten Tomatoes 95 0

The Movie Database* 93 93NBCU 92 38 ČSFD* 87 87Turner Classic Movies (TCM)*

79 79

Baseline 65 0 EuropeanAVO 57 2 Google Play 54 4

*NewtopIDthisyear

Page 11: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

Quick examples

•  Levoyagedanslalune[VoyagetotheMoon](Méliès,1902)•  10.5240/C79D-4B8B-36AF-B214-03C0-H

•  Frankenstein(Edison,1910)•  10.5240/4375-1EB2-6CCD-DC73-B77F-V

• GameofThrones•  10.5240/C1B5-3BA1-8991-A571-8472-W

• BladeRunner•  https://doi.org/10.5240/EA73-79D7-1B2B-B378-3A73-M

Page 12: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

First Application: Digital Supply Chain (simplified)

Buy/Rent

Metadata

Fulfillment Metadata

Avail

Library Browse/Fulfill

Studio Retailer

EntitlementDatabase+

Media Files

Page 13: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

Second application: Getting it right – double-shot movies •  Somemoviesaren’tdubbed•  Scenesarere-shotatmoreorlessthesametimeinadifferentlanguagesometimeswithsomeoftheactorsdifferent

•  Thesearen’teditsofthesamemovie–theymeetthedefinitionofaseparatework

•  Commoninthe1930s•  StilldoneforsomeIndianproductions(Tamil/Hindi,forexample)

Page 14: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

Double-shot examples •  https://doi.org/10.5240/BD8D-8F89-8F75-FE28-7010-M

•  Murder!1930GBDouble-shotinEnglish(thisversion)andGerman.

•  https://doi.org/10.5240/C264-EC88-AFA1-2EC2-9B28-Z•  Mary1931GBDEDouble-shotinEnglishandGerman(thisversion).

•  https://doi.org/10.5240/1388-8D7E-42D2-7147-D5DB-L•  S.O.SIceberg1933USDEDouble-shotinGermanandEnglish(thisversion).

•  https://doi.org/10.5240/9162-6940-4DC3-ABF9-A67B-0•  S.O.S.Eisberg1933DEUSDouble-shotinEnglishandGerman(thisversion).

•  https://doi.org/10.5240/C97B-42DD-BF23-B3FF-4A8B-9•  Raavan6/18/2010INShotsimultaneouslywiththeTamil-languageversion,Raavanan.

•  https://doi.org/10.5240/F635-4E44-475B-158B-9FF4-Z•  Raavanan6/18/2010INShotsimultaneouslywiththeHindi-languageversion,Raavan.

•  https://doi.org/10.5240/13D5-090F-CA5A-A590-CE47-5•  MumbaiExpress4/15/2005INDouble-shotinHindi(thisversion)andTamil.

•  https://doi.org/10.5240/2492-A8ED-46AE-7631-40F1-H•  MumbaiExpress4/15/2005INDouble-shotinHindiandTamil(thisversion).

Page 15: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

Third Application: getting all of it -- This Modern Age •  TheRankOrganisation’sanswer(1946-1954)toTheMarchofTime(byTime,Inc.)

•  Notthe1931MGMfilm(that’shttps://doi.org/10.5240/DE76-BA99-3701-6237-6BCE-I)

•  ITVhasmostoftheRankcatalogue;BFIhassomeofit;CITWFhasmetadataforsomeofit

•  Allthreehadpartialdata•  CombiningdatafromallthesourcesintheEIDRrecordsgivesbetteroverallinformation

•  Allpartiescanupdatetheirrecordswhentheywantto•  Andtheycanalsotalktoeachotheraboutpossiblecollaborations

•  Re-releasewithsupplementalmaterialfromBFI,e.g.•  Shouldalsomakelaterresearchers’liveseasier•  Seehttps://doi.org/10.5240/E051-49A0-94DB-28CC-9F5F-Zfortheresults

Page 16: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

Fourth Application: Linked Data

•  InsertTimBerners-Lee’sdefinitionoflinkedopendatahere•  Problemistheworlddoesn’tworkthatway•  Butthecoreofitisfollowing/resolvingidentifierstogettomorestuff•  SoMovieLabsbuiltanontology

•  MuchmoredatathatthebasicstuffinEIDR•  Treatprovenanceasanessentialitem•  Heavyemphasisonwhichcountryorregionthedataappliesto/camefrom

•  PutEIDRdataintotheformtheontologywants,then….•  FollowalternateIDsinEIDRtoothersources,andaddingthatdata

•  Wikidata(ontological,butdifficult)andIMDB/S3(notanontologybutverywellstructuredanddocumented)

•  BFI(notanontology,butanAPI,anXSDschema,andmappablecontrolledvocabulary)•  Thenseveralmoresources

•  Buildsomeapplications

Page 17: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

Extras

•  Linkstononaudiovisual‘stuff’•  Books,plays,themeparkrides,….

•  Editorialcomments(reviews,ratings,rankings)•  Expandedpersonrecords•  Lotsoflocations(production,filming,setting)• Artwork• Reviews• Consumption

Page 18: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

Demo, if time and the fact that it’s a prototype permit…. • GraphQLqueries• UI• Abitofmachinelearningorclevermath(dependingonyourbuzzwordbreakingpoint)

Page 19: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

Why EIDR is Like Every Other PID

•  Buildingacommunityofpractice•  Or‘Whycan’tourusersgettheirmetadataacttogether?’and‘TheywanttodoWHAT?’•  Connectcommunitiesthatmaynotknowwhattheyhaveincommon•  Formatwars

• Wheredoestheidentifierendandtheapplicationbegin?•  Sharedtermsandvocabulary.

•  Whatdoes‘version’mean?Whatdo‘thesame’and‘derivedfrom’mean?•  Identifiersforpeopleandorganizations• Whathappenswhentwosubcommunitiesdisagree(vigorously)onthe‘best’waytodealwithacornercase?

•  Howdowetalktopeopleaboutthisstuff?•  Explainwhy,nothow•  Dealwithmisconceptionsandmyths

Page 20: PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources (excluding user-generated content for now) • Appropriate granularity of identification

Two messages from the past

…weareabletobetheSecretarys,theinterpretersandpreserversofthememorialsofourancestors.

--WilliamStukeley,27May1723(Bodl.MSEng.Misc.c.401,fols.22-23,publishedinAyres,ClassicalCultureandtheIdeaofRome,Cambridge1997)

…Ihadthegreatestintimacywith…thewholesettoflearnedmen…andbyhavingrecoursetotheirlibrarysIarriv’dtoaconsiderabledegreeofknowledge&equalreputation

--WilliamStukeley,journalentry,ChristmasEve,1725(PublishedinTheFamilyMemoirsofWilliamStukeley,3vols.,Durham1882-7,citedinAyres)