PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources...
Transcript of PIDs Go To The Movies - EIDR I wanted • Coverage of a wide range of works from global sources...
PIDs Go To The Movies RaymondDrewryMovieLabs/EIDR
PIDapalooza2019–Dublin
pidapalooza.orgeidr.org
What I wanted
• Coverageofawiderangeofworksfromglobalsources(excludinguser-generatedcontentfornow)
• Appropriategranularityofidentification(coveringtheabstractconceptofanunderlyingworkandallitsmanyvariations)
• Reliable,freeaccesstotheidentifiersandtheirmetadata(i.e.theidentifiersareresolvable,andanyonecanusethem)
• Connectiontootherdatasources(informationfrommultiplesourcesismorepowerfulthaninformationfromanysinglesource)
• Aknowledgeable,engagedusercommunitytohelppopulatethedatabase(crowd-sourcedwithacuratedcrowd.)
• Easeofintegrationwithandusebyotherpiecesofsoftware-databasesaren’tveryusefulifnooneusesthem.AUIisjustanotherapplication.
• Economicviability–cheapisgood,andpersistencerequireslongevity,whichrequiresmoney
What I Didn’t Want
• RightsInformation-contextdependent,variesbytimeandcountry,andsomeofitissecretandwrappedupincomplicatedlegadocuments
• Peoplethinkingthey‘own’themetadataortheID--metadataforidentifyingaudiovisualworksisfreelyavailableinlotsofinconvenientways,butitisavailable,sowhyshouldarecordbe‘owned’?
• Restrictionsonreadaccesstothemetadata-it’sonlyusefulifpeoplecanuseitastheywish
• Uncontrolledadditionsandmodifications--whendealingwithuniqueness,sloppymetadataisdeath
• Onlycommerciallyimportantworks-thatwouldgiveawoefullyincompletepictureoftheworldoffilmandTV,andprecludelotsofinterestingprojectsandapplications
• Toomuchlegalese-theuseragreementisonlyacoupleofpageslong,whichisremarkableforsomethingthatistheoffspringofHollywoodandSiliconValley.
What is EIDR – spiffy marketing version
EIDR Technology Summary
• Interoperable, standards-based infrastructure
• Built on ISO Digital Object Identifier (DOI) standard
• Application integration through public APIs and schemas, freely available SDK for members
• Efficient infrastructure for new and existing applications
EIDR Purpose
• Make digital distribution competitive • Help reduce costs • Improve collaboration and automation across multiple application domains & platforms • Enable new businesses and create new efficiencies
What EIDR is
• Global registry for unique identification of movie and TV content • Designed for automated machine-to-machine communication • Flexible data hierarchy down to the product & SKU level, incl. edits, clips, composites, encodings, and relationships
What EIDR is Not
• Profit-making • Rich commercial metadata • Ownership or rights information • US-only
4
Example EIDR movie hierarchy with multiple versions
Example EIDR episodic hierarchy
S e a s o n s ( A b s t r a c t i o n s )
IsSeasonOf IsSeasonOf
Series
Season 1
Retail EST (EN, FR)
Season 2
S e r i e s ( A b s t r a c t i o n s )
Episode 1
IsEpisopdeOf E p i s o d e s ( A b s t r a c t i o n s )
IsManifestationOf
Similar Hierarchy
Here
Episode2 Episode N …
E d i t s ( P e r f o r m a n c e s )
Broadcast Edit
IsEditOf
Broadcast Edit
Broadcast Edit
Retail EST (EN, FR)
Retail EST (EN, FR)
Season 2 Trailer
IsPromotionFor
UGC Upload
M a n i f e s t a t i o n s ( D i g i t a l )
Promotional Clip
IsClipOf
Social Upload
IsPromotionFor
6
What Metadata Is Included
• Enoughdescriptivemetadatatodistinguishoneworkfromanother• DifferentepisodesofaSeries,remakes,director’scuts,dubbedlanguages,etc• Differentrequirementsatdifferentlevelsofthehierarchy• Factual,notinterpretive=nogenresorplotsummaries,e.g.
• Someexceptions(e.g.earlyActualities)
• LinkstootherEIDRIDs• ContainingSeries,originalabstractwork,itemsincludedinaretailcompilation,etc• Wetryprettyhardtomakethisconcrete,notmarketingoropinion(sowedon’tdo‘franchises’,forexample)
• NoteonEIDRIDs• Standardformis10.5240/C1B5-3BA1-8991-A571-8472-W• Lotsofotherformatsdefinedforvariousapplications• ResolutionandcontentnegotiationthroughtheDOIproxy
What’s in an external identifier?
• TheID,inwhateveritsbasic,nativeformhappenstobe• Currentlyonlyforotheraudiovisualworks• Significantinterestincoveringotherthings(posters,scripts,reviews,miscellaneousephemera)
• …butnoneofthosehavegoodidentifiers(yet)• Atype(IMDB,ISAN,DOI,etc)
• Ifit’snota‘first-classtype’,thetypeis‘Proprietary’• Proprietarytypeshaveadomainwithinwhichtheyarevalid,e.g.bfi.org.uk,bfi.org.uk/gifford,Disney.com,Disney.com/MPM,etc
• AresolvableURLforthatID,ifweknowhowtomakeone• Generatedontheflyfromapublishedlistofpatterns• SomeIDtypeshavenone,somehavemorethanone
• Arelationship(optional)• IsSameAs,IsDerivedFrom,ContainsPartOf,etc
IDs in EIDR
0200,000400,000600,000800,000
1,000,0001,200,0001,400,0001,600,0001,800,0002,000,0002,200,0002,400,0002,600,0002,800,0003,000,0003,200,0003,400,0003,600,0003,800,0004,000,000
Dec-10 Dec-11 Dec-12 Dec-13 Dec-14 Dec-15 Dec-16 Dec-17 Dec-18
EIDRContentIDs
AlternateIDs
Top Alternate IDs in EIDR Alt. ID Type 2018
(000)
Chg.
TiVo ID 483 39 Webedia/AlloCiné
404 11
IMDb 314 69 CITWF 290 30 Sony 275 13 ISAN 209 20 ITV 174 14 Wikidata* 130 123Warner 112 10
Alt. ID Type 2018 (000)
Chg.
Flixster- Rotten Tomatoes 95 0
The Movie Database* 93 93NBCU 92 38 ČSFD* 87 87Turner Classic Movies (TCM)*
79 79
Baseline 65 0 EuropeanAVO 57 2 Google Play 54 4
*NewtopIDthisyear
Quick examples
• Levoyagedanslalune[VoyagetotheMoon](Méliès,1902)• 10.5240/C79D-4B8B-36AF-B214-03C0-H
• Frankenstein(Edison,1910)• 10.5240/4375-1EB2-6CCD-DC73-B77F-V
• GameofThrones• 10.5240/C1B5-3BA1-8991-A571-8472-W
• BladeRunner• https://doi.org/10.5240/EA73-79D7-1B2B-B378-3A73-M
First Application: Digital Supply Chain (simplified)
Buy/Rent
Metadata
Fulfillment Metadata
Avail
Library Browse/Fulfill
Studio Retailer
EntitlementDatabase+
Media Files
Second application: Getting it right – double-shot movies • Somemoviesaren’tdubbed• Scenesarere-shotatmoreorlessthesametimeinadifferentlanguagesometimeswithsomeoftheactorsdifferent
• Thesearen’teditsofthesamemovie–theymeetthedefinitionofaseparatework
• Commoninthe1930s• StilldoneforsomeIndianproductions(Tamil/Hindi,forexample)
Double-shot examples • https://doi.org/10.5240/BD8D-8F89-8F75-FE28-7010-M
• Murder!1930GBDouble-shotinEnglish(thisversion)andGerman.
• https://doi.org/10.5240/C264-EC88-AFA1-2EC2-9B28-Z• Mary1931GBDEDouble-shotinEnglishandGerman(thisversion).
• https://doi.org/10.5240/1388-8D7E-42D2-7147-D5DB-L• S.O.SIceberg1933USDEDouble-shotinGermanandEnglish(thisversion).
• https://doi.org/10.5240/9162-6940-4DC3-ABF9-A67B-0• S.O.S.Eisberg1933DEUSDouble-shotinEnglishandGerman(thisversion).
• https://doi.org/10.5240/C97B-42DD-BF23-B3FF-4A8B-9• Raavan6/18/2010INShotsimultaneouslywiththeTamil-languageversion,Raavanan.
• https://doi.org/10.5240/F635-4E44-475B-158B-9FF4-Z• Raavanan6/18/2010INShotsimultaneouslywiththeHindi-languageversion,Raavan.
• https://doi.org/10.5240/13D5-090F-CA5A-A590-CE47-5• MumbaiExpress4/15/2005INDouble-shotinHindi(thisversion)andTamil.
• https://doi.org/10.5240/2492-A8ED-46AE-7631-40F1-H• MumbaiExpress4/15/2005INDouble-shotinHindiandTamil(thisversion).
Third Application: getting all of it -- This Modern Age • TheRankOrganisation’sanswer(1946-1954)toTheMarchofTime(byTime,Inc.)
• Notthe1931MGMfilm(that’shttps://doi.org/10.5240/DE76-BA99-3701-6237-6BCE-I)
• ITVhasmostoftheRankcatalogue;BFIhassomeofit;CITWFhasmetadataforsomeofit
• Allthreehadpartialdata• CombiningdatafromallthesourcesintheEIDRrecordsgivesbetteroverallinformation
• Allpartiescanupdatetheirrecordswhentheywantto• Andtheycanalsotalktoeachotheraboutpossiblecollaborations
• Re-releasewithsupplementalmaterialfromBFI,e.g.• Shouldalsomakelaterresearchers’liveseasier• Seehttps://doi.org/10.5240/E051-49A0-94DB-28CC-9F5F-Zfortheresults
Fourth Application: Linked Data
• InsertTimBerners-Lee’sdefinitionoflinkedopendatahere• Problemistheworlddoesn’tworkthatway• Butthecoreofitisfollowing/resolvingidentifierstogettomorestuff• SoMovieLabsbuiltanontology
• MuchmoredatathatthebasicstuffinEIDR• Treatprovenanceasanessentialitem• Heavyemphasisonwhichcountryorregionthedataappliesto/camefrom
• PutEIDRdataintotheformtheontologywants,then….• FollowalternateIDsinEIDRtoothersources,andaddingthatdata
• Wikidata(ontological,butdifficult)andIMDB/S3(notanontologybutverywellstructuredanddocumented)
• BFI(notanontology,butanAPI,anXSDschema,andmappablecontrolledvocabulary)• Thenseveralmoresources
• Buildsomeapplications
Extras
• Linkstononaudiovisual‘stuff’• Books,plays,themeparkrides,….
• Editorialcomments(reviews,ratings,rankings)• Expandedpersonrecords• Lotsoflocations(production,filming,setting)• Artwork• Reviews• Consumption
Demo, if time and the fact that it’s a prototype permit…. • GraphQLqueries• UI• Abitofmachinelearningorclevermath(dependingonyourbuzzwordbreakingpoint)
Why EIDR is Like Every Other PID
• Buildingacommunityofpractice• Or‘Whycan’tourusersgettheirmetadataacttogether?’and‘TheywanttodoWHAT?’• Connectcommunitiesthatmaynotknowwhattheyhaveincommon• Formatwars
• Wheredoestheidentifierendandtheapplicationbegin?• Sharedtermsandvocabulary.
• Whatdoes‘version’mean?Whatdo‘thesame’and‘derivedfrom’mean?• Identifiersforpeopleandorganizations• Whathappenswhentwosubcommunitiesdisagree(vigorously)onthe‘best’waytodealwithacornercase?
• Howdowetalktopeopleaboutthisstuff?• Explainwhy,nothow• Dealwithmisconceptionsandmyths
Two messages from the past
…weareabletobetheSecretarys,theinterpretersandpreserversofthememorialsofourancestors.
--WilliamStukeley,27May1723(Bodl.MSEng.Misc.c.401,fols.22-23,publishedinAyres,ClassicalCultureandtheIdeaofRome,Cambridge1997)
…Ihadthegreatestintimacywith…thewholesettoflearnedmen…andbyhavingrecoursetotheirlibrarysIarriv’dtoaconsiderabledegreeofknowledge&equalreputation
--WilliamStukeley,journalentry,ChristmasEve,1725(PublishedinTheFamilyMemoirsofWilliamStukeley,3vols.,Durham1882-7,citedinAyres)