IPTC News in JSON Spring 2013
-
Upload
stuart-myles -
Category
Documents
-
view
875 -
download
0
Transcript of IPTC News in JSON Spring 2013
News in JSON
Stuart Myles * Associated Press * 11th March 2013
News in JSON Activity
http://www.flickr.com/photos/jondresner/5789254800/
News In JSON Activity
http://www.flickr.com/photos/jondresner/5789254800/
News in JSON Activity
http://www.flickr.com/photos/jondresner/5789254800/
News In JSON Activity
http://www.flickr.com/photos/jondresner/5789254800/
Determine the priority properties to be expressedBy examining G2, rNews, NITF and existing implementations
Created 2-3 candidate JSON representations
Places
Subjects
Text markup
Wrote experimental code
to try out the candidate structures
Remind Me: What is JSON?JSON = JavaScript Object Notation http://json.org/
Name / value pairs: a fieldname in quotes, a colon, a value in quotes
"givenname" : "Stuart"
Objects: written inside curly braces, may contain multiple NVPs
{"givenname" : "Stuart", "familyname" : "Myles"}
Arrays: Written inside square braces, may contain multiple objects{"iptcdelegates": [ { "givenname": "Dave", "familyname": "Compton"}, { "givenname": "Stuart", "familyname": "Myles"}, { "givenname": "Robert", "familyname": "Schmidt-Nia" }]}
© 2013 IPTC (www.iptc.org) All rights reserved 4
Things We Considered But Decided Against
• Translating from an existing XML standard into JSON– Not all IPTC standards are XML– Not all publishers use the same IPTC standards– Not all publishers use any IPTC standards
• “Mechanically” translating from XML into JSON– There are many libraries that can do this– Different choices for how to represent certain XML features– So each technique results in a slightly different JSON– We felt that more a more “natural” JSON would be more
valuable
© 2010 IPTC (www.iptc.org) All rights reserved 5
News in JSON Properties
• We reviewed existing sets of news properties including– NewsML-G2– NewsML 1– rNews– NITF
• We selected a set of priority properties to represent• https://docs.google.com/spreadsheet/ccc?key=0AvnUbL
xJqDwBdGxOQXdYeTRPM2k3WFhiNGRuMWR2M1E
• We could add more later...• ...but we wanted to start somewhere
© 2010 IPTC (www.iptc.org) All rights reserved 6
© 2010 IPTC (www.iptc.org) All rights reserved 7
Let’s draft a News in JSON white paper!
Representing Places in JSON
Geographic metadata such as• Display Name: Brooklyn (NYC)• ID: http://id.example.org/5110302• Centroid: 40.6501038, -73.9495823• Bounding Box: 40.453216826620995, -73.68930777156369,
40.846990773379, 1.0, -74.20985682843632• Hierarchy: Kings County > New York > NY > USA• Type: Second order administrative division
Several non publishing JSON implementations, such as• GeoJSON http://www.geojson.org/• Geonames API http://
www.geonames.org/export/JSON-webservices.html
© 2010 IPTC (www.iptc.org) All rights reserved 8
Two Ways to Represent Places
Approach #1: The geonames wayhttp://api.geonames.org/getJSON?geonameId=5110302&username=kansandhaus&format=json
Approach #2: With a bit more structurehttps://gist.github.com/jays0n/5032774
We wrote some code to test them outThe app selects a few fields and prints out the objects created
Note the different nesting that these caused when looking at the two BO classes.
http://tech.groups.yahoo.com/group/iptc-news-in-json-dev/files/jayson-json-geo-tests.tar.gz
© 2010 IPTC (www.iptc.org) All rights reserved 9
What We Learnt
• Simpler JSON results in simpler code– Avoid arrays if they will normally only contain a single object
• Ensure property labels start with lower case letters– Some parsers (e.g. Jackson) assume this convention
• The main conclusion: there wasn’t much to choose between the two styles in practice
• Proposal: adopt the slightly more structured approach
© 2010 IPTC (www.iptc.org) All rights reserved 10
Some Useful Tools
• http://jsonlint.com/– helpful for finding syntax errors
• http://jackson.codehaus.org/– nice JSON support in JAVA
• http://goessner.net/articles/JsonPath/– like XPATH for JSON
© 2010 IPTC (www.iptc.org) All rights reserved 11
Subjects in JSON
• Subjects– People, companies, organizations, abstract concepts– Keywords, categories
• A single structure for all– Like NITF http://
www.iptc.org/std/NITF/3.6/documentation/nitf-3-6.html#Link19– For example https://gist.github.com/kansandhaus/5049159
• Each subject type has its own structure– For example https://gist.github.com/anonymous/5049220
© 2010 IPTC (www.iptc.org) All rights reserved 12
Subjects in JSON• A single structure leaves no room for error in selecting
the “bucket” to use to represent a given concept• However, the code to access these anonymous buckets
is much more complex• To select documents which are marked as having a
location=San Francisco in MongoDB– mnc.queryDB("{\"keywords\" : {\"$elemMatch\" :
{\"type\" : \"location\", \"name\" : \"San Francisco (CA)\"}}}"); – mnc.queryDB("{ locations.name: 'San Francisco (SF)' }");
• Proposal: Adopt the “specific buckets” structure
© 2010 IPTC (www.iptc.org) All rights reserved 13
Text Markup in JSON
• How to represent richly marked up text in JSON?• A sweet spot for document-oriented XML• Could be HTML, XHTML, NITF ...
• We experiment with two existing text markup examples• NITF:
http://www.iptc.org/std/NITF/3.2/examples/nitf-fishing.xml• HTML:
http://dev.iptc.org/Implementation-Guide-HTML-5-Microdata-in-IPTC-namespace
© 2010 IPTC (www.iptc.org) All rights reserved 14
Text Markup Options in JSON
• Plain text, stripped of markup• Preserved but escaped markup
– HTML: https://gist.github.com/anonymous/4996653– XML: https://gist.github.com/anonymous/4996676– See http://
stackoverflow.com/questions/993970/what-do-i-need-to-escape-in-my-html-json-response for a discussion of how to escape markup in JSON
• Mechanically create JSON structures to mimic the original markup– We used JSONML as an example http://www.jsonml.org/– NITF : https://gist.github.com/anonymous/4996697– HTML: https://gist.github.com/anonymous/4996720
© 2010 IPTC (www.iptc.org) All rights reserved 15
What We Learnt
• Both plain text (no markup) and escaped markup have clear use cases– Plain text can be useful for search, for example– Escaped markup works well for direct display on a webpage
• Markup translated (like JSONML) works OK if you have a library to implement the rules– But what is the added benefit beyond just working directly with
XML or HTML?– Who will write and maintain the libraries for ever language?
• Proposal: Let providers use both plain and escaped text
© 2010 IPTC (www.iptc.org) All rights reserved 16
News in JSON Road Map
• Evaluate more structures– Such as links to binaries
• Write a white paper on our initial recommendations– Publish and seek out feedback within IPTC and beyond
• Create a News in JSON 1.0 recommendation– Present it for a vote at the Paris meeting– Consider an experimental phase
• You can help by joining the News in JSON group– [email protected]
© 2010 IPTC (www.iptc.org) All rights reserved 17
Date and Place of Next Meeting
Paris 24 - 26 June, 2013
http://www.flickr.com/photos/anirudhkoul/3536413126/
Dank en tot ziens!
© 2013 IPTC (www.iptc.org) All rights reserved 18