The Newsroom of Things by BBC News Labs - for ISKOUK "Taming the News Beast"

Post on 09-May-2015

787 views 0 download

description

This presentation outlines how BBC News Labs is currently working on entity extraction on BBC News content. It looks at the challenge of how BBC News is going to leverage it's USP of Storytelling, and it's famous purpose - "to INFORM, EDUCATE & ENTERTAIN" globally. With the millions and millions of "things" that are in our content, how do we discover and connect these things?.. It also mentions 2 key BBC News Labs projects "JUICER" and "#newsVANE", and promotes the http://newshack.co.uk/newshack-ii/ event on May 1st in Dublin & Glasgow.

Transcript of The Newsroom of Things by BBC News Labs - for ISKOUK "Taming the News Beast"

Powered by BBC Connected Studio

The Newsroom of Things

ISKO UK “Taming the News Beast”April 2014

matt shearer – innovation manager@BBC_News_Labs

ABOUT US

“Driving Innovation in News”

ABOUT US

“Driving Innovation in News”

NEW TECH AND DATA

OPPORTUNITIES

NEW TECH AND DATA

OPPORTUNITIES

NEW JOURNALISM

FORMATS

NEW JOURNALISM

FORMATS

EXPLORE

VIA

PROTOTYPING

EXPLORE

VIA

PROTOTYPING

Part of BBC Connected Studio.

The BBC’s open innovation programme.

Part of BBC Connected Studio.

The BBC’s open innovation programme.

WHAT ARE WE* DOING?

* BBC NEWS

Focussing on the next 8 yrs.

INFORMEDUCATE

ENTERTAIN

DIFFERENTIATOR?

Storytelling with Storytelling with BBC (+) contentBBC (+) contentpast & presentpast & present

GLOBALGLOBALCURATIONCURATION

HOW?

GLOBALGLOBALCURATIONCURATION

MACHINES DO HEAVY LIFTING:

Discovery

MACHINES DO HEAVY LIFTING:

Discovery

JOURNALISTS:Amazing Creative Efficiency.

100% time on curation.0% time on digging and filing…

JOURNALISTS:Amazing Creative Efficiency.

100% time on curation.0% time on digging and filing…

MORE MACHINEWORK:

Connecting and Surfacing

MORE MACHINEWORK:

Connecting and Surfacing

“juicer” “#newsVANE”

Stuff & ThingsStuff & Things Trends & SignalsTrends & Signals

SCALE!!

“Stuff & Things”

(and stuff)

The Newsroom of THINGS

30 Languages.+ 24/7

+ Global+ Radio + Online + TV

= a lot of things

HOW DO WE GETTHE THINGS?

THE MACHINESDISCOVER

THE THINGS

It’s notIt’s notNEWNEWROCKET SCIENCEROCKET SCIENCE

It’s notIt’s notNEWNEWROCKET SCIENCEROCKET SCIENCE

TEXT

…well explored

Things in

is a bit more exciting

AV …Things in

Stuff & Things (all 6 types = holy grail)

1.Verbatim transcript (to time) “…where she says ‘damnit!’”2.Contributors (face and voice) “who’s in this segment?”3.Objects (audio & image recog) “tank or a elephant?”4.Scene geolocation “this looks like Bangor”5.Topics mentioned (people, places, orgs,.. Storylines*)6.Actions & Events (non verbal) “people laughing, kissing”

* Jeremy is telling you in a few mins…

from combinations ofthis stuff & things

relevance

Then we get real

this stuff & things

connecting

and from

Since 2012:R&D JUICER

What is it?

• News Content

• Tagged with Linked Data concepts

1

Get Content

2

Extract Concepts

3

Match to DBpedia

4

Annotate Content

5

Push to Triplestore

The Juicer6

Expose via API

“THINGS”

AT PRESENT:

Topics: PEOPLE, PLACES, ORGANISATIONS& themes.

NEAR FUTURE:

Storylines: curated “Story” – an editorial THING

What’s the content?

Unique Things

…used how many times?

Summary figures

• 680,000 articles tagged with 5,700,000 tags

• ~8 tags per article

We save a lot of manual tag time

NB – this is rough, and just for illustration.

Next :Window on the Newsroom

(+AV with transcript generation and speaker recognition)

and after that:and after that:

GLOBALGLOBALJUICERJUICER

spaces we are in:

Journalism as DataMetaJournalism

Object-based Broadcast

Come explore with us!See: newsHACK.co.uk

Come explore with us!See: newsHACK.co.uk

1st & 2nd May.

#newsHACK II. Glasgow & Dublin

(News Orgs and Academics)

1st & 2nd May.

#newsHACK II. Glasgow & Dublin

(News Orgs and Academics)

@BBC_News_Labs

ThanksPowered by BBC Connected Studio