Seminar 3: Creative use of archive

taupesalmonInternet and Web Development

Oct 21, 2013 (4 years and 8 months ago)



: Creative use of archive

‘Digital P
c Space prototype demonstration’

Jake Berger, Programme Manager, Digital Public Space, BBC

Bill Thompson, Chair and Head of Partnership Development, BBC Archives

Now I’d like to ask Jake Berger

from the


to come and talk to us about the
digital public space data model and what’s
we’ve been working on

Jake Berger, Programme Manager, Digital Public Space, BBC

It’s, er...

hopefully it’ll be a little bit more exciting than

Bill Th

The digital public space date model is

exciting. You and I love meta

Jake Berger

We do, yea

Bill Thompson

These people just don’t understan
d the sheer glory of triples...

Jake Berger:


Bill Thompson

And linked open data

Jake Berger

And I promise I won’t mention the word meta
data, or triples, in this entire talk. But

erm, i
t’s quite interesting
I’m the fourth man in a dark suit to stand up in front of
you today, but you can tell we
’re from the creative sector ‘cau

none of us are
actually wearing a tie.

that’s me, that’s wha
t I do, that’s who I work for.

I’d like you to imagine that every museum archive, gallery, library, theatre, and
studio in the country could all be found next to each other. And that th
ey each had
every single item in their collections on display. And imagine if the smallest
organisation’s archives and objects had exactly the same level of visibility and
accessibility as the big nationals. And imagine that all of this material and
ation were linked together. Now, hold that thought for a moment. This is a
picture of the internet.

Now, the web finds us stuff, it shows us loads of stuff, and it links to loads of other
stuff. So it’s great at linking things together, but it’s not yet
great at making real
meaningful connections between all of the things. That’s left up to the humans to
do. It’s not very good at saying that this thing is like this other thing, or this thing is
different to this other thing. These are not the same things.

Paris Hilton is not the
same as the Paris Hilton.
But, if you try and find that picture by typing in
Hilton’ or ‘
on in Paris’
, you have to work your way, as I found a couple of days
ago, through about a hundred thousand of these before you get
that. Most people
are probably looking for the one on the left, but I was looking for the one on the
right. We need to do something about this. It shouldn’t just be what’s popular that
is always first. But all of this is possible
. This

is about as technica
l as I will get

this is the vision of linked data and the semantic web.

It’s a bit hard to see here but that says

Now if we can
tell the web what each thing is

and what each thing isn’t... rule
number one: never let
e Magritte

do the

[Audience laughter]

And if we tell it what set, if you remember back to your Venn diagrams at school,
or group of things it’s part of, and how all these sets relate to each other, then we
can ask new kinds of questions and we can get new kinds of answers.

h as

[shows aud
ience picture] o

[shows audience picture].
Probably get zero
results for that, but it’d be worth a try.

Or more simply

[shows audience picture].

So you should get the idea.

So, how can we tell the web what these things are? Well let’s call them entities.
Now I think, and I’m very happy to be proved wrong, that we can understand every
entity in the world as being either a person,
a place, a collection, an event

or a
thing. Now
, things are the k
ind of ‘
get out of jail free

card because it captures
everything that doesn’t fit into all the previous ones. These entities are often
associated with a time, a moment or period in the past
, the present, or the future,
and assertions tha
t are made about these things by people and by machines.
These assert us: we’re all assigned various levels of authority and perspective, so
a curator, an expert, a witness, a creator. I’m sure some of you will be use

making assertions, and I’m sure so
me people will actually believe them. But all
entities have some sort of physical representation in the real world, whether that’s
a statue, a recording, a video, some ones and zeros on a memory stick, or a
server, or a flash card. And some of these entiti
es are going to have emotional
states associated with them. These can be very different depending on which
character you play in the story.

Each physical representation is held somewhere

hat’s the Amazon

by the way, if you wonder where all y
our stuff comes fr

r is displayed
somewhere. And all of these entities will sit somewher
e on a spectrum
availability and affordability, somewhere between free and open, or closed and

how can we make this vision of connected availab
ility happen?

Starting with the material that we have in our own archives, in our collections, and
the data, if we can classify or tag all of it, if we can digitise it, do this in a semantic
web friendly manner, following some very basic, simple

rules and


there’s nothing more complex than the grammar you would learn in your first
years of secondary school

ake them available and open, then people can find
our stuff. They can make their own assertions about it; they can rate it to other
hings. They can tell us things that we don’t know about our own material, which
then adds to the find
ability, the interestingness and the usefulness of it. It’s a
positive feedback loop, it’s a positive cycle. You
’re probably thinking this, and
quite rig
htly so.

So what are we doing?
we, the BBC in conjunction with partners, many of
whom are re
presented in this audience, are

trying to create a framework that
makes all of this thing feasible for any organisation, small or large. We’ve drafted
an ove
rarching data model in conjunction

with a number of organisations

this lot
at the moment, but we have many more who are interested. The data model
simply brings together a whole load of different catalogues, classifies and identifies
them in a constant w
ay, picks out themes within and types and sets in
relationships, maps out those connections.

Now this next slide
. I
f you are of a nervous disposition I’d ask you to look away
now, but I’ll only keep it up there for a couple of seconds. This is the data mo
which you can’t see there, thank you lights.

[Audience laughter]


s actually got it tattooed on his inner thigh if you’re interested, and I’m selling
posters at the end at very reasonable prices, so come and see me.

So, turning this vision into
something that’s u
seable and interesting, well
created a prototype system that aggregates for all of these data sets, and
translates them into th
e categories of people, place, collections, event
s and things
and starts to make connections between the


will eventually enable all of the
other things that I’ve talked about
. B
ut at the moment it’s relatively basic.


I’d like to show you this system, but I’m afraid I can’t because my

broke it last week while ingesting 10 million records o

the national archives. I’ll
have something that I can show you soon. I can show you a slightly shaking
version of it in the break, or come and talk to us afterwards.

But actually the visible bit isn’t the important bit of this

the important bit is bri
all of these data sets together, being able to translate
them. The really clever bit of
a few
rhythms that create and associate al
l of these different things in

that a human being could
do if they had, I don’t know,
10 million years at thei

What I was going to show you is a couple of example interfaces that we’ve built
over the last few months that demonstrate the kind of thing you can do on this
platform. They would have looked like this, so h
ere’s the view of
the Royal Opera

it shows a f
ew things you can explore. Don’t know

why Southend Pier
s up
there, but there you go.


is a person page for Winston Churchill. You see it’s just pulling in information
from other sources.

A place. A thing. An event.

And if you
can see at the top

it’s beginning to group these things together so

event is part of tourism ceremonies, tr
ade events, Royal Festival Hall.

one of this
has been hand created or

all of this is linked, structured data and
rhythms that are

saying, ‘t
his thing here is probably like that thing, and if that
thing’s related to those things then this thing’s proba
bly related to those other


I’ll draw you a diagram.

We also wanted to have a bit more of a kind of video
friendly version of it, so
here’s an interface which finds a whole load of videos related to Enid Blyton

we’ve got this hooked up with the British Library’s collection it will show you books

This is a time view whic
h lets you jump from millennium

o century, to decade, to
year, to month, to day, and pulling bits of information from everyone’s collections
that relate to that particular moment in time.

Here are

the results from the first database for Swan Lake, breaking them down
into things, events
, collections, places.
This has only got about probably half of 1%
of the amount of material

that it will eventually have. So if we can get some interest
in connections across different people’s collections with the 1%, imagine what
happens when we multipl
y that by a factor of one hundred.

And then this just lets you kind of create your own view of it, or see what other
people are interested in, in a kind of
‘my f


But this can only work if it’s much, much bigger and broader than the
BBC. All
we’re really trying to do is create standards, frameworks and tools for other people
to use. We can do this because we are funded by you and, you know, 60 million
other people. We should do this because we have engineers, we have archivist
we ha
ve producers, and

they’re all generally pretty busy and there’ll be a few less
of them today after today’s announcement, but we feel it’s


fundamental thing that
the BBC should be doing

in the same way that


sure that your radio would

from t
he peak o
f the highest Scottish mountain

to the lowest valley, maybe.

It must work for everyone

for the smallest organisation or individual due, you
know, down, or up, up to the biggest beh
. So we want people to contribute
data and media to make i
t available, we can help you understand easy ways to do
that. We need people to play with what we’re creating, try and break it, tell us how
to make it better. Tell


Ooh, if only it did this thing, suddenl
y that would fit my

And we want people

to think about how they could use what we’re creating to
supplement the stuff that you’re already creating. Everything that we would pull
together here we would like to be usable by, you know, small websites, by small
exhibitions, by
chool kids


through to massive national projects.

If you’re still interested,
then come and talk to us. I didn’t realise Tony was gonna
be here, otherwise

wouldn’t have used that picture, but sorry Tony, it was the
second one that came up on Google.

If you’re re
ally important,

then you can talk to

Thank you for listening.