Skip to content

Instantly share code, notes, and snippets.

@danbri
Created December 21, 2023 21:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save danbri/45ede34beb54995a91b1a26f5165a2e7 to your computer and use it in GitHub Desktop.
Save danbri/45ede34beb54995a91b1a26f5165a2e7 to your computer and use it in GitHub Desktop.
# ETech 2004: Foaf session
#
# Dan Brickley - fear of a foaf planet
#
# abstract: http://conferences.oreillynet.com/cs/et2004/view/e_sess/4757
#
# a pretty verbatim transcript, dropped some of the 'kindas' and suchlike]
# thanks to dav for the movie.
#
# danbri@rdfweb.org
#
# todo: transcribe Q+A, Edd's foafbot talk, link to slides etc.
#
#######################################################################
Ok so i'm talking about this FOAF project -- friend of a friend project --
Its been going behind the scenes since around 2000. I'm here with
2 hats on really ... here as a W3C representative, where i spend my
dayjob working on formal web standards... but i'm talking about a side
project i've been working on with friends: "friend of a friend".
[audience: try talking louder! etc.]
friend of a friend project
i just want a show of hands thing at the start
how many of you have heard of xml?
[lots]
...of rdf?
[respectable]
of foaf?
interesting. how many of you have got foaf files?
...so, i'm suprised, that's a lot of files.
So, what I really want to do is too many things at once here.
I want to talk a bit about FOAF as a technology... and most of its
technical characteristics are inherited from its use of RDF, and its use
of XML. I want to talk a bit about its social impact, and probably not
enough about either to really get into the interesting details.
So... what is this foaf project? what can we do with FOAF files? and
why do so many people seem to find it interesting? how do the
technical and social aspects interrelate...? and where are we up to?
The basic idea is pretty simple. It's machine readable homepages.
Initially homepages for people, but also companies, organisations,
anything that uses the Web. It's an exploration of the idea of a
Semantic Web, a Web of machine-readable pages. It's an experiment in
'just doing it' really. A few years ago... the first FOAF file was my
homepage... the second FOAF file was my friend Libby's home page, and
its just gone from there. Last week I heard that LiveJournal were
about to switch on FOAF export, another 2 million FOAF pages
about to be on the Web. So... it's interesting times...
The two core concepts I think with FOAF are this notion of FOAF files or
FOAF profiles, RDF files in the Web. And the notion of FOAF as a
vocabulary, as a kind of dictionary of terms you can use to say things
in FOAF files. So FOAF gives you markup for saying things like -- if
you're talking about People -- their mailboxes, their homepages... For
saying of a person what their workplaceHomepage is, which is a kind of
nerdy way of saying where they work. A way of saying where they work
that is quite easy for computers to deal with, so you can run queries
against it to say, 'find me the weblogs of people who work for the place
whose homepage is w3.org'.
The really interesting with FOAF is the really interesting thing
with the Web: connectivity. That my FOAF file has a reference to
Libby's FOAF file, which has a reference to Edd's,
and so on and so on. So you can feed these things to a harvester, just
in the same way that traditional harvesters traverse from page to page,
but harvesters are indexing machine-readable assertions about the world,
rather than just a list of words from human language.
So just to give a quick example... It's a bunch of angle brackets. It's
some XML. I took this from a very nice article on XML.com that
Leigh Dodds wrote last week... it says, this person, Peter Parker, [...]
has a foaf:gender property of 'male', a foaf:title of 'Mr',
it has given name and family name. It gives a hash of Peter
Parker's mailbox, which is a kind of sneaky way identifying this person
in the absence of a planet-wide identifier scheme, so, this person, with
these details... they have a homepage, weblog, and they know another
person... this markup at the bottom here. Peter Parker knows Harry
Osborn; Harry Osborn has such'n'so homepage, and with respect to Harry,
see-also this RDF file over here. And it's this little bit in bold
that's really made this an interesting project to me. Once you've got
that, that capacity for linking amongst machine-readable files, you've
got the basis for scooping it all up and pushing it into a database.
And what do you get when you do that? This [foafnaut slide] this is one
of several visualisations and user interfaces that people have built
with FOAF. Edd's going to talk about a textmode interface later. Just to
give you a sense of the kind of data we've got. People, connected to
people, by a variety of named relationships. People described in a
variety of machine-readable files, and then arbitrary other
characteristics of those people. And this is the really fun thing in
terms of the structure of FOAF... we didn't in the FOAF specification
nail down once and for all, what it is you can say about people. So,
each of the people described can say as much or as little about
themselves, about their colleagues and friends, as they choose to.
So what we're really trying to do here, was to take a thought experiment
and roll it out and see what happens. What would it be like if machines
could read our homepages? And, not wanting to wait for Artificial
Intelligence or Natural Language Processing and so on, we took the
approach of "dumbing down" self-descriptions to a level that is well
suited to machine processing. So the next obvious question there is:
well, what do we lose when we dumb down the subtleties of interpersonal
relationships, the subtleties of self-description, into a
machine-friendly form. This debate has come up again in the "social
networking" discussions, where people present themselves through
Friendster, through Orkut, for example.
And from a more geeky perspective, what might search engines evolve
into, as we move from indexing the words in a page to indexing claims
about the world. So we're trying to sneak up on the big hard problems,
through a very simple technology. It is at heart just a file you put on
a Web site, or a file that someone exposes from a service on your
behalf.
I've got some kinda wordy examples here, of the kind of things you
can say in FOAF. I can say... simplistic atomic statements about myself
and other people. I can say, "I'm Dan, and I work at W3C, I know Libby
who works at ILRT, and her FOAF profile is blah-blah-blah over here.
And here are the titles and descriptions of some documents we wrote
together". Or I can say, "here's a photo I use on my homepage.
Here's a bunch of other photos. And here are the people in those
photos."
I can use any FOAF extension at all in these descriptions. And "FOAF
extension" in this sense is any other RDF vocabulary that people have
created for use within the RDF framework. So there are... maybe some of
you here have heard of MusicBrainz, or Dublin Core, or Creative
Commons. Each of those RDF applications gives you terms that you can
plug into these descriptions. So if I want to talk about rights over
these documents, use Creative Commons. If I want to talk about music,
eg. to say that "I really like Massive Attack", I use MusicBrainz. We
try not to do too much in FOAF, and to come up with an architecture
where we can plug in other people's work.
So... my dayjob is acronyms. This is the acronym view. FOAF uses XML as
a data format. It uses RDF as a data model, a set of conventions over
the top of XML. It also uses this thing called OWL, the Web Ontology
Language. I'm very pleased to say that both OWL and RDF became W3C
recommendations yesterday, which is [clapping!] ... it's so nice to be
able to say that, it's been years...
So, this is a practical application of RDF and a practical application
of OWL. The current FOAF spec defines 50 or so terms for talking about
the world, for making very simple claims. And we use OWL [...]
for the following reasons. RDF, in a sense, guarantees the freedom of
indepdent extension. Because we're describing people, because people are
such interesting, complex, political beasts, there is no way that a
single spec, written by a bunch of primarily technically oriented
people, is really going to ever do a complete job of capturing the
things you'd want to say of people. So what we try to do instead, was
find a way of using RDF, so that other people's descriptive concerns
could be plugged in there. So RDF guarantees that. It guarantees that a
FOAF file can have arbitrary other ways of talking about people in the
file.
OWL provides us also with something pretty important. Algorithms for
data merging. If you think about the problem of independent parties
scattered around the Web, trying to describe each other. There is no
planet-wide identification system for people. There is no planet-wide
identification system for companies, ... Despite that, what we need to
do is to be able to fold data together from multiple sources, and figure
out when they're talking about the same things. Without getting into the
detail, that's really what OWL gives us a lot of off-the-shelf tools
for.
A quick recap. It's a new kind of Web page; it's a Web page for
machines. We try to echo the freedom and flexibility you have in your
own homepage, in
machine-readable form... that you might intuitively think that because
it's machine-readable it's going to be kind of stilted, static, rigid,
and ugly. It's a very low-tech approach, it's like RSS in the sense
that, in its simplest form, it's just a page you put on a Web site. And
we've got quite a long way with just putting pages on Web sites. But we
run into issues that allow us to explore the harder things. So we've
been PGP-signing these pages, for example, or encrypting them. It's a
Web of files describing people, and webs of people.
I don't have time to get into the nitty-gritty of these issues, but the
kind of things we've run into include the need to plan for lying, to
plan for people being mischievous... If you remember the fuss at
Friendster about "Fakesters". I don't know how many of you followed
this, but on the Friendster site, people were creating playful cartoon
characters, and they were being deleted because they weren't true
descriptions of the world. In the FOAF universe, we just don't have that
control. If you create a FOAF file, out there on the Web, and it
describes Peter Parker, or Mr Benn, I can't delete that file, 'cos its
on your Web server. So we need an architecture that allows us to survive
with lies, survive with half-truths. Anyone can say anything about
anyone, using any RDF vocabulary. So there are etiquette issues there,
there are privacy and politics issues there. We primarily use FOAF for
self-description, but some of the lightning talks later will address its
use in activism, where for example we might be talking about
politicians.
FOAF increasingly gets lumped with this Social Networking thing. When we
first started the FOAF project, a few years back, the big social
networking site was Six Degrees. Last year the big social networking
site was Friendster. Last week it was Orkut. Who knows what it'll be in
6 months time. The driving ethic behind FOAF and a lot of this Semantic
Web work is this sense that people want their data back, that they want
control of their data, they want to be able to migrate it between
hosting sites, to be able to host it themselves...
A good friend of mine was copying and pasting her profile from
Friendster into Orkut last week. There's gotta be a be a better way,
there really has. These sites they also partially describe the world,
and present that as a full description of the world... so, "he's on
Friendster, she's on Orkut...", they don't show up in each other's
friends lists. To my mind, that's just wrong. There's one world, and we
kind of stumble towards describing it. We sorely need import and export
between these sites. And the way FOAF was designed, it gives you the
basics for doing that. What we don't have in FOAF is a representation
yet of all the nitty-gritty ratings, profiles, and "fan"
characteristics that these sites use. We don't have a "dating"
vocabulary. I'm kind of scared of adding one, but I'm sure it'll happen
eventually.
Where are we up to today? I should have come here with stats. There is
a growing number of self-hosted foaf-a-matic generated files.
Foaf-a-matic is a simple Javascript interface for describing yourself
and linking to your friends. If see these funny little FOAF faces on a
Web site that link to a FOAF file... there are a suprising number of
them. People have gone to the foaf-a-matic site, created a self
description, uploaded that to their site. On top of that there are an
increasing number of services generating FOAF. So Ecademy, a business
networking site in the UK. TypePad and Cocolog, the hosted version of
Movable Type, they both produce and consume FOAF descriptions. Some
interesting characteristics of what they've done there, which maybe we
can talk about later. Live Journal, I heard last week, are about to
switch on 2 million FOAF files. And we'll hear later about Tribe.
So... getting the data... it's _almost_ easy, a few lines of Perl code
and suddenly there's 2 million more FOAF descriptions in the world.
Consuming the data is an interesting problem, and again this
is something that distinguishes the approach we've taken from the
approach of the monolithic aggregator sites. So you see FOAF user
interfaces that use HTML, that use SVG for zooming around loads of
people... Edd's going to talk about text-mode interfaces to it. It's a
very interesting data set. There's a lot of it out there, a lot of it is
public, and its there for anyone to play with. It's under nobody's
control. You don't have to wait for the hosting site to hire some user
interface designers. You can do a good job (or a bad job) yourself.
The other thing is... FOAF terminology... the little bits of language
that we defined in the FOAF spec, are being used in other RDF data
formats. They describe people adequately enough, and the RDF design...
tries to have a division of labour such that we don't duplicate things.
If you're creating an RDF format, say for calendar exchange, and you
want to talk about people, you can just plug in FOAF, you don't need to
duplicate it.
There's a couple of styles of using FOAF.
There's a couple of styles of social networking sites. And we tried to
architect FOAF to be neutral between them, although I think there are
some cultural biases in the FOAF crowd towards one of them.
So, you can be very explicit in a FOAF file. I could say: "Edd's my
friend",
or I could say "Edd's my _best_ friend". Or I could say "Edd's my
arch-nemesis". I could plug in any set of interpersonal relationships
that someone else out there decides to make available.
That's a very... articulated, social networking site style of talking
about sociality.
There's also, and my biases lean this way, a more kind of implicit,
evidence-based approach. So we talk about: Liam and I work for the same
organisation. Or... Libby and I went to the same school. The two of us
co-authored a document, and so on. So you describe facts about the world
which have associated with them implicit information about your
relationships to someone.
There's quite a lot of work in the FOAF crowd on image metadata, I think
because of this bias towards trying to humanise the
machine-representation of interpersonal relationships. So we talk about
'co-depiction', y'know, Edd and I are in the same photo. A lot of this
stuff, it's processable by machines, you have to look at the picture to
get the point.
Applications... we're going to hear about in more detail in the
lightning sessions, but there are a couple of favourites I have.
The connection of this stuff to locality... so... having the ability to
scan the room to see who's there, with bluetooth or there's another
application called FoafFinger that uses Rendevous to get the profiles,
to get the weblogs, to get the most recent weblog articles written by
people who are with you in the room.
If you've ever sat at the station, or been wandering around town
wondering who all these people are, but not really wanting to ask them,
there's quite an interesting application there.
The activist side of things is my pet hobbyhorse, I think there's a
whole separate talk there. One of our cuter apps is a dataset taken from
theyrule.net, which describes webs of connections between boards of
directors in the USA. I also wanted to mention DeanLink recently, the
database of activists on the Howard Dean campaign. There's an RDF view
of that dataset, so you can scoop that up and see connections amongst
people and collaborators there.
People ask what's FOAF's _for_... I can gives use cases. The standard
cheesy business traveller use case is: I'm a vegetarian, I always forget
to tell the airport, tell the airlines that I'm vegetarian. You should
be able to check in at the airport and have their machine read your
homepage and go "Hey! You appear to be a vegetarian, but we don't have a
special missing listed for you - is there something up here?" That's not
rocket science.
Basically, what is FOAF for? What is data for? Any FOAF file is just
some RDF, it just contains statements about the world, whatever you
might want to use that data for. That's why it's up there. So the
vegetarian thing, is an arbitrarily picked use case. Whenever you want
data, that's what FOAF's for. So anything you can say in FOAF, you can
ask of a crawl of harvested data. Whether you believe that data is a
whole other topic...
Just to recap. FOAF is designed really to be a freeform platform.
If you think back to the homepage, your homepage, it's a blank slate,
page... you can write what you like there. No-one tells you what you can
write, no-one tells you which words you can use. We're trying to
reproduce that for machines. It's out of control by design. We don't
want to say what words you can use. We don't want to say where it has to
be hosted. We don't want to own your data.
I think the technical machinery, after a few year's work in the RDF
community, is reasonably there. There are RDF crawlers, there are RDF
data stores, there are query systems. There is no single way to deploy
FOAF. We're really feeling our way around the options here. Whether it's
hosted. Whether it's exported from something like LiveJournal.
We're on the borders of going mainstream. A few lines of code separate
us from there being millions of these files. That seems kinda scary.
Although I think we've got the technological aspects sorted, what we
don't really have is a sense of the legal, privacy, etiquette issues.
Within the closed world of say Orkut, you get this awkwardness of
someone saying they're a fan of you, someone giving you 4 stars for
sexyness... it's kinda unsettling for a lot of people. It's been
unsettling for people to see the marital styles and sexual biases of
their business colleagues. But it's been within the scope of a
particular site. Now what happens when we take that and we scale it to
the Web? We make it possible for you to say that about anyone, you don't
have to log in on Orkut.... All the stuff we didn't have time to talk
about...
[end]
Questions?
[transcribe later]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment