How does the emergence of the semantic web change the way we think about information architecture?

Posted on September 18, 2010
Filed Under Semantic Web, information architecture, journalism, semantic information architecture | 5 Comments

Transcript

How does the emergence of the semantic web and its associated technologies change the way we approach user experience design and more specifically information architecture?

In Tim Berners Lee’s original proposal for the web he gave us the basic ingredients to build the web of documents as we experience it today. This gave us a easy means to publish documents, refer to them with url’s and point from one document to another with a hyperlink.

In many ways the web became a victim of its own success. The simplicity with which we could publish documents meant we were soon overwhelmed. At this point Information Architects were employed to group together documents into managable piles.

The process would often take the form of a content audit. Grouping an organisations documents into similar types then giving these groups a label and arranging these groups into small hierarchies. If you were lucky you might also do some user testing. For example a card sort to see if a user of your site would expect to find a document in the place you have grouped it.

The problem with this approach is that if we start out focusing on documents our sites turn out document centric. For example navigation that includes things like pictures, news and features or opinion and archive.

If we step back and think about it. The user coming to our site does not have a mental image of a document but rather the player or team they are interested in. People are interested in things not documents.

This leads us to move away from a document orientated approach to web development to a thing focused one and with this move comes the need for new tools and approaches to information architecture.

One approach at the BBC has been to use Domain Driven Design. DDD encourages you before you have written a line of code or draw a wireframe to collectively understand the things and relationships between them in the problem space you are trying to solve. This model becomes the ubiquitous language used by all members of the project. At this point we can also test our model against the mental model of the user. Ensuring the users mental models are built in the very core of the site.

If we look at an example – when the BBC wanted to open up their archive of wildlife clips  instead of beginning to publish pages for the clips they first published a page for the things of interest and links between these things. So publishing a page for every species, link them to habitats and behaviours.

Each of these pages then links back the species that it relates to. So you soon start to build up a dense network of links between these things. The emphasis in this approach is to shift focus from the content to model. The assets are associated with the things in the model but the model provides the context.

Anyone who has been involved in building even a modest taxonomy for a site will understand the maintenance overhead that this introduces.  In using the rich relationships that an ontology like approach introduces it would be not be feasible to for the BBC to build and manage this product.

Instead the model was populated by sourcing data from the web and stitching it together with common web identifiers. In this case DBpedia . So different sources of data can provide the concepts and – links between concepts – at no extra cost to the BBC. In the cases where a concept is missing, for example in Wikipedia then the team of editorial experts at the BBC will edit or create the concept in Wikipedia. This means not only are we reusing what is already available on the web but in the places where it is wrong or missing we correct it at source so others benefit.

One of the outcomes of focusing on publishing urls for things and creating a dense network of links between them is it has had great benefits in terms of Google rank. In the case of Wildlife Finder some species are being placed above their equivalent page in Wikipedia on UK Google searches.

This approach has not been restricted to wildlife but has been used across the BBC including the World Cup.

The BBC football site at it exists today consisted of a limited number of editorially managed indexes. This means that editorial resource dictated the types of aggregations the site had. So we have no index for the England team or brazil but rather an general and slightly meaningless index called internationals. In addition to this the BBC purchases sports stats from an external provider but at the moment these are not brought together to tell a coherent story.

So in order to solve these challenges the starting point was to think about the things of importance to the world cup as opposed to the documents.

The approach was to focus on the model and then associate content with the things in the model. As the model is device agnostic the views that provide the user experience on top of this can be tailored to be the best we have to offer for a given device.

The starting point of the modelling was to recognise the importance of the event to sport. If we can handle events we can represent the majority of sports. Building upon the existing events ontology we then set about specialising it in order that we could represent the complex structure of a sports competition.

For instance the world cup is a multistage event made up of a group stage and a knockout stage each of which contain rounds and those rounds contain matches.

Once we had developed a model we then decided upon the views that we would want to show the user for a variety of devices. For example html web views would include amongst other things teams, players and groups.

Once we knew the views we wanted to create we could then be sure that if journalists annotated with a select number of tag class types that the model could handle the rest. So we asked them to tag with player, team, competition and venue. By keeping the tagging simple we ensured it would be of high quality.

Here we have a team page for Italy – you can see how the approach starts to bring  stories and data together to tell stories in a more coherent way. Though aggregating assets with tags is not particularly novel when we look at the group pages we can see how this approach is different.

You might remember that we did not ask journalists to tag with group but we are still able to construct this view for users because the model knows which teams played in which group and which players played for which team. No additional editorial intervention was needed to generate these additional views.

By focusing on the model it allowed us to easily integrate a variety of data sources and pull them together to provide a coherent user experience. In addition by tagging content with concept from the model we are increasing the benefits we get from the cost of tagging content. So a tag that has a web scale identifier enables the content to be contextualised in previously impossible ways.

In summary we have looked at a number of ways that semantic web like thinking changes the way we work. Firstly we start developing sites with processes that encourage us to focus us on things and the relationships between them as opposed to the documents, secondly it introduces a culture of building with open vocabularies to add context and links that would never be possible otherwise and it also enables us to maximise the value we get out of our tagging of content. Like the world cup group page example.

All this thinking is heavily influenced by the work of my colleagues at the BBC. Notably Michael Smethurst, Tom Scott, Chris Sizemore and Michael Atherton. For further reading regarding this approach and the work of the BBC there is no better starting point than Michael’s posts on the BBC internet blog and Tom Scott’s personal blog.

Comments

5 Responses to “How does the emergence of the semantic web change the way we think about information architecture?

  1. Tweets that mention How the emergence of the semantic web changes the way we think about information architecture. : block, slab, pillar -- Topsy.com on September 19th, 2010 4:33 am

    [...] This post was mentioned on Twitter by Ben Shoemate, Peter Krantz, Tony Scott, Alex Coley, Kerstin Forsberg and others. Kerstin Forsberg said: RT @kerfors: "People are interested in things not documents" http://bit.ly/b3hww0 (via @PullNews) #semweb [...]

  2. Nodalities » Blog Archive » Public-sector Pay and Panorama… on September 21st, 2010 10:57 am

    [...] principles. For a peak at this world, a great place to start would be Silver Oliver’s recent post about the Semantic Web. And for more about the way this story unfolds, watch last night’s [...]

  3. infomisa.net» Blog Archive » Public-sector Pay and Panorama… on September 21st, 2010 10:37 pm

    [...] principles. For a peak at this world, a great place to start would be Silver Oliver’s recent post about the Semantic Web. And for more about the way this story unfolds, watch last night’s [...]

  4. Twitted by lespetitescases on October 9th, 2010 3:40 pm

    [...] This post was Twitted by lespetitescases [...]

  5. A Panorama of Public Pay: Salary Data Explorer Powered by Talis on July 8th, 2011 7:21 pm

    [...] principles. For a peak at this world, a great place to start would be Silver Oliver’s recent post about the Semantic Web. And for more about the way this story unfolds, watch last night’s [...]

Switch to our mobile site