Many Sci Fi fans will recognise that this week’s title and the image above both come from Terminator 2, a film in which in a dystopian future, an Artificial Intelligence defence system called Skynet has waged war against humanity. The film’s great fun, but it does touch on concerns which some people do have about AI. After watching it last week I posted a jokey tweet about how this could be linked with a worry about the development of a Semantic Web.
Terminator 2 still stands up as one of the greatest sci-fi films of all time.#citylis—
Steve Mishkin (@SteveMishkin) November 30, 2014
Steve Mishkin (@SteveMishkin) November 30, 2014
I was a little surprised therefore, when just a few days later an image of the Terminator was used to accompany an article about Stephen Hawking, in which he too (less jokily) raised concerns about the development of AI.
Obviously, Artificial Intelligence and the Semantic Web are two different things, (AI is essentially the creation of a computer which can ‘think’, whereas the Semantic Web is more concerned with allowing a computer to ‘know’), but what they both have in common is the desire to make computers more responsive.
This was our theme for our lecture and lab work this week, the Semantic Web. As mentioned earlier, the idea behind the Semantic Web is to advance to a situation where data is not only machine-readable but machine-understandable. This has been the ambition of Tim Berners-Lee and his World Wide Web Consortium (W3C) for many years now, but the goal is still some way off. The solution lies in the ways in which data is encoded on the web. For a Semantic Web to exist, a far greater depth of metadata needs to be added to data and documents. This would make the process of Information Retrieval far more efficient, because a search engine would no longer have to make good ‘guesses’, because everything would be unambiguously tagged. Consequently, information processing and discovery could become automated.
To try to facilitate this, W3c have helped develop a Resource Description Framework (RDF), which aims to give a grammar for how things are described on the web. Taxonomies can be created to identify things, and ontologies to create logical rules for inferences which can be made about them. This helps create a Semantic Web stack, the building blocks of a Semantic Web; the RDF adds metadata to web resources; an RDF Schema is used to create a taxonomy for it; and Web Ontology Language (OWL) creates an ontology to add a greater sense of meaning. The W3C wants everyone who wishes to add this depth of metadata to their resources, to use the same language. Only if a uniform way of doing things is agreed and applied, can a Semantic Web stand a chance of coming about.
This is a massive simplification of what is a very complex topic. The reality is, that only a small proportion of web developers are even considering it as an issue. In the lab we looked at how www.artistsbooksonline.org uses the Text Encoding Initiative (TEI) to mark-up its text. The site is a web repository for art works which adds a great depth of metadata in order to improve the experience for the user.
Each work has related metadata provided under 3 hierarchical headings: Work>Edition(s)>Object(s)
The site describes the headings as follows:
The site uses RDF triples to try to bring a greater sense of meaning to the data it presents, as in the following example:
Subject Predicate Object
Johanna Drucker Wrote Dark, the bat elf banquets the pupae
Dark, the bat elf banquets the pupae Was published In 1972
Below, you can see how the site displays the work alongside its metadata.
The TEI mark-up which the site uses, requires a Document Type Definition (DTD) which defines the elements and attributes of each page of XML.
This XML metadata isn’t available to view for the individual works on the site, rather, we just view it as human-readable data. Nevertheless, the fact that this mark up has been added, means that the site is very navigable and has a high level of functionality, something which it shares with the Old Bailey Online site I discussed in my previous post. These two sites do give some indication of what’s possible to achieve when one attempts to attach a high degree of metadata to a resource. It’s clear that an awful lot of work has gone into making this possible and here lies the problem. For a Semantic Web to exist, there needs to be a universality of this level of coding, but for a range of reasons, primarily, time and lack of knowledge, we’re unlikely to see it’s widespread adoption in the near future.
However, we must remember, that we are still witnessing the infancy of the World Wide Web, and in the future no one knows for certain how things will develop. But one thing we do know: There is no fate but what we make.
Cue closing credits.
Not everyone’s cup of tea, but in keeping with this week’s theme, I present to you Metal Heads – Terminator, Goldie’s 1992 classic, acknowledged by many as the first ever drum and bass track.