Friday, April 13, 2007

Excuse me, where do I find the Semantic Web - did I miss a turn back there?

pencil icon, that"s clickable to start editing the post

Until now I've had an won't touch that attitude towards the soft semantic markup, but that might be about to change.

Sorry, I'm a data-head so almost per instinct I've been skeptic about all that talk about semantics. In the world of XML people can often find them self in either the data og the document camp, based on what kind of business they're doing. I haven't thought of it before, but what about those who work with the semantic stuff and feels comfortable with terms like taxonomies, thesauri, onthologies? I think I'll put them in the doc camp per default.

I haven't worked with semantics/knowledge management and neither followed the standards. The last time I looked was at RDF a couple of years ago and it frightened me, but I've decided to look at it again. As goes the semantic web it didn't work out quite they way that some originally presented it and the speed of adoption has been relative slow - that's in my un-knowlegdeable wisdom. Things take time and a parallel to the webservice hype is not that far of, it takes time and during that time it twists and turns, while we get wiser and the big players fight it out. On the other hand I'm sure that the semantic web is a reality for some people doing work on knowledge management and artificial intelligence and probably also other lines of work.

At the Semantic Web overview page at W3C, which does give both a quick and good overview, the introduction states:

The Semantic Web is about two things. It is about common formats for integration and combination of data drawn from diverse sources, where on the original Web mainly concentrated on the interchange of documents. It is also about language for recording how the data relates to real world objects. That allows a person, or a machine, to start off in one database, and then move through an unending set of databases which are connected not by wires but by being about the same thing.

Hey, hey man I do work on integration and combination of data drawn from diverse sources and I also tend to work with how the data relates to real world objects. Could this mean that it's all different than i thought? maybe, and in some ways I'm certainly wrong and in others in probably not that far off - time will tell.

As mentioned previously the last time I looked into this was some time ago, but I did keep a printed version of two great guides from Stefano Mazzocchi blog:

There is no way around RDF, and as Stefano has written in the presentation A no-nonsense introduction to "semantic web" technologies:

• Resource Description Framework
• W3C Recommendation since 1998 (as old as XML!)
• Misunderstood for years as a very complicated way of embedding metadata into XML documents

Why misunderstood?
• Original specification was extremely formal
• “what is this good for” was nowhere to be found (somewhat taken for granted by the people that designed it)
• The RDF/XML serialization obscured the value of the graph data model
• XML seemed to solve the same problems and was much easier to understand

The real problem is that RDF was conceived as a solution to a problem people didn’t have:
data interoperability at a world-wide scale

The SW page points to Dave Beckett's Resource Description Framework (RDF) Resource Guide (planetrdf) where there's a link to an attempt by TBL to explain the XML versus RDF from both sides, that's part of his Design Issues - Architectural and philosophical points:

I'll start with the RDF Primer and twist the example to concern me:

<?xml version="1.0"?>
  <rdf:Description rdf:about="">
    <dcel:creator rdf:resource="" />

The definition of the Dublin Core element creator is in the Dublin Core Metadata Element Set, Version 1.1.

There is no XML schema for RDF (there is a strange version at the W3C website, in an old version of XML Schema) and I haven't figured out if it's even possible/practical since it will require many wildcards (xsd:any) to allow for al kinds of custom elements like the creator. But RDF files can be validated (and visualized) with the W3C RDF Validation Service.

That's all for now and it's all to clear for me how long and troublesome my journey into this will be.