
It is time for doing the thesis and most of the computer engineering department here at Polimi is focused on Semantic Web applications. To better decide which area I am going to choose to do my thesis, I have been reading tutorials about this new extension of the Web for about a week.
What is the Semantic Web?
W3C defines The Semantic Web as: “The Web with a meaning”.
“If HTML and the Web made all the online documents look like one huge book, RDF, schema, and inference languages will make all the data in the world look like one huge database“. The word semantic stands for the meaning of. The semantic of something is the meaning of something.
The Semantic Web is a web that is able to describe things in a way that computers can understand. The Semantic Web is not a very fast growing technology. One of the reasons for that is the learning curve. RDF was developed by people with academic background in logic and artificial intelligence. For traditional developers it is not very easy to understand.
Semantic Search
We have seen a number of real Semantic Web applications so far. Semantic search engines like Hakia, Powerset and hodo have been good examples to these. These engines index RDF data stored on the Web and provide an interface to search through the crawled data. Rather than using ranking algorithms such as Google’s PageRank to predict relevancy, Semantic Search uses semantics, or the science of meaning in language to produce highly relevant search results. In most cases, the goal is to deliver the information queried by a user rather than have a user sort through a list of loosely related keyword results.
Markup
Currently, the World Wide Web is based mainly on documents written in HTML, a markup convention that is used for coding a body of text interspersed with multimedia objects such as images and interactive forms. Metadata tags, for example provide a method by which computers can categorise the content of web pages.
The Semantic Web takes the concept further; it involves publishing the data in a language, Resource Description Framework (RDF), specifically for data, so that it can be categorized as human perception and be “understood” by computers. So all data is not only stored, but filed and well handled.
Resource Description Framework
The Resource Description Framework (RDF) is a W3C standard for describing Web resources, such as the title, author, modification date, content, and copyright information of a Web page. More information about building semantic web apps can be found on W3Schools.
There are also a bunch of APIs that allow you to create semantic apps such as the Jena RDF API for Java. Until now everything seems to be fine.
Does the Semantic Web search yet really help?
Things like creating a meaning, making the web understanding that “cat” is not just a word that consists of c, a and t but that it is “an animal with 4 legs” are nice things. However, effective semantic search will need websites that have utilised semantic technology. That means all the content on your site has to be tagged hyerarchically and they have to implement an RDF schema.
I will be continuing to write more on the Semantic side of the web as I dive further. Meanwhile, here are a nuch of nice hello Semantic Web kinda articles you may check out: