CS835 - Data and Document Representation & Processing

Lecture 9 - Data: Semantic Web, Ontologies, RDF

 

References:

  1. The Semantic Web: An Introduction http://infomesh.net/2001/swintro/
  2. http://www.w3.org/2001/12/semweb-fin/w3csw

 

The Semantic Web?

·       The Semantic Web is worldwide information linked in such a way as to be easily understandable by machines

·       Idea created by Tim Berners-Lee, inventor of the WWW, URIs, HTTP, and HTML.

 

·       Problem: Most data on the Web in a form difficult to use on a large scale

·        no global system for publishing data that can be easily processed by anyone.

·       Solution - Semantic Web

 

What is the Problem?

Take this webpage

 

 

What we see:

Siggraph 2005

Sketches

 

Everything you can imagine is real.

PABLO PICASSO

The Sketches program is one of the most dynamic programs of the annual SIGGRAPH conference, providing a forum for ideas, techniques, and uses of computer graphics and interactive techniques.

Every year, an eclectic mix of researchers, artists, animators, and programmers share their experiences and show their ideas. Sketches cover a broad spectrum of topics in art, design, science, and engineering, and include provocative speculation, academic research, industrial development, practical tools, and behind-the-scenes explanations of commercial and artistic works.

Sketches are presentations, about 20 minutes long, followed by about five minutes for questions. Sketch sessions are organized by topic in groups of four or five sketches and chaired by a member of the review committee.

We welcome submissions from artists, animators, developers, and researchers. Whether you are developing new techniques or using existing ones in novel ways, we want to hear from you. We encourage submissions from academia and industry, as well as independent work. International contributions are especially welcome.

Be a part of this exciting exchange of ideas and techniques. Contribute your sketch proposal to SIGGRAPH 2005!

Juan Buhler
PDI/DreamWorks
SIGGRAPH 2005 Sketches Chair

 

What the Computer Sees:

SKETCHES INFORMATION

 

New for SIGGRAPH 2005

 

Implementation Sketches

 

Frequently Asked Questions

 

Submission Guidelines

 

Submission Procedure Checklist

 

Review and Upon Acceptance

 

GENERAL INFORMATION

 

New for SIGGRAPH 2005

 

Deadlines

 

How to Submit Your Work

 

Online Submission

 

Uploading Files

 

Presenter Information

 

Award Nominations

 

Conference Volunteer Application


REFER A FRIEND

>

Share the SIGGRAPH
2005 web site

 

Need to Add “Semantics

E.g., Dublin Core

Problems with this approach

Ontologies provide a vocabulary of terms

New terms can be formed by combining existing ones

Meaning (semantics) of such terms is formally specified

Can also specify relationships between terms in multiple ontologies

 

A Semantic Web — First Steps

 

Ontologies

 

·       Ontology in Computer Science

·       An ontology is an engineering artifact:

It is constituted by a specific vocabulary used to describe a certain reality,

o      plus a set of explicit assumptions regarding the intended meaning of the vocabulary.

 

Thus, an ontology describes a formal specification of a certain domain:

Shared understanding of a domain of interest

Formal and machine manipulable model of a domain of interest

 

Structure of an Ontology

o      Ontologies typically have two distinct components:

Names for important concepts in the domain

Elephant is a concept whose members are a kind of animal

Herbivore is a concept whose members are exactly those animals who eat only plants or parts of plants

Adult_Elephant is a concept whose members are exactly those elephants whose age is greater than 20 years

 

Background knowledge/constraints on the domain

Adult_Elephants weigh at least 2,000 kg

All Elephants are either African_Elephants or Indian_Elephants

No individual can be both a Herbivore and a Carnivore

 

Semantic Web Main Principles

Principle 1: Everything can be identified by URI's

Principle 2: Resources and links can have types

Principle 3: Partial information is tolerated

Principle 4: There is no need for absolute truth

Principle 5: Evolution is supported

Principle 6: Minimalist design

Semantic Web Layers

 

URI - Uniform Resource Identifier

·       Semantic Web built on syntaxes that use URIs to represent data called "Resource Description Framework" syntaxes.

 

RDF - Resource Description Framework

·        The Resource Description Framework (RDF) is a language for representing information about resources in the World Wide Web.

·        Intended for representing metadata about Web resources, such as the title, author, and modification date of a Web page, copyright and licensing information about a Web document, or the availability schedule for some shared resource.

·        By generalizing the concept of a "Web resource", RDF can be used to represent information about things that can be identified on the Web, even when they cannot be directly retrieved on the Web.

·        A language that uses three URIs - Uniform Resource Identifiers

·        In RDF, information is a collection of statements, each with a subject, verb and object - and nothing else.

·        Once information is in RDF form, it can be processed

 

The RDF Data Model

o      Statements are <subject, predicate, object> triples:

         

<Frank,hasColleague,Richard>

Can be represented as a graph:

 

 

o      Statements describe properties of resources

A resource is any object that can be pointed to by a URI:

a document, a picture, a paragraph on the Web;

http://www.cs.man.ac.uk/index.html

a book in the library, a real person (?)

isbn://5031-4444-3333

Properties themselves are also resources (URIs)

 

Linking Statements

o      The subject of one statement can be the object of another

Such collections of statements form a directed, labeled graph

 

 

Note that the object of a triple can also be a “literal” (a string)

 

RDF Syntax

o      RDF has an XML syntax that has a specific meaning:

Every Description element describes a resource

Every attribute or nested element inside a Description  is a property of that Resource

We can refer to resources by using URIs

 

           

<Description about="some.uri/person/ian_horrocks">

           

   <hasColleague resource="some.uri/person/uli_sattler"/>

           

</Description>

           

<Description about="some.uri/person/uli_sattler">

           

   <hasHomePage>http://www.cs.mam.ac.uk/~sattler</hasHomePage>

           

</Description>

           

<Description about="some.uri/person/carole_goble">

           

   <hasColleague resource="some.uri/person/uli_sattler"/>

           

</Description>

 

 

Notation3: RDF Made Easy

 

<http://xyz.org/#a> <http://xyz.org/#b> http://xyz.org/#c

 

 

<http://xyz.org/#Sean> <http://xyz.org/#name> "Sean"

 

The above reads as subject, verb and object – Sean has the name “Sean”

 

 

_:a1 <http://xyz.org/#name> "Sean"

This may be read as "there is something that has the name Sean", or "a1 has the name Sean, for some value of a1".

 

·        These are called anonymous nodes, because they don't have a URI.

 

·        Given: <http://xyz.org/#a> <http://xyz.org/#b> http://xyz.org/#c

@prefix xyz: <http://xyz.org/#>

o      This gives:

@prefix xyz: <http://xyz.org/#>

:a :b :c

 

@prefix blargh: <http://xyz.org/#> .

blargh:a blargh:b blargh:c .

 

@prefix blargh: <http://xyz.org/#> .

@prefix xyz: <http://xyz.org/#> .

blargh:a xyz:b blargh:c .

 

@prefix : <#> .

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

@prefix daml: <http://www.daml.org/2001/03/daml+oil#> .

@prefix log: <http://www.w3.org/2000/10/swap/log#> .

@prefix dc: <http://purl.org/dc/elements/1.1/> .

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

 

 

CWM: An XML RDF And Notation3 Inference Engine

python cwm.py a.n3 -rdf > a.rdf

 

RDF Schema

·       RDF gives a formalism for meta data annotation, and a way to write it down in XML, but it does not give any special meaning to vocabulary such as subClassOf or type

 

RDF Schema allows you to define vocabulary terms and the relations between those terms

it gives “extra meaning” to particular RDF predicates and resources

this “extra meaning”, or semantics, specifies how a term should be interpreted

 

·        RDF Schema (also: RDF Schema Candidate Recommendation) was designed to be a simple datatyping model for RDF.

·        Using RDF Schema, we can say:

"Fido" is a type of "Dog"

"Dog" is a sub class of animal.

·        Can create properties and classes, and create ranges and domains for properties.

 

·        All terms for RDF Schema start with "http://www.w3.org/2000/01/rdf-schema#"

1.     "Resource" (rdfs:Resource)

2.     "Class" (rdfs:Class)

3.     "Property" (rdf:Property)

rdfs:Resource rdf:type rdfs:Class

rdfs:Class rdf:type rdfs:Class

rdf:Property rdf:type rdfs:Class

rdf:type rdf:type rdf:Property

:Dog rdf:type rdfs:Class

o      Now we can say that "Fido is a type of Dog":

:Fido rdf:type :Dog

o      Can create properties by saying a term is a type of rdf:Property, and then use those properties in the RDF:

:name rdf:type rdf:Property

:Fido :name "Fido"

·       This says that Fido's name is "Fido"?

·       The term ":Fido" is a URI, and any URI for Fido, including ":Squiggle" or ":n508s0srh"

·       The URI ":Fido" is easier to remember.

·       Must tell machines that his name is Fido

 

·       More properties: rdfs:subClassOf and rdfs:subPropertyOf.

o      Can say that one class or property is a sub class or sub property of another.

o      e.g., "Dog" is a sub class of the class "Animal":

:Dog rdfs:subClassOf :Animal

o      Also say that there are other sub classes of Animal:-

:Human rdfs:subClassOf :Animal

:Duck rdfs:subClassOf :Animal

o      Create new instances of those classes:-

:Bob rdf:type :Human

:Quakcy rdf:type :Duck

Can invent another property, use that, and build up more information...

:owns rdf:type rdf:Property

:Bob :owns :Fido

:Bob :owns :Quacky

:Bob :name "Bob Fleming"

:Quacky :name "Quakcy"

 

·       RDF Schema provides ranges and domains

o      Ranges and domains specify what classes the subject and object of each property belong.

o      e.g., to constain ":bookTitle" to a book with a literal value:

:Book rdf:type rdfs:Class

:bookTitle rdf:type rdf:Property

:bookTitle rdfs:domain :Book

:bookTitle rdfs:range rdfs:Literal

:MyBook rdf:type :Book

:MyBook :bookTitle "My Book"

 

·       RDF Schema contains a set of properties for annotating schemata, providing comments, labels, and the like.

·       Two properties for doing this are rdfs:label and rdfs:comment

·       e.g.:

:bookTitle rdfs:label "bookTitle";

   rdfs:comment "the title of a book" .

 

Problems with RDFS

·       RDFS too weak to describe resources in sufficient detail

No localized range and domain constraints

Can’t say that the range of hasChild is person when applied to persons and elephant when applied to elephants

No existence/cardinality constraints

Can’t say that all instances of person have a mother that is also a person, or that persons have exactly 2 parents

No transitive, inverse or symmetrical properties

Can’t say that isPartOf is a transitive property, that hasPart is the inverse of isPartOf or that touches is symmetrical

 

Difficult to provide reasoning support

No “native” reasoners for non-standard semantics

 

Web Ontology Language Requirements

Desirable features identified for Web Ontology Language:

 

Extends existing Web standards

Such as XML, RDF, RDFS

Easy to understand and use

Should be based on familiar KR idioms

Formally specified

Of “adequate” expressive power

Possible to provide automated reasoning support

 

From RDF to OWL

·       Two languages developed to satisfy above requirements

OIL: developed by group of (largely) European researchers (several from EU OntoKnowledge project)

DAML-ONT: developed by group of (largely) US researchers (in DARPA DAML program)

Efforts merged to produce DAML+OIL

Development was carried out by “Joint EU/US Committee on Agent Markup Languages”

Extends (“DL subset” of) RDF

DAML+OIL submitted to W3C as basis for standardisation

Web-Ontology (WebOnt) Working Group formed

WebOnt group developed OWL language based on DAML+OIL

OWL language now a W3C Candidate Recommendation

Will soon become Proposed Recommendation

 

 

DAML

·       DAML , The DARPA Agent Markup Language is a language created by DARPA

·       It aims to provide a language and toolset that enables the Web to transform from a platform that focuses on presenting information to a platform that focuses on understanding and reasoning with information.

·       DAML gives RDF Schema more in depth properties and classes.

·       DAML provides simple terms for creating inferences.

 

DAML+OIL

·       DAML+OIL is a language for describing ontologies, building on RDF Schema and XML Schema.

·       It can be used to describe types of objects and the kinds of relationships expected between them.

·       It uses references to XML Schema datatypes to describe integers, dates and other datatypes.

·       DAML provides a method of saying things such as inverses, unambiguous properties, unique properties, lists, restrictions, cardinalities, pairwise disjoint lists, datatypes, and so on.

 

·       DAML construct - daml:inverseOf 

·       Can say that one property is the inverse of another.

·       The rdfs:range and rdfs:domain values of daml:inverseOf is rdf:Property.

·       example of daml:inverseOf:

:hasName daml:inverseOf :isNameOf

:Sean :hasName "Sean"

"Sean" :isNameOf :Sean

 

·       DAML construct - daml:UnambiguousProperty class.

·       Saying that a Property is a daml:UnambiguousProperty means that if the object of the property is the same, then the subjects are equivalent.

·       example:

foaf:mbox rdf:type daml:UnambiguousProperty .

:x foaf:mbox 

:y foaf:mbox 

implies that:-

:x daml:equivalentTo :y

 

Inference

·       Inference is one of the driving principles of the Semantic Web

·       Example:

:MyCar de:macht "160KW" .

·       A German Semantic Web processor may understand ":macht"

·       An English processor may not

·       Here is a piece of inference data that makes things clearer to the processor:

de:macht daml:equivalentTo en:power

·       The DAML "equivalentTo" property is used to say that "macht" in the German system is equivalent to "power" in the English system.

·       Using an inference engine, a Semantic Web client could successfully determine that:

:MyCar en:power "160KW"

 

·       Merging databases becomes a matter of recording in RDF somewhere that "Person Name" in your database is equivalent to "Name" in my database, and then throwing all of the information together and getting a processor to think about it.

·       CWM can do this

·       Great levels of inference can only be provided using "First Order Predicate Logic" languages, and DAML is not a FOPL language entirely.

 

OWL Language

·       Three species of OWL

OWL full is union of OWL syntax and RDF

OWL DL restricted to FOL fragment (¼ DAML+OIL)

OWL Lite is “easier to implement” subset of OWL DL

Semantic layering

OWL DL ¼ OWL full within DL fragment

DL semantics officially definitive

OWL DL based on SHIQ Description Logic

In fact it is equivalent to SHOIN(Dn) DL

OWL DL Benefits from many years of DL research

Well defined semantics

Formal properties well understood (complexity, decidability)

Known reasoning algorithms

Implemented systems (highly optimised)

 

OWL Class Constructors

 

XMLS datatypes as well as classes in 8P.C and 9P.C

E.g., 9hasAge.nonNegativeInteger

Arbitrarily complex nesting of constructors

E.g., Person u 8hasChild.Doctor t 9hasChild.Doctor

RDFS Syntax

e.g., Person u 8hasChild.Doctor t9hasChild.Doctor:

<owl:Class>

  <owl:intersectionOf rdf:parseType=" collection">

    <owl:Class rdf:about="#Person"/>

    <owl:Restriction>

      <owl:onProperty rdf:resource="#hasChild"/>

      <owl:toClass>

        <owl:unionOf rdf:parseType=" collection">

          <owl:Class rdf:about="#Doctor"/>

          <owl:Restriction>

            <owl:onProperty rdf:resource="#hasChild"/>

            <owl:hasClass rdf:resource="#Doctor"/>

          </owl:Restriction>

        </owl:unionOf>

      </owl:toClass>

    </owl:Restriction>

  </owl:intersectionOf>

</owl:Class>

 

OWL Axioms

Axioms (mostly) reducible to inclusion (v)

C ´ D  iff  both C v D and D v C

 

Ontology Editors

Swoop http://www.mindswap.org/2004/SWOOP/

OilEd http://oiled.man.ac.uk/

Protégé http://protege.stanford.edu/

OntoEdit http://www.ontoprise.de/products/ontoedit_en

 

Developing an Ontology using Protégé