Tag Archives: CTM

Serendipitous reuse and representations with basic ontological commitments

Steve Vinoski published a very interesting article: Serendipitous reuse. He also provided additional comments in his blog. The author explores benefits of RESTful uniform interfaces based on HTTP “verbs” GET, PUT, POST and DELETE for building expansible distributed systems. He also compares RESTful approach with traditional SOA implementations based on strongly typed operation-centric interfaces.

Serendipitous reuse is one of the main goals of Subject-centric computing. In addition to uniform interfaces, Subject-centric computing promotes usage of uniform representations with basic ontological commitments (as one of the possible representations).

One of the fundamental principles of the Resource-Oriented Architecture is the support for multiple representations for the same resource. For example, if we have a RESTful service which collects information about people, GET request can return multiple representations.

Example using JSON:

	"id":          "John_Smith",
	"type":        "Person",
	"first_name":  "John",
	"last_name":   "Smith",	
	"born_in":      {
			   "id": "Boston_MA_US", 
			   "name": "Boston"

Example using one of the “domain specific” XML vocabularies:

<person id="John_Smith">
	<born_in ref="Boston_MA_US">Boston</born_in>

Example using one of the “domain independent” XML vocabularies:

<object obj_id="John_Smith">
        <property prop_id="first_name" prop_name="first name">John</property>
        <property prop_id="last_name" prop_name="last name">Smith</property>
        <property prop_id="born_in" prop_name="born in" val_ref="Boston_MA_US">

Example using HTML:

<div class="object">
	<div class="data-property-value">
		<div class="property">first name</div>
		<div class="value">John</div>
	<div class="data-property-value">
		<div class="property">last name</div>
		<div class="value">Smith</div>
	<div class="object-property-value">
		<div class="property">born in</div>
		<div class="value">
			<a href="/Boston_MA_US">Boston</a>

Example using text:

John Smith was born in Boston

These five formats are examples of data-centric representations without built-in ontological commitments. These formats do not define any relationship between representation and things in the “real world”. Programs which communicate using JSON, for example, do not “know” what “first_name” means. It is just a string that is used as a key in a hash table.

Creators of RESTful services typically define additional constraints and default interpretation for corresponding data-centric representations. For example, we can agree to use “id” string in JSON-based representation as an object identifier and we can publish some human readable document which describes and clarifies this agreement. But the key moment is that this agreement is not a part of JSON format.

Even if we are talking about a representation based on a domain specific XML vocabulary, semantic interpretation is outside of this vocabulary and is a part of an informal schema description (using comments or annotations).

Interestingly enough, level of usefulness is different for various representations. In case of a text, for example, computer can show text “as is”. It is also possible to do full-text indexing and to implement simple full-text search.

HTML-based representations add some structure, ability to use styles and linking between resources. Some links analysis can help to improve results of basic full-text search.

If we look at representations based on Topic Maps, situation is different. Topic Maps technology is a knowledge representation formalism and it embeds a set of ontological commitments. Topic Maps-based representations, for example, commit to such categories as topics, subject identifiers, subject locators, names, occurrences (properties) and associations between topics. There is also the commitment to two association types: “instance-type” and “subtype-supertype”. Topic Maps also support contextual assertions (using scope).

In addition, Topic Maps promote usage of Published Subject Identifiers (PSIs) as a universal mechanism for identifying “things”.

Topic Maps – based representations are optimized for information merging. For example, computers can _automatically_ merge fragments produced by different RESTful services:

Fragment 1 (based on draft of Compact Syntax for Topic Maps: CTM):

   isa po:person; 
   - "John Smith"; 
   - "John" @ po:first_name; 
   - "Smith" @ po:last_name

g:Boston_MA_US - "Boston"; isa geo:city. 

po:born_in(p:John_Smith : po:person, g:Boston_MA_US : geo:location)

Fragment 2:

g:Paris_FR - "Paris"; isa geo:city. 

po:likes(p:John_Smith : po:person, g:Paris_FR : o:object)

Result of automatic merging:

   isa po:person; 
   - "John Smith"; 
   - "John" @ po:first_name; 
   - "Smith" @ po:last_name

g:Boston_MA_US - "Boston"; isa geo:city. 

g:Paris_FR - "Paris"; isa geo:city. 

po:born_in(p:John_Smith : po:person, g:Boston_MA_US : geo:location)

po:likes(p:John_Smith : po:person, g:Paris_FR : o:object)

As any other representation formalism, Topic Maps are not ideal. But Topic Maps enthusiasts think that Topic Maps capture a “robust set” of ontological commitments which can drastically improve our ability to organize and manage information and to achieve real reuse of information with added value.

Authoring topic maps using Ruby-based DSL: CTM, the way I like it

Designing and using Domain Specific Languages (DSL) is a popular programming style in Ruby community.
I am experimenting with Ruby-based DSL for authoring topic maps. Surprisingly, the result is very close to
my view on the “ideal” CTM (Compact Topic Maps syntax).

I just would like to share a sample that should demonstrate main ideas of this approach. It is a piece of Ruby code that generates topic maps (behind the scenes).

First topic map defines some simple ontology.

# some definitions to support DSL
# should be included

topic_map :ontology_tm do
  tm_base "http://www.example.com/topic_maps/people/"

  topic(:person) {
    sid   "http://psi.example.com/Person"
    name  "Person"
    isa :topic_type
  topic(:first_name) {
    sid   "http://psi.example.com/first_name"
    name  "first name"
    isa :name

  topic(:last_name) {
    sid   "http://psi.example.com/last_name"
    name  "last name"
    isa :name
  topic(:web_page) {
    sid   "http://psi.example.com/web_page"
    name  "web page"
    isa :occurrence
    datatype :uri

  topic(:age) {
    sid   "http://psi.example.com/age"
    name  "age"
    isa :occurrence
    datatype :integer
  topic(:description) {
    sid   "http://psi.example.com/description"
    name  "description"
    isa :occurrence
    datatype :string
  topic(:works_for) {
    sid   "http://psi.example.com/works_for"
    name  "works for"
    isa :property
    association :employment
    first_role :employee
    second_role :employer
    third_role :position_type  
    third_role_prefix :as
  topic(:likes) {
    sid   "http://psi.example.com/likes"
    name  "likes"
    isa [:property, :association]
    association :likes
    first_role :person
    second_role :object

Second topic map includes ontology and asserts some facts.

topic_map :facts_tm do  
  tm_base "http://www.example.com/topic_maps/people/john_smith"

  tm_include :ontology_tm
  topic :john_smith do
      sid "http://psi.example.com/JohnSmith"
      name  "John Smith"
      name  "Johnny", :scope => :alt_name
      first_name "John" ; last_name  "Smith"
      web_page "http://a.example.com/JohnSmith.htm"
      works_for topic(:example_dot_com){
                              sid "http://www.example.com"
                              name "example.com"; isa :company
    	                :as => :program_manager, 
    	                :scope => :date_2008_02_28
      likes [:italian_opera, :new_york]
      age 35
      description <

Subject-centric programming language or what was good about COBOL

I did a short presentation (3 slides) about requirements for a new subject-centric programming language on TMRA 2007. I made a reference to COBOL as a language that had built-in high-level support for defining and manipulating “business data”. Many modern programming languages “outsourced” data handling to relational databases and lost transparency and simplicity in manipulation of data.

Object-oriented programming languages (starting with Simula ) help to model various “things” in our computers. But object-oriented languages are not optimized for representing our knowledge about these “things”. For example, it is quite easy to define a class that represents people with several properties such as ‘first_name’ and ‘last_name’ (Ruby notation):

	class Person
	   attr_accessor :first_name,:last_name

It is easy to create a new instance and assign some values:


But what if we need to represent a situation when names can be changed over time? What if different sources have different information about names of the same person? What if information about names is not available directly but can be inferred/calculated based on exiting data and various inference rules? How can we specify that some object properties can be viewed/modified only by specific user groups?

What if we need to persist information about an object in some data store and retrieve this information later? What if we need to access and modify this information on a laptop when we do not have connectivity. How can we synchronize this information with a desktop that is connected to the network all the time?

These “basic”, everyday requirements break easily simplicity of traditional interpretation of the object-oriented paradigm and introduce complicated “frameworks” and APIs. Instead of using objects to model “things” we need a lot of objects do deal with infrastructure and to model technical aspects of our knowledge about “things”.

Dynamic languages such as Lisp, Prolog, Python, Ruby etc. allow to “hide” technicalities using meta-programming. These languages allow to build required “technical” constructs behind the scenes and help programmers to work with domain level objects.

For example, we can use “metaproperties” to define domain specific associations, names, occurrences and types:

	class KnowsPerson < ActiveTopic::Association
	    psi           'http://psi.ontopedia.net/knows_person'
	    historical    true
	    symmetrical   true

	    role :person, :player_type  => :person, 
	                  :as_property => :knows_person, 
	                  :card_min => 2,
	                  :card_max => 2

	class FirstName < ActiveTopic::Name
	    psi          'http://psi.ontopedia.net/first_name'
	    historical   true
	    card_max     1
	    domain       :person

	class LastName < ActiveTopic::Name
	    psi          'http://psi.ontopedia.net/last_name'
	    historical   true
	    card_max     1
	    domain       :person

	class DateOfBirth < ActiveTopic::Occurrence
	    psi          'http://psi.ontopedia.net/date_of_birth'
	    domain       :person
	    card_max     1
	    data_type    :date

	class Person <  ActiveTopic::Topic
	  psi          'http://psi.ontopedia.net/Person'
	  sub_type     :thing 
	  name         :first_name
	  name         :last_name
	  occurrence   :date_of_birth
	  association  :knows_person

We can use defined constructs to manipulate domain specific objects:

	a=Person.create(:psi=>'JohnSmith',:at_date => '2007-10-01')

                                  :at_date => '2007-10-02')
	a=Person.find(:psi=>'JohnSmith',:at_date => '2007-10-03')
	b=Person.find(:psi=>'JoeSmith',:at_date => '2007-10-03')
	b.knows_person << a

Ruby and other dynamic languages allow us to go quite far in defining domain specific languages. But if we look at CTM, TMQL and TMCL, we can think about even more advanced programming language that natively supports various assertion contexts (time, source, etc.), metadata, information provenance and various inference/calculation mechanisms.


- Larsblog: TMRA 2007, day 1

- Larsblog: TMRA 2007, day 2

Short presentation on TMRA 2007, COBOL and Topic Maps?