Category Archives: Thoughts

Subject-centric programming language or what was good about COBOL

I did a short presentation (3 slides) about requirements for a new subject-centric programming language on TMRA 2007. I made a reference to COBOL as a language that had built-in high-level support for defining and manipulating “business data”. Many modern programming languages “outsourced” data handling to relational databases and lost transparency and simplicity in manipulation of data.

Object-oriented programming languages (starting with Simula ) help to model various “things” in our computers. But object-oriented languages are not optimized for representing our knowledge about these “things”. For example, it is quite easy to define a class that represents people with several properties such as ‘first_name’ and ‘last_name’ (Ruby notation):


	class Person
	   attr_accessor :first_name,:last_name
	end

It is easy to create a new instance and assign some values:


	p=Person.new
	p.first_name='John'
	p.last_name='Smith'

But what if we need to represent a situation when names can be changed over time? What if different sources have different information about names of the same person? What if information about names is not available directly but can be inferred/calculated based on exiting data and various inference rules? How can we specify that some object properties can be viewed/modified only by specific user groups?

What if we need to persist information about an object in some data store and retrieve this information later? What if we need to access and modify this information on a laptop when we do not have connectivity. How can we synchronize this information with a desktop that is connected to the network all the time?

These “basic”, everyday requirements break easily simplicity of traditional interpretation of the object-oriented paradigm and introduce complicated “frameworks” and APIs. Instead of using objects to model “things” we need a lot of objects do deal with infrastructure and to model technical aspects of our knowledge about “things”.

Dynamic languages such as Lisp, Prolog, Python, Ruby etc. allow to “hide” technicalities using meta-programming. These languages allow to build required “technical” constructs behind the scenes and help programmers to work with domain level objects.

For example, we can use “metaproperties” to define domain specific associations, names, occurrences and types:


	class KnowsPerson < ActiveTopic::Association
	    psi           'http://psi.ontopedia.net/knows_person'
	    historical    true
	    symmetrical   true

	    role :person, :player_type  => :person, 
	                  :as_property => :knows_person, 
	                  :card_min => 2,
	                  :card_max => 2
	end

	class FirstName < ActiveTopic::Name
	    psi          'http://psi.ontopedia.net/first_name'
	    historical   true
	    card_max     1
	    domain       :person
	end

	class LastName < ActiveTopic::Name
	    psi          'http://psi.ontopedia.net/last_name'
	    historical   true
	    card_max     1
	    domain       :person
	end

	class DateOfBirth < ActiveTopic::Occurrence
	    psi          'http://psi.ontopedia.net/date_of_birth'
	    domain       :person
	    card_max     1
	    data_type    :date
	end

	class Person <  ActiveTopic::Topic
	  psi          'http://psi.ontopedia.net/Person'
	  sub_type     :thing 
	  name         :first_name
	  name         :last_name
	  occurrence   :date_of_birth
	  association  :knows_person
	end

We can use defined constructs to manipulate domain specific objects:


	a=Person.create(:psi=>'JohnSmith',:at_date => '2007-10-01')
	a.first_name='John'
	a.last_name='Smith'
	a.save

	b=Person.create(:psi=>'JoeSmith',
                                  :first_name=>'Joe',
                                  :last_name=>'Smith',
                                  :at_date => '2007-10-02')
	b.save
	a=Person.find(:psi=>'JohnSmith',:at_date => '2007-10-03')
	b=Person.find(:psi=>'JoeSmith',:at_date => '2007-10-03')
	b.knows_person << a
	a.save
	b.save

Ruby and other dynamic languages allow us to go quite far in defining domain specific languages. But if we look at CTM, TMQL and TMCL, we can think about even more advanced programming language that natively supports various assertion contexts (time, source, etc.), metadata, information provenance and various inference/calculation mechanisms.

Links:

- Larsblog: TMRA 2007, day 1

- Larsblog: TMRA 2007, day 2

Short presentation on TMRA 2007, COBOL and Topic Maps?

Resource-oriented architecture and Subject-centric computing: what is the difference?

I just finished reading RESTful Web Services. It is an amazing book and I think it will play a very important role in defining main principles of the next generation of the Web. The authors of the book introduce the Resource-Oriented Architecture (ROA) as an architecture for building the resource-centric programmable Web. “Resource” is a fundamental concept in this architecture.

“A resource is anything that’s important enough to be referenced as thing in itself… What makes a resource a resource? It has to have at least one URI. The URI is the name and address of the resource…”

“… A resource can be anything a client might want to link to: a work of art, a piece of the information, a physical object, a concept, or a grouping of references to other resources… The client cannot access resources directly. A [ROA-based] web service serves representations of a resource: documents in a specific data formats that contain information about the resource…”

ROA defines principles of organizing data sets as resources, approaches to designing representations of these resources and main operations on these representations.

The key concept of the Subject-centric computing (SCC) is a “Subject” which is defined as “anything whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever”. This definition is very close to the definition of a “Resource” in ROA.

But there are important differences between ROA and SCC main goals. The Subject-centric computing is less concerned with managing resource/subject representations and using universal HTTP operations such as GET, POST, PUT and DELETE to manipulate resources. SCC assumes that there are a lot of different data sets/documents (at least potentially) which describe or reference the same subject. With SCC, our main concern is in identifying subjects reliably and in bringing together different pieces of information related to the same subject.

As with ROA, we use (resolvable) URIs to identify Resources/Subjects. But in the case of SCC, we promote usage of Published Subject Identifiers (PSIs). If we have a subject that is not a “digital information item”, its PSI should be resolvable to a special kind of a “document” – Published Subject Descriptor (PSD). Each PSD provides a human readable description of a subject which is enough for distinguishing this subject from other subjects. Using ROA terminology, PSD is a special kind of a representation that is introduced to convey “identification” information about a subject.

Many other “documents” and data sets which contain various assertions about the same subject can exist on the Web. SCC is concerned with providing ability to collect these various assertions into the 360° view of the subject. PSIs are one of the main mechanisms to achieve this goal.

With SCC, we do not have a luxury of doing point-to-point data integration each time when we have a new data set. That’s why we rely on universal representation formalism which is an important part of ISO Topic Maps standard. Topic Maps provide also a universal merging mechanism that takes care of integration of various data sets published using an interchange syntax such as XTM.

One of the main goals of SCC is to support “associative nature” of human thinking. ROA is satisfied quite often with “shallow” representations of associations (with the “a” HTML tag, for example). SCC is more targeted to semantically rich representations of relationships between subjects. Topic Maps help to represent and manage such relationships as “instance-type”, “supertype-subtype” and thousands of domain-specific association types. Representations of these relationships are available for processing at the semantic level. It makes possible to implement integration scenarios which are “unthinkable” with HTML-like representations.

But in general, ROA and SCC are complementary architectures and can be successfully used together to build exciting applications and environments

Instant messaging, subject centric group chats, topic maps and … goodbye email (… almost)

I recently saw presentation of Parlano MindAlign for Microsoft Live Communication Server instant messaging platform.

I enjoy using IM, IRC and enterprise group chats for many years. I use them for person to person communications, for getting and providing quick answers from/to peers, for notifications about important events. MindAlign introduces new trend, I think: real-time subject centric communication. MindAlign smart client allows to manage effectively and participate in hundreds of subject centric channels at the same time. It also allows to see history of all conversations and search message archives. All these features are not new. But effective support of hundreds of channels on the client side changes rules of the game and moves group chats to a new level.

There is something extremely powerful in combination of real-time subject centric communication, ability to access message history and search. I think that this kind of system can replace about 80% of emails in the future.

What is the next step? We can connect channel topics with topic map and allow users to reference subtopics in real time conversation using analog of WikiWords. In this case we have a topic map which is modified by users in real-time. This topic map has references to group chat messages. But as any other topic map it also can have information about associations between topics and references to other resources.

We can add ability for users to provide Wiki-like occurrences in this topic map and ability to add links to resources (analog of social bookmark manager del.icio.us ).

Result – live topic map which integrates summary information about subjects, associations between subjects, real-time messages and links to resources connected with subjects.

Apple’s Spotlight, what do we search for and … topic maps

I recently enjoyed watching “Tiger” presentation and specifically presentation of a new Apple’s search technology – “Spotlight”

As many other people I would like to have this kind of search now on OS X, Windows and Linux computers. I also would like to have this kind of search for enterprise document repositories.

What I cannot find in this demonstration is an explicit concept of “subjects” or “topics”. If I select a name of a person in email, for example, I can find all emails, presentations, calendar entries, documents, images etc. which have reference to this name in a file name, metatags or in document content. But can I find all projects which I manage? Can I find all applications which I am responsible for? Can I find all servers which I have to check from time to time or all technologies which I am interested in? Projects, applications, servers, technologies are subjects in my area of interests.

When I do search, I would like to search not only for resources which reference my favorite subjects, but also for other subjects which are connected with subject in focus.

So I will probably add topic map engine to Spotlight on my OS X computer as soon as Tiger will be available. How will I use Topic Map engine? I will use it to define subjects which are not covered by standard OS X applications. I will use it to manage relationships between subjects in my area of interests. I will also create a script which creates pseudo-documents (in html format?) for each subject. Each pseudo-document will have all names, inline occurrences and associations. I can also create document proxies for external resources which are not located on my hard drive (if Spotlight/Safari do not allow to attach custom metatags for bookmarked URIs).

It seems that Spotlight allows to define custom document categories/types. So I can define pseudo-document types for my subject classes, such as “projects”, “applications”, “people”, “servers”, “companies”, “technologies” etc. Now I can use standard system-wide Spotlight engine to search subjects and resources. And I can use Safari to navigate between different subjects.

It is time for “save as xtm” initiative

More and more applications can produce XML representation of internal information and save it to shared storage. It helps users to synchronize information on several computers. XML representation also helps to create user communities based on sharing of information. Think about shared calendars, music and picture mixes, blogs, recipes. It’s nice, but it can be much better… with topic maps.

Topic Maps provide “out of the box” support for information sharing and merging. This support is based on ability to explicitly represent subjects and ability to connect any piece of information with subjects.

If we have a blog entry, for example, we have a standard mechanism to express that this entry is related to specific subjects. And we have a standard way to merge information from several blogs. As a result we can easily find all blog entries related to the same subject.

“Pure” XML solutions can encode relationships between information pieces and subjects. But these solutions are based on custom schemas. Each time we need to define custom merging rules which also can include transformations between various XML schemas.

It is time… it is time to promote XTM format as “save as” option for various applications. Applications can use optimized internal data models to implement specific set of functions. But applications can also publish Topic Map – based representations of internal information to shared storage. Other applications can “subscribe” to external topic maps and merge external and internal information. Of course, applications remember source of information so users can keep track of “who said what”.

With “save as XTM” support it will be possible to use “universal topic map browsers” to explore information from different applications. Users also will be able to rely on specific applications with optimized views.