Tag Archives: Resource-Oriented Architecture

The new version of Ontopedia PSI server

The new version of Ontopedia PSI server is out now. It is possible to represent various types of assertions related to subjects (names, occurrences, associations). The new PSI server allows also to record and integrate opinions of different users. Its internal knowledge representation is optimized for paraconsistent reasoning.

I started to play with some topics that I am interested in. For example, Subject-centric Computing , Apple Inc .
As with typical Topic Maps-based system, we can easily add new subject and assertion types, we are not limited by fixed domain models. In addition, the new PSI server supports recording of assertion provenance and five truth values.

We also tried to follow the Resource-Oriented Architecture: each subject, each assertion, each subject-centric group of assertions of the same type has own Uri and “page”.

The main goal of this version is to experiment with assertion level subject-centric representations vs. more traditional portal-based approach.

2008 Semantic Technology Conference: random observations

I am back from Semantic Technology Conference. It is becoming bigger and bigger each year. This year there were more than hundred sessions, full day of tutorials, product exhibition. It was quite crowded and energizing.

Just some random observations:

Oracle improves RDF / OWL support in 11g database, considers RDF/OWL as strategic/enabling technologies which will be leveraged in future versions of Oracle products.

Yahoo uses RDF to organize content on various web sites. It also introduced SearchMonkey – extension to Yahoo search platform which allows to provide more detailed information about information resources.

– Consumer oriented web sites powered by semantic technologies are here. Twine, Freebase, Powerset are good examples, more to come.

Resource Oriented Architecture and RDF could be a very powerful combination. More and more people understand the value of exposing data through URIs in the form of information resources.
Linked Data initiative looks quite interesting.

– Some advanced semantic applications use knowledge representation formalisms that go beyond basic RDF/OWL model.
But RDF/OWL can be used to surface/exchange information based on W3C standards. Lots of discussions about
information provenance, trust, “semantic spam”.

– It looks like there is a workable solution (compromise) for ’Web’s Identity Crisis’. The idea is to reserve HTTP 303 (“See Other”) code for indication of “Concept URIs”. 303 response should include an additional URI for “See Other” information resource. This approach combined
with new PURL -like servers allows to keep RDF “as is” and to implement something close to the idea of Published Subject Identifiers

Franz demonstrated a new version of AllegroGraph 64-bit RDFStore. Franz implemented support for Named Graphs (can be used for representing weights, trust factors, provenance)
and incorporated geospatial and temporal libraries. Named Graphs allow to deal with contexts using RDF.

– Text analysis tools become better and better. Interesting example is AllegroGraph.
Incorporating natural language processors allows to extract entities and relationships with reasonable level of precision (News Portal sample).

Doug Lenat did a great presentation on the conference about the history of Cyc project. It looks like in 5-10 years we can expect “artificial intelligent assistants” with quite sophisticated abilities to reason.

Serendipitous reuse and representations with basic ontological commitments

Steve Vinoski published a very interesting article: Serendipitous reuse. He also provided additional comments in his blog. The author explores benefits of RESTful uniform interfaces based on HTTP “verbs” GET, PUT, POST and DELETE for building expansible distributed systems. He also compares RESTful approach with traditional SOA implementations based on strongly typed operation-centric interfaces.

Serendipitous reuse is one of the main goals of Subject-centric computing. In addition to uniform interfaces, Subject-centric computing promotes usage of uniform representations with basic ontological commitments (as one of the possible representations).

One of the fundamental principles of the Resource-Oriented Architecture is the support for multiple representations for the same resource. For example, if we have a RESTful service which collects information about people, GET request can return multiple representations.

Example using JSON:


{
	"id":          "John_Smith",
	"type":        "Person",
	"first_name":  "John",
	"last_name":   "Smith",	
	"born_in":      {
			   "id": "Boston_MA_US", 
			   "name": "Boston"
			}
} 

Example using one of the “domain specific” XML vocabularies:


<person id="John_Smith">
	<first_name>John</first_name>
	<last_name>Smith</last_name>
	<born_in ref="Boston_MA_US">Boston</born_in>
</person>	

Example using one of the “domain independent” XML vocabularies:


<object obj_id="John_Smith">
        <property prop_id="first_name" prop_name="first name">John</property>
        <property prop_id="last_name" prop_name="last name">Smith</property>
        <property prop_id="born_in" prop_name="born in" val_ref="Boston_MA_US">
                 Boston
        </property>
</object>	

Example using HTML:


<div class="object">
	<div class="data-property-value">
		<div class="property">first name</div>
		<div class="value">John</div>
	</div>	
	<div class="data-property-value">
		<div class="property">last name</div>
		<div class="value">Smith</div>
	</div>	
	<div class="object-property-value">
		<div class="property">born in</div>
		<div class="value">
			<a href="/Boston_MA_US">Boston</a>
		</div>
	</div>	
</div>	

Example using text:


John Smith was born in Boston

These five formats are examples of data-centric representations without built-in ontological commitments. These formats do not define any relationship between representation and things in the “real world”. Programs which communicate using JSON, for example, do not “know” what “first_name” means. It is just a string that is used as a key in a hash table.

Creators of RESTful services typically define additional constraints and default interpretation for corresponding data-centric representations. For example, we can agree to use “id” string in JSON-based representation as an object identifier and we can publish some human readable document which describes and clarifies this agreement. But the key moment is that this agreement is not a part of JSON format.

Even if we are talking about a representation based on a domain specific XML vocabulary, semantic interpretation is outside of this vocabulary and is a part of an informal schema description (using comments or annotations).

Interestingly enough, level of usefulness is different for various representations. In case of a text, for example, computer can show text “as is”. It is also possible to do full-text indexing and to implement simple full-text search.

HTML-based representations add some structure, ability to use styles and linking between resources. Some links analysis can help to improve results of basic full-text search.

If we look at representations based on Topic Maps, situation is different. Topic Maps technology is a knowledge representation formalism and it embeds a set of ontological commitments. Topic Maps-based representations, for example, commit to such categories as topics, subject identifiers, subject locators, names, occurrences (properties) and associations between topics. There is also the commitment to two association types: “instance-type” and “subtype-supertype”. Topic Maps also support contextual assertions (using scope).

In addition, Topic Maps promote usage of Published Subject Identifiers (PSIs) as a universal mechanism for identifying “things”.

Topic Maps – based representations are optimized for information merging. For example, computers can _automatically_ merge fragments produced by different RESTful services:

Fragment 1 (based on draft of Compact Syntax for Topic Maps: CTM):


p:John_Smith
   isa po:person; 
   - "John Smith"; 
   - "John" @ po:first_name; 
   - "Smith" @ po:last_name
.

g:Boston_MA_US - "Boston"; isa geo:city. 

po:born_in(p:John_Smith : po:person, g:Boston_MA_US : geo:location)

Fragment 2:


g:Paris_FR - "Paris"; isa geo:city. 

po:likes(p:John_Smith : po:person, g:Paris_FR : o:object)

Result of automatic merging:


p:John_Smith
   isa po:person; 
   - "John Smith"; 
   - "John" @ po:first_name; 
   - "Smith" @ po:last_name
.

g:Boston_MA_US - "Boston"; isa geo:city. 

g:Paris_FR - "Paris"; isa geo:city. 

po:born_in(p:John_Smith : po:person, g:Boston_MA_US : geo:location)

po:likes(p:John_Smith : po:person, g:Paris_FR : o:object)

As any other representation formalism, Topic Maps are not ideal. But Topic Maps enthusiasts think that Topic Maps capture a “robust set” of ontological commitments which can drastically improve our ability to organize and manage information and to achieve real reuse of information with added value.

Resource-Oriented Architecture and Subject-centric computing vs. traditional SOA: modeling business transactions

If we look at traditional SOA, business transactions are modeled typically as service operations that are part of a service contract. Operation invocations in traditional SOA are not treated as first class “objects”. Operation invocations do not have own identity. Components/processes inside of a service and service clients cannot reference individual operation calls. Situation is different if we look at subject-centric and RESTFul services.

If a client of some subject-centric (or RESTFul) service needs to start a transaction, this client should create a new subject “request for a transaction” with own identity and internal state. Subject-centric service processes this request and some other subjects can be created/updated/deleted as a result of this operation. Service clients have direct access to subjects that represent transactions. Clients can check status of any initiated transaction. It is also possible to use a general query/search interface for finding various subsets of transactions.

Service invocation results can be presented as a special kind of subjects which are linked to original requests for transactions. Subject-centric services also can record “cause and effect” relationships that connect a request for a transaction and results of implementing this transaction as a network of related “events”. Subject-centric computing promotes (and helps) to build transparent services.

It is true that subject-centric services can generate many more subjects (and assertions about subjects) in comparison with modern SOA/object-oriented systems. But Subject-centric computing is in a unique position to leverage available hardware parallelism and distributed storage. Subject-centric services model changes in time differently from traditional computing systems. Subject-centric services do not do “updates”, they just add new assertions about subjects in a new time-specific context. Subject-centric services also have a built-in mechanism for merging assertions from multiple sources so new assertions can be created on different physical storage devices. Computations in subject-centric world can be described using data flow abstractions which allow perfect (natural) parallelism.

Resource-oriented architecture and Subject-centric computing: what is the difference?

I just finished reading RESTful Web Services. It is an amazing book and I think it will play a very important role in defining main principles of the next generation of the Web. The authors of the book introduce the Resource-Oriented Architecture (ROA) as an architecture for building the resource-centric programmable Web. “Resource” is a fundamental concept in this architecture.

“A resource is anything that’s important enough to be referenced as thing in itself… What makes a resource a resource? It has to have at least one URI. The URI is the name and address of the resource…”

“… A resource can be anything a client might want to link to: a work of art, a piece of the information, a physical object, a concept, or a grouping of references to other resources… The client cannot access resources directly. A [ROA-based] web service serves representations of a resource: documents in a specific data formats that contain information about the resource…”

ROA defines principles of organizing data sets as resources, approaches to designing representations of these resources and main operations on these representations.

The key concept of the Subject-centric computing (SCC) is a “Subject” which is defined as “anything whatsoever, regardless of whether it exists or has any other specific characteristics, about which anything whatsoever may be asserted by any means whatsoever”. This definition is very close to the definition of a “Resource” in ROA.

But there are important differences between ROA and SCC main goals. The Subject-centric computing is less concerned with managing resource/subject representations and using universal HTTP operations such as GET, POST, PUT and DELETE to manipulate resources. SCC assumes that there are a lot of different data sets/documents (at least potentially) which describe or reference the same subject. With SCC, our main concern is in identifying subjects reliably and in bringing together different pieces of information related to the same subject.

As with ROA, we use (resolvable) URIs to identify Resources/Subjects. But in the case of SCC, we promote usage of Published Subject Identifiers (PSIs). If we have a subject that is not a “digital information item”, its PSI should be resolvable to a special kind of a “document” – Published Subject Descriptor (PSD). Each PSD provides a human readable description of a subject which is enough for distinguishing this subject from other subjects. Using ROA terminology, PSD is a special kind of a representation that is introduced to convey “identification” information about a subject.

Many other “documents” and data sets which contain various assertions about the same subject can exist on the Web. SCC is concerned with providing ability to collect these various assertions into the 360° view of the subject. PSIs are one of the main mechanisms to achieve this goal.

With SCC, we do not have a luxury of doing point-to-point data integration each time when we have a new data set. That’s why we rely on universal representation formalism which is an important part of ISO Topic Maps standard. Topic Maps provide also a universal merging mechanism that takes care of integration of various data sets published using an interchange syntax such as XTM.

One of the main goals of SCC is to support “associative nature” of human thinking. ROA is satisfied quite often with “shallow” representations of associations (with the “a” HTML tag, for example). SCC is more targeted to semantically rich representations of relationships between subjects. Topic Maps help to represent and manage such relationships as “instance-type”, “supertype-subtype” and thousands of domain-specific association types. Representations of these relationships are available for processing at the semantic level. It makes possible to implement integration scenarios which are “unthinkable” with HTML-like representations.

But in general, ROA and SCC are complementary architectures and can be successfully used together to build exciting applications and environments