Category Archives: Thoughts

Subject-centric applications: toward subject-centric computing

Recently we added a small framework that allows us to build/use subject-centric applications in Ontopedia. Within traditional paradigm of application-centric computing, we have to start an application (or go to some domain/function specific website) and then we can change application/website context. Within subject-centric environment, we can select a subject and then we can have access to various applications/functions that can be used with the subject in context.

As an example, we implemented basic subject-centric application Company headquarters map. It can be used for various companies. This app tries to find a geo point of company headquarters and map it using Google Maps service.

Live example: Tibco Software Headquarters on a map

Subject-centric apps can use information recorded in Ontopedia knowledge map and provide rich, user friendly interactivity. They also can use external data sources and submit pieces of information to Ontopedia knowledge map.

If we have just one application, this is not too exciting. But we are talking about a framework that allows us to integrate multiple subject-centric apps which are relevant to various subjects. All these applications have a very important feature in common – they can ‘sense’ current subject context. This approach can be used on the Web or on a new generation of desktops.

More info about my perspective on knowledge maps/grids, subject-centric pages, widgets, apps can be found in these references:

Enterprise Search, Faceted Navigation and Subject-Centric Portals; Topic Maps, 2008

Ruby, Topic Maps and Subject-centric Blogging: Tutorial, Topic Maps, 2008

Enterprise Knowledge Map; Topic Maps, 2007

Topic Maps Grid; Extreme Markup 2006

Fundamentals and possibilities are described perfectly in Steve Pepper’s presentation on Topic Maps 2008 (PPT format):

Everything is a Subject

“Creating Linked Data” by Jeni Tennison

Jeni Tennison published several excellent blog entries which describe process of creating Linked Data. If you are interested in semantic technologies, you will find lots of important ideas in these postings.

From Jeni’s Musings :

Creating Linked Data – Part I: Analysing and Modelling

Creating Linked Data – Part II: Defining URIs

Creating Linked Data – Part III: Defining Concept Schemes

Creating Linked Data – Part IV: Developing RDF Schemas

Creating Linked Data – Part V: Finishing Touches

We have more and more Linked Data (published in RDF). I think that it is very important to find a way of using RDF data in Linked Topic Maps, and vice-versa. Traditional approach to RDF/Topic Maps mapping is based on idea of mapping between RDF Object Properties and Associations. I have been playing with the idea of mapping between Object Properties and Occurrences for quite some time. In Ontopedia, it is possible to represent RDF object properties directly as “occurrences”. Mapping between subject-centric occurrences and associations can be done later as results of inference. In Ontopedia, from user experience, there is no big difference between associations and subject-centric occurrences: identifiers, locators, names, associations, and occurrences are rendered as subject “properties”. I think that with this approach, RDF data can be “naturally” integrated with Topic Maps-based data sets.

Modeling Time and Identity

I really like ideas described in this presentation by Rich Hickey.

Link on InfoQ: http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hickey

One of the fundamental requirements for future computing: explicit representation of “values” and changes in time.

“… The future is a function of the past, it doesn’t change it …”

“… We associate identities with a series of causally related values …”

Presented ideas are very close to my own understanding of new subject-centric programming model

Paraconsistent Reasoning in Ontopedia

I did a short presentation about paraconsistent reasoning in Ontopedia on TMRA 2009.

Summary:

  • Paraconsistent reasoning allows to collect assertions from
    various sources and “safely” infer new information
  • Paraconsistent reasoning works well together with constraint
    languages (such as TMCL)
  • Paraconsistent reasoning supports evolutionary approach to
    building large assertion systems
  • We do not need to “fix all errors” before we can reason

iPhone OS 3.0 – ready for Subject-centric computing

Apple just introduced iPhone OS 3.0 (beta) and 3.0 SDK. There are lots of improvements and new features. iPhone is a great platform for developing mobile applications. OS 3.0 makes it even more compelling for building Subject-centric solutions. One of my favorite features is Push Notification Service.

We introduced Subject-centric RSS feeds some time ago on Ontopedia PSI server . With RSS feeds in place, we can subscribe and monitor information about subjects that we are interested in using RSS (including mobile) aggregators. As Ontopedia user, I can submit an assertion, for example, that I am thinking about Blogging Vocabulary . Everyone with RSS subscription to Blogging Vocabulary or to my PSI will be notified about this new assertion.

But, of course, existing RSS aggregators and pull model do not allow to realize full potential of Subject-centric micro-blogging. Services like iPhone Push Notification Service are game changers. I wrote this blog post many years ago about Subject-centric real-time messaging. Now is the right time to implement it. And with new Apple iPhone SDK it should be fun.

Finding “facts” (without scanning millions of documents)

The new version of Ontopedia PSI server is out. There are several interesting features in this release. We introduced auto-reification of all assertions, “everything is a subject” now. In the new version, preferable and recommended way to model web resources is to model them as first class “subjects”. Another interesting feature is ability to search for ‘facts’ related to various subjects.

Every assertion created in Ontopedia’s knowledge map is automatically reified as a ‘subject’. Starting from the moment of ‘creation’, assertion-based PSIs have a regular ‘life cycle’. Users can change PSI default name, description. It is also possible to deprecate PSIs and introduce new PSIs for the same subject. Of course, users can make assertions about other assertions. This feature is quite helpful for modeling changes in time (combined with time interval scoping), for example.

Speaking about modeling resources on the web, we continue to support URI-based properties/occurrences, but main modeling practice moving forward is based on creation of explicit subjects for web resources and using associations for connecting resources and other subjects. Ontopedia’s generic user interface will be optimized in next releases to support dual nature of web resources (as “subjects” and “links”).

Next feature is related to improving ‘findability’. I think that in many cases we are looking for ‘facts’, not documents, so we try to do the first step in providing direct access to ‘facts’ collected in Ontopedia’s knowledge map. We use basic faceted search/navigation with three main facets: ‘Concepts’, ‘Web Resources’, and ‘Assertions’. For example, if we type ‘apple’ in Ontopedia’s search box, we can find some information items in all three tabs on the front search page. The most interesting tab is probably ‘Assertions’. This tab provides direct access to facts which include reference to ‘apple’. Future versions of ‘Assertions’ tab will include additional facets which will allow to ‘slice and dice’ assertions.

With this new feature, our goal is to demonstrate that Subject-centric computing can change ‘search paradigm’ by providing direct, reliable access to ‘facts’. Of course, it will take lots of efforts to make this approach scalable. But recent enhancements in commercial and open source faceted search engines, achievements in creating “knowledge maps”/”smart indices ” make me believe that we are not that far from ability to directly find ‘facts’ that we are interested in.

Subject-centric micro-blogging and Ontopedia’s knowledge map

Traditionally, when we think about subject-centric approach to organizing information, we have in mind equivalent of “master data” – main entities, their properties and relationships. This type of information is relatively static. Of course, subject-centric approach works well also for representing/organizing information about “transactions” and “events”.

“Master data” (PSIs for people, places, companies, products etc.) is a conceptual frame/”endoskeleton” of Ontopedia’s knowledge map. For example, http://psi.ontopedia.net/Apple_Inc is a core, “master” entity.

Assertions
Apple Inc is a Company , Apple’s product line includes Mac Mini, iPhone, … are also part of this core knowledge map.

But Ontopedia’s knowledge map is not limited by this relatively static information. Ontopedia’s knowledge map also has PSIs for events, such as
http://psi.ontopedia.net/Apple_reports_financial_results_Q4_2008
and http://psi.ontopedia.net/Apple_Event_October_14th_2008

“Master Data” combined with “Events” create amazingly powerful conceptual framework for mapping of our knowledge.

Ontopedia’s knowledge map has explicit concept of time and has focus on “current moment on Earth at human size level of (real) world” with recording of history and results of forecasting. History does not disappear in the knowledge map. For example, Ontopedia can “remember” that Apple Inc was called “Apple Computer Inc” at some point and that eMac was in Apple’s product line. History is available for referencing and continues to play an essential role in organizing information.

Explicit modeling of time helped us to introduce even more intriguing features such as Subject-centric micro-blogging.
We are experimenting with “dynamic” associations and properties such as Currently Reading [Person, Book], Currently Located At [Person, City], “Currently Thinking About [Person, Subject]”, “My favorite link of the day” etc.

To support this “dynamic” perspective on Ontopedia’s knowledge map, we recently added subject-centric RSS feeds. Each subject page in Ontopedia’s knowledge map has own RSS feed which provides quick access to all assertions about specific subject. Each assertion has associated time stamps which allow to track changes in the knowledge map and report them in RSS feeds.

In addition to traditional “source-centric” RSS feeds in my RSS aggregator, I have now folders like People, Companies, etc. with subject-centric RSS feeds from Ontopedia’s knowledge map. These feeds are available on my laptop, but I also have a synchronized RSS aggregator on my mobile phone. Mobile RSS aggregator and mobile browser allow me to work with Ontopedia’s knowledge map when I need it. It makes me feel like Subject-centric computing is (almost) here…

Carl Hewitt – Actor model, OWL, knowledge inconsistency and paraconsistent logic

ITConversations published recently Jon Udell’s interview with Carl Hewitt. In this interview – “Interdependent Message-Passing ORGs”, Carl Hewitt shares his ideas about distributed computations, Actor model, inconsistent knowledge, paraconsistent logic and semantic web.

Carl Hewitt’s work has been an inspiration to me for more than 20 years. Knowledge inconsistency is a fundamental reality of our life. When we build computer systems, we can ignore it, we can try to create artificial boundaries, artificial worlds with “guaranteed” knowledge consistency. Alternative approach is to accept from the beginning that we have to deal with inconsistency and create systems that can represent inconsistent knowledge, reason within inconsistent knowledge bases and utilize mechanisms which help to keep inconsistency “under control”.

I made the choice many years ago in favor of this alternative approach and used it in building many computer systems over the years. Our recent project – Ontopedia PSI server is not an exception. Ontopedia PSI server allows to represent opinions from various sources, including contradictory opinions. Ontopedia’s reasoning engine is justification based (as everything in Ontopedia – work in progress :) which means that decision about each assertion is based on comparison between various opinions and their justifications. Reasoning inside of Ontopedia PSI server is paraconsistent. Inference engine can find contradictory assertions in some areas of Ontopedia’s knowledge base. Local contradictions do not prevent reasoning engine from inferring reasonable assertions in other areas of knowledge base and there is no ‘explosion of assertions’.

Reasoning in Ontopedia PSI server is also ‘adaptive’. We anticipate that when various sources ‘see’ results of comparison between various opinions and ‘see’ consequences of their statements in several ‘steps ahead’, then sources can change their original opinions.

Ontopedia PSI server actually ‘likes’ contradictions. Contradictions are starting points of identifying errors, negotiations, improving knowledge models and as a result – knowledge evolution.

Resources:

Interdependent Message-Passing ORGs, interview on ITConversations

Watching an interview about Powerset

InfoQ published an interview with Tom Preston-Werner on Powerset, GitHub, Ruby and Erlang. I really like projects that try to analyze text/resources on the web and try to implement “smart search”. Powerset is one of these projects. But what I like even more is the approach when we explicitly represent facts/information items using open knowledge representation standards such as Topic Maps or RDF.

Topic Maps can play the role of “knowledge middleware” that helps to integrate various components of “smart search puzzle”. A topic map-based index allows to represent and connect subjects and resources. Explicit representation of relatively small number of relationships (“facts”,”assertions”) between resources and subjects can dramatically change the world of smart search.

Topic Maps based-knowledge middleware is a disruptive technology because it replaces proprietary knowledge organization schemas and modules and it allows multiple players to build various solutions that help to create or use smart index.

Topic Maps-based Ontopedia PSI server, for example, can represent assertions that are manually created by users or generated by some algorithms. We do not have our own text analysis infrastructure, but I hope that in the future we can leverage some services on the web (such as OpenCalais) which can perform text analysis on “as needed” basis. The core ability of Ontopedia PSI server is maintaining explicit representations of subjects that are important for people and ability to maintain assertions about these subjects.

The new version of Ontopedia PSI server can play a role of an aggregator that can extract assertions from existing topic maps/fragements hosted on other websites. Assertions from multiple sources are aggregated into one assertion set/information map/semantic index. Ontopedia PSI server keeps track of information provenance and supports multiple truth values. The server, for example, can handle a situation when one source on the web asserts that Person X did a Presentation P and someone else makes the opposite assertion.

I think that natural language processing can play a huge role in improving search. Ideal text analysis tool should allow to provide ‘clues’ about subjects in a text. I am looking for equivalent of some kind of ‘binding’ that is used in programming quite often these days. I would love to have the ability to provide list of main subjects in a form of PSIs to text analysis tool (using embedded markup or attached external assertions). If I do so, I expect much more precise results. If I do not have an initial list of subjects I expect some kind of suggestions from text analysis tools that I can check against existing information map.

Ontopedia (as many other Topic Maps-based projects) promotes usage of Public Subject Identifiers (PSIs) for “all thinkable” subjects. For example, there is an identifier for TMRA 2008 conference – http://psi.ontopedia.net/TMRA_2008 .
There are identifiers for each presenter and presentation. Basic relationships between various subjects are also “mapped”/explicitly represented. Each basic resource, such as a blog post can have a small assertion set that describes metadata (using Dublin Core metadata vocabulary, for example) and maybe some main assertions. Traditional websites can provide combined assertion sets in XTM or RDF which can be consumed by semantic aggregators such as Ontopedia PSI server. Text analysis is great (when it is good enough). But even simple (semi-)manual “mapping” of subjects, resources and relationships can change the search game.

When we manually try to “map” an existing resource such as a conference website for the first time, it can look as a complicated and time consuming task. Mapping a website for another conference will take much less time. And, of course, in many cases it is possible to reverse traditional website building/assertion extraction paradigm.

It is possible to build nice looking and functional web sites based on “assertion sets”. Topicmaps.com is a great example of this approach. It is driven by a topic map. Humans can enjoy HTML-based representation of this site and aggregators like Ontopedia PSI Server can consume raw XTM-based representation and aggregate it with other assertion sets such as TMRA 2008 conference assertion set.

References

Interview link on InfoQ

Extending Ontopedia PSI server to handle PURLs: support for RDF, step one

I have been thinking about RDF support on Ontopedia PSI server for quite some time. Semantic Technology Conference that I attended this spring gave me some new ideas in this direction. I decided to follow recommendations from Eric Miller’s and David Wood’s presentation “Persistent Identifiers for the ‘Real Web'” regarding PURLs (Persistent Uniform Resource Locators). Ontopedia PSI server was extended to handle PURLs

Each Published Subject Identifier (PSI) on http://psi.ontopedia.net has an equivalent PURL on http://purl.ontopedia.net. For example, http://psi.ontopedia.net/TMRA_2008 has the corresponding PURL http://purl.ontopedia.net/TMRA_2008. What happens when we type in our browser PURL http://purl.ontopedia.net/TMRA_2008? Ontopedia PURL server returns HTTP code 303 “See Other” with “Location” header set to http://psi.ontopedia.net/TMRA_2008.

For RDF-based applications, code 303 is an indication that URI does not correspond to a “digital resource”. Web browsers will automatically jump to http://psi.ontopedia.net/TMRA_2008 which will provide nice subject/resource description.

When we need to export RDF assertions from Ontopedia, we can do something like this:

<rdf:Description rdf:about="http://purl.ontopedia.net/TMRA_2008">
      <rdfs:label>
           TMRA 2008 (Topic Maps Research and Applications  Conference)
      </rdfs:label>
      <rdfs:comment>
               Fourth International Conference on 
               Topic Maps Research and Applications
       </rdfs:comment>
       <rdf:type rdf:resource="http://purl.ontopedia.net/Conference"/>	
</Definition>	

In topic maps-based version we can have:

<topic id="id_98c49a0d3d87f067a4ba13b6d2f6d086">
	<subjectIdentifier href="http://psi.ontopedia.net/TMRA_2008"/>
	<instanceOf>
           <topicRef href="http://psi.ontopedia.net/Conference"/>
        </instanceOf>
	<name>
	   <value>
              TMRA 2008 (Topic Maps Research and Applications  Conference)
           </value>
	</name>
	<occurrence>
            <type>
	        <topicRef href="http://psi.ontopedia.net/Description"/>
            </type>
            <resourceData>
                      Fourth International Conference on 
                      Topic Maps Research and Applications
            </resourceData>
	</occurrence>
</topic>

RDF-based version uses PURLs and Topic Maps-based version uses PSIs for identification of subjects/resources.

Reference:

Persistent Identifiers for the ‘Real Web’, David Wood, Eric Miller, May 2008, PDF