Experimenting with Intelligent Personal Assistant Platform

Traditionally the term Intelligent Personal Assistant is associated with Siri, Google Now or Cortana and voice based communication. For me, this term means something different – extendable open modular platform doing useful things for the user with user privacy at the centre. I have been experimenting with various ways to build something like this for the last few years in my spare time. I consider it as a research hobby, although I would love to see some Personal Assistant Platforms in existence and usage. In this post I would like share some ideas and observations.

My approach is more close to the approach used in platforms which became quite popular in resent years because of the ChatOps movement. The idea is to use chat-based user interface and “bots” which listen for patterns and commands in chat rooms and react on messages using coded “skills”. ChatOps solutions typically concentrate on coordination of human activities and task automation with bots. The same ideas (and software) can be used for building Personal Assistants. I have been interested in exploring core assistant architecture so I decided (after trying few different bot frameworks) to use JASON [1]. JASON is not a typical bot platform, it is a framework for building multi-agent systems. JASON is written in Java but agent behaviours and declarative knowledge are coded in AgentSpeak [2] – high level language with some similarity to Prolog.

Current architecture is presented below.


I use Slack as a chat platform. Slack provides user apps for various devices, it allows messages with rich content and has APIs for bot integration. There are couple variations of API, but the most attractive from the Personal Assistant perspective is the WebSocket based Realtime API. With this API, bot (assistant) can run in a cloud, on an appliance at your home, or just on your desktop computer (if you still have one). Slack group capabilities are also helpful, it is possible to communicate with various versions and types of the assistants. I use node.js-based gateway to decouple Slack and JASON based infrastructure. I also use Redis as a simple messaging medium. There is nothing special about node.js and Redis in this architecture except that both are easy to work with for implementing “plumbing”. JASON AgentSpeak does not have capabilities to talk to external services directly but it can be extended through Java components. I coded basic integration with Redis pubsub inside of my JASON solution, and for advanced scenarios (when I need services with lots of plumbing) I use node.js or other service frameworks.

Personal Assistant is implemented as a multi-agent system with additional components in node.js (and other frameworks as needed) communicating through Redis. At the centre of the multi-agent system is the Assistant Agent, User Agent and various Expert agents. User Agent is responsible for communicating with the user, Assistant agent plays the role of the coordinator and Expert agents implement specific skills. Expert agents can delegate low level “plumbing” details to services. The system is quite concurrent and runs well on multicore computers. JASON allows also to run multi-agent system on several computers but this is outside of my current experiments. Agents can create other agents, can use direct message based communication (sync and async) or pubsub based message broadcasting. Agents have local storage implemented as prolog-like associative term memory, logical rules and scripts. AgentSpeak follows BDI (Belief–Desire–Intention) software model[3] which allows to code quite sophisticated behaviours in a compact way. Prior to JASON I tried to implement the same agents with traditional software stacks such as node.js, Akka Actors. I also looked quickly at Elixir, Azure Actors. It is all good, it just takes more time and lines of code to get to the essence of interesting behaviours. I actually tried JASON earlier but made a mistake. I started with implementing Personal Assistant as a “very smart” singleton agent. Bad idea! Currently JASON-based assistant is a multi-agent system with many very specialized “experts”.

Let’s look at the “Greeting” expert agent, for example. This agent can send variations of the “Hi” message to the user and will wait for greeting response (for some time). The same agent listens for variations of the “Hi” message from the user. Greeting agent implements bi-directional communication with the user with mixed initiative. It means that the agent can initiate greeting and wait for the user response or it can respond to the user greeting. In addition, greeting agent reacts on user on-line presence change, remembers last greeting exchange. After successful greeting exchange it broadcasts “successful greeting” message which can initiate additional micro conversations.

Reactivity, autonomy, and expertise are the primary properties of the Assistant multi-agent system. Various expert agents run in parallel, they have goals, react on events, develop and execute plans, notify other agents about events, compete for the user attention. Although many agents run simultaneously, there is a concept of the “focus agent” that drives communication with the user. Even without requests from the user, Assistant is active, it may generate notifications to the user, ask questions, suggest to initiate or continue micro conversations.

Multi-agent platform JASON directly supports implementation of manually encoded reactive and goal oriented behaviours (quite sophisticated). It can be also extended with more advanced approaches such as machine learning, simulation of emotions, ethical decision making, self modelling. My current interest is mostly in experimenting with various expert agents, coordination of activities and user experience, but I am looking forward to other topics.

  1. JASON: https://en.wikipedia.org/wiki/Jason_(multi-agent_systems_development_platform)
  2. AgentSpeak: https://en.wikipedia.org/wiki/AgentSpeak
  3. Belief–Desire–Intention software model: https://en.wikipedia.org/wiki/Belief–desire–intention_software_model

Home Automation and Intelligent Personal Assistants

Home automation became a hot topic again. We have more sensors, automation devices, more connectivity options, and better managing apps.

Legacy approach to home automation typically includes using several mobile controlling apps each dedicated to a specific device. This approach is not sustainable with the increasing number of devices. It does not provide optimal user experience: we cannot see the “big picture” – status of a “home” at any moment of time. We are forced to jump between various controlling apps with different user interfaces. There is no simple way to see correlation between device performance. Optimization, predictive modelling are limited by device scope.

More advanced approach includes the idea of a “home automation hub”. In this case, user can have only one app which controls all home automation devices. Hub unifies various physical connectivity protocols, creates integrated “big picture”, allows advanced optimization. Many consumers will be probably quite happy with this level of automation and integration (at least for now).

As the natural next step, I see potential for integration of “home automation hub” into Intelligent Personal Assistants. “Home” is an important concern for many, but not the only one. The same type of consolidation is going on in other areas. We may soon deal with “personal health hub”, “personal finance hub”, “personal investment hub”, “car hub”, “transportation hub”, “travel hub” etc. Each of these hubs will be active, monitoring current situation, identifying possible/recommended actions, “fighting” for our attention. Some mediation will be required and this is one of the main roles that can be played by Intelligent Personal Assistants: optimization of user experience based on integrated view of highly automated world.

Google Glass and Intelligent Personal Assistants

I have been investigating Google Glass Mirror API and this investigation has generated some thoughts about Google Glass and Personal Intelligent Assistants that I would like to discuss.

I am quite enthusiastic about Google Glass, mostly because it creates a framework for context-aware real-time user centric services. I am very interested in the Mirror API service created and maintained by Google and capabilities for developers to deliver services based on the Glass platform.

The Mirror API allows developers to manipulate Glass timeline, react on changes in timeline and other events, it allows also controlled information sharing between apps/services. In principle, it is not that different from the App centric model that we have already with smartphones. We just do not need to check our smartphones from time to time, important information is always available. But with Glass, there is a big difference from my perspective: various services become integrated in one unified timeline with unified interface and user experience. Glass implements basic information and service integration – literally “on the glass” . I also consider Glass as a new generation of notification centre with information delivered proactively to Glass owners

Mirror API service

How can we incorporate Intelligent Personal Assistants into this picture? Personal Assistant can be viewed as a specialized service with one of its interfaces being Glass-aware.

Mirror API and Intelligent Personal Assistant

Intelligent Personal Assistant provides mediated communication between its owner and various service providers. It manages personal data/event clouds and provides integrated view on “things” and events important for its user. Personal Assistants will have multiple user interfaces including smart glasses, watches, phones, tablets, car dashboards, TVs, etc.

There are currently examples of closed services called “Personal Assistants” tightly coupled with specific vendor solutions, but I am more interested in an open extensible platform with various components and options for deployment. I am looking forward to something like “WordPress for Personal Assistants”. My understanding of Intelligent Personal Assistants is very close to classic FIPA Personal Assistants [1] and the idea of “Personal Data Locker” introduced by David Siegel in his book “Pull” [2] and his vision video [3]


  1. FIPA Personal Assistant Specification
  2. Pull: The Power of the Semantic Web to Transform Your Business by David Siegel
  3. Personal Data Locker vision video by David Siegel

Reviving the blog

I decided to revive the Subject-centric blog on the WordPress platform and will try to re-publish some of my old posts soon. The main topic will be the same but with some new categories such as “cognitive computing”, “agent technology”, “personal intelligent agents”, “moral machines”. Many old posts have references to the Ontopedia research project (active 2007-2012) and Ontopedia PSI server (offline currently). New systems/projects/services became available since Ontopedia started (such as Google’s “knowledge graph”, Wikidata), but many research topics are still relevant and I am thinking about relaunching Ontopedia PSI server on updated technical platform.

Google acquired Metaweb (company that maintains Freebase): good news for Subject-centric computing

“Google and Metaweb plan to maintain Freebase as a free and open database for the world. Better yet, we plan to contribute to and further develop Freebase and would be delighted if other web companies use and contribute to the data…” (Google blog)


* Deeper understanding with Metaweb

* Google Buys Metaweb to Boost Semantic Search

TIBCO and Subject-centric computing

I attended TIBCO’s TUCON 2010 conference this year. It gave me a great opportunity to explore Event-driven Architecture, SOA, BPM and Cloud computing. I had a chance also to listen/think/talk about the future of computing. And this future looks very subject-centric.

Let’s take a look at TIBCO’s entry into the world of enterprise social software – tibbr.
It is a microblogging platform: it allows people to submit/receive short messages. But it is subject-centric in its core. tibbr allows users to define subjects with name and description. Users can submit (“tib”) messages into “subjects” and subscribe to subjects that they are interested in. User experience is amazing!

It is very close to ideas described in my previous posts related to Subject-centric microblogging

iPhone OS 3.0 – ready for Subject-centric computing, Mar 21, 2009

Subject-centric microblogging and Ontopedia’s knowledge map, Feb 21, 2009

Another interesting product is “TIBCO BusinessEvents”. It is a complex events processing platform. Under the hood there is a powerful domain modeling infrastructure that allows to define “concepts”, “events” and “business rules”. “Concepts” help to define what I like to call “Enterprise Knowledge Map”. “Events” define what can happen outside and inside of an enterprise. “Business rules” allow to connect events and concepts. As a result, we can create dynamic enterprise models which facilitate
decision making in real-time.

TIBCO’s founder and CEO Vivek Ranadivé in his presentation “Two-second Advantage” mentioned future “triple store”. But it is not just static triple store that we typically mean in relation to Linked Data. It is a triple store that is integrated and updated by stream of events with ability to reason about concepts and events in time. TIBCO BusinessEvents 4.0 is a great introduction to these ideas.

Carl Hewitt’s Direct Logic, inconsistency tolerant reasoning and Subject-centric computing

I have been fascinated by the idea of building computer systems which are inconsistency tolerant for many years. I usually address this problem from practical perspective: I just try to write code that demonstrates behavior that I would like to model. But I always thought that it should be beneficial to have some kind of a formal logic that can provide foundations for my heuristic approach. I follow Carl Hewitt’s work for many years and it seems that his inconsistency tolerant Direct Logic can play this foundational role. Firstly, let’s take a look at how traditional logic handles contradictions.

Within traditional logic from inconsistency anything can be inferred. If we have a contradiction about a proposition P then we can infer any proposition Q

	P, ⌉P ├ Q	

This is not very helpful if we want to talk about inconsistency tolerant reasoning…

In Carl Hewitt’s Direct Logic [1] situation is different. Formula

	P, ⌉P

describes totally normal situation, but the meaning of this situation is different from traditional logic.

“Direct Logic supports direct inference ( ├τ ) for an inconsistency theory T. ├τ does not support either general proof by contradiction or disjunction introduction. However, ├τ does support all other rules of natural deduction…” [1]

In traditional logic if we have

	P ├τ Q, ├τ Q

then we can infer ⌉P in theory T. In Direct Logic we cannot do this kind of inference.

“Since truth is out the window for inconsistent theories, we have the following reformulation: Inference in a theory T (├τ) carries argument from antecedents to consequents in chains of inference…” [1]

So if we have

	P,  ⌉P

it does not mean that P is true and not true at the same time. It means that we have arguments both for and against the proposition P.
Direct inference ├τ does not “propagate truth”, it propagates arguments.

Even if we have

	P, ⌉P

in Direct Logic we still can continue doing inferences based on these propositions (of course, we can generate new contradictory propositions).

So if we have, for example,

	P,  ⌉P,  P ├ Q,  ⌉P ├  ⌉Q

we can infer

	Q, ⌉Q

which means that we have arguments both for and against the proposition Q. Because we are talking here not about “truth” but about argumentation it makes total sense.

How is it related to inference in Ontopedia (described here)?

If we have foundational inconsistency tolerant logic we can implement reasoning engines that support various inference heuristics on top of this logic. We can think about Ontopedia inference engine as a specific heuristic layer on top of inconsistency tolerant logic such as Direct Logic.

We do have a concept of “truth” in Ontopedia, but it exists only at the heuristic layer as a convenient metaphor. At the foundation layer inference engine just generates propositions with argumentation. In Ontopedia we try to assign truth value to each proposition, but this attempt can fail if we have contradictory arguments with the same strength. We also use contradiction detection for managing/controlling inference. In Ontopedia we do not want to infer lots of assertions based on contradictory propositions. We suppress this kind of inference. Ontopedia is built to be a tool that helps Ontopedia’s user community to create/evolve/improve Ontopedia’s community knowledge map. Identification and resolution of knowledge conflicts is a fundamental activity.

It is important to mention that we do not make an assumption that one found contradiction refutes entire Ontopedia’s knowledge map. We assume that Ontopedia’s knowledge map can have multiple contradictions at any time. Ontopedia is an open system. We can have new users and we can collect new information about new subjects. Openness naturally introduces new (also helps to resolve some) inconsistencies . But we do not just stay and watch growing number of knowledge conflicts. We embed system mechanisms that allow us to monitor level of Ontopedia’s inconsistency. Our assumption is that Ontopedia’s user community will try to resolve some of existing knowledge conflicts and can improve knowledge quality and support reasonable level of inconsistency.

Ontopedia’s inference engine allows to ignore contradictory arguments and select one of the options as a “decision”. It means that engine does inferences only based on a “decision” and suppresses inferences based on alternative. Probably not all users can agree with decisions recorded in Ontopedia’s knowledge map. We consider possibility of “forking” knowledge map into separate maps with own user communities. Same assertions can be assimilated by some communities and ignored/rejected by others. In general, inconsistency tolerant reasoning allows various communities to exchange and utilize information even when they disagree with it.

I am looking forward to explore more in the areas of inconsistency tolerant logic and inconsistency tolerant reasoning.


** 1. Common sense for concurrency and inconsistency tolerance using Direct Logic, Carl Hewitt, 2010

iPad, Multi-touch interaction and Subject-centric computing

I am very excited about iPad. It makes multi-touch interaction mainstream. iPad revives and introduces new generation of developers and users to the idea of “direct object manipulation” which is one of the key concepts of Subject-centric computing.

On the surface it looks like iPad (and iPhone) application-centric model (with thousands various Apps) is in contradiction with application-less model of Subject-centric computing. In reality, applications developed for iPad and iPhone are often can be considered as “functions” which can be combined to provide subject-centric experience.

What is missing? Integration with global knowledge map that can be used from various apps and a mechanism to pass subject context that allows to launch/continue applications in specific subject context.

Apple provides support for embedding geo maps into any application using MapKit framework. Let’s assume for a minute that Apple creates SubjectKit. This (imaginary) framework provides access to information collected in global sources such as Freebase, Wikipedia the same way MapKit provides access to Google Maps. In this case iPad and iPhone applications can leverage information collected in global knowledge map. SubjectKit also can allow applications to record current subject context in some shared storage available for all apps. When a user launches an app, this app can read current subject context and use it to provide subject-centric experience.

Let’s take a look at iTunes, for example. It simplifies “buying experience”, but it is not currently integrated with global knowledge map. We often need to launch a browser and search to get additional info about subjects that we are interested in (movies, actors, directors, tracks, groups, …). iTunes has some reference data but it is quite minimal in comparison with what we can get in global knowledge map, and this reference data is limited to iTunes app.

With (imaginary) SubjectKit iTunes (and any other app) on iPad can leverage information available in global knowledge map directly without manual search.

What about leveraging subject context between various applications? Let’s say I open my Weather app and check weather in New York City. Weather app can record that one of my currently active subjects has identifier http://en.wikipedia.org/wiki/New_York_City. Let’s assume that as a next step I launch iTunes. iTunes (in my imaginary scenario) can retrieve this subject identifier from shared context storage. If I click on “Movies” tab then iTunes can suggest, for example, “Sleepless in Seattle” movie in the new “Related to your active subjects” section. iTunes can do it because (in my imaginary scenario) it can leverage current subject context, internal iTunes database and global knowledge map.

With sharing subject context between apps comes an issue of protecting user privacy. SubjectKit framework can maintain “white list” of apps which are allowed to save into and restore from current subject context. SubjectKit can block applications that try to access subject context if they are not allowed to see it. SubjectKit also can prevent applications to save active subjects into shared subject context if they are not allowed to do so.

Of course, this approach is not limited to OS X and iPad. It’s just combination of iPad design, powerful multi-touch interface and strength of OS X creates a winning platform for Subject-centric computing.

Inference in Ontopedia

I just finished reading “Semantic Web for the Working Ontologist: Modeling in RDF, RDFS and OWL”. Great book! Lots of examples and deep exploration of Semantic Web fundamentals. It inspired me… not to use OWL, no… but to describe how we approach inference/reasoning in Ontopedia.

There are several fundamental principles that we try to follow developing inference capabilities in Ontopedia.


We assume that Ontopedia’s knowledge base can have contradictions at any time. We try to develop a system that can do “reasonable” inferences within inconsistent knowledge base.

Non-monotonic, Adaptive Knowledge Base

People change opinions, modify existing and create new theories (formal and informal). People learn new things, they can be convinced and taught (sometimes). There is evolution in general and personal knowledge. We would like to support these “subject-centric” features.

In simple cases, Ontopedia users can change their factual assertions. This can trigger truth maintenance processes and revision of other assertions. Ontopedia also keeps history of assertions – “Who asserted What and When”

Minimization of Inconsistency

We try to create a system that can operate/reason within inconsistent knowledge base. But it is not enough. We embed mechanisms that allow to identify inconsistencies and help to resolve them. Knowledge conflicts are “first class objects” in Ontopedia and are organized based on conflict types. Each knowledge conflict type has conditions that describe how conflicts of this type can be identified. There are also some recommendations for resolving conflicts. In general, Ontopedia’s user community tries to minimize knowledge inconsistency and Ontopedia system “tries” to help users to achieve this goal.

Inference Transparency and Information Provenance

Inference is not an easy process, contradictions are tough, knowledge evolution is challenging. We do not try to create an illusion that these things are easy and that a user can just “click a button” and magically get “all” inferred assertions, we do not try to “virtualize” inference and hide it behind a query language, for example. We think about inference as a process that can be time and resource consuming and can include multiple steps. Ontopedia provides facilities to record various steps of inference process. Inference tracing is an important part of Information Provenance in Ontopedia. We keep track of “Who asserted What and When” and we also keep track of “What was Inferred based on What and Why”.

Multiple Inference Modules and Decision Procedure

RDFS inference rules are useful, RDFS+ adds some new tricks. OWL 1.0 inference looks interesting in many contexts and OWL 2.0 looks even better. What about Common Logic? What about Cyc-like inference?… Ontopedia’s system architecture supports various inference modules. Each module can generate proposals based on the current state of Ontopedia’s knowledge base. These proposals are recorded in the knowledge base but they do not automatically become Ontopedia’s “visible” assertions. Different inference modules can produce controversial proposals, it’s OK. Various proposals are considered by Ontopedia’s decision procedure that calculates “final” assertion that becomes “visible” on Ontopedia’s web site. Decision procedure can be invoked on any assertion at any time. Ontopedia’s knowledge base is not limited by “true” only statements. We utilize multi-valued truth including “unknown.”

Loosely Coupled Inference, Decision Making, and Truth Maintenance

Ontopedia is a system that can “infer a little bit something here”, “find some knowledge conflicts there”, “make some new decisions”, “infer a little bit more”, “review some decisions” etc. All these activities can be performed in any order by various components.
All activities are recorded in the “activity log” (it is available as RSS/ATOM feed). Various modules can explore activity log and use it for managing own activities and cooperation between modules. In general, modules can run in parallel in various areas of Ontopedia’s knowledge base. These activities can be directed buy user community through user interface or can be directed by “controller” software components.

I developed this approach and related system architecture in late 80s and used it successfully in various projects for relatively small data sets during last 20 years. What is exciting about 2010? It’s availability of huge data sets on the Web. There is also experience in building Social Web and much better understanding of Collective Intelligence. I could only envision in 1990 that it would be possible to build large stable evolving paraconsistent open knowledge based systems. In 2010, I am pretty sure about this.