Conversational Interfaces Must Be Interoperable

How can we invest the time to teach our devices about ourselves without getting locked in the solutions of a single vendor? Interoperability of conversational interfaces, centered on the user, with strong support for exporting preferences and personalizations is necessary before they can become widespread.

The various operating systems are all moving towards conversational interfaces. There is a very natural reason for this, stemming from the fact that more and more powerful computers can be designed and packaged in smaller and smaller form factors. Any constraint deriving from the human body interposing itself in this process is going to slow it down. The introduction of touchscreens was not only useful to make interfaces more intuitive, it also eliminated the necessity of a full size keyboard, removing one of these size constraints. How can you also eliminate screens, making computers potentially microscopical? If the computer doesn’t have a keyboard, and it doesn’t have a screen, then it is natural for it to listen to what you say and respond to you, in what becomes a natural language dialogue. This is the reason why we are moving towards conversational interfaces.

Apple with Siri, Google with Google Now, and Microsoft with Cortana are doing exactly this… However, there is a fundamental problem with this approach: lack of interoperability, and user-centered design.

At the birth of personal computers their cryptic command lines required specific knowledge to operate them, and only when they were replaced by graphical user interfaces we came to a full understanding of the necessity of ease-of-use, and built computers that were intuitive to use regardless of the specific vendor. If you want to print something, you know there will be an option for it, and you’ll search with your mouse, finding it pretty fast whether you are using a Macintosh or a computer based on Microsoft Windows.

The interactions with the various natural language systems personified by the software agents cannot be tied to a given operating system vendor. They learn about us a lot, and we will expect this knowledge to travel with us as we move from device to device regardless of the make, the model, and the brand of operating system. It is going to be fundamental for their success in the long term to be able to transparently export and import user preferences, interaction histories, every possible kind of knowledge about us that will put us at ease with the next version of the operating system, or the next natural language agent that we meet, regardless of its vendor.

Until such time, the investment that we individually make in training these agents on one hand and learning about them on the other, are going to be temporary advantages, and we are not going to be see a return on it at the levels that we should be able to expect.