ROCKIT is a strategic roadmapping project for research and innovation in the area of natural conversational interaction. The primary scientific focus concerns interactive agents which are proactive, multimodal, social, and autonomous. A second focus concerns systems which can extract and exploit rich context and knowledge from heterogenous data sources.
The main goal of ROCKIT is the development of a Research and Innovation Roadmap which integrates the vision and innovation agendas of those organisations (concerned with R&D and exploitation) in the field across Europe, with a broad coverage across sectors. A key goal is to bring together public sector research organisations with commercial organisations at all scales, with a particular focus on SMEs that represent the majority of fragmented commercial activity in Europe.
A key aspect of ROCKIT will be to organise a European research and innovation community in the area of conversational interaction technologies, integrating a wide-range of commercial organisations with application and use links to the area. ROCKIT will be structured around a set of sector-based clusters including mobile applications, healthcare, education, games, broadcast media, robotics, law enforcement, and security.
Research and Innovation Scenarios
As part of the strategic roadmapping action in the area of multimodal conversational interaction technologies, ROCKIT has arrived at a set of five target research and innovation scenarios , presented here. These scenarios represent a number of common themes arising from the workshops organized uring the process: accessibility, multilinguality, the importance of design, privacy by design, systems for all of human–human, human–machine, and human–environment interactions, robustness, security, potentially ephemeral interactions, and using the technology to enable fun.
MIT says a computer that binge-watched YouTube videos and TV shows such as "The Office," "Big Bang Theory" and "Desperate Housewives" learned how to predict whether the actors were about to hug, kiss, shake hands or slap high fives -- advances that eventually could help the next generation of artificial intelligence function less clumsily. Lead researcher Carl Vondrick sees potential health-care applications: "If you can predict that someone's about to fall down or start a fire or hurt themselves, it might give you a few seconds' advance notice to intervene."
Within enterprises, ABI Research predicts that natural language processing will prove particularly beneficial in use cases and verticals that demand hands-free functionality, such as in healthcare, oil and gas, factory floors, and construction. Enterprise voice adoption often requires customized dictionaries, applications, and tools. While the major players in voice recognition allow some APIs to extend into these domains, specialists such as Nuance Communications are developing industry-specific voice packages. Consumer-wise, voice control and conversational interaction is a natural fit for both smart glasses and AR devices, as their primary purpose is to offer hands-free, efficient data display and interaction. Smart home devices like Amazon Echo and Google Home will drive consumer use cases, with growth in AI-powered personal assistants enabling natural and rewarding interaction.
We plan to build an agent that can perform a complex task specified by language, and ask for clarification about the task if it’s ambiguous. Today, there are promising algorithms for supervised language tasks such as question answering, syntactic parsing and machine translation but there aren’t any for more advanced linguistic goals, such as the ability to carry a conversation, the ability to fully understand a document, and the ability to follow complex instructions in natural language. We expect to develop new learning algorithms and paradigms to tackle these problems.