Integrating logical reasoning and probabilistic graphical models for spoken dialog system
MetadataShow full item record
In recent times, partially observable Markov decision processes (POMDPs), an instance of probabilistic sequential decision making under partial observability, have been used to create SDS. These systems use the inherent capabilities of POMDPs to model (and account for) the common causes of speech recognition errors such as background noise and variations in user input. However, POMDP-based solutions can become computationally intractable as the number of possible states becomes large. This is typically the case for dialog in many practical application domains. These systems also fail to exploit the contextual commonsense knowledge about the domain or the sequence of possible utterances in dialog in any specific domain. This thesis is a significant step towards addressing the challenges in designing SDS for practical applications. The proposed novel architecture improves the tractability of SDSs by exploiting contextual domain knowledge, and the complementary strengths of commonsense inference and probabilistic sequential decision making. Specifically, we use Plog, a probabilistic extension to the declarative language Answer Set Prolog (ASP), to represent and reason with domain knowledge. We also use a POMDP to select the actions executed by the SDS, i.e., the verbal utterances of the system and the conclusions drawn, in response to the verbal inputs of the user. Commonsense inference with the domain knowledge is used to identify the subset of the state space that needs to be considered by the POMDP, and to revise the probabilistic beliefs currently encoded by the POMDP. The updated probabilistic belief distributions are, in turn, used to revise the domain knowledge for subsequent inference. The architecture's capabilities are illustrated and evaluated in the context of a system that accepts verbal instructions from a human to move objects between specific locations.