Monday, April 9, 2018

Integrating Neural Networks with LL(*) Parsers and Natural Language Processing


Natural language processing has been a traditional area of Artificial Intelligence. Transition Networks and Augmented transition networks have defined language grammars since 1970.

Java has taken grammars further with support for Java Speech Grammar Format (JSGF). The advantage of having JSGF is it makes it simple to parse back in results for more analysis.

Like AWK/SED parsers in Unix, JSGF takes the problem of stating a complex grammar further by allowing a straightforward syntax in text to represent a grammar. This opens the possibility of defining a self changing grammer.

The premier parser today is ANTLR4. It has versions for C and Java. But another possibility is the Apache OpenNLP which is much more primitive but provides access to source and some minimal trainability but is mostly a part of speech type old style NLP.

Then there are LL(*) grammars which are more useful in processing continuous speech. There are other techniques to help decide initial branch choices such as Probabilistic Recursive Transition Networks (the A now changed to R to demonstrate the more accurate issue of recursive definition)

But one issue is how to produce a grammer which is more like how Humans learn language. Could you define a infrastructure which itself is self learning and changing to pick up any language, whether German , Swahili, or abstract. That is quite a bit far off in the future.

But one area where neural networks can augment these systems is by providing learning and reinforcement training into weights and modifications of grammars and parsing trees.

One particular area which should have process are the ability to determine response especially to an interrogative. But also in the realm of general speech.

One problem with systems like SIRI and ALEXA is they are obviously rigid and stupid. Much of these systems have been done simply with collections of questions and statistics. Try to ask a reasoning question and they barf. Try to even ask them to convert your weather results to Farenheight and they barf. Why? Because they are not truly modeling language.

There are two primary issues involved. The first is it will require true semantic network modeling in order to provide more intelligent responses rather than a simple grammer with no knowledge representation. This is however a large effort which is going to require a revolution from the image based Neural Network paradigms like Tensor. This is why I invented the Neural Cube which is the next generation of Fukushima's Neo-Cognitron adding concepts from Karl Pribram and Gerald Edelman and John Koza.

Integrative systems will provide processing which can effect path choice weighting which can improve the efficiency of parsing grammars for real time speech or add deeper classifications to grammers - e.g. temporal location vs. noun.  They will also be able to tie in parallel neural networks to inject things like emotion, mode, task participation and state, and also tailoring response based on the relationship with the person. We don't want a cold robotic response like SIRI we want an empathetic response which shows understanding of the situation. This will become the principal feature when developing cybertrons for common human tasks such as working a hotel reception desk. People can be forgiving if the language or the process isn't perfect, as long as they form the empathetic connection. People become frustrated quickly if the effect is dealing with a cold machine, which as I hurl my Alexa device into the wall becomes ever so apparent.

Cybernetics and Cognitive Science stresses to build in adaptability and learning in many ways to make the systems work closer to how our brains work. While we have so much to learn in processing language, this is still an area where basic techniques can greatly improve the state of the art.

Noonean calls this "More natural language". Issues like emotive, goal, variability, and language that humans find pleasant is often more important than being able to answer complex questions correctly. That complex reasoning will come. But, the stoic rigid language of Siri and Alexa need to fade into history as a silly first attempt. It is still revolutionary, almost magic, but what is coming next will be language that doesn't annoy. Language that makes Humans comfortable. This is the promise of cognitive linguistics and cybernetic language development.


No comments:

Post a Comment