What is Natural Language Processing or NLP?
NLP is the process of how your voice or language in text is translated into an intent for action or a result that a computer can understand.
Natural Language Processing sort of works like a graphic interface.
That’s because, like a graphic interface, the heart of NLP is the process of determining user intent, but through interpreting voice instead of visual engagement or cues.
A graphic user interface (GUI) uses visual-oriented objects and interactive layouts that a user can engage with. Those points of engagement can be things like the menu on a web site or the buttons on a remote control or the visual menu in a product like an entertainment console in a streaming service. In those instances, that voice version of NLP is your finger.
That’s because you don’t speak to the graphic interface.
Instead, you look at the visual menu and, using your finger via button or mouse, you communicate your intent for information or action to the computer by clicking something in a way that the system understands.
For example. Clicking on the home button means you want to go home, or clicking on the play button means you want the computer to play content.
In a GUI, UX and UI designers have to create and display pre-set visual options and actions so users can see and select intents that make their desired task execute. Because the interface is visually oriented, a GUI is something users usually need to look at and learn how to use in order to know how to express intent and get content from the system.
A pure voice user interface (VUI) is mostly the opposite. A VUI works as much as possible to understand how to get information and the intent of the user by what is contained in their voice and how a person naturally speaks. It does that through the exchange and translation of language.
That process of managing the exchange and translation of language is called Natural Language Processing or NLP for short. It’s an application, driven by a programming language (often Python) that takes captured text or identified human language and tries to determine meaning from it.
NLP is used all around you. It’s part of the predictive text on your phone. It’s used in social media to mine texts in order to assess emotions and topic interests in users’ social media posts like Facebook and Twitter.
it’s also in use at companies like J.P. Morgan for mining the texts of President Trump and some market sectors to quickly assess volatility in the marketplace due to government policies. And it’s also used as a processing layer with voice assistants to take the words you say and try to identify what you mean or what task you wish to accomplish (user intent).
Let’s focus on voice assistants.
As I mentioned above, a VUI is an interface that’s trying to understand you. Particularly what you are saying. While it uses a form of AI (weak AI), its intelligence in understanding language is not a sentient or conscious ability to understand words, meaning, and context. Instead, after another process that captures words and text, NLP attempts to use rules humans have developed around language to make a reasonable prediction of a user’s meaning or intent around using those words.
NLP does this through these primary processes.
Breaking out a sentence that the system hears into their separate words. If you listen to your own speech, you know it’s easy for humans to see how we blend one word into another. We usually can tell where one word stops and another begins. That’s much harder for a computer. To help a computer, it analyzes the sound wave pattern of a sentence, and the amplitude of the waves, to help it parse words.
So let’s say NLP found a word. Say “running.” As a human, you know what that means. However, with the present participle “ing” adding more time context to the word, that’s a little more complex for a robot to understand.
So NLP often looks for the stem or the root word and primary meaning. For example. In this case, the root word of running is “run.” Finding the stem of the word allows the NLP to more easily assess the meaning, “to use legs to move quickly” or “to execute.” Based on other words and parts of speech spoken in the utterance, the NLP will calculate the most likely intent meant by the word “run.”
Since language has rules and parts of speech, the software looks to group words that likely go together. Like “the book.” The word “the” is not likely to be used by itself so the software looks for the word or words it is likely related to and “chunks” them together.
Some words don’t really add to the meaning of the sentence so the NLP does not seek to find meaning from them. That’s often parts of speech like prepositions (e.g., at, of, the).
Invoke, Variable, and Entities
With the structure, parts of speech, and grouping set, the system then looks for keywords in the sentence to triangulate meaning and understand the user’s intent.
Let’s say we are a voice assistant named Robbie and we hear a request from a user…
“Robbie, turn on the bedroom lights.”
This is how it breaks down for NLP.
Robbie is the Invocation or “wake” word.
This is the word or words that help a service understand and identify which app or resource a user wants to act on to the request. In this case, that’s Robbie. In your own life, that may be “Hey Siri” or “Alexa.”
This phrase helps the system understand Intent or what is the user trying to do. In this case, change the state of something from off to on.
Objects or defined concepts that are related to the action are called entities or variables. In the case, an action “turn off” will be applied to the variable “lights.”
Once the NLP has determined the intent of the spoken language, it will attempt to match your expression to action it has in its library of actions and will perform that function with the parameters expressed (in this case, bedroom lights). If it finds it, it gives you the response.
While it takes data scientists, programmers, linguistics and conversational professionals to do this, that is NLP in a nutshell. Turning language into intent. And through a voice assistant turning intent into action.