Top 3 Exciting Ideas in NLP in 2018

“Machine learning and natural language are the foundation to any AI system, just in the ability to communicate with us in a human way and to automate that learning process, what you build on top of that, whether it’s predictive, prescriptive analytics, forecasting, optimization, wherever you want to go, that foundation always comes back to these technologies that have been around for decades. ” this quote is said by SAS Artificial Intelligence and Language Analytics Strategist Mary Beth Moore. That said, several research breakthroughs in 2018 made astonishing improvement in NLP. In this article we give a glance about the top 3 sophisticated language models and new approaches in NLP.

  1. BERT

BERT is short for Bidirectional Encoder Representations from Transformers, it is a new pre-trained cutting edge NLP model that gave new impressive results in solving NLP tasks such as question answering, named entity recognition and language inference. Unlike the other pre-trained language model like OpenAI GPT and ELMO, BERT is designed with bidirectional Transformer to train on each word from both sides left and right. The following figure shows the difference among the three architectures.

BERT bidirectional model avoids the issue of cycles where words can be repeated because it is trained by randomly masking a percentage of input tokens, it can also understand the relationships between sentences by pre-training a sentence relationship model. BERT is considered to be a new era in NLP and it can be used in applications chatbots and customer reviews analysis.

2. SWAG: 

When a person reads “He started his car” he or she is able to anticipate the rest of the sentence which might be “and he drove away”. Unlike humans, machines are not able to continue this obvious and easy sentence as it requires reasoning and commonsense. SWANG is short for Situations With Adversarial Generations and its goal is to enhance the research field of Natural Language Inference (NLI). SWANG is introduced as a large scale dataset that contains 113k questions about a wide range of commonsense reasoning situations and collected using video captions. The dataset is built by following these steps:

  1. extracting a sentence from a video caption.
  2. Extracting the correct answer from the next video caption.
  3. Generating wrong answers by generating a huge set of wrong answers, picking the most related one statistically and finally filtering the endings that looks like it is generated by the computer and replace those endings with more human like endings, this way of generating answers is called Adversarial Filtering (AF).

The previous figure gives an example of how SWANG works. SWANG accuracy is relatively high as it scored an accuracy of 86.2%, while the human accuracy is as high as 88%. This model can improve commonsense reasoning in question and answer systems and chat bots.

3. LISA

LISA is short for Linguistically-Informed Self-Attention and it is a neural network model designed to extract the semantic role labeling using deep learning and linguistic formalism. For example, the sentence “Matt gave the instructions to Kim” the model should recognise the verb “gave to” as the predicate, “Matt” as the supervisor or the person who gave the instructions, “the instructions” as the theme and “Kim” as the recipient. 

The neural network takes as an input word embeddings in addition to task specific learned parameters and train with multi-head self-attention with multi-task learning. Unlike the previous semantic role labelling models, LISA consume trivial pre-processing as it can add syntax using only raw tokens input, encode the sequence and then perform parsing, predicate detection and role labelling. LISA is used in automatic summarization, machine translation and Q&A systems, and perform very well in analyzing writing styles in newswires, journals and fictional writing.

Refrences:

1- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

2- SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

3- Linguistically-Informed Self-Attention for Semantic Role Labeling

4- 14 NLP Research Breakthroughs

Leave a Reply

Your email address will not be published. Required fields are marked *