In this post we will describe and demystify the relevant artifacts in the paper “Attention is all you need” (Vaswani, Ashish & Shazeer, Noam & Parmar, Niki & Uszkoreit, Jakob & Jones, Llion & Gomez, Aidan & Kaiser, Lukasz & Polosukhin, Illia. (2017)). This paper was a great advance in the use of the attention mechanism, being the main improvement for a model called Transformer. The most famous current models that are emerging in NLP tasks consist of dozens of transformers or some of their variants, for example, GPT-2 or BERT.
We will describe the components of this model, analyze their operation and build a simple model that we will apply to a small-scale NMT problem (Neural Machine Translation). To read more about the problem that we will address and to know how the basic attention mechanism works, I recommend you to read my previous post “A Guide on the Encoder-Decoder Model and the Attention Mechanism”. …
Part 1 of a series about text summarization using machine learning techniques
During the execution of my capstone project in the Machine Learning Engineer Nanodegree in Udacity, I studied in some depth about the problem of text summarization. For that reason, I am going to write a series of articles about it, from the definition of the problem to some approaches to solve it, showing some basic implementations and algorithms and describing and testing some more advanced techniques. It will take me some posts for the next few weeks or months. …
In this post we will describe the most relevant steps to start training a custom algorithm in Amazon SageMaker, not using a custom container, showing how to handle experiments and solving some of the common problems that happen when facing with custom models using SageMaker script mode. Some basics concepts on SageMaker will not be detailed in order to focus on the relevant concepts.
The following steps will be explained:
Today, we’ll continue our journey through the world of NLP. In this article, we’re going to describe the basic architecture of an encoder-decoder model that we’ll apply to a neural machine translation problem, translating texts from English to Spanish.
Later, we’ll introduce a technique that has been a great step forward in the treatment of NLP tasks: the attention mechanism. We’ll detail a basic processing of the attention applied to a scenario of a sequence-to-sequence model, many-to-many approach.
But for the moment, it’ll be a simple attention model. We won’t comment (yet) on the more complex models that’ll be discussed in future articles, like when we’ll address the subject of transformers. …
Today, we’ll continue our journey through the fascinating world of natural language processing (NLP) by introducing the operation and use of recurrent neural networks to generate text from a small initial text. This type of problem is known as language modeling and is used when we want to predict the next word or character in an input sequence of words or characters.
But in language-modeling problems, the presence of words isn’t the only thing that’s important but also their order — i.e., when they’re presented in the text sequence. …
Explaining the concepts and use of word embeddings in NLP, in text classification.
In this blog post we are going to explain the concepts and use of word embeddings in NLP, using Glove as en example. Then we will try to apply the pre-trained Glove word embeddings to solve a text classification problem using this technique. And as in others notebook we will follow the notebook from the great course on NLP by LazyProgrammer “Natural Language Processing in Python”.
In my personal blog you can find a blog post or notebook with the text and code in this post. …
Applying basic NLP techniques for text classification on Tweets: Real or Fake?
In this post we continue to describe some traditional methods to address an Natural Language Processing task, text classification.
This is an easy and fast to build text classifier, built based on a traditional approach to NLP problems. The steps to follow are:
The blog post and the code is available on my fastai pages blog. …
In this development of a ML model we will use a very powerful platform/tools based on Microsoft Azure, Azure Machine Learning Services. It is a cloud-based framework for designing, training, deploying and managing machine learning models. It provides a toolkit where you can manage and supervise all the steps in a machine learning model development.
Voice recognition software enables our devices to be responsive to our speech. We see it in our phones, cars, and home appliances. But for people with accents — even the regional lilts, dialects and drawls native to various parts of the same country — the artificially intelligent speakers can seem very different: inattentive, unresponsive, even isolating. Researchers found that smart speakers made about 30 percent more errors in parsing the speech of non-native speakers compared to native speakers. Other research has shown that voice recognition software often works better for men than women.
Algorithmic biases often stem from the datasets on which they’re trained. One of the ways to improve non-native speakers’ experiences with voice recognition software is to train the algorithms on a diverse set of speech samples. Accent detection of existing speech samples can help with the generation of these training datasets, which is an important step toward closing the “accent gap” and eliminating biases in voice recognition software. …
This is part 2 of “Predicting mortgage approvals”. Here you can link to part 1.
The next point is to develop a predictive model that allows us to determine with an acceptable degree of confidence when a loan request or application is accepted or not. This is a two-class classification problem, accepted or not accepted. We will try to define a model starting from a simple approach. Then, this model will be refined but avoiding increasing complexity as much as possible.
After the analysis made we can take some decisions: