In this post we will describe and demystify the relevant artifacts in the paper “Attention is all you need” (Vaswani, Ashish & Shazeer, Noam & Parmar, Niki & Uszkoreit, Jakob & Jones, Llion & Gomez, Aidan & Kaiser, Lukasz & Polosukhin, Illia. (2017)). This paper was a great advance in the use of the attention mechanism, being the main improvement for a model called Transformer. The most famous current models that are emerging in NLP tasks consist of dozens of transformers or some of their variants, for example, GPT-2 or BERT.
Part 1 of a series about text summarization using machine learning techniques
During the execution of my capstone project in the Machine Learning Engineer Nanodegree in Udacity, I studied in some depth about the problem of text summarization. For that reason, I am going to write a series of articles about it, from the definition of the problem to some approaches to solve it, showing some basic implementations and algorithms and describing and testing some more advanced techniques. It will take me some posts for the next few weeks or months. …
In this post we will describe the most relevant steps to start training a custom algorithm in Amazon SageMaker, not using a custom container, showing how to handle experiments and solving some of the common problems that happen when facing with custom models using SageMaker script mode. Some basics concepts on SageMaker will not be detailed in order to focus on the relevant concepts.
The following steps will be explained:
Today, we’ll continue our journey through the world of NLP. In this article, we’re going to describe the basic architecture of an encoder-decoder model that we’ll apply to a neural machine translation problem, translating texts from English to Spanish.
Later, we’ll introduce a technique that has been a great step forward in the treatment of NLP tasks: the attention mechanism. We’ll detail a basic processing of the attention applied to a scenario of a sequence-to-sequence model, many-to-many approach.
But for the moment, it’ll be a simple attention model. We won’t comment (yet) on the more complex models that’ll be discussed…
Today, we’ll continue our journey through the fascinating world of natural language processing (NLP) by introducing the operation and use of recurrent neural networks to generate text from a small initial text. This type of problem is known as language modeling and is used when we want to predict the next word or character in an input sequence of words or characters.
But in language-modeling problems, the presence of words isn’t the only thing that’s important but also their order — i.e., when they’re presented in the text sequence. …
Applying basic NLP techniques for text classification on Tweets: Real or Fake?
In this post we continue to describe some traditional methods to address an Natural Language Processing task, text classification.
This is an easy and fast to build text classifier, built based on a traditional approach to NLP problems. The steps to follow are:
In this development of a ML model we will use a very powerful platform/tools based on Microsoft Azure, Azure Machine Learning Services. It is a cloud-based framework for designing, training, deploying and managing machine learning models. It provides a toolkit where you can manage and supervise all the steps in a machine learning model development.
Voice recognition software enables our devices to be responsive to our speech. We see it in our phones, cars, and home appliances. But for people with accents — even the regional lilts, dialects and drawls native to various parts of the same country — the artificially intelligent speakers can seem very different: inattentive, unresponsive, even isolating. Researchers found that smart speakers made about 30 percent more errors in parsing the speech of non-native speakers compared to native speakers. Other research has shown that voice recognition software often works better for men than women.
Algorithmic biases often stem from the datasets…
This is part 2 of “Predicting mortgage approvals”. Here you can link to part 1.
The next point is to develop a predictive model that allows us to determine with an acceptable degree of confidence when a loan request or application is accepted or not. This is a two-class classification problem, accepted or not accepted. We will try to define a model starting from a simple approach. Then, this model will be refined but avoiding increasing complexity as much as possible.
After the analysis made we can take some decisions:
This is part 1 of “Predicting mortgage approvals”, in part 2 I show you the classification model on Azure Machine Learning Studio.
This post is an introduction, example, of an EDA (Exploratory Data Analysis). Our goal is to explore and analyze the data. Finally we will design a model to predict whether a mortgage application is accepted or denied according to the given dataset, which is adapted from the Federal Financial Institutions Examination Council’s (FFIEC). The model will be described in next post.
Data contains variables about applicants, loan characteristics and amounts, location and population, etc. Basically, we explored, analyzed…