Getting Started

Detailed implementation of a Transformer model in Tensorflow

Image for post
Image for post
Picture by Vinson Tan from Pixabay

In this post we will describe and demystify the relevant artifacts in the paper “Attention is all you need” (Vaswani, Ashish & Shazeer, Noam & Parmar, Niki & Uszkoreit, Jakob & Jones, Llion & Gomez, Aidan & Kaiser, Lukasz & Polosukhin, Illia. (2017))[1]. This paper was a great advance in the use of the attention mechanism, being the main improvement for a model called Transformer. The most famous current models that are emerging in NLP tasks consist of dozens of transformers or some of their variants, for example, GPT-2 or BERT.

We will describe the components of this model, analyze…


Part 1 of a series about text summarization using machine learning techniques

Image for post
Image for post
Photo by Romain Vignes on Unsplash

During the execution of my capstone project in the Machine Learning Engineer Nanodegree in Udacity, I studied in some depth about the problem of text summarization. For that reason, I am going to write a series of articles about it, from the definition of the problem to some approaches to solve it, showing some basic implementations and algorithms and describing and testing some more advanced techniques. It will take me some posts for the next few weeks or months. …


Create an experiment, train and use checkpoints for a Transformer model

In this post we will describe the most relevant steps to start training a custom algorithm in Amazon SageMaker, not using a custom container, showing how to handle experiments and solving some of the common problems that happen when facing with custom models using SageMaker script mode. Some basics concepts on SageMaker will not be detailed in order to focus on the relevant concepts.

The following steps will be explained:

  1. Create an Experiment and Trial to keep track of our experiments
  2. Load the training data to our training instance
  3. Create the scripts to train our custom model, a Transformer.
  4. Create…


Create and train a neural machine translation model with attention in TF2

Image for post
Image for post
Photo by Alireza Attari on Unsplash

Today, we’ll continue our journey through the world of NLP. In this article, we’re going to describe the basic architecture of an encoder-decoder model that we’ll apply to a neural machine translation problem, translating texts from English to Spanish.

Later, we’ll introduce a technique that has been a great step forward in the treatment of NLP tasks: the attention mechanism. We’ll detail a basic processing of the attention applied to a scenario of a sequence-to-sequence model, many-to-many approach.

But for the moment, it’ll be a simple attention model. We won’t comment (yet) on the more complex models that’ll be discussed…


Train and deploy a PyTorch model in Amazon SageMaker

Image for post
Image for post
Photo by Clint Adair on Unsplash

Today, we’ll continue our journey through the fascinating world of natural language processing (NLP) by introducing the operation and use of recurrent neural networks to generate text from a small initial text. This type of problem is known as language modeling and is used when we want to predict the next word or character in an input sequence of words or characters.

But in language-modeling problems, the presence of words isn’t the only thing that’s important but also their order — i.e., when they’re presented in the text sequence. …


Applying basic NLP techniques for text classification on Tweets: Real or Fake?

In this post we continue to describe some traditional methods to address an Natural Language Processing task, text classification.

This is an easy and fast to build text classifier, built based on a traditional approach to NLP problems. The steps to follow are:

  • describe the process of tokenization
  • how to build a Term-Document Matrix (using some methods like Counting words and TFIDF) as the numericalization method
  • and then apply a machine learning classifier to predict or classify a tweet as real or fake.

The blog post and the…


Image for post
Image for post

In this development of a ML model we will use a very powerful platform/tools based on Microsoft Azure, Azure Machine Learning Services. It is a cloud-based framework for designing, training, deploying and managing machine learning models. It provides a toolkit where you can manage and supervise all the steps in a machine learning model development.


Image for post
Image for post

A TF Keras convolutional model in a ML competition. Part 1.

Description

Voice recognition software enables our devices to be responsive to our speech. We see it in our phones, cars, and home appliances. But for people with accents — even the regional lilts, dialects and drawls native to various parts of the same country — the artificially intelligent speakers can seem very different: inattentive, unresponsive, even isolating. Researchers found that smart speakers made about 30 percent more errors in parsing the speech of non-native speakers compared to native speakers. Other research has shown that voice recognition software often works better for men than women.

Algorithmic biases often stem from the datasets…


This is part 2 of “Predicting mortgage approvals”. Here you can link to part 1.

A description of a machine learning model in Azure Machine Learning Studio

The next point is to develop a predictive model that allows us to determine with an acceptable degree of confidence when a loan request or application is accepted or not. This is a two-class classification problem, accepted or not accepted. We will try to define a model starting from a simple approach. Then, this model will be refined but avoiding increasing complexity as much as possible.

After the analysis made we can take some decisions:

  • Row_id variable, as it is well known, should be discarded…

Part 1

Image for post
Image for post
Image by Alexander Stein from Pixabay

This is part 1 of “Predicting mortgage approvals”, in part 2 I show you the classification model on Azure Machine Learning Studio.

A simple and quick EDA for a data scientist beginner in Python

This post is an introduction, example, of an EDA (Exploratory Data Analysis). Our goal is to explore and analyze the data. Finally we will design a model to predict whether a mortgage application is accepted or denied according to the given dataset, which is adapted from the Federal Financial Institutions Examination Council’s (FFIEC). The model will be described in next post.

Data contains variables about applicants, loan characteristics and amounts, location and population, etc. Basically, we explored, analyzed…

Eduardo Muñoz

An experienced software engineer, a machine learning practitioner and enthusiastic data scientist. Learning every day.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store