Estimator API

Estimator API

Estimator API – So, far in this course you are using tf dot Keras as an API to build our machine learning models; tf dot Keras is believed to be a simpler API for building models in TensorFlow 2 dot o. There is another way to build models in TensorFlow which is through Estimator API.

Estimator is TensorFlows high level representation of a complete model and it has been designed for easy scaling and asynchronous training. In this module we will build machine learning models with TensorFlow estimator API. In this exercise we will use iris classification problem for demonstrating how to build machine learning models with estimator API.

Let us begin by first importing TensorFlow and number of libraries that we need. We are mainly going to use Pandas as a library for manipulating structured data, we have used pandas in some of our earlier exercises as well. Now, that we have installed the required libraries or required packages. Let us get into building the model with TensorFlow estimator APIs.

In this exercise we will build and test a model for classifying iris flowers into three different species based on the size of the sepals and petals. Iris dataset has four features and one label. The four features identify the following botanical characteristics of individual iris flowers. There is a sepal length and width and petal length and width. So, there are four columns and there are three different; there are three different species in our dataset there is Setosa, Versicolor, and Virginica as species and we have to classify the incoming flower into one of these three categories.

And a flower is represented by four features sepal length and width and petal length and width. So, the file the input file has five columns sepal length and width petal length and width and the name of the species. Let us download and parse iris data using Keras and Pandas. Let us get the training and test file and read the CSV to get a Pandas data frame.

So, the trained data is saved in train data frame and test data is saved in test data frame, let us examine the training data. Let us look at the first five examples in the training you can see that we have four columns and a last column is the desired label that we want to learn. So, here we want to learn the mapping between these four features to the species.

Let us get the species column which is the label from the data frame out and store that into; and store that into train underscore y list and for test we do the same thing and store that in test underscore y list. Let us look at the first five columns of the training now. You can see that the species column has been removed from the training because of using the pop command.

Everything else remains the same. If you want to use estimator there are three steps you have to create one or more input function that defines how the data will be input to the estimator, you have to define feature columns. And, then instantiate an estimator specifying the feature columns and various hyper parameters and we call appropriate method on the estimator object.

Let us understand how this task can be implemented for iris classification. First let us create input function. Input function is a function that returns a tf dot data dot data set object which outputs the following two elements as tuple, we have features and labels. Let us look at how the data set object look like.

So, we have a dictionary of features which contains the feature name and list of values. So, here we have for sepal length there are two value 6.4 and 5. Similarly we have features defined for sepal width petal length and petal petal width with respective values and we have label as an array having exactly two values which is 2 and 1. So, you can see that this particular input function defines an input corresponding to two flowers and finally, it returns the feature dictionary and the label array.

So, this is exactly what the data set object contains. In order to keep things simple we will be loading the data with pandas and we will build an input pipeline from this in memory data. The data set API is very powerful as it can easily records from a large collections of file in parallel and join them into single stream. However, for the iris dataset this particular functionality will not be required. So, we will use here Pandas data frame and create the data set using from underscore tensor underscore slices function.

We take the dictionary of features and array of labels to create a data set object. In case of training we shuffle the data set object and return the data set object in a specified batch size. Next we define feature columns corresponding to the features. Since all the features are numeric here we will use numeric feature column.

Now, that we have defined our input function and created feature columns next task is to instantiate an estimator. There are several premade classifier estimators defined in TensorFlow. There is a DNN classifier that is used for deep models on multiclass classification. A DNN linear combined classifier is used for wide and deep model. Wide model works on a large number of features like a very large one nod encoding and deep model works with the features which come from embeddings.

So, DNN linear combined classifier is used for wide and deep models. Linear classifier is based on linear model. For iris problem we will use a DNN classifier that helps us perform multiclass classification. Let us see how to instantiate this estimator. Here we define a DNN classifier with two hidden layers each with 30 and 10 nodes respectively. We also specify that there are three classes, so that in out output corresponding output layer can be constructed and we specify our feature columns as input to the DNN classifier.

Let us run, let us instantiate the DNN classifier. So, we will train the model by calling the estimator’s train method where we specify the input function and also specify number of steps for which the training loop should be run. So, the model is trained now evaluate the model with the test data. So, we use the same input function instead of train we are going to pass the arguments corresponding to the test data.

And, we mentioned training and we set training to be false as against the training to be true at the time of training. And you can see that we achieve accuracy of 66 percent, we achieved test accuracy of 66 percent on the iris classification. Let us use this particular model for making predictions on unseen data or for inferencing. So, here we will have to first specify what is the expected output and then we specify the features.

So, here the feature vector is specified as a dictionary where the key is the name of the column or name of the feature and followed by a list of values. So, here we are specifying three examples with their value specified for each of the feature in a list. For example we have a sepal length of 5.1 5.9 and 6.9 corresponding to the three examples and they have sepal width of 3.3, 3.0 and 3.1.

So, 5.1 SepalLength the flower with SepalLength of 5.1 has SepalWidth of 3.3, PetalLength of 1.7 and PetalWidth of 0.5. So, this is how you have to interpret an example, but it is specified in a slightly different format or in a transposed form. We specify the input function to get a data set object from tensor slices and we give the predict underscore x as a dictionary to the input function which returns a data set object on which we apply the prediction.

So, let us run this and see what kind of predictions are coming out. And let us look at the predictions and the expected result and we also we will also print the probabilities. We can see that the first prediction is Setosa where the actual label was also Setosa and we can see that this prediction is with quite good probability or quite good confidence of 82 percent.

The second prediction is Virginica where the expected or where the actual prediction was Versicolor, but you can see that the probability of the prediction is less than 50 percent. In third case the prediction is Virginica which matches the actual label of Virginica and has got 60.5 percent probability of the label.

So, in this module we learnt how to use tf estimator API and we applied that for iris classification. We understood that tf in order to define a tf estimator API we have three main steps we have to specify one or more input functions. We have to specify the feature columns and we have to instantiate tf estimator API with appropriate configuration.

In the next session we will use this tf estimator API for building a linear model. Hope to see you in the next session.

Share this post ...

1 thought on “Estimator API”

  1. Pingback: Boosted Trees | My Universal NK

Leave a Reply

Your email address will not be published. Required fields are marked *