Google Professional Data Engineer – TensorFlow and Machine Learning Part 4
August 2, 2023

7. Tensors

Concepts of the rank and the shape of a tensor are both really important ones. So here is a question that I’d like you to keep in mind as we process this video. The rank of a tensor is an integer. For instance, a scalar has ranked zero, and so on. The shape of a tensor is what? What, for instance, is the data type of the shape of a scalar. For that matter, the data data type of the shape of a vector. While discussing computation graphs in TensorFlow, it’s important for us to understand the distinction between building a graph and running a graph. These are two distinct steps in any TensorFlow program, and it’s important for us to realize what’s going on in each of these.

As the name would suggest, the process of building a graph is all about defining the computation graph, specifying the operations and the data, in a sense defining the edges and the vertices which constitute that graph. But as with other tools where transformations are modeled using tags, the step of actually running a graph is separate. This involves actually executing the graph to get a final output. This idea of a sharp distinction between building a data representation and executing that data representation is once again not new or unique to TensorFlow. Consider, for instance, how RDDs are evaluated in spark RDDs or resilient distributed data sets know their lineage.

They know the exact set of transformations which resulted in that particular RDD coming into being. And so RDDs are lazily evaluated, and only as a result of an action and not as a result of a trans formation. In any case, let’s now double click a little bit and understand in detail what building a graph and running a graph entail. As an aside, you should be aware that Tensor board, which is a great visualization facility available in TensorFlow, helps with both of these parts of the computation process, both the shape of the computation graph and the intermediate and final results. Both of these can be viewed in TensorBoard. We’ll check out TensorBoard when we get to the demos.

Let’s now turn our attention to lazy evaluation and the distributed nature of computation graphs. To understand how this is possible, let’s zoom in on a couple of nodes. Virtually any node in a directed asylic graph can be thought of as the output node of a smaller directed asylic graph. That smaller directed asylum graph consists of all of the dependencies of that output node. Now, given two different unrelated nodes in the larger Dag, each of these can be evaluated separately. Clearly, there is no intersection between the dependencies of these two nodes, and this serves as an indication to us that the computation of these two smaller subgraphs can be parallelized.

Another implication of this is that there is no need to calculate the entire diag simply for this one output result. TensorFlow can easily figure out and calculate only that portion of a Dag which is required for the evaluation of a particular node, exactly as in other data flow technologies like spark or data flow. On the Google Cloud platform, lazy evaluation and distributed processing go hand in hand. As soon as one is able to create logically distinct subunits of our original Dag, these can then be evaluated in a lazy manner. And what’s more, that evaluation can be distributed. It can be performed in parallel across machines in a cluster. So keep in mind both of these aspects of TensorFlow’s performance. It relies on lazy evaluation to only evaluate that subpart of a directed computational graph which is required for a particular output.

And in addition, it will distribute the calculation of the computation graph over many machines in a cluster. Let’s now double click on the term ensor. Clearly, tensors are extremely important in TensorFlow. That’s why TensorFlow has gotten its name. Let’s understand what they are. Tensors are the edges in our computation graph. They connect the nodes which represent different data transformations. At heart, tensors are just data items. Let’s look up the docks. The docs tell us that a tensor is a central unit of data in TensorFlow. Indeed, all data items in TensorFlow are tensors. And tensors form the edges of the computation graph.

Now, a tensor consists of a set of primitive values shaped into an array of any number of dimensions. Let’s parse this. Every word here is important. The primitive values could be INTs or floats or any other basic types i, e. Not containers. These primitive types have been collected into some container, which is an array. And that array could have any number of dimensions and any shape. Let’s understand all of the properties of tensors. Scalars can be thought of as zero dimensional tensors. So, for instance, the integer three, the double 6. 7, the single character a, each of these represent zero dimensional tensors, which are scalars. We can extend this to vectors.

Vectors are one dimensional tensors. These are defined by one square bracket on each end. This is important. Vectors are defined by one pair of square brackets, that is, by one square bracket at each end. Let’s extend this. We can increase the number of dimensions. For instance, a matrix can be thought of as a two dimensional tensor. This has rows and columns. In order to define a matrix, you would use two square brackets at either end. So matrices are two dimensional tensors, which are defined by two pairs of square brackets. Generalizing, the number of dimensions is equal to the number of pairs of square brackets.

So for n dimensional matrices, these are n dimensional tensors, and they will require n pairs of square brackets to open and close the tensor’s data set, respectively. So, tensors have defining characteristics. In fact, they have three defining characteristics. These are the rank, which is the number of dimensions in a tensor, the shape, which is the number of elements along each dimension. And lastly, the data type this represents the type of each element in the tensor. The type is pretty simple, but let’s understand the rank a little more clearly. We’ve seen this already that the rank of a tensor can be thought of as the number of square brackets that you would use while defining the tensor.

So scalars are tensors of rank zero. Vectors are tensors of rank one. Matrices are tensors of rank two. And tensors which have a rank higher than two are just called n dimensional. The shape is a property of a tensor, which tells us how many elements exist along each dimension. Now, in reality, there can be a couple of types of shapes, the static and the dynamic shape, which correspond to the maximum possible number of elements in each dimension and the current number of elements in each dimension, respectively. It’s also important to note that the shape itself is a vector. It’s going to have one element corresponding to each dimension in the tensor.

So, for instance, the shape vector of a scalar is the empty vector that’s just a pair of square brackets. The shape vector of a vector is going to consist of just one element, that is, the number of elements in the vector. The shape vector of a matrix is going to consist of two elements, that is, the number of elements along each dimension of the two dimensional matrix. And for an N dimensional tensor, the shape vector will have n elements. That’s as far as the shape goes. The last defining characteristic is the data type. The data types of tensors are pretty familiar INTs, floats, strings, booleans, all of the usual suspects. So do keep in mind that every tensor is a multi dimensional set of primitive values.

Every tensor has three defining characteristics which are the rank, the shape, and the data type. Let’s come back to the question we posed at the start of the video. The rank of a tensor is an integer. This is something we saw. The shape of a tensor is a vector. This also is something that we saw. For instance, consider a scalar. A scalar has a rank zero. Now, the shape vector of a scalar is going to be an empty list or an empty vector. Consider a two dimensional matrix there. The rank is two. The shape vector has two numbers within a list because the shape vector tells us how many elements are present in each dimension. So again, the shape of a tensor is represented by a vector.

8. Lab: Tensors

At the end of this lecture, you should be able to tell me the rank of a tensor, which is just a list of numbers or a vector. In the last demo, we used constants to build up our computation graph. In this demo, we’ll use tensors. Tensors are n dimensional matrices made up of primitive values such as integers, floats, booleans strings, et cetera. A tensor is a central unit of data within a TensorFlow program. It flows through the nodes, which act on it and produce new tensors. Tensors are the edges in our TensorFlow computation graphs. The code for this demo is in the simple math sensors IPython notebook.

To start off any TensorFlow program, we’ll import the TensorFlow libraries and alias it as PF. One thing to note here is that data lab comes with TensorFlow installed. There is no additional installation step that you need to perform before you use TensorFlow. We set up two constants, x and y, but the values for these constants are one dimensional arrays or vectors. These are 1D tensors. The rank of a tensor is equal to the number of dimensions that it has. So this is a tensor of rank one. Both these constants are tensors of rank one. TensorFlow has math operations that can operate on the entire tensor in one go.

For example, the reduced sum operation will sum up all the individual primitives within our tensor. Similarly, the reduced prod operation will find the product of all the primitives within the y tensor. Final div is the next computation that we specify. Notice that the input to final div is the output of sum x and prod y. We divide SUMX by prod y. The reduced mean operation finds the average value of the primitive specified in the tensor. Here we find the average of sum x and prod y. Now that we’ve built our computation graph, let’s instantiate a session so that we can run this graph. You can also instantiate a session using a simple variable in this way.

This is not recommended, though, because it’s very easy to forget to close the session like I have done in this program in TensorFlow. In order to find the value for any computation, you need to call session run and pass in that computation node. Only then will you get the results. Let’s say you were to execute print x or print y or print sum underscore x. That will give you the metadata details of the tensor or the computation that you perform. In the case of a tensor, it’ll give you its rank, perhaps its shape, its data type, and so on. It won’t give you the value of the tensor. To get the value of a tensor, you have to compute it using session run.

This is true even if the tensors are just constants. Run these lines of code and you’ll see the result printed onto screen. There are the value of the constants x and y. Some x prod y. Notice that sum x sums up all the individual elements of x prod y, multiplies all the individual elements of y. We want to view this computation graph with tensors, so we write it out using PF summary file writer, this time to a different directory. Simple math with tensors. Run this bit of code. This will show you where TensorBoard has been launched, and you can click on that to view your TensorBoard.

And here under the graphs tab, is the computation graph for this TensorFlow program. Notice that we added answers and a few more operations, such as a reduced sum, reduce mean, and so on. And the graph has already grown in complexity. If you notice in this graph, some of the nodes have been shaded in gray and have a plus sign within them. That’s because that node represents a logical grouping of several computations. And you can set up this logical grouping using something called named scopes. We won’t cover that right here, but you should know that it exists, and the computations within named scopes are grouped into one node.

You can expand this node and see what the individual computations are. Another thing that might have struck you here is that you didn’t explicitly compute the rank of any tensor. So how does this rank operation show up? The reduced mean operation, though, under the hood, needs to find the rank of the individual matrices, or tensors, in order to compute the mean. That’s where this rank comes from. The rank of a tensor, which is represented as a list, a single dimensional array or a vector is essentially one. The number of dimensions is equal to the rank of a tensor.

9. Linear Regression Intro

Here is a question that I would like us to keep in mind as we go through the contents of this video. Linear regression is which of the following is it a machine learning algorithm, a representation learning algorithm, a deep learning algorithm? Or none of the above? We now have a decent understanding of the fundamentals of TensorFlow. Let’s dig a little deeper. Let’s take a look at how TensorFlow can be used to actually build a machine learning model. Here we will work through the logic of a specific machine learning problem. We’ll start with linear regression, which is really simple, and see how the TensorFlow approach to regression is pretty different from other cookie cutter approaches available in Python or R.

We will also understand the ideas of placeholders feed dictionaries and variables. We’ll run programs with different inputs, making use of these concepts, and then get to the really crucial idea of variables to hold values, which our TensorFlow program is going to update during the training process. And along the way, we’ll also use TensorBoard to visualize our computation graph. And we’ll see how this can be done in a nice fashion by making use of named scopes. Let’s start with linear regression. Let’s use linear regression as an example, because it is maybe the simplest machine learning problem out there. The basic idea of regression is to measure or quantify causal relationships.

Given a statement like x causes Y, we have two variables, X and Y. X is the causal or the independent variable. Y is the effect or the dependent variable. This relationship between X and Y is going to be quantified in the form of a straight line. That’s the whole point of linear regression. Take an idea like this one, that wealth increases life expectancy. This is a hypothesis which we might want to test here we are saying that the cause is the wealth of individuals and the effect is how long they are expected to live. Linear regression would be used in a situation where this hypothesis is true and where the link is linear.

So increasing the wealth by one unit, whatever that unit is, will also cause the life expectancy to increase by a proportionate constant. This linear relationship between cause and effect is captured by linear regression. In a little bit, we’ll also discuss other types of regression, such as logistic regression. Another example of regression could be a hypothesis that home prices fall as one moves away from a city center. So the causatory variable here is the distance in miles from the city center, and the effect will be the price per square foot of a home. And once again, linear regression only ought to be used if we already have a reason to believe that there is indeed a linear relationship between X and Y.

In general, the idea is that X or the cause of the explanatory variable does not depend on Y, but Y depends on X. And that’s why we will be able to represent these in a two dimensional frame, assuming that there is only one x variable. We can do so by plotting the x variable on the x axis and the y variable on the y axis. Notice that this is only possible in two dimensions. If we have only one x variable in multiple linear regression, we will need more than two axes. That is called multiple regression. In any case, let’s now see how a linear relationship would be found given data for a large number of homes, given the information about how far from the city center they are, and also about their price per square foot.

Linear regression is the procedure. It’s an algorithm which will try and find the best straight line to pass through all of these points. And we’ll get back to the definition of best in a moment. The key insight here is that that regression line is a line. It has an equation of the form y is equal to A plus BX. And hopefully this immediately signals to us that this is a use case for machine learning. Because after all the regression equation, the values of those constants A and B are going to change on the basis of the data points that we consider. Any machine learning algorithm changes its output, changes its parameters based on the corpus of training data.

And our regression line satisfies that condition. And this is why regression is indeed the simplest example of machine learning possible. Now, it also turns out that linear regression is an example of the simplest function that can be learned by just one neuron. We’ll get back to that idea in a little bit. Let’s keep going with a brief exploration of simple regression. In simple regression, we need to find a line of the form Y is equal to A plus BX. Given all the x and y coordinates of all of these points, in an ideal world we would want to find just two constants A and B, such that all of these equations on screen now are perfectly satisfied.

In other words, in a case of perfect linear regression, the same two constants A and B, ought to be enough to predict the value of y given any value of x. That’s what this set of equations represents. And this boils down to finding the best regression line. Now, the question that’s sure to arise is what is the best line? What is the best fitting line? To make this real, let’s consider a pair of lines. Y is equal to a one plus B one x, and another line, line two y is equal to a two plus B two x. Let’s compare these two lines. Visually, it’s pretty obvious that line two is not a good fit. It’s nowhere near the points which we are seeking to fit.

But the question is, how do we mathematically express that intuition? How do we know mathematically that line two is not as good as line one. This is performed using a procedure called minimizing least squares. But before we talk about the MSC estimation method, to find the best values of A and B, I’d like us to first focus on the significance of the coordinates A and B in each of these lines. You might remember from high school coordinate geometry that it’s possible to express the equation of a line in something known as the slope intercept form. That’s exactly what we’ve done. The line one has yintercept a one that measures the length or the distance between the origin and the point where line one meets the y axis.

That explains the significance of the constant A. Let’s understand the significance of the other constant B. Given any pair of points on this line, what the second constant B tells us is if x increases by one unit, how much is y going to change by? Notice here that these lines are downward sloping. And so we can interpret the slope as follows we can state that if x increases by one unit, y is going to decrease by B one units. We can go ahead and calculate this exact same set of quantities for the second line using the constants A two and B two. Okay, let’s now turn our attention to defining the best fitting line.

And it turns out that the best fitting line is that which minimizes the least square error, the least square error of what you ask and the answer is the least square error of the residuals. Given a set of data points, the residuals represent those dashed lines which you now see on screen. These are the differences between the actual and the fitted values for each line. So given any candidate line, whether it’s line one or line two, we can find the residuals with respect to that line by dropping vertical dash lines from each point to the corresponding line. The best fitting regression line is that which minimizes the sums of the squares of all of these dashed lines.

And this now helps us to quantify our intuition. It’s pretty obvious that the residuals or the dash lines to the second line, line two, are really long. They are much longer than those to line one. And this explains why line one is the best fitting line. It’s because it minimizes the sums of the squares of the length of all of these dotted lines. This is the theory of linear regression. Those dotted lines which are the residuals need to be as small as possible. And that sum of the squares is called the error. Now, this is bringing us very close to a machine learning implementation because what we would like to do is set up a machine learning algorithm which calculates the values of A and B such that the errors are minimized.

The way we are going to accomplish this is by starting with some initial estimates of the values of A and b, then finding the errors of a regression line defined by those estimates for our data points, and then feeding those errors back to reduce the values of those errors by updating the values of A and B. And when this process ends, we will have found the best fitting line using a machine learning approach. Let’s return to the question we posed at the start of this video. Now, to be really pedantic about this, linear regression is none of the above because linear regression is a problem rather than an algorithm. Linear regression is a problem which seeks to fit a line through a set of points.

That said, linear regression can be solved using a machine learning approach and to that extent it can indeed be represented or solved using machine learning. It is not a representation learning algorithm because representation refers to the case where the algorithm picks up on its own what features are impacted. Important. It’s also not a deep learning algorithm, at least in the sense of the common usage of the term deep learning, because we do not require a multilayer neural network in order to solve it or to reverse engineer this problem. So technically, linear regression is a problem, it’s not an algorithm, but it can be solved using a traditional machine learning based approach.

Leave a Reply

How It Works

img
Step 1. Choose Exam
on ExamLabs
Download IT Exams Questions & Answers
img
Step 2. Open Exam with
Avanset Exam Simulator
Press here to download VCE Exam Simulator that simulates real exam environment
img
Step 3. Study
& Pass
IT Exams Anywhere, Anytime!