Course Lesson 2

Course:

Selection of simple algorithms for Machine Learning applications.

Lesson 2: Classifying Data Patterns.

To remind: if the point that represents your data joins or is close (minimum error) to the line that represents the pattern of data that your model analyzes, or this point is located in a «cluster» of data, completely different from other groups, then , my dear friend, you can try to predict the future, or, know your past. please be very careful…For example, do you remember our school graph?

Our model, or machine learning algorithm, is nothing more than our humble mathematical function: (which represents the equation of a straight line). if you try to graph the point x= 5, you will know the future, because the value of Y will be 11. if on the contrary, you want to evaluate the value of x = -1, your model will tell you that its value is (or was) y = -1, and so, in this way, you will see the past in front of you. When we have that a single independent variable «x» manages to change a dependent «Y», we will be in the presence of a simple linear regression, of the form Y=f(x).

Step 1: Let’s continue seeing more visual examples of data groupings, these, which seem to be together because they are «a family» that share interests or “labels”, in this situation “Blue and Red”. They are called «Cluster»; We will study the “why” they are grouped like this later in the course. I can only tell you in advance that we will call the algorithms that use this data like “Classification algorithms”. Do you remember this image?

But sometimes, grouping data is not as easy as it seems:

Step 2: And these others points, which seem to form lines, or graphs of functions that we learned about in school, are called regression algorithms.

Step 3: I am sorry, I forget use “a line” for a visual representation of data. In the future, we will become wizards to see the future; that is to say, this line will represent our «model», and thus we will be able to know its behavior, in time, past and future. but, As you can see, with real world data, we will probably never have a «perfect» straight line, that can join all the points. Using a «straight line» is just a mathematical model that will help us do our calculations, because we only have to find the «mathematical function» that generates that line.

Step 4: how to select the best straight line, for try to join all the points? This issue will be studied in the next lesson; now, I only want you remember this term: “error”, and a very special procedure called “gradient decent”, while more “error”, less accruable will be our model. Don’t worry for this now.

Step 5: enjoy this GIF animation, and try to understand how we can select the “best straight line”:

Step 6: Not always our line will be a straight line, could be a curve. I hope you enjoy cartoons:

Step 7: if you are asking to, if we can make some prediction about who will gain the fight, my answer is: yes, if our model is good. Ps: Goku win!

Step 8: in the real life, data set not always in 2D plane, clustering or regression, sometimes they look like art paints!

Step 9: The difference is that here, trying to get a large majority of points to be united by the same graph is almost an impossible mission, so a mathematical method called «gradient descent» is used to minimize the error of leaving out some points. , and thus be able to recreate the best image, that is, our machine learning model.

Step 10: all these points, which represents the data set with which we «feed» our machine learning model. when two or more independent variables (X, W, Z…N), influence a dependent variable «Y». We will call it «Multiple Linear Regression», of the form Y=f(x,y,z…n)

Step 11: Keep in mind that the patterns will not always be very clear to our brain; For this reason, it is not always easy to know which algorithm to use to be able to use that grouped data (now converted into information), so that our machine learning application makes a simple decision for us; Sometimes models and algorithms are even used that combine both classification and regression methods.

Step 12: But don’t worry, for now, it is important you remember this simple definition:

Step 13: let’s look at the simplest ones, which will help us prepare for what’s to come.

Selecting “the best possible machine learning algorithm” is a lot like trying to fit a sports jersey (algorithm) to a specific type of athlete (data); many times, we have different body sizes of shirts, and with a simple glance at the athlete, we can select the most appropriate size.

Step 14: Other times, it is not so easy, because for computability reasons, we only have a few available algorithms, and very sparse or complex data, which do not seem to fit our available algorithm well. And it is for this reason that some machine learning algorithms are more effective (and with less error) than others.

Step 15: Words More, Words Less Let’s go back to our sports shirt example; If we had only one shirt, which athlete do you think would serve you better?

Step 16: Indeed, I think maybe the second player is a little more comfortable 😊

Step 17: using real data, remember these examples:

Step 18: An interesting detail: later in the advanced course, you will learn mathematical and statistical techniques and procedures to be able to “adapt or modify our algorithms” for the available data, or “redistribute the data”, so that they can be used with the available data. algorithms that we already have available. Imagine then, that our sports shirt is made of a special material that shrinks or expands, or that our sports hero can go on great diets, or gain many kilograms eating delicious foods.

Step 19: From this PowerPoint template, drag and drop on top of the grouped data the one you think is the best machine learning algorithm, for that case, and explain why you made that selection; Do not forget that it is mandatory that, somewhere in this image, your student identification number must be visible. In any case, you will have to take that image and place it in a repository of free images, or on a page of your blog. Then, place the link to your image here, or try to embed it as a comment, accompanying it with your explanation. Likewise, I ask you to also respond to the comment of one of your classmates, arguing why you agree or why you don’t. to your examples. Use this blog forum, to show your work. I recommend you wait for the response of your facilitator instructor, to continue with the next lesson.

I recommend you wait for the response of your facilitator instructor, to continue with the next lesson. Good Luck!

Note for workshop:

https://www.investopedia.com/ask/answers/062215/how-can-i-run-linear-and-multiple-regressions-excel.asp

«Utilizá la Inteligencia Artificial en tu estudio contable y volvé a crecer»

Deja tu opinion o comentario: