Bonus Project – Obesity Predictor – ML Course

MLcourse

Bonus Project – Obesity Predictor – ML Course

Hey Guys! I am back with another freshly brewed article for you guys! Today will be a practical project! So enjoy. This is Manas from csopensource.com – “Your one stop destination for everything computer science”. Today we will be making a project on Obesity Classification.

Prelude

Today I am back with a new project based on a Machine Learning model. Until now we have learnt only about the theory of the machine learning models and a simple python code example. But only learning theory isn’t fun is it? So today we will be applying the concepts that we have learnt in the previous courses and put them to good Use.

Objective: Obesity Predictor

We will be making an Obesity Classifier which takes in the gender, height and weight as inputs and returns a value that corresponds to the class[labels] or the type of obesity. Ex. Overweight, Obese, Extremely Obese, Normal etc. We will make a simple Machine learning algorithm that finds the class to which our new Observation belongs to.

Plan of Attack

For this problem we could use any type of classification model like Support Vector Machines, Simple Classifier etc. but I would like to use the K-NearestNeighborsClassifer as we have recently learnt about it! We will load our dataset using the pandas module in python and follow certain steps to perform Machine Learning

Resources

This is the place from where i got the data and idea! https://www.kaggle.com/yersever/500-person-gender-height-weight-bodymassindex/home

Download the DataSet and Python Code ( Google Drive )

Code

Input Features

  • Gender
  • Height
  • Weight

Output Features

  • Class Value [ label ]

Python Code

Here we import all the necessary modules

We are creating 2 helper functions which will help us later in the program to reduce repetitive code. One function is to normalize the training values i.e keep it between 0 and 1 to aid in training, and the other is a switch function to prevent us from writing multiple if else statements.

Next, We import the dataset from our current directory with the seperator specified as ‘,’ as it is a .csv file.

Here we are converting the gender into numbers { 1: ‘Male’, 0: ‘Female’ } to help in training as the ML models can’t process Strings.

Generating the X( Features ) and the Y( Labels ) Variables from the dataset. Notice that we are using everything except the Index Column which contain the labels for our features and specifically the Index column for our y values. This is how we create our Features and Labels.

Now we may contain multiple values in several ranges in our data, to make it easier for our model to process we need to scale it between 0 and 1. To do this we use our helper function that we defined before as our Scaler. We input the DataFrame Variable in this case X and scale it between 0 and 1.

Notice that we don’t do this for the Y values as they are merely labels and don’t involve in the actual training computation.

Here, we are creating some variables that we need later in order to train and test our Classifier. We create 4 variables 2 sets of X and Y one of which contains 90% of the original data which we will use as the training set and another set which contains 10% of the data which we will use to validate our model.

This part of the program deals with the training of our model. First we create a classifier of type K-Nearest Neighbors and insert a K value of 7. Next, we use the fit method to input our Training data into the classifier. We then compute the accuracy by using the score method.

This is the part of the program where we perform inference that is actually using our model to predict on real life data.

most of the code here is basic python which is easy to understand. We run a continuous while loop which enables us to use the program multiple times instead of running it again and again. We can exit by entering ‘X’ when the prompt asks us to.

First we accept the gender as a string and convert it into 1 or 0 { 1: ‘Male’, 0: ‘Female’ } then we accept height and weight. Next we create a numpy array which contains all the scaled values of our gender, height and weight. We then Reshape it to (-1, 3) which means we do not know the number of rows but we are sure that there are 3 columns, namely “Gender”, “Height” and “Weight”. Next, we perform prediction on this new data and store it in a variable.

Finally, we print the values by using our switch function that we defined in the beginning of our program. This prevents us from writing multiple if else statements hereby reducing the amount of code that we need to write. Then we create a prompt to ask if the user wants to run the program again!

And we are done! it took us around 80 lines to write this whole program.

Output

This is the output that the model has predicted for us! we enter the gender, weight and height and it gave us a feedback saying that the person is Obese. This is Amazing!

Final thoughts

  • It is currently about 70 percent accurate at best which is quite low for a medical grade algorithm. But we cant complain much as our dataset contains very few datapoints around 500 which is way too less for a ML model, I’m sure with more data our accuracy will increase
  • It is sort of a bad use of AI as with height and weight anybody can find the BMI(Body Mass Index) and check out the obesity level, but what is special is that I am not finding the BMI at all! the model is completely learning from the height and weight data and mapping it in its own way! this is what makes it special.
  • Many improvements can be done such as changing the type of classifier! Currently we are using K Nearest Neighbors as our Model with a K Values of 7 but maybe changing the K value can yield greater accuracy. I leave that up to you guys.

I decided to make a quick  fun project in order to not bore you guys from constant knowledge..Sometimes its okay to have fun! Please work on more challenges and try to solve them on your own! In case you get stuck the internet is always there!

I hope you learnt something valuable from today’s post and ill be back for another one, until then Enjoy Machine Learning!

-MANAS HEJMADI

 

<———————–LINKS———————–>

Previous Post: ML Course Day 8(K-Nearest Neighbors)

Python Course by Author: Jithu

My Personal Blog

 

Manas Hejmadi

I am a boy who studies in 9th grade at Bangalore! I have a good knowledge of computer programming, AI and UI Design. I aspire to create a tech startup of my own!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.