Machine Intelligence Sharing KNN Classification

KNN for prediction using machine intelligence sharing

"A BABY LEARNS TO CRAWL, WALK AND THEN RUN. WE ARE IN THE CRAWLING STAGE WHEN IT COMES TO APPLYING MACHINE LEARNING.OR WHEN WE ARE SUPERVISING MACHINE" -DAVE WATERS. Creating a Machine Learning Model is like introducing your child to a new environment and teaching him to adapt likewise. Then comes up the second stage when your child starts exploring on his own using the information that you gave him before. That's what machine learning is, that’s what classification models are. For Most of the people around the world buying a car, a house, or investment requires an expert.There machine learning and KNN algorithms can guide us like an expert.

AARK Technology Hub works on building the environment so that machines and humans work and reside together for a better future.

“Best use of technology comes from best minds”

KNN classification model helps in breaking down the data as per our need for classification. And using it with MIS(Machine intelligence sharing model) helps in achieving optimized results with better machine efficiency. As described in our previous blog AARK ML intelligence sharing, you can get the brief of intelligence sharing.

Below we have show a Case study Conducted on Machine Intelligence Sharing

Case Study

To build a simple KNN classification model for predicting the quality of the car given few other car attributes.

Sources

Title: Car Evaluation Database

The dataset is available at “http://archive.ics.uci.edu/ml/datasets/Car+Evaluation”

Creator: Marko Bohanec

Donors: Marko Bohanec

Data Set Information

Number of Attributes : 6

Names Of Attributes

CAR -> car acceptability

PRICE -> overall price

buying -> buying price

maint -> price of the maintenance

TECH -> technical characteristics

COMFORT -> comfort

doors -> number of doors

persons -> capacity in terms of persons to carry

lug_boot -> the size of luggage boot

safety -> estimated safety of the car

Attributes Values:

buying -> v-high, high, med, low

maint -> v-high, high, med, low

doors -> 2, 3, 4, 5-more

persons -> 2, 4, more

lug_boot -> small, med, big

safety -> low, med, high

The Car Evaluation Database contains examples with the structural information removed, i.e., directly relates CAR to the six input attributes: buying, maint, doors, persons, lug_boot, safety.

Because of known underlying concept structure, this database may be particularly useful for testing constructive induction and structure discovery methods.

Step Wise Implementation :

Step 1: Import the necessary modules from specific libraries.

Step 2: Load the data set with mongo. Use DataFrame to read the data from mongo

Step 3: Check few information about the data set

Step 4: Identify the target variable

The target variable is marked as class in the dataframe. The values are present in string format. However the algorithm requires the variables to be coded into its equivalent integer codes. We can convert the string categorical values into integer code using the factorize method of the pandas library.

Step 5: Identify the predictor variables and encode any string variables to equivalent integer codes

Step 6: Select the predictor feature and select the target variable

Step 7: Calculate the value of k and store it in mongo and get value of k

Step 8: Train test split

Step 9: Training / model fitting

Step 10: Model parameters study

As you can see the algorithm was able to achieve classification accuracy of 94% on the held out set. Only 32 samples were misclassified. Since this is a very simplistic data set with distinctly separable classes. That’s how to implement K-Nearest Neighbors with scikit-learn. Load your favorite data set and give it a try

INCREASING ACCURACY USING MACHINE INTELLIGENCE SHARING MODEL :

Step 1: Expand your dataset (labeled data).

By this We have our Trained Model Stored in MongoDB. Now any new data we collect we will store in MongoDB and mark it as NEW since we havn't used it in our Model.

Step 2: Now using combined new and old data from MongoDB, we can multiple Epochs on our Stored Model and check improvement in accuracy. If the accuracy is reducing we can ignore the new model mark new data as bad data. If the accuray is increasing we can store the new model as a new row in MongoDB without effecting the old Model.

Step 3: Machine Intelligence Sharing. We can now use the KNN model stored in MongoDB by other python programs trying to find solution for same Problem. This would significantly save the training time.

This Case study may not look much but lets thing about the possibilities such as

A centerized Frad detection system that can collect data from all agencies and keep getting better with time to avoid fraud without human interaction.

Loan approval system that can connect with all banks of nation and predict loan risk percentage of applicant.

and many more.

AARK Technology Hub always focuses on providing efficient and innovative solutions. This AARK Technology Hub case study is freely available on our GITHUB and you can easily download and use it anywhere.