Machine Intelligence Sharing KNN Classification

KNN for prediction using machine intelligence sharing

KNN for prediction using machine intelligence sharing
"A BABY LEARNS TO CRAWL, WALK AND THEN RUN. WE ARE IN THE CRAWLING STAGE WHEN IT COMES TO APPLYING MACHINE LEARNING.OR WHEN WE ARE SUPERVISING MACHINE" -DAVE WATERS. Creating a Machine Learning Model is like introducing your child to a new environment and teaching him to adapt likewise. Then comes up the second stage when your child starts exploring on his own using the information that you gave him before. That's what machine learning is, that’s what classification models are. For Most of the people around the world buying a car, a house, or investment requires an expert.There machine learning and KNN algorithms can guide us like an expert.

AARK Technology Hub works on building the environment so that machines and humans work and reside together for a better future.

“Best use of technology comes from best minds”

KNN classification model helps in breaking down the data as per our need for classification. And using it with MIS(Machine intelligence sharing model) helps in achieving optimized results with better machine efficiency. As described in our previous blog AARK ML intelligence sharing, you can get the brief of intelligence sharing.

Below we have show a Case study Conducted on Machine Intelligence Sharing

Case Study

To build a simple KNN classification model for predicting the quality of the car given few other car attributes.


Data Set Information

Number of Attributes : 6
Names Of Attributes
  • CAR -> car acceptability
  • PRICE -> overall price
  • buying -> buying price
  • maint -> price of the maintenance
  • TECH -> technical characteristics
  • COMFORT -> comfort
  • doors -> number of doors
  • persons -> capacity in terms of persons to carry
  • lug_boot -> the size of luggage boot
  • safety -> estimated safety of the car

Attributes Values:
  • buying -> v-high, high, med, low
  • maint -> v-high, high, med, low
  • doors -> 2, 3, 4, 5-more
  • persons -> 2, 4, more
  • lug_boot -> small, med, big
  • safety -> low, med, high

The Car Evaluation Database contains examples with the structural information removed, i.e., directly relates CAR to the six input attributes: buying, maint, doors, persons, lug_boot, safety.
Because of known underlying concept structure, this database may be particularly useful for testing constructive induction and structure discovery methods.

Step Wise Implementation :

  • Step 1: Import the necessary modules from specific libraries.
  • Step 2: Load the data set with mongo. Use DataFrame to read the data from mongo
  • Step 3: Check few information about the data set
  • Step 4: Identify the target variable
    The target variable is marked as class in the dataframe. The values are present in string format. However the algorithm requires the variables to be coded into its equivalent integer codes. We can convert the string categorical values into integer code using the factorize method of the pandas library.
  • Step 5: Identify the predictor variables and encode any string variables to equivalent integer codes
  • Step 6: Select the predictor feature and select the target variable
  • Step 7: Calculate the value of k and store it in mongo and get value of k
  • Step 8: Train test split
  • Step 9: Training / model fitting
  • Step 10: Model parameters study
    As you can see the algorithm was able to achieve classification accuracy of 94% on the held out set. Only 32 samples were misclassified. Since this is a very simplistic data set with distinctly separable classes. That’s how to implement K-Nearest Neighbors with scikit-learn. Load your favorite data set and give it a try


  • Step 1: Expand your dataset (labeled data).
    By this We have our Trained Model Stored in MongoDB. Now any new data we collect we will store in MongoDB and mark it as NEW since we havn't used it in our Model.
  • Step 2: Now using combined new and old data from MongoDB, we can multiple Epochs on our Stored Model and check improvement in accuracy. If the accuracy is reducing we can ignore the new model mark new data as bad data. If the accuray is increasing we can store the new model as a new row in MongoDB without effecting the old Model.
  • Step 3: Machine Intelligence Sharing. We can now use the KNN model stored in MongoDB by other python programs trying to find solution for same Problem. This would significantly save the training time.

This Case study may not look much but lets thing about the possibilities such as
  • A centerized Frad detection system that can collect data from all agencies and keep getting better with time to avoid fraud without human interaction.
  • Loan approval system that can connect with all banks of nation and predict loan risk percentage of applicant.
and many more.

AARK Technology Hub always focuses on providing efficient and innovative solutions. This AARK Technology Hub case study is freely available on our GITHUB and you can easily download and use it anywhere.