"A BABY LEARNS TO CRAWL, WALK AND THEN RUN. WE ARE IN THE CRAWLING STAGE WHEN IT COMES TO APPLYING MACHINE LEARNING.OR WHEN WE ARE SUPERVISING MACHINE" -DAVE WATERS. Creating a Machine Learning Model is like introducing your child to a new environment and teaching him to adapt likewise. Then comes up the second stage when your child starts exploring on his own using the information that you gave him before. That's what machine learning is, that’s what classification models are. For Most of the people around the world buying a car, a house, or investment requires an expert.There machine learning and KNN algorithms can guide us like an expert.
AARK Technology Hub works on building the environment so that machines and humans work and reside together for a better future.
“Best use of technology comes from best minds”
KNN classification model helps in breaking down the data as per our need for classification. And using it with MIS(Machine intelligence sharing model) helps in achieving optimized results with better machine efficiency. As described in our previous blog AARK ML intelligence sharing, you can get the brief of intelligence sharing.
Below we have show a Case study Conducted on Machine Intelligence Sharing
To build a simple KNN classification model for predicting the quality of the car given few other car attributes.
Title: Car Evaluation Database
The dataset is available at “http://archive.ics.uci.edu/ml/datasets/Car+Evaluation”
Creator: Marko Bohanec
Donors: Marko Bohanec
Data Set Information
Number of Attributes : 6
Names Of Attributes
CAR -> car acceptability
PRICE -> overall price
buying -> buying price
maint -> price of the maintenance
TECH -> technical characteristics
COMFORT -> comfort
doors -> number of doors
persons -> capacity in terms of persons to carry
lug_boot -> the size of luggage boot
safety -> estimated safety of the car
buying -> v-high, high, med, low
maint -> v-high, high, med, low
doors -> 2, 3, 4, 5-more
persons -> 2, 4, more
lug_boot -> small, med, big
safety -> low, med, high
The Car Evaluation Database contains examples with the structural information removed, i.e., directly relates CAR to the six input attributes: buying, maint, doors, persons, lug_boot, safety.
Because of known underlying concept structure, this database may be particularly useful for testing constructive induction and structure discovery methods.
Step Wise Implementation :
Step 1: Import the necessary modules from specific libraries.
Step 2: Load the data set with mongo. Use DataFrame to read the data from mongo
Step 3: Check few information about the data set
Step 4: Identify the target variable
The target variable is marked as class in the dataframe. The values are present in string format. However the algorithm requires the variables to be coded into its equivalent integer codes. We can convert the string categorical values into integer code using the factorize method of the pandas library.
Step 5: Identify the predictor variables and encode any string variables to equivalent integer codes
Step 6: Select the predictor feature and select the target variable
Step 7: Calculate the value of k and store it in mongo and get value of k
Step 8: Train test split
Step 9: Training / model fitting
Step 10: Model parameters study
As you can see the algorithm was able to achieve classification accuracy of 94% on the held out set. Only 32 samples were misclassified. Since this is a very simplistic data set with distinctly separable classes. That’s how to implement K-Nearest Neighbors with scikit-learn. Load your favorite data set and give it a try
INCREASING ACCURACY USING MACHINE INTELLIGENCE SHARING MODEL :
Step 1: Expand your dataset (labeled data).
By this We have our Trained Model Stored in MongoDB. Now any new data we collect we will store in MongoDB and mark it as NEW since we havn't used it in our Model.
Step 2: Now using combined new and old data from MongoDB, we can multiple Epochs on our Stored Model and check improvement in accuracy. If the accuracy is reducing we can ignore the new model mark new data as bad data. If the accuray is increasing we can store the new model as a new row in MongoDB without effecting the old Model.
Step 3: Machine Intelligence Sharing. We can now use the KNN model stored in MongoDB by other python programs trying to find solution for same Problem. This would significantly save the training time.
This Case study may not look much but lets thing about the possibilities such as
A centerized Frad detection system that can collect data from all agencies and keep getting better with time to avoid fraud without human interaction.
Loan approval system that can connect with all banks of nation and predict loan risk percentage of applicant.
and many more.
AARK Technology Hub always focuses on providing efficient and innovative solutions. This AARK Technology Hub case study is freely available on our GITHUB and you can easily download and use it anywhere.