Similarity in Triangles Unit 13 Notes Definition of Similarity.
RM World 2014: Similarity assessment and resume analysis
-
Upload
rapidminer -
Category
Documents
-
view
231 -
download
3
description
Transcript of RM World 2014: Similarity assessment and resume analysis
RapidMiner World2014
Similarity Assessment and
Resume Analysis using Clustering
and Cosine Similarity Measures in
RapidMiner
Surabhi Lodha
Santosh Vishwakarma
PROBLEM STATEMENT
• Every company’s main challenge is hiring of new individuals
• For recruitment the pool of resume a company gets for a job application is way larger than the number of people assigned to analyze it.
SOLUTION
• NEED OF TEXT MINING MODEL
• SORTING AND FILTERING OF KEYWORDS
• CATEGORISING OF RESUMES FOR BETTER PROCESSING
WHY RAPID MINER
• Rapidminer is an open source software package for predictive analysis.
• It is solid and complete package with flexible and affordable support options.
• Enterprise-ready performance and scalability for big data analytics Innovative analyst support.
• We can program by piping components together in a graphic ETL workflows.
• Rapidminer is very powerful due to its learning operators and operator framework, which allows to form nearly arbitrary processes
DATASET
• RESUMES OF GRADUATE STUDENTS OF VARIOUS STREAMS
– CSE 300
– CIVIL ENGG 225
– ELECTRICAL 200
– MECHANICAL 250
OUR APPROACH
PREPROCESSING OF RESUME DATASET
• TOKENISING
• STEMMING
• REMOVAL OF STOP WORDS
• INVERTED INDEX
PERFORM CLUSTERING USING K-MEANS
RESULT ANALYSIS
COMPARISONS BETWEEN CLUSTERS
COMPARISONS AMONG CLUSTERS
DATA SIMILARITY BW RESUMES
CONCLUSIONS
• Reduces the work of HR
• Project focuses on resume analysis by implementing clustering algorithm on resume dataset using rapid miner tool
• Selection of best resume in minimum time
THANKS