career guidance using ml and python for college students projects
hamed0432
237 views
28 slides
Jun 15, 2024
Slide 1 of 28
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
About This Presentation
hi
Size: 812.68 KB
Language: en
Added: Jun 15, 2024
Slides: 28 pages
Slide Content
Intelligent Career Guidance System
Contents Introduction Algorithms Feature Selection Techniques Dataset Expected Output Format Proposed Project Pipeline Feasibility Analysis System Environment System Design Results and Discussions Model Deployment Git History Conclusion References
Introduction Proposed system on Machine learning helps students in computer science stream to choose right career path Based on their scores in various subjects, skills in communication, coding etc.
Literature Review Title of the paper Tanya V Yadalam , Vaishnavi M Gowda, Vanditha Shiva Kumar, Disha Girish “Career Recommendation Systems using Content based Filtering”Proceedings of the Fifth International Conference on Communication and Electronics Systems (ICCES 2020) Area of work Career Recommendation for registered computer stream students Dataset Dataset contains 17 columns and 20000 entries Methodology/Strategy Recommendation takes a form based input from the user and recommends an accurate and apt career. It recommends three best career options. Career prediction is based on the data filled by the user and perform cosine similarity on that. By applying cosine similarity function, get the similarity between previous user preference and the available jobs and finally get the top recommended jobs according to the score of the similarity. The aim is to implement a feedback and a comment and perform NLP on the given feedback and determine whether it’s a positive, negative or neutral comment to provide better results to the students using the recommender system. Algorithm NLP, Cosine Similarity, Content Based Filtering Result/Accuracy Recommend top three jobs by applying cosine similarity. NLP determine whether the comment or feedback is positive, negative or neutral. Advantages Introduce features of security, reliability and transparency. Recommend 3 top jobs. Can give feedbacks and comments. Limitations It is mainly for Engineering students. No data Encryption mechanism. Future Proposal In future it can be developed for other branches such as business, arts. System can be implemented using collaborative approach.
Paper 2 Title of the paper Vignesh S, Shivani Priyanka C, Shree Manju H, Mythili K “An Intelligent Career Guidance System using Machine Learning “. 2021 7 th International Conference on Advanced Computing & Communication Systems (ICACCS) Area of work Engineering department prediction for students after plustwo. Dataset The dataset used for the machine learning model is developed manually. In the dataset, there are five different target labels available each representing a specific department. The dataset contains more than 500 rows which means 500 unique values with several features and target variables. There are seven different features available on the dataset. Methodology/Strategy The framework totally consists of three modules where the whole process take place. First module is skill set assessment module. The second module is the prediction module where with the help of the scores obtained by the candidate the prediction takes place. The third and final module is the result analysis module. In this module a detailed analysis of the candidate’s performance will be represented in various formats Algorithm KNN, SVM, Naïve Bayes, K-Means Clustering Result/Accuracy KNN Accuracy: 0.9410 SVM: 0.8632 Naïve Bayes: 0.8714 Advantages Lower the chances of selecting a department by a candidate where the candidate has higher chance of failure rate. Limitations The skill set analysis need not be with exact knowledge of students. The system is for plustwo students and they have only knowledge within their academics. Future Proposal In the near future the framework’s accuracy rate will be enhanced and additional features can be used for recommending a suitable department and also the outliers of the framework will be removed gradually.
Paper 3 Title of the paper Manar Qamhieh , Haya Sammaneh , Mona Nabil Demaidi “PCRS: Personalized Career-Path Recommender System for Engineering Students. Supported by the An-Najah National University, Palestine, Research Project, under grant ANNU-1920-Sc004. Published on 2020. Area of work Guidance system for high school students in Palestinian community to choose engineering discipline. Dataset The dataset is mainly collected from research survey and stored in database and analysed to create an association between personality types and engineering disciplines in Palestine. The MBTI personality test consists of 21 questions randomly chosen from a dataset of 70 questions. Methodology/Strategy There are 4 main phases: Obtaining student’s personal information including gender, high school grades in STEM courses, and a list of extra-curricular interests. Determining student’s personality type based on a self-administered personality test. Processing input data to construct a personal and academic profile for each student. Build a fuzzy recommender system to provide students with personalized and user-specific ranking of engineering disciplines. Algorithm Fuzzy Logic Result/Accuracy The output of the PCRS application is a bar plot to show the suitability rates of engineering disciplines after applying the fuzzy logic of the system. Advantages Helpful for high school students in developing countries where educational and professional guidance in schools is limited. The bar plot representation provides the user with clear results in a simple way. User can be compare the suitability rates of engineering disciplines. Limitations Time consuming because for each student corresponding processed data of a specific engineering discipline is entered into fuzzy logic and the fuzzy logic determine a personalized rate for it. It should be repeater for all seven engineering disciplines considered in PCRS. Future Proposal In the future, PCRS can be extended to consider more university departments and disciplines other than engineering. The recommendation can be enhanced to consider social-economic factors such as employment rates, economical situation and parent’s background specially in developing countries such as Palestine
Dataset Dataset contains 20000 records Students have to answer 24 questions related to their ability in academics , personality, coding These 24 questions are feature list Suggested job role is the class label
Expected Output format
Algorithms Decision Tree : Classification technique used to classify records in pictorial format Attribute Selection Measures: Gini Index (a cost function used to evaluate splits in the dataset. It is calculated by subtracting the sum of the squared probabilities of each class from one.), Entropy (measure of the randomness in the information being processed)
XG Boost: It works on gradient boosting algorithm Gradient boosting algorithm works on the basic principle gradient descent. This model is built using tree-based learners(Decision Trees) XGradient boosting Algorithm: Final prediction=Base value(the starting prediction from basic decision tree)+LR*w1+LR*w2+..+LR* wn Where LR= learning rate=eta w1=residual predicted value by 1st residual model wn =residual predicted value by nth residual model Xgboost is different from other gradient boost is because of its tuning parameters The main tuning parameters are 1) regularisation parameter(Lambda) 2)threshold that defines auto pruning (Gamma) 3)Learning rate(eta)
SVM : Perform classification by finding the best hyperplane that classifies datapoints in a best way. Searching for linear optimal separating hyperplane( decision boundary) Find hyperplane using support vectors and margins Training tuples that fall on hyperplane are support vectors Farther a hyperplane from datapoint , larger its margin- optimal hyperplane Kernel, gamma, c( regularization parameter), random_state
Feature Selection techniques Chi Squared Test : Quantifies the independence of pairs of categorical variables.
Proposed Project Pipeline
Data collection: Find an appropriate dataset with appropriate parameters like academic scores, specialization programs, analytic capabilities, personal details like hobbies, workshops, certifications, books interested, etc. Data Pre-processing: Make the acquired data set in an organized format. Cleaning the null values, invalid data values, and unwanted data. OneHot Encoding: Applying techniques for converting categorical values in the data into a numerical or ordinal format so that they can be provided to machine learning algorithms.
Feasibility Analysis Technical Feasibility: The application is technically feasible because all the technical resources required for the development and working of the application is easily available and reliable . The codes are written in Google Colab , therefore all the libraries will be available, no need to install or import each of those. Economic Feasibility: The code is working on Google Colab .So the colab consumes an amount of internet. The development of the system will not need a huge amount of money. It will be economically feasible. Operational Feasibility: Since the code is written on Google Colab , no need for worrying about importing or installing the libraries required. There is no need of skill for a new user to open this application and use it
System Environment Software Environment : Various software used for the development of this application are the following : Pandas,Python , Matplotlib, Numpy , LabelEncoder , OrdinalEncoder,SelectKBest , Google Colab , Visual Studio, HTML&CSS, Flask, GitHub Hardware Environment : Processor : 2 GHz or faster (dual-core or quad-core will be much faster) Memory : 8 GB RAM or greater Disk space : 40 GB or greater Good internet connectivity
System Design Model Planning
Model Training
Testing
Results and Discussions First calculated accuracy for XGBoost was 5.75. And after label encoding techniques and ordinal encoding for categorical values and feature selection technique Chi-Squared test accuracy was increased to 80.683. The saved model xgboost.sav was loaded by importing pickle package.
Model Deployment
Git History
Conclusion By using this system, we predicted job suited for the student based on similarity score, gain of branch Proposed system gives students the insight towards their career and choose the one suits for them.
References Vignesh S, Shivani Priyanka C, Shree Manju H, Mythili K “An Intelligent Career Guidance System using Machine Learning “. 2021 7 th International Conference on Advanced Computing & Communication Systems (ICACCS) Tanya V Yadalam , Vaishnavi M Gowda, Vanditha Shiva Kumar, Disha Girish “Career Recommendation Systems using Content based Filtering” Proceedings of the Fifth International Conference on Communication and Electronics Systems (ICCES 2020) Manar Qamhieh , Haya Sammaneh , Mona Nabil Demaidi “PCRS: Personalized Career-Path Recommender System for Engineering Students. Supported by the An- Najah National University, Palestine, Research Project, under grant ANNU-1920-Sc004. Published on 2020.