Predicting Students Career Aspirations
An end-to-end machine learning pipeline predicting students' likely career paths from academic and personal data — Random Forest classifier achieving 80% accuracy, deployed as an interactive Streamlit dashboard.
Visit websiteProblem Statement
Career guidance systems in universities are typically manual and reactive — a student must seek out an advisor, often too late. This project explores whether a student's academic performance, interests, and personal attributes can predict their career aspiration early enough to deliver proactive, data-driven guidance.
Exploratory Data Analysis
The dataset was thoroughly profiled before modelling. Seaborn heatmaps revealed correlations between academic subjects and career clusters. Feature importance analysis using Random Forest's built-in Gini impurity rankings identified the top predictors: STEM subject grades, extracurricular participation, and self-reported interests.
Model Comparison
Four classifiers were trained and evaluated — Logistic Regression, K-Nearest Neighbours, Support Vector Machine, and Random Forest. Random Forest outperformed all others at 80% accuracy, benefiting from ensemble variance reduction and its robustness to correlated features in the dataset.
Interactive Dashboard
The final model was deployed as a Streamlit web app. Users can input academic and personal attributes through an intuitive form and receive an instant career prediction with a confidence score and brief recommendation. The dashboard also includes visualisations of feature importance and class distributions.
Tech Stack
Python · Scikit-learn · Random Forest · Pandas · NumPy · Seaborn · Matplotlib · FastAPI · Streamlit · Streamlit Cloud