Vikas Kumar

Hi, I'm Vikas Kumar

Machine Learning & Data Science

I build end-to-end machine learning systems with a strong focus on data understanding, robust evaluation, and real-world decision making. My work emphasizes reproducibility, thoughtful metrics, and practical problem-solving.

✨About Me

Coding on a laptop

Building reliable machine learning systems with purpose

I am a Computer Science undergraduate with a strong focus on Machine Learning and Data Science. I enjoy working across the complete machine learning lifecycle — from exploratory data analysis and data preprocessing to model training, evaluation, and iterative improvement.

My projects emphasize clean, reproducible code and thoughtful evaluation using metrics beyond accuracy, especially in scenarios involving class imbalance and real-world constraints. Currently, I am expanding my expertise toward Natural Language Processing, Deep Learning, and Transformer-based architectures.

Experience

Summer Intern

IBM · Delhi, India

  • Worked on building and deploying AI-powered agents as part of project-based learning initiatives.
  • Gained hands-on exposure to Agentic AI frameworks and multi-agent system concepts.
  • Applied Python-based workflows to experiment with intelligent agents and decision-making pipelines.
  • Strengthened understanding of real-world AI system design, collaboration, and iterative development.

Skills

Programming & Tools

  • Python
  • Jupyter Notebook
  • Git & GitHub

Data Collection & Storage

  • SQL (Data Querying)
  • MongoDB (NoSQL Databases)
  • Selenium (Web Automation)
  • Scrapy (Web Scraping)

Data Analysis & Visualization

  • Pandas, NumPy
  • Matplotlib
  • Seaborn
  • Exploratory Data Analysis (EDA)

Machine Learning

  • Linear Regression
  • Logistic Regression
  • K-Nearest Neighbors (KNN)
  • Scikit-learn
  • Feature Engineering
  • Model Evaluation

Applied ML Techniques

  • Class Imbalance Handling (SMOTE)
  • Hyperparameter Tuning (GridSearchCV)
  • Cross-Validation (k-fold)
  • ROC-AUC & Precision–Recall Analysis
  • Threshold Tuning

NLP & Computer Vision (Foundational Knowledge)

  • Natural Language Processing (NLP)
  • Text Preprocessing & Vectorization
  • OpenCV (Computer Vision Basics)
  • Image Processing Fundamentals

Currently Learning

  • Deep Learning
  • Transformer Architectures
  • Advanced NLP Techniques

Projects

Titanic Survival Prediction

End-to-End Machine Learning Project

Designed and implemented a complete machine learning pipeline to predict passenger survival on the Titanic dataset, with emphasis on data understanding, class imbalance handling, and robust evaluation using metrics beyond accuracy.

  • Performed detailed exploratory data analysis to identify key survival drivers
  • Handled missing values and categorical features using structured preprocessing
  • Addressed class imbalance using SMOTE applied only on training data
  • Optimized Logistic Regression using GridSearchCV
  • Evaluated models using Accuracy, ROC-AUC, Precision–Recall curves, and k-fold cross-validation
  • Achieved ~80% test accuracy and ~0.85 mean cross-validated ROC-AUC
  • Developed environment-independent scripts runnable on local systems and Google Colab
View Project on GitHub

Contact

I am open to entry-level opportunities, internships, and collaborative projects in Machine Learning and Data Science. Feel free to reach out.