Anurag Tiwari Portrait

Anurag Tiwari

Work Experience

Software Developer — Itedium

Oct 2024 – Present

Worked on a production-grade, batch-driven healthcare data system, handling user eligibility, entitlements, and pricing logic where accuracy was critical.

  • Tech & Development: Built and maintained internal platforms using Symfony 1.x and Laravel 7. Improved 834 EDI → JSON pipelines with Go and Omniparser, enhancing data processing efficiency. Implemented UI and backend updates, fixed bugs, and handled urgent production changes using Go, PHP, HTML, CSS, JavaScript, and AJAX. Managed and optimized high-volume healthcare data pipelines, occasionally processing up to 500,000 records with zero critical errors.
  • Challenges: Tackled invalid or duplicate records, incorrect state transitions, and false validation errors that slowed manual reviews. Refactored legacy logic to handle edge cases and complex rules, keeping downstream data accurate.
  • Contributions: Led analysis, development, and testing of critical fixes. Added defensive validations to prevent bad data from persisting. Worked closely with the Agile team, using Git and JIRA for smooth coordination.
  • Impact: Stopped invalid/duplicate data from spreading, reduced manual review workload, minimized operational noise, and improved overall system stability. Delivered time-sensitive fixes with zero critical post-release issues.

Skills: Data validation, legacy system refactoring, production debugging, root cause analysis, cross-team collaboration.

Educational Qualifications

Level Institution Grade
SSC Pancham English High School 84%
HSC Kanchan Junior College 82.15%
B.Sc IT Annasaheb Vartak College 9.68 / 10 CGPA

Entrance Exams: JEE Main – 62.56 percentile | MHT-CET – 87.28 percentile

Projects

🚀 AI & Machine Learning Projects

GitWise AI — GitHub Repository Q&A System

 

Built a Retrieval-Augmented Generation (RAG) system to answer natural language questions on GitHub repositories.

  • Core System: Designed pipeline for repository ingestion, code chunking, and embedding generation.
  • Retrieval: Implemented hybrid search using MiniLM embeddings with BGE reranker for high-quality results.
  • LLM Integration: Used Groq LLM to generate context-aware answers from retrieved code snippets.
  • Database: Used Qdrant vector database for efficient semantic search with metadata filtering.
  • Architecture: Built modular system with separate ingestion, retrieval, and response layers.
  • Evaluation: Created evaluation pipeline to measure retrieval accuracy and response quality.
  • Tech Stack: Python, FastAPI, LangChain, SentenceTransformers, Qdrant, Groq LLM, Streamlit, Docker

Power Consumption Prediction

 

Predicts household electricity consumption using historical sensor data and time-based features.

  • Performance: MAE: ~0.0177 | RMSE: ~0.0329 | R²: ~0.9987
  • Features: Historical sensor readings (Global reactive power, Voltage, Current, Sub-metering 1-3), time features (hour, day of week, month), and lag features to capture temporal dependencies.
  • Tech Stack: Python, Pandas, NumPy, scikit-learn, Streamlit, Docker
  • Highlights:
    • Preprocessing pipeline handles missing values and ensures numeric types for all features.
    • Lag features allow the model to capture short-term dependencies in power consumption.
    • Random Forest Regressor with feature selection (SelectKBest) to focus on the most important predictors and reduce overfitting.
    • Interactive Streamlit UI supports both single and batch predictions, with metrics displayed when actual values are available.

Telco Customer Churn Prediction

 

Predicts whether a telecom customer is likely to churn based on demographic, account, and service usage features.

  • Performance: Accuracy: 82.1% | ROC-AUC: 0.87
  • Features: Tenure, MonthlyCharges, TotalCharges, gender, SeniorCitizen, Partner, Dependents, PhoneService, MultipleLines, InternetService, OnlineSecurity, OnlineBackup, DeviceProtection, TechSupport, StreamingTV, StreamingMovies, Contract, PaperlessBilling, PaymentMethod
  • Tech Stack: Python, Pandas, NumPy, scikit-learn, Streamlit, Docker
  • Highlights:
    • Preprocessing pipeline handles missing values, scales numeric features, and one-hot encodes categorical features.
    • Random Forest classifier with class balancing, controlled tree depth, and optimized splits to reduce overfitting.
    • Interactive Streamlit UI supports single and batch predictions with churn probability and churn rate metrics.
    • Correlation analysis of numeric features to improve model robustness.

Titanic Survival Predictor

 

Predicts Titanic passenger survival using demographic and travel information. Supports single predictions via UI and batch predictions via CSV upload.

  • Performance: Accuracy: 82.3% | ROC-AUC: 0.87 | Cross-Validation: 81.5% ± 2.1%
  • Features: Age, Fare, Pclass, Sex, Embarked, Siblings/Spouses, Parents/Children, Title, FamilySize, IsAlone
  • Tech Stack: Python, Pandas, NumPy, scikit-learn, Streamlit, Docker
  • Highlights:
    • Interactive Streamlit app for single predictions with survival probability.
    • Batch prediction support for CSV files, generating predictions and probabilities for multiple passengers.
    • Custom preprocessing pipeline handling missing values, feature engineering, skewness correction, scaling, and one-hot encoding.
    • Decision Tree classifier with controlled depth and leaf size to reduce overfitting.

💻 Other Projects

AuthVerse — Three Level Authentication System

Built AuthVerse, a secure multi-level authentication system with three layers:

  • Text Password: Standard password login.
  • Color Password: Users select a combination of colors as a password.
  • Image Password: Users tap specific areas on a cropped image to log in.

Tech Stack: PHP, MySQL, HTML, CSS, JavaScript

Elearning App

Developed an E-learning app with three modules:

  • Admin: Approves user accounts and manages access.
  • Teacher: Uploads lectures and notes for students.
  • Student: Accesses learning materials and connects with teachers.

Tech Stack: Java, Android Studio, Retrofit, Firebase, Glide, PHP, MySQL

Certificates

AI Certification

Issued by Responsible AI for Youth

AI Certification

HTML Essentials Certification

Issued by LinkedIn Learning

HTML Certification

Programming Fundamentals Certification

Issued by LinkedIn Learning

Programming Fundamentals Certification

Database Certification

Issued by LinkedIn Learning

Database Certification

Full Stack Certification

Issued by LinkedIn Learning

Full Stack Certification

Linux Certification

Issued by Vartak College

Linux Certification

Contact Information

+91 8689828704

princeanu27@gmail.com

LinkedIn

GitHub

Telegram