Data science roles in India have evolved dramatically. The days of landing a job by knowing pandas and sklearn are over. In 2026, companies expect candidates who can handle end-to-end ML pipelines, explain model decisions to business stakeholders, and increasingly — integrate AI and LLM workflows into products.
This guide covers everything you need to crack DS/ML interviews in India — from freshers targeting analytics roles at startups to experienced candidates eyeing FAANG India, Razorpay, or PhonePe.
The India Data Science Hiring Landscape
| Company Tier | Companies | What They Hire For |
|---|---|---|
| FAANG India | Google, Amazon, Microsoft, Meta | ML Research, Applied Science, ML Engineering |
| Product Unicorns | Flipkart, Razorpay, PhonePe, Swiggy, Zomato | Applied ML, Recommendation, Fraud, Search |
| Fintech | Paytm, Groww, CRED, Zerodha | Risk Models, Credit Scoring, Fraud Detection |
| IT Services (Analytics) | TCS, Wipro, Infosys (AI units), HCL | Data Analytics, BI, Machine Learning COE |
| Consulting | McKinsey QuantumBlack, BCG Gamma, Deloitte AI | Analytics Strategy, Model Development |
| Startups | 100s across Bangalore, Mumbai, Hyderabad | Full-stack DS: ETL to model to dashboard |
Interview Round Structure (Typical)
| Round | Focus | Duration |
|---|---|---|
| Resume / JD Screen | Skill match, experience fit | 0 (automated) |
| Phone Screen / HR | Background, motivation, CTC | 20–30 min |
| Take-Home Assignment | EDA, model building, communication | 2–5 hours |
| Statistics / Probability | Foundational theory | 45–60 min |
| SQL / Data Manipulation | Querying skills | 45–60 min |
| ML Concepts & Theory | Algorithms, trade-offs, tuning | 60 min |
| Case Study / Business Problem | Applying ML to real business problem | 60–90 min |
| Coding (Python / DSA) | Pandas, NumPy, algorithms | 45–60 min |
| System Design (for senior roles) | ML System Architecture | 60 min |
| Managerial / Leadership Fit | Stakeholder management, communication | 45 min |
Section 1: Statistics and Probability (Non-Negotiable)
These are asked at every level — from fresher to Principal DS.
Must-Know Concepts:
| Topic | Key Questions to Practise |
|---|---|
| Probability basics | Bayes’ Theorem, conditional probability, independence |
| Distributions | Normal, Binomial, Poisson, Exponential — when and why |
| Hypothesis testing | p-value, Type I/II errors, z-test vs. t-test |
| A/B Testing | Sample size calculation, statistical significance, business framing |
| Confidence Intervals | Interpretation, margin of error |
| Central Limit Theorem | Why it matters for ML |
Classic India Interview Question:
“You run an A/B test on Swiggy’s checkout page. Group A shows a 5% conversion, Group B shows 5.3%. The p-value is 0.04. Is the result statistically significant? Would you ship it?”
The right answer discusses: significance (yes), practical significance (small effect), business trade-offs, whether the test ran long enough, and segment-level analysis.
Section 2: Machine Learning Algorithms
| Algorithm | Key Interview Questions |
|---|---|
| Linear Regression | Assumptions, multicollinearity, regularisation (L1/L2) |
| Logistic Regression | Log odds, decision boundary, threshold selection |
| Decision Trees | Gini vs. Entropy, overfitting, pruning |
| Random Forest | Bagging, feature importance, out-of-bag error |
| XGBoost / LightGBM | Boosting mechanics, hyperparameter tuning, India interview favourite |
| K-Means | Elbow method, limitations, distance metrics |
| SVM | Kernel trick, margin, when to use |
| Neural Networks | Backpropagation, activation functions, gradient descent |
| LLMs (2026 essential) | Fine-tuning, RAG, embeddings, prompt engineering basics |
Framework for any ML question:
1. Clarify the problem type (classification / regression / clustering / ranking)
2. Define the target variable and evaluation metric
3. Discuss data: features, missing values, imbalance
4. Choose algorithm with justification
5. Discuss trade-offs (interpretability vs. accuracy)
6. Describe deployment and monitoring considerations
Section 3: SQL — The Non-Negotiable Skill
Level by role:
- Fresher / Analyst: Basic SELECT, WHERE, GROUP BY, ORDER BY, JOINs
- Mid-level: Window functions, CTEs, subqueries, optimisation
- Senior: Query optimisation, indexing, explain plans, partitioning
Top 5 SQL Questions in India DS Interviews:
— Q1: Find the top 3 products by revenue per category
SELECT category, product, revenue,
RANK() OVER (PARTITION BY category ORDER BY revenue DESC) as rank
FROM sales_table
WHERE rank <= 3;
— Q2: Month-over-month retention rate
SELECT month,
COUNT(DISTINCT user_id) as active_users,
COUNT(DISTINCT CASE WHEN prev_month_active THEN user_id END) /
LAG(COUNT(DISTINCT user_id)) OVER (ORDER BY month) as retention_rate
FROM user_activity
GROUP BY month;
— Q3: Median order value (no MEDIAN function in most DBs)
SELECT AVG(order_value) as median_order_value
FROM (
SELECT order_value,
ROW_NUMBER() OVER (ORDER BY order_value) as rn,
COUNT(*) OVER () as total
FROM orders
) t
WHERE rn IN (FLOOR((total+1)/2), CEIL((total+1)/2));
Section 4: The Case Study / Business Problem Round
This is where most candidates struggle — not because they lack technical skills, but because they forget to anchor the model to business outcomes.
The 6-Step Business ML Framework:
1. CLARIFY → What’s the business problem? What’s the cost of error?
2. DEFINE → What does success look like? (Metric + threshold)
3. DATA → What data do we have? What’s the quality?
4. MODEL → What approach? What trade-offs?
5. EVALUATE → How do you measure model performance? Business KPI?
6. DEPLOY → How do you monitor drift? How often do you retrain?
Real India-Style Case Studies to Practise:
- “Build a credit scoring model for first-time borrowers on Paytm — no credit history”
- “Predict churn for Hotstar Premium subscribers”
- “Design a fraud detection system for PhonePe UPI transactions”
- “Build a recommendation system for Zomato Gold members”
- “Forecast demand for Ola driver supply during IPL season in 5 cities”
Section 5: ML System Design (Senior Roles)
For Senior DS / ML Engineer roles at product companies, expect a system design round.
| Topic | Key Concepts |
|---|---|
| Feature Store | Online vs. offline features, latency, consistency |
| Model Serving | REST API, gRPC, batch vs. real-time inference |
| Monitoring | Data drift, concept drift, model performance decay |
| Retraining Pipelines | Trigger-based vs. scheduled, shadow deployment |
| Data Pipelines | Kafka, Spark, Airflow for ML workflows |
| Experiment Tracking | MLflow, Weights & Biases, DVC |
Typical System Design Question:
“Design a real-time fraud detection system for 10M UPI transactions per day.”
Hit these points: data ingestion (Kafka), feature engineering (real-time and batch), model serving latency (<100ms), feedback loop, monitoring, and fallback logic.
30-Day Preparation Plan
WEEK 1: Foundations
☐ Statistics: Complete StatQuest YouTube series (free)
☐ SQL: Mode Analytics SQL tutorials + LeetCode SQL 50
☐ Python: Review pandas, NumPy, scikit-learn docs
WEEK 2: Algorithms and Modelling
☐ ML Algorithms: Hands-On ML (Géron) — 3 key chapters
☐ XGBoost: Kaggle course (free, 4 hours)
☐ Build 1 end-to-end project (Kaggle dataset, full pipeline)
WEEK 3: Case Studies and Business Thinking
☐ Practise 5 business ML case studies (use framework above)
☐ Learn A/B testing thoroughly (Udacity course, free)
☐ Review 3 real ML case studies from Indian companies (Swiggy, Flipkart engineering blogs)
WEEK 4: Mock Interviews and System Design
☐ 3 mock interviews (Pramp, Interviewing.io, or peer mock)
☐ ML System Design: Chip Huyen’s “Designing ML Systems” (first 3 chapters)
☐ Review your target company’s tech blog — match your answers to their stack
India-Specific Resources
| Resource | What It Covers | Cost |
|---|---|---|
| Kaggle | Datasets, competitions, courses | Free |
| Analytics Vidhya | India-focused DS tutorials, hackathons | Free/Paid |
| Towards Data Science | Practical ML articles | Free |
| StatQuest (YouTube) | Statistics and ML intuition | Free |
| LeetCode (SQL section) | SQL interview prep | Free/Paid |
| Chip Huyen’s blog | ML Systems | Free |
| IIMB / ISB online courses | Business + Data Analytics | Paid |
References
- NASSCOM (2024) — India Data Science and AI Talent Report — [nasscom.in](https://nasscom.in)
- Analytics Vidhya (2024) — India DS Interview Trends — [analyticsvidhya.com](https://www.analyticsvidhya.com)
- LinkedIn India (2024) — Top Skills for Data Science Roles in India — [linkedin.com/business/talent](https://business.linkedin.com/talent-solutions)
- Glassdoor India (2024) — Data Science Interview Questions — India Companies — [glassdoor.co.in](https://www.glassdoor.co.in)
- Chip Huyen (2022) — Designing Machine Learning Systems — [oreilly.com](https://www.oreilly.com)
