Top 25 Machine Learning Interview Questions With Python Code Examples 2026
Prepare with 25 real machine learning interview questions, complete with Python code examples. Covers supervised learning, evaluation metrics, overfitting, and practical implementation.
RV
Ravi Vohra
19 Jun 2026
40 min read
Top 25 Machine Learning Interview Questions (With Python Code Examples): The Real Ones
I bombed a machine learning interview once. Not because I did not know the algorithms. Because the interviewer asked me to write code. On a shared screen. While he watched. I had practiced theory for weeks. Bias-variance tradeoff. Cross-validation. Gradient descent. I could explain them all beautifully. Then he said, "Write a function to calculate the Gini impurity of a dataset." I froze. I knew what Gini impurity was. I had read about it three times. I had never coded it from scratch.
So here are twenty-five real machine learning interview questions. The ones that come up repeatedly. Each one with a Python code example where it makes sense. Not pseudocode. Real, runnable Python using scikit-learn, NumPy, and Pandas. The kind you can type into a shared editor while someone watches.
A quick note. Do not memorize the code. Understand it. Change it. Break it. Fix it. The interviewer is not checking your memory. They are checking whether you understand what the code is doing and why.
What is the bias-variance tradeoff? Write code to demonstrate overfitting and underfitting.
This is the most important concept in machine learning. Bias is error from oversimplification. High bias underfits. The model is too simple to capture patterns. Variance is error from oversensitivity to training data. High variance overfits. The model memorizes noise.
Here is code that demonstrates both using polynomial regression.
Low degree underfits. High degree overfits. The sweet spot is somewhere in the middle where test error is minimized. This is the bias-variance tradeoff in action.
Write a function to split data into train and test sets without using scikit-learn.
The interviewer wants to see you understand the mechanics.
Typescript
1import numpy as np
23def train_test_split_manual(X, y, test_size=0.2, random_state=None):4if random_state:5 np.random.seed(random_state)6 indices = np.arange(X.shape[0])7 np.random.shuffle(indices)8 split_idx =int(X.shape[0]*(1- test_size))9 train_indices = indices[:split_idx]10 test_indices = indices[split_idx:]11returnX[train_indices],X[test_indices], y[train_indices], y[test_indices]
The key details. Shuffling prevents order bias. The split point respects the test ratio. Mention that in a real project you would use scikit-learn. This is about demonstrating understanding.
What is cross-validation? Write code for k-fold cross-validation.
Cross-validation splits data into k folds. Train on k minus one, validate on the remaining one. Rotate which fold validates. It gives a more reliable performance estimate than a single split.
The standard deviation matters. High variance across folds suggests the model is sensitive to the specific training data. Low variance suggests stability.
How do you handle missing values? Write code.
The approach depends on why values are missing. Show you understand the options.
Explain that dropping rows loses data. Filling with mean preserves sample size but can distort distributions. The choice depends on how much data is missing and why.
What is the difference between fit, transform, and fit_transform?
Fit learns parameters from the data. Transform applies those parameters to modify the data. Fit_transform does both in one call. Use fit_transform on training data, then only transform on test data. Using fit_transform on test data leaks information.
What is regularization? Write code comparing L1 and L2.
Regularization penalizes model complexity to prevent overfitting. L1 can zero out coefficients, performing feature selection. L2 shrinks coefficients but keeps all features.
The diagonal is correct predictions. Off-diagonal are errors. They tell you which classes are being mistaken for which others.
What is feature scaling and why does it matter? Write code to standardize features.
Algorithms using distance, gradient descent, or regularization are sensitive to feature scales. Standardization gives features zero mean and unit variance.
This is simplified. A full implementation would build the tree recursively. But this core logic, finding the split that minimizes impurity, is the heart of a decision tree.
What is ensemble learning? Write code comparing a single model to a random forest.
Ensemble methods combine multiple models. Random forests average many decision trees trained on random data subsets.
Typescript
1from sklearn.tree import DecisionTreeClassifier
2from sklearn.ensemble import RandomForestClassifier
3from sklearn.model_selection import cross_val_score
4from sklearn.datasets import make_classification
56X, y =make_classification(n_samples=1000, n_features=20, random_state=42)78tree =DecisionTreeClassifier(random_state=42)9forest =RandomForestClassifier(n_estimators=100, random_state=42)1011tree_scores =cross_val_score(tree,X, y, cv=5)12forest_scores =cross_val_score(forest,X, y, cv=5)1314print(f"Decision Tree accuracy: {tree_scores.mean():.3f} (+/- {tree_scores.std():.3f})")15print(f"Random Forest accuracy: {forest_scores.mean():.3f} (+/- {forest_scores.std():.3f})")
The random forest almost always outperforms a single tree. The ensemble reduces variance.
What is gradient descent? Write code for a simple implementation.
An optimization algorithm that iteratively adjusts parameters to minimize a loss function.
RMSE penalizes large errors more than MAE. R-squared tells you how much variance your model explains.
What is the difference between bagging and boosting? Write code for both.
Bagging trains models in parallel on random subsets. Reduces variance. Boosting trains sequentially, each model correcting previous errors. Reduces bias.
Typescript
1from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
2from sklearn.model_selection import cross_val_score
3from sklearn.datasets import make_classification
45X, y =make_classification(n_samples=500, n_features=20, random_state=42)67bagging =RandomForestClassifier(n_estimators=100, random_state=42)8boosting =GradientBoostingClassifier(n_estimators=100, random_state=42)910print("Bagging (Random Forest):",cross_val_score(bagging,X, y, cv=5).mean())11print("Boosting (Gradient Boosting):",cross_val_score(boosting,X, y, cv=5).mean())
What is a hyperparameter? Write code using GridSearchCV.
Hyperparameters are set before training. They control model behavior. Grid search tries combinations to find the best ones.
For production, you would wrap the model in an API using Flask or FastAPI. But the saving and loading is the first step.
A Quick Preparation Checklist
One. Write code without autocomplete. Practice on a plain text editor or a whiteboard. Autocomplete is a crutch that will not be there in many interview settings.
Two. Explain your code out loud as you write it. The interviewer wants to hear your thought process, not just see correct syntax.
Three. Know scikit-learn but also understand what is happening underneath. Be able to code a simple version of core concepts.
Four. Have a project ready to discuss. A real model you built, deployed, and can talk about in detail. The bug stories, the design decisions, the lessons learned.
Five. Review the fundamentals before the advanced stuff. Most interviewers probe linear regression, logistic regression, and decision trees more deeply than neural networks.
The Honest Closing
Twenty-five questions is a lot. You will not get all of them. But if you understand the concepts and can write the code, you can handle whatever comes. The interviewer wants to see that you have trained real models, that you understand why they work, and that you can implement them without relying entirely on library magic.
If you are still building these skills, structured practice helps. SkillsYard 's Data Science and AI program covers machine learning implementation with live mentorship and real projects. A free demo class is available. No commitment. Just a session to see if the teaching style clicks.
Related Courses
Programming Courses
BEGINNER
Advance Certification in C++
Master C++ with OOP, STL, memory management & design patterns through real-world projects and expert guidance.
Learn C, the language behind operating systems and embedded devices from variables and loops to pointers, memory management, and data structures with hands on projects and expert guidance.
Accelerate your career with Advanced Python Certification master enterprise coding, data science, web dev & automation with hands on projects and expert mentorship.