Background

Top 40 Data Analytics Interview Questions (SQL, Excel, Power BI) 2026

Prepare with 40 real data analytics interview questions on SQL, Excel, and Power BI. Honest, practical answers from someone who has hired analysts. No fluff.

RV

Ravi Vohra

13 Jun 2026

46 min read

Article graphic

Top 40 Data Analytics Interview Questions (SQL, Excel, Power BI): The Real Ones

I once interviewed a candidate for a data analyst role. His resume was polished. His certifications were listed in a neat table. I asked him to write a simple SQL query. A JOIN with a GROUP BY. Nothing tricky. He stared at the screen for a long moment. Then he said, "I usually just export the data from the analytics tool. I do not write queries from scratch."

I did not hire him. Not because he was unintelligent. Because the job required pulling data from databases that do not have friendly export buttons. The tools are conveniences. The fundamentals are the job.

That experience shaped how I think about data analytics interview questions. The real ones test whether you have done the work or just watched someone else do it in a video. SQL. Excel. Power BI. The three tools that quietly run the business world.

I have put together forty questions across these tools, plus a few scenario questions at the end. Do not memorize them. Use them to check your understanding. If you can explain a concept clearly and build a small example, you know it. If you cannot, that is the gap to fill.

The SQL Questions

SQL is the most important tool in analytics. Without it, you cannot get data. If you cannot get data, you cannot analyze it. These questions test whether you have written real queries or just watched tutorials.

What is SQL and why is it essential for data analysts?

Structured Query Language. It is how you talk to databases. Analysts use it to pull data, clean it, join tables, and prepare data for analysis. Without SQL, you are dependent on someone else to give you data. That dependency makes you slower and less valuable.

What is the difference between SELECT and SELECT DISTINCT?

SELECT returns all rows. SELECT DISTINCT returns only unique rows, removing duplicates. It is useful for finding unique values in a column, like all the different product categories. But on multiple columns, it looks for unique combinations, not unique values per column.

Explain WHERE versus HAVING.

WHERE filters individual rows before grouping. HAVING filters groups after GROUP BY. You cannot use aggregate functions like SUM in WHERE. That is what HAVING is for. WHERE is for row-level filtering. HAVING is for group-level filtering.

Write a query to find the second highest order value from a sales table.

There are a few ways. A subquery with LIMIT and OFFSET. SELECT DISTINCT order_value FROM sales ORDER BY order_value DESC LIMIT 1 OFFSET 1. Or use a window function like DENSE_RANK. The interviewer wants to see you can think beyond basic SELECT-FROM-WHERE.

What is a JOIN and which types have you used?

A JOIN combines rows from two or more tables based on a related column. INNER JOIN returns only matching rows. LEFT JOIN returns all rows from the left table and matching rows from the right, with NULLs where there is no match. RIGHT JOIN is the reverse. FULL OUTER JOIN returns everything. I use LEFT JOIN the most. Most real-world queries involve some kind of JOIN.

What is a subquery?

A query nested inside another query. Used in WHERE, FROM, or SELECT clauses. You might use one to filter based on a calculation, like finding customers whose average order value exceeds the overall average. They are powerful but can be slow if written poorly. Sometimes a JOIN or a CTE is cleaner.

What are aggregate functions? Name a few.

Functions that perform a calculation on a set of rows and return a single value. COUNT, SUM, AVG, MIN, MAX. They are almost always paired with GROUP BY. The GROUP BY defines the groups. The aggregate function calculates the summary for each group.

What is GROUP BY and why is it important?

It groups rows sharing the same values in specified columns. It is how you summarize data. Sales by region. Average order value by customer segment. Count of users by signup month. Most analytical queries use it.

What is the difference between COUNT(*) and COUNT(column)?

COUNT(*) counts all rows, including rows with NULLs. COUNT(column) counts only rows where that specific column is not NULL. If your data has missing values, these will give different results. The distinction matters.

How do you handle NULL values in SQL?

You cannot use equals to check for NULL. NULL is the absence of a value, not a value itself. Use IS NULL or IS NOT NULL. COALESCE replaces NULLs with a default value. These functions are essential for data cleaning.

What is a Common Table Expression (CTE)?

A temporary named result set you can reference within a larger query. It makes complex queries readable. Instead of nesting five subqueries, you define each step as a CTE and combine them. It breaks a big problem into smaller, understandable pieces.

Explain window functions with an example.

Functions that perform calculations across a set of rows related to the current row. Unlike GROUP BY, they do not collapse rows. ROW_NUMBER, RANK, LAG, LEAD, SUM with OVER. Example. Calculating a running total of sales by month. Or ranking customers by order value within each region.

What is an index and why does it matter?

A database structure that speeds up data retrieval. Like a book index. Without it, the database scans every row. With it, the database knows where to look. Indexes make queries faster but slow down inserts and updates. It is a tradeoff.

How would you optimize a slow SQL query?

Check the execution plan. It shows where time is spent. Avoid SELECT *. Pull only needed columns. Filter early with WHERE. Avoid functions on indexed columns in WHERE. Consider if an index is missing. Query optimization is a practical skill learned by doing.

Write a query to find duplicate records.

GROUP BY the columns that should be unique. Use HAVING COUNT(*) greater than one. This returns groups appearing more than once. This task comes up constantly during data cleaning. Every analyst should write this without thinking.

The Excel Questions

Excel refuses to die. Stakeholders live in spreadsheets. Reports are shared as Excel files. Quick analyses happen in Excel. Knowing it well is a quiet advantage.

What is VLOOKUP and what are its limitations?

It searches for a value in the first column of a range and returns a value from a specified column. Limitations. It only looks right, never left. It is slow on large data. It breaks if columns are inserted because it uses column index numbers. XLOOKUP fixes most of these problems. Mention XLOOKUP and the interviewer knows you are current.

What is the difference between VLOOKUP and INDEX-MATCH?

INDEX-MATCH is more flexible. It can look left or right. It is not dependent on column position. It is faster on large datasets. Slightly harder to learn, which is why VLOOKUP persists despite being worse. Knowing both signals real experience.

What is a PivotTable and how do you use it?

A tool for summarizing large datasets interactively. Drag fields into rows, columns, values, and filters. It creates summaries without formulas. Sales by region. Count of orders by month. It is fast and flexible. The first thing I use when exploring new data in Excel.

How do you remove duplicates in Excel?

Data tab, Remove Duplicates button. Select the columns to check. Always make a copy of your data first. Removing duplicates is destructive. Understand what Excel considers a duplicate. It checks the selected columns, not necessarily the entire row.

What is conditional formatting and how have you used it?

It changes cell formatting based on rules. Highlight cells above a threshold. Color scales. Data bars. I use it to flag outliers, highlight values needing attention, and make tables scannable at a glance.

Name some Excel functions you use regularly.

SUMIF, COUNTIF, AVERAGEIF for conditional calculations. TEXT functions like LEFT, RIGHT, CONCATENATE for cleaning data. DATE functions like EOMONTH for date math. IFERROR to handle errors cleanly. Name them and briefly explain when you use each.

How do you create a chart and what makes it effective?

Select data, insert chart. An effective chart has a clear title, labeled axes, a readable scale, and minimal decoration. Remove gridlines if they do not help. Use color intentionally. The insight should be obvious within seconds.

What is the difference between relative and absolute cell references?

Relative references change when you copy a formula. Absolute references, with dollar signs, stay fixed. A1 is relative. $A$1 is absolute. Not knowing this leads to broken formulas and wrong results.

How do you handle a large dataset that slows Excel down?

Remove unnecessary formatting. Use tables instead of raw ranges. Avoid volatile functions like INDIRECT. Consider Power Query to load and transform data without bringing it all into the sheet. If it is millions of rows, Excel is simply the wrong tool. Use SQL or Python.

What is Power Query and why is it useful?

A data connection and transformation tool built into Excel and Power BI. It imports data from various sources, cleans it, reshapes it, and loads it. The steps are recorded and repeatable. Refresh the query and the data updates without redoing the work.

The Power BI Questions

Power BI is the most common visualization tool in the Microsoft ecosystem. Knowing it means you can build dashboards that stakeholders actually use instead of ignoring.

What is Power BI and how does it differ from Excel?

Power BI is for building interactive dashboards and reports. Excel is a spreadsheet tool that also does analysis and charts. Power BI handles larger data volumes. It connects to more data sources natively. It is designed for sharing. Excel is designed for individual analysis. They complement each other.

What are the main components of Power BI?

Power Query for data import and transformation. Power Pivot for data modeling and DAX. Power View for visualization. Power BI Service for sharing online. Understanding the components helps you know which tool for which task.

What is DAX? Give an example.

Data Analysis Expressions. A formula language for Power BI. Used for calculated columns, measures, and tables. Example. Total Sales equals SUM of Sales Amount. Or something more complex like Year-over-Year Growth using CALCULATE with DATEADD. DAX is the hardest part of Power BI to learn well.

What is the difference between a calculated column and a measure?

A calculated column is computed row by row during data refresh and stored. It does not respond to filters. A measure is computed at query time and responds dynamically to filters and slicers. Use calculated columns for static categorizations. Use measures for aggregations that need to be interactive.

What is a star schema and why is it important?

A data model with a central fact table surrounded by dimension tables. Fact table contains transactions or events. Dimension tables contain descriptive attributes like customer details or product info. It makes models faster, simpler, and easier to understand. It is the standard approach.

How do you handle relationships between tables in Power BI?

Relationships are created in Model view. One-to-many is most common. The dimension table is on the one side, the fact table on the many side. Filter direction matters. Usually filters flow from dimension to fact. Understanding relationships is fundamental to accurate reports.

What visualizations do you use most often and why?

Bar and column charts for comparisons. Line charts for trends. Card visuals for key metrics. Tables and matrices for detail. Slicers for interactivity. The choice depends on the question the visual is answering. A good dashboard has a purposeful mix.

How do you improve the performance of a slow Power BI report?

Reduce data volume. Import only needed columns. Filter rows at the source. Optimize DAX. Avoid complex calculated columns when measures work. Use a star schema. Check the performance analyzer to identify slow visuals.

What is the difference between Power BI Desktop and Power BI Service?

Desktop is the free authoring tool installed locally. Service is the cloud platform for sharing, collaborating, and consuming reports. You build in Desktop. You publish to Service. Service enables scheduled data refreshes and dashboard creation.

How do you ensure data accuracy in a Power BI report?

Validate against source data. Check totals against known benchmarks. Test filters and slicers. Document assumptions. Have someone else review before it reaches stakeholders. There is no substitute for thorough, manual checking.

The Scenario and Behavioral Questions

These test judgment and communication. I have seen technically strong candidates lose offers here.

You are given a messy dataset and asked to find insights. Walk me through your approach.

First, understand the business context. What question are we trying to answer? Then explore the data. Columns, types, missing values. Clean as needed. Do exploratory analysis. Trends, distributions, relationships. Form hypotheses. Test them. Summarize findings clearly, tied back to the original business question.

How do you explain a technical analysis to a non-technical stakeholder?

Start with the conclusion. Not the methodology. "We found that customers who contact support in week one are three times more likely to cancel." Then explain the evidence simply. Use analogies. Use visuals. Avoid jargon. Check for understanding. Be patient.

Tell me about a time your analysis led to a real decision.

Have a specific example ready. What was the problem. What data did you analyze. What did you find. What decision was made. What was the result. Even if from a project rather than a job, the structure matters. Problem, analysis, recommendation, outcome.

How do you prioritize when you have multiple data requests?

Consider business impact and urgency. Talk to requesters to understand context. Communicate timelines clearly. Do not overpromise. If everything is urgent, escalate for prioritization. Transparency prevents disappointment later.

What do you do if the data you need does not exist or is unreliable?

First, verify it truly does not exist. Check all sources. If it does not, look for a proxy metric that correlates with what you need. Communicate the limitation clearly. If data exists but is unreliable, document the issues and proceed with caveats. Never present bad data as good.

A Quick Prep Checklist

One. Write SQL queries by hand. On paper or a plain text editor. No autocomplete. This forces you to actually know syntax. Two. Open Excel. Build a PivotTable. Write a VLOOKUP. Create a chart. Do not watch. Do. Three. Download Power BI Desktop. It is free. Build a dashboard with a public dataset. Publish it. A live dashboard in your portfolio is impressive. Four. Practice explaining your analysis out loud. Record yourself. Are you clear? Are you using jargon? Would a non-technical person understand? Five. Have stories ready for scenario questions. Real examples from projects. Specifics matter more than generalities.

The Honest Closing

Forty questions. You will not be asked all of them. But if you understand the concepts behind them, you can handle whatever comes up. The interviewer wants to see that you have done the work. That you have written queries that broke and you fixed them. That you have built dashboards someone actually used. That you can take a messy question and turn it into a clear answer.

If you are still building these skills, structured practice helps. SkillsYard's Data Analytics program covers SQL, Excel, Power BI, and the business context that ties them together. Live projects. Mentors who have worked as analysts. Placement support. A free demo class is available if you want to see the teaching style before committing anything.

Related Courses

Data Science & Analytics
BEGINNER
Advance Certification in Power BI

Master Power BI with advanced data modeling, interactive dashboards, and automation. Build business intelligence and reporting skills within 3 months.

Power BIData VisualizationDAXData ModelingDashboard Design
3 months
BEGINNER
Advance Certification in Python for Data Science

Accelerate your career with Python! Master Pandas and Scikit-learn in 6 months, build your portfolio, and land a data science job.

PythonNumPyPandasMatplotlib & SeabornScikit-learn
3 months
INTERMEDIATE
Advance Certification in SQL

Accelerate your career by mastering advanced SQL. Gain expertise in complex querying, performance optimization, and database management in just six months to unlock new job opportunities.

SQLDatabase ManagementData AnalysisQuery OptimizationStored Procedures
6 months
ADVANCED
Advance Program in Data Analytics

Accelerate your career with Data Analytics! Master SQL, Power BI, Tableau, and Excel in 1 year, build a strong portfolio, and land your dream analytics job.

Data AnalyticsSQLPower BITableauExcelPython
12 months
ADVANCED
Advance Program in Data Science

Unlock your career in Data Science! Master statistics, machine learning & deep learning in 2 years and build predictive solutions for the future.

Data SciencePythonR ProgrammingMachine LearningDeep LearningArtificial Intelligence
16 months
ADVANCED
Advance program in machine learning

Unlock your career in Machine Learning! Master supervised & unsupervised learning, deep learning, NLP, and reinforcement learning in 2 years, building real-world AI solutions.

Machine LearningDeep LearningAIPythonComputer Vision
24 months
BEGINNER
Advance Certification in Advance Excel

Master Excel with advanced functions, dynamic dashboards, and automation. Build data analysis and reporting skills in 3 months.

Microsoft ExcelAdvanced FunctionsPivotTables & PivotChartsPower QueryPower Pivot
3 months

Frequently Asked Questions

Share this article