Brainalyst's SQL Interview Guide
Brainalyst's SQL Interview Guide
S
I
NT
G
U
Q
E
R
I
VI
D
E
E
L
W
QUESTI
ONS
&ANSWERS
ABOUT BRAINALYST
Brainalyst is a pioneering data-driven company dedicated to transforming data into actionable insights and
innovative solutions. Founded on the principles of leveraging cutting-edge technology and advanced analytics,
Brainalyst has become a beacon of excellence in the realms of data science, artificial intelligence, and machine
learning.
OUR MISSION
At Brainalyst, our mission is to empower businesses and individuals by providing comprehensive data solutions
that drive informed decision-making and foster innovation. We strive to bridge the gap between complex data and
meaningful insights, enabling our clients to navigate the digital landscape with confidence and clarity.
WHAT WE OFFER
• Data Strategy Development: Crafting customized data strategies aligned with your business
objectives.
• Advanced Analytics Solutions: Implementing predictive analytics, data mining, and statistical
analysis to uncover valuable insights.
• Business Intelligence: Developing intuitive dashboards and reports to visualize key metrics and
performance indicators.
• Machine Learning Models: Building and deploying ML models for classification, regression,
clustering, and more.
• Natural Language Processing: Implementing NLP techniques for text analysis, sentiment analysis,
and conversational AI.
• Computer Vision: Developing computer vision applications for image recognition, object detection,
and video analysis.
• Workshops and Seminars: Hands-on training sessions on the latest trends and technologies in
data science and AI.
• Customized Training Programs: Tailored training solutions to meet the specific needs of
organizations and individuals.
2021-2024
4. Generative AI Solutions
As a leader in the field of Generative AI, Brainalyst offers innovative solutions that create new content and
enhance creativity. Our services include:
• Content Generation: Developing AI models for generating text, images, and audio.
• Creative AI Tools: Building applications that support creative processes in writing, design, and
media production.
• Generative Design: Implementing AI-driven design tools for product development and
optimization.
OUR JOURNEY
Brainalyst’s journey began with a vision to revolutionize how data is utilized and understood. Founded by
Nitin Sharma, a visionary in the field of data science, Brainalyst has grown from a small startup into a renowned
company recognized for its expertise and innovation.
KEY MILESTONES:
• Inception: Brainalyst was founded with a mission to democratize access to advanced data analytics and AI
technologies.
• Expansion: Our team expanded to include experts in various domains of data science, leading to the
development of a diverse portfolio of services.
• Innovation: Brainalyst pioneered the integration of Generative AI into practical applications, setting new
standards in the industry.
• Recognition: We have been acknowledged for our contributions to the field, earning accolades and
partnerships with leading organizations.
Throughout our journey, we have remained committed to excellence, integrity, and customer satisfaction.
Our growth is a testament to the trust and support of our clients and the relentless dedication of our team.
Choosing Brainalyst means partnering with a company that is at the forefront of data-driven innovation. Our
strengths lie in:
• Expertise: A team of seasoned professionals with deep knowledge and experience in data science and AI.
• Customer Focus: A dedication to understanding and meeting the unique needs of each client.
• Results: Proven success in delivering impactful solutions that drive measurable outcomes.
JOIN US ON THIS JOURNEY TO HARNESS THE POWER OF DATA AND AI. WITH BRAINALYST, THE FUTURE IS
DATA-DRIVEN AND LIMITLESS.
2021-2024
BRAINALYST - SQL INTERVIEW QUESTIONS
SQL Roadmap
Truncate Table What are Relational Database?
Alter Table
RDBMS Benefits and Limitations
Create Table
Introduction
SQL vs NoSQL Databases
Data Defination Language (DDL) Basic SQL Syntox
SQL Keywords
Data Types
Data Manipulation Language (DDL) SELECT
Operators
FROM FROM
SELECT
Correlated Subqueris
Dynamic SQL
Tips:
problem-solving:-
1. Understand the Problem: Before attempting a solution, fully understand the problem and clarify what
output is expected.
2. Identify Required Tables: Determine which tables are needed for the problem. Assess if any joins are
necessary.
3. Evaluate Filter Conditions: Check if any filtering conditions are required to refine the data.
4. Consider Aggregation and Grouping: Determine if the problem requires any aggregation or grouping of
data.
5. Develop the Logic: After considering the above steps, think about the logic that will yield the desired
output.
6. Leverage Others’ Solutions for Learning: If you’re a beginner, don’t hesitate to look at others’ solutions
for understanding different approaches.
7. Avoid Stubbornness: Initially, don’t be stubborn about solving the problem alone. Feel free to check
others’ solutions to gain perspective.
8. Seek Help When Stuck: If you’re still unable to understand a solution, you can ask ChatGPT to explain it
step-by-step according to the flow of execution.
9. Practice with Real Data: For a better understanding of each step’s output, consider creating the table in
a database and examining the results of each step yourself.
“Explain about your recent project.” Sounds familiar, right? It’s a common question in job interviews,
yet many of us struggle to answer it effectively. Why is that?
Takeaway:
Be prepared, clear, and impact-focused to master this question. Your project’s story reflects your
professional journey.
1. Understanding Various Types of Joins: Familiarize yourself with the outputs of different joins,
including non-equi joins.
2. Window Functions & Their Variations: Differentiate and grasp the nuances between various window
functions available in SQL.
3. Distinguishing ‘WHERE’ vs ‘HAVING’: Understand the disparity between these SQL clauses and their
appropriate usage.
4. Query Order of Execution: Know the sequence in which SQL queries are executed.
5. Creating and Utilizing CTEs (Common Table Expressions): Learn how and when to implement CTEs
effectively.
6. Aggregation Functions as Window Functions: Comprehend using aggregate functions within window
functions for more advanced operations.
7. Commonly Asked Queries: Be well-versed in solving frequently asked questions like finding the
2nd/3rd highest salary, cumulative/running total queries, employing lead & lag functions, and
utilizing self-joins and other join types.
8. Subqueries and their Application: Nested queries within a main SQL query, useful for complex
filtering or operations.
9. Indexing and its Importance in Query Performance: Enhances data retrieval speed by creating
efficient pathways for searching data.
10. Handling NULL Values in SQL: Managing and treating NULL values effectively within SQL queries.
11. Joins vs. Subqueries: Differentiating when to use joins or subqueries for better query efficiency.
Note:
The Essence of Window Functions
Window functions allow users to carry out calculations across a set of table rows that are related to the
current row. This is akin to a moving window that shifts through rows, and as it moves, calculations are
performed for each row relative to the position of the window.
3. Offset Functions:
෮ LEAD(): Fetches the value of a given expression for the next row in the result set.
෮ LAG(): Fetches the value of a given expression for the previous row in the result set.
෮ FIRST_VALUE(): Provides the value of the specified expression for the first row of the window
frame.
෮ LAST_VALUE(): Provides the value of the specified expression for the last row of the window
frame.
4. Statistical Functions: Examples include CUME_DIST() and PERCENT_RANK(), which are used to calculate
cumulative distributions and relative rank respectively.
Conclusion:
Window functions breathe life into SQL queries, making them dynamic, concise, and efficient. They allow
us to look at data in context, providing insights that would be difficult to extract otherwise. As the world
of data continues to grow and evolve, mastery of such tools is invaluable.
Note:
Window Functions: Learn how to use OVER() for advanced analytics tasks. They are crucial for calculating
running totals, rankings, and lead-lag analysis in datasets.
CTEs and Temp Tables: Common Table Expressions (CTEs) and temporary tables can simplify complex
queries, especially when dealing with large datasets.
Dynamic SQL: Understand how to construct SQL queries dynamically to increase the flexibility of your
database interactions.
Optimizing Queries for Performance: Explore how indexing, query restructuring, and understanding exe-
cution plans can drastically improve your query performance.
Using PIVOT and UNPIVOT: These operations are key for converting rows to columns and vice versa, mak-
ing data more readable and analysis-friendly.
• HAVING: The HAVING clause filters groups from the result set based on the specified
conditions.
• SELECT: The columns specified in the SELECT clause are evaluated to produce the result set.
• ORDER BY: The rows in the result set are sorted based on the columns specified in the
ORDER BY clause.
Q6. Explain all types of window functions? (Mainly rank, row_num, dense_rank, lead & lag)
Answer:
Window functions are used to perform calculations across a set of rows related to the current row.
Common window functions include:
• ROW_NUMBER(): Assigns a unique integer to each row within a partition of a result set.
• RANK(): Assigns a unique rank to each row within a result set, with gaps in the ranking sequence.
• DENSE_RANK(): Like RANK(), but without gaps in the ranking sequence.
• LEAD(): Accesses data from subsequent rows within the same result set.
• LAG(): Accesses data from previous rows within the same result set.
Q9. What is aggregate function and when do we use them? explain with few examples.
Answer:
Aggregate functions are used to perform calculations on sets of values and return a single value.
Examples include SUM, AVG, COUNT, MAX, MIN. They are typically used with the GROUP BY clause
to calculate summary statistics for groups of rows.
Q11. What are window functions in SQL and how are they used?
Answer:
Window functions in SQL are special functions that operate on a set of rows called a “window” within
a query result. They allow you to perform calculations across a set of rows related to the current row.
Q13. What is the difference between a regular aggregate function and a window function?
Answer:
A regular aggregate function, such as SUM or AVG, computes a single result value from a set of input
values, typically grouped together. In contrast, a window function performs a calculation across a set
of rows related to the current row, without collapsing them into a single result.
Q14. Can you give an example of a problem that can be solved using a window function?
Answer:
Calculating a moving average of a time series data, where each value in the series is replaced by the
average of itself and its neighbouring values.
Q18. How do you specify the order of the rows in a window function?
Answer:
You can specify the order of rows in a window function using the ORDER BY clause within the OVER()
clause. This determines the sequence in which the rows are processed by the window function.
Q19. Can you give an example of a query that uses the ROW_NUMBER() window function?
Answer:
SELECT column1, column2, ..., ROW_NUMBER() OVER (ORDER BY column1) AS row_num
FROM table_name;
Q20. Can you give an example of a query that uses the RANK() window function?
Answer:
SELECT column1, column2, ..., RANK() OVER (PARTITION BY column3 ORDER BY column2 DESC) AS
rank_num
FROM table_name;
Q31. Display the top 5 customers with the highest total revenue:
Answer:
SELECT customer_id, SUM(total_revenue) AS total_revenue
FROM Orders
GROUP BY customer_id
ORDER BY total_revenue DESC
LIMIT 5;
Q32. List the number of orders placed by each customer in the year 2022:
Answer:
SELECT customer_id, COUNT(order_id) AS num_orders
FROM Orders
WHERE YEAR(order_date) = 2022
GROUP BY customer_id;
Q33. Find the average salary of employees for each department in a “Employees” table:
Answer:
SELECT department_id, AVG(salary) AS avg_salary
FROM Employees
GROUP BY department_id;
Q34. Find the names of all customers who have placed at least one order for more than $1000:
Answer:
SELECT DISTINCT customer_name
FROM Customers
WHERE customer_id IN (SELECT customer_id FROM Orders WHERE total_amount > 1000);
Q35. List the top 3 most popular products (by the number of orders) in the year 2022:
Answer:
SELECT product_id, COUNT(order_id) AS num_orders
FROM Order_Details
WHERE YEAR(order_date) = 2022
GROUP BY product_id
ORDER BY num_orders DESC
LIMIT 3;
Q36. Find the names of all employees who have worked for more than 5 years:
Answer:
SELECT employee_name
FROM Employees
WHERE DATEDIFF(CURDATE (), hire_date) > 5*365;
Q37. List all orders that have at least one product with a price greater than $100:
Answer:
SELECT *
FROM Orders
WHERE order_id IN (SELECT order_id FROM Order_Details WHERE unit_price > 100);
Q38. Find the number of customers who have not placed any orders in the year 2022:
Answer:
SELECT COUNT(customer_id) AS num_customers
FROM Customers
WHERE customer_id NOT IN (SELECT DISTINCT customer_id FROM Orders WHERE YEAR(order_date)
= 2022);
Q40. How to keep information fine and statistics amount in an evaluation? Here are a few pointers you
can comply with: -
Answer:
Define your studies question: Before you begin gathering records, make certain you’ve got a clear
research question in thoughts. This will help you determine the type and quantity of information
you want to acquire. It may also help you to discover the capability sources of bias which could
affect the first-class of your data.
Use suitable facts collection strategies: Choose appropriate statistics collection methods that are
appropriate to your studies question. For instance, surveys, interviews, and observational research
may be used to gather qualitative records, at the same time as experiments and randomized
controlled trials may be used to accumulate quantitative statistics.
Consider the sample length: When accumulating statistics, it’s crucial to remember the pattern
size. A large pattern length can enhance the statistical strength of your evaluation and increase the
precision of your estimates, but it is able to also boost the threat of amassing low-pleasant data.
A small sample size, on the other hand, may be less consultant of the populace you are studying;
however, it could be easier to gather incredible records.
• Ensure facts quality: To ensure the great of your facts, don’t forget the subsequent factors:
• Validity: The records you gather ought to measure what it is meant to measure.
• Reliability: The records ought to be consistent over the years and throughout distinctive
observers.
• Objectivity: The facts need to be amassed without bias or prejudice.
• Accuracy: The records should be loose from errors and mistakes.
• Use appropriate records analysis strategies: Once you have amassed your information,
use suitable facts analysis strategies to answer your research query. This could include
descriptive statistics, inferential data, or gadget getting to know algorithms
Q41. Writing SQL queries is simple but efficient writing that will make you appear different in a data
analyst interview.
Answer:
So here are some ways to write an efficient SQL query:-
Use SELECT statements to retrieve only the necessary data.
Use proper indexing to improve query performance.
Use UNION ALL instead of UNION if possible since UNION ALL doesn’t remove duplicates.
Avoid using subqueries, if possible, as they can slow down performance.
Use INNER JOIN instead of OUTER JOIN if you only need to retrieve matching records.
Avoid using wildcard characters at the beginning of LIKE clauses.
Use LIMIT or TOP to retrieve only a specific number of rows.
Use aggregate functions like SUM, AVG, and COUNT to perform calculations on large amounts of
data.
Use GROUP BY to group data and improve query performance.
Use EXISTS instead of IN or NOT IN clauses, as EXISTS is typically faster.
Use a WHERE clause to filter data before retrieving it.
Use CASE statements to perform conditional logic.
Avoid using SELECT * to retrieve all columns when you only need a few columns.
Use stored procedures to reduce network traffic and improve performance.
Use views to simplify complex queries and reduce the amount of code needed.
Use subqueries in the FROM clause instead of in the WHERE clause.
Use table aliases to simplify query code and improve readability
Use UNION instead of UNION ALL if you need to remove duplicates.
Avoid using functions in WHERE clauses, as they can slow down performance.
Use EXPLAIN to analyse query performance and identify bottlenecks.
Avoid using correlated subqueries, if possible, as they can be slow.
Use proper data types to optimize query performance.
Use stored procedures to reduce the number of round trips to the database.
Use temp tables to reduce query complexity and improve performance.
Q 50. Write sql query to remove duplicate job listings within same company.
Answer:
table name - job_listings
Q51. Write a SQL query to find the second highest salary from the ‘emp’ table. (Columns: id, salary)
Answer:
SELECT MAX(salary) AS second_highest_salary
FROM emp
WHERE salary < (SELECT MAX(salary) FROM emp);
Q52. Write a SQL query to find numbers that consecutively occur 3 times in the ‘id’ column of a table.
(Columns: id, numbers)
Answer:
SELECT DISTINCT a.numbers
FROM table_name a, table_name b, table_name c
WHERE a.numbers = b.numbers AND b.numbers = c.numbers
AND a.id = b.id - 1 AND b.id = c.id - 1;
Q53. Write a SQL query to find the days when the temperature was higher than its previous dates.
(Columns: Days, Temp)
Answer:
SELECT Days, Temp
FROM table_name t1
WHERE Temp > (SELECT Temp FROM table_name t2 WHERE t2.Days = t1.Days - 1);
Q55. Write a SQL query for the cumulative sum of salary for each employee from January to July.
(Columns: Emp_id, Month, Salary)
Answer:
SELECT Emp_id, Month, SUM(Salary) OVER (PARTITION BY Emp_id ORDER BY Month) AS
Cumulative_Salary
FROM salary_table
WHERE Month BETWEEN ‘January’ AND ‘July’;
Q56. Write a SQL query to display year-on-year growth for each product. (Columns: transaction_id,
Product_id, transaction_date, spend, Output: year, product_id, yoy_growth)
Answer:
SELECT EXTRACT(YEAR FROM transaction_date) AS year,
Product_id,
(spend - LAG(spend, 1, 0) OVER (PARTITION BY Product_id ORDER BY transaction_date)) /
LAG(spend, 1, 0) OVER (PARTITION BY Product_id ORDER BY transaction_date) * 100 AS yoy_growth
FROM transactions;
Q57. Write a SQL query to find the rolling average of posts on a daily basis for each user_id. (Columns:
user_id, date, post_count)
Answer:
SELECT user_id,
date,
ROUND(AVG(post_count) OVER (PARTITION BY user_id ORDER BY date ROWS BETWEEN 2 PRE-
CEDING AND CURRENT ROW), 2) AS rolling_average
FROM posts_table;
Q58. Write a SQL query to get emp id and department for each department, considering employees who
recently joined the organization and are currently working. (Columns: emp id, first name, last name, date
of join, date of exit, department)
Answer:
SELECT e.emp_id, e.first_name, e.last_name, e.date_of_join, e.date_of_exit, d.department
FROM employees e
JOIN departments d ON e.department_id = d.department_id
WHERE e.date_of_join > DATE_SUB(CURRENT_DATE, INTERVAL 1 YEAR)
AND e.date_of_exit IS NULL;
Q59. Write a query to get mean, median, and mode for earning? (Columns: Emp_id, salary)
Answer:
SELECT
AVG(salary) AS mean_earning,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) AS median_earning,
MODE() WITHIN GROUP (ORDER BY salary) AS mode_earning
FROM earnings_table;
Q60.
1. Second Highest Salary
SQL -
python-
consecutives = table_name[table_name[‘numbers’].eq(table_name[‘numbers’].shift()) & table_
name[‘numbers’].eq(table_name[‘numbers’].shift(-1))][‘numbers’].unique()
SQL -
SELECT EXTRACT(YEAR FROM transaction_date) AS year,
Product_id,
(spend - LAG(spend, 1, 0) OVER (PARTITION BY Product_id ORDER BY transaction_date)) /
LAG(spend, 1, 0) OVER (PARTITION BY Product_id ORDER BY transaction_date) * 100 AS yoy_
growth
FROM transactions;
python-
transactions[‘year’] = transactions[‘transaction_date’].dt.year
transactions[‘lag_spend’] = transactions.groupby(‘Product_id’)[‘spend’].shift(1)
transactions[‘yoy_growth’] = (transactions[‘spend’] - transactions[‘lag_spend’]) /
transactions[‘lag_spend’] * 100
-----------------------------------------------------------------------------------------------------------------------------------
Practical Questions:
1. Combine Two Tables
Write a SQL query for a report that provides the following information for each person in the Person
table, regardless of if there is an address for each of those people: FirstName, LastName, City, State
For example, given the above Employee table, the query should return 200 as the second highest
salary. If there is no second highest salary, then the query should return null.
-----------------------------------------------------------------------------------------------------------------------------------------
For example, given the above Employee table, the nth highest salary where n = 2 is 200. If there is no nth
highest salary, then the query should return null.
4. Write a SQL query to rank scores. If there is a tie between two scores, both should have the same
ranking. Note that after a tie, the next ranking number should be the next consecutive integer value.
In other words, there should be no “holes” between ranks.
For example, given the above Scores table, your query should generate the following report (order
by highest score):
5. Write a SQL query to find all numbers that appear at least three times consecutively.
SELECT Num
FROM (
SELECT Num,
ROW_NUMBER() OVER (ORDER BY Id) AS rn,
ROW_NUMBER() OVER (PARTITION BY Num ORDER BY Id) AS rn_id
FROM your_table_name
) AS subquery
GROUP BY Num, (rn - rn_id)
HAVING COUNT(*) >= 3;
6. The Employee table holds all employees including their managers. Every employee has an Id, and there
is also a column for the manager Id.
Given the Employee table, write a SQL query that finds out employees who earn more than their
managers. For the above table, Joe is the only employee who earns more than his manager.
-----------------------------------------------------------------------------------------------------------------------------------
8. Suppose that a website contains two tables, the Customers table and the Orders table. Write a SQL
query to find all customers who never order anything.
9. Write a SQL query to delete all duplicate email entries in a table named Person, keeping only unique
emails based on its smallest Id.
-----------------------------------------------------------------------------------------------------------------------------------
11. Game Play Analysis I
-----------------------------------------------------------------------------------------------------------------------------------
SELECT player_id,
event_date,
SUM(games_played) OVER (PARTITION BY player_id ORDER BY event_date) AS games_played_so_far
FROM Activity
ORDER BY player_id, event_date;
-----------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------
22. Salesperson
select actor_id, director_id from ActorDirector group by actor_id, director_id having count(*) >= 3;
-----------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------
# select seller_id
# from Sales
# group by seller_id
# having sum(price) >= all(
# select sum(price)
# from Sales
# group by seller_id # );
-----------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------
# sum(case when month = ‘Nov’ then revenue else null end) as Nov_Revenue,
# sum(case when month = ‘Dec’ then revenue else null end) as Dec_Revenue
# from Department # group by id;
select id, sum(if(month = ‘Jan’, revenue, null)) as Jan_Revenue, sum(if(month = ‘Feb’, revenue,
null)) as Feb_Revenue, sum(if(month = ‘Mar’, revenue, null)) as Mar_Revenue, sum(if(month =
‘Apr’, revenue, null)) as Apr_Revenue, sum(if(month = ‘May’, revenue, null)) as May_Revenue,
sum(if(month = ‘Jun’, revenue, null)) as Jun_Revenue, sum(if(month = ‘Jul’, revenue, null)) as
Jul_Revenue, sum(if(month = ‘Aug’, revenue, null)) as Aug_Revenue, sum(if(month = ‘Sep’, reve-
nue, null)) as Sep_Revenue, sum(if(month = ‘Oct’, revenue, null)) as Oct_Revenue, sum(if(month
= ‘Nov’, revenue, null)) as Nov_Revenue, sum(if(month = ‘Dec’, revenue, null)) as Dec_Revenue
from Department group by id;
-----------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------
# from Submissions
# ) as ds
# where parent_id is not null
# group by parent_id
# ) as s2
# on s1.post_id = s2.parent_id
# order by post_id;
-----------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------
Part 2:
Part 3:
Case study Based Questions:
Q: 1
Schema
Imagine a table named “Movies” with columns: MovieID, Title, ReleaseDate, GenreID. There’s
another table “Genres” with columns: GenreID, GenreName.
Write a SQL query to fetch the genres that don’t have any movies associated with them.
Q:2
Schema
You are given a table named “Attendance” with columns: StudentID, ClassDate, IsPresent (a boolean
where 1 indicates presence and 0 indicates absence).
Write a SQL query to identify students who have missed more than 3 consecutive classes.
Q: 3
Schema
Consider a table named “Elections” with columns: CandidateID, VoterID, VoteDate.
Write a SQL query to calculate the candidate who received the highest number of votes each
month.
Q: 4
Schema
You have a table named “ProductSales” with columns: ProductID, SaleDate, UnitsSold.
Write a SQL query to find the top 3 products that have shown the most significant sales growth
month-over-month.
Q: 5
Schema
You are provided with a table named “LibraryBooks” with columns: BookID, BorrowerID,
BorrowDate, ReturnDate.
Write a SQL query to find out which books are currently borrowed and have passed their return
date without being returned.
Q: 6
Schema
Consider a table named “OnlineCourses” with columns: CourseID, EnrollmentDate, StudentID,
CompletionDate.
Write a SQL query to determine the courses which have the highest drop rate (i.e., students enrolling
but not completing).
Q: 7
Schema
You have a table named “EmployeeFeedback” with columns: EmployeeID, FeedbackDate, Rating
(from 1 to 10).
Write a SQL query to identify employees whose rating has been declining for the past 3 consecutive
feedback.
Q: 8
Schema
There are two tables: “BlogPosts” and “Comments”.
The “BlogPosts” table has columns: PostID, Title, PostDate, AuthorID.
The “Comments” table has columns: CommentID, PostID, CommentDate, Text.
Write a SQL query to fetch the blog posts that have not received any comments within a week of their
posting.
Q: 9
Schema
You are given a table named “Subscription” with columns: UserID, SubscriptionDate, ExpiryDate.
Write a SQL query to count the number of active subscriptions on the first day of each month in the
past year.
Q: 10
Consider a table named “TouristSpots” with columns: SpotID, SpotName, VisitorID, VisitDate. Write a
SQL query to find the least visited tourist spots in the last summer.
Q: 11
Schema:
There are two tables: “Books” and “Authors”.
The “Books” table has columns: BookID, BookName, AuthorID, SoldCopies.
The “Authors” table has columns: AuthorID, AuthorName.
Write a SQL query to find authors whose books, on average, have sold more than 10,000 copies, but
have written less than 3 books.
Q: 12
Schema
You have a table named “FlightBookings” with columns: BookingID, FlightDate, PassengerID,
Destination.
Write a SQL query to determine which destination has seen a steady month-on-month increase in
bookings over the last year.
Q13
Schema
Table: PatientRecords
Columns: PatientID, VisitDate, DiagnosisCode
Write an SQL query to identify the most frequent diagnosis code for each month. (Note: Patients may
have multiple visits and diagnoses in a month.)
Q:14
Schema
Table: ClinicStaff
Columns: StaffID, Name, HireDate
Craft an SQL query to find the names of staff members who are celebrating their 5th, 10th, and 15th
work anniversaries in 2023.
Q:15
Schema
Table X:
Columns: ids with values 10, 20, 30, 40, 50, null, 60, 70, 80, null, 90, 100
Table Y:
Columns: ids with values 20, null, 40, 60, null, 80, 100, null
Task: Use SQL to demonstrate the results of an inner join, left join, right join, and full outer join
between these tables.
2021-2024 Pg. No.96
BRAINALYST - SQL INTERVIEW QUESTIONS
Q:16
Schema
Table X:
Columns: ids with values 1, 1, 1, 1
Table Y:
Columns: ids with values 1, 1, 1, 1, 1, 1, 1, 1
Q 17: Task: Determine the count of rows in the output of the following queries:
Select * From X join Y on X.ids != Y.ids
Select * From X left join Y on X.ids != Y.ids
Select * From X right join Y on X.ids != Y.ids
Select * From X full outer join Y on X.ids != Y.ids
Q18. Employee Project Allocation - Consider two tables, Employees and Projects:
Schema
CREATE TABLE Employees ( employee_id INT PRIMARY KEY, employee_name VARCHAR(255),
department VARCHAR(255) );
CREATE TABLE Projects ( project_id INT PRIMARY KEY, lead_employee_id INT, project_name
VARCHAR(255), start_date DATE, end_date DATE );
Assume necessary INSERT statements are already executed.
Task: The goal is to write an SQL query to find the names of employees who have led more than 3
projects in the last year. The result should be ordered by the number of projects led.
Q19.
Tables:
Orders (Order_id, Customer_id, Order_Date, Total_Amount)
Order_Details (Order_Detail_id, Order_id, Product_id, Quantity, Unit_Price)
Products (Product_id, Product_Name, Category)
Task: Write an SQL query to find the top-selling product (highest revenue) in each category for the
last quarter.
Q20.
Table:
Sales (Sale_id, Sale_Date, Amount)
Question: Write an SQL query to calculate last 7-day rolling average of sales amounts.
Q21. Tables:
Customers (Customer_id, Name, Join_Date)
Orders (Order_id, Customer_id, Order_Date, Amount)
Question: Write an SQL query to list customers who have not placed an order in the last 6 months but
have placed more than 5 orders in total.
Q22. Tables:
Sales (Sale_id, Salesperson_id, Sale_Date, Amount)
Salesperson (Salesperson_id, Name, Region)
Question: Write an SQL query using window functions to rank salespersons in each region by their
total sales amount.
Q23. Customer Purchase Patterns - You have two tables, Customers and Purchases:
Schema
CREATE TABLE Customers ( customer_id INT PRIMARY KEY, customer_name VARCHAR(255) );
CREATE TABLE Purchases ( purchase_id INT PRIMARY KEY, customer_id INT, product_id INT,
purchase_date DATE );
Assume necessary INSERT statements are already executed.
Task: Write an SQL query to find the names of customers who have purchased more than 5 different
products within the last month. Order the result by customer_name.
Q24. Call Log Analysis -Suppose you have a CallLogs table:
Schema
CREATE TABLE CallLogs ( log_id INT PRIMARY KEY, caller_id INT, receiver_id INT, call_start_time
TIMESTAMP, call_end_time TIMESTAMP );
Assume necessary INSERT statements are already executed.
Task: Write a query to find the average call duration per user. Include only users who have made more
than 10 calls in total. Order the result by average duration descending.
Q25.
Tables:
CREATE TABLE Customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(255)
);
--
CREATE TABLE Orders (
order_id INT PRIMARY KEY,
customer_id INT,
MS’,’2159010’,1)
,(‘SMS’,’4535614’,1),(‘OUT’,’181868’,20),(‘INC’,’181868’,54),(‘INC’,’218748’,20),(‘INC’,’2159010’,9)
,(‘INC’,’197432’,66),(‘SMS’,’2159010’,1),(‘SMS’,’4535614’,1);
Q27. Write a query to retrieve number having both incoming and outgoing calls & sum of duration of
outgoing is more than that of incoming.
Table:
CREATE TABLE Person (
id INT PRIMARY KEY,
name VARCHAR(255),
phone_number VARCHAR(255)
);
--
CREATE TABLE Country (
name VARCHAR(255),
country_code VARCHAR(3) PRIMARY KEY
);
--
CREATE TABLE Calls (
caller_id INT,
callee_id INT,
duration INT
);
--
INSERT INTO Person (id, name, phone_number) VALUES
(3, ‘Jonathan’, ‘051-1234567’),
(12, ‘Elvis’, ‘051-7654321’),
(1, ‘Monce’, ‘972-5432110’),
(2, ‘Maroua’, ‘212-4321098’),
(7, ‘Meir’, ‘972-2211111’),
(9, ‘Rachel’, ‘972-2221111’);
--
INSERT INTO Country (name, country_code) VALUES
(‘Peru’, ‘051’),
(‘Israel’, ‘972’),
(‘Morocco’, ‘212’),
(‘Germany’, ‘049’),
(‘Ethiopia’, ‘251’);
--
INSERT INTO Calls (caller_id, callee_id, duration) VALUES
(1, 9, 33),
(2, 9, 4),
(1, 2, 59),
(3, 12, 102),
(3, 12, 330),
(12, 3, 5),
(7, 9, 13),
(7, 1, 3),
(9, 7, 1),
(1, 7, 7);
Q28. A telecommunications company wants to invest in new countries. The company intends to
invest in countries where the average call duration of the calls in that country is strictly greater than
the global average call duration. Write an SQL query to find the countries where this company can
invest.
Q29. Customer Purchase Patterns - You have two tables, Customers and Purchases:
Tables
CREATE TABLE Customers ( customer_id INT PRIMARY KEY, customer_name VARCHAR(255) );
CREATE TABLE Purchases ( purchase_id INT PRIMARY KEY, customer_id INT, product_id INT,
purchase_date DATE );
Assume necessary INSERT statements are already executed.
Task: Write an SQL query to find the names of customers who have purchased more than 5 different
products within the last month. Order the result by customer_name.
Q30. Call Log Analysis -Suppose you have a CallLogs table:
Table:
CREATE TABLE CallLogs ( log_id INT PRIMARY KEY, caller_id INT, receiver_id INT, call_start_time
TIMESTAMP, call_end_time TIMESTAMP );
Assume necessary INSERT statements are already executed.
Task: Write a query to find the average call duration per user. Include only users who have made more
than 10 calls in total. Order the result by average duration descending.
Q31.
Tables:
CREATE TABLE Customers (
cstomer_id INT PRIMARY KEY,
customer_name VARCHAR(255)
);
--
CREATE TABLE Orders (
order_id INT PRIMARY KEY,
customer_id INT,
product_name VARCHAR(255)
);
--
INSERT INTO Customers (customer_id, customer_name) VALUES
(1, ‘Daniel’),
(2, ‘Diana’),
(3, ‘Elizabeth’),
Pg. No.101 2021-2024
BRAINALYST - SQL INTERVIEW QUESTIONS
(4, ‘Jhon’);
--
INSERT INTO Orders (order_id, customer_id, product_name) VALUES
(10, 1, ‘A’),
(20, 1, ‘B’),
(30, 1, ‘D’),
(40, 1, ‘C’),
(50, 2, ‘A’),
(60, 3, ‘A’),
(70, 3, ‘B’),
(80, 3, ‘D’),
(90, 4, ‘C’);
Q32. Write an SQL query to report the customer_id and customer_name of customers who bought
products “A”, “B” but did not buy the product “C”. Return the result table ordered by customer_id.
Call details_table:
create table call_details (
call_type varchar(10),
call_number varchar(12),
call_duration int
);
--
insert into call_details
values (‘OUT’,’181868’,13),(‘OUT’,’2159010’,8)
,(‘OUT’,’2159010’,178),(‘SMS’,’4153810’,1),(‘OUT’,’2159010’,152),(‘OUT’,’9140152’,18),(‘S
MS’,’4162672’,1)
,(‘SMS’,’9168204’,1),(‘OUT’,’9168204’,576),(‘INC’,’2159010’,5),(‘INC’,’2159010’,4),(‘S
MS’,’2159010’,1)
,(‘SMS’,’4535614’,1),(‘OUT’,’181868’,20),(‘INC’,’181868’,54),(‘INC’,’218748’,20),(‘INC’,’2159010’,9)
,(‘INC’,’197432’,66),(‘SMS’,’2159010’,1),(‘SMS’,’4535614’,1);
Q33. Write a query to retrieve number having both incoming and outgoing calls & sum of duration of
outgoing is more than that of incoming.
Tables:
CREATE TABLE Person (
id INT PRIMARY KEY,
name VARCHAR(255),
phone_number VARCHAR(255)
);
--
CREATE TABLE Country (
name VARCHAR(255),
country_code VARCHAR(3) PRIMARY KEY
);
--
CREATE TABLE Calls (
2021-2024 Pg. No.102
BRAINALYST - SQL INTERVIEW QUESTIONS
caller_id INT,
callee_id INT,
duration INT
);
--
INSERT INTO Person (id, name, phone_number) VALUES
(3, ‘Jonathan’, ‘051-1234567’),
(12, ‘Elvis’, ‘051-7654321’),
(1, ‘Monce’, ‘972-5432110’),
(2, ‘Maroua’, ‘212-4321098’),
Purchase_date DATE
);
Challenge: Write a SQL query to calculate the YoY growth rate of the amount spent by each custom-
er.
Question 2: Inventory Threshold Analysis
- Scenario: Identify days when product stock falls below a critical level.
- Table Schema:
CREATE TABLE supplier_inventory (
Supplier_id VARCHAR,
Product_id VARCHAR,
Stock_quantity INT,
Record_date DATE
);
Challenge: Find the periods when stock quantity was below 50 units for more than two consecutive
days.
Q36: Employee Sales Performance Ranking
- Scenario: Rank employees in each store by their sales in the current year.
- Table Schema:
CREATE TABLE employee_sales (
Employee_id VARCHAR,
Store_id VARCHAR,
Sale_amount DECIMAL(10,2),
Sale_year INT
);
Challenge: Select Employee_id, Store_id, and rank based on Sale_amount for 2023.
Q37: Frequent Flyers from the Same Airport
- Scenario: Find passengers with frequent flights from the same airport over the last year.
- Table Schema:
CREATE TABLE passenger_flights (
Passenger_id VARCHAR,
Flight_id VARCHAR,
Departure_date DATE
);
CREATE TABLE flight_details (
Flight_id VARCHAR PRIMARY KEY,
Departure_airport_code VARCHAR,
Arrival_airport_code VARCHAR
);
Challenge: Identify passengers with more than 10 flights from the same airport since last year.
Q38. Product Sales Analysis - Consider two tables, Products and Sales:
Table Schemas -
CREATE TABLE Products ( product_id INT PRIMARY KEY, product_name VARCHAR(255), category
VARCHAR(255) );
CREATE TABLE Sales ( sale_id INT PRIMARY KEY, product_id INT, sale_date DATE, quantity_sold INT
);
Assume necessary INSERT statements are already executed.
Task: Write an SQL query to find the names of products that have sold more than 100 units in the last
quarter. The result should be ordered by product_name.
Q39. Employee Attendance Tracking - Imagine you have an EmployeeAttendance table:
Table Schemas -
CREATE TABLE EmployeeAttendance ( attendance_id INT PRIMARY KEY, employee_id INT, clock_
in_time TIMESTAMP, clock_out_time TIMESTAMP );
Assume necessary INSERT statements are already executed.
Task: Write a query to find the average working hours per day for each employee. Include only
employees who have clocked more than 20 days. Order the results by average working hours
ascending.
Q40. Customer Feedback Analysis - You have two tables, Customers and Feedback:
Table Schemas -
CREATE TABLE Customers ( customer_id INT PRIMARY KEY, customer_name VARCHAR(255), email
VARCHAR(255) );
CREATE TABLE Feedback ( feedback_id INT PRIMARY KEY, customer_id INT, feedback_date DATE,
rating INT );
Note:
Assume necessary INSERT statements are already executed.
The task is to write an SQL query to find the names of customers who have given a rating of 5 more
than 3 times in the last 6 months. Order the result by customer_name
Q41.
Write an SQL query to select Product_id for products that had a stock quantity of less than 20 units
for each of the last 5 recorded days.
Question 1: Calculate the average stock price for Company X over the last 6 months.
Question 2: Identify the month with the highest total sales for Company Y using their monthly sales
data.
Question 3: Find the maximum and minimum stock price for Company Z on any given day in the last
year.
Question 4: Create a column in the DataFrame showing the percentage change in stock price from
the previous day for Company X.
Question 5: Determine the number of days when the sales of Company Y were above its 30-day
moving average.
Question 6: Compare the average stock price of Companies X and Z in the first quarter of the year.
Q42.
Q1. Given a sales_data table, write a query to display the month-over-year change in total sales.
Table structure:
CREATE TABLE sales_data (
Sale_id VARCHAR PRIMARY KEY,
Product_id VARCHAR,
Sale_amount DECIMAL(10,2),
Sale_date DATE
);
Required: Write an SQL query to calculate the percentage change in total sales for each month
compared to the same month in the previous year.
Q2. You have a customer_orders table that tracks every order placed by customers. Find customers
who placed orders in the first quarter of 2022 but did not place any orders in the first quarter of
2023.
Table structure:
CREATE TABLE customer_orders (
Order_id VARCHAR PRIMARY KEY,
Customer_id VARCHAR,
Order_amount DECIMAL(10,2),
Order_date DATE
);
Required: Write an SQL query to select the Customer_id of those who were active in Q1 of 2022
but inactive in Q1 of 2023.
Q3. Given an employee_tasks table, rank employees based on the number of tasks completed this
month.
Table structure:
CREATE TABLE employee_tasks (
Task_id VARCHAR PRIMARY KEY,
Employee_id VARCHAR,
Task_status VARCHAR, -- Values can be ‘Completed’, ‘In Progress’, or ‘Not Started’
Task_date DATE
);
Required: Write an SQL query to select Employee_id and a rank based on the count of completed
tasks for the current month, with 1 being the employee with the most tasks completed.
Q4. Using the product_inventory table, identify products that have consistently low stock (below 20
units) for the last 5 days.
Table structure:
CREATE TABLE product_inventory (
Product_id VARCHAR,
Stock_quantity INT,
Record_date DATE);
Q43.
#Data#
import pandas as pd
import numpy as np
data = {
‘Date’: pd.date_range(start=’2023-01-01’, periods=180, freq=’D’),
‘CompanyX_StockPrice’: np.random.randint(50, 150, 180),
‘CompanyY_Sales’: np.random.randint(20000, 50000, 180),
‘CompanyZ_StockPrice’: np.random.randint(70, 200, 180)
}
df = pd.DataFrame(data)
Q44:
Question 1: Calculate the average stock price for Company X over the last 6 months.
Question 2: Identify the month with the highest total sales for Company Y using their monthly sales
data.
Question 3: Find the maximum and minimum stock price for Company Z on any given day in the last
year.
Question 4: Create a column in the DataFrame showing the percentage change in stock price from
the previous day for Company X.
Question 5: Determine the number of days when the sales of Company Y were above its 30-day
moving average.
Question 6: Compare the average stock price of Companies X and Z in the first quarter of the year.
Q45:
#Data#----------------------------------------------
import pandas as pd
import numpy as np
data = {
‘Date’: pd.date_range(start=’2023-01-01’, periods=180, freq=’D’),
‘CompanyX_StockPrice’: np.random.randint(50, 150, 180),
‘CompanyY_Sales’: np.random.randint(20000, 50000, 180),
‘CompanyZ_StockPrice’: np.random.randint(70, 200, 180)
}
df = pd.DataFrame(data)
Task: Write a SQL query to calculate the average stock price of CompanyX for each month, along
with the total sales of CompanyY for each month. Display the results with the month and year as
the grouping criteria.
Q46:
A. SQL-:
1 - You have two tables - Employee & Department
Columns of Employee Table - Emp_id, Emp_Name, Salary, Dept_id
Columns of Department Table - Dept_id, Dept_Name
Write SQL query to fetch Dept_Name which have 2nd Highest average salary.
2 - You have two tables - Orders & Delivery_Status
Columns of Orders Table - order_id, timestamp, product_id, unit_price
Columns of Delivery_Status Table - order_id, timestamp, status
FYI - status column have these entries - New, Cancelled, RTO, Returned
NMV = sum of unit_price (orders who have status = New) - sum of unit_price (orders who have sta-
tus = Cancelled) - sum of unit_price (orders who have status = RTO) - sum of unit_price (orders who
have status = Returned)
Q47:
Write SQL query to calculate NMV value Month-on-Month. Output tables should contain these two
columns - Month, NMV.
Q48.
1. You have two tables: Product and Supplier.
- Product Table Columns: Product_id, Product_Name, Supplier_id, Price
- Supplier Table Columns: Supplier_id, Supplier_Name, Country
Write an SQL query to find the name of the product with the highest price in each country.
2. You have two tables: Customer and Transaction.
- Customer Table Columns: Customer_id, Customer_Name, Registration_Date
- Transaction Table Columns: Transaction_id, Customer_id, Transaction_Date, Amount
Write an SQL query to calculate the total transaction amount for each customer for the current
year. The output should contain Customer_Name and the total amount.
------------------------------------------------------------------------------------------------------------------------------------------------------