0% found this document useful (0 votes)
63 views20 pages

Google SQL Interview Questions

Uploaded by

abhaykhanjain
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
63 views20 pages

Google SQL Interview Questions

Uploaded by

abhaykhanjain
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 20

Mohammed Enayat ur Rahman

SQL
Interview questions
for Data Analysts

linkedin.com/in/enayat-ur-rahman
1. Median Number of Searches

Problem Statement: For an IPL ad


campaign, you need to determine the
median number of searches made by fans
last year. The data is stored in a summary
table with columns searches (indicating
the number of searches) and num_users
(indicating the number of users who
performed that many searches). Due to
the large size of the dataset, a direct
calculation of the median from the
summary table is not feasible.

linkedin.com/in/enayat-ur-rahman
search_frequency

linkedin.com/in/enayat-ur-rahman
How to Solve:

1.Expand Data:
Create a detailed list where each
search count is repeated according to
the number of users. For example, if
10 users made 5 searches each, the
list should include 10 entries of 5
searches.

1.Calculate Median:
Use the expanded list to find the
median value. The median is the
middle value when all entries are
ordered. If the number of entries is
even, the median is the average of the
two middle values.

linkedin.com/in/enayat-ur-rahman
linkedin.com/in/enayat-ur-rahman
2. Sum of Odd and Even Measurements
Problem Statement: You need to calculate the
sum of measurements taken at various cricket
matches, where measurements are
categorized by odd and even row numbers.
You have a table measurements with columns
measurement_time and measurement_value.

linkedin.com/in/enayat-ur-rahman
Measurements

linkedin.com/in/enayat-ur-rahman
How to Solve:

1. Assign Row Numbers:


Use the ROW_NUMBER() function to
assign a unique row number to each
measurement, partitioned by match
date and ordered by measurement
time.

2. Calculate Sums:
Use conditional aggregation to sum
measurements based on whether their
row number is odd or even.

linkedin.com/in/enayat-ur-rahman
linkedin.com/in/enayat-ur-rahman
3. Google Maps - Most Off-Topic UGC
Problem Statement: As a Data Analyst on
the Google Maps User Generated Content
team, you and your Product Manager are
investigating user-generated content
(UGC) – photos and reviews that
independent users upload to Google
Maps.

Identify which venue type (e.g.,


Restaurant, Bar) has the highest amount
of "off-topic" user-generated content
(UGC). You have two tables: place_info
(with place categories) and
maps_ugc_review (with UGC details).

linkedin.com/in/enayat-ur-rahman
linkedin.com/in/enayat-ur-rahman
How to Solve:

1. Count Off-Topic UGC:

Join place_info with maps_ugc_review


on place ID. Filter UGC to include only
those tagged as "Off-topic".

Count the occurrences of off-topic


UGC for each venue category.

2. Find Top Venue Category:

Determine which category has the


highest count of off-topic UGC.

linkedin.com/in/enayat-ur-rahman
linkedin.com/in/enayat-ur-rahman
4. Popular Search Categories

Problem Statement: Find the total


number of searches per category for the
year 2024, and group the results by
month. You have two tables: searches
(with search details) and categories (with
category names).

linkedin.com/in/enayat-ur-rahman
Categories

Searches

linkedin.com/in/enayat-ur-rahman
How to Solve:

1. Join Tables:

Combine the searches table with the


categories table to include category
names.

2. Count Searches:

Aggregate the number of searches by


category and month for the year 2024.

linkedin.com/in/enayat-ur-rahman
linkedin.com/in/enayat-ur-rahman
5. What is Database Denormalization?
Problem Statement: Explain the concept
of denormalization in database design.

linkedin.com/in/enayat-ur-rahman
Denormalization is a database design
approach where tables are combined to
simplify the schema and improve query
performance.

This process involves introducing


redundancy by merging tables, which
reduces the need for complex joins and
can speed up read operations.

While it may increase data redundancy, it


can also improve performance and
simplify certain queries.

linkedin.com/in/enayat-ur-rahman
Thank you
Found this helpful? Repost!

linkedin.com/in/enayat-ur-rahman

You might also like