DP-203T00 Microsoft Azure Data Engineering-03
DP-203T00 Microsoft Azure Data Engineering-03
DP-203T00 Microsoft Azure Data Engineering-03
Notebook experience
Reading and writing by simply writing code in a shared notebook experience
Read data in Azure Databricks
Working with Select in Azure Databricks
show(…) The show(..) command is part of the core Spark API and simply prints the
results to the console
display(…) The display(…) command provides more flexibility than show(…) such as
downloading results against csv, rendering charts and showing up to 100
rows
limit(…) The limit(…) command can be used to control the number of records that
are returned to a DataFrame
Optimize DataFrames in Azure Databricks
DateTime manipulation
Enabling different DateTime techniques to use across DataFrames
Aggregate Functions
groupBy() function, sum(), count(), avg(), min(), max() functions
Deduplication of Data
Removing duplicates, by ensuring you only keep 1 record
Review questions
Q03 – You need to find the average of sales transactions by storefront. Which of
the following aggregates would you use?
A03 – df.groupBy(col("storefront")).avg("completedTransactions")
Lab: Data Exploration and Transformation in Azure Databricks
Lab overview
This lab teaches you how to use various Apache Spark DataFrame methods to explore and transform data in
Azure Databricks. You will learn how to perform standard DataFrame methods to explore and transform data.
You will also learn how to perform more advanced tasks, such as removing duplicate data, manipulate date/time
values, rename columns, and aggregate data.
Lab objectives
After completing this lab, you will be able to:
Use DataFrames in Azure Databricks to explore and filter data
Q01 – What is a function that allows you to print data to the console in Azure
Databricks?
Q02 – How can you transform a whole dataset called wholedatasetDF data by
selecting only 2 columns?
Next steps
After the course, consider visiting the website that explores the [Azure Databricks concepts
] and architectures where the associated documentation goes more in depth about architectures and con
cepts related to Azure Databricks.
© Copyright Microsoft Corporation. All rights reserved.