0% found this document useful (0 votes)
94 views15 pages

AP-CSA-Data-Lab

Uploaded by

sarika
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
94 views15 pages

AP-CSA-Data-Lab

Uploaded by

sarika
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 15

AP CSA Data Lab

 Printing Options
Students are motivated if they feel connected to the material they are studying. One way to make
computer science content relevant, meaningful, relatable, and exciting is to incorporate real-world data
sets into hands-on programming assignments.
Data are values. Values by themselves are not knowledge, knowledge is what you gain from studying data.
This lab is designed to encourage students to derive knowledge about a topic of interest to them by
examining a set of freely available data from the web. In order to gain knowledge from data in this lab,
students will pose a question, search for a data set on that topic, and then process that data to determine
the answer to their question.

Real-World Connection
Being able to access and process data is an important part of many industries. Whether it is determining
the best allocation of resources or deciding what new product to create, analyzing data is a necessary
component of those conversations. Luckily there are tools available to help read and process data, and this
lab will introduce one such tool to students and ask them to utilize existing real-world data to answer a
question of their own interest.

Lesson 2:
Week 1 Lesson 1: Data, Designing and Lesson 3: Putting It All Together
Data, Data! Implementing a
Custom Class

Week 2 Lesson 4: Open-Ended Activity

Key ■ Instructional Lesson  Assessment ✀ Unplugged Lesson

 View calendar

Active section:
Select a section

 Show All Lessons  Hide All Lessons 

 Lesson 1: Data, Data, Data!


The ease with which data can be stored, cataloged, and searched in an ever more-connected
society is an important point to understand. Data storage has both positive and negative
implications, and being able to recognize these, as well as how students' actions can contribute to
the data being stored, are addressed in this activity.

 1-2 File Types


1 2

 3-4 Posing Questions

3 4

 5 Refining Your Questions

 6-7 Finding Data Sets

6 7

 8-9 Check for Understanding

8● 9●
✓ ✓

 Lesson 2: Designing and Implementing a Custom Class


Students have likely created a class from a given description or specification several times. This is an
important skill, but equally important is the ability to determine essential information to include when
creating a class. What is "essential" can vary based on perspective, or can be determined by a
question that is being asked or a problem that is attempted to be solved. This activity will give
students an opportunity to practice making this type of determination.

 1 Cereal Data Set

 2-4 Designing the Cereal Class

2 3 4

 5-6 Implementing the Cereal Class

5 6

 7-9 Check for Understanding

7● 8● 9●
✓ ✓ ✓

 Lesson 3: Putting It All Together


Existing code in the form of libraries is incredibly useful and a powerful aspect of object-oriented
programming. Beyond the standard Java library, users around the world have created and published
libraries to perform countless tasks. One such library, which students will be using in this library and
the one that follows, is the Sinbad library. This library allows students to create a Java program that
can connect to a data source, read in data, and then process this data. The goal of this activity is to
provide practice working with the Sinbad library.

 1-2 Getting Started

1 2

 3-5 Arrays and Lists of Objects

3 4 5

 6-7 Check for Understanding

6● 7●
✓ ✓

 Lesson 4: Open-Ended Activity


The goal of this activity is to allow students to demonstrate their knowledge of class design and data
structures in a way that is interesting and engaging to them.

 1-5 Check for Understanding

1● 2● 3● 4● 5●
✓ ✓ ✓ ✓ ✓
Lesson 1: Data, Data, Data!
45 minutes

Overview
The ease with which data can be stored, cataloged, and Links
searched in an ever more-connected society is an
important point to understand. Data storage has both Heads Up! Please make a copy of
positive and negative implications, and being able to any documents you plan to share
recognize these, as well as how students' actions can with students.
contribute to the data being stored, are addressed in this
activity. For the students
cereal.csv - Resource
Agenda cereal.xlsx - Resource
Suggested Reading
Warm Up (10 minutes)
Activity (25 minutes)
File Types
Posing Questions
Finding Data Sets
Wrap Up (10 minutes)

Teaching Guide
Suggested Reading
Although they are not assessed, the ethical and social implications of computing systems are covered in
this course. This lab provides a real-world premise to discuss these topics. Students should be given some
pre-reading assignments, and class discussions should ensue to debate issues brought up in the articles.
Ideas include:
An article about data science in general (including job opportunities).
An article about DNA testing to determine ancestry and the idea that a company has collected a lot of
information about its participants without disclosing how it might be used in the future.
An article debating the points about smart home listening devices.

Warm Up (10 minutes)


Do This: Spend a few minutes discussing privacy policies. Regardless of whether a service is offered for
free or at a cost, the companies that students interact with on a regular basis collect data about them.
Privacy policies are commonplace everywhere, from social media to your local doctor's office. By agreeing
to use a service, and often just by visiting a website, you are agreeing to the privacy policy.
Do This: Have students list sites that they typically visit and then compile a class list. In groups, have
students find the privacy policy for a specific site. Ask them to identify two or three pieces of information
that are collected. This information should include information about the device used to access the site,
such as type or IP address, as well as information about specific content viewed.
 Teaching Tip 

Although it is not assessed, students should spend time discussing the ethical implications of
computing, and this lab provides an appropriate scenario to have this discussion. The ease with which
data can be stored, cataloged, and searched in an ever more-connected society is an important point
for students to discuss.

Activity (25 minutes)


File Types
Do This: From within Microsoft Excel or Google Sheets, open both Cereal.csv and Cereal.xlsx . They
look the same and contain the same data, although the file type is different. Open the .csv file in a text
editor and point out the comma-separated values. Each row of the original file is on its own line. Then try
and open the .xlsx file in a text editor and notice that the contents are illegible.
 Teaching Tip 

Although not a part of the required curriculum, it is important for students to understand different file
types, what certain file extensions mean, and how to open, view, or access data within different types
of files. It may be difficult to have students explore different file types on their own depending on
classroom setup, so if students don't all have individual access to Microsoft Excel, it might be best to
display it to the class.
Another file type that you or your students might run into when completing this lab is JSON files
( .json ). JSON stands for JavaScript Object Notation, and although files of this type look harder to
read than a .csv or .xlsx file, the data is formatted specifically to be read and utilized by computer
programs. It is possible to complete the activities of this lab without using or interacting with JSON files,
however there are options for differentiation/modification that incorporate the use of this file type.

Do This: Direct students to Level 1 on Code Studio to complete Levels 1 and 2 by responding to the
prompts.

 1-2 File Types

1 2

Posing Questions
Do This: Direct students to Level 3 on Code Studio to complete Levels 3 and 4 by responding to the
prompts.

 3-4 Posing Questions


3 4
Do This: Have students discuss their two questions with a partner and refine their questions based on their
feedback.
 Teaching Tip 

Although this lab is not a traditional research project, there are components of various lab activities that
mimic what you might see in a research project. Being able to write, or ask, an appropriate question for
investigation is something that takes practice, and students will likely need help throughout the process.
This activity is meant to get students thinking about what question they might like to answer and what
data exists to help them answer that question, so that when they get to the final activity they have the
tools needed to explore the topic of their choice. Whenever possible, encourage students to ask
questions about a topic that they find meaningful or interesting.
When asked to share their questions with a partner, encourage students to be critical of the questions,
and to offer constructive feedback on how the questions might be improved.

Do This: Direct students to Level 5 on Code Studio to write their updated questions.

 5 Refining Your Questions

Finding Data Sets


Do This: Direct students to Level 6 on Code Studio to complete Levels 6 and 7 by responding to the
prompts.

 6-7 Finding Data Sets

6 7

 Teaching Tip 

Provide students with a list of places to find data. Here are a few links:
https://www.kaggle.com (This site requires a free account, which can be an existing Google
account, in order to access and search data sets.)
https://data.gov
https://datasetsearch.research.google.com
All three allow searching of their data catalog (the collection of data sets), and students should spend a
few minutes looking at the various topics of data available. Examples of topics include education,
finance, nutrition, government, athletics, and technology.
Students will identify at least two data sets that relate to one of the questions that they generated and
answer questions about the file type and number of records.
Option for Differentiation: For students who have experience with data or who are interested in doing
more, they can investigate the idea of "cleaning data." Cleaning data is a process that makes the data
uniform without changing its meaning. For example, replacing all abbreviations, spellings, and
capitalizations with the same order. Students can investigate techniques for detecting inaccurate data,
correcting data, and determining whether it is most appropriate to correct or remove such records.

Wrap Up (10 minutes)


Do This: With a partner, have students discuss one way that user data is captured by a site (whether
knowingly or unknowingly) has contributed to an improvement in the service provided. Have there been
any positive impacts of this data outside of the service or website?
Do This: Direct students to Level 8 on Code Studio to complete Levels 8 and 9 by responding to the
prompts.

 8-9 Check for Understanding

8● 9●
✓ ✓

 Teaching Tip 

In addition to understanding the types of data that are collected about individuals, this activity exposes
students to the different ways that this data can be stored, and also how to find appropriate data to
answer questions of interest. Although students do not program in this activity, it sets them up to be
able to complete the open-ended activity. Students are typically able to list privacy concerns with data
storage, such as security breaches that might result in identity theft, but data also allows for
improvements to be made that can affect the lives of millions.
The purpose of the Check for Understanding questions in this activity is to get students thinking about
some of the positive aspects of data collection that might not be as apparent as the negative aspects.
The final question could be answered in a number of ways, and the focus should be on the explanation
or justification more than the "yes" or "no."

This work is available under a Creative Commons License (CC BY-NC-SA 4.0).
If you are interested in licensing Code.org materials for commercial purposes contact us.
Lesson 2: Designing and Implementing a
Custom Class
45 minutes

Overview
Students have likely created a class from a given
description or specification several times. This is an
important skill, but equally important is the ability to
determine essential information to include when creating a
class. What is "essential" can vary based on perspective, or
can be determined by a question that is being asked or a
problem that is attempted to be solved. This activity will
give students an opportunity to practice making this type
of determination.

Agenda
Warm Up (10 minutes)
Activity (30 minutes)
Wrap Up (5 minutes)

Teaching Guide
Warm Up (10 minutes)
Do This: Direct students to Level 1 on Code Studio to review the table. Have students discuss the following
questions:
What do you think the table is describing?
What do the letters 'C' and 'H' represent in the Type column?
Each row of the table represents an instance of an object. What is the best name for that object?

 1 Cereal Data Set

Do This: In groups or partners, have students answer a specific question about the cereals on the list, for
example: Which is the cereal with the most sugar on the list? Which cereal is the highest in fiber on the list?
 Teaching Tip 

Highlight the fact that this table includes only the first 21 rows of a much larger table. While it was
relatively easy to answer the questions while only looking through 21 rows or data, briefly discuss the
time it would take to determine the answers given 100 rows, then 1,000 rows, and 10,000 rows.

Activity (30 minutes)


Do This: Direct students to Level 2 on Code Studio to complete Levels 2, 3, and 4 by responding to the
prompts.

 2-4 Designing the Cereal Class

2 3 4

 Teaching Tip 

Keep students in their groups or pairs and have them answer the questions in terms of their response
to the specific question that they were asked during the warm up based on the 21 rows of data. Each
row contains several pieces of data about cereal, and not all of the data is necessary to represent in a
class. Although not required, students should use the column header as the variable name when they
are determining instance variables. This helps with readability and maintainability of their code, and
using different variable names adds unneeded complexity.

Do This: Direct students to Level 5 on Code Studio to complete Levels 5 and 6 by implementing and testing
the Cereal class.

 5-6 Implementing the Cereal Class

5 6

Wrap Up (5 minutes)
Do This: Direct students to Level 7 on Code Studio to complete Levels 7, 8, and 9 by responding to the
prompts.

 7-9 Check for Understanding

7● 8● 9●
✓ ✓ ✓

 Teaching Tip 

When creating a class, there is a tradeoff between the complexity of the object created (the more data
that is stored, the more complex the class) and the questions that can be answered. More questions will
be able to be answered if there is more data to examine. It is important for students to realize that
decisions they make early on about what data to store could limit the questions that they can answer
with the data.
The provided questions will start to get at this idea, but it isn't explicitly stated. The answers that
students will have to these questions will depend on their implementation of the Cereal class.
This work is available under a Creative Commons License (CC BY-NC-SA 4.0).
If you are interested in licensing Code.org materials for commercial purposes contact us.
Lesson 3: Putting It All Together
90 minutes

Overview
Existing code in the form of libraries is incredibly useful and Links
a powerful aspect of object-oriented programming. Beyond
the standard Java library, users around the world have Heads Up! Please make a copy of
created and published libraries to perform countless tasks. any documents you plan to share
One such library, which students will be using in this library with students.
and the one that follows, is the Sinbad library. This library
allows students to create a Java program that can connect For the students
to a data source, read in data, and then process this data.
The goal of this activity is to provide practice working with WeatherStation.java - Resource
the Sinbad library. Welcome01.java - Resource
Welcome02_Object.java -
Agenda Resource
Welcome03_List.java - Resource
Setup
Activity (80 minutes)
Wrap Up (10 minutes)

Teaching Guide
Setup
This lab utilizes a third-part library called Sinbad, which is not supported within Java Lab. This activity and
the next in this lab make sure of the Sinbad library for data collection and processing. There are installation
tips for installing the Sinbad library for Dr. Java, Eclipse, and Processing provided here: http://berry-
cs.github.io/sinbad/install-java. If you are using a different IDE, you will need to determine how to utilize an
external Java library within your IDE of choice.
This activity will utilize the tutorials for the library that can be found at https://github.com/berry-
cs/sinbad/tree/master/tutorials/java. Specifically, students will complete the welcome01.md, welcome02-
obj.md, and welcome03-objs.md tutorials. It is recommended to give students the Welcome01.java file to
ensure that the library is set up properly and it will compile, although the first tutorial could be given as a
homework assignment before beginning this activity. If you plan on having students work through the
tutorial, you are encouraged to work through it yourself ahead of time. The tutorial also contains possible
extensions, which could be assigned to the entire class or individual students if desired.
While using the library, you or your students might notice a pop-up box asking for permission to collect
data on the usage of the Sinbad library. Whether you allow data to be collected or not, this provides an
opportunity to discuss the ethics of data collection as a class, including potential benefits of such data
collection.
Activity (80 minutes)
Do This: Direct students to Level 1 on Code Studio to complete Levels 1 and 2 by responding to the
prompts.

 1-2 Getting Started

1 2
Do This: Once students have verified that their IDE is set up appropriately to use the Sinbad library (either
through compiling and running Welcome01.java or by working through the first tutorial), have students
work through the Fetching Objects (welcome02-obj.md) tutorial. This tutorial focuses on creating objects
from a data source.
Do This: Using the location from earlier prompts, have students modify Welcome02_Object.java that they
completed in the tutorial to create a third Observation object for their identified location, and then write
the code to determine the coldest location between all three Observation objects.
Do This: Have students work through the Arrays and Lists of Objects (welcome03-objs.md) tutorial. This
tutorial combines the use of objects with a data structure to store all records of the data file and mimics
what students will be doing in the open-ended activity.
 Teaching Tip 

It is recommended that you have students work through the ArrayList s tutorial. The tutorials for
arrays and ArrayList s are almost identical, so it is not necessary to have students complete both. The
goal of this activity is to give them practice working with the Sinbad library so that they can complete
the open-ended activity.

Do This: Direct students to Level 3 on Code Studio to complete Levels 3, 4, and 5 by responding to the
prompts.

 3-5 Arrays and Lists of Objects

3 4 5

Wrap Up (10 minutes)


Do This: Direct students to Level 6 on Code Studio to complete Levels 6 and 7 by responding to the
prompts.

 6-7 Check for Understanding

6● 7●
✓ ✓

 Teaching Tip 
The purpose of the Check for Understanding questions is to help students realize that once they can
read in data from a data set, they can answer countless questions about that data with very little
additional code. When they first start discussing the types of questions that they can answer, they will
probably be very similar to what they have already done. Allow them to spend time in groups, or even
as a class, discussing the types of questions they can answer.

This work is available under a Creative Commons License (CC BY-NC-SA 4.0).
If you are interested in licensing Code.org materials for commercial purposes contact us.
Lesson 4: Open-Ended Activity
180 minutes

Overview
The goal of this activity is to allow students to demonstrate Links
their knowledge of class design and data structures in a
way that is interesting and engaging to them. Heads Up! Please make a copy of
any documents you plan to share
with students.
Agenda
Activity (135 minutes) For the students
Wrap Up (45 minutes) Open-Ended Activity
Requirements - Resource
 Make a Copy
Open-Ended Activity Scoring
Guidelines - Resource
 Make a Copy

Teaching Guide
Activity (135 minutes)
While it is possible to create additional constraints or requirements, it is best to provide students with as
much freedom as possible. The use of any existing classes is intentionally not included in the list of
requirements.
All requirements are based on the identified question, so spend time with students talking about what
makes a good question. Students should also have a good understanding of the processing that will be
required to answer their question.
Be sure to indicate how much time students will have to work on this activity. The suggested time is only
four class periods, and students often underestimate the time it will take them to implement an idea. To
avoid students underestimating the time it will take for them to implement their idea, you may want to
review their project idea beforehand. It is possible to ask students to complete work outside of class in
order to stay within the class periods allotted, however, it's important to provide students with at least
some in-class time to answer the associated questions.
There are several things that you can do to help facilitate student implementation of this activity:
Allow and encourage collaboration during the development of the program. Either in pairs or groups,
students should work together to design, implement, and test their chosen project.
If using pair programming, consider how students should rotate roles and whether they will be notified
as to when to switch roles or if they should determine when to switch on their own.
Assist in resolving technical problems that impede work.
Determine before work begins how students should handle non-technical questions, whether students
should resolve problems within their collaborative groups, with another group, or with you.
Do not allow students to begin answering the Check for Understanding questions while the program is
being implemented. The questions should be answered independently after the program is complete.
 Teaching Tip 

Option for Differentiation: Students can choose to use a data set that comes from JSON data as
opposed to .csv data. Another option for more-advanced students would be to answer a question
using data accessed through an API, such as Twitter or Facebook.

Wrap Up (45 minutes)


Do This: Direct students to Level 1 on Code Studio to complete Levels 1 through 5 by responding to the
prompts.

 1-5 Check for Understanding

1● 2● 3● 4● 5●
✓ ✓ ✓ ✓ ✓

 Teaching Tip 

For students to show that they understand the concepts, it's important for them to describe their work.
Have them answer the questions individually.
The scoring guidelines are provided in order to assist in the assessment of student work. There are a
variety of ways that these guidelines could be used, including:
provide students with a blank copy of the guidelines and have them score their own program; or
provide students with a blank copy and ask them to evaluate the program of another student or
group; or
use these when grading programs, and provide students with a printed or electronic copy with
notes about each point, whether they earned or did not earn the point, and why.
This list is not comprehensive, and the guidelines may be used in ways that are not listed here.
Additional rows may be added, or focus may be on just a few rows. These guidelines are meant to be
flexible.
Ask additional questions to better understand the development process and provide students with
opportunities to improve their communication skills. One option is to ask students to identify one
programming difficulty they had when completing this activity, and describe how they overcame this
difficulty.

This work is available under a Creative Commons License (CC BY-NC-SA 4.0).
If you are interested in licensing Code.org materials for commercial purposes contact us.

You might also like