CIT-137-W1HB FA19 Syllabus
CIT-137-W1HB FA19 Syllabus
CIT-137-W1HB FA19 Syllabus
CIT-137-M1
Introduction to Big Data
With R and R Studio
COURSE DESCRIPTION: This course provides foundation level training for students who want to learn the
programming language R. The course provides grounding in basic and moderate analytic methods along with
an introduction to the field of data science and some of the tools used. Students will learn the language
through labs which offer opportunities to understand how the language is used to real world business
challenges. The students will learn some of the more popular libraries in R (dplyr, ggplot, lubridate, and tidyr)
and will be able to prepare data for future analysis. The course takes an "Open", or technology-neutral
approach, and includes a final lab which addresses a data science challenge by applying the concepts taught in
the course with an open source database. Prerequisite: Information Technology Problem Solving (CIT113) or
equivalent (CIT110, CIT120, CIT182 or department chair approval).
2
COURSE LEARNING OBJECTIVES: By the end of this course students will be able to:
Utilize the R programming language to write functions, loops, examine and explore data and utilize
libraries for added functionality for data analysis such as: dplyr, ggplot2, lubridate, and tidyr.
Utilize basic statistical parameters and show how data can be used and analyzed and modeled from
various distributions.
Demonstrate how to turn unstructured data (messy data) into structured data (tidy data).
Demonstrate how to search for online databases, find open data sources on the internet.
Utilize resiliency skills, improve communication, and learn to overcome obstacles in a rapidly changing
environment while working on a complex, multistage project.
Show how to web scrape data, clean it, and present the data to a user in a readable, often visual,
format which utilizes tools and techniques learned throughout the course.
3
INSTRUCTOR: The instructor for this course is: Professor Michael Harris Office: D123E
Email address: [email protected]
Telephone: Office:(617)228-2486 cell:(617) 480-3003
Office Hours: M 1;2:15, W 2:30-3:45, Th 1:00-2:15
STUDENT REQUIREMENTS: To complete this course, receive a final grade and full credit each student must:
STUDENT EVALUATION: A letter grade will be awarded at the completion of the course according to the following
weighted average:
4
COURSE ASSIGNMENT GRID
Programming
Wk Topic Datacamp Work Discussion
Assignment
Facebook Data
3 Subsetting Importing Data Datacamp
Policies
Google data
4 Loops Intermediate R:
policies
Dplyr
5 dplyr Apple data policies
Datacamp
Cambridge Tourism
8 Ggplot2 cont
Analytica Project
Ethical Concerns
9 Intermediate Data Wrangling Data Cleaning Datacamp
part 1
Ethical Concerns
10 Tidy Data Datacamp Tidy Data
part 2
What is a good
14 Tableau Project work
data policy?
5
GRADING INFORMATION AND CRITERIA:
Datacamp ASSIGNMENTS: Lab assignments are to be completed on the Datacamp.com website. Students
will receive an e-mail from Datacamp at the beginning of the semester and will sign up for the classroom on
Datacamp via the e-mail. Students can also use the BHCC computer lab.
The College’s Computer Lab is open five (5) days per week during the summer, and their fall/spring
schedule is as follows:
o Charlestown Campus, Room D111
Fall and Spring Semesters Hours:
Monday - Thurs, 7am to 9:45 pm
Friday, 8am to 9:45pm
Saturday – Sunday, 9:00 – 3:45
PROGRAMMING ASSIGNMENTS: Students will receive 2 different programming assignments during the
semester. The programming assignments test a depth of knowledge on the software and are comprehensive
in design. The students will submit their code to moodle as a .R or a .Rmd file by the required due date. Plan
on spending at least 10 hours per programming assignment because they take on average 10-15 hours to
complete.
DISCUSSIONS: There will be weekly discussions on various topics throughout the semester. Students are
expected to post an initial answer to the discussion topic by Tuesday at 11:59pm is the discussion forum, and
they are expected to respond to at least 2 of their fellow classmates as well. The follow up discussions should
either:
Extending or adding to his or her point(s).
Asking a clarifying question.
Disagreeing (with reasoning and evidence, if possible).
Otherwise adding to understanding of the topic.
FINAL PROJECT: This course does not have a final examination, instead there will be a final project of the
student’s choosing. The final project will consist of a data project where the student will examine, clean,
explore and run a prediction algorithm on the data. The final project will be graded according to a rubric
which will be handed out when the discussion on the project starts. This occurs during the middle of the
semester (week 8 or 9).
6
DISCUSSION POLICY: Students must be active in all class discussion sessions. The Student Services Office
(617.228.2000) should be notified if a student would be absent for an extended period of time. See the Student
Handbook for more details.
TEACHING METHODOLOGY: This is an online class which will be taught through a problem based
learning methodology. This means your grade will be determined not by exams, but by how well you do on
your homework, discussions and the class projects.
ATTENDEES: Only registered students are allowed in the classroom and the door must be kept closed during
class time. If the student wishes to leave the room during class time, he/she will close the door behind them and
will be let back in upon their arrival.
STUDENT CODE OF BEHAVIOR: Students found guilty of violating the code of ethics will be subject to
the rule listed by BHCC policy. Below is a statement from BHCC catalog:
“If it is proven that a student in any course in which he or she is enrolled has knowingly cheated or
plagiarized, this may result in a failing grade for an exam or assignment, withdrawal from the course or a
failing grade for the course. The student would also be subject to disciplinary proceedings as outlined in
the Student Handbook for violation of the Student Code of Conduct.”
POLICY FOR INDIVIDUALS WITH DISABILITIES: BHCC is committed to providing equal access to the
educational experience of all students in compliance with Section 504 of the Rehabilitation Act of 1973 and the
Americans with Disabilities Act of 1990. A student with a documented disability, who has not already done so,
should schedule an appointment at the Office for Students with Disabilities (Room D106) in order to obtain
appropriate services.
7
Please Note: The above schedule is subject to change.
This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License. To
view a copy of this license, visit http://creativecommons.org/licenses/by-nc/4.0/ or send a letter to Creative
Commons, PO Box 1866, Mountain View, CA 94042, USA.