CIT-137-M1 SP19 Syllabus
CIT-137-M1 SP19 Syllabus
CIT-137-M1 SP19 Syllabus
CIT-137-M1
Spring 2020, Monday 6:00-9:45
Introduction to Big Data
With R and R Studio
COURSE DESCRIPTION: This course provides practical foundation level training that enables immediate and
effective participation in big data and other analytics projects. It includes an introduction to big data and the
Data Analytics Lifecycle to address business challenges that leverage big data. The course provides grounding
in basic and advanced analytic methods and an introduction to big data analytics technology and tools. Labs
offer opportunities for students to understand how these methods and tools may be applied to real world
business challenges by a practicing data scientist. The course takes an "Open", or technology-neutral
approach, and includes a final lab which addresses a data science challenge by applying the concepts taught in
the course with an open source database. Prerequisite: Information Technology Problem Solving (CIT113) or
equivalent (CIT110, CIT120, CIT182 or department chair approval).
2
COURSE OBJECTIVES: Students should be able to do the following after completing this course.
Describe the role of Data Science in society, and state how data is used in a real world environment.
Describe various tools a data scientist uses and demonstrate how to use an open source software
package called R-Studio, a GUI (graphical user interface) for the CLI (command line interface) software
R.
Utilize R to write functions, loops, examine and explore data and utilize libraries for added functionality
for data analysis such as: tidyverse, dplyr, ggplot2, lubridate, tidyr, stringr, reshape2
Utilize basic statistical parameters related distributions and show how data can be used and analyzed
from distributions.
Demonstrate how to turn unstructured data (messy and clean data) into structured data (tidy data).
Demonstrate how to live link R, Excel and Tableau to a database, and update the software as the
database updates in real time.
Demonstrate how to search for online databases, find open data sources, and search the data for
answers to questions.
Utilize resiliency skills, improve communication, and learn to overcome obstacles in a rapidly changing
environment while working on a complex, multistage group project.
Show how to web scrape data, clean it, and present the data to a user in a readable, often visual,
format which utilizes tools and techniques learned throughout the course.
3
INSTRUCTOR: The instructor for this course is: Professor <Michael Harris>
E.Mail Address: <[email protected]>
Desk Location: <D123E>
Telephone: <617-480-3003>
Office Hours: <W 4:00-5:45, Th 11:30-2:15 >
REQUIRED COURSE MATERIAL:
1. R and R Studio Software
2. Data Camp Online course content MOOC
3. https://sites.google.com/site/cit137sp19 Course website
STUDENT REQUIREMENTS: To complete this course, receive a final grade and full credit each student must:
1. Complete assigned homework and attend classes
2. Complete all homework assignments
3. Complete all required Lab Projects
4. Complete a final project and give a presentation on the project
STUDENT EVALUATION: A letter grade will be awarded at the completion of the course according to the following
weighted average:
1 Course Introduction
Introduction to R
2 Introduction to RStudio
Chapter 1-2
Introduction to R
3 Subsetting
Chapter 3-4
4 Loops Intermediate R
14 Data Vis
15 Tufte
5
GRADING INFORMATION AND CRITERIA:
ATTENDANCE POLICY: Each student is required to attend all class sessions. The Student Services Office
(617.228.2000) should be notified if a student would be absent for an extended period of time. See the Student
Handbook for more details.
MOBILE DEVICES: Cellphones are not to be used during class-time. If you need to take a call, leave the
room in a manner that is undisruptive to the class. Laptops and cellphones must be muted at all times.
TEACHING METHODOLOGY: This class will be taught through a problem based learning methodology, so
your grade will be determined not by exams, but by how well you do on your homework and the class projects.
ATTENDEES: Only registered students are allowed in the classroom and the door must be kept closed during
class time. If the student wishes to leave the room during class time, he/she will close the door behind them and
will be let back in upon their arrival.
STUDENT CODE OF BEHAVIOR: Students found guilty of violating the code of ethics will be subject to
the rule listed by BHCC policy. Below is a statement from BHCC catalog:
“If it is proven that a student in any course in which he or she is enrolled has knowingly cheated or
plagiarized, this may result in a failing grade for an exam or assignment, withdrawal from the course or a
failing grade for the course. The student would also be subject to disciplinary proceedings as outlined in
the Student Handbook for violation of the Student Code of Conduct.”
POLICY FOR INDIVIDUALS WITH DISABILITIES: BHCC is committed to providing equal access to the
educational experience of all students in compliance with Section 504 of the Rehabilitation Act of 1973 and the
Americans with Disabilities Act of 1990. A student with a documented disability, who has not already done so,
should schedule an appointment at the Office for Students with Disabilities (Room D106) in order to obtain
appropriate services.
6
LAB ASSIGNMENTS: Documentation will be assigned for each lab. The documentation will screen capture the
labs and answer the questions pertained in the labs. If there are any problems understanding any parts of the
lab, this should also be noted. This class does not require a laptop or home computer. If you do not have
either a laptop or computer, the college has resources available to you. Such as:
The College’s Computer Lab is open five (5) days per week during the summer, and their schedule is as
follows:
o Charlestown Campus, Room D111
Fall and Spring Semesters Hours:
Monday - Thurs, 7am to 9:45 pm
Friday, 8am to 9:45pm
Saturday – Sunday, 9:00 – 3:45
HOMEWORK ASSIGNMENTS: Homework assignments vary; reading, studying, preparing questions, and
papers. All are to be handed in on time. If a problem of should arise, please use my contact information.
EXAMINATIONS: This course does not have examinations, instead the student’s grade will be based on the
homework, and lab work completed during the semester. Some of the lab work will be collaborative project-
based assignments and some will be solo based assignments.
The tools shown in this class are for educational purposes only and the instructor/BHCC is not responsible for
any of my actions. The VMs are not to be cloned from the classroom unless explicitly instructed to do so. Also,
if I choose to bring in my laptop, the instructor/BHCC is not responsible for any theft or malfunctions of the
device. Sending a reply to this email constitutes my understanding and agreement to what I have just read.
Please type “I agree with the syllabus and except the term therein”, if you agree with these terms.
This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License. To
view a copy of this license, visit http://creativecommons.org/licenses/by-nc/4.0/ or send a letter to Creative
Commons, PO Box 1866, Mountain View, CA 94042, USA.