UCR Syllabus

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

UCR Extension

Syllabus
Introduction to Data Science with Python

I. Course Information

Course Title & number: Introduction to Data Science with Python


Format of the course: Online
Course location: Online
Course start and end dates: 6/23/2017 – 7/15/2017
Number of Units and hours: 3 Units

II. Instructor Contact Information

Name: Kemal Oflus


Time Zone: PST –California, USA
Email address: [email protected]
Phone number: (949) 292 -8389

III. Course Description and Purpose

Course Description:
This class provides students an introduction to data science and methods starting
with essential data exploration exploratory techniques using Python programming
language. These techniques are typically applied before formal modeling commences and
can help inform the development of more complex statistical models. Exploratory
techniques are also important for eliminating or sharpening potential hypotheses about
the world that can be addressed by the data. We will cover in detail the creating
informative plots as well as some of the basic principles of constructing data graphics and
move on to exploring common multivariate statistical techniques used to visualize and
understand high-dimensional data to build models.

In the class, we will utilize fundamentals of statistical inference in a practical manner for
getting things done. After investigating correlations and trends, we will move to building
interactive models that are used in wide variety of industries to give student a “real life”
use case for application of data science. After taking this course, students will be familiar
with using Python for data science tasks and use this information for making informed
choices in analyzing data.

Learning outcomes and course objectives


By the end of the class students will be familiar with the underlying statistics for data
science; how to approach different data types, appropriate analysis and visualization
methods and presenting analysis results and finally build high fidelity models using various
techniques using Python language.

1
Instructional methods
Course will be taught via Moodle.

Assignments
There will be hands on homework and project assignments to give students an
opportunity to apply what they learn in the class.

IV. Course Prerequisites


The students should be familiar with Microsoft Office tools; Excel, Word, PowerPoint
at the basic use and understanding level. Basic understanding of statistics and math.

V. Required course materials


Jupyter notebook, python software, lecture notes and online lectures

VI. Course Organization

The course is organized as follows:

Basic of Python Language


- Variables
- Functions
- Control Loops
- Bring in data from various sources

Scipy, Numpy and Basic Statistics


- Scipy
- Numpy
- Basic Statistics
o Distributions
o Statistical Inference
o Variance

Pandas and Use cases


- Pandas description and basics
- Examples
- Take home work

Visualization
- Matplotlib
- Seaborn
- Bokeh

Scikit Learn – Machine Learning


- 1 –to- 1 relationships
o Correlations and Trends

2
o ANOVA
o Categorical Analysis

- Multivariate Analysis
o Correlations
o Dimensionality Reduction
o Principal Components Analysis
o Clustering

- Regression
o Assumptions
o Linear Regression
o Logistic Regression

- Decision Trees
o Assumptions
o Single Decision Tree
o A forest of trees
o Gradient Boosted Trees

- Neural Networks
o Assumptions
o Sigmoid, Linear and Stochastic Components
o Building a multilayer, multimode neural network using Keras

The relevant datasets and iPython notebooks will be posted on the Moodle site
before the class for the students to get accustomed to the material before the
concepts and examples are introduced.

VII. Course Attendance / Participation

Course will be delivered online and students are expected to participate. Attendance
and participation will be counted towards to final grade.

VIII. Grading Policy and Grade Scale

The final grade will be based on a data science project. This will be a Kaggle style project
where the students turn in their predicted results for a test dataset along with the Python
code used to create the prediction. The students can select any method; could be as
simple as a linear fit or as complicated as a multi-layer multi-node neural network
solution. The final grade determination will be based on the performance of model on the
test dataset.

IX. Student Email Accounts

3
Your email account is an important tool for your participation this course. Make sure that
your mailbox has enough room to accept messages and attachments. If you are using an
email account provided by your employer, check to see that your account can receive
email from outside your local network. School districts frequently reject emails from our
server because of filtering software and many students never receive course
announcements or other materials. Additionally, do not use an automated responder with
the email account you are using with your course. If you have concerns about getting
unwanted emails because your email account is visible to others in your course, set up an
account specifically for your online course using a free service (Google, Yahoo,
Hotmail).

X. Plagiarism

All written work must be the product of the student submitting the work. While students
may be permitted by the instructor to work together on in-class assignments, all work
done outside the classroom must be done by the student without collaboration or sharing
with other students or non-students. Credit must be given for any material used which is
not created by the student, including images. If a student is determined to have violated
this policy, he/she will receive a zero for the assignment and be reported to the Program
Director. A second finding of plagiarism or cheating will result in the student being
withdrawn from the course by the instructor and reported to the Registrar.

Academic Integrity at UCR --


http://conduct.ucr.edu/policies/academicintegrity.html

UCR Policy on Plagiarism and Academic Integrity--


http://senate.ucr.edu/bylaws/?action=read_bylaws&code=app&section=06%20)

For Online Courses:

Introduction

This course uses the Moodle course management system. To participate in the course,
you will need Internet access and a web browser which works with Moodle. You may
visit the following website for current information about which web browsers are
compatible with Moodle: http://www.delhi.edu/cis/moodle/browsercheck.php

Necessary Technical Skills

In order to complete this course, you should know how to:

 Access websites and search for material online


 Create and send documents as email attachments
 Download and open files on your computer
 Save files in required formats (MS Word, PDF) and upload them to your class

4
If you need additional instruction you may access the Moodle site tutorial at:
http://docs.moodle.org/20/en/Student_tutorials

Security

If you access the course from a public computer, be sure to log out of the course and
completely close the browser when you are done. This will prevent others from accessing
the course using your student identification. Do not share your NetID and password with
others.

Participation Guidelines

 Check the forums frequently.


 Use the email subscription feature to receive email alerts when someone posts to
the forum. Keep your responses on the topic of discussion.
 Use informative titles with your forum posts.
 Use capitalization to highlight a point, but don't post messages in all caps. This is
usually interpreted as shouting.
 Think about what you have written before you post it to the forum. Moodle allows
30 minutes for you to reconsider and edit your message before others will see it.
 Cite all quotes, references, and sources.
 Keep your postings brief, but when you need to write something longer, you can
warn others at the start of your post that it is lengthy.
 Be careful how you use humor online. It's not as easy to tell that something is a
joke as it might be in face-to-face communication. Use emoticons such as the
smiley face :-) to indicate humor.

Your online presence is an important part of the class. You should log on at least twice a
week and make contributions to the online forums. Responding to someone's forum post
with "Yes, that's a good point" or "I agree with that" doesn't count as adding to the
discussion. Start a new topic or make a substantive contribution to the existing
discussion.

Participation in the online forums each week is required to earn a passing grade in this
course.

Support

If you have any questions related to Moodle, please email [email protected] or


contact your instructor. Problems with your NetID and course access should be sent to
[email protected].

You might also like