Assignment 1 - Introduction To Data Science

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Assignment 1: Introduction to Data Science

Lecture 1: Understanding Data Science, Data Engineering, and Data Analysis

Instructions:

1. Essay (20 points):


○ Write a detailed essay (1000-1500 words) on the differences and
interconnections between Data Science, Data Engineering, and Data Analysis.
○ Include the following sections:
■ Introduction: Define each field and explain their importance in the
context of modern data-driven decision-making.
■ Core Responsibilities: Outline the primary responsibilities and tasks
associated with each role.
■ Tools and Technologies: Discuss the common tools and technologies
used by professionals in each field.
■ Case Studies: Provide real-world examples or case studies illustrating
how these roles collaborate in a project.
■ Conclusion: Summarize the key points and reflect on the future trends in
these fields.
2. Research Project (30 points):
○ Choose a specific industry (e.g., healthcare, finance, retail) and conduct research
on how data science, data engineering, and data analysis are applied within that
industry.
○ Create a report (1500-2000 words) that includes:
■ Introduction: Briefly describe the chosen industry and its relevance.
■ Applications: Detail how each of the three fields contributes to solving
industry-specific problems.
■ Challenges: Identify the key challenges faced by professionals in these
roles within the industry.
■ Innovations: Highlight any innovative solutions or emerging trends.
■ Interviews: Conduct interviews with at least two professionals working in
these fields and include their insights in your report.
■ Conclusion: Reflect on the insights gained from the research and
interviews.

Lecture 2: Introduction to Python

Instructions:

1. Python Coding Assignment (25 points):


○ Write Python code to solve the following problems. Submit your code along with
detailed comments explaining your logic.
a. Data Types and Operations:

Create a script that accepts user input for two numbers and performs the
following operations: addition, subtraction, multiplication, division, and
modulus. Display the results.
○ b. Data Structures:
■ Implement a Python program that creates a dictionary of 5 students, each
with a nested dictionary containing their name, age, and grades in three
subjects. Write functions to:
■ Calculate the average grade of each student.
■ Find the student with the highest average grade.
■ Sort the students by their average grades in descending order and
print the sorted list.
○ c. Control Flow and Functions:
■ Develop a function that takes a list of integers and returns a list of prime
numbers from the input list. Include a main block to test your function with
various inputs.
2. Exploratory Data Analysis Project (25 points):
○ Use Python and libraries such as Pandas, NumPy, and Matplotlib to perform an
exploratory data analysis (EDA) on a provided dataset.
○ Dataset: [Provide a link to a dataset or include a sample dataset]
○ Tasks:
■ Load the dataset and display the first 10 rows.
■ Perform basic data cleaning (e.g., handling missing values, correcting
data types).
■ Generate summary statistics for numerical and categorical columns.
■ Visualize the distribution of key variables using appropriate plots
(histograms, box plots, bar charts, etc.).
■ Identify and visualize any correlations between variables.
■ Write a report (800-1000 words) summarizing your findings, including the
insights gained from the EDA and any interesting patterns or anomalies
discovered.

Submission Guidelines:

● Submit all written assignments as PDF documents.


● Submit Python code files (.py) and Jupyter Notebooks (.ipynb) as applicable.
● Include all visualizations and results within your reports.
● Ensure that all submissions are properly formatted and free of grammatical errors.

Grading Criteria:

● Clarity and depth of explanations


● Completeness and correctness of code
● Insightfulness of analysis and reflections
● Creativity and originality in approach
● Proper use of tools and technologies
Deadline:

● 23 Augast at 23:00

Note: Collaboration is encouraged, but each student must submit their own work. Plagiarism will
not be tolerated.

You might also like