NeedForDS FACETS

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

INTRODUCTION

Dr. S. Angel Deborah


Assistant Professor, Dept. of CSE
Outline

• Need for data science


• Benefits and uses
• Facets of data
• Big data ecosystem
Need for data science
• Businesses will be able to recognize their customers better and more refinedly
with the aid of data science.
• Data science enables products to strongly and captivatingly express their story.
• Its results may be used in every industry, including travel, healthcare, and
education.
• There is a tremendous amount of data in the world today that, depending on
how it is used, can determine whether a product succeeds or fails.
Benefits and uses

• Data science and big data are used almost everywhere in both commercial and
noncommercial settings.
• Commercial companies in almost every industry.
• To gain insights into their customers, processes, staff, completion, and
products.
• Governmental organizations
• Many governmental organizations not only rely on internal data scientists to
discover valuable information, but also share their data with the public.
• Nongovernmental organizations
• They use it to raise money and defend their causes.
Facets of data
• Structured
• Unstructured
• Natural language
• Machine-generated
• Graph-based
• Audio, video, and images
• Streaming
Structured data
• Structured data is data that depends on a data model and resides in a fixed field within a record.
• It’s stored as structured data in tables within databases or Excel files.
Unstructured data
• Unstructured data is data that isn’t easy to fit into a data model because the content is context-specific or
varying
Natural language
• Natural language is a special type of unstructured data.
• It’s challenging to process because it requires knowledge of specific data science
techniques and linguistics.
• The natural language processing community has had success in entity recognition, topic
recognition, summarization, text completion, and sentiment analysis.

Machine-generated data
• Machine-generated data is information that’s automatically created by a computer,
process, application, or other machine without human intervention.
Graph-based or network data
• “Graph data” can be a confusing term because any data can be shown in a graph.

Audio, image, and video


• Machine-generated data is information that’s automatically created by a computer, process,
application, or other machine without human intervention.

Streaming Data
• While streaming data can take almost any of the previous forms, it has an extra property.
• The data flows into the system when an event happens instead of being loaded into a data store
in a batch.
The big data
ecosystem
Check your understanding
• What is the need for data science?
• What are the facets of data?
Summary
• Need for data science
• Benefits and uses
• Facets of data
• Big data ecosystem
Reference
• Davy Cielen, Arno D B Meysman, Mohamed Ali, “Introducing Data Science – Big data, Machine Learning,
and more using Python tools”, Manning Publications Co, 2016.

You might also like