Unit-05 SSB DBMS

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 174

Noida Institute of Engineering and Technology, Greater Noida

Introduction To NOSQL with Cloud Database

Unit: 5
Introduction To NOSQL with Cloud Database

Introduction To NOSQL with Cloud Database


Mr. SOVERS SINGH BISHT
Dept. Data Science & CSBS
B-Tech 4th Sem NIET
ONLINE & Offline (All Branches)

Mr. SOVERS SINGH BISHT UNIT 05 1


12/05/2022
Noida Institute of Engineering and Technology,
THE CONCEPT LEARNING TASK
Greater Noida

Mr. Sovers Singh Bisht


Dept. Head (Data Science & CSBS)
B.Tech, M.Tech in CSE

Area of Expertise: Data Science, Big Data Analytics


& Cloud Computing.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 2


THE CONCEPT LEARNING
Evaluation Scheme TASK

• B. Tech (Data Science & Related Branches)


• Fourth Semester
• Professional Core Course

DATABASE MANAGEMENT SYSTEMS

LTP Credits
3–1–0 4

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 3


THE CONCEPT LEARNING
Evaluation Scheme TASK

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 4


THE CONCEPT
Course LEARNING
Contents TASK
/ Syllabus

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 5


THE CONCEPT
Course LEARNING
Contents TASK
/ Syllabus

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 6


THE CONCEPT
Branch wiseLEARNING
Applications TASK
• From the 1980s to the Internet era in the late 1990s, SQL databases
dominated the development landscape. Large commercial applications, niche
products, and custom applications of all types were based on SQL.
• But the rise of the Internet has changed application development profoundly.
The amount of data, the structure of the data, the scale of applications, the
way applications have developed have all changed dramatically.
• These changes have led many organizations of all sizes to adopt NoSQL
database technology.
• In recent times you can easily capture and access data from various sources,
like Facebook, Google, etc.
• User’s personal information, geographic location data, user generated
content, social graphs and machine logging data are some of the examples
where data is increasing rapidly.
• To use above mentioned properties, it is necessary to process large volume of
data.
• For which relational databases are not suitable. The evolution of NoSQL
databases is to handle this large volume of data properly.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 7


THE CONCEPT LEARNING TASK
Course Objective

The objective of this course is -


• The objective of the course is to present an introduction to database
management systems, with an emphasis on how to organize, maintain and
retrieve - efficiently, and effectively - information in relational and non-
relation Database.
After successfully completing this course, students will be able to:
• Distinguish the different types of NoSQL databases
• Understand the impact of the cluster on database design
• State the CAP theorem and explain it main points
• Explain where HBase, MongoDB, Cassandra, Neo4j, and Redis fit with the
CAP theorem
• Work with the Hadoop Distributed File System (HDFS) as a foundation for
NoSQL technologies
• Describe the design of HBase, MongoDB, Cassandra, Neo4j, and Redis
• Use the data control, definition, and manipulation languages of the NoSQL
databases covered in the course
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 8
CONTENT
Course Outcomes

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 9


CONTENT
Program Outcomes

Engineering Graduates will be able to Understand:


1. The pace of development with NoSQL databases can be much faster than with a
SQL database.
2. The structure of many different forms of data is more easily handled and evolved
with a NoSQL database.
3. The amount of data in many applications cannot be served affordably by a SQL
database.
4. The scale of traffic and need for zero downtime cannot be handled by SQL.
5. New application paradigms can be more easily supported.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 10


CONTENT
CO-PO Mapping

  PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
CO1 2 2 3 2 3 3 - 1 - 2 2 3
CO2 2 2 2 2 2 2 - 1 - 1 2 2
CO3 2 2 3 2 3 2 - 2 - 1 2 3
CO4 3 3 3 3 3 2 - 2 - 2 1 3
CO5 3 2 2 2 3 2 - 2 1 3 3 3

Average 2.4 2.2 2.6 2.2 2.8 2.2 - 1.6 1 1.8 2 2.8

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 11


QuestionCONTENT
Paper Template

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 12


QuestionCONTENT
Paper Template
    SECTION – A   CO
         
1.   Attempt all parts- [10×1=10]  

  1-a. Question- (1)  


  1-b. Question- (1)  
  1-c. Question- (1)  
  1-d. Question- (1)  
  1-e. Question- (1)  
  1-f. Question- (1)  
  1-g. Question- (1)  
  1-h. Question- (1)  
  1-i. Question- (1)  
  1-j. Question- (1)  
         
2. Attempt all parts- [5×2=10] CO
       
  2-a. Question- (2)  
  2-b. Question- (2)  
  2-c. Question- (2)  
  2-d. Question- (2)  
  2-e. Question- (2)  
       
       

 
SECTION – B   CO
     
3. Answer any five of the following- [5×6=30]  
3-a.  Question- (6)  
3-b.  Question- (6)  
3-c.  Question- (6)  
3-d.  Question- (6)  
3-e.  Question- (6)  
3-f.   Question- (6)  
3-g.  Question- (6)  
SECTION – C   CO
     
4 Answer any one of the following- [5×10=50]  

  4-a. Question- (10)  


         
  4-b. Question- (10)  
5. Answer any one of the following-    
  5-a. Question- (10)  
         
  5-b. Question- (10)  
6. Answer any one of the following-    
  6-a. Question- (10)  
         
  6-b. Question- (10)  
7. Answer any one of the following-    
  7-a. Question- (10)  
         
  7-b. Question- (10)  
         
8. Answer any one of the following-    
  8-a. Question- (10)  
         
  8-b. Question- (10)  

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 13


THE CONCEPT
PrerequisiteLEARNING
and Recap TASK

Prerequisites:
• Linux/ Windows operating system.
• Database Management Software's such as Oracle
• Knowledge on SQL/PLSQL.
• Cloud Infrastructure.
• Handling Big Data on Unstructured Databases.
• Programming Languages (Python or Java)

Recap:
• Discussion about Cloud and Database Management System.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 14


THE
BriefCONCEPT LEARNING
Intro of Subject TASK
with video
Digital transformation is the name for the trend toward serving customers using scalable,
customizable, Internet and mobile applications. These applications are often hard to build and evolve
rapidly using SQL technology. For this reason, from the mid-2000s to 2020 we have seen a steady rise
in the adoption of NoSQL database technology.

The rise of NoSQL is an important event in computer science and in application development because
SQL has been so dominant for so long. Many other forms of database technology have come and
gone, but few have had the wide adoption of NoSQL.

By understanding the rise in popularity of NoSQL databases, we should be able to shed light on when
it makes sense to use NoSQL.
NoSQL covers a lot of different database structures and data models.

• http://www.nptelvideos.com/lecture.php?id=6516
• http://www.nptelvideos.com/lecture.php?id=6517
• http://www.nptelvideos.com/lecture.php?id=6518
• http://www.nptelvideos.com/lecture.php?id=6519
• https://www.youtube.com/watch?v=2yQ9TGFpDuM

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 15


CONTENT
Introduction to Data Science:
1. Definition of NoSQL
2. History of NoSQL and Different NoSQL products
3. Exploring Mongo DB
4. Interfacing and Interacting with NoSQL
5. NoSQL Storage Architecture
6. CRUD operations with MongoDB
7. Querying, Modifying and Managing NoSQL Data stores
8. Indexing and ordering datasets (MongoDB)
9. Cloud database: - Introduction of Cloud database
10. NoSQL with Cloud Database
11. Introduction to Real time Database.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 16


THE CONCEPT LEARNING TASK
Unit Objective

The objective of the Unit 1 is :


1.To provide an overview of an exciting growing field of NOSQL databases.

2. To inculcate the preliminary knowledge of domain of DBMS and elaborate


when should NoSQL be used:
• When huge amount of data need to be stored and retrieved .
• The relationship between the data you store is not that important
• The data changing over time and is not structured.
• Support of Constraints and Joins is not required at database level
• The data is growing continuously, and you need to scale the database regular
to handle the data.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 17


THE CONCEPT
Definition LEARNING
of NoSQL TASK

Objective:
 In this topic we focus on There are several advantages of working
with NoSQL databases such as MongoDB and Cassandra. The main
advantages are high scalability and high availability. High scalability:
NoSQL database such as MongoDB uses sharding for horizontal
scaling.

Recap:
 Revision of Database Management Systems.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 18


THE CONCEPT
Definition LEARNING
of NoSQL TASK

NoSQL databases (aka "not only SQL") are non-tabular databases and store data differently
than relational tables. NoSQL databases come in a variety of types based on their data
model. The main types are document, key-value, wide-column, and graph. They provide
flexible schemas and scale easily with large amounts of data and high user loads.

When people use the term “NoSQL database,” they typically use it to refer to any non-
relational database. Some say the term “NoSQL” stands for “non SQL” while others say it
stands for “not only SQL.” Either way, most agree that NoSQL databases are databases that
store data in a format other than relational tables.

A NoSQL (originally referring to "non-SQL" or "non-relational") database provides a


mechanism for storage and retrieval of data that is modeled in means other than the
tabular relations used in relational databases. Such databases have existed since the late
1960s, but the name "NoSQL" was only coined in the early 21st century, triggered by the
needs of Web 2.0 companies NoSQL databases are increasingly used in big data and real-
time web applications. NoSQL systems are also sometimes called Not only SQL to
emphasize that they may support SQL-like query languages or sit alongside SQL databases
in polyglot-persistent architectures

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 19


THE CONCEPT
Definition LEARNING
of NoSQL TASK

Introduction NoSQL
1. Stands for Not Only SQL.
2. The idea of NoSQL founded in 1998 with term lightweight
Schema Less by Carlo Strozzi.
3. Open-source database.
4. NoSQL will be the future database.
5. Very compatible with distributed systems.
6. Lower cost.
7. High performance database.
8. Founded to handle huge data space.
9. Used by Facebook , Google , Wikipedia …

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 20


Definition of NoSQL
Increasing Web Applicati on Data.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 21


THE CONCEPT
Definition LEARNING
of NoSQL TASK
Characteristi c of NoSQL
1. Large data volumes.
2. Scalable replication and distribution (Horizontal scaling).
3. Queries need to return answers quickly.
4. Asynchronous Inserts & Updates.
5. Schema-less.
6. Designed to support Caching without 3-rd party tools.
7. ACID transaction properties are not needed.
8. BASE here.
9. CAP Theorem.
10. No Joins statement.
11. No complicated Relationships
12. Less administration time(less cost).

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 22


THE CONCEPT
Definition LEARNING
of NoSQL TASK
ACID
1. Atomic – All transaction completes (commit) or none of it completes.
2. Consistent – Consistency is defined in terms of constraints.
3. Isolated – The transaction will behave as if it is the only operation being
performed upon the database
4. Durable – Upon completion of the transaction, the operation will not be
reversed.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 23


THE CONCEPT
Definition LEARNING
of NoSQL TASK
BASE

1. Basically Available.

2. Soft state(expiration of information).

3. Eventually Consistent.

4. Weak consistency.

5. Availability first.

6. Simpler and faster.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 24


Definition of NoSQL
CAP Theory

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 25


THE CONCEPT
Definition LEARNING
of NoSQL TASK
Major NoSQL Types
 Key-Value Store.
 Hashing.
 Basic get/put/delete.
 Crazy fast because there is key to get set of values.

 Document Store.
 JSON,XML … document structured.
 No Join.(handle it in your code).

 Column Database.
 Each storage block contains data from only one column.
 Reduce access and scanning time.
 Still use tables without joins statements.
 Better for data analytics.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 26


THE CONCEPT
Definition LEARNING
of NoSQL TASK
Developers Viewpoint !
SQL is better ?
 Natural reaction.
 Everyone's experience.
 Fear of change.
NoSQL Will :
 Simplify your data model.
 Easy to install.
 your bugs will be fewer and easier to find.
 Lower administration / less DBAs.
 performance is going to be awesome.
 Scale will be much simpler.
 Rapid Development.
 Large binary objects.
 Graphs/relationships.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 27


THE CONCEPT
Definition LEARNING
of NoSQL TASK
Look at Queries sy ntax
 db.inventory.find( { type: "snacks" } );
 db.inventory.find( { type: 'food', price: { $lt: 9.95 } } );
 db.inventory.find( { producer: { company: 'ABC123',
address: '123 Street' } } );
 db.inventory.find( { memos: { $elemMatch: { memo : 'on time', by: 'shipping' } } }
);
 db.inventory.insert( { _id: 10, type: "misc", item: "card", qty: 15 } );
 db.users.remove( { status: "D" } )
 db.users.insert( { name: "sue", age: 26, status: "A" } )
 db.users.update( { age: { $gt: 18 } }, { $set: { status: "A" }
}, { multi: true } )
 db.inventory.save( { type: "book", item: "notebook", qty:
40 } )

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 28


THE CONCEPT
Definition LEARNING
of NoSQL TASK

What's about functions ?


var myCursor = db.inventory.find( { type: 'food' } );
myCursor.forEach(printjson);

var myCursor = db.inventory.find( { type: 'food' } ); var


documentArray = myCursor.toArray();
var myDocument = documentArray[3];

Query Analysis

db.inventory.find( { type: 'food' } ).explain() Output Here

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 29


THE CONCEPT
Definition LEARNING
of NoSQL TASK
S u m m a ry
NoSQL :
 Handle huge data.
 High availability with small cost.
 More data redundancy. Pick the right tool for your job
 High performance. !
 Less administration time.
SQL :
 Less standards.  Good to solve ACID problems.
 Expensive.
 Less data redundancy.
 Increasing availability mean increasing
cost.
 More standards.
 More administration.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 30


THE
History CONCEPT
of NoSQL LEARNING
and Different NoSQLTASK
products

Objective:
 In this topic we focus on Motivations for this approach include
simplicity of design, simpler "horizontal" scaling to clusters of
machines (which is a problem for relational databases), finer control
over availability and limiting the object-relational impedance
mismatch.
 Recap:

 Revision of Database Management Systems.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 31


THE
History CONCEPT
of NoSQL LEARNING
and Different NoSQLTASK
products

Brief history of NoSQL databases


NoSQL databases emerged in the late 2000s as the cost of storage dramatically decreased.
Gone were the days of needing to create a complex, difficult-to-manage data model in
order to avoid data duplication. Developers (rather than storage) were becoming the
primary cost of software development, so NoSQL databases optimized for developer
productivity. Cost Per MB of Data Over Time (Log Scale)

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 32


THE
History CONCEPT
of NoSQL LEARNING
and Different NoSQLTASK
products

As storage costs rapidly decreased, the amount of data that applications needed to store
and query increased. This data came in all shapes and sizes — structured, semi-structured,
and polymorphic — and defining the schema in advance became nearly impossible. NoSQL
databases allow developers to store huge amounts of unstructured data, giving them a lot
of flexibility.

Additionally, the Agile Manifesto was rising in popularity, and software engineers were
rethinking the way they developed software. They were recognizing the need to rapidly
adapt to changing requirements. They needed the ability to iterate quickly and make
changes throughout their software stack — all the way down to the database. NoSQL
databases gave them this flexibility.

Cloud computing also rose in popularity, and developers began using public clouds to host
their applications and data. They wanted the ability to distribute data across multiple
servers and regions to make their applications resilient, to scale out instead of scale up, and
to intelligently geo-place their data. Some NoSQL databases like MongoDB provide these
capabilities.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 33


THE
History CONCEPT
of NoSQL LEARNING
and Different NoSQLTASK
products

NoSQL database features


Each NoSQL database has its own unique features. At a high level, many NoSQL
databases have the following features:

• Flexible schemas
• Horizontal scaling
• Fast queries due to the data model
• Ease of use for developers
Benefits of NoSQL Databases
NoSQL databases offer many benefits over relational databases. NoSQL databases have
flexible data models, scale horizontally, have incredibly fast queries, and are easy for
developers to work with.

1. Flexible data models

NoSQL databases typically have very flexible schemas. A flexible schema allows you to
easily make changes to your database as requirements change. You can iterate quickly
and continuously integrate new application features to provide value to your users faster.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 34


THE
History CONCEPT
of NoSQL LEARNING
and Different NoSQLTASK
products

2. Horizontal scaling
Most SQL databases require you to scale-up vertically (migrate to a larger, more expensive
server) when you exceed the capacity requirements of your current server. Conversely,
most NoSQL databases allow you to scale-out horizontally, meaning you can add cheaper,
commodity servers whenever you need to.

3. Fast queries
Queries in NoSQL databases can be faster than SQL databases. Why? Data in SQL databases
is typically normalized, so queries for a single object or entity require you to join data from
multiple tables. As your tables grow in size, the joins can become expensive. However, data
in NoSQL databases is typically stored in a way that is optimized for queries. The rule of
thumb when you use MongoDB is Data that is accessed together should be stored together.
Queries typically do not require joins, so the queries are very fast.

4. Easy for developers


Some NoSQL databases like MongoDB map their data structures to those of popular
programming languages. This mapping allows developers to store their data in the same
way that they use it in their application code. While it may seem like a trivial advantage, this
mapping can allow developers to write less code, leading to faster development time and
fewer bugs.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 35
THE
History CONCEPT
of NoSQL LEARNING
and Different NoSQLTASK
products
Drawbacks of NoSQL Databases
One of the most frequently cited drawbacks of NoSQL databases is that they don’t support
ACID (atomicity, consistency, isolation, durability) transactions across multiple documents.
With appropriate schema design, single record atomicity is acceptable for lots of
applications. However, there are still many applications that require ACID across multiple
records.

To address these use cases MongoDB added support for multi-document ACID transactions
in the 4.0 release, and extended them in 4.2 to span sharded clusters.

Since data models in NoSQL databases are typically optimized for queries and not for
reducing data duplication, NoSQL databases can be larger than SQL databases. Storage is
currently so cheap that most consider this a minor drawback, and some NoSQL databases
also support compression to reduce the storage footprint.

Depending on the NoSQL database type you select, you may not be able to achieve all of
your use cases in a single database. For example, graph databases are excellent for analyzing
relationships in your data but may not provide what you need for everyday retrieval of the
data such as range queries. When selecting a NoSQL database, consider what your use cases
will be and if a general purpose database like MongoDB would be a better option.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 36
THE
History CONCEPT
of NoSQL LEARNING
and Different NoSQLTASK
products

Types of NoSQL databases


Over time, four major types of NoSQL databases emerged: document databases, key-value
databases, wide-column stores, and graph databases.

• Document databases
• Key-value stores
• Column-oriented databases
• Graph databases

1. Document databases store data in documents similar to JSON (JavaScript Object


Notation) objects. Each document contains pairs of fields and values. The values can
typically be a variety of types including things like strings, numbers, booleans, arrays, or
objects.
2. Key-value databases are a simpler type of database where each item contains keys
and values.
3. Wide-column stores store data in tables, rows, and dynamic columns.
4. Graph databases store data in nodes and edges. Nodes typically store information
about people, places, and things, while edges store information about the relationships
between the nodes.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 37
THE
History CONCEPT
of NoSQL LEARNING
and Different NoSQLTASK
products

Document Databases
A document database stores data in JSON, BSON , or XML documents (not Word documents
or Google docs, of course). In a document database, documents can be nested. Particular
elements can be indexed for faster querying.

Documents can be stored and retrieved in a form that is much closer to the data objects
used in applications, which means less translation is required to use the data in an
application. SQL data must often be assembled and disassembled when moving back and
forth between applications and storage.

Document databases are popular with developers because they have the flexibility to
rework their document structures as needed to suit their application, shaping their data
structures as their application requirements change over time. This flexibility speeds
development because in effect data becomes like code and is under the control of
developers. In SQL databases, intervention by database administrators may be required to
change the structure of a database.

The most widely adopted document databases are usually implemented with a scale-out
architecture, providing a clear path to scalability of both data volumes and traffic.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 38


THE
History CONCEPT
of NoSQL LEARNING
and Different NoSQLTASK
products

Use cases include ecommerce platforms, trading platforms, and mobile app development
across industries.

Comparing MongoDB vs PostgreSQL offers a detailed analysis of MongoDB, the leading


NoSQL database, and PostgreSQL, one of the most popular SQL databases.

Key-Value Stores
The simplest type of NoSQL database is a key-value store . Every data element in the
database is stored as a key value pair consisting of an attribute name (or "key") and a
value. In a sense, a key-value store is like a relational database with only two columns:
the key or attribute name (such as state) and the value (such as Alaska).

Use cases include shopping carts, user preferences, and user profiles.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 39


THE
History CONCEPT
of NoSQL LEARNING
and Different NoSQLTASK
products

Column-Oriented Databases
While a relational database stores data in rows and reads data row by row, a column store
is organized as a set of columns. This means that when you want to run analytics on a
small number of columns, you can read those columns directly without consuming
memory with the unwanted data. Columns are often of the same type and benefit from
more efficient compression, making reads even faster. Columnar databases can quickly
aggregate the value of a given column (adding up the total sales for the year, for example).
Use cases include analytics.

Unfortunately, there is no free lunch, which means that while columnar databases are
great for analytics, the way in which they write data makes it very difficult for them to be
strongly consistent as writes of all the columns require multiple write events on disk.
Relational databases don't suffer from this problem as row data is written contiguously to
disk.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 40


THE
History CONCEPT
of NoSQL LEARNING
and Different NoSQLTASK
products

Graph Databases

A graph database focuses on the relationship between data elements. Each element is
stored as a node (such as a person in a social media graph). The connections between
elements are called links or relationships. In a graph database, connections are first-class
elements of the database, stored directly. In relational databases, links are implied, using
data to express the relationships.

A graph database is optimized to capture and search the connections between data
elements, overcoming the overhead associated with JOINing multiple tables in SQL.

Very few real-world business systems can survive solely on graph queries. As a result graph
databases are usually run alongside other more traditional databases.

Use cases include fraud detection, social networks, and knowledge graphs.

As you can see, despite a common umbrella, NoSQL databases are diverse in their data
structures and their applications.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 41


THE CONCEPT LEARNING
Exploring Mongo DB TASK

Objective:
 In this topic we focus on MongoDB which is a source-available cross-
platform document-oriented database program. Classified as a
NoSQL database program, MongoDB uses JSON-like documents with
optional schemas. MongoDB is developed by MongoDB Inc. and
licensed under the Server Side Public License.
 Recap:

 Revision of Database Management Systems.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 42


THE CONCEPT
NoSQL Databases: LEARNING
Introduction to NoSQLTASK
& MongoDB

About MongoDB
• MongoDB is an open-source document database and leading NoSQL
database. MongoDB is written in C++.
• This study will give you great understanding on MongoDB concepts
needed to create and deploy a highly scalable and performance-
oriented database.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 43


THE CONCEPT
NoSQL Databases: LEARNING
Introduction to NoSQLTASK
& MongoDB

MongoDB
• MongoDB is a cross-platform, document oriented database that provides, high
performance, high availability, and easy scalability. MongoDB works on concept of
collection and document.
• Database
• Database is a physical container for collections. Each database gets its own set of files
on the file system. A single MongoDB server typically has multiple databases.
• Collection
• Collection is a group of MongoDB documents. It is the equivalent of an RDBMS table. A
collection exists within a single database. Collections do not enforce a schema.
Documents within a collection can have different fields. Typically, all documents in a
collection are of similar or related purpose.
• Document
• A document is a set of key-value pairs. Documents have dynamic schema. Dynamic
schema means that documents in the same collection do not need to have the same
set of fields or structure, and common fields in a collection's documents may hold
different types of data.
• The following table shows the relationship of RDBMS terminology with MongoDB.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 44
THE CONCEPT
NoSQL Databases: LEARNING
Introduction to NoSQLTASK
& MongoDB
RDBMS MongoDB

Database Database

Table Collection

Tuple/Row Document

column Field

Table Join Embedded Documents

Primary Key Primary Key (Default key _id provided by


mongodb itself)

Database Server and Client

Mysqld/Oracle mongod

mysql/sqlplus mongo
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 45
THE CONCEPT
IntroductionLEARNING
MongoDB TASK

• Name comes from “Humongous” & huge data

• Written in C++, developed in 2009

• Creator: 10gen, former doublick

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 46


THE CONCEPT
IntroductionLEARNING
MongoDB TASK

MongoDB: Goal

• Goal: bridge the gap between key-value stores (which are fast and scalable) and
relational databases (which have rich functionality).

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 47


THE CONCEPT
IntroductionLEARNING
MongoDB TASK

What is MongoDB?

• Definition: MongoDB is an open source, document-oriented database


designed with both scalability and developer agility in mind.
• Instead of storing your data in tables and rows as you would with a
relational database, in MongoDB you store JSON-like documents with
dynamic schemas (schema-free, schemaless).

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 48


THE CONCEPT
IntroductionLEARNING
MongoDB TASK
What is MongoDB? (Cont’d)

• Document-Oriented DB
– Unit object is a document instead of a row (tuple) in relational DBs

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 49


THE CONCEPT
IntroductionLEARNING
MongoDB TASK

Is It Fast?

• For semi-structured & complex relationships: Yes

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 50


THE CONCEPT
IntroductionLEARNING
MongoDB TASK
It is Growing Fast

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 51


THE CONCEPT
IntroductionLEARNING
MongoDB TASK

Integration with Others

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 52


THE CONCEPT
IntroductionLEARNING
MongoDB TASK

NoSQL: Categories

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 53


THE CONCEPT
Interfacing LEARNING
and Interacting TASK
with NoSQL

Objective:
 In this topic we focus on introducing the essential ways of
interacting with NoSQL data stores. The types of NoSQL stores vary
and so do the ways of accessing and interacting with them. This
topic attempts to summarize a few of the most prominent of these
disparate ways of accessing and querying data in NoSQL databases.
 Recap:

 Revision of Database Management Systems.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 54


THE CONCEPT
Interfacing LEARNING
and Interacting TASK
with NoSQL

Objective:
 In this topic we focus on MongoDB - Datatypes

1. String − This is the most commonly used datatype to store the data.
2. Integer − This type is used to store a numerical value. ...

3. Boolean − This type is used to store a boolean (true/ false) value.


4. Double − This type is used to store floating point values.

Recap:
 Revision of Nosql Databases.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 55


THE CONCEPT LEARNING TASK
Data types
• MongoDB supports many datatypes. Some of them are −
• String − This is the most commonly used datatype to store the data. String in MongoDB must be UTF-8
valid.
• Integer − This type is used to store a numerical value. Integer can be 32 bit or 64 bit depending upon
your server.
• Boolean − This type is used to store a boolean (true/ false) value.
• Double − This type is used to store floating point values.
• Min/ Max keys − This type is used to compare a value against the lowest and highest BSON elements.
• Arrays − This type is used to store arrays or list or multiple values into one key.
• Timestamp − ctimestamp. This can be handy for recording when a document has been modified or
added.
• Object − This datatype is used for embedded documents.
• Null − This type is used to store a Null value.
• Symbol − This datatype is used identically to a string; however, it's generally reserved for languages that
use a specific symbol type.
• Date − This datatype is used to store the current date or time in UNIX time format. You can specify your
own date time by creating object of Date and passing day, month, year into it.
• Object ID − This datatype is used to store the document’s ID.
• Binary data − This datatype is used to store binary data.
• Code − This datatype is used to store JavaScript code into the document.
• Regular expression − This datatype is used to store regular expression.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 56


THE CONCEPT LEARNING TASK
Data types
Data Model

 BSON format (binary JSON)

 Developers can easily map to modern object-oriented languages


without a complicated ORM layer.

 lightweight, traversable, efficient

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 57


THE CONCEPT LEARNING TASK
Data types
Terms Mapping (DB vs. MongoDB)

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 58


THE CONCEPT LEARNING TASK
Data types
JSON
Field Name
Field Value

• Field Value
– Scalar (Int, Boolean, String,
One document
Date, …)

– Document (Embedding or
Nesting)

– Array of JSON objects

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 59


THE CONCEPT LEARNING TASK
Data types

Another Example

Remember it is stored in
binary formats (BSON)

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 60


THE CONCEPT LEARNING TASK
Data types
MongoDB Model

One document (e.g., one tuple in RDBMS) • Collection is a group of similar


documents

• Within a collection, each document


must have a unique Id

One Collection (e.g., one Table in RDBMS)


Unlike RDBMS:
No Integrity Constraints in
MongoDB

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 61


THE CONCEPT LEARNING TASK
Data types

MongoDB Model

One document (e.g., one tuple in RDBMS)


• The field names cannot start
with the $ character

• The field names cannot contain


the . character

• Max size of single document


One Collection (e.g., one Table in RDBMS) 16MB

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 62


THE CONCEPT LEARNING TASK
Data types

Example Document in
MongoDB

• _id is a special column in each document

• Unique within each collection

• _id  Primary Key in RDBMS

• _id is 12 Bytes, you can set it yourself

• Or:
• 1st 4 bytes  timestamp
• Next 3 bytes  machine id
• Next 2 bytes  Process id
• Last 3 bytes  incremental values

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 63


THENoSQL
CONCEPT
StorageLEARNING
Architecture TASK

Objective:
 In this topic we focus on the NoSQL database approach which is
characterized by a move away from the complexity of SQL based
servers. The logic of validation, access control, mapping querieable
indexed data, correlating related data, conflict resolution,
maintaining integrity constraints, and triggered procedures is moved
out of the database layer.
 Recap:
 Revision of RDBMS architecture.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 64


THENoSQL
CONCEPT
StorageLEARNING
Architecture TASK

• Architecture Pattern is a logical way of categorizing data that will be


stored on the Database. NoSQL is a type of database which helps to
perform operations on big data and store it in a valid format. It is
widely used because of its flexibility and a wide variety of services.

• Architecture Patterns of NoSQL:


• The data is stored in NoSQL in any of the following four data
architecture patterns.

• 1. Key-Value Store Database


• 2. Column Store Database
• 3. Document Database
• 4. Graph Database

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 65


THENoSQL
CONCEPT
StorageLEARNING
Architecture TASK

• These are explained as following below.

• 1. Key-Value Store Database:


• This model is one of the most basic models of NoSQL databases. As the name
suggests, the data is stored in form of Key-Value Pairs. The key is usually a
sequence of strings, integers or characters but can also be a more advanced data
type. The value is typically linked or co-related to the key. The key-value pair
storage databases generally store data as a hash table where each key is unique.
The value can be of any type (JSON, BLOB(Binary Large Object), strings, etc). This
type of pattern is usually used in shopping websites or e-commerce applications.

• Advantages:
• Can handle large amounts of data and heavy load,
• Easy retrieval of data by keys.
• Limitations:

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 66


THENoSQL
CONCEPT
StorageLEARNING
Architecture TASK

• Complex queries may attempt to involve multiple key-value pairs which may delay performance.
• Data can be involving many-to-many relationships which may collide.
• Examples:

• DynamoDB
• Berkeley DB

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 67


THENoSQL
CONCEPT
StorageLEARNING
Architecture TASK
• 2. Column Store Database:
• Rather than storing data in relational tuples, the data is stored in individual cells which
are further grouped into columns. Column-oriented databases work only on columns.
They store large amounts of data into columns together. Format and titles of the
columns can diverge from one row to other. Every column is treated separately. But still,
each individual column may contain multiple other columns like traditional databases.
• Basically, columns are mode of storage in this type.

• Advantages:

• Data is readily available


• Queries like SUM, AVERAGE, COUNT can be easily performed on columns.
• Examples:

• HBase
• Bigtable by Google
• Cassandra

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 68


THENoSQL
CONCEPT
StorageLEARNING
Architecture TASK

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 69


THENoSQL
CONCEPT
StorageLEARNING
Architecture TASK

• 3. Document Database:
• The document database fetches and accumulates data in form of key-value pairs
but here, the values are called as Documents. Document can be stated as a complex
data structure. Document here can be a form of text, arrays, strings, JSON, XML or
any such format. The use of nested documents is also very common. It is very
effective as most of the data created is usually in form of JSONs and is unstructured.
• Advantages: 
• This type of format is very useful and apt for semi-structured data. 
• Storage retrieval and managing of documents is easy. 
• Limitations:  
• Handling multiple documents is challenging 
• Aggregation operations may not work accurately. 
• Examples: 
• MongoDB
• CouchDB

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 70


THENoSQL
CONCEPT
StorageLEARNING
Architecture TASK

Figure – Document Store Model in form of JSON documents

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 71


THENoSQL
CONCEPT
StorageLEARNING
Architecture TASK
4. Graph Databases:
• Clearly, this architecture pattern deals with the storage and management of data in
graphs. Graphs are basically structures that depict connections between two or
more objects in some data. The objects or entities are called as nodes and are joined
together by relationships called Edges. Each edge has a unique identifier. Each node
serves as a point of contact for the graph. This pattern is very commonly used in
social networks where there are a large number of entities and each entity has one
or many characteristics which are connected by edges. The relational database
pattern has tables that are loosely connected, whereas graphs are often very strong
and rigid in nature.
Advantages:
• Fastest traversal because of connections.
• Spatial data can be easily handled.
• Limitations:
• Wrong connections may lead to infinite loops.
Examples:
• Neo4J
• FlockDB( Used by Twitter)

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 72


THENoSQL
CONCEPT
StorageLEARNING
Architecture TASK

Figure – Graph model format of NoSQL Databases

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 73


THE CONCEPT
CRUD operationsLEARNING TASK
with MongoDB

Objective:
 In this topic we focus on CRUD Meaning: CRUD is an acronym that
comes from the world of computer programming and refers to the
four functions that are considered necessary to implement a
persistent storage application: create, read, update and delete.
 Recap:
 Revision of Database Management Systems.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 74


THE CONCEPT
CRUD operationsLEARNING TASK
with MongoDB

Must Practice It

Install it Practice simple stuff Move to complex stuff

Install it from here: http://www.mongodb.org

Manual: http://docs.mongodb.org/master/MongoDB-manual.pdf
(Focus on Ch. 3, 4 for now)

Dataset: http://docs.mongodb.org/manual/reference/bios-example-collection/

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 75


THE CONCEPT
CRUD operationsLEARNING TASK
with MongoDB

CRUD

• Create
– db.collection.insert( <document> )
– db.collection.save( <document> )
– db.collection.update( <query>, <update>, { upsert: true } )
• Read
– db.collection.find( <query>, <projection> )
– db.collection.findOne( <query>, <projection> )
• Update
– db.collection.update( <query>, <update>, <options> )
• Delete
– db.collection.remove( <query>, <justOne> )

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 76


THE CONCEPT
CRUD operationsLEARNING TASK
with MongoDB

CRUD Examples

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 77


Querying,THE CONCEPT
Modifying LEARNING
and Managing TASK
NoSQL Data stores

Objective:
 In this topic we focus on Most NoSQL and NewSQL data stores
which implement some sort of horizontal partitioning or sharding,
which involves storing sets or rows/records into different segments
(or shards) which may be located on different servers.
 Recap:
 Revision of Database Management Systems.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 78


Querying,THE CONCEPT
Modifying LEARNING
and Managing TASK
NoSQL Data stores

Examples

In RDBMS In MongoDB
Either insert the 1st docuement

Or create “Users” collection explicitly

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 79


Creating, Updating and Deleing documents
THE CONCEPT LEARNING TASK
& Querying
Insertion

• The collection “users” is created automatically if it


does not exist
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 80
Creating, Updating and Deleing documents
THE CONCEPT LEARNING TASK
& Querying
Multi-Document Insertion
(Use of Arrays)

All the documents are


inserted at once

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 81


Creating, Updating and Deleing documents
THE CONCEPT LEARNING TASK
& Querying
Multi-Document Insertion
(Bulk Operation)
• A temporary object in memory
There is also Bulk Ordered object
• Holds your insertions and uploads them at once

_id column is added


automatically

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 82


Creating, Updating and Deleing documents
THE CONCEPT LEARNING TASK
& Querying
Deletion
(Remove Operation)

• You can put condition on any field in the document (even _id)

db.users.remove ( ) Removes all documents from users collection

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 83


Creating, Updating and Deleing documents
THE CONCEPT LEARNING TASK
& Querying
Update

Otherwise, it will update only the 1st matching document

Equivalent to in SQL:

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 84


Creating, Updating and Deleing documents
THE CONCEPT LEARNING TASK
& Querying
Update (Cont’d)

Two
operators

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 85


Creating, Updating and Deleing documents
THE CONCEPT LEARNING TASK
& Querying
Replace a document

Query Condition

New
doc

For the document having item = “BE10”, replace it with the given document

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 86


Creating, Updating and Deleing documents
THE CONCEPT LEARNING TASK
& Querying
Insert or Replace

The upsert option

If the document having item = “TBD1” is in the DB, it will be replaced


Otherwise, it will be inserted.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 87


Creating, Updating and Deleing documents
THE CONCEPT LEARNING TASK
& Querying

Any relational database has a typical schema design that


shows number of tables and the relationship between these
tables. While in MongoDB, there is no concept of relationship.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 88


Creating, Updating and Deleing documents
THE CONCEPT LEARNING TASK
& Querying
MongoDB Create Database

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 89


Creating, Updating and Deleing documents
THE CONCEPT LEARNING TASK
& Querying
MongoDB Drop Database

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 90


Creating, Updating and Deleing documents
THE CONCEPT LEARNING TASK
& Querying
MongoDB Create Collection

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 91


Creating, Updating and Deleing documents
THE CONCEPT LEARNING TASK
& Querying

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 92


THE and
Indexing CONCEPT
orderingLEARNING TASK
datasets (MongoDB)

Objective:
 In this topic we focus on MongoDB uses multikey indexes to index
the content stored in arrays. When you index on a column that holds
an array value, MongoDB creates separate index entries for every
element of the array. These multikey indexes allow queries to select
documents that contain arrays by matching on element or elements
of the arrays.
Recap:
 Revision of DBMS architecture.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 93


THE and
Indexing CONCEPT
orderingLEARNING TASK
datasets (MongoDB)

• Indexes support the efficient resolution of queries. Without indexes, MongoDB must scan
every document of a collection to select those documents that match the query
statement. This scan is highly inefficient and require MongoDB to process a large volume
of data.

• Indexes are special data structures, that store a small portion of the data set in an easy-to-
traverse form. The index stores the value of a specific field or set of fields, ordered by the
value of the field as specified in the index.

• The createIndex() Method


• To create an index, you need to use createIndex() method of MongoDB.

• Syntax
• The basic syntax of createIndex() method is as follows().

• >db.COLLECTION_NAME.createIndex({KEY:1})
• Here key is the name of the field on which you want to create index and 1 is for ascending
order. To create index in descending order you need to use -1.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 94
THE and
Indexing CONCEPT
orderingLEARNING TASK
datasets (MongoDB)
• Example
• >db.mycol.createIndex({"title":1})
• {
• "createdCollectionAutomatically" : false,
• "numIndexesBefore" : 1,
• "numIndexesAfter" : 2,
• "ok" : 1
• }
• >
• In createIndex() method you can pass multiple fields, to create index
on multiple fields.

• >db.mycol.createIndex({"title":1,"description":-1})
• >
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 95
THE and
Indexing CONCEPT
orderingLEARNING TASK
datasets (MongoDB)

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 96


THE and
Indexing CONCEPT
orderingLEARNING TASK
datasets (MongoDB)
• The dropIndex() method
• You can drop a particular index using the dropIndex() method of MongoDB.
• Syntax
• The basic syntax of DropIndex() method is as follows().
• >db.COLLECTION_NAME.dropIndex({KEY:1})
• Here key is the name of the file on which you want to create index and 1 is for ascending order. To
create index in descending order you need to use -1.
• Example
• > db.mycol.dropIndex({"title":1})
• {
• "ok" : 0,
• "errmsg" : "can't find index with key: { title: 1.0 }",
• "code" : 27,
• "codeName" : "IndexNotFound"
• }
• The dropIndexes() method
• This method deletes multiple (specified) indexes on a collection.
• Syntax
• The basic syntax of DropIndexes() method is as follows() −

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 97


THE and
Indexing CONCEPT
orderingLEARNING TASK
datasets (MongoDB)
• >db.COLLECTION_NAME.dropIndexes()
• Example
• Assume we have created 2 indexes in the named mycol collection as shown below −
• > db.mycol.createIndex({"title":1,"description":-1})
• Following example removes the above created indexes of mycol −
• >db.mycol.dropIndexes({"title":1,"description":-1})
• { "nIndexesWas" : 2, "ok" : 1 }
• >The getIndexes() method
• This method returns the description of all the indexes int the collection.
• Syntax
• Following is the basic syntax od the getIndexes() method −
• db.COLLECTION_NAME.getIndexes()
• Example
• Assume we have created 2 indexes in the named mycol collection as shown below −

• > db.mycol.createIndex({"title":1,"description":-1})

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 98


THE CONCEPT LEARNING TASK

Objective:
 In this topic we focus on Capped collections are fixed-size collections
that support high-throughput operations that insert and retrieve
documents based on insertion order.

Recap:
 Revision of NOSql architecture.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 99


THE CONCEPT LEARNING TASK
Capped Collections
• Overview
• Capped collections are fixed-size collections that support high-throughput operations that insert and
retrieve documents based on insertion order. Capped collections work in a way similar to circular
buffers: once a collection fills its allocated space, it makes room for new documents by overwriting the
oldest documents in the collection.
• Behavior
• Insertion Order
• Capped collections guarantee preservation of the insertion order. As a result, queries do not need an
index to return documents in insertion order. Without this indexing overhead, capped collections can
support higher insertion throughput.

• Automatic Removal of Oldest Documents


• To make room for new documents, capped collections automatically remove the oldest documents in
the collection without requiring scripts or explicit remove operations.

• Consider the following potential use cases for capped collections:

• Store log information generated by high-volume systems. Inserting documents in a capped collection
without an index is close to the speed of writing log information directly to a file system. Furthermore,
the built-in first-in-first-out property maintains the order of events, while managing storage use.
• Cache small amounts of data in a capped collections. Since caches are read rather than write heavy,
you would either need to ensure that this collection always remains in the working set (i.e. in RAM) or
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 100
accept some write penalty for the required index or indexes.
THE CONCEPT LEARNING TASK
Capped Collections
• _id Index
• Capped collections have an _id field and an index on the _id field by default.

• Restrictions and Recommendations


• Updates
• If you plan to update documents in a capped collection, create an index so that these update operations do not require a collection scan.

• Document Size
• Changed in version 3.2.

• If an update or a replacement operation changes the document size, the operation will fail.

• Document Deletion
• You cannot delete documents from a capped collection. To remove all documents from a collection, use the drop() method to drop the
collection and recreate the capped collection.

• Sharding
• You cannot shard a capped collection.

• Query Efficiency
• Use natural ordering to retrieve the most recently inserted elements from the collection efficiently. This is similar to using the tail
command on a log file.

• Aggregation $out
• The aggregation pipeline stage $out cannot write results to a capped collection.

• Transactions
• Starting in MongoDB 4.2, you cannot write to capped collections in transactions. Reads from capped collections are still supported in
transactions.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 101
THE CONCEPT LEARNING TASK
Capped Collections

• Procedures
• Create a Capped Collection
• You must create capped collections explicitly using the db.createCollection() method,
which is a helper in the mongo shell for the create command. When creating a capped
collection you must specify the maximum size of the collection in bytes, which MongoDB
will pre-allocate for the collection. The size of the capped collection includes a small
amount of space for internal overhead.

• db.createCollection( "log", { capped: true, size: 100000 } )

• If the size field is less than or equal to 4096, then the collection will have a cap of 4096
bytes. Otherwise, MongoDB will raise the provided size to make it an integer multiple of
256.

• Additionally, you may also specify a maximum number of documents for the collection
using the max field as in the following document:

• db.createCollection("log", { capped : true, size : 5242880, max : 5000 } )

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 102


THE CONCEPT LEARNING TASK
Capped Collections
• Query a Capped Collection
• If you perform a find() on a capped collection with no ordering specified, MongoDB guarantees that
the ordering of results is the same as the insertion order.

• To retrieve documents in reverse insertion order, issue find() along with the sort() method with the
$natural parameter set to -1, as shown in the following example:

• db.cappedCollection.find().sort( { $natural: -1 } )

• Check if a Collection is Capped


• Use the isCapped() method to determine if a collection is capped, as follows:

• db.collection.isCapped()

• Convert a Collection to Capped


• You can convert a non-capped collection to a capped collection with the convertToCapped
command:

• db.runCommand({"convertToCapped": "mycoll", size: 100000});

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 103


THE CONCEPT LEARNING TASK
Capped Collections

• The size parameter specifies the size of the capped collection in bytes.

• This holds a database exclusive lock for the duration of the operation. Other
operations which lock the same database will be blocked until the operation
completes. See What locks are taken by some common client operations? for
operations that lock the database.

• Tailable Cursor
• You can use a tailable cursor with capped collections. Similar to the Unix tail -f
command, the tailable cursor "tails" the end of a capped collection. As new
documents are inserted into the capped collection, you can use the tailable cursor
to continue retrieving documents.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 104


THE CONCEPT
Cloud database: LEARNING
- Introduction of CloudTASK
database

Objective:
 In this topic we focus on cloud database which is a database service
built and accessed through a cloud platform. It serves many of the
same functions as a traditional database with the added flexibility of
cloud computing. Users install software on a cloud infrastructure to
implement the database.
 Recap:

 Revision of Cloud Architecture.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 105


THE CONCEPT
Cloud database: LEARNING
- Introduction of CloudTASK
database

• A cloud database is a database service built and accessed through a cloud


platform. It serves many of the same functions as a traditional database
with the added flexibility of cloud computing. Users install software on a
cloud infrastructure to implement the database.

• Key features:

• A database service built and accessed through a cloud platform


• Enables enterprise users to host databases without buying dedicated
hardware
• Can be managed by the user or offered as a service and managed by a
provider
• Can support relational databases (including MySQL and PostgreSQL) and
NoSQL databases (including MongoDB and Apache CouchDB)
• Accessed through a web interface or vendor-provided API

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 106


THE CONCEPT
Cloud database: LEARNING
- Introduction of CloudTASK
database

• Why cloud databases


• Ease of access
• Users can access cloud databases from virtually anywhere, using a vendor’s API or
web interface.

• Scalability
• Cloud databases can expand their storage capacities on run-time to accommodate
changing needs. Organizations only pay for what they use.

• Disaster recovery
• In the event of a natural disaster, equipment failure or power outage, data is kept
secure through backups on remote servers.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 107


THE CONCEPT
Cloud database: LEARNING
- Introduction of CloudTASK
database

• Considerations for cloud databases


• Control options

• Users can opt for a virtual machine image managed like a traditional database or a provider’s
database as a service (DBaaS).

• Database technology

• SQL databases are difficult to scale but very common. NoSQL databases scale more easily but do
not work with some applications.

• Security

• Most cloud database providers encrypt data and provide other security measures; organizations
should research their options.

• Maintenance

• When using a virtual machine image, one should ensure that IT staffers can maintain the underlying
infrastructure.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 108
THE CONCEPT
Cloud database: LEARNING
- Introduction of CloudTASK
database

• WHAT IS A CLOUD DATABASE?


• let’s dig deeper into the cloud-based world that we are living in. So, cloud database services include
everything from storing all kinds of data required to providing access and delivering the data to the
required parties involved. Therefore, as mentioned above, it is storing the data on the internet and is
normally of three kinds.

• Platform as a service(PaaS)

• Software as a service(SaaS)

• Infrastructure as a service(IaaS)

• Platform as a service or PaaS is the most common type here, providing the provision of servers, data
storage, and operating systems. It helps in the storage and acts as a platform for the virtual
database, saving the hardware cost and helping to access the data from all around the world.

• SaaS, on the other hand, provides the entire software as a service to the organization in exchange for
an amount and is an excellent business option for all those organizations involving a lot of web users.

• IaaS helps to provide a complete infrastructure where the business can run their applications.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 109
THE CONCEPT
Cloud database: LEARNING
- Introduction of CloudTASK
database

CLOUD DATABASE TECHNOLOGIES LIST

• CLOUCloud computing is on a rise because of the flexibility and the ease of services that
it provides. Several well-known IT giants are planning to capture the market. Most of
the cloud databases run on the well-known cloud computing platforms like Rackspace,
salesforce, GoGrid, and Amazon EC2.

• Here are the top five most beneficial cloud services for data storage.

• Amazon Web Services or AWS- AWS needs no introduction as it is already counted as


one of the top cloud database technologies.
• Azure by Microsoft- This is Microsoft’s entry into the cloud space which has already
gained a lot of momentum.
• Oracle Database cloud- Everyone has heard about Oracle because of its traditional
database system, and now it is capturing the cloud storage space.
• SAP- SAP is the giant when it comes to offering software for enterprises and now is
ready for cloud storage with its platform called HANA.D DATABASE TECHNOLOGIES LIST

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 110


THENoSQL
CONCEPT LEARNING
with Cloud DatabaseTASK

Objective:
 In this topic we focus on NoSQL databases are specifically designed
for low cost commodity hardware. These databases are mostly used
for storage and access of data across multiple storage cluster. For
example Google, Facebook, Google+, Google big table, Amazon
Dynamo, Twitter etc. collects and stores Terabytes of data for their
user every day.
 Recap:
 Revision of Cloud Architecture.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 111


THENoSQL
CONCEPT LEARNING
with Cloud DatabaseTASK

• What is a cloud database?


• A “cloud database” can be one of two distinct things: a traditional or NoSQL
database installed and running on a cloud virtual machine (be it public cloud,
private cloud, or hybrid cloud platforms), or a cloud provider’s fully managed
database-as-a-service (DBaaS) offering. The former, running your own self-
managed database in a cloud environment, is really no different from
operating a traditional database. Cloud DBaaS, on the other hand, is the
natural database equivalent of software-as-a-service (SaaS): pay as you go, and
only for what you use, and let the system handle all the details of provisioning
and scaling to meet demand, while maintaining consistently high performance.
• Cloud database options:
• Traditional database running on cloud virtual machine (VM)
• Fully managed database-as-a-service

Most of the time (and for most of the remainder of this page), the
term “cloud database” refers to a cloud-based database-as-a-service.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 112


THENoSQL
CONCEPT LEARNING
with Cloud DatabaseTASK
• Why use a cloud database/DBaaS?
• The key benefits of cloud databases are that they are accessible from anywhere, scalable from day one, and
designed for reliability and performance.
• Common cloud database use cases
• Cloud databases work in most cases that traditional databases do. They are particularly valuable when
building software products that:

• Are cloud-native

• Require large volume of data

• Need to handle high scale traffic

• Are distributed geographically

• Data applications that take advantage of centralization, like legacy modernization and analytics, are also
fantastic candidates for cloud database usage.
• While certain use cases are more obvious candidates for cloud database usage, more traditional use cases, like
real-time online transaction processing, caching, or data warehousing work just as well in the fully managed
paradigm.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 113


THENoSQL
CONCEPT LEARNING
with Cloud DatabaseTASK

• Cloud database considerations


• Whether you’re still thinking about whether a cloud database is right for you, or in
the process of selecting the ideal database-as-a-service for your needs, there are a
few key factors to take into consideration:

• Cloud Database Providers

• Database Technology

• Management System

• Cost Model

• Security

• Extras

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 114


THENoSQL
CONCEPT LEARNING
with Cloud DatabaseTASK
• MongoDB Atlas cloud database
• MongoDB can be installed and run on any cloud provider or on-premise network as a
self-managed database cluster or virtual machine, or on AWS, GCP, or Azure using
MongoDB Atlas, our cloud database-as-a-service (DBaaS) offering. There are major
benefits to adopting the DBaaS option, including:

• Simplified management

• Elastic autoscaling

• Redundancy, backup, and restore

• Charts

• Connectors

• Schema navigator

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 115


THENoSQL
CONCEPT LEARNING
with Cloud DatabaseTASK

• MongoDB Atlas, part of MongoDB’s broader data-as-a-service (DaaS)


development platform, is a powerful and compelling alternative to managing
your own NoSQL, or traditional, database, or using a cloud provider-specific
managed offering.

• The way a cloud database works is that rather than installing, configuring, and
maintaining a database instance or instances, an automated system is able to
provision, manage, and scale the underlying database cluster for you.

• Fully managed database services handle the complexities of maintaining a


consistently available, high performance cluster in a way that allows you, the
developer, to access it as a simple, globally available resource.

• You can treat the cluster as a single database instance, covered by a transparent
usage-based pricing model, so you’re never worrying about over- or under-
provisioning.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 116


THE CONCEPT
Introduction LEARNING
to Real TASK
time Database.

Objective:
 In this topic we focus on cloud database which is a database service
built and accessed through a cloud platform. It serves many of the
same functions as a traditional database with the added flexibility of
cloud computing. Users install software on a cloud infrastructure to
implement the database.
 Recap:

 Revision of Cloud Architecture.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 117


THE CONCEPT
Introduction LEARNING
to Real TASK
time Database.
Definition
• Real-Time Data Base System can be defined as
those computing systems that are designed to
operate in a timely manner.
• It must perform certain actions within specific
timing constrains (producing results while
meeting predefined deadlines)
• Real-Time Data Base System can also be defined
as Traditional Databases that uses an extension to
give additional power to yield reliable response.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 118


THE CONCEPT
Introduction LEARNING
to Real TASK
time Database.
RTDBS Structure
• Typical Real-Time Bata Base System consists of:
– Controlled System : the underlying application
– Controlling System:
• A Computer monitoring the state of the environment
• Supplying the environment with the appropriate driving
signals.
• The state of the environment as perceived by
the controlling system must be consistent with
the actual state of the environment.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 119
THE CONCEPT
Introduction LEARNING
to Real TASK
time Database.
Specifications
validity of data
• Effective RTBDS must consider:
– Temporal-consistency: maintaining consistency between
the actual state of the environment and the state as
reflected or perceived by the system.
– Deadlines: timing constrains which must be met in addition
to the desired computations
– Priority Scheduling: policy for ordering the execution of the
outstanding processor according to some predefined
criteria. integrity of data

• As a conclusion, Real Time Data Base Systems


correctness do not only depends on the logical
correctness, but on the timeliness of its actions
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 120
THE CONCEPT
Introduction LEARNING
to Real TASK
time Database.
Services and Examples
• Telecommunication Systems
– Routers and network management systems
– Telephone switching systems
• Control Systems
– Automatic tracking and object positioning
– Engine control in automobiles
• Multimedia servers for real-time streaming
• E-commerce and e-buisness
– Stock market: program stock trading
– Financial services: credit card transactions
• Web-based data services

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 121


THE CONCEPT
Introduction LEARNING
to Real TASK
time Database.

System Models and Timing


Deadlines
• Soft-Deadline:
– desirable but not critical
– missing a soft-deadline does not cause a system
failure or compromises the system’s integrity
– Example: operator
v(t)
switchboard for a telephone
Soft deadline

v0

12/05/2022 d1BISHT
Mr. SOVERS SINGH dUNIT
2 05 t 122
THE CONCEPT
Introduction LEARNING
to Real TASK
time Database.
Deadlines

• Firm-Deadline:
– Desirable but not critical (like Soft-Deadline case)
– It is not executed after its deadline and no value is
gained by the system from the tasks that miss
their deadlines
v(t)
– Example: an autopilot systemFirm deadline
v0

d t
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 123
THE CONCEPT
Introduction LEARNING
to Real TASK
time Database.
Deadlines

• Hard-Deadline:
– Timely and logically correct execution is
considered to be critical
– Missing a hard-deadline can result in catastrophic
consequences
– Also known as Safety-Critical
– Example:v(t)data gathered byHard
a sensor
deadline
v0

12/05/2022 Mr. SOVERS SINGH BISHTd UNIT 05 t 124


THE CONCEPT
Introduction LEARNING
to Real TASK
time Database.
Design Paradigms
• Time-Triggered (TT)
– Systems are initiated as predefined instances
– Assessments of resource requirements and
resource availability is required
– TT architecture can provide predictable behavior
due to its pre-planed execution pattern.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 125


THE CONCEPT
Introduction LEARNING
to Real TASK
time Database.
Design Paradigms
• Event-Triggered (ET)
– Systems are initiated in response to the
occurrence of particular events that are possibly
caused by the environment
– The resource-need assessments in ET architecture
is usually probabilistic
– ET is not as reliable as TT but provides more
flexibility and ideal for more classes of applications
– ET behavior usually is not predictable.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 126


THE CONCEPT
Introduction LEARNING
to Real TASK
time Database.

Tasks Periodicity
• Prosodic Tasks
– Executes at regular intervals of time
– Corresponds to TT architecture
– Have Hard-Deadlines characterized by their periods
(requires worst-case analysis).

• Aperiodic Tasks
– Execution time cannot be priori anticipated
– Activation of tasks is random event caused by a trigger
– Corresponds to ET architecture
– Have Soft-Deadlines (no worst-case analysis)

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 127


THE CONCEPT
Introduction LEARNING
to Real TASK
time Database.

Tasks Periodicity
• Sporadic Tasks
• Tasks which are aperiodic in nature, but have Hard-
Deadlines
• Used to handle emergency conditions or exceptional
situations
• Worst-case calculations is done using Schedulability-
Constraint
• Schedulability-Constraint defines a minimum period
between any two sporadic events from the same
source.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 128
THE CONCEPT LEARNING TASK
Scheduling
• Each task within a real-time system has
– Deadline
– An arrival time
– Possibly an estimated worst-case execution
• A Scheduler can be defined as an algorithm or policy
for ordering the execution of the outstanding process
• Scheduler maybe:
– Preemptive
• Can arbitrarily suspend and resume the execution of the task
without affecting its behavior

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 129


THE CONCEPT LEARNING TASK
Scheduling (Cont)
– Non-preemptive
• A task must be rum without interruption until completion
• Hybrid
– Preemptive scheduler, but preemption is only allowed at certain
points within the code of each task.
• Real-Time scheduling algorithms can be :
– Static
» Known as fixed-priority where priorities are computed off-line
» Requires complete priori knowledge of the real-time environment
in which is deployed
» Inflexible: scheme is workable only if all the tasks are effectively
periodic.
» Can work only for simple systems, performs inconsistently as the
load increases.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 130


THE CONCEPT LEARNING TASK
Scheduling (Cont)
• Dynamic
– Assumes unpredictable task-arrival times
– Attempts to schedule tasks dynamically upon arrival
– Dynamically computes and assigns a priority value to each
task
– Decisions are based on task characteristics and the current
state of the system
– Flexible scheduler that can deal with unpredictable events.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 131


THE CONCEPT LEARNING TASK
Priority-Based Scheduling
• Conventional scheduling algorithms aims at
balancing the number of CPU-bound and I/O
bound jobs to maximize system utilization and
throughput
• Real-Time tasks need to be scheduled according
to their criticalness and timeliness
• Real-Time system must ensure that the progress
of higher-priority tasks (ideally) is never
hindered by lower-priority tasks.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 132
THE CONCEPT LEARNING TASK
Priority-Based Scheduling
Methods
• Earliest-Deadline-First (EDF):
• the task with the current closest (earliest)
deadline is assigned the highest priority in the
system and executed next
• Value-Functions : highest value (benefit) first
• the scheduler is required to assign priorities as
well as defining the system values of completing
each task at any instant in time

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 133


THE CONCEPT LEARNING TASK
Priority-Based Scheduling
Methods
• Value-Density (VD): highest
(value/computation) first
• The scheduler tends to select the tasks that earn
more value per time unit they consume
• It is a greedy technique since it always schedules
that task that has the highest expected value
within the shortest possible time unit.
• Complex functions of deadline, value and slack
time.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 134
THE CONCEPT LEARNING TASK
Synchronization
• Priority inversion problem: a higher-priority task can
be blocked by a lower-priority task possibly for an
unbounded number of times and for unbounded
periods.
• Solutions:
– The Priority Inheritance Protocol
• execute the blocking transaction (low priority) with the priority
of the blocked transaction (high priority)
• The task inherits the highest priority level of all the tasks it
blocks and executes its resource (critical section)
• “intermediate” blocking is eliminated

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 135


THE CONCEPT LEARNING TASK
Synchronization (Cont)
• Priority Abort Protocol
– abort the low priority transaction - no blocking at
all
– quick resolution, but wasted resources
• Conditional Priority Inheritance Protocol
– based on the estimated length of transaction
– inherit the priority only if blocking one is close to
completion; otherwise abort.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 136


THE CONCEPT LEARNING TASK
Real Time Database Systems
Overview
• Topics related to design of RTDBS in a centralized
uni-processor system:
– RTDBS System Models
– Scheduling RTDB Transactions
• Concurrency Control
• Conflict Resolution
• Deadlocks
– Admission Control
– Memory Management
– I/O and Disk Scheduling
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 137
THE CONCEPT LEARNING TASK
Conventional Databases:
Transactions and Serializability
• Transaction: is a collection of read and write
operations which comprises a consistent
transformation of the system state.
• When executed alone, each transaction transforms a
consistent state into a new consistent state
• Transactions preserve consistency of the database
information
• Schedule: a particular sequencing of the actions from
different transactions.
• Consistent Schedule: a schedule that gives each
transaction a consistent view of the database-state.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 138
Introduction to Real time Database.

Conventional Databases:
Transactions and Serializability
• Database inconsistencies can be caused by:
– Failures
– Concurrency
• Four properties associated with transactions
known as ACID properties are used to prevent
such problems

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 139


Introduction to Real time Database.

Conventional Databases:
ACID Properties
A Atomicity: Either all or none of the transactions operations are/is
performed. All the operations of a transaction are treated as a
single, indivisible, atomic unit.

C Consistency: A transaction maintains the integrity constraints on


the database.

I Isolation: Transactions can execute concurrently but with no


interference with each other’s operations.

D Durability: All changes made by a committed transaction become


permanent in the database, surviving any subsequent failures.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 140


THE CONCEPT LEARNING TASK
Conventional Databases:
ACID Properties (Cont.)
• Consistency of database is preserved by each
transaction
• Recovery Protocols are used to ensure the Atomicity
and Durability properties
• The difficulty of dealing with traditional transactions
that different execution paths have significantly
different requirement
• Concurrent execution may violate the database
integrity constrains regardless of the correctness of
individual transactions.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 141
THE CONCEPT LEARNING TASK
Serializability
• An execution is said to be serializiable if it produces the same
output and has the same effect on the database as some serial
execution of the same transactions.
• Serializability is a notion of correctness in any DBMS
• Conflict-Serializability:
– the simplest and most common form of Serializability
– ensures that conflicting operations appear in the same order in two
equivalent executions
– Conflicts can happen in case of read and write operations on the same
data object.
• View Serializability
– Two executions are equivalent if each transaction reads the same
values in the two executions.
– Final value of the databases is the same in both executions

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 142


THE CONCEPT LEARNING TASK
Recoverable History
• Cascading-Aborts: If a transaction Tj reads a value that
was last written by an aborted transaction Ti, then Tj
must also be aborted
• To keep Durability, once a transaction commits, it
could not subsequently be aborted nor its effects
changed due to cascading-aborts.
• to assure Atomicity and Durability, an execution must
be Recoverable
• An execution is Recoverable if, once a transaction is
committed, the transaction is guaranteed not to be
involved in cascading aborts.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 143
THE CONCEPT LEARNING TASK
Recoverable History (Cont)
• Cascadeless: Read only committed written data. That
is, if transaction Tj reads from Ti, then Ti must be an
already committed transaction; i.e.,
– Wi [x] → Rj [x] ⇒ Ci → Cj
• Strict: Read and write only committed written data.
That is, if transaction Tj reads from Ti, or overwrites a
data item that was last written by Ti, then Ti must be
an already committed transaction; i.e.,
– Wi [x] → Rj [x] ⇒ Ci → Cj
Wj [x] ⇒ Ci → Cj

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 144


THE CONCEPT LEARNING TASK
RTBBS vs. Conventional DB

• Conventional Real-Time Transactions


Transactions Logically correct and
• Logically correct consistent (ACID)
“Approximately correct”
and consistent
trade quality or
(ACID): correctness for timeliness
– atomicity Time correctness
– consistency time constraints on
– isolation transactions
– durability temporal constraints on
12/05/2022 Mr. SOVERS SINGH BISHT
dataUNIT 05 145
THE CONCEPT LEARNING TASK
Conventional DB vs. RTDBS

• Conventional Real-Time Database Systems:


Databases: Logical consistency
ACID properties (may be
• Logical consistency
relaxed)
– ACID properties of
Data integrity State
constraints
of environment
transactions:
Enforce time constraints
and reflection in
• Atomicity database
• Isolation Deadlines of transaction
• Among data
Consistency External consistency
• Durabilityused to derive absolute validity interval
other data
– Data integrity (AVI)
constraints Temporal consistency
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 146
relative validity interval (RVI)
THE CONCEPT LEARNING TASK
Conventional DB vs. RTDBS
• Real-time systems
• Task centric
– Deadlines attached to tasks

• Real-time databases
• Data centric
– Data has temporal validity, i.e., deadlines also attached to
data
– Transactions must be executed by deadline to keep the data
valid, in addition to produce results in a timely manner

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 147


THE CONCEPT LEARNING TASK
A Real-Time Database Model

Real-Time Database Model


12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 148
THE CONCEPT LEARNING TASK
A Real-Time Database Model
• Any new transaction must pass through an Admission Control
mechanism, which monitors and regulates the total number of
concurrently active transactions within the system in order to avoid
thrashing
• Every new or resubmitted transaction is assigned a Priority Level,
which orders its scheduling preference relative to the other
concurrent transactions within the system
• Before a transaction performs an operation on a data object, it
must go through the Concurrency Control component in order to
achieve the required synchronization. If the transaction’s request
for a granule is denied, the transaction will be placed into a Wait
Queue.
• The waiting transaction will be reactivated when the requested
granule becomes available, after which the transaction performs its
operation.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 149


THE CONCEPT LEARNING TASK
A Real-Time Database Model
• Similarly, if a transaction requests an item that is
currently not in main-memory, an I/O request is
initiated and the transaction will be placed into a wait
queue.
• The waiting transaction will be reactivated when the
requested granule becomes available in main-memory,
and there is no active higher-priority transaction.
• When a transaction completes all of its operations, it
commits its result(s) and releases all of the data items
in its possession.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 150


THE CONCEPT LEARNING TASK
A Real-Time Database Model
• A transaction may abort/restart a number of
times before it commits. There are various types
of aborts :
– Terminating abort:
• An abort due to missing a deadline, or
• Self-abort – a transaction may abort itself due to an
exceptional condition.
– Non-terminating abort: An abort due to a deadlock
or a data conflict. In this case, the transaction maybe
restarted if its deadline remains feasible.
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 151
THE CONCEPT LEARNING TASK
Scheduling RTDB Transactions
• A special feature of RTDB systems, in addition to
standard physical resources, is the data objects
stored in the database, and transactions accessing
this data have to be scheduled in accordance with
real-time performance objectives.
• The scheduling process of transactions in a RTDB
system consists of:
– Concurrency Control
– Conflict Resolution

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 152


THE CONCEPT LEARNING TASK
Scheduling RTDB Transactions
• Concurrency Control Protocols
– Locking
– Time-stamping
– Multiversion
– Validation
• all of which have the same goal; i.e., enforcing
serializability.
• These Protocols need to be modified and their trade-
off(s) must be reevaluated under RTDB systems.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 153


THE CONCEPT LEARNING TASK
Scheduling RTDB Transactions
Concurrency Control Protocol
• Locks are used to synchronize concurrent
actions
• Two-Phase Locking (2PL)
– all locking operations precedes the first unlock
operation in the transaction
– expanding phase (locks are acquired)
– shrinking phase (locks are released)
– suffers from deadlock
– priority inversion
12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 154
THE CONCEPT LEARNING TASK
Scheduling RTDB Transactions
Conflict Resolution Protocol
• Conflict Resolution Protocol
– Priority-based Wound-Wait Conflict Resolution
• The original scheme was designed to use timestamps.
• It was modified so that the scheme uses priorities
instead of timestamps
• Modified scheme known as High-Priority (HP) and as
Priority-Abort (PA)

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 155


THE CONCEPT LEARNING TASK
Scheduling RTDB Transactions
Deadlocks
• Deadlocks
– Whenever a set of transactions gets involved in a
circular wait in what is known as a wait-for graph
– Five deadlock resolution policies that take into
account :
• the timing properties of the transactions
• the cost of abort operations

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 156


THE CONCEPT LEARNING TASK

Scheduling RTDB Transactions


• Policy 1: Deadlocks
– Always aborts the transaction invoking deadlock detection.
• Policy 2:
– Trace the deadlock cycle
– abort the first tardy transaction encountered in a deadlock cycle.
– If no tardy transaction is found, abort the transaction with the
furthest deadline.
• Policy 3:
– Trace the deadlock cycle
– abort the first tardy transaction encountered in a deadlock cycle.
– If no tardy transaction is found, abort the transaction with the earliest
deadline.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 157


THE CONCEPT LEARNING TASK
Scheduling RTDB Transactions
• Policy 4:
Deadlocks
– Trace the deadlock cycle, and abort the first tardy transaction
encountered in a deadlock cycle.
– If no tardy transaction is found, abort the transaction with the least
criticalness.
• Policy 5:
– Abort the infeasible transaction with the least criticalness.
– If all transactions are feasible, then abort a feasible transaction with
the least criticalness.
– This policy is sensitive to the accuracy of the computation time
because it requires information about remaining execution time
– So; Total execution time requirements at the start of each transaction
must be known.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 158


THE CONCEPT LEARNING TASK
Scheduling RTDB Transactions
Conflict Resolution Protocol
– Outline of the Protocol:

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 159


THE CONCEPT LEARNING TASK
Scheduling RTDB Transactions
Admission Control
• Admission Controller:
• Reject transaction
• Admit contingency action

• Scheduler:
• Drop transaction (firm/soft)
• Replace transaction with contingency action (hard)
• Postpone transaction execution (soft)

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 160


THE CONCEPT LEARNING TASK
Scheduling RTDB Transactions Memory
Management

• Memory management is concerned with three


types of decisions:
– transaction admission
– buffer allocation
– buffer replacement

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 161


THE CONCEPT LEARNING TASK
Future Research Areas in RTDBS
• Resource management and scheduling
• Recovery
• Concurrency Control
• Fault tolerance and security models to interact with RTDBS
• Query languages for explicit specification of real-time constraints ->
RT-SQL
• Distributed real-time databases
• Data models to support complex multimedia objects
• Schemes to process a mixture of hard, soft, and firm timing
constraints and complex transaction structures
• Support for more active features in real-time context
• Interaction with legacy systems (conventional databases)

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 162


THE CONCEPT
Daily LEARNING
Quiz TASK

Q1. Compare NoSQL & RDBMS


Q2. What is NoSQL?
Q3. What are the features of NoSQL?
Q4. Explain the difference between NoSQL v/s Relational database?
Q5. Explain “Polyglot Persistence” in NoSQL?
Q6. How does NoSQL DB budget memory?
Q7. How to script NoSQL DB configuration?
Q8. Does NoSQL Database Interact With Oracle Database?
Q9. What is the difference between NoSQL & Mysql DBs’?
Q10. Explain Oracle NoSQL database?

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 163


THE CONCEPT LEARNING TASK
Weekly Assignemnt

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 164


THE CONCEPT LEARNING
Weekly/monthly/Unit TASK
Wise Assignment.

Assignment
Q1: What are NoSQL databases? What are the different types of NoSQL databases?

Q2: What do you understand by NoSQL databases? Explain.

Q3: Explain difference between scaling horizontally and vertically for databases

Q4: What are the advantages of NoSQL over traditional RDBMS?

Q5: When should we embed one document within another in MongoDB?

Q6: Define ACID Properties?

Q7: Does MongoDB support ACID transaction management and locking functionalities?

Q8: Explain advantages of BSON over JSON in MongoDB?

Q9: How can you achieve primary key - foreign key relationships in MongoDB?

Q10: How do I perform the SQL JOIN equivalent in MongoDB?


12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 165
Faculty VideoTHE
Links, You tube &LEARNING
CONCEPT NPTEL VideoTASK
Links and Online
Courses Details

You Tube video

http://www.nptelvideos.com/lecture.php?id=6516

http://www.nptelvideos.com/lecture.php?id=6517

http://www.nptelvideos.com/lecture.php?id=6518

http://www.nptelvideos.com/lecture.php?id=6519

https://www.youtube.com/watch?v=2yQ9TGFpDuM

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 166


THE CONCEPT LEARNING TASK
MCQ

• 1. Most NoSQL databases support automatic __________ meaning that you get high availability
and disaster recovery.
• (a)processing
• (b)scalability
• (c) replication
• (d)all of the mentioned

• 2. Which of the following are the simplest NoSQL databases?
• (a)Key-value
• (b)Wide-column
• (c) Document
• (d)All of the mentioned

• 3.________ stores are used to store information about networks, such as social connections.
• (a)Key-value
• (b)Wide-column
• (c) Document
• (d)Graph

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 167


THE CONCEPTMCQ
LEARNING TASK
• 4. NoSQL databases is used mainly for handling large volumes of ______________ data.
• (a)unstructured
• (b) structured
• (c)semi-structured
• (d) all of the mentioned

• 5. Which of the following language is MongoDB written in?
• (a)Javascript
• (b) C
• (c)C++
• (d) All of the mentioned

• 6. Point out the correct statement.


• (a)MongoDB is classified as a NoSQL database
• (b) MongoDB favors XML format more than JSON
• (c)MongoDB is column-oriented database store
• (d) All of the mentioned

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 168


THE CONCEPTMCQ
LEARNING TASK

• 7. Which of the following format is supported by MongoDB?


• (a)SQL
• (b)XML
• (c) BSON
• (d)All of the mentioned

• 8. NoSQL was designed with security in mind, so developers or security teams don't need to worry about
implementing a security layer. Is it true or false?
• (a)True
• (b)False

• 9. Which of the following is not a reason NoSQL has become a popular solution for some organizations?
• (a)Better scalability
• (b)Improved ability to keep data consistent
• (c) Faster access to data than relational database management systems (RDBMS)
• (d)More easily allows for data to be held across multiple servers

• 10. NoSQL prohibits structured query language (SQL). Is it True or False?


• (a)True
• (b)False

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 169


THE CONCEPT
GlossaryLEARNING
Questions TASK
Fill the following blanks with one of the given options-

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 170


THE CONCEPT
Expected Questions LEARNING
for University TASK
Exam

1. What do you mean by NoSQL?


2. What are the features of NoSQL?
3. What is the CAP theorem? How is it applicable to NoSQL systems?
4. Explain the difference: RDBMS vs NoSQL?
5. What are the major challenges with traditional RDBMS?
6. What are the different types of NoSQL databases?
7. How Does NoSQL relate to big data?
8. Can you explain the transaction support by using a BASE in NoSQL?
9. What is a Key-Value store or Key-Value database?
10. What is the Column store database?

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 171


THE CONCEPT LEARNING TASK
References

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 172


THE CONCEPT LEARNING TASK
Recap

 This unit provide us fundamentals domain of NOSQL and its latest


trends in industry.
 In this unit we are also benefitted with the knowledge of different
types of databases in NOSQL.
 Whether you experience a natural disaster, power failure or other
crisis, having your data stored in the cloud ensures it is backed up and
protected in a secure and safe location. Being able to access your data
again quickly allows you to conduct business as usual, minimizing any
downtime and loss of productivity
 This unit will impart us with knowledge of Cloud Databases and
querying on cloud databases.

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 173


CONTENT

Thank You

12/05/2022 Mr. SOVERS SINGH BISHT UNIT 05 174

You might also like