Interview Questions SQL PDF
Interview Questions SQL PDF
What is Database?
A database is an organized collection of data, stored and retrieved digitally from a remote or
local computer system. Databases can be vast and complex, and such databases are
developed using fixed design and modeling approaches.
What is DBMS?
DBMS stands for Database Management System. DBMS is a system software responsible for
the creation, retrieval, updation, and management of the database. It ensures that our data
is consistent, organized, and is easily accessible by serving as an interface between the
database and its end-users or application software.
RDBMS stands for Relational Database Management System. The key difference here,
compared to DBMS, is that RDBMS stores data in the form of a collection of tables, and
relations can be defined between the common fields of these tables. Most modern database
management systems like MySQL, Microsoft SQL Server, Oracle, IBM DB2, and Amazon
Redshift are based on RDBMS.
What is SQL?
SQL stands for Structured Query Language. It is the standard language for relational database
management systems. It is especially useful in handling organized data comprised of entities
(variables) and relations between different entities of the data.
SQL is a standard language for retrieving and manipulating structured databases. On the
contrary, MySQL is a relational database management system, like SQL Server, Oracle or IBM
DB2, that is used to manage SQL databases.
A table is an organized collection of data stored in the form of rows and columns. Columns
can be categorized as vertical and rows as horizontal. The columns in a table are called fields
while the rows can be referred to as records.
Constraints are used to specify the rules concerning data in the table. It can be applied for
single or multiple fields in an SQL table during the creation of the table or after creating using
the ALTER TABLE command. The constraints are:
• NOT NULL - Restricts NULL value from being inserted into a column.
• CHECK - Verifies that all values in a field satisfy a condition.
• DEFAULT - Automatically assigns a default value if no value has been specified for the field.
• UNIQUE - Ensures unique values to be inserted into the field.
• INDEX - Indexes a field providing faster retrieval of records.
• PRIMARY KEY - Uniquely identifies each record in a table.
• FOREIGN KEY - Ensures referential integrity for a record in another table.
The PRIMARY KEY constraint uniquely identifies each row in a table. It must contain UNIQUE
values and has an implicit NOT NULL constraint.
A table in SQL is strictly restricted to have one and only one primary key, which is comprised
of single or multiple fields (columns).
A UNIQUE constraint ensures that all values in a column are different. This provides
uniqueness for the column(s) and helps identify each row uniquely. Unlike primary key, there
can be multiple unique constraints defined per table. The code syntax for UNIQUE is quite
similar to that of PRIMARY KEY and can be used interchangeably.
The SQL Join clause is used to combine records (rows) from two or more tables in a SQL
database based on a related column between the two.
What is a Self-Join?
A self-JOIN is a case of regular join where a table is joined to itself based on some relation
between its own column(s). Self-join uses the INNER JOIN or LEFT JOIN clause and a table alias
is used to assign different names to the table within the query.
What is a Cross-Join?
Cross join can be defined as a cartesian product of the two tables included in the join. The
table after join contains the same number of rows as in the cross-product of the number of
rows in the two tables. If a WHERE clause is used in cross join then the query will work like
an INNER JOIN.
A database index is a data structure that provides a quick lookup of data in a column or
columns of a table. It enhances the speed of operations accessing data from a database
table at the cost of additional writes and memory to maintain the index data structure.
There are different types of indexes that can be created for different purposes:
Unique indexes are indexes that help maintain data integrity by ensuring that no two rows
of data in a table have identical key values. Once a unique index has been defined for a
table, uniqueness is enforced whenever keys are added or changed within the index.
Non-unique indexes, on the other hand, are not used to enforce constraints on the tables
with which they are associated. Instead, non-unique indexes are used solely to improve
query performance by maintaining a sorted order of data values that are used frequently.
Clustered indexes are indexes whose order of the rows in the database corresponds to the
order of the rows in the index. This is why only one clustered index can exist in a given table,
whereas, multiple non-clustered indexes can exist in the table.
The only difference between clustered and non-clustered indexes is that the database
manager attempts to keep the data in the database in the same order as the corresponding
keys appear in the clustered index.
Clustering indexes can improve the performance of most query operations because they
provide a linear-access path to data stored in the database.
Data Integrity is the assurance of accuracy and consistency of data over its entire life-cycle
and is a critical aspect of the design, implementation, and usage of any system which stores,
processes, or retrieves data. It also defines integrity constraints to enforce business rules on
the data when it is entered into an application or a database.
The UNION operator combines and returns the result-set retrieved by two or more SELECT
statements.
The MINUS operator in SQL is used to remove duplicates from the result-set obtained by the
second SELECT query from the result-set obtained by the first SELECT query and then return
the filtered results from the first.
The INTERSECT clause in SQL combines the result-set fetched by the two SELECT statements
where records from one match the other and then returns this intersection of result-sets.
• One-to-One - This can be defined as the relationship between two tables where each record
in one table is associated with the maximum of one record in the other table.
• One-to-Many & Many-to-One - This is the most commonly used relationship where a record
in a table is associated with multiple records in the other table.
• Many-to-Many - This is used in cases when multiple instances on both sides are needed for
defining a relationship.
• Self-Referencing Relationships - This is used when a table needs to define a relationship with
itself.
An alias is a feature of SQL that is supported by most, if not all, RDBMSs. It is a temporary
name assigned to the table or table column for the purpose of a particular SQL query. In
addition, aliasing can be employed as an obfuscation technique to secure the real names of
database fields. A table alias is also called a correlation name.
An alias is represented explicitly by the AS keyword but in some cases, the same can be
performed without it as well. Nevertheless, using the AS keyword is always a good practice.
What is Normalization?
Normalization represents the way of organizing structured data in the database efficiently. It
includes the creation of tables, establishing relationships between them, and defining rules
for those relationships. Inconsistency and redundancy can be kept in check based on these
rules, hence, adding flexibility to the database.
What is Denormalization?
TRUNCATE command is used to delete all the rows from the table and free the space
containing the table.
DROP command is used to remove an object from the database. If you drop a table, all the
rows in the table are deleted and the table structure is removed from the database.
If a table is dropped, all things associated with the tables are dropped as well. This includes -
the relationships defined on the table with other tables, the integrity checks and constraints,
access privileges and other grants that the table has. To create and use the table again in its
original form, all these relations, checks, constraints, privileges and relationships need to be
redefined. However, if a table is truncated, none of the above problems exist and the table
retains its original structure.
The TRUNCATE command is used to delete all the rows from the table and free the space
containing the table.
The DELETE command deletes only the rows from the table based on the condition given in
the where clause or deletes all the rows from the table if no condition is specified. But it does
not free the space containing the table.
The user-defined functions in SQL are like functions in any other programming language that
accept parameters, perform complex calculations, and return a value. They are written to use
the logic repetitively whenever required. There are two types of SQL user-defined functions:
• Scalar Function: As explained earlier, user-defined scalar functions return a single scalar value.
• Table-Valued Functions: User-defined table-valued functions return a table as output.
o Inline: returns a table data type based on a single SELECT statement.
o Multi-statement: returns a tabular result-set but, unlike inline, multiple SELECT statements
can be used inside the function body.