Database Performance Optimization. Andrey Avtomonov

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

DATABASE

PERFORMANCE
OPTIMIZATION

Andrey Avtomonov | Dubna | April 2013


[email protected]
AGENDA
Reasons for optimization
Levels of optimization
Best practices for building an application layer
Optimizing communication (bind variables, caching, array
interface)
DB schema improvement (indexes, partitioning)
WHY OPTIMIZE?

A large number of todays applications are running well because:


CPU speed
DB buffer cache size

Within time users start to experience performance issues:


Active data exceeds the buffer cache
I/O becomes dominant in a response time of any query
WHERE TO START FROM?
A concept of tuning by layers
These stages are dictated by the reality of how applications,
databases, and operating systems interact

What are these layers?


APPLICATION LAYER
Applications send requests to the database in the form of SQL
statements (including stored procedures requests).
The database responds to these requests with return codes and
result sets.

Ways of minimizing application workload:


Structuring an application to avoid overloading the database
Optimize the database design
CACHING
The best tuned SQL is the one you didnt execute.
Caching is best suitable for tables:
Frequently accessed
Small
Contain static lookup

Caching can degrade performance by contributing the memory


shortages
Caching frequently updated date may require some sophisticated
synchronization mechanism
External tools may be used to perform fast and complex searches
(Apache Lucene)
DB COMMUNICATION
A lot of performance issues may be caused not by the database
itself, but the way you communicate with it
Reuse database connections
Establishing DB connection is expensive

Application should avoid continually creating and releasing database


connections

In simple applications a connection should be created at application


initialization and used for all transactions

In a WEB-based application connection pooling should be used


BIND VARIABLES

Statement stmt = conn.createStatement();


stmt.execute("UPDATE emp SET sal = sal*1.5 Where empno = "+fEmpNo);

PreparedStatement stmt = conn.prepareStatement(


"UPDATE emp SET sal = sal*1.5 Where empno = :1");
stmt.setInt(1, fEmpNo);
stmt.execute();
SQL INJECTION

Not only bind variables improve performance, they ADAMS-1100


ALLEN-1600
also secure an application.
BLAKE-2850
CLARK-2450
Select ENAME,SAL From emp Where EmpCode = 1 OR 1=1 FORD-3000

Sensitive information may get exposed

Statement stmt = conn.createStatement();


String fName = "' UNION Select ename||'-'||sal From emp Where '1' = '1";
rset = stmt.executeQuery(
"Select JOB From emp Where ename = '" + fName + "'");
while (rset.next()){
System.out.println(rset.getString("JOB"));
}
ARRAY INTERFACE

Array fetches retrieve batches of rows from the DB in a single call


In case of Oracle, default JDBC array size is 10
Fetching rows in batches reduces:
the number of calls issued to the DB server
network traffic
logical IO overhead.

Avoid large fetches where possible


ARRAY SIZE RESPONSE TIME
STORED PROCEDURES
In order to reduce network round-trips some application logic can be
moved to the DB tier.
Most RDBMS have a native procedural language to write :
Functions
Procedures
Triggers
Packages
Other languages (e.g. Java)
STORED PROCEDURES
HOW TO FIND WHAT YOU NEED?
INDEXES
An index is an object with its own storage that provides a fast
access to the table.
Indexes exist primarily to improve performance, so establishing
an optimal indexing strategy is critical to database performance.
Well discuss 3 kinds of indexes:
B-tree
Bitmap
B-TREE INDEX
Has a hierarchical tree structure
B stands for BALANCED
OLTP systems
FEATURES
Each leaf node is at the same depth
Every row requires the same number of index reads to locate.

Leafs have both side links between each other


B-tree index satisfies queries with >,< or BETWEEN operators.

Selective indexes are more efficient than nonselective


Because they point more directly to specific values.

Selectivity
A highly selective expression returns a small proportion of rows
from a table
DATE_OF_BIRTH column is selective, GENDER is not.
CONCATENATED INDEXES
A concatenated (or composite) index is an index composed of
more than one column.

More selective then single key

When several columns within a table are frequently queried

Index will be used when leading columns are used in a query:


If an index is created on FNAME, LNAME, DOB, than querying for :
FNAME, LNAME, DOB / FNAME, LNAME / FNAME will use index
The rest combinations will not
WHATS THE PERFORMANCE GAIN?

SELECT cust_id FROM sh.customers c


WHERE cust_first_name = 'Connor'
AND cust_last_name = 'Bishop'
AND cust_year_of_birth = 1976;
BITMAP INDEX
Creates a bitmap for each unique value of a single column
More effective then B-Tree for columns with fewer distinct values
Merged more effectively
DSS Systems
Oracle locks a lot of rows when indexed column is updated
INDEXES ARE NOT FREE
INDEX OVERHEAD
Indexes reduce DML statements performance
(INSERT/UPDATE/DELETE/MERGE)

Cost of index maintenance can vary and sometimes can be


significant

Avoid creating indexes on frequently updated columns


PARTITIONING
Partitioning splits a table into multiple segments.
Each of the segments can be manipulated individually
Partition provides following advantages :
Partition elimination (Pruning)
Limits the scope of a search for full-table scans

Partition-wise join
QUESTIONS?

You might also like