MySQL Made Easy - Joseph C Scott
MySQL Made Easy - Joseph C Scott
Introduction
MySQL has grown up. Once dismissed as a lightweight toy for websites,
MySQL is now a viable mission-critical data-management solution.
Whereas before it was the ideal choice for websites, now it contains many
of the features needed in other environments, all while maintaining its
impressive speed. It has long been able to outperform many commercial
solutions for raw speed and has an elegant and powerful permissions
system, but now version has the ACID-compliant InnoDB transactional
storage engine.
MySQL is faster, has online backup facilities, and has a multitude of new
features. There is little reason not to consider MySQL for your database
solution. MySQL AB, the company behind MySQL, offers efficient and
low-cost support, and, like most open-source communities, you'll find lots
of free support on the Web. Standard features not yet included in MySQL
(such as views and stored procedures) are currently under development and
may even be ready by the time you read this book.
Ease of use MySQL is easy to use and administer. Many older databases
suffer from legacy issues, making administration needlessly complex.
MySQL's tools are powerful and flexible without sacrificing usability.
How to program. This book assists you in programming with MySQL, but
it will not teach you how to program from the beginning.
Embedded MySQL.
A system on which to install MySQL (if you don't have access to one
already). MySQL can install on your desktop PC, but more often it runs on
a dedicated server when it is used in serious applications.
Readers without formal database design knowledge will benefit from Part II
, "Designing a Database," which covers the often-ignored database design
issues required for developing large-scale databases.
Any readers wanting to administer MySQL will benefit from Part III ,
"MySQL Administration," which progresses from the basics for new users
to the advanced issues for optimizing high-performance databases. It also
explains backups, replication, security, and installation.
Finally, you should turn to the appendixes when you need MySQL SQL,
function, and operator references, as well as references to the database
functions and methods used by most popular programming languages.
With version 3, MySQL dominated the low end of the Internet market. And
with MySQL's release of version 4, the product is now appealing to a much
wider range of customers. With the open-source Apache dominating the
web server market and various open-source operating systems (such as
Linux and FreeBSD) performing strongly in the server market, MySQL's
time has come in the database market.
But I'm getting ahead of myself. This chapter provides a brief introduction
to relational database concepts. You'll learn exactly what a relational
database is and how it works, as well as key terminology. Armed with this
information, you'll be ready to jump into creating a simple database and
manipulating its data.
What Is a Database?
The easiest way to understand a database is as a collection of related files.
Imagine a file (either paper or electronic) of sales orders in a shop. Then
there's another file of products, containing stock records. To fulfill an order,
you'd need to look up the product in the order file and then look up the
stock levels for that particular product in the product file. A database and
the software that controls the database, called adatabase management
system (DBMS), helps with this kind of task. Most databases today
arerelational databases, named such because they deal with tables of data
related by a common field. For example, Table 1.1 shows the Product table,
and Table 1.2 shows the Invoice table. As you can see, the relation between
the two tables is based on the common field stock_code . Any two
tables can relate to each other simply by having a field in common.
Database Terminology
Let's take a closer look at the previous two tables to see how they are
organized:
Each table consists of manyrows andcolumns .
Each row contains data about one single entity (such as one product or one
order). This is called arecord . For example, the first row in Table 1.1 is a
record; it describes the A416 product, which is a box of nails that costs 14
cents. To reiterate, the termsrow andrecord are interchangeable.
Each column (also called a tuple ) contains one piece of data that relates to
the record, called anattribute . Examples of attributes are the quantity of an
item sold or the price of a product. An attribute, when referring to a
database table, is called a field . For example, the data in the Description
column in Table 1.1 are fields. To reiterate, the termsattribute andfield are
interchangeable.
Given this kind of structure, the database gives you a way to manipulate
this data: SQL. SQL is a powerful way to search for records or make
changes. Almost all DBMSs use SQL, although many have added their own
enhancements to it. This means that when you learn about SQL in this
chapter and in more detail in later chapters, you aren't learning something
specific to MySQL. Most of what you learn can be used on any other
relational database, such as PostgreSQL, Oracle, Sybase, or SQL Server.
But after tasting the benefits of MySQL, you probably won't want to
change!
If the MySQL client is not on your desktop and you need to connect to a
second machine to use the MySQL client, you'll probably use something
such as Telnet or a Secure Shell (SSH) client to do so. Using one of these is
a matter of opening the Telnet program, entering the hostname, username,
and password. If you're unsure about this, ask your system administrator for
help.
Once you've logged into a machine on which the MySQL client program is
installed, connecting to the server is easy:
On a Unix machine (for example, Linux or FreeBSD), run the following
command from the command line within your shell:
% mysql -h hostname -u username -p password
databasename
On a Windows machine, run the same command from the command
prompt: % mysql -h hostname -u username -p password
databasename
The % refers to the shell prompt. It'll probably look different on your
machine—for example, c:\> on some Windows setups or$ on some Unix
shells. The-h and the-u can be followed by a space (you can also leave out
the space), but the-p must be followed by the password immediately, with
no intermediate spaces.
Tip prompted for it when MySQL starts; then you can enter the password
without it appearing on the screen. This avoids anyone seeing your
password entered in plain text.
The hostname would be the machine hosting the server (perhaps something
such as www.sybex.com or an IP such as 196.30.168.20). You don't need
a hostname if you're already logged into the server (in other words, the
MySQL client and server are on the same machine). The administrator
assigns you the username and password (this is your MySQL password and
username, which is not the same as your login to the client machine). Some
insecure systems don't require any username or password.
Sometimes the system administrator makes your life a little harder by not
putting MySQL into the default path. So, when you type themysql
command, you may get acommand not found error (Unix) or abad
command or file name error
Tip
(Windows) even though you know you have MySQL installed. If this
happens, you'll need to enter the full path to the MySQL client ( for
example,
/usr/local/build/mysql/bin/mysql , or on Windows,
something such as C:\mysql\bin\mysql ). Ask your administrator for
the correct path if this is a problem in your setup.
For ease of use, you're going to use the same password for the root
user,g00r002b , as you will for the user you're creating,guru2b .
Next, you will have to create thefirstdb database that you're going to be
working with throughout:
mysql> CREATE DATABASE firstdb;
Query OK, 1 row affected (0.01 sec)
Finally, the user you're going to be working as,guru2b , with a password
ofg00r002b , needs to be created and given full access to thefirstdb
database:
Next, your administrator will have to create thefirstdb database that you're
going to be working with throughout:
mysql> CREATE DATABASE firstdb;
Query OK, 1 row affected (0.01 sec)
Finally, the user you're going to be working as, guru2b , with a password
ofg00r002b , needs to be created and given full access to thefirstdb
database. Note that this assumes you will be connecting to the database
fromlocalhost (i.e., the database client and database server are on the same
machine). If this is not the case, your administrator will have to
replacelocalhost with the appropriate host name:
guru2b is your MySQL username, and I'll use it throughout the book,
andg00r002b is your password. You may choose, or be assigned, another
username. You'll learn how to grant permissions in Chapter 14 .
You'll start by creating a table inside your sample database, and then
populating this table with data. Once you've got some tables with data in
them, you'll learn how to perform queries on these tables. First, connect to
your newly created table with the following command:
All these permission hassles may seem troublesome now, but they're a
useful feature. At some stage you'll want to make sure that not just anybody
can access your data, and permissions are the way you'll ensure this.
You can connect to your database either way for now, specifying the
database when you connect or later once you are connected. In the future,
when you have more than one database to use on the system, you'll find it
much easier to change between databases with theUSE statement.
Creating a Table
Now that you've connected to your database, you'll want to put something
in it. To get going, you're going to create a database that could track a sales
team. As you learned, a database consists of many tables, and to start with,
you'll create a table that can contains data about sales representatives. You
will store their names, employee numbers, and commissions. To create a
table, you're also going to use theCREATE command, but you need to
specifyTABLE rather thanDATABASE , as well as a few extras. Enter the
following CREATE statement:
);
Query OK, 0 rows affected (0.00 sec)
Don't forget the semicolon at the end of the line. All MySQL commands
must end with a semicolon. Forgetting it is one of the prime reasons for
beginner
You don't need to enter the statement exactly as printed; I've used multiple
lines to make it easier to follow, but entering it on one line will make it
easier for you. Also, if you use a different case, it will still work.
Throughout this book, I use uppercase to represent MySQL keywords and
lowercase to represent names you can choose yourself. For example, you
could have entered the following:
);
without any problems. However, entering the following:
);
would give you this error:
because you've misspelled TABLE . So take care to enter the capitalized text
exactly; you can rename the text appearing in lowercase without any
problems (as long as you are consistent and use the same names
throughout).
You may be wondering about the INT ,VARCHAR , andTINYINT terms
that appear after the fieldnames. They're what are calleddata types orcolumn
types. INT stands forinteger , or a whole number usually ranging from –
2,147,483,648 to 2,147,483,647. That's about a third of the world's
population, so it should cover the sales team no matter how large it grows.
VARCHAR stands forvariable length character . The number in brackets is
the maximum length of the character string. An amount of 30 and 40
characters should suffice for the first name and surname, respectively.
AndTINYINT stands fortiny integer , usually a whole number from –128 to
127. The commission field refers to a percentage value, and because no one
can earn more than 100 percent, a tiny integer is sufficient. Chapter 2 ,
"Data Types and Table Types," goes into more detail about the various
column types and when to use them.
SHOW TABLES lists all the existing tables in the current database. In the
case of your newly createdfirstdb there's only one:sales_rep . So unless
your short-term memory is really shot, this command probably wasn't much
use. But in big databases with many tables, the name of that obscure table
you created two months ago has slipped your mind. Or perhaps you're
encountering a new database for the first time. That's whenSHOW TABLES
is invaluable.
Examining the Table Structure with DESCRIBE
DESCRIBE is the command that shows you the structure of a table. To see
that MySQL has created your table correctly, type the following:
There are all kinds of extra columns in this table with which you're not yet
familiar. For now, you should be interested in the field and the type. You'll
learn about the other headings in Chapter 2 . The fieldnames are exactly as
you entered them, and the twoVARCHAR fields have the size you allocated
them. Notice that theINT andTINYINT fields have been allocated a size,
too, even though you didn't specify one when you created them. Remember
that aTINYINT by default ranges from –128 to 127 (four characters
including the minus sign), and anINT ranges from –2,147,483,648 to
2,147,483,647 (11 characters including the minus sign), so the seemingly
mysterious size allocation refers to the display width.
To enter this data into the table, you use the SQL statementINSERT to
create a record, as follows:
mysql> INSERT INTO
sales_rep(employee_number,surname,first_name,commi
ssion)
VALUES(1,'Rive','Sol',10);
mysql> INSERT INTO
sales_rep(employee_number,surname,first_name,commi
ssion)
VALUES(2,'Gordimer','Charlene',15);
mysql> INSERT INTO
sales_rep(employee_number,surname,first_name,commi
ssion)
VALUES(3,'Serote','Mike',10);
The string field (aVARCHAR character field) needs a single quote around its
value, but the numeric fields (commission ,employee_number )
don't. Make
Note sure you have enclosed the right field values in quotes and that you
have matched quotes correctly (whatever gets an open quote must get a
close quote), as this often entraps newcomers to SQL.
There is also a shortcutINSERT statement to enter the data. You could have
used the following:
the chances of error ( you could not see just from the statement whether
first_name and surname were in the wrong order), and second, it makes
your programs more flexible. Chapter 5 , "Programming with MySQL,"
discusses this topic in more detail.
This method involves less typing of course, and the server also processes it
more quickly.
The format of the data file must be correct, with no exceptions. In this case,
where you're using the defaults, the text file has each record on a new line,
and each field separated by a tab. Assuming the\t character represents a
tab, and each line ends with a newline character, the text file would have to
look exactly as follows:
1\tRive\tSol\t10
2\tGordimer\tCharlene\t15
3\tSerote\tMike\t10
The SELECT statement has several parts. The first part, immediately after
theSELECT , is the list of fields. You could have returned a number of other
fields, instead of just commission, as follows:
You could also use a wildcard (* ) to return all the fields, as follows:
The * wildcard means all fields in the table. So in the previous example, all
four fields are returned, in the same order they exist in the table structure.
AND
True
False
False, not hot AND humid as it's not humid
AND False True False, not hot AND humid as it's not hot
AND
False
False
False, not hot AND humid as neither are true
OR True True True, hot OR humid as both are true OR True False True, hot
OR humid as it's hot OR False True True, hot OR humid as it's humid
OR
False
False
False, not hot OR humid as neither are true
This result may be exactly what you want. But what if the manager had
meant something slightly different? The employee must have a surname of
Rive and then can either have a first name of Sol or have a commission of
greater than 10 percent. The second result would then not apply, as although
her commission is greater than 10 percent, her first name is not Sol.
TheAND construct means that both clauses must be true. You would have to
give the query as follows:
Note the parentheses in this query. When you have multiple conditions, it
becomes critical to know the order in which the conditions must be
processed. Does theOR part or theAND part come first? Often you will be
given unclear verbal instructions, but this example shows you the
importance of finding out getting clarity before you implement the query.
Sometimes errors like these are never discovered! It's what's often meant
bycomputer error , but it all comes down eventually to a person, usually
someone who devised the wrong query.
Take note of the % . It's a wildcard, similar to *, but specifically for use
inside aSELECT condition. It means0 or more characters . So all of the
earlier permutations on the spelling you were considering would have been
returned. You can use the wildcard any number of times, which allows
queries such as this:
This returns all the records, as it looks for any surname with ane in it. This
is different from the following query, which only looks for surnames that
start with ane :
mysql> SELECT * FROM sales_rep WHERE surname LIKE
'e%'; Empty set (0.00 sec)
You could also use a query such as the following, which searches for
surnames that have ane anywhere in the name and then end with ane :
Let's add a few more records to the table so that you can try some more
complex queries. Add the following two records:
mysql> INSERT INTO sales_rep
values(4,'Rive','Mongane',10); mysql> INSERT INTO
sales_rep values(5,'Smith','Mike',12);
Sorting
Another useful and commonly used clause allows sorting of the results. An
alphabetical list of employees would be useful, and you can use theORDER
BY clause to help you generate it:
You may have noticed that this list is not quite correct if you want to sort by
name because Sol Rive appears before Mongane Rive. To correct this, you
need to sort on the first name as well when the surnames are the same. To
achieve this, use the following:
Now the order is correct. To sort in reverse order (descending order), you
use the DESC keyword. The following query returns all records in order of
commission earned, from high to low:
Again, you may want to further sort the three employees earning 10-percent
commission. To do so, you can use theASC keyword. Although not strictly
necessary because it is the default sort order, the keyword helps add clarity:
If there's only one number after theLIMIT clause, it determines the number
of rows returned.
LIMIT 0 returns no records. This may seem useless, but it's a useful way
to test Note your query on large databases without actually running it.
The LIMIT clause does not only allow you to return a limited number of
records starting from the beginning or end of the dataset. You can also tell
MySQL whatoffset to use—in other words, from which result to start
limiting. If there are two numbers after theLIMIT clause, the first is the
offset and the second is the row limit. The next example returns the second
record, in descending order:
The 2 is the offset (remember the offset starts at 0, so 2 is actually the third
record), and the 3 is the number of records to return.
Notice the parentheses when you use functions. The function is applied to
whatever is inside the parentheses. Throughout this book, I write functions
with the parentheses whenever I refer to them to remind you that it is a
function and how to use it—for example,MAX() .
This query is all well and good, but what if you just wanted to return a list
of surnames, with each surname appearing only once? You don't want Rive
to appear for both Mongane and Sol, but just once. The solution is to
useDISTINCT , as follows:
Counting
As you can see beneath the results so far, MySQL displays the number of
rows returned, such as4 rows in set . Sometimes, all you really want
to return is the number of results, not the contents of the records
themselves. In this case, you'd use theCOUNT() function:
It doesn't really make much difference which field you counted in the
previous example, as there are as many surnames as there are first names in
the table. You would have gotten the same results if you used the following
query:
And to find the lowest commission that any of the sales staff is earning, use
this:
SUM() works in the same way. It's unlikely you would find much use for
totaling the commissions as shown here, but you can get an idea of how it
works:
Obviously this is not the reason you use MySQL. There's no danger of
schools embracing MySQL for students to use in mathematics
examinations. But it's a useful feature inside a query. For example, if you
want to see what commissions the sales reps would be earning if you
increased everyone's commission by 1 percent, use this:
Deleting Records
To delete a record, MySQL uses the DELETE statement. It's similar
toSELECT , except that as the entire record is deleted, there is no need to
specify any columns. You just need the table name and the condition. For
example, Mike Smith resigns, so to remove him, you use this:
You could also have used the first name and surname as a condition to
delete, and in this case it would have worked as well. But for real-world
databases, you'll want to use a unique field to identify the right person.
Chapter 3 will introduce the topic of indexes, but for now keep in mind
thatemployee_number is a unique field, and it is best to use it (Mike Smith
is not that uncommon a name!). You will formalize the fact
thatemployee_number is unique in your database structure in the section on
indexes.
Warning
No warnings, no notification—the table, and all data in it, has been
dropped! Be careful with this statement.
You can do the same with a database:
Now you get some idea why permissions become so important! Allowing
anyone who can connect to MySQL such power would be disastrous. You'll
learn how to prevent catastrophes like this with permissions in Chapter 14 .
Changing Table Structure
The final DDL statement,ALTER , allows you to change the structure of
tables. You can add columns, change column definitions, rename tables, and
drop columns.
Adding a Column
Let's say that you realize you need to create a column in your sales_reps
table to store the date the person joined.UPDATE won't do, as this only
changes the data, not the structure. To make this change, you use theALTER
statement:
Your manager gives you another requirement for your database. (Although
most changes are easy to perform, it's always better to get the design right
from the beginning, as some changes will have unhappy consequences.
Chapter 9 , "Database Design," introduces the topic of database design.)
You need to store the year that the sales rep was born, in order to perform
analysis on the age distribution of the staff. MySQL has aYEAR column
type, which you can use. Add the following column:
After the CHANGE clause comes the old column name, then the new column
name, followed by its definition. To change the definition, but not the name
of the column, you would simply keep the name the same as before, as
shown here:
Renaming a Table
One morning your manager barges in demanding that the term sales rep no
longer be used. From henceforth, the employees arecash-flow enhancers ,
and records need to be kept of theirenhancement value . Noting the wild
look in your manager's eye, you decide to comply, first by adding the new
field:
The next day your manager is looking a little sheepish and red-eyed, and is
mumbling something about doctored punch. You decide to change the table
name back and drop the new column, before anyone notices.
Note the difference between the two ALTER RENAME statements:TO has
been included after the secondRENAME . Both statements are identical in
function, however. There are quite a few cases like this where MySQL has
more than one
Note way of doing something. In fact, there's even a third way to rename a
table; you could useRENAME old tablename TO new_tablename
. These sorts of things are often to provide compliance with other databases
or to the ANSI SQL standard.
Dropping a Column
To remove the unwantedenhancement_value (what made us thinkINT was
right for this mysterious field anyway?), useALTER … DROP , as follows:
mysql> ALTER TABLE sales_rep DROP
enhancement_value;
Query OK, 4 rows affected (0.06 sec) Records: 4
Duplicates: 0 Warnings: 0
+-----------------+-------------+------+-----+----
-----+-------+ 6 rows in set (0.00 sec)
If you do a query, returningdate_joined andbirthday , you get the following
result:
The NULL values indicate that you never entered anything into these fields.
You'll have noticed theNull heading returned when youDESCRIBE a
table.YES is the default for this; it means the field is allowed to have
nothing in it. Sometimes you want to specify that a field can never contain
aNULL value. You'll learn how to do this in Chapter 3 .NULL values often
affect the results of queries and have their own complexities, which are also
examined in chapters 2 and 3 . To make sure you have noNULL values,
update the sales rep records as follows:
There are a host of useful date functions. These examples show just a few;
see Appendix A , "MySQL Syntax Reference," and Appendix B for more
information.
The part in single quotes after the date_joined column is called theformat
string . Inside the format string, you use aspecifier to specify exactly what
format to return.%m returns the month (01–12),%d returns the day (01–31),
and%Y returns the year in four digits. There are a huge range of specifiers;
see Appendix B for a full listing. In the meantime, some examples follow:
%W returns the weekday name,%M returns the month name,%e returns the
day (1–31), and %y returns the year in a two-digit format. Note that%d also
returns the day (01-31), but is different to%e because the leading zeros are
included.
In the following query, %a is the abbreviated weekday name, %D is the day
of month with the suffix attached, %b is the abbreviated month name, and
%Y is the four-digit year:
NOW() returns the date and time. There is a column type calledDATETIME
that Note allows you to store data in the same format (YYYY-MM-DD
HH:MM:SS) in your
tables.
You can do other conversions on the birthday field when you return it. Just
in case you were worried about not being able to return the year, because
you've replaced the year field with a date of birth, you can use theYEAR()
function, as follows:
MySQL has other functions for returning just a part of a date:MONTH() and
DAYOFMONTH():
+----------+------------+-------+------+
| surname | first_name | month | day |
+----------+------------+-------+------+
| Rive | Mongane | 1 | 4 |
| Rive | Sol | 3 | 18 |
| Serote | Mike | 6 | 18 |
| Gordimer | Charlene | 11 | 30 |
+----------+------------+-------+------+
4 rows in set (0.01 sec)
+-------------------+-------+------+
| name | month | day |
+-------------------+-------+------+
| Mongane Rive | 1 | 4 |
| Sol Rive | 3 | 18 |
| Mike Serote | 6 | 18 |
| Charlene Gordimer | 11 | 30 |
+-------------------+-------+------+
4 rows in set (0.00 sec)
Note
Note the space used insideCONCAT() . Just as with date specifiers, you can
use any characters to format yourCONCAT() .
WHERE employee_number=1;
+------------------------+
| DAYOFYEAR(date_joined) |
+------------------------+
| 46 |
+------------------------+
Can you create these tables? Here are the statements I used:
);
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE sales(
code int,
sales_rep int,
customer int,
value int
);
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO customer(id,first_name,surname)
VALUES (1,'Yvonne','Clegg'),
(2,'Johnny','Chaka-Chaka'),
(3,'Winston','Powers'),
(4,'Patricia','Mankunku');
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> INSERT INTO
sales(code,sales_rep,customer,value) VALUES
(1,1,1,2000),
(2,4,3,250),
(3,2,3,500),
(4,1,4,450),
(5,3,1,3800),
(6,1,2,500);
Query OK, 6 rows affected (0.00 sec)
Records: 6 Duplicates: 0 Warnings: 0
mysql> SELECT
sales_rep,customer,value,first_name,surname FROM
sales,sales_rep WHERE code=1 AND
sales_rep.employee_number=sales.sales_rep;
+-----------+----------+-------+------------+-----
----+ | sales_rep | customer | value | first_name
| surname | +-----------+----------+-------+------
------+---------+ | 1 | 1 | 2000 | Sol | Rive | +-
----------+----------+-------+------------+-------
--+ 1 row in set (0.00 sec)
The first part of the query, after theSELECT , lists the fields you want to
return. Easy enough—you just list the fields you want from both tables.
The second part, after theFROM , tells MySQL which tables to use. In this
case it's two tables: thesales andsales_rep tables.
The third part, after the WHERE , contains the condition:code=1 , which
returns the first record from the sales table. The next part is what makes this
query a join. This is where you tell MySQL which fields to join on or which
fields the relation between the tables exists on. The relation between
thesales table and thesales_rep table is between the employee_number field
insales_rep , and thesales_rep field in thesales table. So, because you find a
1 in thesales_rep field, you must look for employee_number 1 in the
sales_rep table.
Let's try another one. This time you want to return all the sales that Sol Rive
(employee_ number 1) has made. Let's look at the thought process behind
building this query: Which tables do you need? Clearlysales_rep , andsales.
That's already part of the query:FROM sales_rep,sales .
What fields do you want? You want all the sales information. So the field
list becomesSELECT code,customer,value .
And finally, what are your conditions? The first is that you only want Sol
Rive's results, and the second is to specify the relation, which is between
thesales_rep field in thesales table and theemployee_number field in
thesales_rep table. So, the conditions are as follows: WHERE
first_name='Sol' and surname='Rive' AND
sales.sales_rep = sales_rep.employee_ number .
+------+----------+-------+
| code | customer | value |
+------+----------+-------+
| 1 | 1 | 2000 |
| 4 | 4 | 450 |
| 6 | 2 | 500 |
+------+----------+-------+
3 rows in set (0.00 sec)
+------+----------+-------+
| code | customer | value |
+------+----------+-------+
| 1 | 1 | 2000 |
| 4 | 4 | 450 |
+------+----------+-------+ 3 rows in set (0.00
sec)
without table names before any fieldnames because there are no fields that
have the same names in the different tables. Or you could have written it
like this:
mysql> SELECT
sales.code,sales.customer,sales.value
FROM sales,sales_rep WHERE
sales_rep.first_name='Sol' AND
sales_rep.surname='Rive' AND sales.sales_rep =
sales_rep.employee_number;
+------+----------+-------+
| code | customer | value |
+------+----------+-------+
| 1 | 1 | 2000 |
| 4 | 4 | 450 |
| 6 | 2 | 500 |
+------+----------+-------+
3 rows in set (0.00 sec)
Now let's try the join again, with the name corrected, but without using the
dot notation to specify the table names:
Just from reading the query you can probably see it is not clear. So now you
have to use the table names every time you reference one of
theemployee_number fields:
+------+----------+-------+
| code | customer | value |
+------+----------+-------+
| 4 | 4 | 450 | | 6 | 2 | 500 | +------+----------
+-------+ 3 rows in set (0.00 sec)
Let's change the fieldname back to what it was before going any further:
Note
You can also useCURRENT_DATE() instead ofNOW() , which will give
you the same result.
The previous query result does not return the age, only the difference in
years. It does not take into account days and months. This section describes
age calculation; it may be tricky for novice users. Don't be put off. Once
you've practiced your basic queries, this type of query will be old hat!
You need to subtract the years as you've done previously but then subtract a
further year if a full year has not passed. Someone born on the 10th of
December in 2001 is not one year old in January 2002, but only after the
10th of December in 2002. A good way of doing this is to take the MM-DD
components of the two date fields (the current date and the birthdate) and
compare them. If the current one is larger, a full year has passed, and the
year calculation can be left. If the current MM-DD part is less than the birth
date one, less than a full year has passed, and you must subtract one from
the calculation. This may sound tricky, and there are some quite complex
ways of doing the calculation floating around, but MySQL makes it easier
because it evaluates a true expression to 1 and a false expression to 0:
The current year is greater than the birthday year of employee 1. That is
true and evaluates to 1. The current year is not less than the birthday year.
That is false and evaluates to 0.
Now you need a quick way to return just the MM-DD component of the
date. And a shortcut way of doing this is to use theRIGHT() string
function:
mysql> SELECT
RIGHT(CURRENT_DATE,5),RIGHT(birthday,5) FROM
sales_rep; +-----------------------+---------------
----+
| RIGHT(CURRENT_DATE,5) | RIGHT(birthday,5) |
+-----------------------+-------------------+
| 04-06 | 03-18 |
| 04-06 | 11-30 |
| 04-06 | 01-04 |
| 04-06 | 06-18 |
+-----------------------+-------------------+
4 rows in set (0.00 sec)
+----------+------------+------+
| surname | first_name | age |
+----------+------------+------+
| Rive | Sol | 26 |
| Gordimer | Charlene | 43 |
| Rive | Mongane | 20 |
| Serote | Mike | 30 |
+----------+------------+------+
4 rows in set (0.00 sec)
Your results may not match these results exactly because time marches on,
and you'll be running the query at a later date than I did.
Be careful to match parentheses carefully when doing such a complex
Warning calculation. For every opening parenthesis, you need a closing
parenthesis,
and in the correct place!
Can you spot a case when the previous age query will not work? When the
current year is the same as the birth year, you'll end up with –1 as the
answer. Once you've had a good look at the appendixes, try and work out
your own way of calculating age. There are many possibilities and just as
many plaintive cries to MySQL to develop anAGE() function.
Grouping in a Query
Now that you have a sales table, let's put theSUM() function to better use
than you did earlier, by returning the total value of sales:
Now you want to find out the total sales of each sales rep. To do this
manually, you would need to group the sales table according to sales_rep.
You would place all of the sales made by
sales_rep 1 together, total the value, and then repeat with sales rep 2. SQL
has theGROUP BY clause, which MySQL uses in the same way:
If you had tried the same query without grouping, you'd have gotten an
error: mysql> SELECT sales_rep,SUM(value) FROM sales;
Now try a more complex query that uses a number of the concepts you have
learned. Let's return the name of a sales rep with the fewest number of
sales. First, you would have to return an employee number. You may get
back a different number when you run the query, as there are three people
who've made only one sale. It doesn't matter which one you return for now.
You would do this as follows:
mysql> SELECT sales_rep,COUNT(*) as count from
sales
Can you take it even further? Can you perform a join as well to return the
name of sales rep 4? If you can do this having started the book as a novice,
you're well on the way to earning your username, "guru to be"! Here's the
query to do so:
mysql> SELECT
first_name,surname,sales_rep,COUNT(*) AS count
from sales,sales_rep WHERE
sales_rep=employee_number GROUP BY
sales_rep,first_name,surname ORDER BY count
LIMIT 1;
+------------+---------+-----------+-------+
| first_name | surname | sales_rep | count |
+------------+---------+-----------+-------+
| Mongane | Rive | 4 | 1 |
+------------+---------+-----------+-------+
1 row in set (0.00 sec)
Summary
MySQL is a relational database management system. Logically, data is
structured into tables, which are related to each other by means of a
common field. Tables consist of rows (or records), and records consist of
columns (or fields). Fields can be of differing types: a numeric type, a string
type, or a date type. (This chapter merely introduces SQL. You will build
your skills in the language throughout this book.)
The MySQL server is what stores the data and runs queries on it. To
connect to the MySQL server, you need the MySQL client. This can be on
the same machine as the server or a remote machine.
TheCREATE statement creates databases and the tables within the database.
TheINSERT statement places records into a table.
TheSELECT statement returns the results of a query.
TheDELETE statement removes records from a table.
TheUPDATE statement modifies the data in a table.
Functions add to the power of MySQL. You can recognize a function by the
parentheses immediately following it. There are many MySQL functions—
mathematical ones such as SUM() to calculate the total of a result set, date
and time functions such asYEAR() to extract the year portion of a date, and
string functions such asRIGHT() to extract part of a string starting on the
right side of that string.
Armed with this basic information, you are now ready to learn the crucial
intricacies of structuring data, move on to more advanced SQL, and
encounter the various types of tables MySQL uses for different kinds of
solutions.
There are three main types of columns in MySQL: numeric, string, and
date. Although there are many more specific types, which you'll learn about
shortly, you can classify each of these into one of the three main types.
Generally, you should choose the smallest possible column type, as this will
save space and be faster to access and update. However, choosing too small
a column can result in data being lost or cut off when you insert it, so be
sure to choose a type that covers all eventualities. The following sections
explore each type in detail.
Note that table and database names can be case sensitive! By default they
are not case sensitive on Windows, but they are case sensitive on most
versions of Unix, except MacOS X.
Note
When performing a query on a numeric column type, you do not need to
use quotes around the numeric value.
Table 2.1 lists the numeric types available in MySQL.
TINYINT[(M)]
[UNSIGNED]
[ZEROFILL]
BIT
BOOL
SMALLINT[(M)]
[UNSIGNED]
[ZEROFILL]
MEDIUMINT[(M)]
[UNSIGNED]
[ZEROFILL]
A synonym forINT .
A big integer, 9,223,372,036,854,775,808 to
9,223,372,036,854,775,807 (SIGNED ); 0 to
[UNSIGNED] [ZEROFILL]
FLOAT(precision) [UNSIGNED]
[ZEROFILL]
DOUBLE[(M,D)] [UNSIGNED]
[ZEROFILL]
DOUBLE
PRECISION[(M,D)] [UNSIGNED]
[ZEROFILL]
REAL[(M,D)]
A synonym forDOUBLE .
[UNSIGNED] [ZEROFILL]
Another synonym forDOUBLE .
DECIMAL[(M[,D])] [UNSIGNED]
[ZEROFILL]
A decimal number is stored like a string, one byte for each character
(thisiscalledunpacked all other numeric types are packed). From
1.7976931348623157E+308 to 2.2250738585072014E308, 0, and
2.2250738585072014E308 to
1.7976931348623157E+308.M refers to the total number of digits
(excluding the sign and decimal point, except for versions earlier than
3.23).D refers to the number of digits after the decimal point. It should
always be less thanM .D is by default 0 if omitted. Unlike other numeric
types,M andD can constrain the range of allowed values. WithUNSIGNED ,
negative values are disallowed.
DEC[(M[,D])]
[UNSIGNED]
[ZEROFILL]
NUMERIC[(M[,D])] [UNSIGNED]
[ZEROFILL]
A synonym forDECIMAL .
Another synonym forDECIMAL .
Use the following guidelines when deciding what numeric type to choose:
Choose the smallest applicable type (TINYINT rather thanINT if the value
would never go beyond 127 signed).
For whole numbers, choose an integer type. (Remember that money can
also be stored as a whole number—for example, it can be stored in cents
rather than dollars, which are integers.) It could also be reasonably stored as
aDECIMAL .
For high precision, use integer types rather than floating-point types
(rounding errors afflict floating-point numbers).
TheM value in Table 2.1 often causes confusion. SettingM to a higher value
than the type allowswill not allow you to extend its limit. For example:
mysql> CREATE TABLE test2(id TINYINT(10));
Query OK, 0 rows affected (0.32 sec)
mysql> INSERT INTO test2(id) VALUES(100000000);
Query OK, 1 row affected (0.00 sec)
mysql> SELECT id FROM test2;
+------+
| id |
+------+
| 127 |
+------+
1 row in set (0.00 sec)
Even though the figure inserted was fewer than 10 digits, because it is a
signedTINYINT , it is limited to a maximum positive value of 127.
However, if you try to limit a type to less than its allowable limit, the value
is not cut off. It neither constrains the range that can be stored nor the
number of digits displayed. For example:
Maximum 255 characters (28 - 1). Requires length + 1 bytes storage. Same
asTINYBLOB , except that searching is done case insensitively. In most
situations, rather useVARCHAR , as it should be faster.
Binary large object. Maximum 65,535 characters
BLOB
TEXT
MEDIUMBLOB
MEDIUMTEXT
LONGBLOB
LONGTEXT
(216 - 1). Requires length + 2 bytes storage. Same asTEXT , except that
searching is done case sensitively.
ENUM('value1','value2',...)
Enumeration. Can only have one of the specified values,NULL or"" .
Maximum of 65,535 values. A set. Can contain zero to 64 values from the
SET('value1','value2',...) specified list.
Use the following guidelines when deciding what string type to choose:
This search returned a result even though you specified nkosi rather
thanNkosi . If you ALTER the table, specifying thefirst_name column
asBINARY , you do not find the result, as follows:
mysql> ALTER TABLE test5 CHANGE first_name
first_name CHAR(10) BINARY; Query OK, 1 row
affected (0.16 sec)
Records: 1 Duplicates: 0 Warnings: 0
ENUM columns have some interesting features. If you add an invalid value,
an empty string ("" ) is inserted instead, as follows:
mysql> CREATE TABLE test6(bool
ENUM("true","false")); Query OK, 0 rows affected
(0.17 sec)
mysql> INSERT INTO test6(bool) VALUES ('true');
Query OK, 1 row affected (0.17 sec)
mysql> INSERT INTO test6(bool) VALUES ('troo');
Query OK, 1 row affected (0.06 sec)
You can also perform queries on enumerated fields based on their indexes
(the first value starts at 1). In the previous example,true would reflect as
an index of 1,false as an index of 2,NULL as an index ofNULL , and any
other value ("" ) as an index of 0. For example:
The following example shows the query. If you insert an index directly,
you'll see the full enumerated value when you return a result:
mysql> INSERT INTO test6(bool) VALUES(2);
Query OK, 1 row affected (0.00 sec)
Note that the order of the elements is always the same as they were
specified by the CREATE TABLE statement, somango,apple is stored
asapple,mango and appears this way in the sorted results, too.
You can create a column of typeCHAR(0) . This may seem useless, but it
can be
Note
useful when using old applications that depend on the existence of a field
but don't actually store anything in it. You can also use it if you need a field
that contains only two values,NULL and"" .
Description
YYYYMMDDHHMMSS YYMMDDHHMMSS YYMMDDHHMM
YYYYMMDD
YYMMDD
YYMM
YY
This does not mean that data is lost, though. The number only affects the
display; even in column defined asTIMESTAMP(2) , the full 14 digits are
stored, so if at a later stage you change the table definition, the
fullTIMESTAMP will be correctly displayed. Functions, except
forUNIX_TIMESTAMP() , work on the display value. So
MySQL Options
When you run the mysql command to connect to MySQL, you can use any
of the options shown in Table 2.5 .
-A, --no-auto-rehash
-b, --no-beep
-B, --batch
--character-setsdir=...
-C, --compress
-#, --debug[=...]
-D, --database=...
Description
Displays the help and exits.
Allows for quicker startup. Automatic rehashing is the feature that allows
you to press Tab; MySQL will try to complete the table or field. MySQL
hashes the field and table names on startup, but sometimes, when you have
many tables and fields, startup can become too slow. Using this option
switches this off. To use hashing when this option has been specified on
startup, enterrehash on the command line.
Turns off the beeping each time there's an error. Accepts SQL statements in
batch mode. Displays results with tab separation. Doesn't use history file.
Tells MySQL in which directory character sets are located.
Uses compression in server/client protocol.
--default-characterset=...
Sets the default character set.
-e, --execute=...
-E, --vertical
Executes command and quits. The output is the same as with the-B option.
Prints the output of a query vertically, with each field on a different line.
Without this option you can also force this
-f, --force
-g, --no-namedcommands
-G, --enable-namedcommands
-h, --host=...
-H, --html
-i, --ignore-space
-L, --skip-linenumbers
--no-pager
--no-tee
-n, --unbuffered
-N, --skip-columnnames
-O, --set-variable var=option
-o, --one-database
--pager[=...]
output by ending your statements with\G .
Forces MySQL to continue processing even if you get a SQL error. Useful
in batch mode when processing from files.
Causes MySQL not to write line number for errors. Can be useful when
outputting to result files in which you later want to search for errors or
compare.
Disables the pager and results in output going to standard output. See the-
pager option.
Disables outfile. See interactive help (\h ) also. Flushes buffer after each
query.
Causes MySQL not to write column names in results.
Gives a variable a value.--help lists variables.
Only updates the default database. This can be useful for skipping updates
to other databases in the update log.
Long results outputs will usually scroll off the screen. You can output the
result to a pager. The default pager is your ENV variablePAGER . Valid
pagers areless ,more ,cat [> filename] , and so on. This option
does not work in batch mode.Pager works only in Unix.
-p[password], -password[=...]
-P, --port=...
-q, --quick
-r, --raw
-s, --silent
-S, --socket=...
-t, --table
-T, --debug-info
--tee=...
-u, --user=#
-U, --safe
updates[=#], --i-ama-dummy[=#]
-v, --verbose
-V, --version
-w, --wait
By default, you connect to a MySQL server through port 3306. You can
change this by specifying the TCP/IP port number to use for connection.
Only allows UPDATE andDELETE that use keys. If this option is set by
default, you can reset it by using-safe-updates=0 .
Causes MySQL to give more verbose output (-v -v -v gives the table
output format,-t ).
Outputs the version information and exit.
If the connection is down, this option will wait and try to connect later,
rather than aborting.
Automatic rehashing is the feature that allows you to press the Tab key and
complete the table or field. MySQL creates this when you connect, but
sometimes, when you have many tables and fields, startup can become too
slow. Using the-A or- no-auto-rehash option switches this off.
The-E option prints the results vertically. You can get this kind of output,
even if you haven't connected to MySQL with this option active, by
using\G at the end of a query:
*************************** 1. row
*************************** id: 1
first_name: Yvonne
surname: Clegg
*************************** 2. row
*************************** id: 2
first_name: Johnny
surname: Chaka-Chaka
*************************** 3. row
*************************** id: 3
first_name: Winston
surname: Powers
*************************** 4. row
*************************** id: 4
first_name: Patricia
surname: Mankunku
*************************** 5. row
*************************** id: 5
first_name: Francois
surname: Papo
*************************** 6. row
*************************** id: 7
first_name: Winnie
surname: Dlamini
*************************** 7. row
*************************** id: 6
first_name: Neil
surname: Beneke
7 rows in set (0.00 sec)
The ignore space option (-i ) allows you to be more lax in using functions
in your queries. For example, the following normally causes an error (note
the space afterMAX ): mysql> SELECT MAX (value) FROM
sales;
ERROR 1064: You have an error in your SQL syntax
near '(value) from sales' at line 1
But if you'd used the-i option to connect, there'd have been no problem:
The -o option only allows updates to the default database. If you connected
with this option, you would not have been able to make an update to any
tables in thefirstdb database:
The -U option (also called the "I am a dummy" option) helps avoid
unpleasant surprises by not permitting anUPDATE orDELETE that does not
use a key (you'll look at keys in detail in Chapter 4 , "Indexes and Query
Optimization"). If you connect with this option, the following command
will not work:
If you do run into the ISAM table type, you should change it to the more
efficient MyISAM type. MyISAM tables also allow you to use more of
MySQL's built-in functionality. To convert an ISAM table to a MyISAM
table, use the following:
MyISAM Tables
The MyISAM table type replaced ISAM in version 3.23.0. MyISAM
indexes are much smaller than ISAM indexes, so the system will use fewer
resources when doing aSELECT using an index on a MyISAM table.
However, MyISAM uses more processor power to insert a record into the
more compressed index.
MyISAM data files are given the extension .MYD , and the indexes have the
extension.MYI . MyISAM databases are stored in a directory, so if you've
been doing the examples from Chapter 1 and have permission to look inside
the directoryfirstdb , you'll see the following files:
sales_rep.MYI
sales_rep.MYD
sales.MYD
sales.MYI
customer.MYD
customer.MYI
The data files should always be larger than the index files. In Chapter 4
you'll see how to use indexes properly and what they actually contain.
There are three subtypes of MyISAM tables: static, dynamic, or
compressed.
MySQL decides whether to use dynamic or static tables when the table is
created. Static tables are the default format, which exist if there are
noVARCHAR ,BLOB , orTEXT columns. If any of these column types exist,
the table type becomes dynamic.
Static Tables
Static tables (also more descriptively called fixed-length tables ) are of a
fixed length. Look at Figure 2.1 , which shows the characters stored in one
field of a mini-table. The field is a first name, set toCHAR(10) .
Easy to reconstruct after a crash (again, because the positions of records are
fixed, MySQL knows where each record is, so only a record being written
during the crash will be lost)
Requires more disk space (30 characters needed for the 3 records, even
though only 16 are used for the names)
Not necessary to reorganize with myisamchk (see Chapter 10 , "Basic
Administration," for more on this)
Dynamic Tables
Columns in dynamic tables are of different lengths. If the same data used in
the static table is placed into a dynamic table, it will be stored as shown in
Figure 2.2 .
All string columns are dynamic (unless they are less than 4 bytes. In this
case, the space saved would be negligible, and the extra complexity would
lead to a performance loss).
Each record has a header, which indicates which string columns are empty
and which numeric columns contain a zero (notNULL records), in which
case they are not stored to disk. Nonempty strings contain a length byte,
plus the string contents.
Compressed Tables
Compressed tables are read-only table types that use much less disk space.
They are ideal for use with archival data which will not change (as they can
only currently be read from, not written to), and where not much space is
available, such as for a CD-ROM.
-f ,--force
-? ,--help
-j big_tablename ,-join=big_tablename
-p # ,--packlength=#
-s ,--silent
-t ,--test
option forces MySQL to pack the table even if the temporary file exists, if
the compression causes the table to become bigger, or if the table is too
small to compress in the first place.
Joins all tables listed on the command line into one big table. All tables that
you want to combine must be identical (in all aspects such as columns and
indexes).
Usually you'd only use this option when you're running myisampack a
second time. myisampack stores all rows with a length pointer from 13.
Occasionally, myisampack notices that it should have used a shorter-length
pointer during the process (normally it gets it right!). Next time youpack the
table, you can alert myisampack to use the optimal length storage size.
If the table is in use, this option waits and retries. Using this option in
conjunction with--skip-externallocking is not recommended
ifthere is a possibility of the table being updated while you're packing.
Let's compress one of the tables you've used so far. You have to use the-f
option because the table is too small to compress normally:
C:\Program Files\MySQL\bin>myisampack -v -f
..\data\firstdb\sales_~1 Compressing
..\data\firstdb\sales_~1.MYD: (5 records)
- Calculating statistics
MERGE Tables
MERGE tables are amalgamations of identical MyISAM tables. They were
introduced in version 3.23.25. You'd normally use them only when your
MyISAM tables are getting too big.
Smaller table size. Some operating systems have a file size limit, and
splitting the tables and creating a MERGE table allows one to get around
this. Also, files can more easily transferred, such as by copying them to CD.
You can make most of the original tables read-only and allow INSERT s
into the most recent table. This means you run the risk of only one small
table getting corrupted during anUPDATE orINSERT , and the repairs on
this table would be much quicker.
You need to take care when changing one of the underlying tables, as this
will corrupt the MERGE table (no actual harm is done, just the MERGE
table may be unavailable).
Let's insert some data into the tables so you can test it later:
Now, if you do a query on the merged table, all the records insales_rep1
andsales_ rep2 are available:
Based on the previous results, you don't know which underlying table any
of the records are in. Fortunately, if you're updating a record, you don't need
to know this. The following statement
will update the record correctly. Because the record only physically exists
on the underlying level, queries to both the MERGE table and the
underlying MyISAM table will reflect the correct data, as the following
demonstrates:
table. Drop the MERGE table, then make your changes, and then rebuild
the MERGE table. If you make the changes and forget to drop your
MERGE table, you may find you are unable to access the MERGE table
properly. Dropping and rebuilding will solve this.
HEAP Tables
HEAP tables are the fastest table types because they are stored in memory
and use a hashed index. The downside is that, because they are stored in
memory, all data will be lost in the case of a crash. They also can't hold
quite so much data (unless you've got a big budget for RAM).
As with any table, you can create a table based on the contents of another
table. HEAP tables are often used to give faster access to an already
existing table—to leave the original table for inserting and updating and
then have the new table for fast reading. Let's create one from thesales_rep
table. If you haven't already created thesales_rep table, create and populate
thesales_rep table with the following statements:
) TYPE=MyISAM;
Now, you create a HEAP table that will take a subset ofsales_rep and put it
into memory for fast access:
As you can see, there are quite a few differences between MyISAM indexes
and HEAP indexes. A HEAP table could actually be slower if you're relying
on an index it does not use. See Chapter 4 for more on using keys.
InnoDB Tables
InnoDB tables are a transaction-safe table type (this means you have
COMMIT and ROLLBACK capabilities. In a MyISAM table, the entire table
is locked when inserting. Just for that fraction of a second, no other
statements can be run on the table. InnoDB uses row-level locking so that
only the row is locked, not the entire table, and statements can still be
performed on other rows.
For performance purposes you should use InnoDB tables if your data
performs large numbers ofINSERT s orUPDATE s relative toSELECT s.
MyISAM would be a better choice when your database performs large
numbers ofSELECT s relative toUPDATE s orINSERT s.
To use InnoDB tables, MySQL will need to have been compiled with
InnoDB support (see Chapter 15 , "Installing MySQL," for full details),
such as the mysqld-max distribution. There are also a number of
configuration parameters that you should set up before you can rely on this
table type for good performance, so be sure to read Chapter 13 ,
"Configuring and Optimizing MySQL," for more details.
When you start MySQL with InnoDB options compiled, and use only the
defaults, you'll see something like the following:
C:\MySQL\bin>mysqld-max
InnoDB: The first specified data file .\ibdata1
did not exist: InnoDB: a new database to be
created!
InnoDB: Setting file .\ibdata1 size to 64 MB
InnoDB: Database physically writes the file full:
wait... InnoDB: Log file .\ib_logfile0 did not
exist: new to be created InnoDB: Setting log file
.\ib_logfile0 size to 5 MB
InnoDB: Log file .\ib_logfile1 did not exist: new
to be created InnoDB: Setting log file
.\ib_logfile1 size to 5 MB
InnoDB: Doublewrite buffer not found: creating new
InnoDB: Doublewrite buffer created
InnoDB: Creating foreign key constraint system
tables InnoDB: Foreign key constraint system
tables created
020504 12:42:52 InnoDB: Started
C:\MYSQL\BIN\MYSQLD~2.EXE: ready for connections
Before version 4, you could not just start MySQL. You had to set at least
the Warning innodb_data_file_path in the configuration file. This
config file is
discussed more fully in Chapter 13 .
InnoDB tables are different from MyISAM tables in that the databases are
not stored in a directory, with the tables as files. All tables and indexes are
stored in an InnoDBtablespace (which can consist of one or more tables; in
the previous example it'sibdata1 ).
The table data limit then is not limited to the operating system file size
limit.
BDB Tables
BDB stands for Berkeley Database (it was originally created at the
University of California, Berkeley). It is also a transaction-capable table
type. As with InnoDB tables, BDB support needs to be compiled into
MySQL to make it work (the mysql-max distribution comes with BDB
support).
Summary
Choosing the types for your fields is important if you want to get the best
performance out of MySQL. Numeric types allow you to do calculations
and are usually smaller than string types. Date types allow you to easily
store date and time data.
HEAP tables are the fastest of the lot and are stored in memory. InnoDB
and BDB tables are transaction safe, allowing statements to be grouped for
integrity of data. InnoDB tables make use of consistent reads, which means
results of tables are displayed as they appear after a completed transaction.
This isn't always ideal, and you can override this behavior with read locks
for updating or for sharing.
Operators
Operators are the building blocks of complex queries. Logical operators
(such asAND and OR ) allow you to relate numbers of conditions in various
ways.Arithmetic operators (such as are+ or* ) allow you to perform basic
mathematical operations in your queries. Comparison operators (such as>
or< ) allow you to compare values, and narrow result sets in this way.
Finally,bit operators , while not often used, allow you to work at a bit level
in your queries.
Logical Operators
Logical operators reduce to either true (1) or false (0). For example, if you
are male, and I ask whether you're maleOR female (assume I'm asking for a
yes/no answer), the answer would be yes, or true. If I ask whether you're
maleAND female, the answer would be no, or false. TheAND andOR in the
questions are logical operators. Table 3.1 describes the operators in more
detail.
c1 OR c2, c1 || OR, || c2
!, NOT ! c1, NOT c1
Description
Only true if both conditionsc1 andc2 are true.
True if eitherc1 orc2 is true.
True ifc1 is false, false ifc1 is true.
Instead of populating a table and running queries against this, the following
examples will produce either a 1 or a 0. Each row in the tables you query
will also reduce to a 1 or a 0. 1s will be returned, and 0s will not be. If you
understand this, you can apply the principles to any of your own tables. If
you're going through these operators for the first time, see if you can predict
the results based on Table 3.1 .
mysql> SELECT 1 AND 0;
+---------+
| 1 AND 0 |
+---------+
| 0 |
+---------+
Arithmetic Operators
Arithmetic operators are used to perform basic mathematical operations.
For example, when I say that 2 + 3 = 5, the plus sign (+) is an arithmetic
operator. Table 3.2 describes the arithmetic operators available in MySQL.
For example, adding together two columns of typeINT will produce anINT
:
This returns 3.5, not 0.5, because the division is performed first (remember
the rules of precedence you learned at school?). However, you should
always use parentheses to make clear which operations are to be performed
first to someone who doesn't know the rules. To make the previous query
simpler, rewrite it as follows:
Note
Even though all the values in this query are integers, because the result has
a non-integer element, it is returned as a floating-point number.
This next example demonstrates the modulus operator:
mysql> SELECT 5 % 3;
+-------+
| 5 % 3 |
+-------+
| 2 |
+-------+
Comparison Operators
Comparison operators are used when making comparisons between values.
For example, I can make the statement that 34 is greater than 2. Theis
greater than part is a comparison operator. Table 3.3 lists and describes the
comparison operators used in MySQL.
>=
<=
<=>
IS NULL
IS NOT NULL
BETWEEN
NOT BETWEEN
LIKE
IN
NOT IN
REGEXP, RLIKE
NOT REGEXP, NOT RLIKE
a IN (b1, b2, b3…)
True ifa is equal to anything in the list.
a NOT IN
(b1,b2,b3…)
a REGEXP b, a
RLIKE b
a NOT REGEXP b, a
NOT RLIKE B
True ifa is not equal to anything in the list.
True ifa matchesb with a regular expression.
True ifa does not matchb with a regular expression.
mysql> SELECT 200 = NULL, 200 <> NULL, 200 < NULL,
200 > NULL; +------------+-------------+-----------
-+------------+ | 200 = NULL | 200 <> NULL | 200 <
NULL | 200 > NULL | +------------+-------------+--
----------+------------+ | NULL | NULL | NULL |
NULL | +------------+-------------+------------+--
----------+
You need to use theIS NULL (orIS NOT NULL ) comparison instead:
MySQL does not sort the two values after a BETWEEN , so if you get the
order Warning wrong, the results for all rows will be false. Make sure that
the first number is
the lower number.
Becausea appears earlier in the alphabet thanb , the result of the following
is true. String comparisons are performed from left to right, one character at
a time:
In the following example,b is less than or equal tob; however, the nextb is
not less than or equal to nothing (the second character on the right string):
The IN() function can be used to test one value against a number of
possible values. The field can match any of the comma-separated values
listed inside the parentheses, as demonstrated in the next example:
Regular Expressions
Regular expressions allow you to perform complex comparisons in
MySQL. Hearing the termregular expressions , for many people, is like
mentioning the plague to a medieval doctor. Immediately the face frowns,
the excuses are prepared, and all hope is lost. And it can be complicated—
entire books have been written on the topic. But using them in MySQL is
not difficult, and they add a useful flexibility to comparisons in MySQL.
Table 3.5 describes the regular expression operators available in MySQL.
To get the equivalent withLIKE , you'd have had to use% wildcards at the
end:
However, the following does not match, as the plus sign (+ ) indicates thatg
had to appear one or more times:
We could also try use the asterisk to try matching against the nameian or the
alternative spelling,iain . Anything other letter aftera should cause the
match to fail. For example:
But the problem is that this also matches 'iaiiiin', as the asterisk matches any
number of characters, as follows:
To correct this, you'd have to limit the match on the 'i' to either zero or one.
Changing the asterisk to a question mark character achieves this. It still
matches 'ian' and 'iain,' but not 'iaiiin,' as follows:
The following matches because{3,} means thea must occur at least three
times:
At first glance, you may think the following should not match because the a
matches four times, but{3} means it should match exactly three times. It
does, however, match three times, as well as twice, once, and four times:
The caret (^ ) anchors the start, and the dollar sign ($ ) the end; omitting
either would cause the match to succeed.
The following match fails because the{3} only refers to thec , not the
entireabc :
Note the difference between curly braces and square brackets in the next
example. The curly braces groupabc as a whole, and the square brackets
would have allowed any ofa , orb , orc to match, allowing a whole range of
other possibilities, such as the following:
The following uses parentheses to achieve the same effect, with the vertical
bar character (| ) being used to group alternate substrings:
Bit Operators
To understand how bit operations work, you'll need to know a little bit
about Boolean numbers and Boolean arithmetic. These kinds of queries
aren't often used, but any selfrespecting "guru2be" needs to have them as
part of their repertoire. Table 3.6 describes the bit operators.
The ordinary number system, called the decimal number system , works off
a base 10. You have 10 fingers, after all, so it makes sense. You count from
zero to nine, and then when you hit 10, you move to the "tens" column, and
start at zero again:
00 01 02 03 04 05 06 07 08 09 10
The decimal number system has 10 digits, from zero to nine. But people
working with computers have often found it useful to work with a number
system based on two digits, zero and one. These represent the two states of
an electrical connection, on and off:
00 01 10 11
Instead of moving to the "tens" column when you run out of digits (in
decimal, after 9 comes 10), you move into the "twos" column (in binary,
after 1 comes 10, which is pronounced "one-zero" to avoid confusion with
the decimal number).
2 * 1000 + 4 * 100 +
2 * 10 + 1 * 1
If you can follow all that (imagining that you are a child learning to count in
decimal helps), you'll see how to apply the same concepts to binary
numbers.
In binary, columns increase in size by powers of 2, as shown in Figure 3.3 .
So, converting binary numbers to decimal is easy enough, but how about
the reverse, converting decimal numbers to binary? It's equally simple. To
convert the number 18 to binary, start with Figure 3.4 .
Starting on the left, there are clearly no 64s in 18, and no 32s. There is,
however, one 16 in 18. So you write a 1 in the 16 column, as shown in
Figure 3.5 .
You've accounted for 16 of your 18, so you now subtract 16 from 18,
leaving you with 2. Continuing to the right, there are no 8s, no 4s, and one
2, in 2. And since 2 minus 2 is 0, you stop once you've written the 1 in the
two column, as shown in Figure 3.6 .
Figure 3.6: Step 3,
Converting decimal to binary
In binary, 18 is then 10010. With larger numbers, you just use more
columns to the left (representing 128s, 256s, and so on). Binary numbers
can get very long very quickly, which is why you don't usually store
numbers that way. Octal (base 8) and hexadecimal (base 16) are two other
number systems that are more convenient.
Let's get back to the bit operators; take two numbers, 9 and 7. In binary they
are 1001 and 111, respectively. The bit operators work on the individual
bits of the binary number that make up the numbers 8 and 7.
For a bitwiseAND operation, both bits need to be 1 for the result to be 1 (just
like an ordinaryAND ). Figure 3.7 shows the two binary numbers.
Note
A bitwiseAND is the same no matter which way around you do it—in other
words, 9&7 is the same as 7&9.
For a bitwiseOR , either digit should be 1 for the result to be 1. So, Figure
3.8 shows a bitwiseOR performed on the same 9 and 7.
The << is the left shift operator, so<< b means that the bits ofa are shifted
left byb columns. For example:2 << 1 . In binary, 2 is 10. If this is shifted
left 1 bit, you get 100, which is 4. For example:
Now you have 15, which is 1111; when shifted 4 bits left, you get
11110000. Convert this to decimal in the usual way, as in Figure 3.9 .
The >> is the right shift operator, soa >> b shifts the bits ofa right byb
columns. Bits shifted beyond the "ones" column are lost. And, again,
shifting by a negative number returns 0.
For example:
In binary, 3 is 11, shifted right by 1 with 1 floating past the ones column (or
1.1 if you'd like, although there is no decimal point in binary notation).
Because you're dealing with integers, the numbers to the right of the
"decimal point" are dropped (perhaps we should call it the binary point, but
there's probably a Hollywood movie coming out by that name), and you're
left with 1 (in both binary and decimal). For example:
This one shifts too far to the right, losing all the bits.
Advanced Joins
You've already looked at a basic two-table join in Chapter 1 . But joins can
get much more complicated than that, and badly written joins are the
culprits in the majority of serious performance problems.
Let's return to the tables created in the previous chapter . If you skipped that
chapter, you can re-create them by running the following statements:
) TYPE=MyISAM;
) TYPE=MyISAM;
INSERT INTO sales_rep VALUES (1, 'Rive', 'Sol',
10, '2000-02-15', '1976-03-18');
+-----------+----------+-------+------------+-----
----+ | sales_rep | customer | value | first_name
| surname | +-----------+----------+-------+------
------+---------+ | 1 | 1 | 2000 | Sol | Rive | +-
----------+----------+-------+------------+-------
--+
To do a more complex join over all three tables is not much more
complicated. If you wanted to return the first names and surnames of both
the sales rep and the customer, as well as the value of the sale, you'd use
this query:
mysql> SELECT
sales_rep.first_name,sales_rep.surname,
value,customer.first_name, customer.surname FROM
sales,sales_rep,customer WHERE
sales_rep.employee_number = sales.sales_rep AND
customer.id = sales.customer;
+------------+----------+-------+------------+----
---------+ | first_name | surname | value |
first_name | surname | +------------+----------+--
-----+------------+-------------+ | Sol | Rive |
2000 | Yvonne | Mike | Serote | 3800 | Yvonne |
Sol | Rive | 500 | Johnny | Charlene | Gordimer |
500 | Winston | Mongane | Rive | 250 | Winston |
Sol | Rive | 450 | Patricia +------------+--------
--+-------+------------+-------------+ | Clegg | |
Clegg | | Chaka-Chaka | | Powers | | Powers | |
Mankunku |
Inner Joins
Inner joins are just another way of describing the first kind of join you
learned. The following two queries are identical:
+------------+----------+-------+------------+----
---------+ | first_name | surname | value |
first_name | surname | +------------+----------+--
-----+------------+-------------+ | Sol | Rive |
2000 | Yvonne | Mike | Serote | 3800 | Yvonne |
Sol | Rive | 500 | Johnny | Charlene | Gordimer |
500 | Winston | Mongane | Rive | 250 | Winston |
Sol | Rive | 450 | Patricia +------------+--------
--+-------+------------+-------------+ | Clegg | |
Clegg | | Chaka-Chaka | | Powers | | Powers | |
Mankunku |
What's going on? Where is the new sale? The problem here is that, because
the customer isNULL in thesales table, the join condition is not fulfilled.
Remember, when you looked at the operators earlier in this chapter, you
saw that the= operator excludesNULL values. The<=> operator won't help
because there are noNULL records in thecustomer table, so even a null-
friendly equality check won't help.
The solution here is to do an OUTER JOIN . These return a result for each
matching result of the one table, whether or not there is an associated record
in the other table. So even though thecustomer field isNULL in thesales
table and has no relation with thecustomer table, a record will be returned.
ALEFT OUTER JOIN is one which returns all matching rows from the
left table, regardless of whether there is a corresponding row in the right
table. The syntax for aLEFT JOIN (a shorter name for aLEFT OUTER
JOIN ) is as follows:
ON id=customer;
+------------+-------------+-------+
| first_name | surname | value |
+------------+-------------+-------+
| Yvonne | Clegg | 2000 |
| Winston | Powers | 250 |
| Winston | Powers | 500 |
| Patricia | Mankunku | 450 |
| Yvonne | Clegg | 3800 | | Johnny | Chaka-Chaka |
500 | | NULL | NULL | 670 | +------------+--------
-----+-------+
Table order in a LEFT JOIN is important. The table from which all
matching rows are returned must be the left table (before theLEFT JOIN
keywords). If you'd reversed the order and tried the following:
ON id=customer;
+------------+-------------+-------+
| first_name | surname | value |
+------------+-------------+-------+
| Yvonne | Yvonne | Johnny | Winston | Winston |
Patricia +------------+-------------+-------+ |
Clegg | 2000 | | Clegg | 3800 | | Chaka-Chaka |
500 | | Powers | 250 | | Powers | 500 | | Mankunku
| 450 |
then you'd have seen only the six records. Because the left table is the
customer table in this query, and the join matches only those records that
exist in the left table, thesales record with theNULL customer (meaning
there is no relation to thecustomer table) is not returned.
Note
ALEFT JOIN was more frequently called aLEFT OUTER JOIN in the
past. For the sake of familiarity, MySQL accepts this term, too.
Of course, you can extend this across a third table to answer the original
query (names of customers and sales reps, as well as sales values, for each
sale). See if you can do it. This is my suggestion:
+------------+----------+-------+------------+----
---------+ | first_name | surname | value |
first_name | surname | +------------+----------+--
-----+------------+-------------+ | Sol | Rive |
2000 | Yvonne | Clegg | | Mongane | Rive | 250 |
Winston | Powers | | Charlene | Sol
| Mike
| Sol
| Charlene +------------+----------+-------+------
------+-------------+ | Gordimer | 500 | Winston |
Powers | | Rive | 450 | Patricia | Mankunku | |
Serote | 3800 | Yvonne | Clegg | | Rive | 500 |
Johnny | Chaka-Chaka | | Gordimer | 670 | NULL |
NULL |
If you get confused by which table to put on which side for left and right
joins, Tip remember that a right join reads all records from the right table,
including nulls,
while a left join reads all records from the left table, including nulls.
There is only one field that is identical in both tables, but if there were
more, each of these would become part of the join condition.
ANATURAL JOIN can also be aLEFT orRIGHT JOIN . The following
two statements are identical:
mysql> SELECT first_name,surname,value FROM
customer LEFT JOIN sales
ON customer.id=sales.id;
+------------+-------------+-------+
| first_name | surname | value | +------------+---
----------+-------+ | Yvonne | Yvonne | Johnny |
Winston | Winston | Patricia +------------+-------
------+-------+ | Clegg | 2000 | | Clegg | 3800 |
| Chaka-Chaka | 500 | | Powers | 250 | | Powers |
500 | | Mankunku | 450 |
The USING keyword allows a bit more control than aNATURAL JOIN . If
there is more than one identical field in the two tables, this keyword allows
you to specify which of these fields are used as join conditions. For
example, taking two tablesA andB with identical fields a,b,c,d , the
following are equivalent:
You use theDISTINCT keyword to avoid duplicates because there are sales
reps who have made more than one sale.
But the reverse is as useful. The boss is raging and heads must roll. Which
sales reps have not made any sales? You can find this information by seeing
which sales reps appear in the sales_rep table but do not have a
corresponding entry in thesales table:
To see the use of this statement, let's create another table, containing a list
of customers handed over from the previous owner of your store:
Now, to get a list of all customers, both old and new, you can use the
following:
You can also order the output as normal. You just need to be careful about
whether the ORDER BY clause applies to the entireUNION or just to the
oneSELECT :
The sorting is performed on the entireUNION . If you just want to sort the
secondSELECT , you'd need to use parentheses:
+------+------------+-------------+
| id | first_name | surname |
+------+------------+-------------+
| 5432 | Thulani | 2342 | Shahiem | 2 | Johnny | 1
| Yvonne | 4 | Patricia | 3 | Winston +------+----
--------+-------------+ | Salie | | Papo | |
Chaka-Chaka | | Clegg | | Mankunku | | Powers |
Whenever there's possible ambiguity, such as where the sorting applies, use
Tip
parentheses. It ensures the sort is applied to the correct part and also means
that anyone else trying to interpret your statements will have an easier time.
Don't assume everyone else knows as much as you!
UNION requires some thought. You can quite easily put together unrelated
fields as long as the number of fields returned by eachSELECT match, and
the data types are the same. MySQL will happily return these results to you,
even though they are meaningless:
mysql> SELECT id, surname FROM customer UNION ALL
SELECT value, sales_rep FROM sales;
+------+-------------+
| id | surname |
+------+-------------+
| 1 | Clegg |
| 2 | Chaka-Chaka |
| 3 | Powers |
| 4 | Mankunku | | 2000 | 1 | | 250 | 4 | | 500 |
2 | | 450 | 1 | | 3800 | 3 | | 500 | 1 | | 670 | 2
| +------+-------------+
Sub-selects
Many queries make use of a SELECT within aSELECT . Sub-selects are
scheduled for implementation in version 4.1. Until now, MySQL did not
allow sub-selects, partly by design (they are often less efficient than
alternatives, as you'll see later) and partly because it was low on a list of
1,001 other "vital" things to implement. With MySQL about to implement
them, you'll need to see how they work.
+------------+---------+
| first_name | surname |
+------------+---------+
| Sol | Rive |
+------------+---------+
But you already know another, better way of doing this, which is the join:
The reason I say better is that often the join is a more efficient way of doing
the query and will return the results quicker. It may not make much
difference on a tiny database, but in large, heavily used tables where
performance is vital, you'll want every extra microsecond you can get out of
MySQL.
To return all the sales reps who have not yet made a sale, you could again
use a subselect, if your DBMS allows it, as follows:
mysql> SELECT first_name,surname FROM sales_rep
WHERE employee_number
Notice the difference between the outputs of the two statements. DELETE
informs you how many rows have been removed, butTRUNCATE
doesn't;TRUNCATE just removes the lot without counting them. It actually
does this by dropping and re-creating the table.
User Variables
MySQL has a feature that allows you to store values as temporary variables,
which you can use again in a later statement. In the vast majority of cases
you'd use a programming language to do this sort of thing (see Chapter 5 ,
"Programming with MySQL"), but MySQL variables can be useful when
working on the MySQL command line.
User variables are set in a particular thread (or connection to a server) and
cannot be accessed by another thread. They are unset when the thread is
closed or the connection lost.
If you close the connection and reconnect from Window1, MySQL will
have cleared the variable, as follows, from Window1:
mysql> exit
% mysql firstdb
Welcome to the MySQL monitor. Commands end with ;
or \g. Your MySQL connection id is 14 to server
version: 4.0.1-alpha-max
Similarly, a user variable set in the field list cannot be used as a condition.
The following will not work because the user variable has not been set in
time for the condition: mysql> SELECT @d:=2000,value FROM
sales WHERE value>@d; Empty set (0.00 sec)
You would have had to set the variable specifically before the query, as
follows: mysql> SET @d=2000;
Query OK, 0 rows affected (0.00 sec)
You can also set a variable in theWHERE clause itself. Be aware then that it
will not be correctly reflected in the field list unless you reset the variable
again! For example:
To reflect this correctly, you'd have to set the variable again in the field list:
This is not an elegant way of implementing user variables; instead, set them
separately beforehand.
Warning
Remember that user variables are set for the duration of the thread. You
may not get the results you expect if you forget to initialize a user variable.
If any of the lines in the file contains a SQL error, MySQL will
immediately stop processing the rest of the file. Changetest.sql to the
following. You add theDELETE statement at the top so that if you rerun the
set of statements a number of times, you won't be stuck with any duplicate
records:
When you run this from the command line, you'll see MySQL returns an
error:
% mysql firstdb < test.sql
ERROR 1064 at line 2: You have an error in your
SQL syntax near ''Sandile','Cohen')' at line 1
If you look at what the customer table contains now, you'll see that the first
record has been correctly inserted, but because the second line contains an
error (theid field is not specified), MySQL stopped processing at that point:
You can force MySQL to continue processing even if there are errors with
theforce option (see Chapter 2 , "Data Types and Table Types," for a full
list of MySQL options): % mysql -f firstdb < test.sql
ERROR 1064 at line 2: You have an error in your
SQL syntax near ''Sandile','Cohen')' at line 1
Even though the error is still reported, all the valid records have still been
inserted, as you can see if you view the table again:
id first_name
1 Yvonne
2 Johnny
3 Winston
4 Patricia
5 Francois
7 Winnie
6 Neil
surname
Clegg
Chaka-Chaka Powers
Mankunku
Papo
Dlamini
Beneke
Notice that the output is not exactly the same as it would be if you were
running in interactive mode. The data is tab delimited, and there are no
formatting lines around them. If you did want the interactive format in the
output file, you could use the table option,-t , for example:
+------+------------+-------------+ | id |
first_name | surname | +------+------------+------
-------+ | 1 | Yvonne | 2 | Johnny | 3 | Winston |
4 | Patricia | 5 | Francois | 7 | Winnie | 6 |
Neil
+------+------------+-------------+ 7 rows in set
(0.00 sec)
| Clegg | | Chaka-Chaka | | Powers | | Mankunku |
| Papo | | Dlamini | | Beneke |
You can delete the records added through the text files, as you will not need
them later: mysql> DELETE FROM customer WHERE id > 4;
Reasons for using batch mode include the following: You can reuse SQL
statements you need again.
You can copy and send files to other people. It's easy to make changes to a
file if there are any errors.
Sometimes you have to run in batch mode, such as when you want to run
certain SQL commands repeatedly at a certain time each day (for example,
with Unix's cron ).
This is all fine and well, but what happens if something goes wrong, and the
system crashes after the first query is completed, but before the second one
is complete? Person1 will have the money removed from their account, and
believe the payment has gone through, but person2 will be irate, believing
the payment was never made. In this sort of case, it's vital that either both
queries are processed together, or neither not at all. To do this, you wrap the
queries together in what is called atransaction , with aBEGIN statement to
indicate the start of the transaction, and aCOMMIT statement to indicate the
end. Only when theCOMMIT is processed, will all queries be made
permanent. If something goes wrong in between, you can use
theROLLBACK command to reverse the incomplete part of the transaction.
Let's run some queries to see how this works. You'll have to create the table
only if you haven't already done so in Chapter 2 :
mysql> CREATE TABLE innotest (f1 INT,f2
CHAR(10),INDEX (f1))TYPE=InnoDB; Query OK,0 rows
affected (0.10 sec)
mysql> INSERT INTO innotest(f1) VALUES(1); Query OK,
1 row affected (0.00 sec)
If you now do aROLLBACK , you will undo this transaction, as it has not yet
been committed:
mysql> ROLLBACK;
Query OK, 0 rows affected (0.00 sec)
You can repeat the previous statement, this time doing a COMMIT before
you exit. Once the COMMIT is run, the transaction is complete, so when you
reconnect, the new record will be present:
mysql> BEGIN;
Query OK, 0 rows affected (0.05 sec)
mysql> INSERT INTO innotest(f1) VALUES(2); Query
OK, 1 row affected (0.06 sec)
mysql> COMMIT;
Query OK, 0 rows affected (0.05 sec)
mysql> EXIT
Bye
C:\Program Files\MySQL\bin> mysql firstdb
Welcome to the MySQL monitor. Commands end with ;
or \g. Your MySQL connection id is 9 to server
version: 4.0.1-alpha-max
Consistent Reads
By default, InnoDB tables perform a consistent read . What this means is
that when a SELECT is performed, MySQL returns the values present in the
database up until the most recently completed transaction. If any
transactions are still in progress, anyUPDATE or INSERT statements will
not be reflected. Before you disagree, there is one exception: The open
transaction itself can see the changes (you probably noticed that when you
did the BEGIN -INSERT -SELECT , the inserted result was displayed). To
demonstrate this, you need to have two windows open and be connected to
the database.
mysql> BEGIN;
mysql> SELECT MAX(f1) FROM innotest;
At the same time, another user is doing the same in Window2: mysql>
BEGIN;
Now, both users (Window1 and Window2) add a new record and commit
their transaction: mysql> INSERT INTO innotest(f1)
VALUES(4);
Query OK, 1 row affected (0.11 sec)
mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)
Now if either user does aSELECT , they'll see the following:
The consistent read has not produced what you'd hoped for: records with
values 4 and 5. The way to avoid this is with an update lock on theSELECT
. By letting MySQL know you are reading in order to update, it will not let
anyone else read that value until your transaction is finished.
First, let's remove the incorrect 4 from the table, so you can do it properly
this time: mysql> DELETE FROM innotest WHERE f1=4;
Query OK, 2 rows affected (0.00 sec)
Now, set the update lock as follows in Window1: mysql> BEGIN;
Now, safe in the knowledge that 4 is the latest value, you can add 5 to the
table in Window2:
mysql> INSERT INTO innotest(f1) VALUES(5);
Query OK, 1 row affected (0.06 sec)
mysql> BEGIN;
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO innotest(f1) VALUES(6); Query OK,
1 row affected (0.00 sec)
mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)
Automatic COMMIT s
By default, unless you specify a transaction withBEGIN , MySQL
automatically commits statements. For example, a query in Window1
returns the following:
However,AUTOCOMMIT=0 does not set this across the entire server, only
for that particular session. If Window2 also setsAUTOCOMMIT to 0, you'll
experience different behavior.
First, setAUTOCOMMIT in both Window1 and 2: mysql> SET
AUTOCOMMIT=0;
Query OK, 0 rows affected (0.00 sec)
Now, run the following in Window1 to see what's present:
The 9 from the new record does not appear, even though you have
committed the results! The reason is that theSELECT in Window1 is also
part of a transaction. The consistent read has been assigned atimepoint , and
this timepoint only moves forward when the transaction it was set in is
completed.
As you saw before, the only way to see the latest results is to SELECT with
aLOCK IN SHARE MODE . This would have waited until the transaction
doing the inserting had done a COMMIT .
Note the long period of time the query took. The fact that there is not a
"quick" SELECT in BDB tables means that any transactions that are
delayed could have serious performance problems.
A query run from Window2 will wait while the transaction is active:
mysql> SELECT f1 FROM bdbtest;
Only when the transaction is committed will the result appear.
Commit the transaction in Window1:
mysql> COMMIT;
Query OK, 0 rows affected (0.05 sec)
And now the query retrieves its results in Window2 (you do not have to
retype the query):
Locking Tables
In the discussions on InnoDB and BDB tables, you've come across the
concept of row-level locking, where individual rows are locked for a period
of time. Row-level locks are much more efficient when the table needs to
perform high volumes ofINSERT s orUPDATE s. Row-level locking,
though, is available only to transaction-safe table types (BDB and InnoDB).
MySQL also has table-level locking, which is available to all table types.
There are two kinds of table locks: read locks andwrite locks . Read locks
mean that only reads may be performed on the table, and writes are locked.
Write locks mean that no reads or writes may be performed on the table for
the duration of the lock. The syntax to lock a table is as follows:
If the thread that created the lock tries to add a record to the customer table,
it will fail. It will not wait for the lock to be released (because it created the
lock, if it hung it would never be able to release the lock); rather,
theINSERT just fails. Try this from Window1:
You can use either the singular or the plural form. Both [UN]LOCK
TABLE and Note [UN]LOCK TABLES are valid, no matter how many
tables you're locking. MySQL
doesn't care about grammar!
Write locks have a higher priority than read locks, so if one thread is
waiting for a read lock, and a request for a write lock comes along, the read
lock has to wait until the write lock has been obtained, and released, before
it obtains its read lock, as follows:
The read lock cannot be obtained until the write lock is released. In the
meantime, another request for a write lock comes along, which also has to
wait until the first write lock is released.
Only when the write lock on Window 3 is released can the read lock from
Window2 be obtained (no need to retype):
You can override this behavior by specifying that the write lock should be a
lower priority, with theLOW_PRIORITY keyword. If you run the previous
example again with a low-priority request for a write lock, the earlier read
lock will be obtained first.
Table locks are mostly used in this way on tables that do not support
transactions. If you're using an InnoDB or a BDB table, useBEGIN
andCOMMIT instead to avoid anomalies in your data. The following is an
example of where it could be used. If yourcustomer_ sales_values table is
empty, populate it with some records:
Now assume that Johnny Chaka-Chaka has made two sales, both of which
are being processed by different clerks. The one sale is worth $100 and the
other $300. Both clerks go through a process of reading the existing value,
then adding either 100 or 300 to this, and then updating the record. The
problem comes if both perform theSELECT before either is updated. Then
the one update will overwrite the other, and the one value will be lost, as
follows.
This is Window1:
mysql> UPDATE customer_sales_values SET
value=500+100 WHERE first_name='Johnny' and
surname='Chaka-Chaka';
Query OK, 1 row affected (0.01 sec)
This is Window2:
After both sales have been captured, the total value of Johnny's sales is
$800, which is $100 short! If you'd taken the care to lock the table, you'd
have avoided the problem.
After you reset the data and start again, run the followingUPDATE :
+-------+
| value |
+-------+
| 500 |
+-------+
Window2 obtains the lock (no need to retype), and can complete the rest of
the transaction as follows:
Transaction Levels
You can change the default behavior when dealing with transactions by
setting the transaction level. There are a number of transaction levels in
MySQL. MySQL supports the following transaction isolation levels:
SERIALIZABLE This level does not allow phantom reads, which is when
another transaction has committed a new row that matches the results of
your query. The data will be the same each time.
To change the transaction level, use the following syntax:
SET [ scope ] TRANSACTION ISOLATION LEVEL
{ isolation_level }
Summary
Joins can be much more complicated than the simple two-table joins
covered in Chapter 1 , "Quickstart to MySQL." Inner joins ignoreNULL
values in a table being joined (or rows where there is no associated
records), and outer joins includeNULL data. Left outer joins return all data
in the table specified first (on the left), including those without an
associated record in the right table, while right outer joins return all data in
the table specified on the right of the join. Full outer joins combine the
features of a left and a right join, but MySQL does not yet support this.
Natural joins use the fact that common fields may be named the same, and
simplify the syntax if this is the case.
TheUNION command combines the results of more than one query into one.
Sub-selects are queries within queries. Often they perform more efficiently
if they are rewritten as a join.
Deleting records one by one, as with the DELETE statement, is not efficient
if you just want to remove all the records in a table. TheTRUNCATE
statement is a quicker way of doing this, though it doesn't return the number
of records deleted, asDELETE does.
User variables allow you to store values for use in a later query. You need to
take care when using them, however, that the user variable is set before it is
required. InSELECT statements, the condition (theWHERE clause) is
performed first, before the field list (immediately after theSELECT and
where user variables are usually set).
MySQL can also be run in batch mode, with SQL statements stored in files
for ease of editing and reuse. You can also redirect the output to a file, so
for example, results of queries can be examined easily at a later stage.
All table types also allow table locking, where the entire table can be
locked, as opposed to just the row as with transaction-safe tables.
In the next chapter , you'll sharpen your skills some more and look at how
you can optimize the performance of our database. You'll explore creating
indexes, writing queries more efficiently, and improving the server's
performance.
+------+------------+-------------+
| id | first_name | surname |
+------+------------+-------------+
| 1 | Yvonne | 2 | Johnny | 3 | Winston | 4 |
Patricia | 5 | Francois | 7 | Winnie | 6 | Neil
+------+------------+-------------+ | Clegg | |
Chaka-Chaka | | Powers | | Mankunku | | Papo | |
Dlamini | | Beneke |
Now, imagine you were doing the job of MySQL. If you wanted to return
any records with the surname of Beneke, you'd probably start at the top and
examine each record. Without any further information, there's no way for
you, or MySQL itself, to know where to find records adhering to this
criteria. Scanning the table in this way (from start to finish, examining all
the records) is called afull table scan . When tables are large, this becomes
inefficient; full table scans of tables consisting of many hundreds of
thousands of records can run very slowly.
To overcome this problem, it would help if the records were sorted. Let's
look for the same record as before, but on a table sorted by surname:
+------+------------+-------------+
| id | first_name | surname |
+------+------------+-------------+
| 6 | Neil
| 2 | Johnny | 1 | Yvonne | 7 | Winnie | 4 |
Patricia | 5 | Francois | 3 | Winston +------+----
--------+-------------+ | Beneke | | Chaka-Chaka |
| Clegg | | Dlamini | | Mankunku | | Papo | |
Powers |
Now you can search this table much more quickly. Because you know the
records are stored alphabetically by surname, you know once you reach the
surname Chaka-Chaka, which begins with aC , that there can be no more
Beneke records. You've only had to examine one record, as opposed to the
seven you would have had to examine in the unordered table. That's quite a
savings, and in a bigger table the benefits would be even greater.
Therefore, it may look like sorting the table is the solution. But,
unfortunately, you may want to search the table in other ways, too. For
example, perhaps you want to return the record with an id of 3. With the
table still ordered by surname, you would have to examine all the records
again, and once more you're stuck with slow, inefficient queries.
The solution is to create separate lists for each field that you need to order.
These don't contain all the fields, just the fields that you need ordered and a
pointer to the complete record in the full table. These lists are calledindexes
, and they are one of the most underused and misused aspects of relational
databases (see Figure 4.1 ). Indexes are stored as separate files in some
cases (MyISAM tables), or as part of the same tablespace in other cases
(InnoDB tables).
denote a physical index. When MySQL indicates that a primary key exists,
there is always an associated index. Throughout this text, the term key
indicates the presence of a physical index.
To create a primary key on an already existing table, you can use theALTER
keyword: ALTER TABLE tablename ADD PRIMARY
KEY(fieldname1 [,fieldname2...]);
Choosing a primary key for the customer table is fairly easy. The id field
lends itself to this because each customer has a different id and there are no
null fields. Either of the name fields would not be ideal, as there may be
duplicates at some stage. To add a primary key to the id field of the
customer table, you need to change the field to not allow nulls and then add
the primary key. You can do this in one statement, as follows:
You can see the changes you've made to the table with this statement by
examining the columns:
The id field does not have aYES in the Null column, indicating that it no
longer can accept null values. It also hasPRI in the Key column, indicating
the primary key.
Primary keys can also consist of more than one field. Sometimes there is no
one field that can uniquely identify a record. To add a primary key in this
instance, separate the fields with a comma:
Let's assume you add a new record with the same code as an existing
record:
There is no problem so far. Even though there are now two records with a
code of 7, there is nothing in the table structure that disallows this. But now,
with your new knowledge of the reasons for using a primary key, you
decide to make the code field a primary key:
You have a duplicate value for the code field, and by definition a primary
key should always be unique. Here, you would have to either remove or
update the duplicates or use an ordinary index that allows duplicates. Most
tables work better with a primary key, though. In this situation, it's easy to
update the responsible record:
You can always create an index at a later stage, though, with this code:
ALTER TABLE tablename ADD INDEX [indexname]
(fieldname1 [,fieldname2...]);
or with the following code:
mysql> CREATE INDEX indexname on
tablename(fieldname1 [,fieldname2...]);
Both of these statements ask for an index name, although with the CREATE
INDEX statement the index name is mandatory. If, in theALTER TABLE…
ADD INDEX… statement, you do not name the index, MySQL will assign
its own name based on the fieldname. MySQL takes the first field as the
index name if there is more than one field in the index. If there is a second
index starting with the same field, MySQL appends_2 , then_3 , and so on
to the index name.
The following sales table has a primary key, but it could also do with an
index on the value field. You may quite often be searching for records with
a value greater or less than a certain amount, or ordering by the value:
To create a full-text index when the table is created, use this syntax:
CREATE TABLE tablename (fieldname columntype,
fieldname2 columntype, FULLTEXT(fieldname
[,fieldname2...]));
The optional keywordINDEX can be added, as in this syntax:
CREATE TABLE tablename (fieldname columntype,
fieldname2 columntype, FULLTEXT INDEX(fieldname
[,fieldname2...]));
To create a full-text index once the table is already in existence, use this
syntax: ALTER TABLE tablename ADD FULLTEXT
[indexname] (fieldname [,fieldname2...]);
or the following code:
CREATE FULLTEXT INDEX indexname ON
tablename(fieldname [,fieldname2...]);
Let's create a table and try to create full-text indexes on some of the fields,
as follows:
Barbarians'),
('In the Heart of the Country'),
('The Master of Petersburg'),
('Writing and Being'),
('Heart of the Beast'),
('Heart of the Beest'),
('The Beginning and the End'),
('Master Master'),
('A Barbarian at my Door');
To return the results of a full-text search, you use the MATCH() function,
andMATCH() a fieldAGAINST() a value, as in this example, which looks
for occurrences of the word Master :
mysql> SELECT * FROM ft2 WHERE MATCH(f1) AGAINST
('Master'); +--------------------------+
| f1 |
+--------------------------+
| Master Master |
| The Master of Petersburg |
+--------------------------+
2 rows in set (0.01 sec)
Noise Words
Now you run another search:
These results are not as you may expect. Most of the titles contain the word
the , andThe Beginning and the End contains it twice, yet is not reflected.
There are a number of reasons for this:
MySQL has what is called a 50-percent threshold . Any words that appear
in more than 50 percent of the fields are treated asnoise words , meaning
that they are ignored.
Any words of three or fewer letters are excluded from the index.
There is a predefined list of noise words, withthe included.
So,The Beginning and the End didn't have a chance!
If you have a table with only one record, all words will be noise words, so a
Warning full-text search would not return anything! Tables with very few
records can
also increase the likelihood of words being treated as noise words.
The following query returns nothing, even though the word for does appear
in the data becausefor is a word of three or fewer characters, and by default
these are excluded from the index:
Relevance
You are not limited to using theMATCH() function in theWHERE condition;
you can return the results, too, as follows:
MySQL can return the relevance as well as the required fields at no extra
cost in time because the two calls to theMATCH() function are identical:
mysql> SELECT f1,(MATCH(f1) AGAINST ('Master'))
FROM ft2
The word following is prohibited and must not be present in any row -
returned.
< The word following has a lower relevance than other words. > The word
following has a higher relevance than other words. ( ) Used to group
words in subexpressions.
(this is not the same as the - operator, which excludes the row altogether if
the word is found, or as the< operator, which still assigns a positive, though
lower, relevance to the word).
This result may seem surprising if you compare it to the previous example,
but because the wordDog is three or fewer letters, it is excluded for
purposes of the search.
The next two examples demonstrate the difference between searching for a
whole word and a part of a word (making use of the* operator):
By default only whole words are matched, unless the* operator is used.
The next three examples demonstrate the use of the> and< operators to
increase and decrease the weightings respectively:
+-----------------------------+------+
| f1 | m |
+-----------------------------+------+
| In the Heart of the Country | 1 |
| Heart of the Beast | 2 |
| Heart of the Beest | 2 |
+-----------------------------+------+
3 rows in set (0.00 sec)
mysql> SELECT f1,MATCH(f1) AGAINST ('Heart Beest
>Beast'
+-----------------------------+-----------------+
| f1 | m | +-----------------------------+--------
---------+ | In the Heart of the Country | 1 | |
Heart of the Beast | 2 | | Heart of the Beest |
1.6666667461395 | +-----------------------------+-
----------------+ 3 rows in set (0.00 sec)
The next five examples demonstrate the difference between the < operator,
which adds a decreased, yet positive, weight to the match; the~ operator,
which places a negative weight on the match; and the- operator, which
prohibits the match. The first example is a basic Boolean search, with a
weighting of 1 for a match:
Next, the< operator reduces this weighting to roughly 2/3, which is still a
positive weight: mysql> SELECT *,MATCH(f1) AGAINST
('<Door' IN BOOLEAN MODE)
The~ operator reduces this weight to a negative, and so, since the result is
less than 0 when matched withA Barbarian at my Door , the row is not
returned:
Finally, this next example shows the difference between the~ and-
operators, where the- operator prohibits the match whenDoor is found:
mysql> SELECT *,MATCH(f1) AGAINST ('-Door
Barbarian*' IN BOOLEAN MODE)
+--------------------+------------------+
| f1 | m |
+--------------------+------------------+
| Heart of the Beast | 1.25 | | Heart of the Beest
| 0.83333337306976 | +--------------------+-------
-----------+ 2 rows in set (0.00 sec)
The next two examples demonstrate the difference between a search using
the "" operators and one without them. The"" operators allow you to
search for an exact match on a phrase:
Warning
Full-text indexes can take a long time to generate and causeOPTIMIZE
statements to take a long time as well.
Or, if the table already exists, you can use either this syntax:
ALTER TABLE tablename ADD UNIQUE [indexname ]
(fieldname [,fieldname2...]);
or this syntax:
CREATE UNIQUE INDEX indexname ON
tablename(fieldname [,fieldname2...]);
If the index contains a single field, that field cannot contain duplicate
values:
mysql> CREATE TABLE ui_test(f1 INT,f2
INT,UNIQUE(f1)); Query OK, 0 rows affected (0.00
sec)
mysql> INSERT INTO ui_test VALUES(1,2);
Query OK, 1 row affected (0.01 sec)
mysql> INSERT INTO ui_test VALUES(1,3);
ERROR 1062: Duplicate entry '1' for key 1
Although the field f1 was not specified as UNIQUE when it was created, the
existence of the unique index prevents any duplication. If the index contains
more than one field, individual field values can be duplicated, but the
combination of field values making up the entire index cannot be
duplicated:
Note
You cannot create an index (except for a full-text index) on an entireBLOB
or TEXT field, so in this case you'd have to specify the index size.
The id field is a numeric field that is already a primary key, and because
you've been assigning the id in sequence to date, turning the id into an auto
increment field will allow MySQL to automate this process for you. The
following code makes the id field auto increment:
VALUES('Breyton','Tshbalala');
Query OK, 1 row affected (0.00 sec)
mysql> SELECT * FROM customer;
+----+------------+-------------+
| id | first_name | surname |
+----+------------+-------------+
| 1 | Yvonne | 2 | Johnny | 3 | Winston | 4 |
Patricia | 5 | Francois | 7 | Winnie | 6 | Neil
| 9 | Breyton +----+------------+-------------+ 8
rows in set (0.01 sec)
| Clegg | | Chaka-Chaka | | Powers | | Mankunku |
| Papo | | Dlamini | | Beneke | | Tshbalala |
The id is now9 . Even though the next highest remaining record is7 , the
most recently inserted value was8 .
This can be useful for updates where you need to create a new auto
increment value. For example, the following code finds the most recently
inserted auto increment value, and adds 1 to it in order to set a new id value
for Breyton Tshbalala:
If you want to reset the auto increment counter to start at a particular value,
such as back to 1 after you've deleted all the records, you can use this:
ALTER TABLE tablename
AUTO_INCREMENT=auto_inc_value;
Let's create a test to examine the behavior:
The auto increment counter has maintained its value, even though the table
has been emptied. You can useTRUNCATE to clear the table, and this will
reset the auto increment counter:
In most cases you'd use this feature when the table is emptied, but this is not
necessary; you can reset the counter even while there are records in the
table:
Note
Currently this too only works for MyISAM tables. You cannot set the auto
increment counter to anything except 1 with an InnoDB table, for example.
Because the –500 was outside the positive range of values permissible for
an auto increment, MySQL has set it to the maximum allowed for anINT :
2147483647. If you try to add another record, you'll get a duplicate key
error because MySQL cannot make anINT value any higher:
Be careful to ensure that you always have enough space for your records. If
Warning you create an auto increment on aSIGNED TINYINT field, once
you hit 127
you'll start getting duplicate key errors.
Here you would expect the id to have a value of 501. However, you'd
receive a rude shock! Look at the following:
mysql> SELECT * FROM ai_test2;
+-----+-------+
| id | f1 |
+-----+-------+
| 50 | one |
| 500 | two |
| 1 | three |
+-----+-------+
3 rows in set (0.00 sec)
Now you have a duplicate key, and theUPDATE fails. The other error occurs
when multiple connections are made. Open two windows to get two
connections to the database. From Window1:
So far, so good. Now go to the second window, and insert another record.
From Window2: mysql> INSERT INTO ai_test2(f1)
VALUES('two');
The value returned is still 1, when it should be 2. So now if you try and use
the value for an update, you'll get the familiar duplicate key error:
mysql> UPDATE ai_test2 SET id=LAST_INSERT_ID()+1
WHERE f1='one'; ERROR 1062: Duplicate entry '2' for
key 1
All three records have the same id value, as the primary key consists of two
fields: rank and id. Once you start adding other records to each rank, you'll
see the familiar incremental behavior:
The auto increments continue from where they left off, regardless of
theALTER statement.
MySQL cannot find out approximately how many rows there are between
two values (this result is used by the query optimizer to decide which index
is most efficient to use). See the "Helping MySQL's Query Optimizer
withANALYZE " section later in this chapter for more on this.
ISAM tables use a B-Tree index stored in files with an extension of.ism .
InnoDB tables cannot use full-text indexes.
When searching for the MAX() orMIN() values, MySQL only needs to
look at the first or last value in the sorted index table, which is extremely
quick. If there are frequent requests forMAX() orMIN() values, an index
on the appropriate field would be extremely useful.
If the id field were indexed here, MySQL would never even need to look at
the data file. This would not apply if the index consisted only of a portion
of the full column data (for example, the field is aVARCHAR of 20
characters, but the index was created on the first 10 characters only).
Because the records are returned in sorted order, which is exactly what an
index is, an index on the surname field would be useful for this query. If
theORDER BY isDESC , the index is simply read in reverse order.
+------------+---------+------------+
| first_name | surname | commission |
+------------+---------+------------+
| Mike | Serote | 10 |
+------------+---------+------------+
1 row in set (0.00 sec)
An index can be used when your query condition contains a wildcard, such
as:
The difference is that the latter has a wildcard as the first character. Because
the index is sorted alphabetically from the first character, the presence of
the wildcard renders the index useless.
Choosing Indexes
Now that you know where MySQL uses indexes, here are a few tips to help
you with choosing indexes.
It should go without saying that you should only create indexes where you
have queries that will make use of them (on fields in yourWHERE condition,
for example), and not on fields where they will not be used (such as where
the first character of the condition is a wildcard).
Create indexes that return as few rows as possible. A primary key is the best
here, as each primary key is uniquely associated with one record. Similarly,
indexes on enumerated fields will not be particularly useful (for example,
an index on a field containing the valuesyes orno would only serve to
reduce the selection to half, with all the overhead of maintaining an index).
Use short indexes (index only the first 10 characters of a name, for
example, rather than the entire field).
Don't create too many indexes. An index adds to the time to update or add a
record, so if the index is for a rarely used query that can afford to run
slightly more slowly, consider not creating the index.
If you performed a query with all three of these fields in the condition,
you'd be making the most use of this index:
mysql> SELECT * FROM customer WHERE surname='Clegg'
AND initial='X' AND first_name='Yvonne';
You would also make the most of the index if you searched for surname and
initial: mysql> SELECT * FROM customer WHERE
surname='Clegg' AND initial='X';
or just surname:
mysql> SELECT * FROM customer WHERE
surname='Clegg';
However, if you broke the leftmost sequence and searched for either first
name or initial, or both first name and initial, MySQL would not use the
index. For example, none of the following makes use of an index:
If you searched on the first and third fields of the index (surname and
first_name), you'd be breaking the sequence, so the index would not be
fully used. However, because surname is the first part of the index, this
portion of the index would still be used:
You can use leftmost prefixing whenever an index is used, such as with
anORDER BY clause.
So what do all these columns mean? See Table 4.2 for an explanation.
Table 4.2: What the EXPLAIN Columns Mean
Column Description
Shows you which table the rest of the row is about (it's trivial in thistable example, but it's
This is an important column, as it tells you which type of join is being type
used. From best to worst, the join types are system, const, eq_ref,
ref, range, index, and ALL.
Shows which indexes could possibly apply to this table. If it's empty,
possible_keys there are no possible indexes. You could make one available
by
looking for a relevant field from theWHERE clause.
The index that was actually used. If this isNULL , then no index was used.
Rarely, MySQL chooses a less optimal index. In this case, you key could
force the use of an index by employingUSE
INDEX(indexname) with yourSELECT statement or force MySQL
toignore an index withIGNORE INDEX(indexname) . The length of the
index used. The shorter you can make this withoutkey_len losing accuracy, the
better.
Tells you which column of the index is used, or a constant if this isref possible.
rows
Number of rows MySQL believes it must examine to be able to return the
required data.
Extra information about how MySQL will resolve the query. This is
discussed more fully in Table 4.3 , but the bad ones to see here are extra
Using temporary andUsing filesort , which mean that MySQL
is unable to make much use of an index at all, andthe results will be slow to
retrieve.
Table 4.3 looks at what the descriptions returned by theextra column
mean.
Table 4.3: What Descriptions in the extra EXPLAIN Column Mean
Extra Column Description Description
Distinct
Not exists
Using filesort
Using index
Using temporary
Once MySQL has found one row that matches the row combination, it will
not search for more.
MySQL optimized the LEFT JOIN and once it has found one row that
matches theLEFT JOIN criteria, it will not search for more.
No ideal index was found, so for each row combination from the earlier
tables, MySQL checks which index to use and uses this to return rows from
the table. This is one of the slowest joins with an index.
When you see this, the query needs to be optimized. MySQL will need to
doan extra step to find out how to sort the rows it returns. It sorts by going
through all rows according to the join type and storing the sort key and a
pointer to the row for all rows that match the condition. The keys are then
sorted and finally the rows returned in the sorted order.
The column data is returned from the table using only information in the
index, without having to read the actual row. This occurs when all the
required columns for the table are part of the same index.
When you see this, the query needs to be optimized. Here, MySQL needs
tocreate a temporary table to hold the result, which usually occurs when you
perform anORDER BY on a different column set than what you did
aGROUP BY on.
AWHERE clause is used restrict which rows will be
matched against the nexttable or returned to the client. If
Where used you don't want to
return all rows from thetable, and the join type is either ALL or index, this
should appear, or elsethere may be a problem with your query.
The type column returned byEXPLAIN tells you the type of join being
used. Table 4.4 explains the join type, listing them in order of most to least
efficient.
A maximum of one record from the table can match this query (the index
const
would be either a primary key or a unique index). With only one row, the
value is effectively a constant, as MySQL reads this value first and then
treats it identically to a constant.
In a join, MySQL reads one record from this table for each combination of
eq_ref records from the earlier tables in the query. It is used when the query
uses
all parts of an index that is either a primary key or a unique key.
This join type occurs if the query uses a key that is not a unique or primary
key or is only partofone of these types (for instance, makes use of leftmost
ref prefixing). All records that match will be read from this table for each
combination of rows from earlier tables. This join type is heavily dependent
on how many records are matched by the indexthe fewer the better. This
join type uses the index to return rows from within a range, such asrange what
occurs if you search for something using > or <.
This join type scans the entire index for each combination of records from
index the earlier tables (which is better than ALL, as the indexes are usually
smaller than the table data).
This join scans the entire table for each combination of records from earlierALL tables. This is
usually very bad and should be avoided as much as possible.
Let's return to the example:
You already have a primary key in the customer table on the id field, and,
because there is only one condition in our query, the id field is equivalent to
a constant, and the query is as optimal as it can be. Therows column tells
you that MySQL only needed to look at one row to return the results. You
can't get better than that! Also, thetype of the join (in this case, it's not really
a join) is const, standing for constant, which is the best type. Let's look at
what happens if you perform a similar query on a table with no indexes:
+-----------------+-------------+------+-----+----
-----+-------+ 6 rows in set (0.00 sec)
mysql> SELECT * FROM sales_rep;
+-----------------+----------+------------+-------
-----+-------------+------------+ |
employee_number | surname | first_name |
commission | date_joined | birthday +-------------
----+----------+------------+------------+--------
-----+------------+ | 1 | Rive | Sol | 10 | 2000-
02-15 | 1976-03-18 | | 2 | Gordimer | Charlene |
15 | 1998-07-09 | 1958-11-30 | | 3 | Serote | Mike
| 10 | 2001-05-14 | 1971-06-18 | | 4 | Rive |
Mongane | 10 | 2002-11-23 | 1982-01-04 | | 5 |
Jomo | Ignesund | 10 | 2002-11-29 | 1968-12-01 |
+-----------------+----------+------------+-------
-----+-------------+------------+ 5 rows in set
(0.01 sec)
An obvious choice for a primary key here is the employee_number field.
There are no duplicate values, and you would not want any, so you can
make it a primary key without any complications:
Not so good! This query doesn't make use of any indexes. This isn't
surprising because there's only one index so far, a primary key on
employee_number. It looks like adding an index on the commission field
will improve matters. Because the commission field has duplicate values, it
cannot be a primary key (of which you are only allowed one per table
anyhow) or a unique key. And it's not a character field, so the only key
available to you is an ordinary index:
Now, if you rerun the check on the query, you get the following result:
This is much better. MySQL performs the calculation once, coming up with
a constant of 15, and can search the index for values less than this. You
could also have entered the query as this:
where you worked out the constant yourself, but this would not be
noticeably faster. Subtracting 5 from 20 costs MySQL almost an
immeasurably small amount of time (or at least you'd be challenged to take
a measurement!). Notice what happens if you wanted to return all the sales
representatives who would earn a commission of exactly 20 percent after
the 5-percent increase:
The query type changes from range to ref, which is a better one to use if
possible. This can be easily understood because returning an exact value is
less work than returning a range of values (or many exact values).
In this example, the full three fields of the index are used, and the join type
is ref because the index in question allows duplicates. If the table structure
excluded the possibility of a duplicate combination of surname, initial, and
first_name, the join type would have been eq_ ref. Note the ref
column,const ,const ,const , indicating that all three parts of the index
are compared against a constant value. The next example shows a similar
situation:
Again, the index is used correctly, but only the first two fields are used. The
key length here is shorter (meaning MySQL has less to scan and is thus a
little quicker). The next case makes no use of leftmost prefixing:
This query does not adhere to the principles of leftmost prefixing and
makes no use of an index. The next example also makes no use of leftmost
prefixing, but does still make use of an index:
Although leftmost prefixing is not used here, because the first_name field is
out of sequence, the surname is still enough to make good use of the index.
In this case, it narrows the number of rows that MySQL needs to scan to
only one, because the surname Clegg is unique, irrespective of any first
names or initials.
Optimizing Selects
In a join, you can calculate the number of rows MySQL needs to search by
multiplying all the rows together. In the following example, MySQL would
need to examine 5*8*1 rows, giving a total of 40:
+-----------+--------+---------------+---------+--
-------+----------+------+------------+ | table |
type | possible_keys | key | key_len | ref +------
-----+--------+---------------+---------+---------
+----------+------+------------+ | sales_rep | ALL
| PRIMARY | sales | ALL | NULL | customer | eq_ref
| PRIMARY | NULL | NULL | NULL | NULL | NULL |
NULL | PRIMARY | 4 | sales.id |
+-----------+--------+---------------+---------+--
-------+----------+------+------------+ 3 rows in
set (0.00 sec)
You can see that the more tables that are joined, the greater the rows
searched. Part of good database design is balancing the choice between
smaller database tables that require more joins with larger tables that are
harder to maintain. Chapter 8 , "Database Normalization," will introduce
some useful techniques to help you achieve this.
ON sales.sales_rep = sales_rep.employee_number;
+-----------+------+---------------+------+-------
--+------+------+-------+ | table | type |
possible_keys | key | key_len | ref | rows | Extra
| +-----------+------+---------------+------+-----
----+------+------+-------+ | sales_rep | ALL |
NULL | sales | ALL | NULL | NULL | NULL | NULL | 5
| | NULL | NULL | NULL | 8 |
+-----------+------+---------------+------+-------
--+------+------+-------+ 2 rows in set (0.00 sec)
The number of rows to examine in this query is five times eight (from the
rows column), which gives you 40. No indexes are used here. If you
examine the structure of the first two tables, you can see why:
+-----------------+-------------+------+-----+----
-----+-------+ 6 rows in set (0.00 sec)
Because you have no WHERE condition, the query will return all records
from the first table —sales_rep. The join is then performed between the
employee_number field from the sales_rep table (which you cannot use an
index for because you are returning all records) and the sales_rep field from
the sales table (which is not indexed). The problem is that the join condition
does not make use of an index. If you added an index to the sales_rep field,
you'd improve performance:
ON sales.sales_rep = sales_rep.employee_number;
+-----------+------+---------------+-----------+--
-------+---------------------------+------+-------
+ | table | type | possible_keys | key | key_len |
ref +-----------+------+---------------+----------
-+---------+---------------------------+------+---
----+ | sales_rep | ALL | NULL | sales | ref |
sales_rep | NULL | NULL | NULL | sales_rep | 5 |
sales_rep.employee_number |
+-----------+------+---------------+-----------+--
-------+---------------------------+------+-------
+ 2 rows in set (0.00 sec)
You've reduced the number of rows that MySQL needs to read from 40 to
10 (5*2), a fourfold improvement.
If you performed theLEFT JOIN the other way around (with the sales
table containing the possible nulls, not the sales_rep table), you'd get a
different result withEXPLAIN :
Only eight rows are examined because while all rows from the sales table
are being returned, the primary key on the sales_rep table
(employee_number) is being used to perform the join.
+-----------+------+---------------+-----------+--
-------+---------------------------+------+-------
-----+ | table | type | possible_keys | key |
key_len | ref +-----------+------+---------------
+-----------+---------+---------------------------
+------+------------+ | sales_rep | ALL | NULL |
sales | ref | sales_rep | NULL | NULL | NULL |
sales_rep | 5 | sales_rep.employee_number |
+-----------+------+---------------+-----------+--
-------+---------------------------+------+-------
-----+ 2 rows in set (0.00 sec)
Now, if you change the sales_rep field to not allow nulls, you'll see some
changes:
The order in which the tables are presented to MySQL can, in some cases,
also make a difference in the speed of the query. MySQL tries its best to
choose the best options, but it doesn't always know in advance which the
quickest route will be. The next section explains how to help MySQL
garner in advance as much information as possible about the index
composition, but, as the following example demonstrates, sometimes even
this isn't enough.
Now, imagine you need to perform a join on these tables. The following
query will give you the required results:
mysql> SELECT * FROM t1,t2 LEFT JOIN t3 ON
(t3.f3=t1.f1)
+-------+--------+---------------+---------+------
---+-------+------+-------------------------+ |
table | type | possible_keys | key | key_len | ref
| rows | Extra +-------+--------+---------------+-
--------+---------+-------+------+----------------
---------+ | t1 | index | NULL | PRIMARY | | t2 |
index | PRIMARY,b,f2 | PRIMARY | | t3 | eq_ref |
PRIMARY,c,f3 | PRIMARY | | t4 | eq_ref |
PRIMARY,d,f4 | PRIMARY | +-------+--------+-------
--------+---------+---------+-------+------+------
-------------------+ 4 rows in set (0.00 sec)
4 | NULL | 2 | Using index 4 | NULL | 2 | Using
index 4 | t1.f1 | 1 | Using index 4 | t1.f1 | 1 |
where used; Using index |
There are two index scans, one on t1 and another on t2, meaning that
MySQL needs to scan the entire index. Looking carefully at the query,
you'll see that theLEFT JOIN is what requires t2 to be read before t4. You
can get around this by changing the order of the tables and separating t2
from theLEFT JOIN :
+-------+--------+---------------+---------+------
---+-------+------+-------------+ | table | type |
possible_keys | key | key_len | ref | rows | Extra
+-------+--------+---------------+---------+------
---+-------+------+-------------+ | t1 | index |
NULL | PRIMARY | | t3 | eq_ref | PRIMARY,c,f3 |
PRIMARY | | t4 | eq_ref | PRIMARY,d,f4 | PRIMARY |
| t2 | eq_ref | PRIMARY,b,f2 | PRIMARY | +-------
+--------+---------------+---------+---------+----
---+------+-------------+ 4 rows in set (0.01 sec)
4 | NULL | 2 | Using index | 4 | t1.f1 | 1 | Using
index | 4 | t1.f1 | 1 | Using index | 4 | t4.f4 |
1 | Using index |
Notice the difference! According to the rows column, only 2*1*1*1 rows
need to be read (2 in total), as opposed to 2*2*1*1 (4 in total) with the
earlier query. Of course, the results are the same:
ANALYZE TABLE updates the key distribution for the table if it is not up-
to-date. (Running ANALYZE is equivalent to runningmyisamchk -a
ormyismachk --analyze . See Chapter 12 , "Database Replication,"
for more information.)
This only works with MyISAM and BDB tables, and the table is locked
with a Warning read lock for the duration of this process. Therefore, you
don't want to do this
when your database is busy.
You can see the information available to MySQL by running the
commandSHOW INDEX:
+----------+------------+----------+--------------
+-------------+-----------+-------------+---------
-+--------+---------+ 4 rows in set (0.01 sec)
Table 4.5 explains the meanings of the columns returned by aSHOW
INDEX statement.
Non_unique
Key_name
Description
Name of the table you're looking at.
Packed
Null
Comment
Deletes and updates can leave gaps in the table (especially when your tables
contain TEXT , BLOB , orVARCHAR fields. This means the drive is doing
too much work because the head needs to skip over these gaps when
reading.
OPTIMIZE TABLE solves this problem, removing the gaps in the data by
rejoining fragmented records, doing the equivalent of a defrag on the table
data.
Benchmarking Functions
The BENCHMARK() function tells you how long it takes MySQL to
perform a function a certain number of times. It can help to give a basic
idea of the difference in capabilities between machines. The syntax is as
follows:
SELECT BENCHMARK(number_of_repetitions,expression)
database, there are many other factors, such as disk speed, which are not
taken into account with this test. Its main aim is to help in the optimization
of functions.
You can optimize an UPDATE statement in the same way as you would the
equivalent SELECT statement. Also note that the fewer indexes you use
and the smaller the data, the quicker the operation will be. Take care not to
use superfluous indexes or make the fields or indexes larger than they need
to be.
You can speed up the process by disabling keys for the duration of the
period you're adding data. MySQL will then only have to concentrate on
adding the data and not worry about adding to the index files at the same
time. The data itself is then added much quicker, and when the indexes are
built separately, the process will be more optimal as well. You can use the
following procedure:
You're not always going to be able to insert from a text file, though. But if
you can group your inserts, multiple value lists are added much quicker
than the separate statements. For instance, the following query:
The reason for this is that the indexes are only flushed once per INSERT
statement. If you require multipleINSERT statements, you can use locking
to achieve the same result. For nontransactional tables, use statements like
this:
BEGIN;
INSERT INTO tablename VALUES (record1),(record2),
(record3); INSERT INTO tablename VALUES (record4),
(record5),(record6); COMMIT;
Matters become slightly more complex when you're adding records from
many threads. Consider a scenario where the first thread adds 10,000
records, and the second thread adds one record. Using locking, the overall
speed will be a lot faster, but the second thread will only complete after the
first thread. Without locking, the second thread will complete much more
quickly, but the overall speed will be slower. The importance of the second
thread relative to the first will determine which method you choose.
The first is to use INSERT LOW PRIORITY . This causes the insert to
lose its usual pushy behavior and to wait until there are no more read
queries waiting in the queue. The problem, though, is that if your database
is very busy, the client performing theINSERT LOW PRIORITY may take
a long time to find a gap, if ever!
idea when the data will be inserted, if at all, so use them with caution.
Summary
Poor use of indexes is probably the single most important cause of
performance problems. An index is a small, sorted file that points to the
main data file. Finding a particular record is then quicker because only the
small index file has to be searched.
An index can be a primary key (a unique index that cannot contain nulls), a
unique index, an ordinary index (that can contain duplicates), or a full-text
index. Full-text indexes allow a high level of sophistication in searching
text fields for certain combinations of keywords.
Auto increment fields are associated with the primary key and allow
MySQL to automatically take care of the sequencing of the field. If a record
is inserted, MySQL will add 1 to the previous auto incremented value and
use this for the value of the inserted auto increment field.
You can begin to tap into the real power of a database when it becomes part
of a larger information system , with fully functional applications adding
their own value to the system. A news website, for example, needs tools to
add and sort the news articles, to display them on a website, and to track the
most popular stories. Most journalists have no interest in learning
Structured Query Language (SQL), however, so they need a well-designed
interface to link them to the database. This interface could be a web page
with a Hypertext Markup Language (HTML) form, with a submit button
that calls a script to run anINSERT statement. Or the interface could be a
news feed that takes articles from the newspaper's QuarkXPress system and
automatically adds them to the database.
The possibilities for information systems are endless. What these scenarios
have in common is that they include a developed application to add extra
levels of logic that MySQL cannot supply. In theory, you can use any
programming language to develop applications. Languages commonly used
are Java, C, PHP, Perl, C++, Visual Basic, Python, and Tcl, which mostly
have well-developed Application Programming Interfaces (APIs) for
interfacing with MySQL. The appendixes in this book contain APIs for
most of these programming languages.
Throughout this chapter, all examples are in PHP—not because you should
all be using PHP, but simply because so many of you already do, because its
syntax is familiar to anyone with a C-like background (C, Perl, or C++),
and because it's simple enough for other programmers to follow. It's the
programming principles that are important, however—not the syntax. All
code is extensively commented in this chapter so that you can follow it no
matter what language you are familiar with or your level of competency.
Featured in this chapter:
Using persistent connections
Making your code portable and easier to maintain
Assessing the database's vs. the application's workload
Exploring the application development process
You don't always want the connections to hang around too long, though. In
most cases, the web server will clean up after itself. However, I have
encountered a web server that was having problems, resulting in it
restarting itself but not cleaning up the connections. The web server was set
to allow 400 instances, and MySQL could take 750 connections. With the
web server misbehaving and not cleaning up, it effectively doubled the
number of connections it was making to the database server, allowing a
potential 800. Suddenly the database server was running out of available
connections. You can minimize the risk of persistent connections hanging
around for too long by reducing thewait_timeout mysqld variable
(orinteractive_timeout , depending how you connect), which
determines the amount of time MySQL allows a connection to remain
inactive before closing it. (Chapter 10 , "Basic Administration," and
Chapter 13 , "Configuring and Optimizing MySQL," explain how to set
these variables.) The default is 28,800 seconds (8 hours). By reducing this
to 600 seconds, I prevented the problem from recurring. Then there was just
the web server to worry about!
Connecting
Most programming languages make it easy to connect to a database through
native functions. For example, PHP has a host of functions for use with
MySQL, such asmysql_ connect() ,mysql_query() , and so on.
When programming a tiny application with one connection to the database,
using the native classes, it's easy to use something simple to connect to
MySQL (see Listing 5.1 ).
Listing 5.1: TOTALLY_IMPORTABLE.PHP
if (!$db) {
echo "There was a problem connecting to the
database."; exit;
}
// basic error checking - if the connection is not
successful, print // an error message and exit the
script
Many examples you'll come across connect in this way because it's simple
to understand and fine for small situations. But to use a database in a more
serious application, you'll want to make it as portable, as easy to maintain,
and as secure as possible.
Imagine you have 10 scripts that connect to the database. If all 10 scripts
connected in the same way, and then one day you had to move the database
to a new server or wanted to change your password, you'd have to change
all 10 scripts.
I inherited a situation like this once before, and, facing the possibility of the
password having been compromised (with the password in hundreds of
locations, it's more likely to be found as well), I had to do lots of grunt
work! The best solution is to build the application correctly from the
beginning. Place your database connection details in a separate location,
which is then included in scripts that connect to the database. Then, when
you need to change the details, you only need to change them in one place
(and youknow that nothing has been forgotten). Changing the password in
hundreds of locations involves the risk of missing a location and only
finding out when functionality fails.
$host = "dbhostname.co.za";
$user = "db_app";
$pass = "g00r002b";
require_once "$include_path/db.inc";
// includes the file containing the connection
details, db.inc // located in the path:
$include_path, which should be a safe location $db
= mysql_pconnect($host, $user, $pass);
}
// basic error checking - if the connection is not
successful, print // an error message and exit the
script
require_once "$include_path/db.inc";
// includes the file containing the connection
details, db.inc // located in the path:
$include_path, which should be a safe location
language than PHP, for example, so the previous examples will not work
well if just directly translated into Java. What's important is the principle of
making your applications as easy to maintain ( by storing connection
information in one location) and as portable ( by avoiding database-specific
code) as possible.
Database Queries
It's acceptable to use shortcuts such as SELECT * when querying MySQL
directly. But you should avoid this in your applications, as they make them
less portable. Take a situation where there are three fields in the entrants
database table; in order, they are id, first_name, and surname. Your
programming code may look something like Listing 5.6 .
$id = $row[0];
// Because the id is the first field in the
database, // it is returned as the first element
of the array, // which of course starts at 0
$first_name = $row[1];
$surname = $row[2];
// .. do some processing with the details
This works at first. But now, somebody (it's always somebody else) makes a
change to the database structure, adding a new field between first_name and
surname, called initial. Your code does not need the initial. Suddenly your
code does not work, as the initial is the third element of the array
(or$row[2] ), and it is stored as$surname . The actual surname, now
in$row[3] , is never accessed.
$id = $row["id"];
$first_name = $row["first_name"];
$surname = $row["surname"];
// .. do some processing with the details
This is better, because now, even after the initial field has been added to the
database table, your code will work. It can handle some changes in database
structure, and it is more readable. A programmer with no knowledge of the
database structure will know what fields are being returned. But you can
still make an improvement. By running aSELECT * query, you're asking
MySQL to return all the fields from the table. Your code only needs to use
three fields, so why waste the resources in returning all of them, with the
extra disk input/output and greater pressure on your network this involves?
Rather, just specify the fields you want to return. Not only does this waste
fewer resources, but it also makes your code even more readable. In fact, in
some instances returning the associative array is more resource hungry than
returning a numeric array. In this case, you can still keep your code
readable, even when returning a numeric array, by specifying the fields as
shown in Listing 5.8 .
If the table structure changes, once again the code will break. By adding a
field, initial, the number of fields being inserted will not match the number
of fields in the table, and the query will fail.
The way to solve this is to specify the database fields you are inserting into,
as shown in Listing 5.10 .
Listing 5.10: FLEXIBLE_INSERT.PHP
// assuming the connection $db has already been
made $result = mysql_query("INSERT INTO
entrants(id, first_name,– surname)
VALUES('$id','$first_name','$surname')",$db);
This also has the advantage of being more readable, especially considering
that the field values will not always match the field names as in this
example.
$surname[] = $row["surname"];
// add the surname as the next element of the //
surname array (and creates the array if not yet
done)
}
sort($surname)
// the sort() function sorts the array // …
continue processing the sorted data
$surname[] = $row["surname"];
// add the surname as the next element of the
// surname array (and creates the array if not yet
done)
}
// … continue processing the sorted data
Listing 5.12 makes a lot more sense. MySQL could (or should) have an
index on the surname field if this is a commonly performed operation, and
reading the data in order, from an index, is much quicker than reading the
data in the unordered format and then using the application to perform the
sort.
In fact, it's possible that reading the sorted data from the database will be
quicker than reading the unsorted data (even before taking thesort()
function into account) because the sorted data may only need to be read
from the index, not the data file.
A similar, more extreme (but still all too common) example is one where
the application is performing the work of theWHERE clause, as Listing 5.13
demonstrates.
Listing 5.13: WORK_THE_SCRIPT2.PHP
if ($row["surname"] == 'Johnson') {
$johnson[] = $row["surname"];
// add the surname as the next element of the //
johnson array (and creates the array if not yet
done)
}
elseif ($row["surname"] == 'Makeba') {
$makeba[] = $row["surname"];
// add the surname as the next element of the
// makeba array (and creates the array if not yet
done)
} }
// … process the makeba and johnson arrays
You can write these code snippets more elegantly if you're processing many
names, but the point is that Listing 5.14 is much more efficient because
MySQL is doing the work, limiting the results returned and reducing the
resources used.
$codelist .= $row["code"].",";
// add the code, followed by a comma to the
$codelist variable }
$codelist = substr($codelist, 0, -1);
Listing 5.15 would function, but again it is placing far too much of a load
on the application. Instead, with some thought, MySQL could perform a
query to return the results, as Listing 5.16 demonstrates.
while ($row =
mysql_fetch_array($result,MYSQL_ASSOC)) { // …
process the entrant details
}
When sub-selects are implemented (version 4.1 according the current
schedule), the resulting code, shown in Listing 5.17 , is even simpler.
Listing 5.17: WORK_THE_DB3_2.PHP
// assuming the connection $db has already been
made $result = mysql_query("SELECT
first_name,surname FROM – entrants WHERE code NOT
IN (SELECT code FROM – referred_entrants ", $db);
while ($row =
mysql_fetch_array($result,MYSQL_ASSOC)) {
The project team needs to undertake the following when determining user
requirements: Engage the users to determine their requirements. Guide the
users, explaining why certain suggestions are not practical or suggesting
better alternatives. The team needs to use their experience. Bring unsaid
requirements into the open so that they can be documented. What's obvious
to the user may not be obvious to the developers, and important
requirements may go missing. Requirements need to be as complete as
possible. For example, a user may request that a reservation system "take
the booking." However, this is no good. Each field is required for a
booking, and the processes behind it needs to be spelled out.
Once you understand the user requirements, write them down and return
them to both the users and to the project owners (the people paying for the
project) for confirmation. Users need to be sure what they are going to be
getting; there should be no surprises later.
Get the owners to sign off on the requirements. That way, there is less
likelihood of either party being unhappy.Scope creep —when a project's
requirements continue to grow during development—is an insidious
problem that occurs frequently, usually when the requirements have not
been formally agreed upon beforehand. Either the project owners demand
more and more or someone from either party discovers critical new issues
that no one thought of until then.
Using Pseudocode
Pseudocode is another step that can help the programmer develop an
application more quickly and more easily. Instead of worrying about the
exact syntax requirements of the programming language, pseudocode
tackles the logical requirements, creating the algorithms needed to solve the
problems.
Phase 3: Coding
This is the step that is often the only step. But coding becomes much easier
when the documentation produced in the previous stages exists. Use the
following tips during the coding phase:
Always document your code. Include comments inside the code, and create
separate documents noting how the application is put together and how it
works. If not for yourself, do this for anyone else to follow you. (You'll be
surprised how easy it is to forget some "trivial" detail after a few months.)
A coder is selfish if they have not left ample documents for someone else to
follow. Make sure you build in time for this, though; don't let the pressure
of a deadline make you skimp on this aspect.
Use clear filenames, function names, and variable names. For example, the
function f3() is not very intuitive, butcalculate_interest() is
clearer.
Don't reinvent the wheel. There are ample classes and sample code
available. Rarely will coders need to do something unique. Mostly the job is
repetitive, and coders play the job of librarian, tracking down the right code
for the job.
Initialize all your variables and document them in one place, even when
coding in languages that allow variables to be used before they are
initialized. This is not only more readable, but it is more secure.
Break your code up into small parts. This makes it easier to understand,
easier to debug, and easier to maintain. In web development, for example,
some sources have espoused the virtues of one script to handle everything,
and the result is usually a jumbled mess.
Use directories intelligently. All parts of the application do not, and usually
should not, appear in the same directory. Group them logically and for
security purposes.
Reuse your own code. Know what functions and classes have been created,
and use these again and again. Write them so that they can be used for
slightly different tasks. This makes changing code at a later stage much
easier. This applies especially to things such as database connections.
When debugging a query, it can help to run the query directly in MySQL
and not through the application (I often display the query and then paste this
into the MySQL client). This allows you to narrow down the error, so you
know whether it's in the query you're running or something in the
application.
Simple code is good code. Writing a chunk of code on one unreadable line,
just because you can, may make you look clever, but it just frustrates those
who have to read your code later. Coders who overly complicate their code
are failing to do what coders are supposed to do: simplifying complexities.
Simple code is usually as fast, too. It can even be faster, such as when you
use simple string functions instead of regular expressions.
In projects with more than one coder, use coding standards. That way when
members of the coding team work on each other's code, they will be more
productive because it will take them less time to adjust to the new standard
(or lack of it). Coding standards include such things as the number of
spaces (or tabs) to use when indenting and variable name conventions (such
as whether to use capitalization, for example,$totalConnects;
underscores, for example, $total_connects; or no spaces, for
example,$totalconnects ). The standards could go so far as to
prescribe the editor to use, because different editors could align code
differently, making it less readable.
Larger projects too need some sort of version control. This ensures that
when more than one person works on a piece of code, their work does not
conflict. Even oneperson projects can easily get out of hand if, like me, you
have a tendency to work on different machines and save versions all over
the place. Larger projects can use something such as CVS or Visual
SourceSafe (these can be found at http://www.cvshome.org/
andhttp://msdn.microsoft.com/ssafe/ , respectively), and
small projects can simply use a numbering scheme.
Unit testing These tests ensure that each class, method, and function works
properly in its own right. You need to test whether the returned results are
correct no matter what the input; in other words, account for every possible
scenario.
Integration testing These tests ensure that each unit works as it should
when integrated with other units.
System testing The entire system is tested as a whole. This would include
testing the performance of the system under load (stress testing), multiple
users, and so on.
Regression testing These are the tests that take place to see if the fixes
have broken any other functionality. It's extremely common for "quick
fixes" to have unforeseen circumstances.
Summary
There are a number of techniques to make your database applications more
portable and easier to maintain, especially regarding database queries.
Separate your connection details from the scripts that make the connections,
and make sure they are stored in only one place so that you can easily
change the details. Make sure future changes in the database table structure
do not affect your scripts if the change is unrelated. Specify field names in
yourSELECT andINSERT queries.
When coding, be sure to avoid the common pitfalls so that your code is
flexible, portable, and easily maintainable. Then, before you implement the
application, be prepared to test it extensively, which should be just as
important as coding the application.
User-Defined Functions
You would mostly write UDFs in C or C++. You can use them with binary
or source distributions that have been configured with--with-mysqld-
ldflags=-rdynamic . Once added, UDFs are always available when
the server restarts, unless the--skipgrant-tables option is used.
There are two kinds of UDFs: standard andaggregate . Standard UDFs are
like ordinary built-in functions such asPOW() andSIN() and work on a
single row of data, and aggregate functions are similar to the built-inSUM()
andAVG() functions that work with groups.
The name field contains the name of the UDF (and the name of the main
C/C++ function). Theret field indicates whether the UDF can return nulls,
thedl field indicates the name of the library containing the UDF (many
UDFs can be bundled into one library), and thetype field indicates whether
the UDF is a standard or aggregate UDF.
If you've upgraded from an older version of MySQL, you may not have all
the Note columns. Run themysql_fix_privilege_tables script (in
thebin directory)
to add the missing columns to your table.
In this section, you'll first look at compiling and installing the sample UDFs
that come with a
MySQL distribution and then go on to writing your own. Before you get
started, create and add records to a small table, which you'll use to test the
UDFs once you've added them:
To find the correct compiler options for your system, you can use the make
utility, which checks for dependencies. Each system will differ, but after
runningmake you may get output something like this:
% make udf_example.o
g++ -DMYSQL_SERVER -
DDEFAULT_MYSQL_HOME="\"/usr/local\""
-DDATADIR="\"/usr/local/var\"" -
DSHAREDIR="\"/usr/local/share/mysql\""
-DHAVE_CONFIG_H -I../innobase/include -
I./../include -I./../regex
-I. -I../include -I. -O3 -DDBUG_OFF -fno-implicit-
templates
-fno-exceptions -fno-rtti -c udf_example.cc
Take the options supplied, and use them to compile the UDF. Again, your
system may differ —some will require the-c option to be left out, others
will need it. You should know your system well enough, know someone
who does, or have enough patience to try again if your first (and second and
third…) attempts result in a dead end. Your final command may look
something like this:
Once you've compiled the UDF, move it to the place your shared libraries
usually go. With Unix systems, it's any directory searched byld
(mostly/usr/lib or/lib ), or you can set an environment variable to
point to the directory you're storing your library. Typingman dlopen will
give you the name of the environment variable (usuallyLD_LIBRARY or
LD_LIBRARY_ PATH ). You would set this in your startup script
(mysql.server or mysqld_safe). With Windows systems, you'll usually place
the UDF in the WINDOWS\System32 orWINNT\System32 directory.
Copy your compiled file to the appropriate location, for example:
% cp udf_example.so /usr/lib
Once you have placed the file, some systems may require you to create the
necessary links (such as by runningldconfig ) or restarting MySQL
before you can load the function.
To load the UDF from the MySQL command line, use theCREATE
FUNCTION statement. The syntax is as follows:
CREATE [AGGREGATE] FUNCTION function_name RETURNS
{STRING|REAL|INTEGER} SONAME shared_library_name
The example comes with a number of UDFs (you can bundle more than one
in a single library). For now, you'll load just three of the functions, as
follows:
Now you can test the new UDF. To see what all the UDFs do, you should
look at the udf_ example.cc file. The metaphon UDF (the correct
name for the algorithm is actually metaphone ) takes a string and returns a
result based on the way the string sounds. It is similar to the more well-
known soundex algorithm, but tuned more for English.
This is not particularly useful, but at least now you know how to add
functions. Test that the aggregate function is working by using this code:
There is only one result, as the function works with a group. You didn't use
a GROUP BY clause, so the entire result set is taken as a single group. If
you grouped by the contents of id , you'd get five results, as there are five
uniqueid values:
You can drop a UDF with theDROP FUNCTION statement, for example:
mysql> DROP FUNCTION myfunc_double;
Query OK, 0 rows affected (0.01 sec)
You can view the list of available UDFs by looking at the contents of
thefunc table in the mysql database, as follows:
mysql> SELECT * FROM mysql.func;
+----------+-----+----------------+-----------+
| name | ret | dl | type |
+----------+-----+----------------+-----------+
| metaphon | 0 | udf_example.so | function |
| avgcost | 1 | udf_example.so | aggregate |
+----------+-----+----------------+-----------+
2 rows in set (0.01 sec)
This implies that the user adding or removing the function needs to have
INSERT or DELETE permission for thefunc table, ormysql database. You
would usually only give this to an administrator, as besides the security risk
of access to themysql database, a UDF can potentially cause a lot of harm.
Now let's create a UDF from scratch. You're first going to create a standard
(not aggregate) UDF, calledcount_vowels .
Standard UDFs
A standard UDF has one main function, which is named the same as the
UDF and is required, and two optional functions, which are named similarly
but with_init and _deinit appended to the end. All these functions
must be in the same library.
It is declared as follows:
my_bool function_name_init(UDF_INIT *initid,
UDF_ARGS *args, char *message);
The function returns a Boolean type, which you set tofalse if the function
did not pick up any errors ortrue if an error was spotted.
char *ptr This is a pointer that the UDF can use—for example, to pass
data across all three functions. Allocate the memory in the init function if
the pointer is used for new data.
For numeric types, the return value of the main function is simply the value.
If it's a string type, the return value is a pointer to the result, with the length
stored in thelength argument. The result buffer is defaulted to 255 bytes,
so if the result is less than this, the pointer should be the result pointer
passed into the main function. If it's more, it should be the pointer allocated
in the init function (you'll need to allocate space withmalloc() and then
later deallocate the space in the deinit function).
#ifdef STANDARD
#include <stdio.h> #include <string.h> #else
#include <my_global.h> #include <my_sys.h> #endif
#include <mysql.h> #include <m_ctype.h> #include
<m_string.h>
extern "C" {
my_bool count_vowels_init(UDF_INIT *initid,
UDF_ARGS *args, char *message); void
count_vowels_deinit(UDF_INIT *initid);
long long count_vowels(UDF_INIT *initid, UDF_ARGS
*args,
}
/* Makes sure there is one argument passed, and
that it's a string. */ my_bool
count_vowels_init(UDF_INIT *initid, UDF_ARGS
*args, char *message) {
if (args->arg_count != 1 || args->arg_type[0] !=
STRING_RESULT) { strcpy(message,"You can only pass
one argument, and it must be a string"); return 1;
}
return 0;
}
/* no need for a deinit function, as we don't
allocate extra memory */ void
count_vowels_deinit(UDF_INIT *initid){
}
switch ( c ) {
case 'a':
case 'e':
case 'i':
case 'o':
case 'u':
num_vowels++; /* if the letter in c is a vowel,
increment the counter */
}
}
return num_vowels;
}
Once you've saved the file, make and compile it, and then copy it to the
directory where you place your libraries, as discussed earlier:
% make count_vowels.o
g++ -DMYSQL_SERVER -
DDEFAULT_MYSQL_HOME="\"/usr/local/mysql\""
-DDATADIR="\"/usr/local/mysql/var\""
-DSHAREDIR="\"/usr/local/mysql/share/mysql\""
-DHAVE_CONFIG_H -I../innobase/include -
I./../include
-I./../regex -I. -I../include -I. -O3 -DDBUG_OFF
-fno-implicit-templates -fno-exceptions -fno-rtti
-c repeat_str.cc % gcc -shared -o count_vowels.so
count_vowels.cc -I../innobase/include
-I./../include -I./../regex -I. -I../include -I.
If you passed a nonstring argument such as from theid field, or more than
one argument, you'd get the error message you specified:
If you make a change to the UDF, be sure to DROP the function first from
Warning MySQL before you upload it again. You're quite likely to crash
MySQL and
have to restart if you don't!
The Reset Function This function is called at the beginning of each new
group. Data used for group calculations are reset here. You declare the
function as follows: char *xxx_reset(UDF_INIT *initid,
UDF_ARGS *args,
char *is_null, char *error);
The Add Function This is called for each row of the group except the first
row. You'll probably want it to be called for every row, in which case you'll
need to call it from within the reset function.
The Main Function The main function is only called once per group of
data (at the end), so it performs any necessary calculations on the entire
group of data (usually accessed byinitd->ptr ).
The Init Function Behaves the same as with a standard UDF, except
theptr attribute becomes much more important in an aggregate function. It
stores data about each group, which is added within the add function. The
main function can then access it to get data about the entire group.
The Deinit Function Plays the same role as with a standard UDF, except
that it will almost always exist, as you need to clean upptr .
#ifdef STANDARD
#include <stdio.h>
#include <string.h>
#else
#include <my_global.h>
#include <my_sys.h>
#endif
#include <mysql.h>
#include <m_ctype.h>
#include <m_string.h> // To get strmov()
#ifdef HAVE_DLOPEN
/* These must be right or mysqld will not find the
symbol! */
extern "C" {
my_bool count_agg_vowels_init( UDF_INIT* initid,
UDF_ARGS* args, char* message ); void
count_agg_vowels_deinit( UDF_INIT* initid );
void count_agg_vowels_reset( UDF_INIT* initid,
UDF_ARGS* args,– char* is_null, char *error );
void count_agg_vowels_add( UDF_INIT* initid,
UDF_ARGS* args,– char* is_null, char *error );
long long count_agg_vowels( UDF_INIT* initid,
UDF_ARGS* args,– char* is_null, char *error );
}
struct count_agg_vowels_data { unsigned long long
count; };
if (args->arg_count != 1 || args->arg_type[0] !=
STRING_RESULT) { strcpy(message,"You can only pass
one argument, and it must be a string"); return 1;
initid->max_length = 20;
data = new struct count_agg_vowels_data; data-
>count = 0;
initid->ptr = (char*)data;
return 0; }
delete initid->ptr;
}
/* called once at the beginning of each group.
Needs to call the add function as well resets
data->count to 0 for the new group */ void
(struct count_agg_vowels_data*)initid->ptr;
char *word = args->args[0]; /* pointer to string
*/ int I = 0;
char c;
}
}
}
/* returns data->count, or a null if it cannot
find anything */ long long
count_agg_vowels( UDF_INIT* initid, UDF_ARGS*
args, char* is_null, char* error ) {
#endif /* HAVE_DLOPEN */
% make count_agg_vowels.o
g++ -DMYSQL_SERVER -
DDEFAULT_MYSQL_HOME="\"/usr/local/mysql\""
-DDATADIR="\"/usr/local/mysql/var\""
-DSHAREDIR="\"/usr/local/mysql/share/mysql\""
-DHAVE_CONFIG_H -I../innobase/include -
I./../include
-I./../regex -I. -I../include -I. -O3 -DDBUG_OFF
-fno-implicit-templates -fno-exceptions -fno-rtti
-c repeat_str.cc % gcc -shared -o
count_agg_vowels.so count_agg_vowels.cc -
I../innobase/include
-I./../include -I./../regex -I. -I../include -I.
Make sure you've dropped any pre-existing functions of the same name (if
you're updating a function) before you upload them or reload them. You
may have to manually delete the function from thefunc table if you've
previously added broken UDF that you can'tDROP in the normal way.
You can also try stopping and restarting MySQL before reloading the
function though this is not usually necessary.
Make sure the return type when youCREATE FUNCTION matches the type
you return from the main function in your code (string, real or integer). You
may have to configure MySQL with--with-mysqld-ldflags=-
rdynamic in order to implement the UDF.
Summary
MySQL allows you to add user-defined functions (UDFs). You use these as
you would a normal function inside a query. There are two kinds of UDFs:
aggregate and standard. Aggregate UDFs work on groups of data and can
be used with theGROUP BY clause. Standard UDFs work on single rows of
data.
Chapter 7: Understanding
Relational Databases
Overview
Just as perhaps we take movie special effects for granted until we see what
the state of the art was in previous eras, so we can't fully appreciate the
power of relational databases without seeing what preceded them.
Relational databases allow any table to relate to any other table through
means of common fields. It is a highly flexible system, and most modern
databases are relational. Featured in this chapter:
The hierarchical database model
The network database model
The relational database model
Learning basic terms
Table keys and foreign keys
Views
Languages such as Perl, with its powerful regular expressions ideal for
processing text, have made the job a lot easier than before; however,
accessing data from files is still a challenging task. Without a standard way
to access data, systems are more prone to errors, are slower to develop, and
are more difficult to maintain. Data redundancy (where data is duplicated
unnecessarily) and poor data integrity (where data is not changed in all the
necessary locations, leading to wrong or outdated data being supplied) are
frequent consequences of the file access method of data storage. For these
reasons, database management systems (DBMSs) were developed to
provide a standard and reliable way to access and update data. They provide
an intermediary layer between the application and the data, and the
programmer is able to concentrate on developing the application, rather
than worrying about data access issues.
There are a number of database models. First you'll learn about two
common models, the hierarchical database model and the network database
model. Then you'll investigate the one that MySQL (along with most
modern DBMSs) uses, the relational model.
Tables 7.1 and 7.2 relate to each other through the stock_code field. Any
two tables can relate to each other simply by creating a field they have in
common.
Table 7.1: The Product Table
Stock_code Description Price A416 Nails, box $0.14 C923 Drawing pins,
box $0.08
Data are the values kept in the database. On their own, the data mean very
little. CA 684-213 is an example of data in a DMV (Division of Motor
Vehicles) database.
Each record is made up of fields (which are the vertical columns of the
table, also calledattributes ). Basically, a record is one fact (for example,
one customer or one sale).
These fields can be of various types . MySQL has many types, as you saw
in Chapter 2 ("Data Types and Table Types"), but generally the types fall
into three kinds: character, numeric, and date. For example, a customer
name is a character field, a customer's birthday is a date field, and a
customer's number of children is a numeric field.
The range of allowed values for a field is called the domain (also called
afield specification ). For example, a credit_card field may be limited to
only the values Mastercard, Visa, and Amex.
A field is said to contain a null value when it contains nothing at all. Null
fields can create complexities in calculations and have consequences for
data accuracy. For this reason, many fields are specifically set not to contain
null values.
A one-to-one (1:1) relationship is where for each instance of the first table
in a relationship, only one instance of the second table exists. An example
of this would be a case where a chain of stores carries a vending machine.
Each vending machine can only be in one store, and each store carries only
one vending machine (see Figure 7.3 ).
A mandatory relationship exists where for each instance of the first table in
a relationship, one or more instances of the secondmust exist. For example,
for a music group to exist, there must exist at least one musician in that
group.
Data integrity refers to the condition where data is accurate, valid, and
consistent. An example of poor integrity would be if a customer telephone
number is stored differently in two different locations. Another is where a
course record contains a reference to a lecturer who is no longer present at
the school. In Chapter 8 , "Database Normalization," you'll learn a
technique that assists you to minimize the risk of these sorts of problems:
database normalization.
Now that you've been introduced to some of the basic terms, the next
section will cover table keys, a fundamental aspect of relational databases,
in more detail.
A primary key is a candidate key that has been designated to identify unique
records in the table throughout the database structure. As an example, Table
7.3 shows the customer table.
At first glance, there are two possible candidate keys for this table. Either
customer_ code or a combination of first_name, surname, and
telephone_number would suffice. It is always better to choose the candidate
key with the least number of fields for the primary key, so you would
choose customer_code in this example. Upon reflection, there is also the
possibility of the second combination not being unique. The combination of
first_name, surname, and telephone_number could in theory be duplicated,
such as where a father has a son of the same name who is contactable at the
same telephone number. This system would have to expressly exclude this
possibility for these three fields to be considered for the status of primary
key.
Foreign keys allow for something called referential integrity. What this
means is that if a foreign key contains a value, this value refers to an
existing record in the related table. For example, take a look at Table 7.4
and Table 7.5 .
Referential integrity exists here, as all the lecturers in the course table exist
in the lecturer table. However, let's assume Anne Cohen leaves the
institution, and you remove her from the lecturer table. In a situation where
referential integrity is not enforced, she would be removed from the lecturer
table, but not from the course table, shown in Table 7.6 and Table 7.7 .
Now, when you look up who lectures Introduction to Programming, you are
sent to a nonexistent record. This is called poor data integrity.
Foreign keys also allow cascading deletes and updates. For example, if
Anne Cohen leaves, taking the Introduction to Programming course with
her, all trace of her can be removed from both the lecturer and course table
by using one statement. The delete "cascades" through the relevant tables,
removing all relevant records. Since version 3.23.44, MySQL has supported
checking of foreign key with the InnoDB table type, and cascading deletes
have been supported since version 4.0.0. Remember, though, that enforcing
referential integrity does have a performance cost. Though without it, it
becomes the responsibility of the application to maintain data integrity.
Foreign keys can contain null values, indicating that no relationship exists.
Introducing Views
Views are virtual tables. They are only a structure and contain no data.
Their purpose is to allow a user to see a subset of the actual data. Views are
one of the most frequent MySQL feature requests, and are due to be
implemented in version 5.
A view can consist of a subset of one table. For example, Table 7.8 is a
subset of the full table, shown in Table 7.9 .
First_name
Surname
Grade
Table 7.9: The Student Table Student
Student_id
First_name
Surname
Grade
Address Telephone
This view could be used to allow other students to see their fellow students'
marks but not allow them access to personal information.
Or a view could be a combination of a number of tables, such as the view
shown in Table 7.10 . It's a combination of Table 7.11 , Table 7.12 , and
Table 7.13 .
Student_id
First_name
Surname
Address
Telephone
Student_id
Course_id
Grade
Views are also useful for security. In larger organizations, where many
developers may be working on a project, views allow developers access to
only the data they need. What they don't need, even if it is in the same table,
is hidden from them, safe from being seen or manipulated. It also allows
queries to be simplified for developers. For example, without the view, a
developer would have to retrieve the fields in the view with the following
sort of query:
SELECT first_name,surname,course_description,grade
FROM student, grade, course WHERE grade.student_id
= student.student_id AND grade.course_id =
course.course_id;
With the view, a developer could do the same with the following:
SELECT first_name,surname,course_description,grade
FROM student_grade_view;
This is much more simple for a junior developer who hasn't yet learned how
to do joins, and it's just less hassle for a senior developer, too!
Summary
Before databases, programmers stored data in files. However, accessing
data from files is inefficient for the programmer, so databases were created.
Hierarchical databases store data in a top-down, one-to-many structure.
They are inflexible and also create a lot of work for the programmers.
Network databases allow easier representation of many-to-many
relationships, but they are difficult to develop and maintain.
Relational databases allow any table to relate to any other table through
means of common fields. It is a highly flexible system, and most modern
databases are relational.
Understanding Normalization
In Part I , "Using MySQL," you created some tables in MySQL. Perhaps
you've been using MySQL for a while with small projects where the
databases contain one or two tables. But as you become more experienced
and begin to tackle bigger projects, you may find that the queries you need
become more complex and unwieldy, you begin to experience performance
problems, or data anomalies start to creep in. Without some knowledge of
database design and normalization, these problems may become
overwhelming, and you will be unable to take the next step in your mastery
of MySQL. Databasenormalization is a technique that can help you avoid
data anomalies and other problems with managing your data. It consists of
transforming a table through various stages: 1st normal form, 2nd normal
form, 3rd normal form, and beyond. It aims to:
Let's begin by creating a sample set of data. You'll walk through the process
of normalization first, without worrying about the theory, to get an
understanding of the reasons you'd want to normalize. Once you've done
that, I'll introduce the theory and steps of the various stages of
normalization, which will make the whole process you're about to carefully
go through now much simpler the next time you do it.
Imagine you are working on a system that records plants placed in certain
locations and the soil descriptions associated with them.
The location:
Location code: 11
Location name: Kirstenbosch Gardens
The location:
Location code: 12
Location name: Karbonkelberg Mountains
There is a problem with the previous data. Tables in relational databases are
in a grid, or table, format (MySQL, like most modern databases, is a
relational database), with each row being one unique record. Let's try and
rearrange this data is the form of a tabular report (as shown in Table 8.1 ).
Table 8.2: Trying to Create a Table with the Plant Data Location
Location Plant Code Name Code 11 Kirstenbosch431Gardens
Plant Name
Soil
Category Soil
Description
Leucadendron A Sandstone
NULL NULL 446 B Protea
NULL NULL 482 12 Karbonkelberg431Mountains
C Erica Sandstone/ Limestone Limestone
Leucadendron A Sandstone
NULL NULL 449 Restio B Sandstone/ Limestone
This table is not much use, though. The first three rows are actually a group,
all belonging to the same location. If you take the third row by itself, the
data is incomplete, as you cannot tell the location the Erica is to be found.
Also, with the table as it stands, you cannot use the location code, or any
other field, as a primary key (remember, a primary key is a field, or list of
fields, that uniquely identify one record). There is not much use in having a
table if you can't uniquely identify each record in it.
So, the solution is to make sure that each table row can stand alone, and is
not part of a group, or set. To achieve this, remove the groups, or sets of
data, and make each row a complete record in its own right, which results in
Table 8.3 .
Table 8.3: Each Record Stands Alone Location Location Code Name
Kirstenbosch11 Gardens
Kirstenbosch11 Gardens
Kirstenbosch11 Gardens
Plant
Plant Name Soil
Code Category
Soil
Description
431 LeucadendronA Sandstone
446 Protea B Sandstone/ Limestone
482 Erica C Limestone 12 Karbonkelberg 431 LeucadendronA Mountains
12 Karbonkelberg449 Restio BMountains
Sandstone
Sandstone/ Limestone
Note The primary keys are shown in italics in Table 8.3 and the following
tables.
Notice that the location code cannot be a primary key on its own. It does
not uniquely identify a row of data. So, the primary key must be a
combination of location code and plant code. Together these two fields
uniquely identify one row of data. Think about it: You would never add the
same plant type more than once to a particular location. Once you have the
fact that it occurs in that location, that's enough. If you want to record
quantities of plants at a location—for this example you're just interested in
the spread of plants—you don't need to add an entire new record for each
plant; rather, just add a quantity field. If for some reason you would be
adding more than one instance of a plant/location combination, you'd need
to add something else to the key to make it unique.
So, now the data can go in table format, but there are still some problems
with it. The table stores the information that code 11 refers to the
Kirstenbosch Gardens three times! Besides the waste of space, there is
another serious problem. Look carefully at the data in Table 8.4 .
Did you notice anything strange in the data in Table 8.4 ? Congratulations if
you did! Kirstenbosch is misspelled in the second record. Now imagine
trying to spot this error in a table with thousands of records! By using the
structure in Table 8.4 , the chances of data anomalies increase dramatically.
The solution is simple. You remove the duplication. What you are doing is
looking for partial dependencies—in other words, fields that are dependent
on a part of a key and not the entire key. Because both the location code and
the plant code make up the key, you look for fields that are dependent only
on location code or on plant name.
There are quite a few fields where this is the case. Location name is
dependent on location code (plant code is irrelevant in determining project
name), and plant name, soil code, and soil name are all dependent on plant
number. So, take out all these fields, as shown in Table 8.5 .
Table 8.5: Removing the Fields Not Dependent on the Entire Key
Location Code Plant Code
11 431
11 446
11 482
12 431
12 449
Clearly you can't remove the data and leave it out of the database
completely. You take it out and put it into a new table, consisting of the
fields that have the partial dependency and the fields on which they are
dependent. For each of thekey fields in the partial dependency, you create a
new table (in this case, both are already part of the primary key, but this
doesn't always have to be the case). So, you identified plant name, soil
description, and soil category as being dependent on plant code. The new
table will consist of plant code as a key, as well as plant name, soil category,
and soil description, as shown in Table 8.6 .
Table 8.6: Creating a New Table with Plant Data Plant Code Plant
Name Soil Category 431 Leucadendron A
446 Protea B
482 Erica C
449 Restio B
Soil Description
Sandstone
Sandstone/Limestone Limestone
Sandstone/Limestone
You do the same process with the location data, as shown in Table 8.7 .
Table 8.7: Creating a New Table with Location Data
See how these tables remove the earlier duplication problem? There is only
one record that containsKirstenbosch Gardens , so the chances of noticing a
misspelling are much higher. And you aren't wasting space storing the name
in many different records. Notice that the location code and plant code
fields are repeated in two tables. These are the fields that create the relation,
allowing you to associate the various plants with the various locations.
Obviously there is no way to remove the duplication of these fields without
losing the relation altogether, but it is far more efficient storing a small code
repeatedly than a large piece of text.
But the table is still not perfect. There is still a chance for anomalies to slip
in. Examine Table 8.8 carefully.
Soil Description
Sandstone
Sandstone/Limestone Limestone
Sandstone
The problem in Table 8.8 is that the Restio has been associated with
sandstone, when in fact, having a soil category of B, it should be a mix of
sandstone and limestone. (The soil category determines the soil description
in this example). Once again you are storing data redundantly: The soil
category to soil description relationship is being stored in its entirety for
each plant. As before, the solution is to take out this excess data and place it
in its own table. What you are in fact doing at this stage is looking
fortransitive relationships, or relationships where a nonkey field is
dependent on another nonkey field. Soil description, although in one sense
dependent on plant code (it did seem to be a partial dependency when we
looked at it in the previous step), is actually dependent on soil category. So,
soil description must be removed: Once again, take it out and place it in a
new table, along with its actual key (soil category), as shown in Table 8.9
and Table 8.10 .
Table 8.9: Plant Data After Removing the Soil Description Plant Code
Plant Name Soil Category
431 Leucadendron A
446 Protea B
482 Erica C
449 Restio B
Table 8.10: Creating a New Table with the Soil Description
Soil Category Soil Description
A Sandstone
B Sandstone/Limestone
C Limestone
You've cut down the chance of anomalies once again. It is now impossible
to mistakenly assume soil category B is associated with anything but a mix
of sandstone and limestone. The soil description to soil category
relationships is stored in only one place: the new soil table, where you can
be sure they are accurate.
Let's look at this example without the data tables to guide you. Often when
you're designing a system you don't yet have a complete set of test data
available, and it's not necessary if you understand how the data relates. I've
used the tables to demonstrate the consequences of storing data in tables
that were not normalized, but without them you have to rely on
dependencies between fields, which is the key to database normalization.
Although not always seen as part of the definition of 1st normal form, the
principle of atomicity is usually applied at this stage as well. This means
that all columns
So far, the plant example has no keys, and there are repeating groups. To get
it into 1st normal form, you'll need to define a primary key and change the
structure so that there are no repeating groups; in other words, each
row/column intersection contains one, and only one, value. Without this,
you cannot put the data into the ordinary two-dimensional table that most
databases require. You define location code and plant code as the primary
key together (neither on its own can uniquely identify a record), and replace
the repeating groups with a single-value attribute. After doing this, you are
left with the data shown in Table 8.11 .
Location code
Location name
Plant code
Plant name
Soil category
Soil description
Let's examine all the fields. Location name is only dependent on location
code. Plant name, soil category, and soil description are only dependent on
plant code. (This assumes that each plant only occurs in one soil type,
which is the case in this example). So you remove each of these fields, and
place them in a separate table, with the key being that part of the original
key on which they are dependent. For example, with plant name, the key is
plant code. This leaves you with Table 8.12 , Table 8.13 , and Table 8.14 .
Plant code
Location code
Table 8.13: Table Resulting from Fields Dependent on Plant Code Plant
Table
Plant code
Plant name
Soil category
Soil description
As only the plant table has more than one nonkey attribute, you can ignore
the others because they are in 3rd normal form already. All fields are
dependent on the primary key in some way, since the tables are in 2nd
normal form. But is this dependency through another nonkey field? Plant
name is not dependent on either soil category or soil description. Nor is soil
category dependent on either soil description or plant name. However, soil
description is dependent on soil category. You use the same procedure as
before, removing it, and placing it in its own table with the attribute that it
was dependent on as the key. You are left with Table 8.15 , Table 8.16 ,
Table 8.17 , and Table 8.18 .
Plant code
Location code
Table 8.16: The Plant Table with Soil Description Removed Plant Table
Plant code
Plant name
Soil category
Table 8.17: The New Soil Table Soil Table
Soil category
Soil description
Location code
Location name
All of these tables are now in 3rd normal form. 3rd normal form is usually
sufficient for most tables, because it avoids the most common kind of data
anomalies. I suggest getting most tables you work with to 3rd normal form
before you implement them, as this will achieve the aims of normalization
listed at the beginning of the chapter in the vast majority of cases. The
normal forms beyond this, such as Boyce-Codd normal form and 4th
normal form, are rarely useful for business applications. In most cases,
tables in 3rd normal form are already in these normal forms anyway. But
any skillful database practitioner should know the exceptions, and be able
to normalize to the higher levels when required.
Student
Course
Instructor
Assume that the following is true for Table 8.19 : Each instructor takes only
one course.
Each course can have one or more instructors. Each student has only one
instructor per course. Each student can take one or more courses.
What would the key be? None of the fields on their own would be sufficient
to uniquely identify a record, so you have to use two fields. Which two
should you use?
Perhaps student and instructor seem like the best choice, as that would
allow you to determine the course. Or you could use student and course,
which would determine the instructor. For now, let's use student and course
as the key (see Table 8.20 ).
Student
Course
Instructor
What normal form is this table in? It's in first normal form, as it has a key
and no repeating groups. It's also in 2nd normal form, as the instructor is
dependent on both other fields (students have many courses and therefore
instructors, and courses have many instructors). Finally, it's also in 3rd
normal form, as there is only one nonkey attribute.
But there are still some data anomalies. Look at the data sample in Table
8.21 .
Table 8.21: More Data Anomalies Student
Conrad Pienaar Dingaan Fortune Gerrie Jantjies Mark Thobela Conrad
Pienaar Alicia Ncita
Quinton Andrews
Course
Biology
Mathematics Science
Biology
Science
Science
Mathematics
Instructor
Nkosizana Asmal Kader Dlamini Helen Ginwala Nkosizana Asmal Peter
Leon
Peter Leon
Kader Dlamini
The fact that Peter Leon teaches science is stored redundantly, as are Kader
Dlamini with mathematics and Nkosizana Asmal with biology. The
problem is that the instructor determines the course. Or put another way,
course is determined by instructor. The table conforms to 3rd normal form
rules because no nonkey attribute is dependent upon another nonkey
attribute. However, a key attribute is dependent upon a nonkey attribute!
Again, you can use the familiar method of removing this field and placing it
into another table, along with its key (see Table 8.22 and Table 8.23 ).
After removing the course field, the primary key needs to include both
remaining fields to uniquely identify a record.
Although we had chosen course as part of the primary key in the original
table, the instructor determines the course, which is why we make it the
primary key in this table. As you can see, the redundancy problem has been
solved.
Table 8.24: Using Student and Instructor as the Key Student Course
Instructor Table
Student
Instructor
Course
Once again it's in 1st normal form because there is a primary key and there
are no repeating groups. This time, though, it's not in 2nd normal form
because course is determined by only part of the key: the instructor. By
removing course and its key, instructor, you get the data shown in Table
8.25 and Table 8.26 .
Table 8.25: Removing Course Student Instructor Table Student
Instructor
Instructor
Course
Either way you do it, by making sure the tables are normalized into Boyce-
Codd normal form, you get the same two resulting tables. It's usually the
case that when there are alternate fields to choose as a key, it doesn't matter
which ones you choose initially because after normalizing the results you
get the same results either way.
Student
Instructor
Course
But this still has some potentially anomalous behavior. The fact that Kader
Dlamini teaches mathematics is still stored more than once, as is the fact
that Dingaan Fortune takes mathematics. The real problem is that the table
stores more than one kind of fact: that of a student-to-course relationship, as
well as that of a student-to-instructor relationship. You can avoid this, as
always, by separating the data into two tables, as shown in Table 8.29 and
Table 8.30 .
Usually you would store this data in one table, as you need all three records
to see which combinations are valid. Afzal Ignesund sells magazines for
Wordsworth, but not necessarily books. Felicia Powers happens to sell both
books and magazines for Exclusive. However, let's add another condition: If
a sales rep sells a certain product, and they sell it for a particular company,
then they must sell that product for that company.
Let's look at a larger data set adhering to this condition (see Table 8.32 ).
Table 8.33: Creating a Table with Sales Rep and Product Sales Rep
Product
Felicia Powers Books
Felicia Powers Magazines
Afzal Ignesund Books
Table 8.34: Creating a Table with Sales Rep and Company Sales Rep
Felicia Powers Felicia Powers Afzal Ignesund
Company Exclusive
Wordsworth Wordsworth
Product Books
Magazines Books
Magazines
Basically, a table is in 5th normal form if it cannot be made into any smaller
tables with different keys (most tables can obviously be made into smaller
tables with the same key!).
Beyond 5th normal form, you enter the heady realms of domain key normal
form, a kind of theoretical ideal. Its practical use to a database designer is
similar to that of the concept of infinity to a bookkeeper—i.e., it exists in
theory, but is not going to be used in practice. Even the most corrupt
executive is not going to expect that of the bookkeeper!
For those interested in pursuing this academic and highly theoretical topic
further, I suggest obtaining a copy ofAn Introduction to Database Systems
by C.J. Date (Addison-Wesley, 1999).
Understanding Denormalization
Denormalization is the process of reversing the transformations made
during normalization for performance reasons. It's a topic that stirs
controversy among database experts; there are those who claim the costs are
too high and never denormalize, and there are those who tout its benefits
and routinely denormalize.
Be sure you are willing to trade the decreased data integrity for the increase
in performance.
Table 8.36 introduces a common structure where it may not be in your best
interests to denormalize. Can you tell which normal form the table is in?
Table 8.36: Customer Table Customer Table
ID
First name
Surname
Address line 1
Address line 2
Town
ZIP code
Table 8.36 must be in 1st normal form because it has a primary key and
there are no repeating groups. It must be in 2nd normal form because there's
only one key, so there cannot be any partial dependencies. And 3rd normal
form? Are there any transitive dependencies? It looks like it. ZIP code is
probably determined by the town attribute. To make it into 3rd normal form,
you should take out ZIP code, putting it in a separate table with town as the
key. In most cases, I would suggest not doing this though. Although this
table is not really in 3rd normal form, separating this table is not worth the
trouble. The more tables you have, the more joins you need to do, which
slows the system down. The reason you normalize at all is to reduce the size
of tables by removing redundant data (which can often speed up the
system). But you also need to look at how your tables are used. Town and
ZIP code would almost always be returned together, as part of the address.
In most cases, the small amount of space you save by removing the
duplicate town/ZIP code combinations would not offset the slowing down
of the system because of the extra joins. In some situations, this may be
useful, perhaps where you need to sort addresses according to ZIP codes or
towns for thousands of customers, and the distribution of the data means
that a query to the new, smaller table can return the results substantially
quicker. In the end, experienced database designers can go beyond rigidly
following the steps, as they understand how the data will be used. And that
is something only experience can teach you. Normalization is just a helpful
set of steps that most often produces an efficient table structure and not a
rule for database design.
Tip
I've seen some scary database designs out there, almost always because of
not normalizing rather than too much normalization. So if you're unsure,
normalize!
Summary
Database normalization is a process performed on your tables to make them
less likely to fall prey to various common kinds of data anomaly:
1st normal form contains no repeating groups and guarantees that all
attributes are dependent on a primary key.
2nd normal form contains no partial dependencies.
3rd normal form contains no transitive dependencies.
Design The Design phase is where a conceptual design is created from the
previously determined requirements, and a logical and physical design are
created that will ready the database for implementation.
Testing The Testing phase is where the database is tested and fine-tuned,
usually in conjunction with the associated applications.
Operation The Operation phase is where the database is working normally,
producing information for its users.
Maintenance The Maintenance phase is where changes are made to the
database in response to new requirements or changed operating conditions
(such as heavier load).
Phase 1: Analysis
Your existing system can no longer cope. It's time to move on. Perhaps the
existing paper system is generating too many errors, or the old Perl script
based on flat files can no longer handle the load. Or perhaps an existing
news database for a website is struggling under its own popularity and
needs an upgrade. This is the stage where the existing system is reviewed.
When reviewing a system, the designer needs to look at the bigger picture
—not just the hardware or existing table structures, but the whole situation
of the organization calling for the redesign. For example, a large bank with
centralized management would have a different structure and a different
way of operating from a decentralized media organization, where anyone
can post news onto a website. This may seem trivial, but understanding the
organization you're building the database for is vital to designing a good
database for it. The same demands in the bank and media organizations
should lead to different designs because the organizations are different. In
other words, a solution that was constructed for the bank cannot be
unthinkingly implemented for the media organization, even when the
situation seems similar. A culture of central control at the bank may mean
that news posted on the bank website has to be moderated and authorized
by central management, or may require the designer to keep detailed audit
trails of who modified what and when. On the flip side, the media
organization may be morelaissez-faire and will be happy with news being
modified by any authorized editor. Understanding an organization's culture
helps the designer to ask the right questions. The bank may not ask for the
audit trail, it may simply expect it; and when the time comes to roll out the
implementation, the audit trail would need to be patched on, requiring more
time and resources.
Once you understand the organization structure, you can question the users
of any existing system as to what their problems and needs are, what
constraints exist currently, and what the objectives of the new database
system are, as well as what constraints will exist then. You need to question
different role players, as each can add a new understanding as to what the
database may need. For example, the media organization's marketing
department may want to track movements from one news article to another
on its website, but the editorial department may want detailed statistics
about the times of day certain articles are read. You may also be alerted to
possible future requirements. Perhaps the editorial department is planning
to expand the website, which will give them the staff to cross-link web
articles. Keeping this future requirement in mind could make it easier to add
the crosslinking feature when the time comes.
Constraints can include hardware ("We have to use our existing database
server, an AMD Duron 900MHz") or people ("We only have one data
capturer on shift at any one time"). Constraints also refer to the limitations
on values. For example, a student's grade in a university database may not
be able to go beyond 100 percent, or the three categories of seats in a
theatre database are small, medium, and large.
Of course, although anything is possible given infinite time and money, this
is almost never forthcoming. Determining scope, and formalizing it, is an
important part of the project. If the budget is for one month's work but the
ideal solution requires three, the designer must make clear these constraints
and agree with the project owners on which facets are not going to be
implemented.
Phase 2: Design
The Design phase is where the requirements identified in the previous phase
are used as the basis to develop the new system. Another way of putting it
is that the business understanding of the data structures is converted into a
technical understanding. Thewhat questions ("What data are required?
What are the problems to be solved?") are replaced byhow questions ("How
will the data be structured? How is the data to be accessed?").
This phase consists of three parts: the conceptual design, the logical design,
and the physical design. Some methodologies merge the logical design
phase into the other two phases. Note that this chapter is not aimed at being
a definitive discussion of database design methodologies (there are whole
books written on that!); rather it aims to introduce you to the topic.
Conceptual Design
The purpose of the conceptual design phase is to build a conceptual model
based upon the previously identified requirements but closer to the final
physical model. The most useful and common conceptual model is called
anentity-relationship model.
Is it its own thing that cannot be separated into subcategories? For example,
a carrental agency may have different criteria and storage requirements for
different kinds of vehicles.Vehicle may not be an entity as it can be
broken up intocar andboat , which are the entities.
Does it list a type of thing, not an instance? The video gameblow-em-up
6 is not an entity, rather an instance of thegame entity.
Relationships
Entities are related in certain ways. For example, a customer can belong to a
library and can take out books. A book can be found in a particular library.
Understanding what you are storing data about, and how the data relate,
leads you a large part of the way to a physical implementation in the
database.
Optional For each instance of entity A, there may or may not exist
instances of entity B.
Figure 9.3 shows husband and wife entities. Each husband must have one
and only one wife, and each wife must have one, and only one, husband.
Both relationships are mandatory.
An entity can also have a relationship with itself. Such an entity is called a
recursive entity . Take aperson entity: If you're interested in storing data
about which people are brothers, you will have an "is a brother to"
relationship. In this case, the relationship is an M:N relationship.
The term cardinality refers to the specific number of instances possible for
a relationship. Cardinality limits list the minimum and maximum possible
occurrences of the associated entity. In the husband and wife example, the
cardinality limit is (1,1), and in the case of a student who can take between
one and eight courses, the cardinality limits would be represented as (1,8).
The first step in developing the diagram is to identify all the entities in the
system. In the initial stage, it is not necessary to identify the attributes, but
this may help to clarify matters if the designer is unsure about some of the
entities. Once the entities are listed, relationships between these entities are
identified and modeled according to their type: oneto-many, optional, and
so on. There are many software packages that can assist in drawing an
entity-relationship diagram, but any graphical package should suffice.
Tip an acquired skill, though, and more experienced designers will have a
good idea of what works and of possible problems at a later stage, having
gone through the process before.
Once the diagram has been approved, the next stage is to replace many-to-
many relationships with two one-to-many relationships. A DBMS cannot
directly implement manyto-many relationships, so they are decomposed
into two smaller relationships. To achieve this, you have to create
anintersection , orcomposite entity type. Because intersection entities are
less "real-world" than ordinary entities, they are sometimes difficult to
name. In this case, you can name them according to the two entities being
intersected. For example, you can intersect the many-to-many relationship
betweenstudent andcourse by a student-course entity (see
Figure 9.4 ).
The same applies even if the entity is recursive. The person entity that has
an M:N relationship "is brother to" also needs an intersection entity. You
can come up with a good name for the intersection entity in this
case:brother . This entity would contain two fields, one for each person
of the brother relationship—in other words, the primary key of the first
brother and the primary key of the other brother (see Figure 9.5 ).
Each entity will become a database table, and each attribute will become a
field of this table. Foreign keys can be created if the DBMS supports them
and the designer decides to implement them. If the relationship is
mandatory, the foreign key must be defined asNOT NULL , and if it is
optional, the foreign key can allow nulls. For example, because of the
invoice line-to-product relationship in the previous example, the product
code field is a foreign key in the invoice line table. Because the invoice line
must contain a product, the field must be defined asNOT NULL . Currently,
InnoDB tables do support foreign key constraints, and MyISAM tables do
not support foreign keys in version 4, but they probably will in version 4.1.
A DBMS that does support foreign keys usesON DELETE CASCADE and
ON DELETE RESTRICT clauses in their definitions.ON DELETE
RESTRICT means that records cannot be deleted unless all records
associated with that foreign key are deleted. In the invoice line-to-product
case,ON DELETE RESTRICT in the invoice line table means that if a
product is deleted, the deletion will not take place unless all associated
invoice lines with that product are deleted as well. This avoids the
possibility of an invoice line existing that points to a nonexistent
product.ON DELETE CASCADE achieves a similar effect but more
automatically (and more dangerously!). If the foreign key was declared
withON DELETE CASCADE , associated invoice lines would
automatically be deleted if a product was deleted.ON UPDATE CASCADE
is similar toON DELETE CASCADE , in that all foreign key references to a
primary key are updated when the primary key is updated.
Keep unrelated data in different tables. People who are used to using
spreadsheets often make this mistake because they are used to seeing all
their data in one twodimensional table. A relational database is much more
powerful; don't "hamstring" it in this way.
Don't store values you can calculate. Let's say you're interested in three
numbers: A, B, and the product of A and B (A * B). Don't store the product.
It wastes space and can easily be calculated if you need it. And it makes
your database more difficult to maintain: If you change A, you also have to
change all of the products as well. Why waste your database's efforts on
something you can calculate when you need it?
Does your design cater to all the conditions you've analyzed? In the heady
rush of creating an entity-relationship diagram, you can easily overlook a
condition. Entityrelationship diagrams are usually better at getting
stakeholders to spot an incorrect rule than spot a missing one. The business
logic is as important as the database logic and is more likely to be
overlooked. For example, it's easy to spot that you cannot have a sale
without an associated customer, but have you built in that a customer cannot
be approved for a sale of less than $500 if another approved customer has
not recommended them?
Are your attributes, which are about to become your field names, well
chosen? Fields should be clearly named. For example, if you usef1 andf2
instead of surname andfirst_name , the time saved in less typing will
be lost in looking up the correct spelling of the field or in mistakes where a
developer thoughtf1 was the first name, andf2 the surname. Similarly, try
to avoid the same names for different fields. If six tables have a primary key
ofcode , you're making life unnecessarily difficult. Rather, use more
descriptive terms, such assales_code orcustomer_code .
Don't create too many relationships. Almost every table in a system can be
related by some stretch of the imagination, but there's no need to do this.
For example, a tennis player belongs to a sports club. A sports club belongs
to a region. The tennis players then also belong to a region, but this
relationship can be derived through the sports club, so there's no need to
add another foreign key (except to achieve performance benefits for certain
kinds of queries). Normalizing can help you avoid this sort of problem (and
even when you're trying to optimize for speed, it's usually better to
normalize and then consciously denormalize rather than not normalize at
all).
Conversely, have you catered to all relations? Do all relations from your
entityrelationship diagram appear as common fields in your table
structures? Have you covered all relations? Are all many-to-many
relationships broken up into two one-tomany relationships, with an
intersection entity?
Have you listed all the constraints? Constraints include a gender that can
only be m or f, ages of schoolchildren that cannot exceed 20, or e-mail
addresses that need to have an at sign (@) and at least one period (.); don't
take these limits for granted. At some stage the system will need to
implement them, and you're going to either forget to do so, or have to go
back to gather more data if you don't list these up front.
Are you planning to store too much data? Should a customer be asked to
supply their eye color, favorite kind of fish, and names of their grandparents
if they are simply trying to register for an online newsletter? Sometimes
stakeholders want too much information from their customers. If the user is
outside the organization, they may not have a voice in the design process,
but they should always be thought of foremost. Consider also the difficulty
and time taken to capture all the data. If a telephone operator needs to take
all this information down before making a sale, imagine how much slower
they will be. Also consider the impact data has on database speed. Larger
tables are generally slower to access, and unnecessary BLOB ,TEXT ,
andVARCHAR fields lead to record and table fragmentation.
Have you combined fields that should be separate? Combining first name
and surname into one name field is a common mistake. Later you'll realize
that sorting names alphabetically is tricky if you've stored them as John
Ellis and Alfred Ntombela. Keep distinct data discrete.
Has every table got at least a primary key? There had better be a good
reason for leaving out a primary key. How else are you going to identify a
unique record quickly? Consider that an index speeds up access time
tremendously, and when kept small it adds very little overhead. Also, it's
usually better to create a new field for the primary key rather than take
existing fields. First name and surname may be unique in your current
dataset, but they may not always be. Creating a systemdefined primary key
ensures that it will always be unique.
Give some thought to your other indexes. What fields are likely to be used
in the condition to access the table? You can always create more fields later
when you test the system, but add any you think you need at this stage.
Have you covered all character sets you may need? German letters, for
example, have an expanded character set, and if the database is to cater to
German users it will have to take this into account. Similarly, dates and
currency formats should be carefully considered if the system is to be
international.
Phase 3: Implementation
The Implementation phase is where you install the DBMS on the required
hardware, optimize the database to run best on that hardware and software
platform, and create the database and load the data. The initial data could be
either new data captured directly or existing data imported from a MySQL
database or other DBMS. You also establish database security in this phase
and give the various users that you've identified access applicable to their
requirements. Finally, you also initiate backup plans in this phase.
Phase 4: Testing
The Testing phase is where the performance, security, and integrity of the
data are tested. Usually this will occur in conjunction with the applications
that have been developed. You test the performance under various load
conditions to see how the database handles multiple concurrent connections
or high volumes of updating and reading. Are the reports generated quickly
enough? For example, an application designed with MyISAM tables may
prove too slow because the impact of the updates was underestimated. The
table type may have to be changed to InnoDB in response.
Data integrity also needs to be tested, as the application may have logical
flaws that result in transactions being lost or other inaccuracies. Further,
security needs to be tested to ensure that users can access and change only
the data they should.
The testing and fine-tuning process is an iterative one, with multiple tests
performed and changes implemented.
The following are the steps in the Testing phase:
1. Test the performance.
2. Test the security.
3. Test the data integrity.
4. Fine-tune the parameters or modify the logical or physical designs in
response to the tests.
Phase 5: Operation
The Operation phase takes place when the testing is complete and the
database is ready to be rolled out for everyday use. The users of the system
begin to operate the system, load data, read reports, and so on. Inevitably,
problems come to light. The designer needs to manage the database's scope
carefully at this stage, as users may expect all their desires to be pandered
to. Poor database designers may find themselves extending the project well
beyond their initial time estimate, and the situation may also become
unpleasant if the scope has not been clearly defined and agreed upon.
Project owners will feel wronged if their needs are not met, and the
database designers will feel overworked and underpaid. Even when scope
has been well managed, there will always be new requirements. These then
lead into the next stage.
Phase 6: Maintenance
The database Maintenance phase incorporates general maintenance, such as
maintaining the indexes, optimizing the tables, adding and removing users,
and changing passwords, as well as backups and restoration of backups in
case of a failure. (See Chapter 10 , "Basic Administration," for more
information about maintenance.) New requirements also start to be
requested, and this may result in new fields, or new tables, being created.
As the system and organization changes, the existing database becomes less
and less sufficient to meet the organization's needs. For example, the media
organization may be amalgamated with media bodies from other countries,
requiring integration of many data sources, or the volumes and staff may
expand (or reduce) dramatically. Eventually, there comes a time, whether
it's 10 months after completion or 10 years, when the database system needs
to be replaced. The maintenance of the existing database begins to drain
more and more resources, and the effort to create a new design is matched
by the current maintenance effort. At this point, the database is coming to
the end of its life, and a new project begins its life in the Analysis phase.
The designer asks various questions to get more detailed information, such
as "What is a poet, as far as the system goes? Does Poet's Circle keep track
of poets even if they haven't written or published poems? Are publications
recorded even before there are any associated poems? Does a publication
consist of one poem or many? Are potential customer details recorded?"
The following summarizes the responses:
All captured poems are written by an associated poet, whose details are
already in the system. There can be no poems submitted and stored without
a full set of details of the poet.
Next, you need to determine the relationships between these entities. You
can identify the following:
A poet can write many poems. The analysis identified the fact that a poet
can be stored in the system even if there are no associated poems. Poems
may be captured at a later point in time, or the poet may still be a potential
poet. Conversely, many poets could conceivably write a poem, though the
poem must have been written by at least one poet.
A publication may contain many poems (an anthology) or just one. It can
also contain no poems (poetry criticism, for example). A poem may or may
not appear in a publication.
A sale must be for at least one publication but may be for many. A
publication may or may not have made any sales.
A customer may be made many sales or none at all. A sale is made for one
and only one customer.
You can identify the following attributes: Poet :first name ,surname
,address ,telephone number
Poem: poem title, poem contents Publication: title, price
Sales: date, amount Customer: first name, surname, address, telephone
number
Based on these entities and relationships, you can construct the entity-
relationship diagram shown in Figure 9.6 .
Now, to begin the logical and physical design, you need to add attributes
that can create the relationship between the entities, and specify primary
keys. You do what's usually best and create new, unique primary keys.
Tables 9.1 through 9.7 show the structures for the tables created from each
of the entities.
sale code
publication code joint primary key, foreign key, integer joint primary key,
foreign key, integer
Table 9.6: Sale Table Field Definition sale code primary key, integer date
date
amount numeric (10.2) customer code foreign key, integer
Definition
primary key, integer character (30)
character (40)
character (100)
character (20)
character (30)
MySQL will have no problem with this design and is selected as the
DBMS. Existing hardware and operating system platforms are also selected.
manager who does not properly account for testing is simply incompetent.
No matter how tiny your system, make sure you allocate time for thorough
testing and time for fixing the inevitable bugs.
Once testing is complete, the system can be rolled out. You decide on a
low-key rollout and give a few selected poets access to the website to
upload their poems. You discover other problems: Obscure browsers have
incompatibilities that lead to garbled poems being submitted. Strictly
speaking, this doesn't fall into the database programmer's domain, but it's
the kind of situation testing will reveal once all the elements of the system
are working together. You decide to insist that users make use of browsers
that can render the developed pages correctly, and browsers that don't
adhere to these standards are barred from uploading.
Atomicity
Atomicity means the entire transaction must complete. If this is not the case,
the entire transaction is aborted. This ensures that the database can never be
left with partially completed transactions, which lead to poor data integrity.
If you remove money out of one bank account, for example, but the second
request fails and the system cannot place the money in another bank
account, both requests must fail. The money cannot simply be lost or taken
from one account without going into the other.
Consistency
Consistency refers to the state the data is in when certain conditions are
met. For example, one rule may be that each invoice must relate to a
customer in the customer table. These rules may be broken during the
course of a transaction if, for example, the invoice is inserted without a
related customer, which is added at a later stage in the transaction. These
temporary violations are not visible outside of the transaction, and will
always be resolved by the time the transaction is complete.
Isolation
Isolation means that any data being used during the processing of one
transaction cannot be used by another transaction until the first transaction
is complete. For example, if two people deposit $100 into an account with a
balance of $900, the first transaction must add $100 to $900, and the second
must add $100 to $1,000. If the second transaction reads the $900 before
the first transaction has completed, both transactions will seem to succeed,
but $100 has gone missing. The second transaction must wait until it alone
is accessing the data.
Durability
Durability refers to the fact that once data from a transaction has been
committed, its effects will remain, even after a system failure. While a
transaction is under way, the effects are not persistent. If the database
crashes, backups will always restore it to a consistent state prior to the
transaction commencing. Nothing a transaction does should be able to
change this fact.
Summary
Good database design ensures a longer-living and more efficient database
system. By spending the time to design it carefully, designers can avoid
most of the commonly repeated errors that plague many existing databases.
The database lifecycle (DBLC) can be defined in many ways, but it comes
down to the same main steps. First, the Analysis phase is where information
is gathered and the existing system is examined to identify current
problems, possible solutions, and so on. Then, the Design phase is where
the new system is carefully designed, first conceptually for the stakeholders
and then logically and physically for implementation. Next, the
Implementation phase physically rolls out the database, before the Testing
phase brings any problems to light. Then, when the Testing phase has
succeeded, the system is put into operation for day-to-day use. Almost
immediately, the Maintenance phase starts. As change requests come in,
routine optimizations and backups need to be performed. Finally, once
maintenance becomes too intensive, a new database cycle begins to replace
the aging system.
Transactions ensure that the database remains in a consistent state
throughout its existence. There are four principles that keep this so.
Atomicity states that all requests within a transaction succeed or fail as one.
Consistency ensures that the database will always return to a coherent state
between transactions. Isolation ensures that all requests from one
transaction are completed before the next transaction that affects the same
data is allowed to begin processing. And durability keeps the database
consistent even in the case of failures.
Meeting MySQL as an
Administrator
As an administrator, you'll need to know a lot more about how MySQL
works than if you were just running queries. You'll need to be familiar with
the utilities supplied with the MySQL distribution, as well as how your
MySQL is set up. The main utilities you will use as an administrator are the
following:
mysqld This is not really a utility, as it's the MySQL server. You'll come
across terms such asthe mysqld variables , and you should know this is
nothing more esoteric than the server variables.
datadir = C:/mysqldata
You can also just get the variable you're interested in by usinggrep (Unix)
orfind (Windows), as follows, first on Unix, then on Windows:
% mysqladmin -uroot -pg00r002b variables | grep
'datadir' | datadir | /usr/local/mysql/data/
or
C:\mysql\bin> mysqladmin variables | find "datadir"
| datadir | C:\mysql\data\
The data directory in this case is/usr/local/mysql/data , the
default for most binary installations on Unix, andc:\mysql\data with
the Windows example.
The data directory usually contains the log files (they are placed there by
default, though you can change this) as well as the actual data. In the case
of MyISAM tables (the default), each database has its own directory, and
within this directory each table has three corresponding files: an.MYD file
for the data, an.MYI for the indexes, and an.frm file for the definition.
BDB tables are also stored in the same directory, but they consist of a.db
file and an.frm definition file. InnoDB tables have their.frm definition
file in the database directory, but the actual data is stored one level up, on
the same level as the database directories.
TCP/IP through a port This is the slowest method but is the only way to
connect to a server running on Windows 95/98/Me or to connect remotely
to a Unix machine.
Warning There are multiple ways to start a server, and trying the wrong
one could cause problems! Most production systems use a script to
automatically start MySQL upon booting up, and you should use this if one
is in place.
Some distributions come with a script called mysql.server, which may even
be automatically installed for you (sometimes renamed just to mysql). This
would usually be placed in a directory where processes are automatically
activated upon booting. If this is the case, you'd use this script to start.
mysql.server takes start and stop options. The following is a common
startup used on Red Hat Linux:
% /etc/rc.d/init.d/mysql start
% Starting mysqld daemon with databases from
/usr/local/mysql/data
On FreeBSD, the file may be placed in/usr/local/etc/rc.d , in
which case you'd use the following:
% /usr/local/etc/rc.d/mysql.sh start
To shut down, you can use mysqladmin:
% mysqladmin shutdown -uroot -p
Enter password:
020706 16:56:02 mysqld ended
The next two lines ensure that MySQL is started when the system boots up
and reaches multiuser mode (run level 3) and shuts down with the system
(run level 0). They create a link from the appropriate level to the
mysql.server script:
% ln -s /etc/rc.d/init.d/mysql.server
/etc/rc.d/rc3.d/S99mysql % ln -s
/etc/rc.d/init.d/mysql.server
/etc/rc.d/rc0.d/S01mysql
The next example is from a recent version of FreeBSD, where the startup
scripts just need to be copied to therc.d directory and given an.sh
extension:
% cp /usr/local/mysql/support-files/mysql.server
/usr/local/etc/rc.d/mysql.sh
Make sure your script is executable, and not accessible to any unauthorized
eyes, with the following:
% chmod 700 /usr/local/etc/rc.d/mysql.sh
Some older systems may use/etc/rc.local to start scripts, in which
case you should add something like the following to the file:
/bin/sh -c 'cd /usr/local/mysql ;
./bin/safe_mysqld --user=mysql &'
% su
Password:
% /usr/local/mysql/bin/mysqld_safe --user=mysql &
[1] 24756
% Starting mysqld daemon with databases from
/usr/local/mysql-max-4.0.2-alpha-unknown-
freebsdelf4.6-i386/data
For other problems, the MySQL error log may give you some assistance.
See the section titled "The Error Log " later in this chapter.
Starting and Shutting Down in Windows
An optimized binary that supports named pipes (for use with NT/2000/XP).
It can run on 95/ 98/Me, but no named pipes will be created, as these
operating systems do not support it.
If you're still using Windows 95, make sure Winsock 2 is installed. Older
versions Note of Windows 95 do not come with Winsock 2, and MySQL
will not run. You can
download it fromwww.microsoft.com .
Make sure your my.ini file contains the executable you want to use. You
may run mysqldmax manually and then want to start it automatically with
winmysqladmin. However, if your file contains the following, for example:
[WinMySQLAdmin]
Server=C:/PROGRAM FILES/MYSQL/bin/mysqld-opt.exe
you won't be able to use the transactional capability you may have been
expecting. You can edit themy.ini file manually, or you can use
winmysladmin to modify it, selectingmy.ini Setup and changing the
mysqld file (see Figure 10.1 ).
Figure 10.1: Using
winmysqladmin to update themy.ini configuration file
With NT/2000/XP, install MySQL as a service as follows:
C:\> c:\mysql\bin\mysqld-max-nt -install
If you don't want MySQL to start automatically, but you still want it as a
service, run the same command with themanual option:
C:\mysql\bin> mysqld-max-nt --install-manual
You can then start the service with the following:
[mysqld]
basedir=C:/Program Files/mysql
datadir=C:/Program Files/data
[WinMySQLAdmin]
Server=C:/Program Files/mysql/bin/mysqld-max-
nt.exe
Windows pathnames are specified with forward slashes, not the usual
Windows backslash, in option files. If you want to use backslashes, you'll
need to escape Note them (with another backslash), as the backslash is a
special MySQL character, for example:Server=C:\\Program
Files\\mysql\\bin\\mysqld
opt.exe .
You may have wanted to use spaces in your filename and tried something
like this: C:/program files/mysql
instead of this:
C:/progra~1/mysql
Configuring MySQL
To get MySQL to run smoothly in the way you want, you'll need to
configure it in certain ways, such as to set the default table type to InnoDB
or to display error messages in a certain language. You can set most options
in three ways: from the command line, from a configuration file, or from a
preset environment variable. Setting options from the command line is
useful for testing, but it's not useful if you want to keep those options over a
long period of time. Environment variables are almost never used. The most
common, and the most useful method, is through a configuration file.
On Unix, the startup configuration file is usually called my.cnf and can be
placed in the following locations. MySQL reads this from top to bottom, so
the lower positions have a higher precedence (see Table 10.2 ).
Table 10.2: Precedence of the Configuration Files on Unix File
/etc/my.cnf
DATA_
DIRECTORY/my.cnf
defaults-extrafile
~/.my.cnf
Description
Global options that apply to all servers and users. If you're unsure where to
put a configuration file, place it here.
Options specific to the server that stores its data in the specified
DATA_DIRECTORY . This is usually/usr/local/mysql/data for
binary or/usr/local/var for source installations. Be warned that this
is not necessarily the same as the--datadir option specified for mysqld.
Rather, it's the one specified when the system was set up.
Global options that apply to all servers and users (could just
useC:\my.cnf
the previousmy.ini file instead).
C:\DATA_
DIRECTORY\
my.cnf
defaults
extra
file=filename
Options specific to the server that stores its data in the specified
DATA_DIRECTORY (which is usuallyC:\mysql\data ).
Options specific to server or client utilities started with the-defaults-
extra-file=filename command-line option.
In Windows, if the C drive is not the boot drive, or you use the
winmysqladmin utility, you have to use amy.ini configuration file (in the
Windows system folder).
Note Windows has no configuration file for options specific to the user.
A sample configuration file follows:
= key_buffer=16M
= max_allowed_packet=1M = table_cache=64
= sort_buffer=512K
= net_buffer_length=8K
= myisam_sort_buffer_size=8M =
ft_min_word_length=3
= 1
[mysqldump] quick
set-variable
[mysql]
no-auto-rehash # Remove the next comment character
if you are not familiar with SQL =
max_allowed_packet=16M #safe-updates
[mysqlhotcopy] interactive-timeout =
key_buffer=20M = sort_buffer=20M = read_buffer=2M
= write_buffer=2M
The hash ( # ) denotes a comment, and the square brackets ([] ) are section
markers. Terms inside square brackets denote which program the settings
that follow will affect. In this example, the settinginteractive-
timeout will apply when running the program mysqlhotcopy only. The
options set beneath a section marker setting apply for the previously set
section, until the next section marker.
In the previous example, the first port applies to clients, and the second port
applies to the MySQL server. They're usually the same, but they don't have
to be (such as when you run multiple MySQL servers on the same
machine).
View the files to check the latest documentation, though; 512MB is not
going to remain a "large" system forever. I suggest copying the one that
comes closest to your needs to the directory you're storing it in and then
making any further modifications.
Tip
Keep a backup of your configuration file as well. If your system fails, you
may lose quite a lot of time reconfiguring the server.
Logging
Examining log files may not sound like your idea of fun on a Friday
evening, but it can be an invaluable aid in not only identifying problems
that have already occurred, but spotting situations that, if left untouched,
may soon cause you to lose more than a Friday night. MySQL has a number
of different log files:
The error log This is the place to look for problems with starting, running,
or stopping MySQL.
The query log All connections and executed queries are logged here.
The binary update log All SQL statements that change data are stored.
The slow query log All queries that took more thanlong_query_time
to execute, or that didn't make use of any indexes, are logged here.
The update log This has been deprecated and should be replaced by the
binary update log in all instances. It stores SQL statements that change data.
The ISAM log Logs all changes to the ISAM tables. Used only for
debugging the ISAM code.
It contains startup and shutdown information and any critical errors that
occur while running. It will log if the server died and was automatically
restarted or if MySQL notices that a table needs to be automatically
checked or repaired. The log may also contain a stack trace when MySQL
dies. A sample error log follows:
010710 19:52:43 mysqld started
010710 19:52:43 Can't start server: Bind on TCP/IP
port: Address already in use
010710 19:52:43 Do you already have another mysqld
server running on port: 3306 ?
010710 19:52:43 Aborting
This logs a situation you were warned about at the beginning of the chapter,
where MySQL has been improperly started. Then, when you attempted to
start MySQL properly, you could not because another process had already
started. To solve the problem, you had to end the rogue process (such as by
runningkill s 9 PID orkill -9 PID on Unix, or using the Task
Manager on Windows).
It will log all connections and executed queries. It can be useful to see who
is connecting (and when) for security purposes, as well as for debugging to
see if the server is receiving queries correctly.
/usr/local/mysql-max-4.0.1-alpha-pc-linux-gnu-
i686/bin/mysqld, Version: 4.0.1-alpha-max-log,
started with:
Tcp port: 3306 Unix socket: /tmp/mysql.sock
Time Id Command
020707 1:01:29 1 Connect
020707 1:01:35 1 Init DB
020707 1:01:38 1 Query
020707 1:01:51 1 Query
020707 1:01:54 1 Quit Argument
root@localhost on
firstdb
show tables
select * from innotest
Any extension will be dropped, as MySQL adds its own extension to the
binary log. If no filename is specified, the binary log is named as the host
machine, with-bin appended. MySQL also creates a binary index file,
with the same name and an extension of.index . The index can be given a
different name (and location) with the following option:
--log-bin-index=binlog_index_filename
The binary update logs contain all the SQL statements that update the
database, as well as how long the query took to execute and a timestamp of
when the query was processed. Statements are logged in the same order as
they are executed (after the query is complete but before transactions are
completed or locks removed). Updates that have not yet been committed are
placed in a cache first.
The binary update log is also useful for restoring backups (see Chapter 11 ,
"Database Backups") and for when you are replicating a slave database
from a master (see Chapter 12 , "Database Replication").
Binary update logs start with an extension 001. A new one is created, with
the number incremented by one, each time the server is restarted or one
ofmysqladmin refresh , mysqladmin flush-logs , orFLUSH
LOGS is run. A new binary log is also created (and incremented) when the
binary log reachesmax_bin_log_size .
./test-bin.001
./test-bin.002
./test-bin.003
./test-bin.004
If you now flushed the logs, the binary update index would be appended
with the new binary log:
% mysqladmin -u root -pg00r002b flush-logs
The sample now contains the following:
./test-bin.001
./test-bin.002
./test-bin.003
./test-bin.004
./test-bin.005
You can delete all the unused binary update logs withRESET MASTER :
mysql> RESET MASTER;
Query OK, 0 rows affected (0.00 sec)
The binary update index now reflects that there is only one binary update
log: ./test-bin.006
Do not remove binary update logs until you are sure they are not going to
be Warning
Not all updates to all databases need to be logged; in many cases, you may
only want to store updates to certain databases. Thebinlog-do-db
andbinlog-ignore-db options in themy.cnf andmy.ini
configuration files allow you to control this. The first specifically sets
which database updates are to be logged. For example, the following:
binlog-do-db = firstdb
will log updates only to the firstdb database, but the following:
binlog-ignore-db = test
will log updates to all databases except test. You can add multiple lines if
you want to log multiple databases:
binlog-do-db = test
binlog-do-db = firstdb
error_code=0
use firstdb;
SET TIMESTAMP=1023036087;
CREATE TABLE customer(id INT);
# at 146
#020602 18:41:40 server id 1 Query thread_id=3
exec_time=0
error_code=0
SET TIMESTAMP=1023036100;
INSERT INTO customer(id) VALUES(1);
# at 218
#020602 18:43:12 server id 1 Query thread_id=5
exec_time=0
error_code=0
SET TIMESTAMP=1023036192;
INSERT INTO customer VALUES(12);
# at 287
#020602 18:45:00 server id 1 Stop
where MySQL allows reads and writes to happen at the same time in
MyISAM tables, but enabling them with these two kinds of statement
would mean that the binary update log could not be reliably used for
restoring backups or for replication.
This is a useful log to have in place; the performance impact is not high
(assuming most of your queries are not slow!), and it highlights the queries
that most need attention (where indexes are missing or not optimally used).
/usr/local/mysql-max-4.0.1-alpha-pc-linux-gnu-
i686/bin/mysqld, Version: 4.0.1-alpha-max-log,
started with:
Tcp port: 3306 Unix socket: /tmp/mysql.sock
Time Id Command Argument
# Time: 020707 13:57:57
# User@Host: root[root] @ localhost []
# Query_time: 0 Lock_time: 0 Rows_sent: 8
Rows_examined: 8
use firstdb;
select id from sales;
# Time: 020707 13:58:47
# User@Host: root[root] @ localhost []
# Query_time: 0 Lock_time: 0 Rows_sent: 6
Rows_examined: 8
In this log, the select id from sales query is there because it did
not make use of an index. The query could have made use of an index on
theid field. See Chapter 4 , "Indexes and Query Optimization," for a
discussion on where to use indexes.
You can also use the mysqldumpslow utility to display the results of the
slow query log: % mysqldumpslow test-slow.log
Reading mysql slow query log from test-slow.log
Count: 1 Time=0.00s (0s) Lock=0.00s (0s) Rows=0.0
(0), root[root]@localhost
For the noncritical logs, doing the following will suffice (assuming you start
in the directory containing the log files). The following is on a Unix system:
mv logfile backup_directory/logfile.old
mysqladmin flush-logs
And the following is on a Windows system:
move logfile backup_directory\logfile.old
mysqladmin flush-logs
Flushing the logs (which you can also do while connected the server with
the SQL statementFLUSH LOGS ) closes and reopens log files that do not
increment in sequence (such as the slow query log). Or, in the case of logs
that are incremented (the binary update log), flushing the logs creates a new
log file with an extension incremented by one from the previous one and
forces MySQL to use this new file.
The old log file can be either removed to backup or just deleted if it will be
of no further use. Any queries that are processed between the two
statements are not logged, as the query log for that moment in time does not
exist. Logging is only re-created when the logs are flushed. For example,
assuming that the query log is calledquerylog , the following set of
commands shows one way to rotate logs. You need to have two windows
open, Window1 connected to your shell or command line, and Window2
connected to MySQL.
This technique cannot be used with the critical log files (such as the binary
update log) because if they are to be useful for replication or for restoration
of backups, there cannot be the possibility of any queries being missed. For
this reason, a new binary update log, with an extension that increments by
one each time, is created whenever the logs are flushed. Records will only
be added to the latest log, meaning you can move older ones without
worrying about queries going missing. Assuming the binary update log is
called gmbinlog and starting with one binary update log, try the
following:
GMBINLOG 001
GMBINL~1 IND
GMBINLOG 002
3 file(s) 0 dir(s) C:\Program Files\MySQL\data>
move gmbinlog.001
D:\backup_directory\gmbinlog001.old
Warning
If you're using replication, do not remove old binary log files until you are
sure no slave servers will still need them. See Chapter 12 for more details.
MySQL for Red Hat Linux comes with a log rotation script. If the
distribution you're using does not, you can use this one as the basis to create
your own:
/usr/local/var/mysqld.log {
# create 600 mysql mysql
notifempty
daily
rotate 3
missingok
compress
postrotate
# just if mysqld is really running if test -n "`ps
acx|grep mysqld`"; then
/usr/local/bin/mysqladmin flush-logs fi
endscript }
Optimizing currently works only with MyISAM and partially with BDB
tables. With MyISAM tables, optimizing does the following:
Defragments tables where rows are split or have been deleted
Sorts the indexes if they have not been already
Updates the index statistics if they have not been already
With BDB tables, optimizing analyzes the key distribution (the same
asANALYZE TABLE; see the "Analyzing Tables withANALYZE TABLE "
section later in this chapter).
You could also optimize the entire database by leaving out any table
references, with the following:
% mysqlcheck -o firstdb -uroot -pg00r002b
The-r option repairs the table, but also eliminates wasted space:
% myisamchk -r sales
- recovering (with sort) MyISAM-table 'sales'
Data records: 8
- Fixing index 1
- Fixing index 2
If you do not specify the path to the table index file, and you're not in the
right directory, you'll get the following error:
% myisamchk -r customer
myisamchk: error: File 'customer' doesn't exist
Specifying the full path to the.MYI file corrects this:
% myisamchk -r
/usr/local/mysql/data/firstdb/customer
- recovering (with keycache) MyISAM-table
'/usr/local/mysql/data/firstdb/customer' Data
records: 0
Tables are locked during the optimization, so don't run this during peak
hours! Also, make sure you have a reasonable amount of free space on the
system
Analyzing Tables
Analyzing tables improves performance by updating the index information
for a table so that MySQL can make a better decision on how to join tables.
The distribution of the various index elements is stored for later usage.
(Analyzing currently only works with MyISAM and BDB tables.)
The table will only be analyzed again if it has changed since the last time it
was analyzed: mysql> ANALYZE TABLE sales;
+---------------+---------+----------+------------
-----------------+ | Table | Op | Msg_type |
Msg_text | +---------------+---------+----------+-
----------------------------+ | firstdb.sales |
analyze | status | Table is already up to date |
+---------------+---------+----------+------------
-----------------+ 1 row in set (0.00 sec)
If you tried to analyze a table that does not support analysis (such as an
InnoDB table), no harm would be done and the operation would just fail.
For example:
You could also analyze all tables in the database with by leaving out any
table names: % mysqlcheck -a firstdb innotest -uroot -
pg00r002b
Checking Tables
Errors can occur when the indexes are not synchronized with the data.
System crashes or power failures can all cause situations where the tables
have become corrupted. Corruption of the data is fairly rare; in most cases,
the corruption is of the index files. These can be hard to spot, though you
may notice information being returned slowly or data not being found that
should be there. Checking tables should be the first thing you do when you
suspect an error. Some of the symptoms of corrupted tables include errors
such as the following:
Got error ### from table handler. The perror utility gives more information
about the error number. Just run perror (which is stored in the same
directory as the other binaries such as mysqladmin) and the error number.
For example:
% perror 126
126 = Index file is crashed / Wrong file format
Some of the other more common errors include:
126 = Index file is crashed / Wrong file format
127 = Record-file is crashed
132 = Old database file
134 = Record was already deleted (or record file
crashed)
135 = No more room in record file
136 = No more room in index file
141 = Duplicate unique key or constraint on write
or update
144 = Table is crashed and last repair failed
145 = Table was marked as crashed and should be
repaired
Once connected to the MySQL server, you can issue a CHECK TABLE
command, make use of the mysqlcheck utility (when the server is running),
or use the myisamchk utility when the server has been stopped. Checking
updates the index statistics and checks for errors.
If any errors are found, the table will need to be repaired (see the "Repairing
Tables" section later in this chapter). Serious errors mark the table as
corrupt, in which case it can no longer be used until it is repaired.
Tip
Always check tables after a power failure or a system crash. You can
usually fix any corruption that has occurred before users notice any
problems.
The default option. It scans rows to check that deleted links are correct.
MEDIUM It also calculates a key checksum for the rows and verifies this
with a
calculated checksum for the keys.
EXTENDED
This is the slowest method, but it checks the table for complete consistency
by doing a full key lookup for every index associated with each row.
TheQUICK option is useful for checking tables where you don't suspect any
errors.
If an error or warning is returned, you should try and repair the table. You
can check more than one table at a time by listing the tables one after
another, for example:
Checks tables.
Checks tables that have changed since the last check or were not closed
properly.
-F ,--fast Checks tables that haven't been closed properly.
This is the slowest form for checking, but it will make sure
-e ,--extended the table is completely consistent. You can also use this
option to repair, though itisusually not necessary.
This is much faster than the extended check, and it finds the -m , --
medium-check vast majority oferrors.
The fastest check, this does not check table rows when -q , --quick
checking. When repairing, it only repairs the index tree.
You can check more than one table by listing a number of tables after the
database name:
You can check all tables in the database by just specifying the name of the
database. % mysqlcheck -c firstdb -uroot -pg00r002b
The default for myisamchk is the ordinary check option ( -c ). There is also
the fast check (- F ), which only checks tables that haven't been closed
properly. This is not the same as the lowercase-f option, which is the force
option, meaning the check continues even if errors occur. There is also the
medium check (-m ), slightly slower and more complete. The most extreme
option is the-e option (that performs an extended check), which is the most
thorough and slowest option. It's also usually a sign of desperation; use this
only when all other options have failed. Increasing
thekey_buffer_size variable can speed up the extended check (if you
have enough memory). See Table 10.6 for the checking options.
-e ,--extend-check
-F ,--fast
-C ,--check-onlychanged
-f ,--force
-i ,--information
Description
Ordinary check and the default option.
Fast check, which only checks tables that haven't been closed properly.
Checks only the tables that have been changed since the last check.
This runs the repair option if any errors are found in the table.
Displays statistics about the table that is checked.
-m
,
--medium-check
Medium check, faster than an extended check, and good enough for most
cases.
Keeps information about when the table was checked and
-U
,
--update-state
whether the table has crashed, which is useful for the -C option. Should not
be used when the table is being used and the--skip-external-
locking option is active.
% myisamchk largetable.MYI
Checking MyISAM file: Hits.MYI
Data records: 2960032 Deleted blocks: 0
myisamchk: warning: 1 clients is using or hasn't
closed the table properly
- check file-size
myisamchk: warning: Size of datafile is: 469968400
Should be: 469909252
- check key delete-chain
- check record delete-chain
- check index reference
- check data record references index: 1
- check data record references index: 2
- check data record references index: 3
myisamchk: error: Found 2959989 keys of 2960032
- check record links
myisamchk: error: Record-count is not ok; is
2960394 Should be: 2960032 myisamchk: warning:
Found 2960394 parts Should be: 2960032 parts
Repairing Tables
If you have checked the tables and errors have been found, you'll need to
repair them. There are various repair options available, depending on which
method you use, but you may not have success. If the disk has failed, or if
none of them work, the only option is to restore from your backup.
Repairing a table can take up significant resources, both disk and memory:
Some space for the new index file (on the same disk as the original). The
old index is deleted at the start, so this is usually not significant, but it will
be if the disk is close to full.
If the error is caused by the table running out of space and the table type is
InnoDB, you will have to enlarge the InnoDB tablespace. MyISAM tables
have a huge theoretical size limit (eight million terabytes), but by default
pointers are only allocated for 4GB. If the table reaches this limit, you can
extend it by using theMAX_ROWS andAVG_ROW_LENGTH ALTER
TABLE parameters. To prepare the table calledlimited for great things
(currently it only has three records), you use the following:
Attempts to recover every possible row from the data file. This option
EXTENDED should not be used unless as a last resort because it may
produce
garbage rows.
This is the option to use if the .MYI file is missing or has a corrupted
USE_FRM header. It will rebuild the indexes from the definitions found in
the .frm
table definition file.
The current error message (4.0.3) indicates that the .MYD file cannot be
found, when it's actually the.MYI file that's missing. The error message is
likely to have been clarified by the time you read this. To repair the table in
this instance, you need to use theUSE_FRM option, which, as the name
suggests, uses the.frm definition file to re-create the.MYI index file:
If for some reason all tables in a database are corrupt, you can repair them
all by just supplying the database name:
% mysqlcheck -r firstdb -uroot -pg00r002b
creating it.file-length=#
-e ,--extendcheck
-f ,--force
-k # ,keysused=#
-r ,--recover
-o ,--saferecover
-n ,--sortrecover
--charactersets-dir=...
--set
characterset=name
-t ,-
tmpdir=path
-q ,--quick
Attempts to recover every possible row from the data file. This option
should not be used unless as a last resort because it may produce garbage
rows.
Repairs most corruption and should be the first option attempted. You can
increase thesort_buffer_size to make the recovery go more quickly
if you have the memory. This option will not recover from the rare form of
corruption where a unique key is not unique.
A more thorough, yet slower repair option than -r that should be used only
if-r fails. This reads through all rows and rebuilds the indexes based on the
rows. It also uses less disk space than-r because a sort buffer is not
created. You can increase the size of key_ buffer_size to improve
repair speed.
Forces MySQL to use sorting to resolve the indexes, even if the resulting
temporary files are large.
The directory containing the character sets.
Specifies a new character set for the index.
Specifies a new path for storing temporary files if you don't want to use
whatever theTMPDIR environment variable specifies.
Fastest repair because the data file is not modified. Specifying theq twice (-
q -q) will modify the data file if there are duplicate keys. Uses much less
disk space as well because the data file is not modified.
-u ,--unpack Unpacks a file that has been packed with the myisampack
utility.
The server should either not be running, or you must be sure there is no
interaction with the tables with which you're working, such as when you
start MySQL with the--skipexternal-locking option. If the--
skip-external-locking option is not on, you can only safely use
myisamchk to repair tables if you are sure there will be no simultaneous
access. Whether--skip-external-locking is used or not, you'll
need to flush the tables before starting the repair (withmysqladmin
flush-tables ) and ensure that there is no access. You may get wrong
results (with tables being marked as corrupted even when they are not) if
mysqld or anything else accesses the table while myisamchk is running.
You must run myisamchk from the directory containing the .MYI files or
supply the path. The following examples show a repair in action, with
MySQL deciding whether to use sorting or a keycache:
% myisamchk -r customer
- recovering (with keycache) MyISAM-table
'customer.MYI' Data records: 0
% myisamchk -r sales
- recovering (with sort) MyISAM-table 'sales.MYI'
Data records: 9
- Fixing index 1
- Fixing index 2
Using mysqlcheck
The mysqlcheck utility is a boon to more recent users of MySQL because,
beforehand, much of the repairing and checking functionality could only be
used when the server was shut down. Luckily this limitation is a thing of the
past with the mysqlcheck utility.
-1 ,--all-in-1
-a ,--analyze
--auto-repair
-# ,--debug=...
--character-setsdir=...
-c ,--check
-C ,--check-onlychanged
--compress
-? ,--help
-B ,--databases
Combines queries for tables into one query per database (instead of one per
table). Tables are in a comma-separated list.
This is the slowest form for checking but will make sure the
-e ,--extended table is completely consistent. You can also use this
option
to repair, though it is usually not necessary.
-h ,--host=... Hostname to which to connect.
Much faster than the extended check and finds the vast -m , --medium-
check majority of errors.
Optimizes the tables.
-o ,--optimize
-p ,-
password[=...]
-P ,--port=...
-q ,--quick
-r ,--repair
-s ,--silent
-S ,--socket=...
--tables
-u ,--user=#
-v ,--verbose
-V ,--version
Displays no output except for error messages. Specifies the socket file to
use when connecting.
List of tables to check. With the-B option, this will take precedence.
The mysqlcheck utility also has a feature that allows it to be run in different
ways without specifying the options. By simply creating a copy of
mysqlcheck with one of the following names, it will take that default
behavior:
Using myisamchk
The myisamchk utility is the older utility, available since the early days of
MySQL. It is also used to analyze, check, and repair tables, but care needs
to be taken if you want to use it when the server is running. Table 10.10
describes the general myisamchk options, Table 10.11 describes the check
options, Table 10.12 describes the repair options, and Table 10.13 describes
other options.
The server should either not be running, or you must be sure there is no
interaction with the tables with which you're working, such as when you
start MySQL with the--skipexternal-locking option. If the--
skip-external-locking option is not on, you can only safely use
myisamchk to repair tables if you are sure there will be no simultaneous
access. Whether or not--skip-external-locking is used, you'll
need to flush the tables before starting the repair (withmysqladmin
flush-tables ) and ensure that there is no access.
I suggest you rather use one of the other options if the server is running.
The syntax is as follows:
myisamchk [options] tablename[s]
You must run myisamchk from the directory where the.MYI index files are
located unless you specify the path to them; otherwise you'll get the
following error:
% myisamchk -r sales.MYI
myisamchk: error: File 'sales.MYI' doesn't exist
Specifying the path solves the problem:
% myisamchk -r
/usr/local/mysql/data/firstdb/sales.MYI
- recovering (with sort) MyISAM-table
'/usr/local/mysql/data/firstdb/sales.MYI' Data
records: 9
- Fixing index 1
- Fixing index 2
% myisamchk -r sales
- recovering (with sort) MyISAM-table 'sales'
Data records: 9
- Fixing index 1
- Fixing index 2
% myisamchk -r sales.MYI
- recovering (with sort) MyISAM-table 'sales.MYI'
Data records: 9
- Fixing index 1
- Fixing index 2
You can use wildcard character to search all tables in a database directory (
*.MYI ) or even all tables in all databases:
% myisamchk -r /usr/local/mysql/data/*/*.MYI
-# ,-
debug=debug_options
-? ,--help
-O var=option ,-set-variable
var=option
-s ,--silent
-v ,--verbose
-V ,--version
-w ,--wait
Description
Outputs a debug log. A commondebug_option string is
d:t:o,filename .
Displays a help message and exits.
Sets the value of a variable. The possible variables and their default values
for myisamchk can be examined with myisamchk --help .
If the table is locked, -w will wait for the table to be unlocked rather than
exiting with an error. If mysqld was running with the--skip-
external-locking option, the table can only be locked by another
myisamchk command.
% myisamchk --help
..
Possible variables for option --set-variable (-O)
are: key_buffer_size myisam_block_size
read_buffer_size write_buffer_size
sort_buffer_size sort_key_blocks decode_bits
ft_min_word_len ft_max_word_len
ft_max_word_len_for_sort current value: 20 current
value: 520192 current value: 1024 current value:
262136 current value: 262136 current value:
2097144 current value: 16
current value: 9
current value: 4
current value: 254
Inside themy.cnf (ormy.ini ) file, there are separate sections for mysqld
and Note
-e ,--extendcheck
-F ,--fast
-C ,--checkonly-changed
-f ,--force
-i ,-
information
-m ,--mediumcheck
-U ,--updatestate
-T ,--read-only
Description
Ordinary check and the default option.
Slowest and most thorough form of check. If you are using -extended-
check and don't have much memory, you should increase the value
ofkey_buffer_size a lot!
Fast check that only checks tables that haven't been closed properly.
Checks only the tables that have been changed since the last check.
This runs the repair option if any errors are found in the table.
Displays statistics about the table that is checked.
Medium check, faster than an extended check and good enough for most
cases.
Keeps information about when the table was checked and whether the table
has crashed, which is useful for the-C option. Should not be used when the
table is being used and the--skipexternal-locking option is active.
Does not mark the table as checked (useful for running myisamchk when
the server is active and the--skip-external-locking option is in
use).
creating it.file-length=#
-e ,--extendcheck
-f ,--force
-k # ,keysused=#
-r ,--recover
-o ,--saferecover
-n ,--sortrecover
--charactersets-dir=...
--set
characterset=name
-t ,-
tmpdir=path
-q ,--quick
-u ,--unpack
Attempts to recover every possible row from the data file. This option
should not be used unless as a last resort because it may produce garbage
rows.
Specifies the keys to use, which can make the process faster. Each binary
bit stands for one key starting at 0 for the first key (for example, 1 is the
first index, 10 is the second index).
Repairs most corruption and should be the first option attempted. You can
increase thesort_buffer_size to make the recover go more quickly if
you have the memory. This option will not recover from the rare form of
corruption where a unique key is not unique.
A more thorough, slower repair option than -r , which should be used only
if-r fails. This reads through all rows and rebuilds the indexes based on the
rows. It also uses less disk space than-r because a sort buffer is not
created. You can increase the size of key_ buffer_size to improve
repair speed.
Forces MySQL to uses sorting to resolve the indexes, even if the resulting
temporary files are large.
The directory containing the character sets.
Specifies a new character set for the index.
Specifies a new path for storing temporary files if you don't want to use the
contents of theTMPDIR environment variable.
Fastest repair as the data file is not modified. Running this option a second
time will modify the data file if there are duplicate keys. Uses much less
disk space as well because the data file is not modified.
Unpacks a file that has been packed with the myisampack utility.
Table 10.13: Other myisamchk Options
Analyzing tables improves performance by updating the index information
for atable so that MySQL can make a better
This can speed up queries that are ordered on this index, as well as ranged
selects. It will probably be very slow if you sort a large table for the first
time.
% myisamchk -d customer
MyISAM file: customer
Record format: Packed
Character set: latin1 (8)
Data records: 3 Deleted blocks: 0 Recordlength: 75
table description:
Key Start Len Index Type
1 2 4 unique long
Summary
MySQL has a host of tools to make administering the database server as
painless a task as possible. But the more critical your data, and the larger
your tables, the more important it is that you can competently and quickly
handle problems when they occur. You can stop and start the MySQL server
in a number of ways, such as using mysqld directly, but it is highly
recommended that you use a wrapper script, such as the mysqld_safe script
supplied with distributions.
Windows and Unix have quite different methods of automating startups, but
both are fairly easy to implement once you know what you are doing.
The next chapters will investigate these topics in more detail: database
security, replication, and configuration.
This chapter shows various ways you can back up and restore with MySQL.
Once you know the complexities and the possibilities involved in backing
up, you'll be in a better position to implement the best strategy for your
situation.
The backup path needs to be the full path to the directory you want to save
to, and should not be a filename. This makes a copy of the.frm (definition)
and.MYD (data) files, but not the.MYI (index) file. You can rebuild the
index once the database has been restored.
When dealing with files, you'll need to watch out for file permissions.
MySQL does not give the most friendly error message to warn you if, when
backing up, you do not have the correct permissions to all the files and
directories.
% mysql firstdb
Welcome to the MySQL monitor. Commands end with ;
or \g. Your MySQL connection id is 13 to server
version: 4.0.1-alpha-max-log mysql> BACKUP TABLE
sales TO '/db_backups';
+---------------+--------+----------+-------------
-------------------------+ | Table | Op | Msg_type
| Msg_text
+---------------+--------+----------+-------------
-------------------------+ | firstdb.sales |
backup | error | Failed copying .frm file: errno =
13 | | firstdb.sales | backup | status | Operation
failed +---------------+--------+----------+------
--------------------------------+
The problem in this example is that MySQL does not have permission to
write files to the /db_backups directory. You need to exit MySQL, and
from the command line make the mysql user the owner of the directory:
mysql> exit
Bye
% chown mysql db_backups/
You need to have the correct permissions to do this. Ensure that the user
Warning you're working as has the correct permissions. In this example it's
root, so there is no problem, but you may not be working as root. If you
have problems, you may need help from your systems administrator.
% mysql firstdb;
Welcome to the MySQL monitor. Commands end with ;
or \g. Your MySQL connection id is 15 to server
version: 4.0.1-alpha-max-log
This time the backup has been successful, and you can view the newly
created files by exiting to the command line once again:
% ls -l db_backups/
total 10
-rw-rw---1 mysql mysql
-rw-rw---1 mysql mysql
Tip to create the files or if you're not sure. Also, remember thatBACKUP
currently only works for MyISAM tables (check your latest documentation
though, as this may no longer be the case by the time you read this).
BACKUP places a read lock on the table before backing it up to ensure that
the backed-up table is consistent.
You can also back up more than one table at a time, by listing more than
one table name:
The lock is placed on one table at a time, first sales, then after sales is
backed up, on sales_rep, and so on. This allows for consistent individual
tables, but if you want to achieve a consistent snapshot of all the tables at
the same time, you'll have to place your own locks on the tables:
Note that you cannot lock tables individually: mysql> LOCK TABLE
sales READ; Query OK, 0 rows affected (0.00 sec)
mysql> LOCK TABLE sales_rep READ; Query OK, 0 rows
affected (0.00 sec)
mysql> LOCK TABLE customer READ; Query OK, 0 rows
affected (0.00 sec)
LOCK TABLE automatically releases all locks held by the same thread, so
the only lock still held by the time of the backup was on the customer table.
To be able to lock a table, you need theLOCK TABLES privilege and
theSELECT Note
privilege for the table you're trying to lock.
mysql> UNLOCK TABLES;
Query OK, 0 rows affected (0.00 sec)
C:\MySQL\bin> cd \
C:\>mkdir db_backups
C:\>mysql firstdb
Welcome to the MySQL monitor. Commands end with ;
or \g. Your MySQL connection id is 3 to server
version: 4.0.1-alpha-max
Unfortunately, the error message is not too clear. The problem is that the
backslash ( \ ) is the MySQL escape character, used to escape other special
characters such as single or double quotes. To use the backslash for a path
in Windows, you need to escape it with another escape character:
mysql> BACKUP TABLE sales TO 'c:\\db_backups'; +----
-----------+--------+----------+----------+ |
Table | Op | Msg_type | Msg_text | +--------------
-+--------+----------+----------+ | firstdb.sales
| backup | status | OK | +---------------+--------
+----------+----------+ 1 row in set (0.55 sec)
Don't panic. Can you see the error in the previous code? The path is not
correct. One of your worst enemies when something does go wrong will be
panic. After seeing the previous result in a crisis situation, you could easily
run screaming from the building cursing MySQL's dodgy software. But
there's usually a simple reason that something does not work, such as the
previous typo. The correct path will correctly restore the table, as this Unix
example shows:
And, just to placate the most paranoid, let's see if thesales table did in fact
restore correctly:
% cd /usr/local/mysql/data/firstdb
% cp sales.* /db_backups
% cp sales_rep.* /db_backups
% cp customer.* /db_backups
%
3 file(s) copied
C:\MySQL\data\firstdb>copy sales.* c:\db_backup
sales.frm
sales.MYI
sales.MYD
3 file(s) copied
C:\MySQL\data\firstdb>copy sales_rep.*
c:\db_backup
sales_rep.frm
sales_rep.MYI
sales_rep.MYD
3 file(s) copied
Once you've copied the files, you can release the locks from Window1, as
follows: mysql> UNLOCK TABLES;
Query OK, 0 rows affected (0.00 sec)
For the duration of the backup, while the locks are in place, you will not be
Warning
able to add new records to the tables and will also experience a performance
penalty for reads. If at all possible, do not perform backups during peak
hours!
There's also a possibility that the Unix permissions could come back to
haunt you. If you did not back the files up as the mysql user (you would
usually have done this as the root user), you're likely to see the following:
mysql> SELECT * FROM sales;
ERROR 1017: Can't find file: './firstdb/sales.frm'
(errno: 13)
The problem is that you've copied the file back, but the mysql user cannot
access this file. The following snippet, from Window1, shows that I've
backed up the files as the root user.
[root@test firstdb]# ls -l
total 183
...
-rw-r----1 root root 153 May 27 22:27 sales.MYD
-rw-r----1 root root 3072 May 27 22:27 sales.MYI
-rw-r----1 root root 8634 May 27 22:27 sales.frm
-rw-rw---1 mysql mysql 156 May 22 21:50
sales_rep.MYD
-rw-rw---1 mysql mysql 3072 May 22 21:50
sales_rep.MYI
-rw-rw---1 mysql mysql 8748 May 22 21:50
sales_rep.frm ...
To back up the customer table, run the following from the Unix shell:
% mysqldump firstdb customer >
/db_backups/customer_2002_11_12.sql
Or run the following from the command line on Windows:
C:\MySQL\bin> mysqldump firstdb customer >
c:\db_backups\customer_2002_11_12.sql
C:\MySQL\bin>
Remember to specify the path, username, and password if you need to. This
creates a file in thedb_backups directory containing the SQL statements
needed to re-create the customer table. You can view this file in any text
editor, such as Notepad or vi. The first part of the file contains the
following:
The hashes (#) are just comments, inefficacious information about versions
and so on. Later in the file are the important SQL statements needed to re-
create the various tables. This snippet is what can re-create the customer
table:
#
# Table structure for table 'customer'
#
#
# Dumping data for table 'customer' #
Warning
Once again, don't do this to test the backup of a live database. This is just a
way to simulate the loss of your database.
To restore the table on a Unix machine, run:
% mysql firstdb <
/db_backups/customer_2002_11_12.sql
Or to restore from Windows:
C:\MySQL\bin> mysql firstdb <
c:\db_backups\customer_2002_11_12.sql
The table has now been restored.
Table 11.1 describes the various options available to mysqldump.
Table 11.1: mysqldump Options
Option
--add-locks
--add-drop-table
-A ,--all-databases
-a ,--all
--allow-keywords
-c ,--complete-insert
-C ,--compress
-B, --databases
--delayed
-e, --extended-insert
Description
Usually column names cannot be the same as a keyword. This allows the
creation of column names that are keywords by beginning each column
name with the table name.
Compresses the data transferred between the client and server if both
support compression.
Uses the multiline INSERT syntax. The output is more compact and also
runs faster because the index buffer is flushed only after eachINSERT
statement.
- # ,- Traces program usage for debugging
purposes.debug[=option_string]
--help Displays a help message and exits.
--fields-terminatedby=...
--fields-enclosedby=...
--fields-optionallyenclosed-by=...
--fields-escapedby=...
--lines-terminatedby=...
-F ,--flush-logs
-f ,--force
-h ,--host=...
-l ,--lock-tables
-K ,--disable-keys
-n ,--no-create-db
-t ,--no-create-info
-d ,--no-data
--opt
-ppassphrase ,-
password[=passphrase]
Same as theLOAD DATA INFILE options. See the sections onSELECT
INTO andLOAD DATA INFILE . Same as theLOAD DATA INFILE
options. See the sections onSELECT INTO andLOAD DATA INFILE .
Same as theLOAD DATA INFILE options. See the sections onSELECT
INTO andLOAD DATA INFILE . Same as theLOAD DATA INFILE
options. See the sections onSELECT INTO andLOAD DATA INFILE .
Same as theLOAD DATA INFILE options. See the sections onSELECT
INTO andLOAD DATA INFILE . Flushes the log file before starting the
dump.
Continues even if there are MySQL errors during the dump.
Dumps data from the MySQL server found on the named host. The default
host is localhost.
Locks all tables before starting the dump. The tables are locked withREAD
LOCAL , which allows concurrent inserts for MyISAM tables.
Indexes are disabled before the INSERT statements, and enabled afterward,
which makes the insertions much quicker.
Does not include theCREATE TABLE statement, so the tables are assumed
to exist already.
Only dumps the table structure and does not include any INSERT
statements for the table.
Specifies the socket file to use when connecting to localhost (the default).
Overrides the-B or--databases option.
Specifies the MySQL username to use when connecting to the server. The
default value is your Unix login name.
Sets the value of a variable.
-v ,--verbose
-V ,--version
-w ,--where='wherecondition'
-X ,--xml
-x ,--first-slave
-O net_buffer_length=n
Makes MySQL more talkative by forcing it to display more information on
the mysqldump process.
Dumps only the records that satisfy thewhere condition. The condition
must appear in quotes.
Dumps the database as well-formed XML.
Locks all tables across all databases.
The following example uses where to limit the backup to only those records
where the id>5 :
% mysqldump --where='id>5' firstdb customer >
/db_backups/customer_2002_11-12.sql
This leaves the output looking like this:
#
# Dumping data for table 'customer'
# WHERE: id>5
#
#
# Dumping data for table 'customer' #
The error message here is quite clear: MySQL cannot write to this directory
because you've forgotten to escape\ characters.\ in Windows is the escape
character, and because it's also part of the Windows path, it needs to be
escaped when used in that context. On Unix, a similar error is common:
In this case, though, MySQL does not even warn you of an error. Somebody
from a Windows environment could easily put the slashes the wrong way
and not get their backup. Always check that your backup has been created.
Looking at the file in any text editor (such as vi or Notepad), you'll see the
following:
1 Yvonne Clegg X
2 Johnny Chaka-Chaka B
3 Winston Powers M
4 Patricia Mankunku C
5 Francois Papo P
7 Winnie Dlamini \N
6 Neil Beneke \N
10 Breyton Tshabalala B
Tabs separate the fields, and newlines separate the records, which is the
same default used byLOAD DATA INTO . You can also change these
defaults by adding options at the end of the statement. The full set of
options forSELECT INTO (andLOAD DATA INTO ) is as follows:
[FIELDS
[TERMINATED BY '\t']
[[OPTIONALLY] ENCLOSED BY ''] [ESCAPED BY '\\' ]
]
[LINES TERMINATED BY '\n']
Here are some examples of using non-default options when usingSELECT
INTO , and the resulting text files:
1zzYvonnezzCleggzzX
2zzJohnnyzzChaka-ChakazzB
3zzWinstonzzPowerszzM
4zzPatriciazzMankunkuzzC
5zzFrancoiszzPapozzP
7zzWinniezzDlaminizz\N
6zzNeilzzBenekezz\N
10zzBreytonzzTshabalalazzB
The default tab character is replaced by the characterszz between each field.
The characters zz are used here just to make a point. It is dangerous to use
ordinary characters like this as terminators. If the text contained the phrase
Warning zzz, the fields would all be out of alignment, as MySQL would
think the first two characters were a terminator. Use conventional characters
such as tabs,
1|Yvonne|Clegg|X[end]2|Johnny|Chaka-Chaka|B[end] Ã
3|Winston|Powers|M[end]4|Patricia|Mankunku|C[end] Ã
5|Francois|Papo|P[end]7|Winnie|Dlamini|\N[end] Ã
6|Neil|Beneke|\N[end]10|Breyton|Tshabalala|B[end]
"1"|"Yvonne"|"Clegg"|"X"
"2"|"Johnny"|"Chaka-Chaka"|"B"
"3"|"Winston"|"Powers"|"M"
"4"|"Patricia"|"Mankunku"|"C"
"5"|"Francois"|"Papo"|"P"
"7"|"Winnie"|"Dlamini"|\N
"6"|"Neil"|"Beneke"|\N
"10"|"Breyton"|"Tshabalala"|"B"
1|"Yvonne"|"Clegg"|"X"
2|"Johnny"|"Chaka-Chaka"|"B"
3|"Winston"|"Powers"|"M"
4|"Patricia"|"Mankunku"|"C"
5|"Francois"|"Papo"|"P"
7|"Winnie"|"Dlamini"|\N
6|"Neil"|"Beneke"|\N
10|"Breyton"|"Tshabalala"|"B"
You can also back up a subset of the data, using a condition in yourSELECT
statement:
1|Yvonne|Clegg|X
2|Johnny|Chaka-Chaka|B
3|Winston|Powers|M
4|Patricia|Mankunku|C
5|Francois|Papo|P
7|Winnie|Dlamini|\N
6|Neil|Beneke|\N
[TERMINATED BY '\t']
[[OPTIONALLY] ENCLOSED BY '']
[ESCAPED BY '\\' ]
]
[LINES TERMINATED BY '\n']
[IGNORE number LINES]
[(col_name,...)]
Let's remove the data from the customer table and restore it usingLOAD
DATA : mysql> TRUNCATE customer;
Query OK, 0 rows affected (0.02 sec)
To restore the table on Unix, use:
mysql> LOAD DATA INFILE '/db_backups/customer.dat'
INTO TABLE customer; Query OK, 8 rows affected
(0.01 sec)
Records: 8 Deleted: 0 Skipped: 0 Warnings: 0
If you're trying unsuccessfully to use LOAD DATA , perhaps you don't have
permission to read a file on the server. The user doing theLOAD DATA
needs to have theFILE privilege (see Chapter 14 , "Database Security"),
and the file needs to either exist in the database directory or be readable by
all.
A common mistake is not to match the terminators and enclosure
characters. They must be exactly the same as they are in the data file (or
were specified in the SELECT INTO statement). If not, the result is that
everything seems to work, but the table remains blank or full ofNULL
values. See the next section for more information.
If you're using the LOCAL keyword, it will not work if MySQL has been
started with the
--local-infile=0 option (see Chapter 13 , "Configuring and
Optimizing MySQL").
The pathname or filename was not specified correctly (remember to use the
escape character for Windows pathnames).
The problem was that you did not match the terminators correctly.
Remember that customer2.dat was created with theFIELDS
TERMINATED BY 'zz' option. So, you need to
restore it in the same way:
+----+------------+-------------+---------+ | id |
first_name | surname | initial | +----+-----------
-+-------------+---------+ | 1 | Yvonne | 2 |
Johnny | 3 | Winston | 4 | Patricia | 5 | Francois
| 7 | Winnie | 6 | Neil
| 10 | Breyton +----+------------+-------------+--
-------+ 8 rows in set (0.00 sec)
| Clegg | X | | Chaka-Chaka | B | | Powers | M | |
Mankunku | C | | Papo | P | | Dlamini | NULL | |
Beneke | NULL | | Tshabalala | B |
What if you now realize you made a mistake and want to restore the entire
table? If you immediatelyLOAD DATA from a file containing a full backup,
you'd come across the following problem:
+----+------------+-------------+---------+
| id | first_name | surname | initial |
+----+------------+-------------+---------+
| 1 | Yvonne | 2 | Johnny | 3 | Winston | 4 |
Patricia | 5 | Francois | 7 | Winnie | 6 | Neil
+----+------------+-------------+---------+ 7 rows
in set (0.00 sec)
| Clegg | X | | Chaka-Chaka | B | | Powers | M | |
Mankunku | C | | Papo | P | | Dlamini | NULL | |
Beneke | NULL |
You've got a duplicate key error, and the file stopped processing at that
point. You could of course simply clear the table beforehand, as you've been
doing before all the restores to date, but if you were trying to restore a table
that already contains records, you may not want to remove everything and
start again. The two keywords that become useful here are REPLACE
andIGNORE . The latter ignores any rows that duplicate an existing row on
a unique index or primary key. So, as in this situation,IGNORE is useful
where you know the records have not changed and you don't want to drop
and restore all the records again:
You can see that of the eight rows, seven were skipped and only the missing
record was inserted. On a large file this would save a lot of time and prevent
the inconvenience of temporarily having no data available. All records are
now present again:
The REPLACE keyword comes in useful where the values of the records
have changed, and you want to restore the records existing on disk. To
demonstrate, you'll make the far-toocommon mistake of updating all
records when you only mean to update one, accidentally changing all
surnames toFortune :
LOAD DATA LOCAL is an option that allows the contents of a file that
exists on the MySQL client machine to be uploaded to the database server.
LOW PRIORITY causes the addition of the new records to wait until there
are no other clients reading from a table (as it does with an ordinary
INSERT).
The CONCURRENT keyword is a useful one if you still want the table to
be read. It allows other threads to read from a MyISAM table (but slows the
LOAD DATA process down).
Many of the options are the same as the LOAD DATA equivalents. The
table to import the data into is determined from the filename. To do this,
mysqlimport drops the extension of the filename, socustomer.dat is
imported into the customer table.
purposes.debug[=option_string]
-d, --delete
--fields-terminatedby=...
--fields-enclosedby=...
--fields-optionallyenclosed-by=...
--fields-escapedby=...
--lines-terminatedby=...
Empties the table before importing the text file.
Same as theLOAD DATA INFILE option.
Same as theLOAD DATA INFILE option.
Same as theLOAD DATA INFILE option.
Same as theLOAD DATA INFILE option.
Same as theLOAD DATA INFILE option.
-f ,--force
--help
Continues even if there are MySQL errors during the dump.
Displays a help message and exits.
-h host_name ,--host= host_name
-i ,--ignore
-l ,--lock-tables
-L ,--local
-ppassphrase ,-
password[=passphrase]
-P portnumber ,-port=portnumber
-r ,--replace
-s ,--silent
-S /path/to/socket ,-socket=/path/to/socket
-u username ,-
user=username
-v ,--verbose
-V ,--version
Imports data to the MySQL server found on the named host. The default
host is localhost.
Added records that would cause a duplicate key error are ignored. Usually
they result in an error and cause the process to stop at that point.
Locks all tables for writing before processing any text files, which keeps the
tables synchronized on the server.
Reads input files from the client machine. If you connect to localhost (the
default), text files are assumed to be on the server.
Specifies the TCP/IP port number to use when connecting to the host. This
does not apply in the case of connections to localhost. See the-S option.
--allowold
--keepold
--noindices
--method=#
-q ,--quiet
--debug
-n ,--dryrun
--regexp=#
--suffix=#
If the files already exist, mysqlhotcopy usually aborts. This option
appends_old tothe filenames and continues operating. The renamed files
from--allowold are usually deleted after the operation. This option
leaves them there.
This option does not include the index files in the backup, which speeds the
process. After restoring the files, the indexes can be rebuilt
withmyisamchk -rq .
Allows you to specify whether to use cp orscp to copy the files. Only
errors are outputted.
Enables debugging.
Outputs messages, but does not do the actions.
Copies all databases with names that match the regular expression.
Gives a suffix for the names of copied databases.
--checkpoint=# Inserts checkpoint entries into specified database
table.
--flushlog Flushes the logs once all the tables are locked.
--tmpdir=# Allows you to specify a temporary directory.
mysqlhotcopy gets its options from the client and mysqlhotcopy groups in
the option files.
To restore a backup made with mysqlhotcopy, replace the files in the data
directory, as if you'd made the direct copies yourself.
There are a number of prerequisites to meet in order to run mysqlhotcopy:
You need to be able to run Perl scripts on your database server.
You need write access to the directory to which you're trying to back up.
You need select privileges on the databases you are backing up. In order to
flush the table, you need reload privileges.
After you've made a backup with mysqldump, restart MySQL with the --
log-bin option. When the time comes to restore, restore the mysqldump
file, and then use the binary log files to return the database to its most
recent status.
For example, let's assume that the last backup was from customer.dat ,
which restores it to the
10 records shown here:
Once you're at this stage (having just made the backup), start the server
with binary logging enabled if you haven't already:
C:\MySQL\bin> mysqladmin shutdown
020601 23:59:01 mysqld ended
If it's not there already, place the following option inside yourmy.cnf
ormy.ini file to enable binary logging:
log-bin
Now restart the server:
C:\MySQL\bin> mysqld-max
020602 18:58:21 InnoDB: Started
C:\MySQL\bin> mysql firstdb;
Welcome to the MySQL monitor. Commands end with ;
or \g. Your MySQL connection id is 3 to server
version: 4.0.1-alpha-max-log
Now let's simulate a crash by stopping the server and deleting the customer
data and index files:
mysql> exit
Bye
C:\MySQL\bin> del c:\MySQL\data\firstdb\customer.*
Depending on your setup, you may not have permission to remove the files
until you shut the server down or change to root.
If you delete the files, and still have a connection active, and then try to
perform a query on the customer table, you may still get results, as the
results may be cached. But when you shut the server down and restart, you
will not be able to find any customer data:
In order to restore it, you need to use the binary update log. First, let's look
at what's in the binary update log. It's not a text file, so you can't use an
ordinary text editor, but MySQL comes with a utility, mysqbinlog. Running
this utility on one of the binary update log files will output the contents of
the log. The syntax is as follows:
mysqlbinlog path_to_binary_update_log
Let's see what's in the log:
Of course, this output is not much good on the screen. You can pipe it to the
actual database as follows:
C:\MySQL\bin> mysqlbinlog ..\data\speed_demon-
bin.001 | mysql firstdb
Now, you can view your table and see that the record has been restored:
-o ,--offset=N
-h ,--host=server
-P ,--port=port
Shows only the queries, not any extra info
Skips a number of entries starting from the beginning, specified byN
Gets the binary log from the specified server Uses the specified port to
connect to the remote server
-u ,--user=username Username to connect to the server
-p , - Password to connect to the server
password=password
Starts reading the binary log at position N Gets the raw table dump
usingCOM_TABLE_DUMB Displays the version and exits
Ordinarily, to make a backup, you need to either take the database server
down or shut out access from clients. There are two main ways to back up,
and for critical data you should use both methods. One is to use mysqldump
(the same as for MyISAM tables), with no write access permitted for the
duration of the backup. This creates a text file with the SQL statements
needed to restore the tables. The second is to make copies of the binary
database files. To do this, you need to shut down the database without any
errors and then copy the data files, InnoDB log files, configuration file
(my.cnf ormy.ini file), and the definition files (.frm ) to a safe place:
% mysqladmin shutdown
% ls -l
total 76145
drwx-----2 mysql mysql
-rw-rw---1 mysql mysql
-rw-rw---1 mysql mysql
-rw-rw---1 mysql mysql
-rw-rw---1 mysql mysql drwxrwx--2 mysql mysql
drwxrwx--2 mysql mysql
-rw-rw---1 mysql mysql
-rw-rw---1 mysql mysql
-rw-rw---1 mysql mysql
-rw-r--r-1 mysql mysql 2048 Jun 1 21:01 firstdb
You should copy all the files from the data directory starting withib , as
these are the InnoDB logs and data. For instance:
% cd /usr/local/mysql/data/
% cp ib*/db_backups/
Now copy the configuration files (remember to copy them all if you have
more than one): % cp /etc/my.cnf /db_backups/
Then copy the definition files, in this case innotest inside thefirstdb
directory (all definition files, as well as MySQL data and index files, exist
inside a directory with the same name as the database):
% cp firstdb/innotest.frm /db_backups/
Now, let's restart the server in order so that a malicious user can destroy the
data:
% mysqld-max
% Starting mysqld daemon with databases from
/usr/local/mysql/data % mysql firstdb
mysql> TRUNCATE innotest;
Query OK, 11 rows affected (0.00 sec)
All the data has been deleted. Your phone will soon start ringing, and it's
time to restore the backup. Once again you need to bring down the server to
prevent interference:
% mysqladmin shutdown
020601 21:20:34 mysqld ended
% cp /db_backups/ib* /usr/local/mysql/data/
cp: overwrite
'/usr/local/mysql/data/ib_arch_log_0000000000'? y
cp: overwrite '/usr/local/mysql/data/ib_logfile0'?
y
cp: overwrite '/usr/local/mysql/data/ib_logfile1'?
y
cp: overwrite '/usr/local/mysql/data/ibdata1'? y
% mysqld-max
% Starting mysqld daemon with databases from
/usr/local/mysql/data % mysql firstdb
mysql> SELECT * FROM innotest;
+------+------+
| f1 | f2 |
+------+------+
| 1 | NULL |
| 2 | NULL |
| 3 | NULL |
| 4 | NULL |
| 5 | NULL |
| 6 | NULL |
| 7 | NULL |
| 8 | NULL |
| 9 | NULL |
| 10 | NULL |
+------+------+
10 rows in set (0.12 sec)
In the case of a server crash to restore InnoDB data you simply need to
restart the server. If general logging and log archiving are on (which is
recommended), the InnoDB tables will automatically restore themselves
from the MySQL logs (the MySQL logs are the "ordinary" logs, not the
InnoDB logs). Any uncommitted transactions present at the time of the
crash will be rolled back. The output will look similar to this:
InnoDB: Database was not shut down normally.
InnoDB: Starting recovery from log files...
InnoDB: Starting log scan based on checkpoint at
InnoDB: log sequence number 0 24785115
InnoDB: Doing recovery: scanned up to log sequence
number 0 24850631 InnoDB: Doing recovery: scanned
up to log sequence number 0 24916167 InnoDB: 1
uncommitted transaction(s) which must be rolled
back InnoDB: Starting rollback of uncommitted
transactions
InnoDB: Rolling back trx no 982
InnoDB: Rolling back of trx no 98 completed
InnoDB: Rollback of uncommitted transactions
completed InnoDB: Starting an apply batch of log
records to the database... InnoDB: Apply batch
completed
InnoDB: Started
mysqld: ready for connections
InnoDB files are not as portable as MyISAM files. They can only be used
on other Tip
platforms if that machine has the same floating-point number format as the
machine on which they were generated. This means, for example, you can
move the files between Intel x86 machines, no matter what operating
systems you're using.
mysqldump, which creates a text file containing the SQL statements needed
to regenerate the tables. Using the file as input to the MySQL daemon
restores the data.
The binary update log, if enabled, keeps a record of all changes to the
database tables. The mysqlbinlog utility can be used to view the contents of
the log or be used to restore updates to the database made because of a
backup. InnoDB tables are not in stored in files, like MyISAM tables, and
so they require extra care. They also have their own logging mechanism.
Understanding Replication
Replication works as follows: The slave starts with an exact copy of the
data on the master. Binary logging is then enabled on the master, and the
slave connects to the master periodically and views the binary log to see
what changes have been made since the last time it connected. The slave
will then automatically repeat these statements on its server.
Themaster.info file on the slave allows it to keep track of which point
it is at on the master binary log. The relation between the master binary log
and the slavemaster.info file is important; if these two fall out of sync,
the data may no longer be identical on both servers. Replication can be
useful as a form of backup (against disk error, not human error) and also to
speed up performance. It is a practical way of running multiple databases,
particularly in an environment whereSELECT statements far
outnumberINSERT orUPDATE statements. (The slaves can then be
optimized entirely forSELECT statements, and the master can handle
theINSERT andUPDATE statements.)
Setting Up Replication
A number of scenarios exist for setting up a master and slave relationship.
You'll see some of these in the examples later in this chapter. The steps that
follow are the basic ones to get replication started.
2. Make a copy of the tables and data. If the database has been used a while
and binary logging is already in place (see Chapter 10 , "Basic
Administration," for details on binary logging), take note of the offset
immediately after the backup. (See the "Replication Commands " section
later in the chapter for more information.) TheLOAD DATA FROM
MASTER operation on the slave can take the place of this step.LOAD DATA
FROM MASTER currently only works with MyISAM tables, and is best
used with small datasets or with situations where data on the master can be
locked for the duration of the operation. Version 4.1 should solve some of
these deficiencies.
[mysqld]
log-bin
server-id=1
[mysqld]
master-host=master_hostname
master-user=replication_user
master-password=replication_password
master-port=master_TCP/IP_port
server-id=unique_number
2. Copy the data taken from the master onto the slave (if you're not
runningLOAD DATA FROM MASTER ).
3. Start the slave server.
4. If you haven't yet got the data, useLOAD DATA FROM MASTER to
access it. Now, with both servers running, you should start to see replication
occurring.
Replication Options
Table 12.1 describes the various replication options available to the master,
and Table 12.2 describes the various replication options available to the
slave.
Table 12.1: Master Configuration File Options Option Description
log-bin=filename
log-bin
index=filename
sql-bin-update-same
binlog-do
db=database_name
binlog-ignore-db= database_name
Activates binary logging. This option must be present on the master for
replication to occur. The filename is optional. To clear the log, runRESET
MASTER , and do not forget to runRESET SLAVE on all slaves. By default
the binary log will be calledhostname.xxx , withxxx being a number
starting at 001 and incrementing by one each time the log is rotated.
Specifies the name of the binary log index file (which lists the binary log
files in order, so the slave will always know the active one). The default
ishostname.index .
Logs all updates to the binary log except from the database_name
database. You can also set the database to ignore on the slave.
Table 12.2: Slave Configuration File Options
Table 12.2: Slave Configuration File Options Option
master-host=host
master-user=username
master-password=password
master-port=portnumber
master-connect-retry= seconds
master-ssl
master-ssl-key=key_name
master-ssl-cert= certificate_name
Description
Specifies the username the slave will connect to the master with. The user
should have
REPLICATION SLAVE permission on the master. Defaults to test. Once
replication has begun, the master.info data will determine this, and
you'll need to use aCHANGE MASTER TO statement to change it.
Specifies the password with which the slave will connect to the master.
Defaults to an empty string. Once replication has begun, the
master.info data will determine this, and you'll need to use aCHANGE
MASTER TO statement to change it.
Specifies the port the master listens on (defaults to the value ofMYSQL_
PORT , usually 3306). Once replication has begun, themaster.info data
will determine this, and you'll need to use a CHANGE MASTER TO
statement to change it.
If the connection between the master and slave goes down, MySQL will
wait this many seconds before trying to reconnect (default 60).
Specifies that replication take place using Secure Sockets Layer (SSL).
If SSL is set to be used (the master-ssl option), this specifies the master
SSL key filename.
If SSL is set to be used (the master-ssl option), this specifies the master
SSL certificate name.
use during a SHOW SLAVE HOSTS statement). Not set by default. Other
methods of determining the host are not reliable; hence the need for this
option.
reported to the master. You should only need this if the slave is on a
nondefault port, or connection takes place in a nonstandard way.
specified table name, from the specified database. You can use this option
multiple times to replicate multiple tables.
updates the specified table (even if other tables are also updated by the
same statement). You can specify multiple instances of this option.
Tells the slave to replicate statements only where they match the specified
table (similar to the
replicate-wild-do-table=
db_name.table_name
replicate_do_table option), but where the matchtakes into account
wildcards. For example, where the table name isdb%.tb% , the match will
apply to any database beginning with the letters db , and any table
beginning with the letterstb .
Tells the slave not to replicate a statement that updates the specified table,
even if other tables are also updated by the same statement, similar to
thereplicate-ignore-table option,
replicate-wild-ignore
table=db_ name.table_name
except that wildcards are taken into account. For example, where the table
name isdb%.tb% replication will not be performed where the database
begins with the lettersdb , and the table begins with the letterstb ). You can
specify multiple instances of this option.
replicate-rewrite-db= master_database->
slave_database
slave-skip-errors=
[err_code1,err_ code2,... | all]
skip-slave-start
slave_compressed_protocol=#
slave_net_timeout=#
when the current database is database_name . You can use this option
multiple times to specify multiple databases to ignore.
Tells the slave thread to replicate a statement only when the database
isdatabase_name . You can use this option multiple times to replicate
multiple databases.
Tells the slave to log replicated updates to the binary log. Not set by default.
If you plan to use the slave as a master to another slave, you'll needto set
this option.
If the database on your slave has a different name to that on the master,
you'll need to map the relationship with this option.
With this option set, replication will not begin when the server starts. You
can manually begin it with theSLAVE START command.
If set to 1, then MySQL uses compression to transfer the data between slave
and master if both servers support this.
Determines the number of seconds to wait for more data from the master
before a read is aborted.
Replication Commands
You should be familiar with both the slave and the master replication
commands. The following are the slave replication commands:
SLAVE START andSLAVE STOP start and stop the replication process,
respectively.
SHOW SLAVE STATUS returns information about the slave, including the
important fact whether the slave is connected to the master
(Slave_IO_Running), replication is running (Slave_SQL_Running), what
binary log is being used (Master_Log_File and Relay_Master_Log_ File),
and what position is current in the binary log (Read_Master_Log_Pos and
Exec_ master_log_pos).
TheRESET SLAVE statement causes the slave to forget its position in the
master logs.
LOAD DATA FROM MASTER takes a copy of the data on the master and
brings it onto the slave. Currently, this is not useful for large datasets or for
situations where the master cannot be unavailable for long, as it acquires a
global read lock when copying the data. It also updates the value
ofMASTER_LOG_FILE and MASTER_LOG_POS . Currently it only works
with MyISAM tables. This statement is likely to become the standard way
of preparing the slave in future, so be sure to read your latest
documentation.
SHOW MASTER LOGS shows the list of binary log files available. You'd
usually use this before purging the logs.
SHOW SLAVE HOSTS returns a list of slaves registered with the master
(note that by default a slave does not register itself but requires
thereport-host configuration option to be set).
Replication Complexities
The following are a few crucial issues you need to keep in mind when
setting up and configuring replication:
FLUSH statements are not replicated, which will affect you if you update
the permission tables directly on the master and then useFLUSH to activate
the changes. The changes will not take effect on the slaves until you run
aFLUSH statement there as well.
Make sure your masters and slaves have the same character set.
The RAND() function does not work properly when passed a random
expression as an argument. You can safely use something like
UNIX_TIMESTAMP(). Queries that update data and use user variables are
not replicated safely (although this is set to change. Check your latest
documentation).
master-host = 192.168.4.100
master-user = replication_user
master_password = replication_pwd
server-id = 3
replicate-do-db = replication_db
+---------------+------------------+-------------
+---------------+----
---------------+---------------------+------------
--------------+-----
---------+-----------------------+----------------
--+-----------------
-+-----------------+---------------------+--------
----+------------+--
-----------+---------------------+----------------
-+
| Master_Host | Master_User | Master_Port |
Connect_retry | Master_Log_File |
Read_Master_Log_Pos | Relay_Log_File Relay_Log_Pos
| Relay_Master_Log_File | Slave_IO_Running |
Slave_SQL_Running | Replicate_do_db |
Replicate_ignore_db | Last_errno | Last_error |
Skip_counter | Exec_master_log_pos |
Relay_log_space | +---------------+---------------
---+-------------+---------------+----
---------------+---------------------+------------
--------------+-----
---------+-----------------------+----------------
--+-----------------
-+-----------------+---------------------+--------
----+------------+--
-----------+---------------------+----------------
-+
| 192.168.4.100 | replication_user | 3306 | 60 |
g-bin.001 | 79 | s-bin.002 |
124 | g-bin.001 | Yes | Yes
| replication_db | | 0 | | 0 | 79 | 132 |
+---------------+------------------+-------------
+---------------+----
---------------+---------------------+------------
--------------+-----
---------+-----------------------+----------------
--+-----------------
-+-----------------+---------------------+--------
----+------------+--
-----------+---------------------+----------------
-+
1 row in set (0.00 sec)
mysql> INSERT INTO replication_table (f1,f2)
VALUES(2,'second'); Query OK, 1 row affected (0.06
sec)
mysql> SELECT * FROM replication_table;
+------+--------+
| f1 | f2 |
+------+--------+
| 1 | first |
| 2 | second |
+------+--------+
2 rows in set (0.00 sec)
Back on the master you can runDELETE andUPDATE statements, and these
will be mirrored on the slave. For example:
The slave does not have to stay connected to the master at all times to
remain in sync as long as the binary logs are correct, as the next example
demonstrates. First, shut down the slave:
% / usr/local/mysql/bin/mysqladmin -uroot -
pg00r002b shutdown 020821 17:25:37 mysqld ended
Then add another record to the master:
mysql> INSERT INTO replication_table (f1,f2)
VALUES(3,'third'); Query OK, 1 row affected (0.03
sec)
Back on the slave, restart the server, connect to the replication_db database,
and you'll see the new record has been added:
% bin/mysqld_safe &
[1] 1989
% /usr/local/mysql/bin/mysql -uroot -pg00r002b
mysql
Welcome to the MySQL monitor. Commands end with ;
or \g. Your MySQL connection id is 3 to server
version: 4.0.2-alpha-max-log
The master could go down as well, and the slave would keep trying to
reconnect (every master-connect-retry seconds, which has a
default of 60) until the master was up again.
Be careful when making changes to the binary logs, though, because this is
the only thing the slave has to go on. The next example shows an example
where data can get lost. First, shut down the slave:
% / usr/local/mysql/bin/mysqladmin -uroot -
pg00r002b shutdown 020821 17:25:37 mysqld ended
As before, add another record to the master, but this time run theRESET
MASTER statement afterward (this removes all old binary logs and starts
again with binary log 1):
Now, when you restart the slave, it will not pick up the change:
% bin/mysqld_safe &
[1] 1989
You can bring them back into sync by resetting the slave, as follows:
mysql> RESET SLAVE;
Query OK, 0 rows affected (0.01 sec)
The slave status has now changed and will again be looking at the
beginning of binary log 1, or position 79:
Now back on the master, add the record again, and take a look at the master
status, where the binary log has moved to position 180:
If you're having an astute day, you may have noticed that the record has
been added twice on the master:
Now copy this data to the slave, and check the offset on the master's binary
log. Make sure no new data was written to the master after you copied the
data but before you could check the status:
Now, on the slave, perform the same options as in the previous example—
that is, copy the data, set the configuration options (with the following
change), delete themaster.info file if it exists (it would have been
created in the previous example in the data directory, as C:\mysql\data
or/usr/local/mysql/data ), and restart the server. The only
difference is that the configuration file should contain the option:
skip-slave-start
You don't want to start the slave replicating until you have set it to begin at
the right point. Now add some more records to the master:
Now, on the slave, you'll need to tell it to start with the correct binary log
file and the correct offset. To do this, set theMASTER_LOG_FILE tog-
bin.001 (or whatever was shown when you ranSHOW MASTER
STATUS ), and set theMASTER_LOG_POS to 280 (or whatever is
applicable in your case). Once this is done, start the slave replicating and
test the results.
Copy this data onto a clean slave (if you've run previous examples on the
slave server, make sure you delete themaster.info file, and start the
slave with theskip-slavestart option).
Now flush the logs on the master, simulating a server that's been running a
while: mysql> FLUSH LOGS;
mysql> FLUSH LOGS;
Now, when you look at the master status, you'll see that the server is already
onto its third binary log:
Now start the slave, and start it replicating from the correct point:
The slave will now start from the correct log on the master. You still have
two other binary logs on the master server, taking up space, and you'll need
to start maintaining the log files to ensure they don't get out of hand. You
may feel tempted to delete logs one and two, but you cannot safely do this
if there is still the possibility that slaves may need to make use of them.
To check this, you'll need to check the slave status for each slave. In this
case, there's only one:
You can see that the slave is using g-bin.003 and is up-to-date (position
4). If all slaves are
up-to-date, you can safely remove binary logs 1 and 2 from the master, with
thePURGE MASTER LOGS statement, as follows:
If you did a listing in the data directory (or wherever you've specified the
binary logs to be), you'll see that the two earlier ones have been removed.
This statement will fail if an active slave is attempting to read a log you're
trying to delete, giving the following error:
If slave is not connected, and you purge a binary log that has not yet been
used, that slave will be unable to continue replicating. At some stage this
process may be automated, but for now you'll have to check each slave
manually to see its position. Let's see what would happen if you didn't
check. First, stop the slave:
Now, if you restart the slave, it will not replicate, as it's looking for a
nonexistent binary log: mysql> SLAVE START;
Query OK, 0 rows affected (0.00 sec)
The problem now is that the most recent INSERT statement on the master
does not appear in a log anywhere because all "old" logs have been purged.
If you've made a backup of the binary logs, you can easily restore it, but if
not, you can manually rerun the statement and then make the slave look at
the most recent log, as follows:
This example demonstrates that replication is not about an exact copy of the
data. Rather, it's about replicating the statements from one server onto the
other. This will result in an exact copy of the data, but if themaster.info
file or the binary logs are tampered with, MySQL will not be able to follow
the commands in sequence, which may result in the data getting out of sync.
Copy the data to the slave, and start the slave. The slave should be clean (no
master.info file and no data in the replication_db database) and contain
what looks like an ordinary set of options in its configuration file, as
follows:
master-host = 192.168.4.100
master-user = replication_user
master_password = replication_pwd
server-id = 3
replicate-do-db = replication_db
Start the slave server and take a look at the slave status to see that
replication has started correctly:
mysql> SHOW SLAVE STATUS;
+---------------+------------------+-------------
+---------------+----
---------------+---------------------+------------
--------------+-----
---------+-----------------------+----------------
--+-----------------
-+-----------------+---------------------+--------
----+------------+--
-----------+---------------------+----------------
-+
| Master_Host | Master_User | Master_Port |
Connect_retry | Master_Log_File |
Read_Master_Log_Pos | Relay_Log_File Relay_Log_Pos
| Relay_Master_Log_File | Slave_IO_Running |
Slave_SQL_Running | Replicate_do_db |
Replicate_ignore_db | Last_errno | Last_error |
Skip_counter | Exec_master_log_pos |
Relay_log_space | +---------------+---------------
---+-------------+---------------+----
---------------+---------------------+------------
--------------+-----
---------+-----------------------+----------------
--+-----------------
-+-----------------+---------------------+--------
----+------------+--
-----------+---------------------+----------------
-+
| 192.168.4.100 | replication_user | 3306 | 60 |
g-bin.001 | 280 | s-bin.003 |
325 | g-bin.001 | Yes | Yes
| replication_db | | 0 | | 0 | 280 | 329 |
+---------------+------------------+-------------
+---------------+----
---------------+---------------------+------------
--------------+-----
---------+-----------------------+----------------
--+-----------------
-+-----------------+---------------------+--------
----+------------+--
-----------+---------------------+----------------
-+
1 row in set (0.00 sec)
The problem is that the slave started replicating from the beginning of the
first binary log, which means it repeated the twoINSERT statements even
though the copy of the data was made after that. There are two solutions.
You can either runRESET MASTER on the master immediately after
making the copy, or run aCHANGE MASTER TO… statement on the slave
before starting to replicate to set it to begin at the right point, as you did in
the "Replicating with an Active Binary Log on the Master" section (which
entails starting the server with the skip-slave-start option).
Add a few records onto the master and reset the master, so you do not
repeat the error of the previous example:
Copy the data to the slave and start the server replicating. Add a new record
on the master:
mysql> INSERT INTO replication_table (f1,f2)
VALUES(3,'third'); Query OK, 1 row affected (0.01
sec)
The data on the slave should now appear as follows:
Although if you checked the data on both servers they'd be identical, there's
actually an error on the slave, as it has attempted toINSERT the record
twice. Unless you used the riskyslave-skip-errors option in the
configuration file, replication would now stop, and the slave will report the
erroneous query, as follows:
The error is clearly reported so that you can investigate the cause of the
error and take action. In this case, the error was that the statement was
repeated on the slave when it shouldn't have been. You can get the slave
replicating correctly once more by telling it to skip the next command in the
master binary log and continue from there. You use theSET
SQL_SLAVE_SKIP_COUNTER command for this. Once this is run, you
can start the slave (you can only tell a slave to skip when replication is
stopped), and it will continue as before, as follows:
Summary
Replication is a useful feature that allows you to keep an exact copy of data
from one server on another server. The master database writes to the binary
log, and any number of slave servers connect to the master, read the binary
update log, and replicate these statements on their servers.
The relationship between the binary log on the master and the
master.info file on the slave is the key to keeping replication in sync.
If the master binary logs are removed before the slaves have used them,
replication will fail.
Also important when tuning is the information supplied by the server itself.
You can view this from the command line with the following:
% mysqladmin extended-status
or when connected to the server:
The list of variables and status information grows longer with each new
release. Your version will probably have more than this, and you should
read the latest documentation to see exactly what these extras do. A full
explanation of those currently in use is given later in this chapter, in Table
13.2 .
Warning
On Windows, the.cnf extension can conflict with FrontPage and
NetMeeting.
As a starting point, I suggest replacing yourmy.cnf file (ormy.ini ) with
one of these configurations, choosing the configuration closest to the your
needs.
Choosing the right configuration for your system will get you a large step of
the way toward optimality, but to achieve optimal usage requires fine-
tuning the configuration for your system and usage specifics. You'll see
some of the variables in the following sections.
Optimizing table_cache
The table_cache variable is one of the most useful variables to adjust.
Every time MySQL accesses a table, if there is space in the cache, that table
is placed there. It's faster to access the table in memory than the table on
disk. You can see whether you need to increase the value of
yourtable_cache by examining the value ofopen_tables at peak
times (one of the extended status values you saw withSHOW STATUS or
mysqladmin variables ). If you find thatopen_tables is at the
same value as your table_cache , and the value ofopened_tables
(another extended status value) is increasing, you should increase
thetable_cache if you have enough memory.
The number of open tables can be higher than the number of tables in your
Note databases. MySQL is multithreaded, and there may be many queries
running at a
time, each of which may open a table.
table_cache - 512
open_tables - 103
opened_tables - 1723
uptime - 4021421 (measured in seconds)
It looks like the table_cache is set too high in this case. The server has
been up for a long time (if the server had just come up you wouldn't know if
thetable_cache would be reached soon or if theopened_tables
would soon begin to increase). The number of opened tables is reasonably
low, and the number of open tables is well below what it could be,
considering that this is a peak time.
table_cache - 64
open_tables - 64
opened_tables - 431
uptime - 1662790 (measured in seconds)
table_cache - 64
open_tables - 64
opened_tables - 22423
uptime - 19538
Optimizing key_buffer_size
The key_buffer_size affects the size of the index buffers, which in
turn affects the speed of index handling, in particular index reads. The
higher the value, the more of the indexes MySQL can hold in memory,
which is much faster to access than from disk. A suggested rule of thumb is
to set it from between a quarter to half of the available memory on your
server (if your server is dedicated to MySQL). You can get a good idea how
to adjust thekey_buffer_size by comparing
thekey_read_requests andkey_reads status values. The ratio
ofkey_reads tokey_read_requests should be as low as possible,
with 1:100 being about the highest acceptable limit (1:1000 is better, 1:10 is
terrible). Thekey_reads value indicates how many times the key needs to
be read from disk, which is what you want to avoid by setting the key buffer
to as high a value as possible.
Scenario 1 reflects a healthy situation. The ratio is over 1:10000, but alarm
bells should be ringing in scenario 2, where the ratio is about a worrying
1:11. As a solution, you should increase thekey_buffer_size to as
much as the memory allows. A hardware upgrade is necessary if you don't
have enough memory to cater for this.
Note
Persistent connections cannot be used in CGI mode, and are affected by the
KeepAlive settings in the Apache web server.
This scenario examines a web server under heavy load that uses persistent
connections:
max_connections - 250
max_used_connections - 210
threads_connected - 202
threads_running - 1
It may look like MySQL is wasting resources in this scenario, but in this
case it's simply that the 202threads_connected are persistent, based
upon the number of instances of the web server, and are hardly taking up
any resources. Only one thread is actually running, so the database is
probably not taking much strain. If thethreads_connected gets ever
closer to themax_connections without any problems, you may even
want to increase themax_connections to avoid exceeding the
connections limit. You can see how close the connections have ever gotten
to maximum by looking at themax_used_connections value. If this is
close, or equal to themax_connections , it's certainly time to make
allowances for an increase.
Warning
When testing, make sure to test properly under load. There are some
documents on the Web with all kinds of erroneous comparisons between
persistent and nonpersistent connections.
In a system such as the previous scenario, a climbing threads_running
value is often an indicator that the database server is not handling the load.
Examining the process list can help identify the queries causing the
blockage. What follows is a portion of the output from a database server
just before it crashed. The number ofthreads_connected continued
increasing until the server could handle it no longer and fell over.
Theprocesslist output helped to identify the problematic queries:
% mysqladmin processlist;
Id User Host Db Command Time State Info 6464 mysql
websrv2… news Sleep 6482 mysql websrv2… news Sleep
6486 mysql websrv2… news Sleep 7549 mysql websrv2…
news Sleep 8126 mysql websrv2...news Sleep 9938
mysql websrv2...news Sleep 1696 mysql
websrv2...news Sleep 4143 mysql websrv2...news
Sleep 5071 mysql websrv2...news Sleep 5135 mysql
websrv2...news Sleep 92707 mysql zubat... news
Sleep 93014 mysql zubat... news Query
156 Copying
to tmp
table select distinct
47
171
176 Copying
to tmp
table select distinct arts.a_id, arts.headline1,
nartsect.se_id, arts.mdate, arts.set_
349
1
153 Copying
to tmp
table select distinct arts.a_id, arts.headline1,
nartsect.se_id, arts.mdate, select distinct
arts.a_id,
arts.headline1, nartsect.se_id, arts.mdate,
arts.set_
select s_id from arts where
a_id='32C436' 93267 mysql zubat... news Query
Of the two web servers indicated in the list, websrv2 is behaving normally
(all its threads are completed, and the connections are sleeping), but zubat
has problem queries piling up.
There are many queries, and this is only a small portion of the whole list,
but the query you should examine in this case is the one beginning like this:
select distinct arts.a_id, arts.headline1,
nartsect.se_id, arts.mdate, arts.set ...
Notice how the status for all of these queries is Copying to tmp
table , and the others wereLocked . In this case, the problem was that a
developer had made a change to the query so that it no longer used an
index. Chapter 4 , "Indexes and Query Optimization," discusses this topic in
more detail.
Routinely examining the processlist output can help identify queries
that are slow even before they cause something as drastic as the server to
fail.
Theslow_queries value is another good one you should examine. If it
creeping up all the time, it probably indicates a problem. A well-tuned
system should have as few slow queries as possible. Some complex joins
may be unavoidably slow, but it's more likely that slow queries are just
badly optimized.
innodb_data_file_path=/disk1/ibdata1:900M;/disk2/i
bdata2:50M:autoextend
Here the two data files are placed on different disks (called disk1
anddisk2 ). The data will first be placed inibdata1 , until the 900MB
limit is reached, and then will be placed into ibdata2 . Once the 50MB
limit is reached,ibdata2 will automatically extend in 8MB chunks.
If a disk becomes physically full, you'll need to add another data file on
another disk, which requires some manual work for configuration. To do
this, look at the physical size of the final data file and round it down to the
nearest megabyte. Set this data file size specifically, and add the new data
file definition. For example, if thedisk2 specified previously fills up with
ibdata2 at 109MB, you'll use something like the following definition:
innodb_data_file_path=/disk1/ibdata1:900M;/disk2/i
bdata2:109M;/disk3/ ibdata3:500M:autoextend
You'll need to restart the server for the changes to take effect.
The path to the base directory or the MySQL installation directory. Other
paths are usually taken relative to this. Allows large result sets by saving all
temporary sets on file when memory is not sufficient.
The Internet Protocol (IP) address or hostname to which to bind MySQL.
--character-setsdir=path
The directory where the character sets are located.
--chroot=path
--core-file
-h ,--datadir=path
--debug[...]=
For security purposes, you can start MySQL in a chroot environment with
this option. This causes MySQL to run in a subset of the directories, hiding
the full directory structure. This does, however, limit the usage ofLOAD
DATA INFILE andSELECT ... INTO OUTFILE .
--default-characterset=charset
Sets the default character set (the default is latin1).
--default-tabletype=type
--delay-key-write-forall-tables
--des-key
file=filename
--enable-externallocking
--enable-named-pipe
-T ,--exit-info
--flush
Sets the default table type (tables of this type are created if no table type is
specified in theCREATE statement). By default, this will be MyISAM.
With this option, MySQL does not flush the key buffers between writes for
any MyISAM table.
The default keys are read from this file. This is used by
theDES_ENCRYPT() andDES_DECRYPT() functions.
This option enables system locking. It should not be used on systems where
the locked daemon does not work fully. (This applied to Linux, although
this may no longer be the case with newer versions.)
On Windows NT/2000/XP, this option enables support for named pipes.
A bit mask of different flags used in debugging. Not suggested you use this
unless you know what you're doing!
This option ensures MySQL flushes all changes to disk after each SQL
statement. Usually the operating system handles this. You should not need
to use this unless
- ? ,--help
--init-file=file
-L ,--language=...
-l ,--log[=file]
--log-isam[=file]
--log-slow
queries[=file]
--log-update[=file]
--log-bin[=file]
--log-long-format
--low-priority-updates
--memlock
This option sets the language to be used for client error messages. Can be
the language or the full path to the language file.
This option logs all changes to MyISAM or ISAM files to the specified file
(only used when debugging these table types).
Logs all queries that take longer than the value of the
variablelong_query_time (in seconds) to execute to the slow query
log.
Logs more information. If the slow query log is being used (--log-
slow-queries ), any queries that do not use an index are logged there as
well.
If this option is used, all inserts, updates, and deletes will have a lower
priority than selects. Where you don't want this to apply to all queries, you
can useSET OPTION SQL_LOW_PRIORITY_UPDATES=1 to apply it to
a specific thread orLOW_PRIORITY ... to apply it to a specificINSERT
,UPDATE , orDELETE query.
Locks mysqld into memory. This option is only available if your system
supports themlockall() function (as Solaris does). You'd normally only
use this if the operating system is having problems and mysqld is swapping
to disk. You can see if--memlock has been used by looking at the value
of the
locked_in_memory variable.
Not used in any but the earliest versions of MySQL 4. If set, users who do
not have any privileges having anything to do with a database will not see
that database listed when they perform aSHOW DATABASES statement
(theSHOWDATABASES privilege removes the need for this).
This option adds to security by not allowing users to create new users (with
theGRANT statement) unless they haveINSERT privilege on the mysql.user
table or one of the columns in that table.
have to (such as when you, as root, have forgotten the password). Once
you've finished, runmysqladmin flush-privileges
ormysqladmin reload to start using the privilege tables again.
--skip-external
locking
MySQL uses ISAM as a default table type and does not use some
oftheoptions that were new in version 3.23. It
--skip-new also implies--skip-delay-key-write . This option
should not be needed anymore,unless its
behaviorchanges.
This option ensures that one cannot delete or rename files to which a
symlinked file in the data directory
--skip-symlink points. You should use it if you are not using symlinks
as a security measure to ensure no one can drop or rename a file outside of
the mysqld data directory.
checking for overruns for each memory allocation and memory freeing
(these checks are done when MySQL is configured with--with-
debug=full ).
--skip-show-database
--skip-stack-trace
If set, the SHOW DATABASES statement does not return results unless the
client has thePROCESS privilege. (The SHOW DATABASES privilege,
introduced in early versions of MySQL 4, removes the need for this.)
Does not use stack traces (which is useful if running mysqld under a
debugger). Some systems require this option to get a core file.
--skip-thread-priority
Disables the use of thread priorities for a faster response time.
--socket=path
The path to the socket file for use for local connections (instead of the
default, usually/tmp/mysql.sock ). The various differences between
ANSI standard and MySQL can be set using these options. They are
--sql
mode=option[,option[,
option...]]
This option should only be needed when an operating system leaks memory
when creating large numbers of
--temp-pool new files with different names (as happened with some
versions of Linux). Instead, MySQL will use a small set of names for
temporary files.
--transaction
isolation= { READUNCOMMITTED | READCOMMITTED |
REPEATABLE-READ | SERIALIZABLE }
Sets the default transaction isolation level. See the discussion in Chapter 4 .
-t ,--tmpdir=path
-u ,--user= [user_name | userid]
-V ,--version
-W ,--warnings
The directory used to store temporary files and tables. It's useful to make
itdifferent to your usual temporary space if that is too small to hold
temporary tables.
Supplies a username to run MySQL as. When starting mysqld as root, this
option must be used.
Displays version information and exits.
Warnings will be displayed in the error file.
The size of the buffer allocated cache data and indexes for BDB tables. If
yoursystem does not use BDB tables, use the--skipbdb option to avoid
wasting memory.
bdb_home
bdb_max_lock
bdb-no-recover
bdb-no-sync
bdb_logdir
bdb_shared_data
bdb_tmpdir
binlog_cache_size
character_set
character_sets
The base directory for BDB tables, which should be the same as--
datadir .
The maximum number of locks that can be applied to a BDB table. Increase
this if your transactions are likely to be long or your queries require many
rows to be examined. Errors such asbdb: Lock table is out of
available locks orGot error 12 from ... indicate a need to
increase the value. The default value is 10000.
If set, MySQL does not start BDB in recover mode. Usually you should
only set this if there is corruption in the BDB logs that prevents a successful
startup.
The cache size for transactions to be written to the binary log. If the
transactions are large and take more than the default 32KB cache, you
should increase this.
The full list of supported character sets. If you are compiling MySQL from
source and know you are never going to use them, you can compile MySQL
not to support the extra character sets.
If active (by default it is), you can insert into MyISAM tables at the same
time as querying, giving a performance gain (as long as the table contains
no gaps from
concurrent_inserts
connect_timeout
datadir
delay_key_write
delayed_insert_limit
delayed_insert_timeout
delayed_queue_size
previously deleted records. You can ensure this by regularly optimizing the
tables). The
--safe or--skip-new options nullify this.
The time in seconds MySQL waits for packets before it times out with
aBad handshake . The default is 5 seconds. This helps to prevent denial
of service attacks where many bad connection attempts are made in order to
prevent legitimate users from connecting.
If active (the default), MySQL will not flush the key buffer for a table on
every index update for tables with the
DELAY_KEY_WRITE option. Rather, it will only be flushed when the table
is closed. This increases the speed of key writes, but it also increases the
chance of corruption, so you should regularly check the tables. You can
specify that the
DELAY_KEY_WRITE option is default for all tables by using the--
delay-key-writefor-all-tables option. The--safe or
--skip-new options nullifies this option.
The number of rows allocated for the INSERT DELAYED queue. If this
limit is reached, clients performingINSERT DELAYED will wait until there
is space.
flush
flush_time
ft_min_word_len
ft_max_word_len
ft_max_word_len_sort
ft_boolean_syntax
have_innodb
If set, MySQL flushes all changes to disk after each SQL statement.
Usually the operating system handles this. Defaults to OFF as you should
not need to use this unless you're having problems.
The time in seconds between automatic flushes (where the tables are closed
and synchronized to disk). This is usually set to 0 unless you're running a
system with very little resources or Windows 95/98/Me.
Words of this length or less are inserted into theFULLTEXT index with the
fast index re-creation when the index is rebuilt. Words longer than this
length are inserted into the index the slow way. You are unlikely to want to
change the default (20) unless the words in your index are of unusual
average lengths. If the value is set too high, the process will be slower as
the temporary files will be bigger, and fewer keys will be in one sort block.
If the value is too low, too many words will be inserted the slow way.
have_bdb
Set toYES if MySQL supports BDB tables orDISABLED if the--skip-
bdb option is have_raid have_openssl
init_file
innodb_data_home_dir
innodb_data_file_path
innodb_mirrored_log_groups
innodb_log_group_home_dir
innodb_log_files_in_group
innodb_log_file_size
used.
Specifies the number of identical copies of log groups to keep for the
database. Currently this should be 1.
Specifies the number of log files in the log group. 3 is the suggested value
(logsare written in rotation).
Specifies the size in megabytes of each log file in a log group. Suggested
values are from 1MB to
1/innodb_log_files_in_group of the
innodb_buffer_pool_size . A high value saves disk input/output
(I/O) because
innodb_log_buffer_size
innodb_flush_log_at_trx_commit
innodb_log_arch_dir
innodb_log_archive
innodb_buffer_pool_size
innodb_additional_mem_pool_size
less flush activity is needed, but slows recovery after a crash. The total size
of the log files should not be more than 4GB on 32-bit computers.
Specifies the size of the buffer used to write logs. A suggested value is
8MB. The larger the buffer, the less disk I/O because then transactions do
not need to be written to disk until they are committed.
If set, logs are flushed to disk as soon as the transaction is committed (and
are therefore permanent and able to survive a crash). This should not
normally beset to anything butON if your transactions are important. You
can set this toOFF if performance is more critical and you want to reduce
disk I/O at the cost of safety.
Specifies the directory where the logs are to be archived. Currently this
should be the same asinnodb_log_group_home_dir because log
archiving is not currently used.
If set, InnoDB log files will be archived. Currently, MySQL recovers using
its own log files, so this should be set toOFF .
Specifies the size in bytes of the memory buffer used to cache table indexes
and data. The larger this is, the better the performance because less disk I/O
is then required. Up to 80 percent of memory is suggested on a dedicated
database server because any larger may cause operating system paging.
Specifies the size in bytes of a memory pool used to store information about
the internal data structures. 2MB is a possibility, but if you have many
tables, make sure there is enough memory allocated;
otherwise MySQL will use operating system memory (you can see the
warnings in the error log if this occurs and increase the
innodb_file_io_threads
innodb_lock_wait_timeout
innodb_flush_method
interactive_timeout
join_buffer_size
key_buffer_size
language
value).
The number of file I/O threads in InnoDB. The suggested value is 4, but it
is
suspected Windows may benefit from a higher setting.
The time in seconds an InnoDB transaction waits for a lock before rolling
back. InnoDB detects deadlocks automatically in its own lock tables, but if
they come from outside (such as aLOCK TABLES statement), deadlock
may arise, in which case this value is used.
The time in seconds that the server waits for any activity on an interactive
connection (one using theCLIENT_INTERACTIVE option when
connecting) before closing it. Thewait-timeout option applies to
ordinary connections. The default is 28800.
The size in bytes of the buffer used for full joins (joins where no index is
used).The buffer is allocated to each full join.
Increasing this will make full joins faster, although the best way to speed up
a join is by adding appropriate indexes.
The size in bytes of the buffer used for index blocks. This is discussed fully
inthe section "Optimizing Key Buffer Size."
The time in seconds that defines a slow query. Queries that take longer than
this will cause theslow queries counter to be incremented and will be
logged in the slow query log file if slow query logging is enabled.
Set to 1 if table names are stored in lowercase and are case insensitive. The
default is 0.
The maximum allowable size in bytes of one packet of data. The message
buffer is initialized to the size specified by
net_buffer_length , but it can grow up to this size. If you use
largeBLOB orTEXT columns, set this to the size of the largest column.
As soon as the current binary log exceeds this size, the logs will be rotated
and a new one created.
max_connect_errors
max_delayed_threads
max_heap_table_size
max_join_size
max_sort_length
max_user_connections
max_tmp_tables
max_write_lock_count
Joins that MySQL determines will return more rows than this limit will
return an error. This prevents users from accidentally (or maliciously)
running huge queries that could take return many millions of rows and take
up too many resources.
When sorting BLOB orTEXT fields, only up to this number of bytes will be
used. For example, if this is set to 1024, only the first 1024 characters will
beused in sorting.
The maximum number of temporary tables a client can keep open at the
same time. (At the time of this writing, this is not used; check the latest
documentation.)
If this many consecutive write locks occur, MySQL will allow a number of
read locks to run.
When MySQL inserts in bulk (for example, LOAD DATA INFILE... ),
it uses a tree-like myisam_bulk_insert_tree_size
myisam_recover_options
myisam_sort_buffer_size
myisam_max_extra_sort_file_size
myisam_max_sort_file_size
cache to speed up the process. This is the maximum size in bytes ofthe
cache for each thread. The default is 8MB, and setting it to 0 disables this
feature. The cache is only used when adding to a table that is not empty.
The size in bytes of the buffer allocated when sorting or repairing an index.
When MySQL creates an index, it subtracts the key cache size from the size
of the temporary table it would use with fast index creation. If the
difference is larger than this amount (specified in megabytes), MySQL uses
the key cache method.
The maximum size (in megabytes) of the temporary file MySQL creates
when creating or repairing indexes. If this size would be exceeded, MySQL
uses the slower key cache method to create or repair the index.
net_buffer_length
net_read_timeout
net_retry_count
net_write_timeout
open_files_limit
pid_file
port
protocol_version
record_buffer
record_rnd_buffer
The size in bytes that the communication buffer is set to between queries.
Tosave memory is systems with low memory, set this to the expected length
of SQL
statements sent by clients. It is
automatically enlarged to the size of max_allowed_packet if the
statement exceeds this length.
Time in seconds that MySQL waits for data from a connection before
aborting the read. If no data is expected, the
net_write_timeout is used, and slave_net_timeout is used for
the master/slave connection.
MySQL uses this value to reserve file descriptors. Increase this if you get
the errorToo many open files . Usually this is set to 0, in which case
MySQL uses the larger ofmax_connections*5 or
max_connections + table_cache*2 .
query_buffer_size
query_cache_limit
query_cache_size
query_cache_startup_type
safe_show_database
server_id
skip_external_locking
skip_networking
skip_show_database
through this avoid disk seeks. If not set, it will be the same as
therecord_buffer .
The initial size in bytes allocated to the query buffer. It should be sufficient
for most queries; otherwise it should be increased.
The limit in bytes for the query cache. Results larger than this will not be
cached. The default is 1MB.
The size in bytes allocated to the query cache (where results are stored from
old queries). 0 indicates the cache is disabled.
Not used in any but the earliest versions of MySQL 4. If set, users who do
not have any privileges having anything to do with a database will not see
that database listed when they perform aSHOW DATABASES statement
(theSHOW DATABASES privilege removes the need for this).
The ID of the server. Important for
replication purposes to identify servers.
If set, the SHOW DATABASES statement does not return results unless the
client has thePROCESS privilege. (TheSHOW
DATABASES privilege, introduced in early versions of MySQL 4, removes
the need for this.)
slave_net_timeout
slow_launch_time
socket
sort_buffer
table_cache
table_type
thread_cache_size
thread_concurrency
thread_stack
timezone
The size in bytes allocated to the buffer used by sorts. See the discussion
earlier in this chapter titled "Optimizing the
sort_buffer variable."
The number of open tables for all threads. See the discussion earlier in this
chapter titled "Optimizingtable_cache ."
The number of threads kept in a cache for reuse. New threads are taken
from the cache if any are available, while a client's threads are placed in the
cache when disconnecting if space is available. If you have lots of new
connections, you can increase this value to improve performance. Systems
with good thread implementation normally don't benefit much from this.
You can see its efficiency by comparing the Connections
andThreads_created status variables.
The size in bytes of the stack for each thread. The behavior of thecrash-
me benchmark depends on this value.
Indicates the number of connections that were aborted because for some
reason the client did not close the connection properly. This could happen
ifthe
Aborted_clients
One of these variables exists for each kind of statement. The value indicates
the number of times this statement has been executed.
The number of attempts to connect to the MySQL server.
Created_tmp_disk_tables
The number of implicit temporary tables on disk that were created while
executing statements.
Created_tmp_tables
The number of implicit temporary tables in memory that were created
whileexecuting statements. Created_tmp_files The number of
temporary files created by mysqld.
Delayed_insert_threads
The number of delayed insert handler threads currently in use.
Delayed_writes
The number of records written by anINSERT DELAYED statement.
Flush_commands
Handler_commit
Handler_delete
The number of times the first entry of an index was read, usually indicating
a full index scan (for example,
The number of requests to read the previous row in index order. This would
be used by anSELECT fieldlist ORDER BY fields DESC type
of statement.
The number of requests to read the next row in the data file. You would
usually not want this to be high because it means that queries are not
making use of indexes and have to read from the data file.
The number of requests causing a key block to be read from the key cache.
The
Key_reads:Key_read_requests ratio should be no higher than
1:100 (i.e., 1:10 is bad).
The number of physical reads causing a key block to be read from disk. The
Key_reads:Key_read_requests ratio should be no higher than
1:100 (again, 1:10 is bad).
The maximum number of connections in use at any one time. See the
connections discussion earlier in this chapter ("Dealing with Too
ManyConnections").
Not_flushed_key_blocks
The number of key blocks in the key cache that have changed but have not
yetbeen flushed to disk. The number of records currently inINSERT
DELAYNot_flushed_delayed_rows
queues waiting to be written. The number of tables that are currently open.
See the
Open_tables
Open_files
Open_streams
Opened_tables
table cache discussion earlier in this chapter ("Optimizingtable_cache ").
The number of files that are currently open. The number of streams that are
open. These are mostly used for logging.
The number of tables that have been opened. See the table cache discussion
earlier in this chapter ("Optimizingtable_cache ").
The number of queries that were not cached (due to being too large, or
because of the
QUERY_CACHE_TYPE ).
The number of merge passes performed during a sort. If this becomes too
large, you should increase thesort_buffer .
The number of sorts performed with ranges. The number of sorted rows.
You can use the SET statement in two ways. The default is for the change
you make to affect theSESSION only, meaning that when you connect next
time (and for all other connections) the variable will still be at the setting
specified in the configuration file. If you specify theGLOBAL keyword, all
new connections will use the new value. When the server restarts, however,
it will always use the values set in the configuration file, so you always
need to make the changes there as well. To set a variable with theGLOBAL
option, you need to have theSUPER permission.
If, after experimenting with the new variable, you decide to return to the old
value, there's no need to trust your memory or to look it up in the
configuration file. You can use the DEFAULT keyword to restore a
GLOBAL value to the value in configuration file, or a SESSION value to
the GLOBAL value. For example:
When set (1), MySQL automatically COMMIT s statements unless you wrap
them inBEGIN and COMMIT statements. MySQL also automatically
COMMIT s all open transactions when you set AUTOCOMMIT .
When set (1), all temporary tables are stored on disk instead of in memory.
This makes temporary tables slower, but it prevents the problem of running
out of memory. The default is 0.
Sets the AUTO_INCREMENT value (so the nextINSERT statement that uses
anAUTO_INCREMENT field will use this value).
When set (1), all update statements ( INSERT , UPDATE ,DELETE ,LOCK
TABLE WRITE ) wait for there to be no pending reads (SELECT
,LOCKTABLE READ ) on the table they're accessing.
By setting a maximum size in rows, you can prevent MySQL from running
queries that may not make use indexes properly or may have the potential to
slow the server down when run in bulk or at peak times. Setting this to
anything butDEFAULT resets
SQL_BIG_SELECTS . IfSQL_BIG_SELECTS is set,
thenMAX_JOIN_SIZE is ignored. If the query is already cached, MySQL
will ignore this limit and return theresults.
If set (1, the default), MySQL allows large queries. If not set (0), MySQL
will not allow queries where it will have to examine more
thanmax_join_size rows. This is useful to avoid running accidental or
malicious queries that could bring the server down.
If set (1), MySQL places query results into a temporary table (in some
cases, speeding up performance by releasing table locks earlier).
If set (1), MySQL will not log for the client (this is not the update log).
TheSUPER permission is required.
SQL_LOG_UPDATE = 0 | 1
If not set (0), MySQL will not use the update log for the client. Requires
theSUPER permission. SQL_QUOTE_SHOW_CREATE
= 0 | 1
If set (1, the default), MySQL will quote table and column names.
SQL_SAFE_UPDATES = 0 |
1
If set (1), MySQL will not perform UPDATE orDELETE statements that
don't use an index or aLIMIT clause, which helps prevent unpleasant
accidents.
SQL_SELECT_LIMIT =
value | DEFAULT
Sets the maximum number of records (default unlimited) that can be
returned with aSELECT statement.LIMIT takes precedence over this.
TIMESTAMP =
timestamp_value |
DEFAULT
Sets the time for the client. You can use this to get the original timestamp
when using the update log to restore rows. Thetimestamp_value is a
Unix epoch timestamp.
Memory
Memory is the most important element because it allows you to tweak the
mysqld variables. Large amounts of memory mean you can create large key
and table caches. Memory that is as large as possible allows MySQL to use
quicker memory rather than disk as much as possible, and the quicker the
memory is, the faster MySQL can access the data stored there. Large
amounts of memory on its own is not as useful as if you actively tweaked
the mysqld variables to make use of the extra memory, so you can't be too
lazy and just stick in the memory and wait for fireworks.
Disks
Ultimately, MySQL has to fetch the data from disk, and this is where fast
disk access plays a role. The disk seek time is important because it
determines how fast the physical disk can move to get to the data it needs,
so you should choose the disk with the best disk seek time. Also, SCSI
(small computer system interface) disks are usually faster than IDE
(Intelligent [or Integrated] Drive Electronics) disks, so you'll probably want
these.
CPU
The faster the processor, the quicker any calculations can be done and the
quicker the results sent back to the client. Besides processor speed, the bus
speed and the size of the cache are important. An analysis of available
processors is beyond the scope of this book and will probably be outdated
before this book is even published, but be sure to investigate your processor
possibilities carefully to see how it performs in various benchmarks.
Using Benchmarks
MySQL distributions come with a benchmark suite called run-all-
tests . You can use it to test various DBMSs to see how well they
perform. To use it, you need to have Perl, the Perl DBI module, and the
DBD module for the DBMS you want to test. Table 13.5 explains the
options for
run-all-tests .
-
cmp=server[,server...]
--create-options=#
--database
--debug
--dir
--fast
--fast-insert
--field-count
--force
--groups
Description
Adds a comment to the benchmark output.
Runs the test with limits from the specified servers. By running all servers
with the same--cmp , the test results will be comparable between the
different SQL servers.
Specifies the database in which the test tables are created. The default is the
test database.
Displays debugging information. You normally only use this when
debugging a test.
Indicates where the test results should be stored. The default is the
defaultoutput.
Allows the use of nonstandard ANSI SQL commands to make the test
gofaster.
Uses fast inserts where possible, which include multiple value lists, such
asINSERT INTO tablename VALUES (values) ,(values) or
simplyINSERT INTO tablename VALUES (values) rather
thanINSERT INTO tablename(fields) VALUES (values) .
Specifies how many fields there are to be in the test table. Usually only
used when debugging a test.
Continues the test even when encountering an error. Deletes tables before
creating new ones. Usually only used when debugging a test.
Indicates how many different groups there are to be in the test. Usually only
used when debugging a test.
--lock-tables Allows the use of table locking to get more speed.
--log Saves the results to the--dir directory.
Indicates how many times each test loop is to be --loop-count (Default) executed.
Usually only used when debugging a test.
--help Displays a list of options.
Specifies the host where the database server is located. --host='host
name' The default is localhost .
--silent
Does not output information about the server when the test starts.
--skip-delete
Specifies that the test tables created are not deleted. Usually only used when
debugging a test.
--skip
test=test1[,test2,...]
--small-test
Excludes the specified tests when running the benchmark.
Speeds up the tests by using smaller limits. Uses fewer rows to run the tests.
This would be used if
--small-tables
the database cannot handle large tables for some reason (they could have
small partitions, forexample).
--suffix
--random
--threads=#
--tcpip
--time-limit
--use-old-results
--user='user_name'
--verbose
--optimization='some comments'
Adds a suffix to the database name in the benchmark output filename. Used
when you want to run multiple tests without overwriting the results. When
using the-fast option, the suffix is automatically_fast .
Generates random initial values for the sequence of test executions, which
could be used to imitate real conditions.
Specifies the number of threads to use for multiuser benchmarks. The
default is 5.
Use TCP/IP to connect to the server. This allows the test to do many new
connections in a row as the TCP/IP stack can be filled.
Specifies a time limit in seconds for a test loop before the test ends, and the
result estimated. The default is 600 seconds.
Uses the old results from the--dir directory instead of actually running
the tests.
Specifies the user to connect as.
Displays more info. Usually only used when debugging a test.
Adds comments about optimizations done before the test.
--hw='some comments' Adds comments about hardware used for this
test.
To runrun-all-tests , change to thesql-bench directory from the
base directory. The following is a sample output of the benchmark:
% cd sql-bench
% perl run-all-tests --small-test --
password='g00r002b' Benchmark DBD suite: 2.14
Date of test: 2002-07-21 21:35:42
Running tests on: Linux 2.2.5-15 i686
Arguments: --small-test
Comments:
Limits from:
Server version: MySQL 4.0.1 alpha max log
Optimization: None
Hardware:
It's also important to benchmark your own applications (under the highest
possible load) before you roll them out. An application that can help you
impose load on your server is super-smack, downloadable from the MySQL
site.
--batch-mode
Runs the test without asking for input and exits if it encounters errors.
--comment='some comment'
Adds the specified comment to thecrash-me limit file.
--check-server
-
database='database'
Does a new connection to the server every time it checks if the server
isstillrunning. This can be useful if a previous query causes wrong data
tostart being returned.
--user='user_name'
--start-cmd='command to restart server'
--sleep='time in seconds'
Specifies the username to connect as.
Will use the specified command to restart the database server in the case of
it dying. (The availability of this option says everything!)
Specifies the time in seconds to wait before restarting the server. The
default is 10 seconds.
A sample display ofcrash-me follows:
% perl crash-me --password='g00r002b'
Running crash-me 1.57 on 'MySQL 4.0.1 alpha max
log'
The || symbol does not mean OR; instead, it applies to string concatenation.
This is thePIPES_AS_CONCAT mysqldsql-mode option.
Having spaces before function names no longer results in an error. This has
the consequence that all function names become reserved words. This is the
IGNORE_SPACE mysqld
sql-mode option.
language=french
You can also edit the error messages yourself (perhaps you want your
database to have that personal touch) or contribute your own set in another
language, and give something back to the MySQL community. To change
the error messages, simply edit the errmsg.txt file in the appropriate
language directory (usually/share/language_name from the MySQL
base directory), run thecmp_error utility, and restart the server. For
example:
% cp errmsg.txt errmsg.bak
% vi errmsg.txt
Here I edited the error message that read as follows:
"No Database Selected",
to read as follows instead:
"Haven't you forgotten something - No Database
Selected",
and then saved it:
Then restart the server, and the new error messages will take effect:
% mysqladmin shutdown
% /etc/rc.d/init.d/mysql start
% mysql -uroot -pg00r002b
mysql> SELECT * FROM a;
ERROR 1046: Haven't you forgotten something - No
Database Selected
You can see what character sets are available in your distribution by looking
at the value of thecharacter_sets variable.
When you change a character set, you'll need to rebuild your indexes to
ensure they sort according to the rules of the new character set.
# sql/share/charsets/Index
#
# This file lists all of the available character
sets.
big5 1
czech 2
dec8 3
dos 4
german1 5
hp8 6
koi8_ru 7
latin1 8
latin2 9
swe7 10
usa7 11
ujis 12
sjis 13
cp1251 14
danish 15
hebrew 16
# The win1251 character set is deprecated. Please
use cp1251 instead. win1251 17
tis620 18
euc_kr 19
estonia 20
hungarian 21
koi8_ukr 22
win1251ukr 23
gb2312 24
greek 25
win1250 26
croat 27
gbk 28
cp1257 29
latin5 30
martian 31
In the .conf file, lines beginning with a# are comments, words are
separated by any amount of whitespace, and every word must be in
hexadecimal format. There are four arrays. In order, they arectype
(containing 257 elements),to_lower andto_upper (each containing
256 elements), andsort_order (also containing 256 elements). The
following is a sample.conf file (this is the standardlatin1.conf ):
The ctype array contains bit values, with one element for one character.
The to_lower andto_upper arrays simply hold the uppercase and
lowercase characters that correspond to each member of the character set.
For example, to_lower['A'] contains a, whileto_upper['z']
contains Z.
The sort_order array indicates the order that characters are to be sorted
(it usually corresponds toto_upper , in which case the sorting will be
case insensitive. All of the arrays are indexed by character value,
exceptctype , which is indexed by character value + 1 (an old legacy).
If you're brave enough to tackle adding a new complex character set, there
are a few more steps to this process. See the MySQL documentation for
what is required (as well as the documentation in the existing complex
character sets: czech, gbk, sjis, and tis160).
Summary
To understand how to get the most out of your database server, it's
important to understand the number of options you have when fine-tuning
the server. To see how an existing server has been set up, use theSHOW
VARIABLES statement, as well asSHOW STATUS to see how it's been
handling. The output of these two statements can reveal many hidden
problems, including queries that are not optimized, poor use of available
memory, or simply that it's time for an upgrade.
MySQL supplies four configuration files that can help to get better
performance from the server. Just choose the closest ofmy-huge.cnf
,my-large.cnf ,my-medium.cnf or my-small.cnf for your server
situation.
Two of the easiest and most important variables to tweak are table_cache
(the number of tables MySQL can keep open) and the key_buffer_size (how
much of the indexes MySQL can keep in memory, minimizing disk access).
MySQL was developed in Scandinavia and has had good support for other
languages besides English. It is easy to display error messages in other
languages or add a character set.
I use this throughout this chapter for convenience, to make the password
visible for purposes of the examples, but a security-conscious user should
not connect in this way for the following reasons:
Anyone looking over your shoulder can see the password in plain text.
Programs that view the system status (such as the Unixps ) could see the
password in plain text.
Instead, connect by entering the password when prompted for it:
% mysql -uroot -p
Enter password:
If you need to store the password in a file, make sure it is properly secured.
For example, if the password is stored in themy.cnf file in the user's home
directory on a server, this file should not be readable by anyone else. The
root user of the system can of course read this file. Be aware that the root
user of the system is not necessarily the same as the MySQL root user.
Similarly, applications often make use of a configuration file to store the
database password. Make sure this is secure too.
Never store a configuration file that contains a database password for a web
Warning application, or any password for that matter, in the web tree.
A booking system has ordinary users who can only insert records to a
particular table and an administrator who can update this table.
Table 14.1: The MySQL Tables Table Description Lists users and the
associated hosts and passwords that may access user
the server, as well as theglobal permissions they have. It's best to disallow
any global permissions and instead specifically allow them access in one of
the other tables.
Lists databases that users may access. Permissions granted heredb apply to all
tables in the database.
Together with the db table allows a more controlled form of accesshost based
on the particular host.
Lists access to specific tables. Permissions granted here apply to alltables_priv
columns in the table.
columns_priv Lists access to specific columns.
func Not yet used.
| User
| Password
| Select_priv
| Insert_priv
| Update_priv
| Delete_priv
| Create_priv
| Drop_priv
| Reload_priv
| Shutdown_priv | Process_priv | File_priv
| Grant_priv
| References_priv | varchar(60)
'Update', 'Delete',
'Create', 'Drop',
'Grant', 'References',
'Index', 'Alter') | | | |
| Column_priv |set('Select','Insert',
'Update','References')| | | |
+-------------+----------------------+------+-----
+---------+-------+
8 rows in set (0.01 sec)
mysql> SHOW COLUMNS FROM columns_priv; ;
+-------------+-----------------------+------+----
-+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-----------------------+------+----
-+---------+-------+
| Host | char(60) binary | | PRI | |
| Db | char(64) binary | | PRI | |
| User | char(16) binary | | PRI | | | char(60)
binary | | PRI | | | | char(64) binary | | PRI | |
| | enum('N','Y') | | | N | | | enum('N','Y') | |
| N | | | enum('N','Y') | | | N | | |
enum('N','Y') | | | N | | | enum('N','Y') | | | N
| | | enum('N','Y') | | | N | | | enum('N','Y') |
| | N | | | Table_name | char(64) binary | | PRI |
| | Column_name | char(64) binary | | PRI | | |
Timestamp | timestamp(14) | YES | | NULL |
| Column_priv | set('Select','Insert',
'Update','References') | | | |
+-------------+-----------------------+------+----
-+---------+-------+
7 rows in set (0.01 sec)
mysql> SHOW COLUMNS FROM func;
+-------+------------------------------+------+---
--+----------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------------------------+------+---
--+----------+-------+
| name | char(64) binary | | PRI | |
| ret | tinyint(1) | | | 0 |
| dl | char(128) | | | |
| type | enum('function','aggregate') | | |
function |
+-------+------------------------------+------+---
--+----------+-------+
4 rows in set (0.01 sec)
Table 14.2 describes the various privileges.
Db
Select_priv
Insert_priv
Update_priv
Delete_priv
Create_priv
Description
The host machine from which the user connects.
The username supplied for the connection (the-u option). The password the
user connects as (the-p option).
Permission to read and write files on the server (for LOAD DATA INFILE
or SELECT INTO OUTFILE statements). Any files that the MySQL
user can read are readable.
TABLE ).
Lock_tables_priv
Permission to lock a table for which the user hasSELECT permission.
Execute_priv
Repl_client_priv
Permission to run stored procedures (scheduled for MySQL 5). Permission
to ask about replication slaves and masters.
Repl_slave_priv
Permission to replicate (see Chapter 12 , "Database Replication").
ssl_type
Permission to connect is only granted if Secure Sockets Layer (SSL) is
used.
ssl_cipher
Permission to connect is only granted if a specific cipher is present.
x509_issuer
Permission to connect is only granted if the certificate is issued by a
specific issuer.
x509_subject
max_questions max_updates
max_connections
Maximum number of queries the user can perform per hour. Maximum
number of updates the user can perform per hour. Maximum number of
times the user can connect per hour.
The db table is examined next. MySQL looks for the database on which the
user is performing the operation. If this does not exist, permission is denied,
and the operation fails. If the database does exist, and the host and user
match, the field relating to the operation is examined. If permission is
granted for the required operation, the operation succeeds. If permission is
not granted, MySQL proceeds to the next step. If the database and user
combination does exist, and the host field is blank, MySQL examines the
host table to see whether the host can perform the required operation. If the
host and database are found in the host table, the related field on both the
host and db tables determines whether the operation succeeds. If permission
is granted on both tables, the operation succeeds. If not, MySQL proceeds
to the next step.
MySQL examines the tables_priv table taking into account the table(s) on
which the operation is being performed. If the host, user, db, and table
combination do not exist, the operation fails. If they do exist, the related
field is examined. If permission is not granted, MySQL proceeds to the next
step. If permission is granted, the operation succeeds.
Finally, MySQL examines the columns_priv tables, taking into account the
table columns being used in the operation. If permission related to the
required operation is granted here, the operation succeeds. If not, if fails.
The order of precedence for the MySQL permission tables is shown in
Figure 14.1 .
Figure 14.1: Precedence for MySQL permission
tables
|
Select_priv | Insert_priv | Update_priv |
Delete_priv | Create_priv | Drop_priv |
Reload_priv | Shutdown_priv | Process_priv |
File_priv | Grant_priv | References_priv |
Index_priv | Alter_priv | Show_db_priv |
Super_priv | Create_tmp_table_priv |
Lock_tables_priv | Execute_priv | Repl_slave_priv
| Repl_client_priv | ssl_type | ssl_cipher |
x509_issuer | x509_subject | max_questions |
max_updates | max_connections |
+---------------------------+------------------+--
----------------+---
---------+-------------+-------------+------------
-+-------------+----
------+-------------+---------------+-------------
-+-----------+------
-----+-----------------+------------+------------
+--------------+-----
------+-----------------------+------------------
+--------------+-----
-----------+------------------+----------+--------
----+-------------+-
------------+---------------+-------------+-------
----------+ | localhost | root | | Y | Y | Y | Y |
Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y
| Y | Y | Y | | | | | 0 | 0 | 0 |
| test.testhost.co.za | root | | Y | Y | Y | Y | Y
| Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
Y | Y | Y | | | | | 0 | 0 | 0 |
| localhost | | | N | N | N | N | N | N | N | N |
N | N | N | N | N | N | N | N | N | N | N | N | N
| | | | | 0 | 0 | 0 |
| test.testhost.co.za | | | N | N | N | N | N | N
| N | N | N | N | N | N | N | N | N | N | N | N |
N | N | N | | | | | 0 | 0 | 0 |
+---------------------------+------------------+--
----------------+---
---------+-------------+-------------+------------
-+-------------+----
------+-------------+---------------+-------------
-+-----------+------
-----+-----------------+------------+------------
+--------------+-----
------+-----------------------+------------------
+--------------+-----
-----------+------------------+----------+--------
----+-------------+-
------------+---------------+-------------+-------
----------+ 4 rows in set (0.05 sec)
mysql> SELECT * FROM db;
+------+---------+------+-------------+-----------
--+-------------+---
---------+-------------+-----------+------------+-
----------------+---
--------+------------+
| Host | Db | User | Select_priv | Insert_priv |
Update_priv | Delete_priv | Create_priv |
Drop_priv | Grant_priv | References_priv |
Index_priv | Alter_priv |
+------+---------+------+-------------+-----------
--+-------------+---
---------+-------------+-----------+------------+-
----------------+---
--------+------------+
| % | test | | Y | Y | Y | Y | Y | Y | N | Y | Y Y
|
| % | test\_% | | Y | Y | Y | Y | Y | Y | N | Y |
Y Y |
+------+---------+------+-------------+-----------
--+-------------+---
---------+-------------+-----------+------------+-
----------------+---
--------+------------+
2 rows in set (0.01 sec)
mysql> SELECT * FROM host;
Empty set (0.00 sec)
mysql> SELECT * FROM tables_priv;
Empty set (0.00 sec)
mysql> SELECT * FROM columns_priv;
Empty set (0.00 sec)
Notice that the default settings are not secure. Anyone can connect from the
local host as the root user and have total authority. An anonymous user
(where no username is supplied) can connect from the local host to the
default test database and to any database where the name begins withtest .
Notice the use of the PASSWORD() function. You must use this function
when updating the tables directly. It encrypts the password so that it cannot
be read simply by viewing the contents of the tables. For example:
Before the permissions are flushed, this data does not take effect. You can
connect as the administrator user without any password:
% mysql -uadministrator;
Welcome to the MySQL monitor. Commands end with ;
or \g. Your MySQL connection id is 6 to server
version: 4.0.1-alpha-max-log
Type 'help;' or '\h' for help. Type '\c' to clear
the buffer.
% mysql -uadministrator;
ERROR 1045: Access denied for user:
'administrator@localhost' (Using password: NO)
It's never good to use the root user for anything but administration. Day-to-
day connections should be through users with permissions developed
especially for the tasks that user performs. For this sales system, you're
going to add two users—an administrator and a regular user. The
administrator will have full permissions to do anything, and the regular user
will have certain limitations. To add the administrator, you could simply
add a record to the user table, giving a full set of permissions to the
administrator. But that would mean the administrator of the sales rep system
would have full access to any other database that gets developed on the
system. It's almost always better to limit permissions at a user level and
then activate permissions on a lower level. You're going to add a record to
the user and to the database table to do this. I use anINSERT statement
without specifying fields (for ease of typing) with the db table example, in
case you're following these examples. Be sure that the fields match the
fields in the tables from your distribution, in case they have changed:
| localhost | | | N | N | N | N | N | N | N | N |
N | N | N | N | N | N | N | N | N | N | N | N | N
| | | | | 0 | 0 | 0 |
| test.testhost.co.za | | | N | N | N | N | N | N
| N | N | N | N | N | N | N | N | N | N | N | N |
N | N | N | | | | | 0 | 0 | 0 |
| localhost | administrator | 26981a09472b4835 | N
| N | N | N | N | N | N | N | N | N | N | N | N |
N | N | N | N | N | N | N | | | | | 0 | 0 | 0 |
+---------------------------+------------------+--
----------------+---
---------+-------------+-------------+------------
-+-------------+----
------+-------------+---------------+-------------
-+-----------+------
-----+-----------------+------------+------------
+--------------+-----
------+-----------------------+------------------
+--------------+-----
-----------+------------------+----------+--------
----+-------------+-
------------+---------------+-------------+-------
----------+ 5 rows in set (0.05 sec)
+-----------+---------+---------------+-----------
--+-------------+---
---------+-------------+-------------+-----------
+------------+-------
---------+------------+------------+
| % | test | | Y | Y | Y | Y | Y | Y | N | Y | Y |
Y |
| % | test\_% | | Y | Y | Y | Y | Y | Y | N | Y |
Y | Y |
| localhost | firstdb | administrator | Y | Y | Y
| Y | N | N | N | N | N | N |
+-----------+---------+---------------+-----------
--+-------------+---
---------+-------------+-------------+-----------
+------------+-------
---------+------------+------------+
password: NO)
If you were not logged in as root, you'd get an error indicating that the
anonymous user does not have permission:
% mysql mysql;
ERROR 1044: Access denied for user: '@localhost'
to database 'mysql'
| Y | Y | Y | Y | Y | Y
| Y | Y
| | | 0 |
| root | | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y
| Y | Y
| | | 0 |
| | | N | N | N | N | N | N | N | N | N | N
| N | N
| | | 0 |
| test.testhost.co.za | N | N
| N | N | N | N
| N | N | |
0 | 0 | | localhost
| N | N
| N | N | N | N
| N | N | |
0 | 0 | | localhost
| | | N
| N | N | N | N | N | N | N | N | N
| N | N
| | | 0 |
| administrator | 26981a09472b4835 | N | N | N | N
| N | N | N | N | N | N | N
| N | N
| | | 0 |
| regular_user | 1bfcf83b2eb5e59 | N | N | N | N |
N | N | N | N | N | N | N | N | N | N | N | N | N
| N | N | N | | | | | 0 | 0 | 0 |
+---------------------------+------------------+--
----------------+---
---------+-------------+-------------+------------
-+-------------+----
------+-------------+---------------+-------------
-+-----------+------
-----+-----------------+------------+------------
+--------------+-----
------+-----------------------+------------------
+--------------+-----
-----------+------------------+----------+--------
----+-------------+-
------------+---------------+-------------+-------
----------+ 6 rows in set (0.05 sec)
mysql> SELECT * FROM db;
+-----------+---------+---------------+-----------
--+-------------+---
---------+-------------+-------------+-----------
+------------+-------
---------+------------+------------+
| Host | Db | User | Select_priv | Insert_priv |
Update_priv | Delete_priv | Create_priv |
Drop_priv | Grant_priv | References_priv |
Index_priv | Alter_priv |
+-----------+---------+---------------+-----------
--+-------------+---
---------+-------------+-------------+-----------
+------------+-------
---------+------------+------------+
| % | test | | Y | Y | Y | Y | Y | Y | N | Y | Y |
Y |
| % | test\_% | | Y | Y | Y | Y | Y | Y | N | Y |
Y | Y |
| localhost | firstdb | administrator | Y | Y | Y
| Y | N | N | N | N | N | N |
| localhost | sales | regular_user | Y | N | N | N
| N | N | N | N | N | N |
+-----------+---------+---------------+-----------
--+-------------+---
---------+-------------+-------------+-----------
+------------+-------
---------+------------+------------+
4 rows in set (0.01 sec)
You can also revoke permissions in the same way as you grant them:
| N | N | N | N | N | N
| N | N
| | | 0 |
| | | N | N | N | N | N | N | N | N | N | N
| N | N
| | | 0 |
| administrator | 26981a09472b4835 | N | N | N | N
| N | N | N | N | N | N
| N | N
| | | 0 |
| regular_user | 1bfcf83b2eb5e59 | N | N | N | N |
N | N | N | N | N | N
| N | N
| | | 0 | +-----------+---------+---------------+-
------------+-------------+---
---------+-------------+-------------+-----------
+------------+-------
---------+------------+------------+
| % | test | | Y | Y | Y | Y | Y | Y | N | Y | Y |
Y |
| % | test\_% | | Y | Y | Y | Y | Y | Y | N | Y |
Y | Y |
| localhost | firstdb | administrator | Y | Y | Y
| Y | N | N | N | N | N | N |
+-----------+---------+---------------+-----------
--+-------------+---
---------+-------------+-------------+-----------
+------------+-------
---------+------------+------------+
Notice that all trace of the user has been removed from the db table but that
the user still exists in the user table. There is no way to remove this from
the table without directly deleting it. A user with no permissions
(calledUSAGE permission) can still connect to the server and access some
information, such as, for early versions of MySQL 4, viewing the existing
databases! For example:
mysql> exit
Bye
% mysql -uregular_user -pl3tm37n_2
Welcome to the MySQL monitor. Commands end with ;
or \g. Your MySQL connection id is 21 to server
version: 4.0.1-alpha
-max-log
To remove all traces of the user, delete them from the user table directly
(while connected as root):
mysql> exit
Bye
% mysql mysql -uroot -pg00r002b
Welcome to the MySQL monitor. Commands end with ;
or \g. Your MySQL connection id is 22 to server
version: 4.0.1-alpha
-max-log
ALL PRIVILEGES
ALTER
CREATE
CREATE TEMPORARY TABLES
DELETE
DROP
EXECUTE
FILE
GRANT
INDEX
INSERT
LOCK TABLES
PROCESS
REFERENCES
RELOAD
REPLICATION CLIENT
REPLICATION SLAVE
SHOW DATABASES
Same asALL .
Permission to change the structure of a table (anALTER statement),
excluding indexes.
Permission to create databases or tables, excluding indexes.
Permission to create a temporary table (CREATE TEMPORARY TABLE
statement).
Permission to remove records from a table (aDELETE statement).
Permission to drop databases or tables, excluding indexes.
Permission to run stored procedures (scheduled for MySQL 5).
Permission to read and write files on the server (for LOAD DATA INFILE
or SELECT INTO OUTFILE statements). Any files that the MySQL
user can read are readable.
The earlier example granted permissions for all tables in the sales database.
You can easily manipulate this by changing the database and table names
you grant on (see Table 14.4 ).
For example:
mysql> GRANT SELECT ON *.* TO
regular_user@localhost IDENTIFIED BY 'l3tm37n_2';
Query OK, 0 rows affected (0.00 sec)
Because permission is granted on all databases, there is no need for an entry
in the database table; just the user table, with the field select_priv set to Y:
You can set a password for the user you're connected as follows: mysql>
SET PASSWORD=PASSWORD('g00r002b2'); Query OK, 0
rows affected (0.00 sec)
A user with access to the user table in themysql database can set
passwords for other users too, by specifying the user:
mysql> SET PASSWORD FOR root=PASSWORD('g00r002b');
Query OK, 0 rows affected (0.00 sec)
Wildcard Permissions
There's no need to enter 1,001 hosts if that's how many hosts to which you
need to grant access. MySQL accepts wildcards in the host table. For
example, the following allows a user to connect from a host ending
withmarsorbust.co.za :
IDENTIFIED BY 'l3tm37n';
Query OK, 0 rows affected (0.03 sec)
mysql> SELECT * FROM user WHERE host LIKE
'%mars%';
+---------------------------+------------------+--
----------------+---
---------+-------------+-------------+------------
-+-------------+----
------+-------------+---------------+-------------
-+-----------+------
-----+-----------------+------------+------------
+--------------+-----
------+-----------------------+------------------
+--------------+-----
-----------+------------------+----------+--------
----+-------------+-
------------+---------------+-------------+-------
----------+ | Host | User | Password | Select_priv
| Insert_priv | Update_priv | Delete_priv |
Create_priv | Drop_priv | Reload_priv |
Shutdown_priv | Process_priv | File_priv |
Grant_priv | References_priv | Index_priv |
Alter_priv | Show_db_priv | Super_priv |
Create_tmp_table_priv | Lock_tables_priv |
Execute_priv | Repl_slave_priv | Repl_client_priv
| ssl_type | ssl_cipher | x509_issuer |
x509_subject | max_questions | max_updates |
max_connections |
+---------------------------+------------------+--
----------------+---
---------+-------------+-------------+------------
-+-------------+----
------+-------------+---------------+-------------
-+-----------+------
-----+-----------------+------------+------------
+--------------+-----
------+-----------------------+------------------
+--------------+-----
-----------+------------------+----------+--------
----+-------------+-
------------+---------------+-------------+-------
----------+ | %.marsorbust.co.za | regular_user |
1bfcf83b2eb5e591 | N | N | N | N | N | N | N | N |
N | N | N | N | N | N | N | N | N | N | N | N | |
| | | 0 | 0 | 0 |
+---------------------------+------------------+--
----------------+---
---------+-------------+-------------+------------
-+-------------+----
------+-------------+---------------+-------------
-+-----------+------
-----+-----------------+------------+------------
+--------------+-----
------+-----------------------+------------------
+--------------+-----
-----------+------------------+----------+--------
----+-------------+-
------------+---------------+-------------+-------
----------+ 1 row in set (0.05 sec)
First, stop MySQL completely. As the root user on Unix, if you run MySQL
out of/init.d , you may be able to run the following:
% /etc/rc.d/init.d/mysql stop
Killing mysqld with pid 5091
Wait for mysqld to exit\c
.\c
.\c
.\c
.\c
020612 01:14:41 mysqld ended
done
If not, still as root, you'll need to kill the specific MySQL-related processes:
In the eventuality that this doesn't work, you may have to usekill -9
(followed by the process ID) to really kill the process.
On Windows, you can simply use the task manager to close MySQL.
Then, restart MySQL without the grant tables (this ignores any permission
restrictions): % mysqld_safe --skip-grant-tables
And now you should be able to add a root password, either through directly
manipulating the tables directly, withGRANT , or with mysqladmin:
% mysqladmin -u root password 'g00r002b'
Don't forget to stop the server, and restart it without--skip-grant-
tables , to activate your root password.
% mv user.MYD user_bak.olddata
% mysql -uroot -pg00r002b;
Welcome to the MySQL monitor. Commands end with ;
or \g. Your MySQL connection id is 6 to server
version: 4.0.1-alphamax-log
If this is the case, you should still start MySQL without the grant tables and
then try dropping the table:
mysql> DROP TABLE user;
Query OK, 0 rows affected (0.01 sec)
You'll need to reload (or flush the privilege tables) in order to activate the
permissions, and then once again you can start issuing commands:
mysql> exit
Bye
[root@test mysql]# mysqladmin reload
[root@test mysql]# mysql
Welcome to the MySQL monitor. Commands end with ;
or \g. Your MySQL connection id is 7 to server
version: 4.0.1-alpha-max-log
The following demonstrates this in action, using two databases: sales and
customer. The administrator creates a regular_user2, with permission to
performSELECT queries on the sales database, and then grants theGRANT
option to the first regular_user, who has permission toSELECT on the
customer database, who will then in turn grant the same rights to
regular_user2:
mysql> exit
Bye
[root@test /root]# mysql -u regular_user -pl3tm37n
Welcome to the MySQL monitor. Commands end with ;
or \g. Your MySQL connection id is 4 to server
version: 4.0.1-alpha-max-log
Type 'help;' or '\h' for help. Type '\c' to clear
the buffer. mysql> GRANT SELECT ON customer.* TO
regular_user@localhost IDENTIFIED BY 'l3tm37n'
WITH GRANT OPTION;
Query OK, 0 rows affected (0.00 sec)
MAX_QUERIES_PER_HOUR n
MAX_UPDATES_PER_HOUR n
MAX_CONNECTIONS_PER_HOUR n
MySQL users are mostly of three kinds: There are individual people (for
example, Anique or Channette), applications (for example, a salary system
or the news website), and roles (for example, updating the news or updating
the salaries). These may overlap to various degrees—for example, Anique
may update both salaries and news, using both applications and performing
both roles. The DBA needs to decide whether to issue Anique and
Channette their own passwords, issue passwords to the news and salary
systems, or create a user based upon whether the news or salaries are being
updated.
If you opt for users as individuals and issue Anique her own password, she
only has to remember one login to the database. But then she needs to be
given access to update both the salary and news databases. If she, or the
application she is using, makes a mistake, there is the possibility of her
damaging data on which she should not even be working. For example, if
the salary and news databases both have a days_data table—with the news
database version growing continually until it is archived and the salary data
being manually removed after it has been processed—there is the
possibility of her removing the news table when she meant to remove the
salary table.
If you opt for users as applications, you solve some of these problems.
However, it seems a user now has to remember two passwords. Also, you
cannot track which user has made which changes to the database. You have
a solution, however, because where security is necessary, it's likely that the
individual will have to log into the application (potentially allowing you to
track the changes an individual makes to the database), and the application
then logs into the database. The user could have the same username and
password to both applications, but they could never destroy news data when
connected as the salary application (as you'd not have given the salary user
permission to update the news database).
Applications, though, often have many roles, with many levels of user.
Perhaps anyone may view their own salary details, but only an
administrator can update them. Giving the application permission to update
data potentially allows an ordinary user to update the data. Consider also
the development process: A trusted senior developer builds the salary
administration component of the application, and a team of junior
developers builds the salary-viewing component. Issuing the same
password to the application allows the junior developers to update the data
when they don't need to and probably shouldn't be allowed to update it. In
this case, you could issue usernames based upon a combination of role and
application (salary administration, salary viewing, news administrator, news
viewing).
Always issue the minimum permissions you can. (But be reasonable! You'll
always get some sadists who take great glee in granting permission on a
query-by-query basis. For example, allowing the user to read the surname
column, then forcing them to come begging for more permission when they
need to read the first name shortly afterward.) The global permissions
assigned in the user table should always beN , though, and then access to
specific databases granted in the db table.
Although you should always issue the minimum privileges required, there
are some privileges that are particularly dangerous, where the security risk
may outweigh the convenience factor. Remember that you should never
grant access on a global level.
Any privileges on the mysql database A malicious user can still gain
access even after being able to only view the encrypted passwords.
ALTER A malicious user could make changes to the privilege tables, such
as renaming them, which renders them ineffective.
DROP If a user canDROP themysql database, the permission limitations
will no longer be in place.
FILE Users with theFILE privilege can potentially access any file that is
readable by all. They can also create a file that has the MySQL user
privileges. GRANT This allows users to give their privileges to others who
may not be as trustworthy as the original user.
PROCESS Queries that are running can be viewed in plain text, which
includes any that change or set passwords.
SHUTDOWN It's unlikely a DBA will be fooled into granting this privilege
easily, and it should go without saying that users with theSHUTDOWN
privilege can shut down the server and deny access to everyone.
SSL Connections
The connection between the client and the server is by default not
encrypted. In most network architectures, this would not be a risk because
the connection between the database client and server is not public. But
there are instances where data needs to be moved over public lines, and an
unencrypted connection potentially allows someone to view the data as it is
moved.
MySQL can be configured to support SSL connections, although this does
impact on performance. To do this, perform the following steps:
1. Install the openssl library, which can be found atwww.openssl.org/ .
2. Configure MySQL with the--with-vio --with-openssl option.
If you need to check whether an existing installation of MySQL supports
SSL (or whether your installation has worked), check to see whether the
variablehave_openssl isYES :
Once SSL is supported, you can make use of it with various grant options
(see Table 14.5 ).
REQUIRE ISSUER
cert_issuer
REQUIRE SUBJECT cert_subject
REQUIRE CIPHER cipher
Description
The client must connect with SSL encryption. The client has to have a valid
certificate to connect.
REQUIRE SSL is the least restrictive of the SSL options. SSL encryption
of any kind is acceptable. This would be useful where you don't want to
send plain text, but simple encryption of the connection is sufficient:
REQUIRE ISSUER and REQUIRE SUBJECT are more secure because the
certificate has to come from a specific issuer or contain a specific subject:
You can specify any or all of the previous options at the same time (theAND
is optional):
Application Security
Most security holes are caused by poor applications. There are a number of
common pitfalls to avoid:
Never trust user data. Always verify any data entered by a user.
System Security
By default, MySQL runs as its own user on Unix. The user and group are
mysql. Never be tempted to allow anyone access to the system as the mysql
user—it should only be for the database itself. MySQL also creates a
separate directory where it places data files. This directory is accessible
only to the mysql user. The default settings have been chosen for a reason;
keep to these principles:
Summary
The MySQL user management mechanism is powerful, flexible, and often
misused. The mysql database contains the various permission tables that
allow access to be controlled based on user, host, and the action being
performed. You can update the tables directly with SQL statements (in
which case flushing the tables activates the changes) or through the more
convenientGRANT andREVOKE statements.
The first task in a new installation should be to issue a root password. Until
then, anyone can connect as root and have full access to everything. You
can use mysqladmin, theSET statement, orGRANT to do this.
MySQL allows SSL connections for added security. This is not installed by
default because it has performance implications.
You also learned some general principles for securing your data:
Never issue a user the root password. They should always be connecting
with another username.
Never give anyone access to the user table, even for reading. Just viewing
the encrypted password is enough to potentially allow a user full access.
Always issue the minimum permissions you can. Issuing minimum
permissions means that the user table containsN for all columns.
Ensure you cannot connect as the root user without a password from any
server. Passwords should never be stored in plain text and should not be
dictionary words.
Check the user privileges every now and again and make sure that no one
has granted anyone else inappropriate privileges.
Chapter 15: Installing MySQL
Overview
Perhaps you're a beginner and need to install your own copy of MySQL to
have something on which to practice. Or perhaps you're lucky enough to
have had systems administrators performing this task for you until now, and
you're just curious. Whatever the reason, if you use MySQL extensively,
you're likely to need to install MySQL yourself at some stage. And if you're
approaching the task with trepidation, this chapter aims to ease you through
the process and make you wonder afterward what all the fuss was about.
Installing from binary, which means you use a distribution that has already
been compiled by the MySQL developers (or another party)
Installing from source, which means you compile the MySQL source code
yourself and install it
Installing a binary is usually the easiest and quickest way to install MySQL,
but the choice depends on a number of factors, as well as how comfortable
you are with compiling software. Windows users rarely need to do this, but
FreeBSD users, for example, may find themselves doing this quite often.
There are a number of reasons you may want to install from source:
The system you are installing on does not have a binary distribution. Binary
distributions were available for Linux, FreeBSD, Windows, Solaris, MacOS
X, HPUX, AIX, SCO, SGI Irix, Dec OSF, and BSDi at the time of writing,
although not all of these had distributions for the latest version of MySQL.
You think you can better optimize MySQL by using a different compiler or
different compilation options.
You want something that is not available in a binary distribution, such as
additional character sets, a bug fix, or a different configuration.
Tables 15.1 and 15.2 provide overviews of the directories in a default binary
and source installation, respectively.
Table 15.1: Directories of a Binary Installation Directory
bin
data
include lib
Description
This is where the binary executables are found, including the allimportant
mysqld, as well as all the utilities such as mysqladmin, mysqlcheck, and
mysqldump.
include info
lib
libexec
Description
This is where the binary executables for the client programs and utilities are
found, such as mysqladmin, mysqlcheck, and mysqldump.
The mysqld server goes here, not in the bin directory, in a default source
installation.
share/mysql
A directory for each language containing files with the error messages for
that specific language.
sql-bench Benchmark results and utilities.
var The actual databases, as well as log files.
basedir=D:/installation-path/
datadir=D:/data-path/
3. MySQL comes with a number of executable files. Choose the one you
want depending on your needs (see Table 15.3 ).
Table 15.3: Executable Files File
mysqld
mysqld-opt
mysqld-nt
mysqld-max
mysqld-max-nt
Description
An optimized binary that supports named pipes (for use with NT/2000/XP).
It can run on 95/98/Me, but no named pipes will be created, as these
operating systems do not support it.
% cd /usr/local
4. Now extract the file:
% gunzip -c /home/ mysql-max-4. x.x-platform-os-
extra .tar.gz | tar -xf
The filename you'll see will depend on the distribution you're using. I'm
seeing mysql-max4.0.2-alpha-pc-linux-gnu-i686 at present.
Make sure you have the right version for your system.
Warning
The Sun version of tar has been known to give problems, so use the GNU
version of tar instead.
Once this is complete, a new directory will have been created, based on the
name of the distribution you're installing, as follows:
% ls -l my*
total 1
drwxr-xr-x 13 mysql users 1024
That name is a bit clunky for everyday use, and you'll most probably want
to create a symlinkmysql pointing to the new directory so
that/usr/local/mysql/ can be the path to MySQL:
% ln -s mysql-max-4.x.x-platform-os-extra mysql
% ls -l my*
lrwxrwxrwx 1 root root 40
% cd mysql
% ls -l
total 4862
-rw-r--r-1 mysql users
-rw-r--r-1 mysql users
-rw-r--r-1 mysql users
-rw-r--r-1 mysql users
-rw-r--r-1 mysql users drwxr-xr-x 2 mysql users
-rwxr-xr-x 1 mysql users drwxr-x--4 mysql users
drwxr-xr-x 2 mysql users drwxr-xr-x 2 mysql users
drwxr-xr-x 2 mysql users
-rw-r--r-1 mysql users
-rw-r--r-1 mysql users
-rw-r--r-1 mysql users drwxr-xr-x 6 mysql users
drwxr-xr-x 2 mysql users drwxr-xr-x 3 mysql users
19106 Jul 1 14:06 COPYING 28003 Jul 1 14:06
COPYING.LIB
122323 Jul 1 13:16 ChangeLog 6808 Jul 1 14:06
INSTALL-BINARY 1937 Jul 1 13:16 README
1024 Jul 1 14:15 bin
% scripts/mysql_install_db
Preparing db table
Preparing host table
Preparing user table
Preparing func table
Preparing tables_priv table
Preparing columns_priv table
Installing all prepared tables
020701 23:19:07 ./bin/mysqld: Shutdown Complete
Now, change the ownership to ensure that MySQL and the data directory
are controlled by the newly created MySQL user, if they're not already:
See Table 15.1 earlier in this chapter for an overview of the contents of the
newly created directories.
MySQL-4.x.x-platform-osextra.rpm
MySQL-client-4.x.x-platformos-extra.rpm
MySQL-bench-4.x.x-platformos-extra.rpm
MySQL-devel-4.x.x-platformos-extra.rpm
MySQL-shared-4.x.x-platformos-extra.rpm
MySQL-embedded-4.x.x
platform-os-extra.rpm
MySQL-4.x.x-platform-osextra.src.rpm
MySQL-Max-4.x.x-platform-osextra
Description The MySQL server software, needed unless you're merely
connecting to an existing server. The MySQL client software, needed to
connect to a MySQL server.
Various MySQL tests and benchmarks. The Perl and msql-mysql rpm files
are required. Various libraries and include files required if you want to
compile other MySQL clients.
The MySQL client shared libraries.
The MySQL embedded server.
The source code for the previous rpm files. Not required if you're doing a
binary installation. MySQL Max rpm (with support for InnoDB tables etc.).
To install the rpm files, simply run the rpm utility with each rpm you want
to install listed. The client and server are typically the minimum you'll want
to install:
% rpm -i MySQL-4.x.x-platform-os-extra.rpm MySQL-
client-4.x.x-platform-os-extra.rpm
Installing via rpm results in a slightly different structure than that resulting
from an ordinary binary installation. The data is placed in
the/var/lib/mysql directory, and, by placing entries in
the/etc/rc.d/ directory—as well as creating
a/etc/rc.d/init.d/mysql script—MySQL is set up to begin
automatically upon booting up. Be careful about installing over a previous
installation in this way because you'll have to redo any changes you'd made.
% su
Password:
% groupadd mysql
% useradd -g mysql mysql
2. Move to the directory where you want to place the files (for example,
/usr/local/src or wherever your convention dictates).
3. Unpack the files:
% gunzip -c /tmp/ mysql-4. x.x-platform-os-extra
.tar.gz | tar -xf
4. Once complete, a new directory will have been created. Move into the
new directory, which is where you're going to configure and build MySQL
from: % cd mysql-4.x.x-extra
5. Run the configure script, which is a useful little script supplied with
the distribution that allows you to set various options for you installation.
There are a large number of options available. Some of the more useful
ones are described next; others are mentioned in the "Compiling MySQL
Optimally" sidebar later.
% ./configure --prefix=/usr/local \
--localstatedir=/usr/local/mysql/data
You can do one of the following instead if you prefer:
If you don't want to compile the server but just the client programs for
connecting to an existing server, use the--without-server option: %
./configure --without-server
To uselibmysqld.a , the embedded MySQL library, you'll need to
employ the--with-embedded-server option:
% ./configure --with-embedded-server
To change the location of the socket file from the default (usually/tmp ),
useconfigure as follows (the path name must be absolute): %
./configure --with-unix-socket-
path=/usr/local/sockets/mysql.sock
For a full set of available options, run the following:
% ./configure --help
7. Once this is complete (which may take a little while depending on your
setup), you'll need to build the binaries, with themake command:
% make
8. Next, you'll need to install the binaries:
% make install
9. Now, continue as you would have if you'd installed a binary, creating the
permission tables and changing ownership of the files. These examples
assume you decided on/usr/ local/mysql as prefix:
% cd /usr/local/mysql
% scripts/mysql_install_db
Preparing db table
Preparing host table
Preparing user table
Preparing func table
Preparing tables_priv table
Preparing columns_priv table
Installing all prepared tables
010726 19:40:05 ./bin/mysqld: Shutdown Complete %
chown -R root /usr/local/mysql
% chgrp -R mysql /usr/local/mysql
% chown -R mysql /usr/local/mysql/data
The standard MySQL distributions are fairly close to optimal compiles, but
if you're out to get every last drop of performance, you can make some
improvements. It's also quite easy to do the reverse and slow things down,
so be careful when using the following tips:
You can define flags or the compiler name used by the compiler—for
example:
% CFLAGS=-O3
% CXX=gcc
% CXXFLAGS=-O3
% CC=gcc
% export CC CFLAGS CXX CXXFLAGS
Link statically, not dynamically (in other words, use the --static
option). This uses more disk space but runs faster (13 percent on Linux
according to MySQL measurements).
MySQL binaries are mostly compiled with gcc because pgcc (Pentium gcc)
has been known to cause problems on non-Intel processors. Compiling with
pgcc if your processor is from the Intel Pentium family may result in some
gains (1 percent according to MySQL tests, up to a reported 10-percent
improvement). Similarly, on a Sun server, the SunPro C++ compiler has
been about 5 percent faster than gcc in the past.
If you plan to run multiple versions of MySQL on the same machine, you'll
need to ensure that they do not attempt to use the same socket file or listen
on the same TCP/IP port. They will also have their ownpid file. The
defaults are port 3306 and/tmp/mysql.sock on most systems. A
convenient way of managing this is with the mysqld_multi utility, discussed
later in this chapter. You can change the default port and TCP/IP settings in
the configuration file, assuming it's a different configuration file than the
other server. For example:
socket=/tmp/mysql2.sock
port=3307
Clients can connect to servers running on a different socket by using the--
socket option: % mysql --socket=/tmp/mysql2.sock -
uroot -pg00r002b
You can also specify the server to connect to by specifying the
configuration file to use for the client. For example:
% mysql --defaults-
file=/usr/local/mysql2/etc/my.cnf -uroot -
pg00r002b
If you're compiling MySQL, configure the second server with a new port
number, socket path, and installation directory. For example:
% ./configure --with-tcp-port=3307 \
--with-unix-socket-path=/tmp/mysql2.sock \
--prefix=/usr/local/mysql2
Never have more than one server controlling the same data, which is a
Warning foolproof recipe for corruption! They should not need to write to
the same log
files either.
[mysqld_multi]
mysqld = /usr/local/bin/mysqld_safe
mysqladmin = /usr/local/bin/mysqladmin
user = root
password [mysqld1] socket
port
pid-file datadir
language user
[mysqld2] socket
port
pid-file datadir
language user
[mysqld3] socket
port
pid-file datadir
language user
[mysqld4] socket
port
pid-file datadir
language user
= g00r002b
= /tmp/mysql.sock
= 3306
= /usr/local/mysql/var/hostname.pid =
/usr/local/mysql/var
= /usr/local/share/mysql/english = hartmann
= /tmp/mysql.sock2
= 3307
= /usr/local/mysql/var2/hostname.pid =
/usr/local/mysql/var2
= /usr/local/share/mysql/french = yves
= /tmp/mysql.sock3
= 3308
= /usr/local/mysql/var3/hostname.pid =
/usr/local/mysql/var3
= /usr/local/share/mysql/german = cleo
= /tmp/mysql.sock4
= 3309
= /usr/local/mysql/var4/hostname.pid =
/usr/local/mysql/var4
= /usr/local/share/mysql/english = caledon
--config Set an alternative configuration file for the groups. (It will not
file=... affect the [mysqld_multi] group.)
Specifies the log file, taking the full path and name of the file. If
--log=... this file already exists, logs will be appended on the end of
the
file.
The full path and name of the mysqladmin binary, used to shut --
mysqladmin=... down the server.
The full path and name of the mysqld binary to be used, or
--mysqld=...
more often the mysqld_safe binary. Options are passed to mysqld. You'll
need to make changes to mysqld_safe or ensure that it's in yourPATH
environment variable.
The user for mysqladmin. Make sure this user has the correct --user=...
privileges to do what is needed (shutdown_priv).
--version Displays the version number and exits.
You may have a problem with your configuration file ( my.cnf ormy.ini
). Check your syntax carefully, or use the standard configuration file that
came with your distribution to see if you can start MySQL.
Another common error is the following:
Can't start server: Bind on unix socket....
or the following:
Can't start server: Bind on TCP/IP port: Address
already in use
This error occurs when trying to install a second copy of MySQL onto the
same port or socket as an existing installation. Make sure you specify a
different port or socket before starting.
Permission problems are common, too. Make sure you've followed the steps
listed in the installation section so that at least the MySQL directories are
correct. If you're using sockets, you need to make sure you have permission
to write the socket file as well (usually to/tmp ).
Another common problem is with libraries. For example, if you run Linux
and have installed shared libraries, make sure the location of these shared
libraries is listed in your/etc/ld.so.conf file. For example, if you
have:
/usr/local/lib/mysql/libmysqlclient.so
Make sure/etc/ld.so.conf contains:
/usr/local/lib/mysql
Then runldconfig .
When starting MySQL where there are existing BDB tables, you may find a
problem such as the following:
Compile Problems
If you run into problems compiling and need to do it a second time, you'll
need to make sure configure is run from a clean slate; otherwise, it uses
information from its previous incarnation, stored in the
fileconfig.cache . You'll need to remove this file each time you
configure. Also, old object files may still be in existence, and to ensure a
clean recompile, you should remove these. Run the following:
% rm config.cache
% make clean
You can also rundistclean , if you have it.
Your compiler may be out of date. Currently MySQL suggests usinggcc
2.95.2 , oregcs 1.0.3a , but this is likely to have changed, so check
the latest documentation.
Other problems could result from an incompatible version of make.
Currently, MySQL recommends GNU make, version 3.75 or higher.
If you get an error when compiling sql_yacc.cc , you may have run out
of disk space. In some situations, compilation of this file uses up too many
resources (even when there is seemingly lots available). The error could be
one of the following:
Windows Problems
If you double-clicksetup.exe , and the process begins but never
completes, you may have something interfering with MySQL. Try one of
these procedures:
Close all Windows applications, including services and ones from the
system tray.
Alternatively, try the install in Safe mode (by pressing F8 when booting,
then choosing the option from the menu).
At worst, you may have to reinstall Windows and install MySQL first,
before anything else gets in the way. The problem is less likely to occur in
production machines dedicated as MySQL database servers. Rather, they're
likely in multipurpose computers where you're running all kinds of other
applications.
There are a large number of new privileges in the user table (in the mysql
database). MySQL has supplied a script to add these new permissions,
while maintaining the existing permissions.
Calledmysql_fix_privilege_tables , it takes the REPLICATION
SLAVE and REPLICATION CLIENT privileges from the old FILE
privilege, and the SUPER and EXECUTE privileges from the old
PROCESS. Without running this script, all users will have SHOW
DATABASES, CREATE TEMPORARY TABLES, and LOCK TABLES
privileges.
myisam_bulk_insert_tree_size to
bulk_insert_buffer_size query_cache_startup_type
to query_cache_type
record_buffer to read_buffer_size
record_rnd_buffer to read_rnd_buffer_size
sort_buffer to sort_buffer_size
warnings to log-warnings
A number of startup options have been deprecated (they will still work for
now):
record_buffer
sort_buffer
warnings
SQL_BIG_TABLES to BIG_TABLES
SQL_LOW_PRIORITY_UPDATES to LOW_PRIORITY_UPDATES
SQL_MAX_JOIN_SIZE to MAX_JOIN_SIZE
SQL_QUERY_CACHE_TYPE to QUERY_CACHE_TYPE
Results of all bitwise operators (<<, >>, |, &, ~) are now unsigned, as is the
result of subtraction between two integers when either of them is unsigned
(the latter can be disabled by starting MySQL with the--sql
mode=NO_UNSIGNED_SUBTRACTION option).
If you're using the Perl DBD::mysql module, you'll need to use a version
more recent than 1.2218, as earlier versions used the olddrop_db()
function.
Instead of usingSET SQL_SLAVE_SKIP_COUNTER=# , you need to
useSET GLOBAL SQL_ SLAVE_SKIP_COUNTER=# .
Multithreaded clients should use the functionsmysql_thread_init()
and mysql_thread_end() .
Summary
Installing MySQL is not difficult. If you choose a binary distribution, which
is recommended in most cases, it's a matter of pointing and clicking in
Windows and a few simple commands in Unix. There are valid reasons for
choosing a source distribution, though—perhaps MySQL is not yet
available in binary on your platform, or you need to get the most out of
MySQL by compiling it more optimally. Whatever your reason, with
careful tweaking of the compile options, you can get a fast yet stable
installation of MySQL.
Understanding RAID
RAID stands for redundant array of inexpensive disks . It doesn't stand
forredundant array of independent disks; it's amazing how quickly one
wrong source gets duplicated on the Internet, though the termindependent
has some accuracy. The original term comes from a paper written in 1987
by researchers David Patterson, Garth Gibson, and Randy Katz. It improves
performance and fault tolerance. The alternative is often described as single
large expensive disk (SLED).
RAID 0
RAID 0 (sometimes just called striping , as it's not really a true type of
RAID) is where data is broken into blocks and spread across multiple
drives. Block sizes are the same for each drive, but different block sizes can
be set initially depending on the circumstances. This allows for increased
performance, as one of the major bottlenecks is moving the drive head.
RAID 0 improves the chances of data requested at the same time being on
different drives, meaning the read can take place at the same time without
having to wait for the one to finish, reposition the head, and then read the
second set of data. In general, the more drives there are, the better the
performance. Performance is even better if each drive has its own
controller, though this is not critical. Just make sure the controllers can
handle the load if it is responsible for multiple drives. RAID 0 allows for no
fault tolerance, however. In fact, it increases the chance of a failure, as there
is more than one drive that can fail, rendering the set of data unavailable.
RAID 1
RAID 1 (also called mirroring ) is where writes to one drive are repeated on
another drive. This improves fault tolerance because there is an up-to-date
backup in case of one drive failing. Drive failures are the most common
type of hardware failure, and RAID 1 protects against this kind of failure.
Write performance is poor, as there are simultaneous writes taking place,
although reads are slightly quicker, as there are multiple drives to access.
You need at least two drives to implement RAID 1.
Figure 16.2 shows RAID 1 being implemented with two drives, each
containing an identical copy.
Figure 16.2: RAID 1
RAID 3 is the same as RAID 0, except that it also sets aside a dedicated
drive for error correction, providing some level of fault tolerance. Data is
striped at a byte level across multiple drives. Another drive stores parity
data. Parity is determined during a write and checked during a read. Parity
information allows recovery if one drive fails. You need at least three drives
to implement RAID 3. It usually requires hardware implementation to be of
much use. Small writes and reads are fast, but large blocks of data usually
require data to be read from all data drives, meaning that performance can
be as slow as a single drive.
Figure 16.3 shows RAID 3 being used to stripe data across two drives (A
and B), with a third drive (C) being used to store parity.
RAID 4
RAID 4 is similar to RAID 3, except that striping is performed at block
level, not byte level. Reads smaller than one block in size will be fast
(generally getting faster as each new drive is added). As with RAID 3, it
requires at least three drives, and the parity data allows recovery if one
drive fails. The parity drive can become a bottleneck, and choosing RAID 5
can overcome this limitation.
Figure 16.4 shows RAID 4 being used to stripe data across three drives (A,
B, and C), with a fourth drive (D) being used to store parity.
RAID 5
RAID 5 also allows for striping, as well as stripe error correction data,
improving both performance and fault tolerance. It is similar to RAID 4,
except that parity data is stored on each drive. Writes are faster than RAID
4 (there is no one drive bottleneck), but reads are slower, as parity
information takes up space on each drive and has to be skipped over. RAID
5 is often recommended for database servers, as it builds in redundancy and
improves performance.
RAID 10
RAID 10 is a combination of RAID 1 and RAID 0 (mirroring and striping).
It provides all the performance benefits of striping, as well as the fault
tolerance of mirroring. It's the best of both worlds, but the cost is high. It
requires at least four drives to implement.
Figure 16.6 shows RAID 10 being implemented across four drives (A, B, C,
and D). A and B mirror the data (blocks one to four of data appear on both
drives), and drives C and D provide the performance boost with data being
striped across them. RAID 10 is often suggested for database servers, as it
gives the greatest level of performance boost and fault tolerance, albeit at a
cost penalty (for the number of drives).
RAID 0+1
RAID 0+1 is often confused with RAID 10. Where RAID 10 is a striped
array of drives, whose segments are mirrored, RAID 0+1 is a mirrored array
of drives, whose segments are striped. Generally RAID 0+1 is chosen when
performance is a higher priority than reliability, and RAID 10 is chosen
when reliability is a higher priority than performance. RAID 0+1 is also
expensive and requires at least four drives to implement. Figure 16.7 shows
RAID 0+1 being implemented across four drives (A, B, C, and D).
Figure 16.7: RAID 0+1
Let's test the creation of a new database, s_db , which you're going to
symlink. Assuming you already have a directory,disk2 , which is the
secondary disk you want to place the database on, create the directories
where you're going to store the data (/disk2/mysql/data/ s_db ) and change
the permissions and ownership, first on a Unix system, as follows:
% cd /disk2
% mkdir mysql
% mkdir mysql/data
% mkdir mysql/data/s_db
% chown mysql /disk2/mysql/data/s_db/
% chgrp mysql /disk2/mysql/data/s_db/
% chmod 700 /disk2/mysql/data/s_db/
Now, to see that data is actually placed on the second disk, create and
populate a small table, as follows:
Check that the new data has been created on the secondary disk:
mysql> exit
Bye
% ls -l /disk2/mysql/data/s_db/
total 14
-rw-rw---- 1 mysql mysql 5 Jul 8 02:26 s1.MYD
-rw-rw---- 1 mysql mysql 1024 Jul 8 02:26 s1.MYI
-rw-rw---- 1 mysql mysql 8550 Jul 8 02:25 s1.frm
Because this is inside your data directory, it will appear on the same level as
all your other MyISAM databases (in this case,firstdb ,mysql , andtest ), and
when you connect to MySQL, you'll see it as an existing database:
your my.ini configuration file). It's also possible your version of MySQL
was not compiled with-DUSE_SYMDIR , in which case symlinks will not
work at all. Usually, mysql-max and mysql-max-nt servers are compiled
with this option; however, you should check the latest documentation.
Features that don't yet work with symlinked tables include the following:
BACKUP TABLE andRESTORE TABLE (the symlinks will be lost).
mysqldump does not store symlink information in the dump.
ALTER TABLE (it ignoresINDEX/DATA DIRECTORY="path"
CREATE TABLE options).
To symlink a table when you create it, you should use the INDEX orDATA
DIRECTORY PATH option.DATA DIRECTORY creates a symlink for
the.MYD file, andINDEX DIRECTORY places the.MYI file. The
following example places the data file of a new table in thefirstdb database
in the new directory you created earlier.
'/disk2/mysql/data/s_db';
Query OK, 0 rows affected (0.20 sec)
mysql> INSERT INTO s_table VALUES(1);
Query OK, 1 row affected (0.01 sec)
You can see that the.frm file (containing the structure) is in the usual data
directory, and the.MYD data file is in the new location:
% cd /usr/local/mysql/data/firstdb/
% ls -l /disk2/mysql/data/s_db/
total 16
-rw-rw---- 1 mysql mysql 5 Jul 8 02:26 s1.MYD
-rw-rw---- 1 mysql mysql 1024 Jul 8 02:26 s1.MYI
-rw-rw---- 1 mysql mysql 8550 Jul 8 02:25 s1.frm
-rw-rw---- 1 mysql mysql 5 Jul 8 05:35 s_table.MYD
% ls -l s*
lrwxrwx--x 1 mysql mysql 34 Jul 8 05:34
s_table.MYD ->
/disk2/mysql/data/s_db/s_table.MYD
-rw-rw---- 1 mysql mysql 1024 Jul 8 05:35
s_table.MYI
-rw-rw---- 1 mysql mysql 8548 Jul 8 05:34
s_table.frm
MySQL has created the symlink for the table. You could also have
explicitly created this symlink yourself (taking the same care with
permissions as you did with creating the database symlink).
The data and index files can be created in different locations as this
example demonstrates, assuming the existence of the
directory/disk3/mysql/data/indexes :
% ls -l /disk3/mysql/data/indexes/
total 2
-rw-rw---- 1 mysql mysql 1024 Jul 8 06:01
s2_table.MYI % ls -l /disk2/mysql/data/s_db/
...
-rw-rw---- 1 mysql mysql 0 Jul 8 06:01
s2_table.MYD ...
% ls -l /usr/local/mysql/data/firstdb/
...
lrwxrwx--x 1 mysql mysql 35 Jul 8 06:01
s2_table.MYD ->
/disk2/mysql/data/s_db/s2_table.MYD
lrwxrwx--x 1 mysql mysql 38 Jul 8 06:01
s2_table.MYI ->
/disk3/mysql/data/indexes/s2_table.MYI
-rw-rw---- 1 mysql mysql 8548 Jul 8 06:01
s2_table.frm ...
Note
TheINDEX DIRECTORY andDATA DIRECTORY options do not work
when running MySQL on Windows (although check your latest
documentation).
Summary
RAID is a method of using multiple disks to store data. RAID 0 (striping)
spreads blocks of data across multiple disks. It speeds up performance but
has no redundancy capability. RAID 1 (mirroring) slows down write speed
and can speed up reads, but it is mainly used for protection from drive
failure. RAID 2, 3, 4, and 5 all make use of striping and use various forms
of parity for fault tolerance. RAID 10 and RAID 0+1 combine mirroring
and striping (although they implement them slightly differently—RAID 10
with a greater focus on reliability, and RAID 0+1 focusing more on speed).
Part A: Appendixes
Appendix List
Appendix A: MySQL Syntax Reference Appendix B: MySQL Function and
Operator Reference
Appendix C: PHP API Appendix D: Perl DBI
Appendix E: Python Database API Appendix F: Java API
Appendix G: C API Appendix H: ODBC and .NET
ALTER
TheALTER syntax is as follows:
ALTER [IGNORE] TABLE table_name
alter_specification [, alter _specification ...]
Thealter_specification syntax can be any of the following:
ANALYZE TABLE
ANALYZE TABLE table_name [,table_name...]
For MyISAM and BDB tables, this analyzes and stores the key distribution
for the specified tables. It locks the table with a read lock for the operation's
duration.
BACKUP TABLE
BACKUP TABLE table_name [,table_name...] TO
'path_name' For MyISAM tables, this copies the data and data
definition files to the backup directory.
BEGIN
BEGIN
TheBEGIN statement begins a transaction, or set of statements. The
transaction remains open until the nextCOMMIT orROLLBACK statement.
CHECK TABLE
CHECK TABLE tbl_name[,tbl_name...] [option
[option...]]
The option can be one of the following:
CHANGED
EXTENDED
FAST
MEDIUM
QUICK
This checks a MyISAM or InnoDB table for errors and, for MyISAM
tables, updates the index statistics. TheQUICK option doesn't scan the rows
to check links. TheFAST option only checks tables that weren't closed
properly. TheCHANGED option is the same asFAST , except that it also
checks tables that have changed since the last check. TheMEDIUM option
verifies that deleted links are correct, and theEXTENDED option does a full
index lookup for each key in each row.
COMMIT
COMMIT
TheCOMMIT statement ends a transaction, or set of statements, and flushes
the results to disk.
CREATE
TheCREATE syntax can be one of the following:
[table_options] [select_statement]
Thecreate_definition syntax can be any of the following:
TEMPORARY tables exist only for as long as the connection is active. You
need to have CREATE TEMPORARY TABLES permission to do this.
The RAID_TYPE option helps operating systems that cannot support large
files to overcome the file size limit. TheSTRIPED option is the only one
currently used. For MyISAM tables, this creates subdirectories inside the
database directory, each containing a portion of the data file. The first 1024
*RAID_CHUNKSIZE bytes go into the first portion, the next 1024
*RAID_CHUNKSIZE bytes go into the next portion, and so on.
The PACK_KEYS=1 option packs numeric fields in the index for MyISAM
tables (as well as strings, which it does by default). This is only useful if
you have indexes with many duplicate numbers.
DELETE
TheDELETE syntax can be any of the following:
TheDELETE statement deletes records from the table (or tables) that adhere
to the where_ clause (or all records if there is no clause).
TheLOW PRIORITY keyword causes theDELETE to wait until no other
clients are reading the table before processing it.
TheQUICK keyword causes MySQL not to merge index leaves during
theDELETE , which is sometimes quicker.
LIMIT determines the maximum number of records to be deleted.
TheORDER BY clause causes MySQL to remove records in a certain order
(which is useful with aLIMIT clause).
DESC
DESC is a synonym forDESCRIBE .
DESCRIBE
DESCRIBE table_name {field_name | wildcard}
DESCRIBE returns the definition of the specified table and fields (the same
asSHOW COLUMNS FROM table_name ).
The wildcard can be part of the fieldname and can be a percentage sign (%),
meaning a number of characters, or an underscore (_), meaning one
character.
DO
TheDO syntax is as follows:
DO expression, [expression, ...]
DO has the same effect as aSELECT , except that it does not return results
(making it slightly faster).
DROP
TheDROP syntax is as follows:
DROP DATABASE [IF EXISTS] database_name
DROP TABLE [IF EXISTS] table_name [,
table_name,...] [RESTRICT | CASCADE]
DROP INDEX index_name ON table_name
DROP DATABASE removes the database and all its tables.
DROP TABLE removes the specified table.
DROP INDEX removes the specified index.
MySQL returns an error if the database doesn't exist, unless theIF
EXISTS clause is used.
DROP TABLE automatically commits active transactions.
RESTRICT andCASCADE are not currently implemented.
EXPLAIN
EXPLAIN table_name
EXPLAIN select_query
Theselect_query is the same as specified in theSELECT description.
FLUSH
FLUSH flush_option [,flush_option] ...
Theflush_option can be any of the following:
DES_KEY_FILE
HOSTS
LOGS
QUERY CACHE
PRIVILEGES
STATUS
TABLES
[TABLE | TABLES] table_name [,table_name...]
TABLES WITH READ LOCK
USER_RESOURCES
Flushing the DES_KEY_FILE reloads the DES keys. With theHOSTS
option, the host's cache is emptied (which you use after changing IP
addresses, for example). Flushing the LOGS closes and reopens log files and
increments the binary log. Flushing theQUERY CACHE defragments the
query cache. Flushing thePRIVILEGES reloads the permission tables from
themysql database. Flushing theSTATUS resets the status variables.
Flushing theTABLES is the same as flushing theQUERY CACHE , but it
also closes all open tables. You can specify only certain tables to flush. You
can place aREAD LOCK on the tables, which is useful for locking a group
of tables for backup purposes. Flushing the USER_RESOURCES resets user
resources (used for limiting queries, connections, and updates per hour).
GRANT
GRANT privilege_type [(field_list)] [,
privilege_type [(field_list)] ...] ON {table_name
| * | *.* | database_name.*} TO user_name
[IDENTIFIED BY [PASSWORD] 'password'] [, user_name
[IDENTIFIED BY 'password'] ...] [REQUIRE NONE |
[{SSL| X509}] [CIPHER cipher [AND]] [ISSUER issuer
[AND]] [SUBJECT subject]] [WITH [GRANT OPTION |
MAX_QUERIES_PER_HOUR # | MAX_UPDATES_PER_HOUR # |
MAX_CONNECTIONS_PER_HOUR #]]
ALTER
CREATE
CREATE TEMPORARY TABLES
DELETE
DROP
EXECUTE
FILE
INDEX
INSERT
LOCK TABLES
PROCESS
Permission to read and write files on the server (for LOAD DATA INFILE
or SELECT INTO OUTFILE statements). Any files that the mysql user
can read are readable.
SUPER
UPDATE
USAGE
Permission to ask about the replication slaves and masters.
INSERT
TheINSERT syntax can be any of the following:
The LOW PRIORITY keyword causes the INSERT to wait until no other
clients are reading the table before processing it. With theDELAYED
keyword, MySQL frees the client but waits to perform theINSERT.
JOIN
MySQL accepts any of the following join syntaxes:
table_name, table_name
table_name [CROSS] JOIN table_name
table_name INNER JOIN table_name condition
table_name STRAIGHT_JOIN table_name
table_name LEFT [OUTER] JOIN table_name condition
table_name LEFT [OUTER] JOIN table_name
table_name NATURAL [LEFT [OUTER]] JOIN table_name
table_name LEFT OUTER JOIN table_name ON
conditional_expr table_name RIGHT [OUTER] JOIN
table_name condition table_name RIGHT [OUTER] JOIN
table_name
table_name NATURAL [RIGHT [OUTER]] JOIN table_name
KILL
KILL thread_id
Kills the specified thread. You can useSHOW PROCESSLIST to identify
thread IDs. The SUPER privilege is required to kill processes not owned by
the current connection.
LOAD DATA reads data from a text file and adds it to a table. This is a
quicker way of adding high volumes of data than usingINSERT .
The LOCAL keyword indicates that the file is on the client machine;
otherwise the file is assumed to be on the database server.LOCAL will not
work if the server was started with the -local-infile=0 option, or the
client has not been enabled to support it.
The IGNORE number LINES option ignores a number of lines at the top of
the file (which is useful when the file contains a header). LOAD DATA
INFILE is the complement ofSELECT...INTO INFILE.
LOCK TABLES
LOCK TABLES table_name [AS alias] {READ | [READ
LOCAL] | [LOW_PRIORITY] WRITE} [,table_name {READ
| [LOW_PRIORITY] WRITE} ...]
LOCK TABLES places a lock on the specified tables. The lock can beREAD
(other connections cannot write, only read),READ LOCAL (same asREAD
except that writes from other connections that do not conflict are allowed),
orWRITE (which blocks reading or writing from other connections). If
theWRITE lock isLOW PRIORITY ,READ locks are placed first.
UsuallyWRITE locks have higher priority.
OPTIMIZE
OPTIMIZE TABLE table_name [,table_name]...
For MyISAM tables, this sorts the index, updates the statistics, and
defragments the data file.
For BDB tables, this is the same asANALYZE TABLE.
This locks the table for the duration of the operation (which can take some
time).
RENAME
TheRENAME syntax is as follows:
RENAME TABLE table_name TO new_table_name[,
table_name2 TO new_table_name2,...]
RENAME allows you to give a table (or list of tables) a new name. You can
also move a table to a new database by
specifyingdatabase_name.table_name , as long as the database is
on the same disk.
REPAIR TABLE
REPAIR TABLE table_name [,table_name...]
[EXTENDED] [QUICK] [USE_FRM]
Repairs a corrupted MyISAM table. With the QUICK option, only the index
tree is repaired. WithEXTENDED , the index is re-created row by row.
WithUSE_FRM , the index is repaired based upon the data definition file
(for when the index is missing or totally corrupted).
REPLACE
TheREPLACE syntax can be one of the following:
RESET
RESET reset_option [,reset_option] ...
Thereset_option can be any of the following:
MASTER
QUERY CACHE
SLAVE
RESET MASTER deletes all binary logs and empties the binary log
index.RESET SLAVE resets a slave's position for replicating with a
master.RESET QUERY CACHE empties the query cache.
RESTORE TABLE
RESTORE TABLE table_name [,table_name...] FROM
'path' Restores a table backed up withBACKUP TABLE . It will not
overwrite existing tables.
REVOKE
REVOKE privilege_type [(field_list)]
[,privilege_type [(field_list)] ...] ON
{table_name | * | *.* | database_name.*} FROM
user_name [, user_name ...]
ROLLBACK
ROLLBACK
TheROLLBACK statement ends a transaction, or set of statements, and
undoes any statements in that transaction.
SELECT
TheSELECT syntax is as follows:
SELECT [STRAIGHT_JOIN] [SQL_SMALL_RESULT]
[SQL_BIG_RESULT] [SQL_BUFFER_RESULT] [SQL_CACHE |
SQL_NO_CACHE]
[SQL_CALC_FOUND_ROWS] [HIGH_PRIORITY] [DISTINCT |
DISTINCTROW | ALL] expression, ... [INTO {OUTFILE
| DUMPFILE} 'file_name' export_options]
[FROM table_names
[WHERE where_clause] [GROUP BY {unsigned_integer |
field_name | formula} [ASC | DESC], ... [HAVING
where_definition] [ORDER BY {unsigned_integer |
field_name | formula} [ASC | DESC], ...] [LIMIT
[offset,] rows] [PROCEDURE procedure_name] [FOR
UPDATE
| LOCK IN SHARE MODE]]
SELECT VERSION();
or as follows:
SELECT 42/10;
The expression can also be given an alias with the keywordAS . For
example: SELECT 22/7 AS about_pi
The expression can be used elsewhere in the statement (but not in
theWHERE clause, which is usually determined first).
Thetable_names clause is a comma-separated list of tables used in the
query. You can also use an alias for a table name. For example:
SELECT watts FROM wind_water_solar_power AS n;
You can also control MySQL's index usage if you're unhappy with
MySQL's choice (which you can view by usingEXPLAIN ) with theUSE
INDEX andIGNORE INDEX clauses after the table name. The syntax is as
follows:
GROUP BY groups output rows, which are useful when you use an
aggregate function. Two non-ANSI MySQL extensions that you can use
areASC orDESC withGROUP BY , and you can also use fields in the
expression that are not mentioned in theGROUP BY clauses. For example:
LIMIT takes one or two arguments to limit the number of rows returned. If
one argument, it's the maximum number of rows to return; if two, the first is
the offset and the second the maximum number of rows to return. If the
second argument is –1, MySQL will return all rows from the specified
offset until the end. For example, to return from row 2 to the end, use this:
Using INTO DUMPFILE causes MySQL to write one row into the file,
without any column or line terminations and without any escaping.
With InnoDB and BDB tables, theFOR UPDATE clause write locks the
rows.
SET
SET [GLOBAL | SESSION] variable_name=expression,
[[GLOBAL | SESSION | LOCAL ]
variable_name=expression...]
When set (1), MySQL automatically COMMIT s statements unless you wrap
them inBEGIN and COMMIT statements. MySQL also automatically
COMMIT s all open transactions when you set AUTOCOMMIT .
When set (1), all temporary tables are stored on disk instead of in memory.
This makes temporary tables slower, but it prevents the problem of running
out of memory. The default is 0.
Sets the AUTO_INCREMENT value (so the next INSERT statement that
uses anAUTO_INCREMENT field will use this value).
By setting a maximum size in rows, you can prevent MySQL from running
queries that may not be making proper use of indexes or that may have
thepotential to slow the server down when run in bulk or at peak times.
Setting this to anything butDEFAULT resets SQL_BIG_ SELECTS .
IfSQL_BIG_SELECTS is set, thenMAX_JOIN_SIZE is ignored. If the
query is already cached, MySQL will ignore this limit and return the
results.
If set (1, the default), then the last inserted row for an AUTO_
INCREMENT can be found withWHERE auto_increment_column
IS NULL . This isused by Microsoft Access and other programs
connecting through ODBC.
If set (1, the default), then MySQL allows large queries. If not set (0), then
MySQL will not allow queries where it will have to examine more than
max_join_size rows. This is useful to avoid running accidental or
malicious queries that could bring the server down.
If set (1), MySQL places query results into a temporary table (in some cases
speeding up performance by releasing table locks earlier).
If set (1), MySQL will not log for the client (this is not the update log).
TheSUPER permission is required.
SQL_LOG_UPDATE = 0 | 1
If not set (0), MySQL will not use the update log for the client. This
requires theSUPER permission. SQL_QUOTE_SHOW_CREATE =
0 | 1
If set (1, the default), MySQL will quote table and column names.
SQL_SAFE_UPDATES = 0 |
1
If set (1), MySQL will not perform UPDATE orDELETE statements that
don't use either an index or aLIMIT clause, which helps prevent unpleasant
accidents.
SQL_SELECT_LIMIT =
value | DEFAULT
TIMESTAMP =
timestamp_value |
DEFAULT
Sets the time for the client. This can be used to get the original timestamp
when using the update log to restore rows. Thetimestamp_value is a
Unix epoch timestamp.
The oldSET OPTION syntax is now deprecated, so you should not use it
anymore.
SET TRANSACTION
SET [GLOBAL | SESSION] TRANSACTION ISOLATION LEVEL
{ READ UNCOMMITTED | READ COMMITTED | REPEATABLE
READ | SERIALIZABLE }
Sets the transaction isolation level. By default it will be for the next
transaction only, unless theSESSION orGLOBAL keywords are used (which
set the level for all transactions on the current connection or for all
transactions on all new connections, respectively).
SHOW
TheSHOW syntax can be any of the following:
TRUNCATE
TRUNCATE TABLE table_name
The TRUNCATE statement deletes all records from a table. It is quicker than
the equivalent DELETE statement as itDROP s andCREATE s the table. It is
not transaction safe (so will return an error if there are any active
transactions or locks).
UNION
SELECT ... UNION [ALL] SELECT ... [UNION SELECT
...]
Union combines many results into one.
Without theALL keyword, rows are unique.
UNLOCK TABLES
UNLOCK TABLES
Releases all locks held by the current connection.
UPDATE
UPDATE [LOW_PRIORITY] [IGNORE] table_name SET
field_name1= expression1 [, field_
name2=expression2, ...] [WHERE where_clause]
[LIMIT #]
TheUPDATE statement updates the contents of existing rows in the
database.
TheSET clause specifies which fields to update and what the new values are
to be.
Thewhere_clause gives conditions the row must adhere to in order to
be updated.
IGNORE causes MySQL to ignoreUPDATE s that would cause a duplicate
primary key or unique key, instead of aborting theUPDATE .
TheLOW PRIORITY keyword causes theUPDATE to wait until no other
clients are reading the table before processing it.
The expression can take the current value of a field; for example, to add 5
to all employees' commissions, you could use the following:
UPDATE employee SET commission=commission+5;
LIMIT determines the maximum number of records to be updated.
USE
USE database_name
Changes the current active database to the specified database.
Logical Operators
Logical operators, or Boolean operators, check whether something is true or
false. They return 0 if the expression is false and 1 if it's true. Null values
are handled differently depending on the operator. Usually they return
aNULL result.
AND, &&
value1 AND value1 value1 && value2
Returns true (1) if both values are true.
For example:
OR, ||
value1 OR value2 value1 || value 2
Returns true (1) if eithervalue1 orvalue2 is true.
For example:
NOT, !
NOT value1 ! value1
Returns the opposite ofvalue1 , which is true ifvalue1 is false and false
ifvalue1 is true.
For example:
mysql> SELECT !1;
+----+
| !1 |
+----+
| 0 |
+----+
mysql> SELECT NOT(1=2); +----------+
| NOT(1=2) |
+----------+
| 1 |
+----------+
Arithmetic Operators
Arithmetic operators perform basic mathematical calculations. If any of the
values are null, the results of the entire operation are also usually null. For
purposes of the calculation, strings are converted to numbers. Some strings
are converted to the equivalent number (such as the strings '1' and '33'), but
others are converted to 0 (such as the strings 'one' and 'abc').
+
value1 + value2
Adds two values together. For example:
–
value1 - value2
Subtractsvalue2 fromvalue1 .
For example:
*
value1 * value2
Multiplies two values with each other.
For example:
/
value1 / value2
Dividesvalue1 byvalue2 .
For example:
%
Returns the modulus (remainder aftervalue1 is divided byvalue2 ).
For example:
Comparison Operators
Comparison operators compare values and return true or false (1 or 0)
depending on the results. If there's a null value, the operator will
returnNULL as a result in most cases. Different types can be compared
(strings, numbers, dates, and so on), though if the types are different, you
need to be careful. MySQL converts the types to an equivalent as well as it
can.
=
value1 = value2
True if bothvalue1 andvalue2 are equal. If either is null, this will
returnNULL .
For example:
!=, <>
value1 <> value2 value1 != value2
True ifvalue1 is not equal tovalue2 .
For example:
>
value1 > value2
True ifvalue1 is greater thanvalue2 .
For example:
<
value1 < value2
True ifvalue1 is less thanvalue2 .
For example:
>=
value1 >= value2
True ifvalue1 is greater than or equal tovalue2 .
For example:
<=
value1<= value2
True ifvalue1 is less than or equal tovalue2 .
For example:
<=>
value1 <=> value2
For example:
IS NULL
value1 IS NULL
True ifvalue1 is null (not false).
For example:
BETWEEN
value1 BETWEEN value2 AND value3
True ifvalue1 is inclusively betweenvalue2 andvalue3 .
For example:
LIKE
value1 LIKE value2
True ifvalue1 matchesvalue2 on an SQL pattern match. A percentage
(%) refers to any number of characters, and an underscore (_) refers to one
character.
For example:
mysql> SELECT 'abc' LIKE 'ab_';
+------------------+
| 'abc' LIKE 'ab_' |
+------------------+
| 1 |
+------------------+
mysql> SELECT 'abc' LIKE '%c'; +-----------------+
| 'abc' LIKE '%c' |
+-----------------+
| 1 |
+-----------------+
IN
value1 IN (value2 [value3,...])
True ifvalue1 is equal to any value in the comma-separated list.
For example:
For example:
Bit Operators
The bit operators are not used often. They allow you to work with bit values
and perform bit calculations in your queries.
&
value1 & value2
Performs a bitwise AND. This converts the values to binary and compares
the bits. Only if both of the corresponding bits are 1 is the resulting bit also
1.
For example:
mysql> SELECT 2&1;
+-----+
| 2&1 |
+-----+
| 0 |
+-----+
mysql> SELECT 3&1;
+-----+
| 3&1 |
+-----+
| 1 |
+-----+
|
value1 | value2
Performs a bitwise OR. This converts the values to binary and compares the
bits. If either of the corresponding bits are 1, the resulting bit is also 1.
For example:
<<
value1 << value2
Convertsvalue1 to binary and shifts the bits ofvalue1 left by the
amount ofvalue2 .
For example:
mysql> SELECT 2<<1; +------+
| 2<<1 |
+------+
| 4 |
+------+
>>
value1 >> value2
Convertsvalue1 to binary and shifts the bits ofvalue1 right by the
amount ofvalue2 .
For example:
To perform date calculations, you can also use the usual operators (+, –, and
so on) rather than the date functions. MySQL also correctly converts
between units. When, for example, you add 1 month to month 12, MySQL
will increment the year and correctly calculate the months.
ADDDATE
ADDDATE(date,INTERVAL expression type) A synonym
forDATE_ADD() .
CURDATE
CURDATE() A synonym for theCURRENT_DATE() function.
CURRENT_DATE
CURRENT_DATE()
Returns the current system date as either the string YYYY-MM-DD or the
numeric YYYYMMDD depending on the context.
For example:
CURRENT_TIME
CURRENT_TIME()
Returns the current system time as either the string hh:mm:ss or the number
hhmmss, depending on the context of the function.
For example:
CURRENT_TIMESTAMP
CURRENT_TIMESTAMP() This function is a synonym for theNOW()
function.
CURTIME
CURTIME() A synonym for theCURRENT_TIME() function.
DATE_ADD
DATE_ADD(date,INTERVAL expression type)
Adds a certain time period to the specified date. You can use a negative
value for the expression, in which case it will be subtracted. The type must
be one of those listed at the beginning of this section ("Date and Time
Functions"), and the expression must match the type.
For example:
DATE_FORMAT
DATE_FORMAT(date,format_string)
Formats the specified date based upon the format string, which can consist
of the specifiers shown in Table B.2 .
For example:
DATE_SUB
DATE_SUB(date,INTERVAL expression type)
Subtracts a certain time period from the specified date. You can use a
negative value for the expression, in which case it will be added. The type
must be one of those listed at the beginning of this section ("Date and Time
Functions"), and the expression must match the type.
For example:
DAYOFMONTH
DAYOFMONTH(date)
Returns the day of the month for the supplied date as a number from 1 to
31. For example:
DAYOFWEEK
DAYOFWEEK(date)
Returns the day of the week for the supplied date as a number from 1 for
Sunday to 7 for Saturday, which is the Open Database Connectivity
(ODBC) standard.
For example:
DAYOFYEAR
DAYOFYEAR(date)
Returns the day of the year for the supplied date as a number from 1 to 366.
For example:
EXTRACT
EXTRACT(date_type FROM date)
Uses the specified date type to return the portion of the date. See the list of
date types before the start of the date functions.
For example:
FROM_DAYS
FROM_DAYS(number)
Converts the specified number into a date based on the number of days
since Jan 1, year 0, and returns the result. Does not take the days lost in the
change to the Gregorian calendar into account.
For example:
FROM_UNIXTIME
FROM_UNIXTIME(unix_timestamp [, format_string])
Converts the specified timestamp into a date and returns the result. The
returned date will be formatted if there is a format string supplied. The
format string can be any of those from theDATE_FORMAT() function.
For example:
HOUR
HOUR(time)
Returns the hour for the specified time, from 0 to 23.
For example:
MINUTE
MINUTE(time)
Returns the minutes for the specified time, from 0 to 59. For example:
MONTH
MONTH(date)
Returns the month for the specified date, from 1 to 12. For example:
MONTHNAME
MONTHNAME(date)
Returns the name of the month for the specified date. For example:
NOW
NOW()
Returns the current timestamp (date and time in the format YYYY-MM-DD
hh:mm:ss), either as a string or numeric depending on the context. The
function will return the same result for multiple calls on a single query.
For example:
PERIOD_ADD
PERIOD_ADD(period,months)
Adds the months to the period (specified as eitherYYMM orYYYYMM ) and
returns the result asYYYYMM .
For example:
PERIOD_DIFF
PERIOD_DIFF(period1,period2)
Returns the number of months betweenperiod andperiod2 (which are
specified in the formatYYMM orYYYYMM ).
For example:
QUARTER
QUARTER(date)
Returns the quarter of the specified date, from 1 to 4. For example:
SEC_TO_TIME
SEC_TO_TIME(seconds)
Converts the seconds to time, returning either a string (hh:mm:ss ) or
numeric (hhmmss ) depending on the context.
For example:
SECOND
SECOND(time)
Returns the seconds for the specified time, from 0 to 58.
For example:
SUBDATE
SUBDATE(date,INTERVAL expression type)
A synonym forDATE_SUB() .
SYSDATE
SYSDATE()
A synonym for theNOW() function.
TIME_FORMAT
TIME_FORMAT(time,format)
Identical toDATE_FORMAT() except that you can only use the subset of
formats dealing with time (or else you'll returnNULL ).
TIME_TO_SEC
TIME_TO_SEC(time)
Converts the time to seconds and returns the result. For example:
TO_DAYS
TO_DAYS(date)
Returns the number of days since Jan 1 year 0 for the specified date. Does
not take the days lost in the change to the Gregorian calendar into account.
For example:
UNIX_TIMESTAMP
UNIX_TIMESTAMP([date])
For example:
WEEK
WEEK(date [,week_start])
Returns the week in a given year for the specified date, from 0 to 53. The
week is assumed to start on Sunday, unless the optionalweek_start
argument is set to 1, in which case the week is assumed to start on Monday.
It can also explicitly set to 0 for Sunday starts. The function will return 0
for dates before the first Sunday (or Monday) of the year.
For example:
Use theYEARWEEK() function to roll the week over from the previous year
if the date is before the first Sunday (or Monday) of the year.
WEEKDAY
WEEKDAY(date)
Returns the day of the week for the supplied date as a number from 0 for
Monday to 6 for Sunday.
For example:
YEAR
YEAR(date)
Returns the year for the specified date, from 1000 to 9999.
For example:
mysql> SELECT YEAR('2002-06-30'); +-----------------
---+
| YEAR('2002-06-30') |
+--------------------+
| 2002 |
+--------------------+
YEARWEEK
YEARWEEK(date [,week_start])
Returns a combination of year and week for the specified date. The week is
assumed to start on Sunday, unless the optionalweek_start argument is
set to 1, in which case the week is assumed to start on Monday. It can also
explicitly set to 0 for Sunday starts. The year could be the previous year to
the date for dates before the first Sunday (or Monday) in the year or in the
following year.
For example:
String Functions
String functions mostly take string arguments and return string results.
Unlike most programming languages, the first character of the string is
position 1, not 0.
ASCII
ASCII(string)
Returns the ASCII value of the first (leftmost) character of the string, 0 if
the string is empty, andNULL if the string is null.
For example:
For example:
BIT_LENGTH
BIT_LENGTH(string)
Returns the string length in bits. For example:
CHAR
CHAR(number1[, number2[, ...]])
This function returns the characters that would result if each number were
an integer converted from ASCII code, skipping null values. Decimals are
rounded to the nearest integer value.
For example:
CHAR_LENGTH
Synonym for theLENGTH() function, except that multibyte characters are
only counted once.
CHARACTER_LENGTH
Synonym for theLENGTH() function, except that multibyte characters are
only counted once.
CONCAT
CONCAT(string1[,string2[,...]])
Concatenates the string arguments and returns the resulting string orNULL
if any argument is null. Arguments that are not strings are converted to
strings.
For example:
CONCAT_WS
CONCAT_WS(separator, string1[, string2[, ...]])
For example:
CONV
CONV(number,from_base,to_base)
Converts a number from one base to another. Returns the converted number
represented as string, 0 if the conversion cannot be made (the function will
convert as far as it can from the left), andNULL if the number is null. The
number is assumed to be an integer, but it can be passed as a string. It is
assumed to be unsigned unless the to base is a negative number. The bases
can be anything between 2 and 36 (withto_base possibly being negative).
For example:
ELT
ELT(number, string1 [,string2, ...])
Usesnumber as an index to decide which string to return; 1 returns the first
string, 2 the second, and so on. ReturnsNULL if there is no matching string.
For example:
EXPORT_SET
EXPORT_SET(number,on,off[,separator[,number_of_bit
s]])
Examines the number in binary, and for each bit that is set, returnson , and
for each that doesn't, returnsoff . The default separator is a comma, but
you can specify something else. Sixty-four bits is used, but you can change
thenumber_of_bits .
For example:
FIELD
FIELD(string, string1 [, string2 , ...])
Returns the index ofstring in the list following. Ifstring1 matches, the
index will be 1. If it'sstring2 then it will be 2, and so on. It will return 0
if the string is not found.
For example:
FIND_IN_SET
FIND_IN_SET(string,stringlist)
Similar to FIELD() in that it returns an index matching the string, but this
function searches either a string separated by commas or the typeSET . It
will return 1 if the string matches the first substring before the comma (or
the element of the set), 2 if the second substring matches, and so on. It
returns 0 if there is no match. Note that it matches whole commaseparated
substrings, not just any portions of the string.
For example:
HEX
HEX(string or number)
Returns the hexadecimal value (a string representation) of the specified
BIGINT number, 0 if the number cannot be converted (the function will
convert as far as it can from the left), or NULL if it's null.
For example:
INSERT
INSERT(string,position,length,newstring)
INSTR
INSTR(string,substring)
Searches the string case insensitively (unless either string is binary) for
the first occurrence ofsubstring and returns the position or returns 0
ifsubstring was not found. The first letter is at position 1.
For example:
LCASE
LCASE(string) Synonym forLOWER() .
LEFT
LEFT(string,length)
Returns the leftmostlength characters from the string. This function is
multibyte safe.
For example:
LENGTH
LENGTH(string)
Returns the length in characters of the string. Converts the argument to a
string if it can.
For example:
Reads the file and returns the file contents as a string. The file must be on
the server, you must specify the full pathname to the file, and you must
have theFILE privilege. The file must be readable by all and be smaller
thanmax_allowed_packet . If the file doesn't exist or can't be read
because of one of the previous reasons, the function returnsNULL .
LOCATE
LOCATE(substring, string [,position])
Searches the string case insensitively (unless either string is binary) for
the first occurrence ofsubstring and returns the position or returns 0
ifsubstring was not found. If the optionalposition argument it
supplied, the search starts at that point. The first letter is at position 1.
For example:
mysql> SELECT LOCATE('My','MySQL'); +---------------
-------+
| LOCATE('My','MySQL') |
+----------------------+
| 1 |
+----------------------+
mysql> SELECT LOCATE('C','Cecilia',2); +------------
-------------+
| LOCATE('C','Cecilia',2) |
+-------------------------+
| 3 |
+-------------------------+
This is the same as theINSTR() function but with the arguments reversed.
LOWER
LOWER(string)
Returns a string with all characters converted to lowercase (according to the
current character set mapping). The function is multibyte safe.
For example:
LPAD
LPAD(string,length,padding_string)
Left-pads the string with thepadding_string until the result islength
characters long. If the string is longer than the length, it will be shortened
tolength characters.
For example:
LTRIM
LTRIM(string)
Removes leading spaces from the string and returns the result. For example:
MAKE_SET
MAKE_SET(number, string1 [, string2, ...])
Returns a set (string where the elements are comma separated) with the
strings that match the number converted to binary. The first string appears if
bit 0 is set, the second string if bit1 is set, and so on. If the bit argument is
set to 3, then the first two strings are returned because 3 is 11 in binary.
For example:
OCT
OCT(number)
Returns the octal value (a string representation) of the specifiedBIGINT
number, 0 if the number cannot be converted (the function will convert as
far as it can from the left), or NULL if it's null.
For example:
OCTET_LENGTH
Synonym for theLENGTH() function.
ORD
ORD(string)
Returns the ASCII value of the first (leftmost) character of the string, 0 if
the string is empty, andNULL if the string is null. This is the same as the
ASCII function, unless the character is a multibyte character, in which case
the value is calculated as a base 256 number— that is, each byte being
worth 256 times more than the next byte. For example, the formula for a
two-byte character would be as follows: (byte_1_ASCII code * 256) +
(byte_2_ASCII_ code).
For example:
POSITION
POSITION(substring IN string)
For example:
QUOTE
QUOTE(string)
Escapes the single quote ( ' ), double quote (" ) ASCII NULL, and Ctrl+Z
characters, and surrounds the string with single quotes so it can be safely
used in an SQL statement. Single quotes are not added if the argument
isNULL .
For example:
mysql> SELECT QUOTE("What's Up?"); +----------------
-----+
| QUOTE("What ' s Up?") |
+---------------------+
| 'What\'s Up?' |
+---------------------+
REPEAT
REPEAT(string,count)
Repeats the string argument count times and returns the result, returns an
empty string if count is not positive, or returnsNULL if either argument in
null.
For example:
REPLACE
REPLACE(string,from_string,to_string)
Replaces all occurrences offrom_str found in thestring withto_str
and returns the result. The function is multibyte safe.
For example:
mysql> SELECT
REPLACE('ftp://test.host.co.za','ftp','http'); +---
--------------------------------------------+ |
REPLACE('ftp://test.host.co.za','ftp','http') | +-
----------------------------------------------+ |
http://test.host.co.za | +------------------------
-----------------------+
REVERSE
REVERSE(string)
Reverses the order of the characters instring and returns the result. This
function is multibyte safe.
For example:
RIGHT
RIGHT(string,length)
Returns the rightmostlength characters from the string. This function is
multibyte safe.
For example:
RPAD
RPAD(string,length,padding_string)
Right-pads the string with thepadding_string until the result
islength characters long. If the string is longer than the length, it will be
shortened tolength characters.
For example:
RTRIM
RTRIM(string)
Removes trailing spaces from the string and returns the result. For example:
mysql> SELECT CONCAT('a',RTRIM('b '),'c'); +--------
----------------------------+
| CONCAT('a',RTRIM('b '),'c') |
+------------------------------------+
| abc |
+------------------------------------+
SOUNDEX
SOUNDEX(string)
For example:
SPACE
SPACE(number)
Returns a string consisting ofnumber spaces.
For example:
SUBSTRING
SUBSTRING(string, position [,length])
SUBSTRING(string FROM position [FOR length])
Returns a substring of the string argument starting at theposition (which
starts at 1) and optionally with the specifiedlength .
For example:
SUBSTRING_INDEX
SUBSTRING_INDEX(string,delimiter,count)
Returns the substring from the string up untilcount (ifcount is positive)
or beyondcount (ifcount is negative) occurrences ofdelimiter .
The function is multibyte safe.
For example:
For example:
UCASE
UCASE(string) Synonym forUPPER() .
UPPER
UPPER(string)
Returns a string with all characters converted to uppercase (according to the
current character set mapping). The function is multibyte safe.
For example:
Numeric Functions
Numeric functions deal with numbers, mostly taking numeric arguments
and returning numeric results. In the case of an error, they will returnNULL .
You need to take care not to go beyond the numeric range of a number—
most MySQL functions work withBIGINT s (263 signed, or 264 unsigned),
and if you go beyond the range, MySQL will usually returnNULL .
ABS
ABS(number)
Returns the absolute value (positive value) of a number. The function is safe
for use with BIGINT values.
For example:
ACOS
ACOS(number)
Returns the arc cosine of number (the inverse cosine). The number must be
between –1 and 1 or the function returnsNULL .
For example:
ASIN
ASIN(number)
Returns the arc sine of number (the inverse sine). The number must be
between –1 and 1 or the function returnsNULL .
For example:
ATAN2
ATAN2(number1,number2) A synonym
forATAN(number1,number2) .
CEILING
CEILING(number)
Rounds up the number to the nearest integer and returns it as aBIGINT .
For example:
mysql> SELECT CEILING(2.98);
+---------------+
| CEILING(2.98) |
+---------------+
| 3 |
+---------------+
mysql> SELECT CEILING(-2.98); +----------------+
| CEILING(-2.98) |
+----------------+
| -2 |
+----------------+
COS
COS(number_radians)
Returns the cosine ofnumber_radians .
For example:
COT
COT(number_radians)
Returns the cotangent ofnumber_radians .
For example:
DEGREES
DEGREES(number)
Converts the number from radians to degrees and returns the result.
For example:
EXP
EXP(number)
Returns the numbere (the base of natural logarithms) raised to the specified
power. For example:
FORMAT
FORMAT(number,decimals)
Formats the number to a format with each three digits separated by a
comma and rounds the result to the specified number of places.
For example:
LEAST
LEAST(argument1, argument2 [, ...])
For example:
LN
LN(number) Synonym for theLOG(number) function.
LOG
LOG(number1 [, number2])
Returns the natural logarithm of number1 if there's one argument. You can
also use an arbitrary base by supplying a second argument, in which case
the function returns LOG(number2) / LOG(number1) .
For example:
LOG2
LOG2(number1)
Returns the base 2 logarithm ofnumber1 . This is equivalent
toLOG(number1)/LOG(2) .
For example:
mysql> SELECT LOG2(4);
MOD
MOD(number1,number2)
Returns the modulus ofnumber1 andnumber2 (the remainder
ofnumber1 divided by number2 ). This is the same as the% operator. This
is safe to use withBIGINT s.
For example:
PI
PI()
Returns the value of pi (or at least a close representation). MySQL uses the
full double precision but only returns five characters by default.
For example:
POW
POW(number1,number2) This function is a synonym
forPOWER(number1,number2).
POWER
POWER(number1,number2)
Raisesnumber1 to the power ofnumber2 and returns the value.
For example:
RADIANS
RADIANS(number1)
Converts the number from degrees to radians and returns the result.
For example:
RAND
RAND([number])
For example:
ROUND
ROUND(number1 [, number2])
Returns the argument number1 , rounded to the nearest integer. You can
supply a second argument to specify the number of decimals to round to
(the default is 0, or no decimals). The rounding behavior for numbers
exactly in the middle is based upon the underlying C library.
For example:
SIGN
SIGN(number)
Returns –1, 0, or 1 depending on whether the argument is negative, zero or
not a number, or positive.
For example:
SIN
SIN(number_radians)
Returns the sine ofnumber_radians .
For example:
SQRT
SQRT(number)
Returns the square root of the argument. For example:
TAN
TAN(number_radians)
Returns the tangent ofnumber_radians .
For example:
TRUNCATE
TRUNCATE(number,decimals)
Truncates (or increases) the number to the specified number of decimal
places.
For example:
| TRUNCATE(2.4,5) |
+-----------------+
| 2.40000 |
+-----------------+
mysql> SELECT TRUNCATE(2.998,0); +----------------
---+
| TRUNCATE(2.998,0) |
+-------------------+
| 2 |
+-------------------+
mysql> SELECT TRUNCATE(-12.43,1); +---------------
-----+
| TRUNCATE(-12.43,1) |
+--------------------+
| -12.4 |
+--------------------+
Aggregate Functions
Aggregate functions are functions that work on a group of data (meaning
they can be used in aGROUP BY clause). If there is noGROUP BY clause,
they will assume the entire result set is the group and return only one result.
For the following examples, assume a simple table exists, as follows:
AVG
AVG(expression)
Returns the average value of the expressions in the group. Will return 0 if
the expression is not numeric.
For example:
BIT_AND
BIT_AND(expression)
Returns the bitwise AND of all bits in the expressions from the group
(performed with 64-bit precision).
BIT_OR
BIT_OR(expession)
Returns the bitwise OR of all bits in the expressions from the group
(performed with 64-bit precision).
For example:
COUNT
COUNT( [DISTINCT] expression1, [expression2])
Returns the number of non-null values in the group.
If the expression is a field, returns the number of rows that don't contain
null values in this field.COUNT(*) the number of all rows, null or not.
TheDISTINCT option returns the number of unique non-null values (or a
combination, if more than one expression is used).
MAX
MAX(expression)
Returns the largest value of the expressions in the group. The expression
can be numeric or string.
For example:
mysql> SELECT MAX(field1) FROM table1;
+-------------+ | MAX(field1) | +-------------+ |
20 | +-------------+
MIN
MIN(expression)
Returns the smallest value of the expressions in the group. The expression
can be numeric or string.
For example:
STD
STD(expression)
Returns the standard deviation of the values in the expressions from the
group.
For example:
STDDEV
STDDEV(expression) A synonym for theSTD() function.
SUM
SUM(expression)
Returns the smallest value of the expressions in the group orNULL if there
are no rows. The expression can be numeric or string.
For example:
Other Functions
The following functions include encryption functions, comparison
functions, control flow functions, and other miscellaneous functions.
AES_DECRYPT
AES_DECRYPT(encrypted_string,key_string)
Decrypts the result of anAES_ENCRYPT() function.
AES_ENCRYPT
AES_ENCRYPT(string,key_string)
BENCHMARK
BENCHMARK(count,expression)
Runs the expression count times. Used mainly for testing to see how
fast MySQL runs an expression. Always returns 0; the time (on the client)
displayed below the function is the useful part of the output.
For example:
CASE
CASE value WHEN [compare_value1] THEN result1
[WHEN [compare_value2] THEN result2 ...] [ELSE
result3] END
CASE WHEN [condition1] THEN result1 [WHEN
[condition2] THEN result2 ...] [ELSE result3] END
There are two incantations of the CASE statement. The first returns a result
depending on the value. It compares the value to the
variouscompare_values and returns the result associated with that
value (after theTHEN ), returns the result after theELSE if none are found,
or returnsNULL if there is no result to return.
The second compares the various conditions and returns the associated
result when it finds a true condition, returns the result after theELSE if none
are found, or returnsNULL if there is no result to return.
For example:
mysql> SELECT CASE 'a' WHEN 'a' THEN 'a it is' END;
+--------------------------------------+
| CASE 'a' WHEN 'a' THEN 'a it is' END |
+--------------------------------------+
| a it is |
+--------------------------------------+
mysql> SELECT CASE 'b' WHEN 'a' THEN 'a it is' WHEN
'b' THEN 'b it is' END; +--------------------------
------------------------------------+ | CASE 'b'
WHEN 'a' THEN 'a it is' WHEN 'b' THEN 'b it is'
END | +-------------------------------------------
-------------------+ | b it is | +----------------
----------------------------------------------+
mysql> SELECT CASE 9 WHEN 1 THEN 'is 1' WHEN 2 THEN
'is 2' ELSE 'not found' END; +---------------------
----------------------------------------------+ |
CASE 9 WHEN 1 THEN 'is 1' WHEN 2 THEN 'is 2' ELSE
'not found' END | +-------------------------------
------------------------------------+ | not found
+-------------------------------------------------
------------------+ mysql> SELECT CASE 9 WHEN 1
THEN 'is 1' WHEN 2 THEN 'is 2' END; +--------------
-------------------------------------+
| CASE 9 WHEN 1 THEN 'is 1' WHEN 2 THEN 'is 2' END
|
+-------------------------------------------------
--+
| NULL |
+-------------------------------------------------
--+
mysql> SELECT CASE WHEN 1>2 THEN '1>2' WHEN 2=2
THEN 'is 2' END; +---------------------------------
-------------------+ | CASE WHEN 1>2 THEN '1>2'
WHEN 2=2 THEN 'is 2' END | +----------------------
------------------------------+ | is 2 | +--------
--------------------------------------------+
mysql> SELECT CASE WHEN 1>2 THEN '1>2' WHEN 2<2
THEN '2<2' ELSE 'none' END; +----------------------
----------------------------------------+ | CASE
WHEN 1>2 THEN '1>2' WHEN 2<2 THEN '2<2' ELSE
'none' END | +------------------------------------
--------------------------+ | none | +------------
--------------------------------------------------
+ mysql> SELECT CASE WHEN BINARY 'a' = 'A' THEN
'bin' WHEN 'a'='A' THEN 'text' END; +--------------
--------------------------------------------------
----+ | CASE WHEN BINARY 'a' = 'A' THEN 'bin' WHEN
'a'='A' THEN 'text' END | +-----------------------
---------------------------------------------+ |
text
+-------------------------------------------------
-------------------+ mysql> SELECT CASE WHEN BINARY
1=1 THEN '1' WHEN 2=2 THEN '2' END; +--------------
---------------------------------------+ | CASE
WHEN BINARY 1=1 THEN '1' WHEN 2=2 THEN '2' END |
+-------------------------------------------------
----+ | 1 | +-------------------------------------
----------------+
The type of the return value (INTEGER ,DOUBLE , orSTRING ) is the same
as the type of the first returned value (the expression after the firstTHEN ).
CAST
CAST(expression AS type)
Converts the expression to the specified type and returns the result. The
types can be one of the following:BINARY ,DATE ,DATETIME ,SIGNED
,SIGNED INTEGER ,TIME , UNSIGNED , andUNSIGNED INTEGER .
MySQL usually automatically converts types. For example, if you add two
number strings, the result will be a numeric. Or if any part of a calculation
is unsigned, the entire result will be unsigned. You can useCAST() to
change this behavior.
For example:
CONNECTION_ID
CONNECTION_ID()
Returns the uniquethread_id of the connection.
For example:
CONVERT
CONVERT(expression,type) This is a synonym
forCAST(expression AS type) , which is the ANSI SQL99 syntax.
DATABASE
DATABASE()
Returns the name of the current database or returns an empty string if there
is none. For example:
DECODE
DECODE(encoded_string,password_string)
Decodes the encoded string using the password string and returns the result.
The decoded string is usually generated by theENCODE() function first.
For example:
DES_DECRYPT
DES_DECRYPT(encrypted_string [, key_string]) Decrypts
a string encrypted withDES_ENCRYPT() .
DES_ENCRYPT
DES_ENCRYPT(string [, (key_number | key_string) ]
)
Uses the Data Encryption Standard (DES) algorithm to encrypt the string
and returns a binary string. If the optional key argument is omitted, the first
key from the des-key file is used. If the key argument is a number (from 0–
9), the corresponding key from the des-key file is used. If the key argument
is a string, that will be the key.
If the key values change in the des-key file, MySQL can read the new
values when you run aFLUSH_DES_KEY_FILE statement, which requires
thereload permission. This function only works if MySQL has Secure
Sockets Layer (SSL) support.
ENCODE
ENCODE(string,password_string)
Returns an encoded binary string. You can use DECODE() , with the same
password_string , to return the original string. The encoded and
decoded strings will be the same length.
For example:
ENCRYPT
ENCRYPT(string [, salt])
Encrypts a string using the Unix crypt() system call and returns the
result. The optional salt argument is a string used in the encryption. Its
specific behavior depends on the underlying system call.
For example:
GET_LOCK
GET_LOCK(string,timeout)
IF
IF(expression1,expression2,expression3)
For example:
The next example returnsfalse because the real number 0.49 is evaluated
as the integer 0:
IFNULL
IFNULL(expression1,expression2)
Returnsexpression1 if it's not null; otherwise it returnsexpression2
. The result can be numeric or string depending on the context.
For example:
INET_ATON
INET_ATON(dotted_quad_string)
Returns an integer 4- or 8-byte network address from the dotted quad string.
For example:
INET_NTOA
INET_NTOA(network_address)
Returns a dotted quad address from a 4- or 8-byte network address and
returns a dotted quad string representing the dotted quad address.
For example:
IS_FREE_LOCK
IS_FREE_LOCK(string)
Used to check whether a lock named string, created withGET_LOCK() , is
free.
Returns 1 if the lock is free, 0 if the lock is held, orNULL on other errors.
For example:
+---------------------+ | IS_FREE_LOCK('one') | +-
--------------------+ | 1 | +---------------------
+
LAST_INSERT_ID
LAST_INSERT_ID([expression])
Returns the last value inserted into anAUTO_INCREMENT field from this
connection, or 0 if there have been none.
For example:
MASTER_POS_WAIT
MASTER_POS_WAIT(log_name, log_position)
Used for replication synching. Run on the slave, this will wait until the
slave has done all updates until the specified position in the master log file
before continuing. For example:
MD5
MD5(string)
Uses the Message Digest algorithm to calculate a 128-bit checksum from
the string and returns the resulting 32-digit hexadecimal number.
For example:
NULLIF
NULLIF(expression1,expression2)
Returnsexpression1 unlessexpression1 is equal toexpression2 ,
in which case it returnsNULL . This will evaluateexpression1 twice if
it's equal toexpression2 .
For example:
PASSWORD
PASSWORD(string)
Converts the string into an encrypted password and returns the result. This
function is used for encrypting passwords in the user table of themysql
database. It it not reversible, and it is encrypted differently than the normal
Unix password.
For example:
RELEASE_LOCK
RELEASE_LOCK(string)
For example:
SESSION_USER
SESSION_USER()
Returns the MySQL user and host that are connected with the current
thread. For example:
SHA
SHA(string)
Uses the Secure Hash algorithm to calculate a 160-bit checksum from the
string and returns the resulting 40-digit hexadecimal number. It's a more
secure encryption than that achieved with theMD5() function.
For example:
SHA1
SHA1(string) A synonym forSHA() .
SYSTEM_USER
SYSTEM_USER() A synonym forSESSION_USER() .
USER
USER() A synonym forSESSION_USER() .
VERSION
VERSION()
Returns the MySQL server version as a string, with-log appended if
logging is enabled.
For example:
mysql_affected_rows
int mysql_affected_rows([resource
mysql_connection])
Returns the number of rows affected by the last statement that has made a
change to the data (INSERT ,UPDATE ,DELETE ,LOAD DATA ,REPLACE
), or –1 if the query failed. Remember thatREPLACE INTO will affect two
rows for each affected row in the original table (oneDELETE and
oneINSERT ). If the database connection is not specified, the most recent
opened open connection is used.
mysql_change_user
boolean mysql_change_user(string username, string
password [, string database [, resource
mysql_connection]])
Changes the current MySQL user (the one that logged in) to another user
(specifying that user's username and password). You can also change the
database at the same time or specify a new connection; otherwise the
current connection and database will be used. Returns true if successful and
false if not, in which case the existing user and details are maintained.
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
$change_succeeded = mysql_change_user($new_user,
$new_password, $database, $connect);
mysql_client_encoding
int mysql_client_encoding ([resource
mysql_connection])
Returns the default character set (for example latin1) for the specified
connection or the most recently opened open connection if none is
specified. For example: // open a persistent connection to
the database
$connect = mysql_pconnect($hostname, $username,
$password);
$charset = mysql_client_encoding($connect); print
"The current character set is $charset";
mysql_close
boolean mysql_close([resource mysql_connection])
Closes the specified connection or the most recent opened open connection.
Does not close persistent connections. For example:
mysql_connect
mysql_connection mysql_connect([string hostname [,
string username [, string password [, boolean
new_connection [, int client_flags]]]]])
The hostname can also take a port (the port follows a colon after the host
name). The final parameter can be one or more of the following flags,
which determine elements of MySQL's behavior when connected:
mysql_client_compress Uses compression protocol.
mysql_client_ignore_space Allows space after function names.
mysql_data_seek
boolean mysql_data_seek (resource query_result,
int row)
Moves the internal row pointer (0 is the first row) associated with the query
result to a new position. The next row that is retrieved
(frommysql_fetch_row() or
mysql_fetch_array() , for example) will be the specified row.
Returns true if the move succeeded and false if not (usually because the
query result does not have any or as many associated rows).
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// select the database
mysql_select_db("database1", $connect);
mysql_db_name
string mysql_db_name (resource query_result, int
row[, mixed unused])
Returns the name of a database. The query result would have been returned
from an earlier call to themysql_list_dbs() function. The row
specifies which element of the query result set (which starts at 0) to return.
mysql_db_query
query_result mysql_db_query ( string database,
string query [, resource mysql_connection])
mysql_drop_db
boolean mysql_drop_db(string database [, resource
mysql_connection])
Drops the specified database for the specified connection or the most
recently opened open connection if none is specified. Returns true if
successful and false if the database could not be dropped.
else {
print "Database old_db could not be dropped";
}
mysql_errno
int mysql_errno([resource connection])
Returns the error number of the most recently performed MySQL function
or zero if there was no error. Uses the specified connection (or the most
recently opened open connection if none is specified).
This function will return zero after any successful MySQL-related function
has been executed, except formysql_error() andmysql_errno() ,
which leave the value unchanged.
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// attempt to use a database that you've just
dropped mysql_select_db("old_db", $connect);
// Displays the error code - 1049
if (mysql_errno()) {
print "MySQL has thrown the following error:
".mysql_errno(); }
mysql_error
string mysql_error([resource mysql_connection])
Returns the text of the error message of the most recently performed
MySQL function or an empty string ('') if there was no error. Uses the
specified connection (or the most recently opened open connection if none
is specified).
This function will return an empty string after any successful MySQL-
related function has been executed, except formysql_error()
andmysql_errno() , which leave the value unchanged.
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password); // attempt to use a database that
you've just dropped mysql_select_db("old_db",
$connect);
// Displays the error text - Unknown database
'old_db' if (mysql_errno()) {
print "MySQL has thrown the following error:
".mysql_error(); }
mysql_escape_string
string mysql_escape_string (string stringname)
Returns a string with all characters that could break the query escaped (with
a backslash placed before them). The characters that are escaped include
null (\x00), new line (\n), carriage return (\r), backslash (\), single quote ('),
double quote ("), and Ctrl+Z (\x1A).
This makes a query safe to use. Anytime user input is used in a query, this
function should be used to make the query safe. You can also use the
slightly less complete addslashes() function.
For example:
// orginal unsafe string
$field_value = "Isn't it true that the case may
be";
// escape the special characters
$field_value = mysql_escape_string($field_value);
// now it's safe and displays: Isn\'t it true that
the case may be print "$field_value";
mysql_fetch_array
array mysql_fetch_array (resource query_result [,
int array_type])
Returns an array of strings based upon a row from a query result returned
from a function such asmysql_query() , and returns false if it fails or
there are no more rows available. The row returned is based upon the
position of the internal row pointer, which is then incremented by one. (The
row pointer starts at 0 immediately after a query is run.)
The second parameter specifies how the data is to be returned. If the array
type is set to MYSQL_ASSOC , data is returned as an associative array (the
same if you'd used the mysql_fetch_ assoc() function). If the array
type is set toMYSQL_NUM , the data is returned as a numeric array (the same
as if you'd used themysql_fetch_row() function). The third
option,MYSQL_BOTH , is the default used if no option is specified, and it
allows you to access the data as an associative or numeric array.
The associative array takes as a key the names of the fields only (dropping
any table name prefix). If there are duplicate field names, you need to use
an alias; otherwise the lastmentioned value will overwrite the earlier one.
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// select the database
mysql_select_db("database1", $connect);
// set and run the query
$sql = "SELECT field1,field2 FROM table1"; $result
= mysql_query($sql, $connect);
mysql_fetch_assoc
array mysql_fetch_assoc (resource query_result)
Returns an array of strings based upon a row from a query result returned
from a function such asmysql_query() , and returns false if it fails or
there are no more rows available. The row returned is based upon the
position of the internal row pointer, which is then incremented by one. (The
row pointer starts at 0 immediately after a query is run.)
The data is returned as an associative array, which takes as a key the names
of the fields only (dropping any table name prefix). If there are duplicate
field names, you need to use an alias; otherwise the last-mentioned value
will overwrite the earlier one. This function is the same as
usingmysql_fetch_array() with theMYSQL_ASSOC parameter.
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
mysql_fetch_field
object mysql_fetch_field(resource query_result [,
int offset ])
$max_length = $row->max_length;
$name = $row->name;
$type = $row->type;
print "Name:$name <br>\n";
print "Type:$type <br>\n";
print "Maximum Length:$max_length <br><br>\n\n";
mysql_fetch_lengths
array mysql_fetch_lengths(resource query_result)
Returns an array of the lengths of each field in the last row fetched from a
query result (the length of that result, not the maximum length), and returns
false if it wasn't successful. Use themysql_field_len() function to
return maximum length for a field.
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// select the database
mysql_select_db("database1", $connect);
// set and run the query
$sql = "SELECT field1,field2 FROM table1"; $result
= mysql_query($sql, $connect);
mysql_fetch_object
object mysql_fetch_object(resource query_result)
Returns an object with properties based upon a row from a query result
returned from a function such asmysql_query() . The row returned is
based upon the position of the internal row pointer, which is then
incremented by one. (The row pointer starts at 0 immediately after a query
is run.)
Each property of the object is based upon a name (or alias) of the field from
the query. For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// select the database
mysql_select_db("database1", $connect);
mysql_fetch_row
array mysql_fetch_row(resource query_result)
Returns an array of strings based upon a row from a query result returned
from a function such asmysql_query() , or returns false if it fails or
there are no more rows available. The row returned is based upon the
position of the internal row pointer, which is then incremented by one. (The
row pointer starts at 0 immediately after a query is run.)
The data is returned as a numeric array (the same as if you had used the
mysql_fetch_array() function with theMYSQL_NUM parameter).
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// select the database
mysql_select_db("database1", $connect);
mysql_field_flags
string mysql_field_flags(resource query_result,
int offset)
Returns a string containing flags of the specified field based upon a row
from a query result returned from a function such asmysql_query() .
The offset determines which field is examined (0 for the first field).
mysql_field_len
int mysql_field_len (resource query_result, int
offset)
Returns the maximum length (determined by the database structure) of the
specified field based upon a row from a query result returned from a
function such asmysql_query() . The offset (starting at 0) determines
the field.
mysql_field_name
string mysql_field_name(resource query_result, int
offset)
Returns the name of the specified field based upon a row from a query
result returned from a function such asmysql_query() . The offset
(starting at 0) determines the field.
For example:
mysql_field_seek
boolean mysql_field_seek(resource query_result,
int offset)
Moves the internal pointer to a new field of the query result, based on the
offset (starting at 0 with the first field). The next call to
themysql_fetch_field() function will start at this offset. This is not
that useful because you can move the pointer directly with themysql_
fetch_field() function.
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// select the database
mysql_select_db("database1", $connect);
mysql_field_table
string mysql_field_table(resource query_result,
int offset)
Returns the name of the table the field in a query result determined by the
offset (starting at 0) refers to, or returns false if there's an error. The
deprecatedmysql_fieldtable() function is identical.
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// select the database
mysql_select_db("database1", $connect);
mysql_field_type
string mysql_field_type(resource query_result, int
offset)
for($i=0;$i<mysql_num_fields($result);$i++) {
echo "Field $i is of type:
".mysql_field_type($result, $i) . "<br>\n";
}
mysql_free_result
boolean mysql_free_result(resource query_result)
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// select the database
mysql_select_db("database1", $connect);
mysql_get_client_info
string mysql_get_client_info()
Returns a string containing the MySQL client library version (4.0.2, for
example).
For example:
// displays - Client library version is: 4.0.2
(for example) print "Client library version is:
".mysql_get_client_info();
mysql_get_host_info
string mysql_get_host_info([resource
mysql_connection])
mysql_get_proto_info
int mysql_get_proto_info([resource
mysql_connection])
Returns an integer containing the protocol version (for example, 10) used
by the connection. The information is from the specified connection (or the
most recently opened open connection if none is specified).
For example:
// displays - Protocol version: 10 (for example)
print "Protocol version: ".mysql_get_proto_info();
mysql_get_server_info
string mysql_get_server_info([resource
mysql_connection])
Returns a string containing the MySQL server version (for example, 4.0.3).
The information is from the specified connection (or the most recently
opened open connection if none is specified).
For example:
// displays - Server version: 4.0.3-beta-log (for
example) print "Server version:
".mysql_get_server_info();
mysql_info
string mysql_info ( [resource mysql_connection])
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// select the database
mysql_select_db("database1", $connect);
// displays:
// Query info: String format: Rows matched: 19
Changed: 19 Warnings: 0 //(for example)
print "Query info: ".mysql_info();
mysql_insert_id
int mysql_insert_id([resource mysql_connection])
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// select the database
mysql_select_db("database1", $connect);
mysql_list_dbs
query_result mysql_list_dbs([resource
mysql_connection])
mysql_list_fields
query_result mysql_list_fields(string database,
string table [, resource mysql_connection])
$max_length = $row->max_length;
$name = $row->name;
$type = $row->type;
print "Name:$name <br>\n";
print "Type:$type <br>\n";
print "Maximum Length:$max_length <br><br>\n\n";
mysql_list_processes
query_result mysql_list_processes ([resource
mysql_connection])
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// return all the processes
$result = mysql_list_processes($connect);
mysql_list_tables
query_result mysql_list_tables(string database[,
resource mysql_connection])
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// returns the list of tables
$result = mysql_list_tables("database1");
// loops through the rows of tables, and displays
the names for($i=0; i < mysql_num_rows($result);
$i++) {
print "Table name: ".mysql_tablename($result,
$i)."<br>\n"; }
mysql_num_fields
int mysql_num_fields(resource query_result)
Returns an integer containing the number of fields in a query result orNULL
on error.
The deprecatedmysql_numfields() function is identical.
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// return a list of all fields in database1.table1
$result = mysql_list_fields("database1",
"table1");
// Displays: Num fields in database1: 6 (for
example) print "Num fields in database1:
".mysql_num_fields($result);
mysql_num_rows
int mysql_num_rows(resource query_result)
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// return the list of databases on the connection
$result = mysql_list_dbs($connect);
// loop through the results, returning the
database names one by one for ($i=0; $i <
mysql_num_rows($result); $i++) {
print mysql_db_name($result, $i) . "<br>\n";
}
mysql_pconnect
mysql_connection mysql_pconnect([string hostname
[, string username [, string password [, int
client_flags]]]])
For example:
mysql_ping
boolean mysql_ping ([resource mysql_connection])
Returns true if the MySQL server is up and false if not. The ping is
attempted using the specified connection (or the most recently opened open
connection if none is specified). If the ping fails, the script will try to
reconnect with the same parameters.
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password); // time passes...
if (mysql_ping()) {
print "Still connected";
}
else {
print "Connection lost";
}
mysql_query
query_result mysql_query(string query[, resource
mysql_connection [, int result_mode]])
Returns a query result (if the query was one that produces a result, such as
SELECT or DESCRIBE ), returns true if the query did not produce a result
but succeeded (such as DELETE orUPDATE ) and returns false if the query
failed. The query is sent to a specified database using the specified
connection (or the most recently opened open connection if none is
specified).
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// select the database
mysql_select_db("database1", $connect);
mysql_real_escape_string
string mysql_real_escape_string (string stringname
[, resource mysql_connection])
Returns a string with all characters that could break the query escaped (a
backslash placed before them. The characters that are escaped include null
(\x00), new line (\n), carriage return (\r), backslash (\), single quote ('),
double quote ("), and Ctrl+Z (\x1A). It does not escape percentage (%) and
underscore (_) characters.
mysql_result
mixed mysql_result(resource query_result, int row
[, mixed field_specifier])
This function is markedly slower than the functions that return the entire
row, such as mysql_fetch_row() andmysql_fetch_array() , so
use one of those instead. Also, don't mix this function with functions that
return the entire row.
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// select the database
mysql_select_db("database1", $connect);
mysql_select_db
boolean mysql_select_db(string database [,
resource mysql_connection])
Changes the current database to the specified database. Uses the specified
connection (or the most recently opened open connection if none is
specified). If no connection is open, it will attempt to
callmysql_connect() with no parameters to connect. Returns true if
successful, and returns false if not.
mysql_stat
string mysql_stat ([resource mysql_connection])
Returns a string containing the server status. This contains uptime, threads,
questions, slow queries, opens, flush tables, open tables, and queries per
second average. Uses the specified connection (or the most recently opened
open connection if none is specified).
For example:
mysql_tablename
string mysql_tablename(resource query_result, int
row)
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// returns the list of tables
$result = mysql_list_tables("database1");
// loops through the rows of tables, and displays
the names for($i=0; $i < mysql_num_rows($result);
$i++) {
print "Table name: ".mysql_tablename($result,
$i)."<br>\n"; }
mysql_thread_id
int mysql_thread_id ([resource mysql_connection])
Returns an integer containing the current thread ID. For example:
// displays - Thread id: 2394 (for example) print
"Thread id: ".mysql_thread_id();
mysql_unbuffered_query
query_result mysql_unbuffered_query(string query
[, resource mysql_connection [, int result_mode]])
Returns an unbuffered query result (if the query was one that produces a
result, such as SELECT orDESCRIBE ), returns true if the query did not
produce a result but succeeded (such asDELETE orUPDATE ), and returns
false if the query failed. The query is sent using the specified connection (or
the most recently opened open connection if none is specified).
For example:
// open a persistent connection to the database
$connect = mysql_pconnect($hostname, $username,
$password);
// select the database
mysql_select_db("database1", $connect);
To install Perl DBI support for MySQL, you need the DBI, DBD-mysql,
DataNote Dumper, and File-Spec modules. You can download the latest
versions from
www.mysql.com/CPAN . Full installation instructions come with the
software.
Throughout this appendix, you'll see the following conventions for variable
names:
$dbh A database handle object, returned from theconnect() or
connect_cached() methods
$sth A statement handle object, returned from aprepare() method
among others
$drh A driver handle object (rarely used in applications)
$h A database, statement, or driver handle
$rc A Boolean return code (true for success or false for an error)
$rv A return value of sorts, usually an integer
@ary An array of values, usually a row of data returned from a query
$rows The number of rows processed, or- 1 if unknown
$fh A file handle
undef ANULL , or undefined, value
\%attributes A reference to a hash of attribute values, used for various
purposes by the methods
To use the DBI, you need to load the DBI module at the beginning of your
script, as follows: use DBI;
Then, you need to return a database handle, usually with the connect()
DBI class method. The database handle then accesses methods that can run
queries and return results, usually returning a statement handle.
available_drivers
@ary = DBI->available_drivers[($quiet)];
Returns a list of available drivers (DBD modules). This gives a warning if
there are drivers with the same name. Setting the optional$quiet to true
stills the warning.
connect
$dbh = DBI->connect($datasource, $username,
$password [, \%attributes]);
mysql_read_default_file=filename
The specified file is used as an option file (MySQL configuration file,
usuallymy.ini or my.cnf on the server).
mysql_read_default_group=groupname
When reading an option file, the default group to use is[client]. This
changes the group to[groupname] .
The following option causes communication between the client and server
to be compressed:
mysql_compression=1
The following option specifies the path to the Unix socket used to connect
to the server: mysql_socket=/path/to/socket
The optional username and password will take the values of theDBI_USER
andDBI_PASS environment variables if not specified.
If the connection fails, it will returnundef and sets$DBI:err
and$DBI:errstr.
You can use the\%attributes parameter to set the various attributes,
such as
AutoCommit (recommended),RaiseError , andPrintError .
For example:
connect_cached
$dbh = DBI->connect_cached($data_source,
$username, $password[ , \%attributes])
Just like connect() , except that the details of the database handle are
also stored in a hash array. This same database handle is used for further
identical calls to connect_cached() if it's still valid.
TheCachedKids attribute (accessed via$dbh-> {Driver}->
{CachedKids} ) contains cache data.
This is still a fairly new method and is likely to change. It is not the same as
an Apache::DBI persistent connection.
data_sources
@ary = DBI->data_sources($driver [,
\%attributes]); Returns an array of all databases available to the
named driver (mysql in this case).
trace
trace($trace_level [, $trace_filename])
The trace level can be from 0 to 9, with 0 disabling tracing, 1 best for a
general overview, 2 being the most commonly used, and the other methods
adding more and more driver and DBI detail.
err
$rv = $h->err;
Returns the native error code from the last method (usually an integer).
errstr
$error_string = $dbh->errstr;
Returns an error string from the failure of the previous call.
For example:
my $hostname = 'localhost';
my $database = 'firstdb';
my $username = 'guru2b';
my $password = 'g00r002b';
set_err
$rv = $h->set_err($err, $errstr [, $state, $method
[, $rv]]);
A new method mainly used by DBI drivers and subclasses. It sets the err
,errstr , and state values for the handle (to enable error handling
throughRaiseError and so on). $method sets a more useful method
name for the error string, and$rv sets a return value (usuallyundef ).
For example:
sub doodle {
# … try to 'doodle'
or return $sth->set_err(1234, "Nope. Sorry. Out of
luck. It all–
state
$rv = $h->state;
Returns an error code inSQLSTATE format. Usually returns the
generalS1000 code when the driver does not supportSQLSTATE .
trace
trace($trace_level [, $trace_filename]) See the
earliertrace method.
trace_msg
$h->trace_msg($message_text [, $minimum_level]);
If tracing is enabled, outputs the message text to the trace file. If the
minimum level is set (default 1), this only outputs the message if the trace
level is at least that level.
hash
$hash_value = DBI::hash($buffer [, $type]);
looks_like_number
@bool = DBI::looks_like_number(@array);
Returns a Boolean array, with true for each element of the original array
that looks like a number, false for each element that does not, andundef
for elements that are empty or not defined.
neat
$neat_string = DBI::neat($value [, $maxlen]);
Formats and neatens the value and quotes string for display purposes, not
for passing to the database server. If the maximum length is exceeded, the
string will be shortened to $maxlen -4 and an ellipsis (... ) will be
added to the end. If$maxlen is not is not specified,
then$DBI::neat_maxlen , which defaults to 400, will be used.
neat_list
$neat_string = DBI::neat_list(\@listref [, $maxlen
[, $field_sep]]);
Calls theneat() function for each element of the list and returns a string
with all the elements separated by$field_sep , which defaults to a
comma (, ).
begin_work
$rc = $dbh->begin_work or die $dbh->errstr;
Begins a transaction and turnsAutoCommit off until the transaction ends
withcommit() orrollback() .
column_info
$sth = $dbh-
>column_info($catalog,$schema,$table,$column);
An experimental method that returns an active statement handle for getting
information about columns.
commit
$rc = $dbh->commit;
Commits the current transaction. TheAutoCommit parameter needs to be
off for this to have any effect.
disconnect
$rc = $dbh->disconnect;
Uses the specified database handle to disconnect from the database. Returns
true if successful or false if not.
The method does not define whether to roll back or commit currently open
transactions, so make sure you specifically commit or roll them back in
your applications before calling disconnect .
do
$rv = $dbh->do($statement [,\%attributes
[,@bind_values]]);
foreign_key_info
$sth = $dbh->foreign_key_info($pk_catalog,
$pk_schema, $pk_table– [, $fk_catalog, $fk_schema,
$fk_table]);
The returned result depends on which tables are supplied. If only the
foreign key table is supplied (by passingundef as the primary key
argument), the results contain all foreign keys in that table and the
associated primary keys. If only the primary key table is supplied, the
results contain the primary key from that table and all associated foreign
keys. If both tables are supplied, the results contain the foreign key from the
foreign key table that refers to the primary key of the primary key table.
get_info
$value = $dbh->get_info( $info_type ); An experimental
method returning implementation information.
ping
$rc = $dbh->ping; Checks if the database is still running and the
connection is active.
prepare
$sth = $dbh->prepare($statement [, \%attributes])
prepare_cached
$sth = $dbh->prepare_cached($statement [,
\%attributes,[ $allow_active]]);
The same as prepare , except that the statement handle is stored in a hash
so that future calls to identical arguments will return the same handle.
The$allow_active argument has three settings. The default, 0,
generates a warning and callsfinish() on the statement handle before it
is returned, and 1 callsfinish() but suppresses the warning. If set to 2,
the DBI will not callfinish() before returning the statement.
primary_key
@key_column_names = $dbh->primary_key($catalog,
$schema, $table);
An experimental interface to theprimary_key_info method that
returns an array of fieldnames that make up the primary key, in sequence,
for the specified table.
primary_key_info
$sth = $dbh->primary_key_info($catalog, $schema,
$table); An experimental method for getting information about primary
key columns.
quote
$quoted_string = $dbh->quote($string
[,$data_type])
quote_identifier
$sql = $dbh->quote_identifier( $name1[ , $name2,
$name3, \%attributes ]); Escapes any special characters in an
identifier (such as a fieldname) for use in a query.
rollback
$rc = $dbh->rollback;
Rolls back the current transaction. TheAutoCommit parameter needs to be
off for this to have any effect.
selectall_arrayref
$ary_ref = $dbh->selectall_arrayref($statement [,
\%attributes [,@bind_values]]);
selectall_hashref
$hash_ref = $dbh->selectall_hashref($statement,
$key_field– [, \%attributes [,@bind_values]]);
selectcol_arrayref
$ary_ref = $dbh->selectcol_arrayref($statement [,
\%attributes [,@bind_values]]);
For example:
# perform a query and return the two columns
my $array_ref = $dbh->selectcol_arrayref("SELECT
first_name, surname–
selectrow_array
@row_ary = $dbh->selectrow_array($statement [,
\%attributes [,@bind_values]]);
selectrow_arrayref
$ary_ref = $dbh->selectrow_arrayref($statement [,
\%attributes [,@bind_values]]);
selectrow_hashref
$hash_ref = $dbh->selectrow_hashref($statement [,
\%attributes [,@bind_values]]);
A method that combines the prepare() ,execute() ,
andfetchrow_hashref() methods into one for ease of use. It returns
the first row of data returned from the query. The statement can also be a
statement handle that has already been prepared, in which case the method
does not do aprepare() .
table_info
$sth = $dbh->table_info($catalog, $schema, $table,
$type [, \%attributes]);
An experimental method that returns an active statement handle for getting
information about tables and views from the database.
tables
@names = $dbh->tables($catalog, $schema, $table,
$type);
An experimental interface to thetable_info() method that returns an
array of table names.
type_info
@type_info = $dbh->type_info($data_type);
An experimental method that returns an array of hash references containing
information about data type variants.
type_info_all
$type_info_all = $dbh->type_info_all;
An experimental method that returns a reference to an array containing
information about data types supported by the database and driver.
bind_col
$rc = $sth->bind_col($column_number,
\$column_variable);
Binds a field (starting at 1) in the result from aSELECT statement to a
variable. Seebind_ columns for more information.
bind_columns
$rc = $sth-
>bind_columns(@list_of_refs_to_vars_to_bind);
Calls thebind_col() method for each field from theSELECT statement.
The number of references must match the number of fields.
For example:
bind_param
$rv = $sth->bind_param($bind_num, $bind_value
[\%attributes | $bind_type]);
Used to bind a value with a placeholder, indicated with a question mark ( ?
). A placeholder is used where you're planning to run a similar query
multiple times, where only a parameter changes each time.
For example:
You can also use the optional bind type parameter to indicate what type the
placeholder should have. For example:
$sth->bind_param(1, $bind_value, {TYPE =>
SQL_INTEGER});
or the equivalent shortcut (which requires you to import DBI withuse
DBIqw(:sql_types) :
$sth->bind_param(1, $bind_value, SQL_INTEGER);
Alternatively, you can use the\%attributes parameter, as follows:
$sth->bind_param(1, $bind_value, {TYPE =>
SQL_INTEGER});
This returns an integer.
For example:
my $hostname = 'localhost';
my $database = 'firstdb';
my $username = 'guru2b';
my $password = 'g00r002b';
}
}
$sth->finish();
bind_param_array
$rc = $sth->bind_param_array($p_num,
$array_ref_or_value [, \%attributes | $bind_type])
Used to bind an array to a placeholder set in the prepared statement, ready
for execution with theexecute_array() method.
For example:
bind_param_inout
$rv = $sth->bind_param_inout($p_num, \$bind_value,
$max_len [, \%attributes | $bind_type]) or ...
The same as thebind_param() method, but you can update values (for
stored procedures). MySQL does not currently support this.
dump_results
$rows = $sth->dump_results($max_len, $lsep, $fsep,
$fh);
Outputs all rows from the statement handle to$fh (defaultSTDOUT ) after
calling DBI::neat_ list for each row.$lsep is the row separator
(with a default of\n ), $fsep the field separator, with a default of comma
(, ), and the$max_len defaults to35 .
execute
$rv = $sth->execute([@bind_values]);
Executes a prepared statement and returns the number of rows affected (for
a query that doesn't return data, such asINSERT orUPDATE ). Returns0E0
(treated as true) if no rows are affected orundef if an error occurs. Use one
of the fetch methods to process the data.
my $hostname = 'localhost'
my $database = 'firstdb';
my $username = 'guru2b';
my $password = 'g00r002b';
execute_array
$rv = $sth->execute_array(\%attributes[,
@bind_values]);
Executes a prepared statement for each parameter set
withbind_param_array() or in @bind_ values and returns the
total number of rows affected.
fetch
An alias forfetchrow_arrayref() .
fetchall_arrayref
$table = $sth->fetchall_arrayref [[($slice[,
$max_rows])];
Returns all rows returned from the query as a reference to an array
containing one reference per row.
If no rows are returned, it returns a reference to an empty array. If an error
occurs, it returns the data fetched until the error, if any.
Some examples will make this clearer. The first two examples will return
references to an array of array references. First, to return just the second
field of every row, use the following:
$tbl_ary_ref = $sth->fetchall_arrayref([1]);
To return the third last and last field of every row, use the following:
$tbl_ary_ref = $sth->fetchall_arrayref([-3,-1]);
The next two examples return a reference to an array of hash references.
First, to fetch all fields of all rows as a hash ref, use this:
$tbl_ary_ref = $sth->fetchall_arrayref({});
To fetch only the fields called fname and sname of each row as a hash ref,
with the keys named as FNAME and sname, use the following:
$tbl_ary_ref = $sth->fetchall_arrayref({ FNAME=>1,
sname=>1 });
fetchall_hashref
$hash_ref = $dbh->fetchall_hashref($key_field);
Returns a reference to a hash that contains one entry per row at most. If the
query returns no rows, the method returns a reference to an empty hash. If
an error occurs, it returns the data fetched until the error, if any.
The $key_field parameter specifies the fieldname that holds the value
to be used for the key for the returned hash, or it can be a number
corresponding to a field (note that this starts at 1, not 0). The method
returns an error if the key does not match a field, either as a name or a
number.
You'd normally only use this when the key field value for each row is
unique; otherwise the values for the second and subsequent rows overwrite
earlier ones of the same key. For example:
$dbh->{FetchHashKeyName} = 'NAME_lc';
$sth = $dbh->prepare("SELECT id, fname, sname FROM
tname"); $hash_ref = $sth->fetchall_hashref('id');
print "The surname for id 8: $hash_ref->{8}->
{sname}";.
fetchrow_array
@row = $sth->fetchrow_array;
Returns an array of field values from the next row of data, using a
previously prepared statement handle. The elements of the row can then be
accessed as$row[0] ,$row[1] , and so on. This moves the row pointer
so that the next call to this method will return the following row.
fetchrow_arrayref
$row_ref = $sth->fetchrow_arrayref
Returns a reference to an array of field values from the next row of data,
using a previously prepared statement handle. The elements of the row can
then be accessed as$row_ ref->[0] ,
$row_ref->[1] , and so on.
This moves the row pointer so that the next call to this method will return
the following row.
fetchrow_hashref
$hash_ref = $sth->fetchrow_hashref[($name)];
Returns a reference to a hash table with the fieldname as key and field
contents as value, using a previously prepared statement handle. The
elements of the row can then be accessed as$hash_ref->
{fieldname1} ,$hash_ref->{fieldname2} , and so on.
This moves the row pointer so that the next call to this method will return
the following row.
The optionalname parameter specifies the name the attributes will be given.
It defaults to NAME , thoughNAME_uc orNAME_lc (uppercase or
lowercase) is suggested for portability. Thefetchrow_arrayref
andfetchrow_array methods are markedly quicker.
finish
$rc = $sth->finish;
Frees system resources associated with a statement handle, indicating that
no more data will be returned from it.
Returns true if successful or false if not.
rows
$rv = $sth->rows;
Returns the number of rows changed by the last SQL statement (after
anUPDATE or INSERT statement, for example) or –1 if the number is
unknown.
Active
Active (boolean, read-only)
True if the handle object is active (connected to a database for a database
handle or with more data to fetch for a statement handle).
The number of current active database handles (for a driver handle) or the
number of current active statement handles for a database handle. Active
means connected to a database for a database handle or with more data to
fetch for a statement handle.
ChopBlanks
ChopBlanks (boolean, inherited)
Specifies whether the various fetch methods will chop leading and trailing
blanks fromCHAR fields. Set to true if they will and false (the default) if
not.
FetchHashKeyName
FetchHashKeyName (string, inherited)
HandleError
HandleError (code ref, inherited)
Allows you to create your own way to handle errors. You can set it to a
reference to a subroutine, which will be called when an error is detected
(whenRaiseError and PrintError would be called). If the subroutine
returns false, thenRaiseError and PrintError are checked, as
normal. The subroutine is called with three parameters, the error message
string (the same asRaiseError andPrintError would use), the DBI
handle, and the first value returned by the failed method.
InactiveDestroy
InactiveDestroy (boolean)
Designed for Unix applications that fork child processes. False (the default)
indicates that a handle is destroyed automatically when it passes out of
scope. True indicates that the handle is not automatically destroyed.
Kids
Kids (integer, read-only)
Contains the number of current database handles (for a driver handle) or the
number of current statement handles for a database handle.
LongReadLen
LongReadLen (unsigned integer, inherited)
LongTruncOK
LongTruncOk (boolean, inherited)
False (the default) indicates that attempting to fetch long values longer
thanLongReadLen will cause the fetch to fail. True will return a truncated
value.
PrintError
PrintError (boolean, inherited)
When set to true (the default), errors in a method generate a warning.
Usually set to false whenRaiseError is set to true.
private_*
private_*
You can store extra information of your own as a private attribute in a DBI
handle by specifying a name beginning withprivate_ . You should name
it in the manner private_your_ module_name_description ,
and use just one attribute.
Because of the way the Perltie mechanism works, you cannot reliably use
the||= operator directly to initialize the attribute.
To initialize, use a two-step approach like this:
my $descriptive_name = $dbh->
{private_your_module_name_descriptive_name};
$descriptive_name ||= $dbh->{–
private_your_module_name_descriptive_name } = {
... };
You cannot use the following as you may expect:
my $descriptive_name = $dbh->{–
private_your_module_name_descriptive_name } ||= {
... };
Profile
Profile (inherited)
Allows method call timing statistics to be reported. See
theDBI::Profile documentation for more details.
RaiseError
RaiseError (boolean, inherited)
False by default; when set to true this will cause errors to raise exceptions
rather than just returning error codes. UsuallyPrintError is set to false
whenRaiseError is true.
ShowErrorStatement
ShowErrorStatement (boolean, inherited)
When true, appends the statement text to the error messages from
RaiseError and PrintError . It applies to statement handle errors
andprepare() ,do() , and select database handle methods.
Taint
Taint (boolean, inherited)
If set to true, and Perl is running in taint mode ( –T ), all data from the
database is treated as being tainted as are arguments to most DBI methods.
It defaults to false. More data may be treated as tainted at a later stage, so
useTaint carefully!
TraceLevel
TraceLevel (integer, inherited)
Setting this is an alternative to using the trace() method to set the DBI
trace level. The trace level can be from 0 to 9, with 0 disabling tracing, 1
best for a general overview, 2 being the most commonly used, and the other
methods adding more and more driver and DBI detail.
Warn
Warn (boolean, inherited) True by default, enables warnings.
You can use$SIG{__WARN__} to catch warnings.
AutoCommit
AutoCommit (boolean)
If set to true, then SQL statements are automatically committed. If false,
they are part of a transaction by default and need to be committed or rolled
back.
Driver
Driver (handle)
Contains the handle of the parent driver.
For example:
$dbh->{Driver}->{Name}
Name
Name (string)
The database name.
RowCacheSize
RowCacheSize (integer)
The size the application would like the local row cache to be or undef if
the row cache is not implemented. Setting it to a negative number specifies
memory size to be used for caching, 0 has the size automatically
determined, 1 disables the cache, and a larger positive number is the size of
the cache in rows.
Statement
Statement (string, read-only)
The most recent SQL statement passed toprepare() .
CursorName
CursorName (string, read-only)
The name of the cursor associated with the statement handle orundef if it
cannot be obtained.
NAME
NAME (array-ref, read-only)
A reference to an array of fieldnames. Names can be uppercase, lowercase,
or mixed case. Instead, useNAME_lc orNAME_uc for portability across
systems.
For example, to display the second column, use the following:
$sth = $dbh->prepare("select * from customer");
$sth->execute;
@row = $sth->fetchrow_array;
print "Column 2: $sth->{NAME}->[1]";
NAME_hash
NAME_hash (hash-ref, read-only)
For example:
NAME_lc
NAME_lc (array-ref, read-only) The same asNAME but only
returns lowercase names.
NAME_lc_hash
NAME_lc_hash (hash-ref, read-only) The same
asNAME_hash but only returns lowercase names.
NAME_uc
NAME_uc (array-ref, read-only) The same asNAME but only
returns lowercase names.
NAME_uc_hash
NAME_uc_hash (hash-ref, read-only) The same
asNAME_hash but only returns uppercase names.
NULLABLE
NULLABLE (array-ref, read-only)
A reference to an array of whether the field can contain nulls. Will contain 0
for no, 1 for yes, and 2 for unknown.
For example:
print "Field 1 can contain a NULL" if $sth->
{NULLABLE}->[0];
NUM_OF_FIELDS
NUM_OF_FIELDS (integer, read-only)
The number of fields that the prepared statement will return. It will be 0 for
statements that do not return fields (INSERT ,UPDATE , and so on).
NUM_OF_PARAMS
NUM_OF_PARAMS (integer, read-only) The number of
placeholders in the prepared statement.
ParamValues
ParamValues (hash ref, read-only) A reference to a hash
containing the values bound to placeholders orundef if not available.
PRECISION
PRECISION (array-ref, read-only)
RowsInCache
RowsInCache (integer, read-only)
The number of unfetched rows in the cache orundef if the driver does not
support a local row cache.
SCALE
SCALE (array-ref, read-only)
Returns a reference to an array of integer values for each column.NULL
(undef ) values indicate columns where scale is not applicable.
Statement
Statement (string, read-only) The last SQL query passed to
theprepare() method.
TYPE
TYPE (array-ref, read-only) A reference to an array of integer
values (representing the data type) for each field.
Dynamic Attributes
These are attributes that have a short lifespan and are only available
immediately after being set. They apply to the handle that's just been
returned.
err
$DBI::err
The same as$handle->err .
errstr
$DBI::errstr
The same as$handle->errstr .
lasth
$DBI::lasth
Returns the DBI handle used with the most recent DBI method call or the
parent of the handle (if it exists) if the method call was aDESTROY .
rows
$DBI::rows
The same as$handle->rows .
state
$DBI::state
The same as$handle->state .
}
}
$sth->finish();
$dbh->disconnect;
Attributes
Attributes can be available to the entire module, or they can be specific to a
cursor. This section describes the available attributes according to how they
are available.
Module Attributes
These attributes are available to the entire module.
apilevel
A string constant containing the supported version of the DB-API (2.0 if
you're using the version 2.0, for example).
conv
Maps MySQL types to Python objects. This defaults to
MySQLdb.converters.conversions .
paramstyle
A string constant containing the type of parameter marker (placeholder)
formatting that the interface uses. This can beformat , for example:
...WHERE fieldname=%s
or it can bepyformat , for example:
...WHERE fieldname=%(name)s
threadsafety
An integer constant containing the level of thread safety. It can be 0 (no
thread sharing), 1 (threads can share the module only), 2 (threads can share
the module and the connections), or 3 (threads can share the module,
connections, and cursors). The default is 1.
Cursor Attributes
These attributes are specific to a cursor object, returned from
thecursor() method.
Arraysize
Specifies the number of rows returned by thefetchmany() method, and
affects the fetchall() method's performance. This defaults to 1, or one
row at a time.
Description
This attribute is read only and is a sequence of sequences describing the
columns in the current result set, each with seven items. These arename
,type_code ,display_size , internal_size ,precision
,scale , andnull_ok .
Thename andtype_code items are mandatory, and the rest are set
toNone if there are no meaningful values for them.
This is set toNone if the query does not return any rows or has not yet been
invoked.
Rowcount
This attribute is read only, and it indicates the number of rows the last query
affected or returned, or it returns –1 if the number of rows is unknown or a
query has not been invoked.
Methods
Methods can be available to the entire module to a connection or a cursor.
This section describes the various methods according to whether they are
module, connection, or cursor methods.
Module Methods
Module methods are available to the entire module. The most important is
theconnect() method:
dbh = MySQLdb.connect(parameters)
dbh = MySQLdb.connect(user='guru2b',
passwd='g00r002b',– host='test.host.co.za',
db='firstdb')
Connection Methods
These methods are for use on a connection object, returned from the
MySQLdb.connect() method. You'll almost always use thecursor()
andclose() methods. Thecommit() androllback() methods are
used only for transactions.
begin
dbh.begin()
Begins a transaction, turning offAUTOCOMMIT if it's on, until the
transaction ends with a call tocommit() orrollback() .
close
dbh.close()
Closes the connection and frees the associated resources.
commit
dbh.commit()
Commits any open transactions.
cursor
dbh.cursor([cursorClass])
rollback
dbh.rollback()
Rolls back any open transactions. Closing the connection without explicitly
calling this method will implicitly callrollback for any open
transactions.
Cursor Methods
These methods are for accessing and manipulating data; they work on a
cursor object, returned from thecursor() method.
close
cursor.close()
Immediately frees resources associated with the cursor.
execute
cursor.execute(query[,parameters])
Prepares and executes a database query. The method also allows you to use
placeholders to optimize repeat queries of a similar type by specifying
various parameters. Placeholders are usually marked with a question mark
(? ), but MySQLdb does not currently support this. You need to use%s to
indicate a placeholder (if theparamstyle attribute is set to format )
because MySQLdb treats all values as strings no matter what type the fields
actually are.
For example:
cursor.execute('INSERT INTO
customer(first_name,surname) VALUES– (%s, %s)',
('Mike', 'Harksen'))
You can also use a Python mapping as the second argument if you set
MySQLdb.paramstyle = 'pyformat' .
You can use lists of tuples as the second parameter, but this usage is
deprecated. Use executemany instead.
executemany
cursor.executemany(query,seq_of_parameters)
Prepares a database query and then runs multiple instances with
placeholders to optimize repeats of similar queries.
For example:
cursor.executemany('INSERT INTO
customer(first_name,surname) VALUES– (%s, %s)',
(('Mike', 'Harksen'),( 'Mndeni', 'Vidal'),(
'John', 'Vilakazi')))
You can also use a Python mapping as the second set of arguments if you
set MySQLdb.paramstyle = 'pyformat' .
fetchall
cursor.fetchall()
Fetches all rows of a query result (from the current row pointer), returning
them as a sequence of sequences (list of tuples).
Thearraysize attribute of the cursor can affect the method's
performance.
This throws an exception if there's an error. For example:
fetchmany
cursor.fetchmany([size=cursor.arraysize]);
fetchone
cursor.fetchone()
Returns the next row from a query result set.
This throws an exception if there's an error.
For example:
import MySQLdb
dbh = None
try:
dbh = MySQLdb.connect(user='guru2b',
passwd='g00r002b',– host='test.host.co.za',
db='firstdb')
except:
print "Could not connect to MySQL server."
exit(0)
surname='Burger'")
print "Rows updated: ", cursor.rowcount
cursor.close()
except:
print "Could not update the table."
General Methods
This section describes some sundry methods for connecting or for accessing
configuration data from a file.
getBundle
bundle.getBundle(filename)
Loads data from a properties file calledConfig.properties . Although
not JDBC specific, it would be used when storing connection data in a
configuration file.
For example,Config.properties contains the following:
Driver = com.mysql.jdbc.Driver
Conn = jdbc:mysql://test.host.com/firstdb?
user=guru2b&password=g00r002b
The main program then contains the following:
ResourceBundle rb =
ResourceBundle.getBundle("Conn");
String conn = rb.getString("Conn");
...
Class.forName(rb.getString("Driver"));
Connection = DriverManager.getConnection(conn);
getConnection
DriverManager.getConnection(connection_details)
jdbc:mysql-caucho://host_name[:port]/database_name
For example:
Connection connection =
DriverManager.getConnection("jdbc:mysql-caucho://–
test.host.co.za/firstdb", "guru2b", "g00r002b");
The Connector/J driver uses a slightly different format, as follows:
jdbc:mysql://[host_name][:port]/database_name[?
property1=value1][&property=value2]
The properties can be any of the ones listed in Table F.1, although you'll
mostly just use the password and username.
getString
bundle.getString(string) SeegetBundle for an example of
reading data from a configuration file.
Connection Methods
The connection methods require a valid connection, returned from
thegetConnection() method.
clearWarnings
connection.clearWarnings()
Clears all warnings for the connection, returning a void.
close
connection.close()
Closes the database connection and frees all connection resources, returning
a void.
commit
connection.commit()
Commits open transactions.
createStatement
connection.createStatement([int resultSetType, int
resultSetConcurrency])
getAutoCommit
connection.getAutoCommit()
Returns true ifAutoCommit mode is set for the connection and false if it is
not.
getMetaData
connection.getMetaData()
Returns a database metadata object containing metadata about the database
with which the connection has been made.
getTransactionIsolation
connection.getTransactionIsolation()
getTypeMap
connection.getTypeMap Returns aMap object associated with the
connection.
isClosed
connection.isClosed() Returns true if the connection has been
closed and false if it's still open.
isReadOnly
connection.isReadOnly() Returns true if the connection is read
only and false if not.
nativeSQL
connection.nativeSQL(String sql) Returns a string with the
supplied string converted to the system's native SQL.
prepareStatement
connection.prepareStatement(String sql)
Prepares the statement to be sent to the database, which means you can
make use of placeholders (or parameters). Use thesetInt()
andsetString() methods to set the values of the parameters.
rollback
connection.rollback() Undoes all changes in the current
transaction.
setAutoCommit
connection.setAutoCommit(boolean mode) Sets the
connection'sAutoCommit mode (true if set, false if not).
setReadOnly
connection.setReadOnly(boolean mode) Passing the method
true sets the connection to read-only mode.
setTransactionIsolation
connection.setTransactionIsolation(int level)
setTypeMap
connection.setTypeMap(Map map) Sets the type Map object for
the connection.
addBatch
statement.addBatch(String sql)
preparedstatement.addBatch()
Adds the SQL statement to a current list of statements, which can then be
executed with theexecuteBatch() method.
clearBatch
statement.clearBatch()
Clears the list of statements in the batch that have been added by
theaddBatch() method.
clearWarnings
statement.clearWarnings()
Clears all the warnings associated with the statement.
close
statement.close()
Frees all resources associated with the statement.
execute
statement.execute(String sql [,int
autoGeneratedKeys | int[] columnIndexes | String[]
columnNames])
preparestatement.execute()
Executes a SQL statement. It returns true if the query returns a result set
(such as for a SELECT statement), and returns false if no result set was
produced (such as for an INSERT orUPDATE statement). The options
indicate that auto-generated keys should be made available for retrieval—
either all of them or all in the integer or string arrays, respectively.
executeBatch
statement.executeBatch()
Executes all statements in the batch (added byaddBatch ), returning an
integer array of update counts, or returning false if any of the statements did
not execute correctly.
executeQuery
statement.executeQuery(String sql)
preparedstatement.executeQuery()
Executes a query that returns data (such asSELECT orSHOW ) and returns a
single result set.
executeUpdate
statement.executeUpdate(String sql)
preparedstatement.executeUpdate()
Executes a query that modifies data (such asUPDATE ,INSERT , orALTER
) and returns the number of rows affected.
getConnection
statement.getConnection() Returns the connection object that
created the statement.
getFetchSize
statement.getFetchSize()
Returns as an integer the number of the default fetch size for aResultSet
object from this statement.
getMaxFieldSize
statement.getMaxFieldSize()
Returns as an integer the maximum number of bytes that can be returned for
character and binary column values for aResultSet object from this
statement.
getMaxRows
statement.getMaxRows()
Returns as an integer the maximum number of rows it's possible for
aResultSet object from this statement to contain.
getMoreResults
statement.getMoreResults([int current])
Moves to the next result from the statement, returning true if there is
another valid ResultSet , or returning false if not. If there's no
parameter, any currentResultSet objects are closed; otherwise they are
dealt with according to the value ofcurrent (which can beCLOSE_
CURRENT_RESULT ,KEEP_CURRENT_RESULT , or
CLOSE_ALL_RESULTS ).
getQueryTimeout
statement.getQueryTimeout()
Returns the number of seconds the driver will wait for a query to execute
before it times out.
getResultSet
statement.getResultSet() Returns a result set from the current
statement.
getResultSetType
statement.getResultSetType() Returns the type forResultSet
objects for the current statement.
getUpdateCount
statement.getUpdateCount()
Retrieves the current result as an update count; if the result is
aResultSet object or there are no more results, –1 is returned.
setXXX
preparedstatement.setXXX(int parameter, xxx value)
Sets a parameter in a previously prepared statement. The parameters start at
1. The value is of the appropriate type (see Table F.2 ).
Table F.2: SQL Types and the Equivalent Set Methods SQL Type
BIGINT
BINARY
BIT
BLOB
CHAR
DATE
DECIMAL
DOUBLE
FLOAT
INTEGER
LONGVARBINARY LONGVARCHAR NUMERIC
OTHER
REAL
SMALLINT
TIME
TIMESTAMP
TINYINT
VARBINARY
VARCHAR
For example:
Java Method
setLong()
setBytes()
setBoolean() SetBlob()
setString()
setDate()
setBigDecimal() setDouble()
setDouble()
setInt()
setBytes()
setString()
setBigDecimal() setObject()
setFloat()
setShort()
setTime()
setTimestamp() setByte()
setBytes()
setString()
preparedstatement =
connection.prepareStatement("UPDATE customer–
setCursorName
statement.setCursorName(String cursorname) Sets the
SQL cursor name to be used by laterexecute() methods.
setEscapeProcessing
statement.setEscapeProcessing(boolean mode)
Sets escape processing (if mode is true) or disables it (if mode is false). The
default is true. Escape processing has no effect
withPreparedStatement objects.
setFetchSize
statement.setFetchSize(int size)
Gives the driver an idea of how many rows should be returned from the
database when more rows are needed for the statement.
setMaxFieldSize
statement.setMaxFieldSize(int limit) Sets the maximum
number of bytes in a binary or characterResultSet column.
setMaxRows
statement.setMaxRows(int limit) Sets the maximum number
of rows that aResultObject can contain.
setQueryTimeout
statement.setQueryTimeout(int seconds) Sets the number
of seconds the driver will wait for a query to execute before it times out.
ResultSet Methods
These methods require a validResultSet object, returned from
thegetResultSet() method.
absolute
resultset.absolute(int row)
Moves the cursor to the specified row from the result set (rows start at 1).
You can use a negative number to move to a row starting from the end of
the result set. Returns true if the cursor is on the row, false if not.
afterLast
resultset.afterlast()
Moves the cursor to the end of the result set.
beforeFirst
resultset.beforeFirst()
Moves the cursor to the start of the result set.
cancelRowUpdates
resultset.cancelRowUpdates()
Cancels updates made to the current row in the result set.
close
resultset.close()
Closes the result set and frees all associated resources.
deleteRow
resultset.deleteRow()
Deletes the currentResultSet row from the database (and the result set).
findColumn
resultset.findColumn(String field_name)
Maps the fieldname to the column in the result set and returns the column
index as an integer.
first
resultset.first()
Moves the cursor to the first row in the result set. Returns true if there is a
valid first row, false if not.
getXXX
resultset.getXXX(String fieldname | int
fieldindex)
Returns the contents of a field of the specified type. You can identify the
field by its name or its position.
Table F.3 shows the SQL types and their equivalent Java methods.
Table F.3: SQL Types and Equivalent Get Methods SQL Type
BIGINT
BINARY
BIT
BLOB
CHAR
DATE
DECIMAL
DOUBLE
FLOAT
INTEGER
LONGVARBINARY LONGVARCHAR NUMERIC
Java Method
getLong()
getBytes()
getBoolean() getBlob()
getString()
getDate()
getBigDecimal() getDouble()
getDouble()
getInt()
getBytes()
getString()
getBigDecimal() OTHER
REAL
SMALLINT
TIME
TIMESTAMP
TINYINT
VARBINARY
VARCHAR
getCursorName
getObject()
getFloat()
getShort()
getTime()
getTimestamp() getByte()
getBytes()
getString()
getFetchSize
resultset.getFetchSize() Returns the fetch size for the result set
object.
getMetaData
resultset.getMetaData()
Returns aResultSetMetaData object with the number, type, and
properties of the result set's columns.
getRow
resultset.getRow() Returns an integer containing the current row
number.
getStatement
resultset.getStatement() Returns the statement object that
created the result set.
getType
resultset.getType() Returns the type of the result set object.
getWarnings
resultset.getWarnings() Returns the firstWarning triggered by a
call from aResultSet method on this result set.
insertRow
resultset.insertRow() Inserts the contents of the insert row into
the database (and the result set).
isAfterLast
resultet.isAfterLast() Returns true if the cursor is after the last
row in the result set, false if not.
isBeforeFirst
resultset.isBeforeFirst() Returns true if the cursor is before the
first row in the result set, false if not.
isFirst
resultset.isFirst() Returns true if the cursor is at the first row in
the result set, false if not.
isLast
resultset.isLast() Returns true if the cursor is at the last row in the
result set, false if not.
last
resultset.last()
Moves the cursor to the last row in the result set. Returns true if there is a
valid last row, false if not.
moveToCurrentRow
resultset.moveToCurrentRow()
Moves the cursor to the remembered cursor position, which is usually the
current row. This has no effect if the cursor is not on the insert row. See
themoveToInsertRow() method.
moveToInsertRow
resultset.moveToInsertRow()
Moves the cursor to the insert row (a buffer where a new row can be placed
with an update method). It remembers the current cursor position, which
can be returned to with the moveToCurrentRow() method.
next
resultset.next()
Moves the cursor to the next row in the result set and returns true if there is
a next row, false if there is not (the end has been reached).
For example:
connection = DriverManager.getConnection(url,
"guru2b", "g00r002b"); statement =
connection.createStatement();
resultset = statement.executeQuery("SELECT
first_name,surname FROM customer");
while(resultset.next()) {
String first_name =
resultset.getString("first_name"); String surname
= resultset.getString("surname");
System.out.print("Name: " + first_name + " " +
surname);
previous
resultset.previous()
Moves the cursor to the previous row in the result set and returns true if
there is a previous row, false if there is not (the start has been reached).
For example:
while(resultset.previous()) { ... }
refreshRow
resultset.refreshRow() Refreshes the current result set row with
the most recent value in the database.
relative
resultset.relative(int rows)
Moves the cursor forward (ifrows is positive) or backward (ifrows is
negative) byrows number of positions.
rowDeleted
resultset.rowDeleted() Returns true if a row has been detected as
deleted in the result set, false if not.
rowInserted
resultset.rowInserted() Returns true if a row has been detected
as inserted in the result set, false if not.
rowUpdated
resultset.rowUpdated() Returns true if a row has been detected as
updated in the result set, false if not.
setFetchSize
resultset.setFetchSize(int rows)
Gives the driver an idea of how many rows should be returned from the
database when more rows are needed for the result set.
updateXXX
Updates the field with a value of the specified type. Table F.4 shows the
SQL types and their equivalent Java methods.
Table F.4: SQL Types and the Equivalent Update Methods SQL Type
BIGINT
BINARY
BIT
BLOB
CHAR
DATE
DECIMAL
DOUBLE
FLOAT
INTEGER
LONGVARBINARY
LONGVARCHAR
NUMERIC
NULL
OTHER
REAL
SMALLINT
TIME
TIMESTAMP
TINYINT
VARBINARY
VARCHAR
updateRow
resultset.updateRow()
Java Method
updateLong()
updateBytes()
updateBoolean() updateBlob()
updateString()
updateDate()
updateBigDecimal() updateDouble()
updateDouble()
updateInt()
updateBytes()
updateString()
updateBigDecimal() updateNull()
updateObject()
updateFloat()
updateShort()
updateTime()
updateTimestamp() updateByte()
updateBytes()
updateString()
Updates the database with the contents of the current row of the result set.
wasNull
resultSet.wasNull() Returns true if the previous field read was a
SQLNULL , false if not.
ResultSetMetaData Methods
These methods require a valid ResultSetMetaData object, returned
from the getMetaData() method. They get information about the
results. Columns begin at 1 for purposes of counting and offsetting.
getColumnCount
resultsetmetadata.getColumnCount()
Returns the number of columns in the result set.
getColumnDisplaySize
resultsetmetadata.getColumnDisplaySize(int column)
Returns the maximum character width of the specified column.
getColumnName
resultsetmetadata.getColumnName(int column)
Returns the fieldname of the specified column.
getColumnType
resultsetmetadata.getColumnType(int column)
Returns the SQL type of the specified column.
getColumnTypeName
resultsetmetadata.getColumnTypeName(int column)
Returns the database-specific type name of the specified column.
getPrecision
resultsetmetadata.getPrecision(int column)
Returns the number of decimals in the specified column.
getScale
resultsetmetadata.getScale(int column) Returns the
number of digits after the decimal point in the in the specified column.
getTableName
resultsetmetadata.getTableName(int column) Returns the
table name owning the specified column.
isAutoIncrement
resultsetmetadata.isAutoIncrement(int column) Returns
true if the specified column is an auto increment field, false if not.
isCaseSensitive
resultsetmetadata.isCaseSensitive(int column) Returns
true if the specified column is case sensitive, false if not.
isDefinitelyWritable
resultsetmetadata.isDefinitelyWritable(int column)
Returns true if a write on the specified column will succeed, false if it may
not.
isNullable
resultsetmetadata.isNullable(int column)
Returns the nullable status of the specified column, which can
becolumnNoNulls , columnNullable ,
orcolumnNullableUnknown .
isReadOnly
resultsetmetadata.isReadOnly(int column) Returns true if
the specified column is read only, false if not.
isSearchable
resultsetmetadata.isSearchable(int column) Returns true
if the specified column can be used in aWHERE clause, false if not.
isSigned
resultsetmetadata.isSigned(int column) Returns true if the
specified column is a signed column, false if not.
isWritable
resultsetmetadata.isWritable(int column) Returns true if
the specified column can be written to, false if not.
SQLException Methods
You use these methods when aSQLException object has been created.
getErrorCode
sqlexception.getErrorCode()
Returns the error code from the vendor.
getMessage
sqlexception.getMessage()
Inherited fromThrowable , this methods returns the message string
forThrowable .
getNextException
sqlexception.getNextException()
Returns the nextSQLException or null if there is none.
getSQLState
sqlexception.getSQLState()
Returns aSQLState identifier.
printStackTrace
sqlexception.printStackTrace(PrintStream s)
Inherited from theThrowable class, this method prints the stack trace to
the standard error stream.
setNextException
setNextException(sqlexception e)
Adds aSQLException .
Warning Methods
You use these methods when aSQLWarning object has been created.
getNextWarning
sqlwarning.getNextWarning()
Returns the nextSQLWarning , or null if there is none.
setNextWarning
sqlwarning.setNextWarning(SQLWarning w)
Adds aSQLWarning .
try {
Class.forName("com.mysql.jdbc.Driver");
}
catch (Exception e) {
System.err.println("Unable to load driver.");
e.printStackTrace();
}
try {
Statement sth,sth2;
// connect to the database using the Connector/J
driver dbh =
DriverManager.getConnection("jdbc:mysql://–
localhost/firstdb?user=mysql");
String first_name =
resultset.getString("first_name"); String surname
= resultset.getString("surname");
System.out.println("Name: " + first_name + " " +
surname);
}
sth2.close();
}
catch( SQLException e ) {
e.printStackTrace();
}
finally {
if(dbh != null) {
try {
dbh.close();
}
catch(Exception e) {}
}
}
}
}
Appendix G: C API
The C API comes with MySQL distributions and is included in the
mysqlclient library. Many of the MySQL clients are written in C, and most
of the APIs from other languages use the C API (take a look at the
similarity between the C and PHP functions, for example). You can also use
the C API for C++ development; for an object-oriented approach, you can
use MySQL++, available from the MySQL website (www.mysql.com ).
my_ulonglong
A numeric type from –1 to 1.84e19, used for return values from functions
such asmysql_ affected_rows() ,mysql_num_rows() ,
andmysql_insert_id() .
MYSQL
A database handle (a connection to the database server). The variable is
initialized with the mysql_init() function and used by most of the API
functions.
MYSQL_FIELD
Field data, returned from the mysql_fetch_field() function,
including the fieldname, type, and size. The actual field values are stored in
theMYSQL_ROW structure. You can find the following members in the
structure:
unsigned int flags Any of the flags shown in Table G.2 can be set;
they provide extra information about the field.
The macros shown in Table G.3 ease the testing of some of the flag values.
IS_BLOB(flags)
IS_NUM(flags)
True if the field is aBLOB orTEXT (deprecated, use FIELD_TYPE_BLOB
instead)
True if the field is numeric
unsigned int decimals The number of decimal places used by a
numeric.
MYSQL_FIELD_OFFSET
Represents the position of the field pointer within a field list, beginning at 0
for the first field. It is used by themysql_field_seek() function.
MYSQL_RES
A structure containing the results of a query that returns data (such
asSELECT ,DESCRIBE , SHOW , orEXPLAIN ).
MYSQL_ROW
A single row of data, obtained from themysql_fetch_row() function.
All data is represented as an array of strings that may contain null bytes if
any of the data is binary.
C API Functions
The C API functions are for opening and closing connections to the server,
performing queries, analyzing query results, debugging, and performing
administrative tasks. You'll need to know them well and know how they
interact with the C API data types to master the C API.
mysql_affected_rows
my_ulonglong mysql_affected_rows(MYSQL *mysql)
Returns the number of rows affected by the last query (for example, the
number of rows removed with aDELETE statement or the number of rows
returned from aSELECT statement, in which case it's the same as
themysql_num_rows function.). It returns –1 on error.
With UPDATE statements, the row is not counted as affected if it matched
the condition but no changes were made, unless
theCLIENT_FOUND_ROWS flag is set when connecting with mysql_
real_connect() .
For example:
/* Update the customer table, and return the
number of records affected */ mysql_query(&mysql,
"UPDATE customer SET first_name='Jackie' WHERE–
surname='Wood')";
affected_rows = mysql_affected_rows(&mysql);
mysql_change_user
my_bool mysql_change_user(MYSQL *mysql, char
*username, char *password, char *database)
Changes the current MySQL user (the one that logged in) to another user
(specifying that user's username and password). You can also change
database at the same time or open a new connection; otherwise, the current
connection and database will be used. This returns true if successful or false
if not, in which case the existing user and details are maintained.
For example:
if (! mysql_change_user(&mysql, 'guru2b',
'g00r002b', 'firstdb')) { printf("Unable to change
user and or database!");
}
mysql_character_set_name
char *mysql_character_set_name(MYSQL *mysql)
Returns the name of the default character set (usually ISO-8859-1, or
Latin1).
For example:
printf("The default character set is: %s \n",
mysql_character_set_name(&mysql));
mysql_close
void mysql_close(MYSQL *mysql)
Closes the connection and frees the resources.
For example:
mysql_close(&mysql);
mysql_connect
MYSQL *mysql_connect(MYSQL *mysql, const char
*host, const char *user, const char *passwd)
For connecting to MySQL. The function has been deprecated, so instead
use mysql_real_ connect() .
mysql_create_db
int mysql_create_db(MYSQL *mysql, const char *db)
For creating a database. The function has been deprecated, so instead use
mysql_query() .
mysql_data_seek
void mysql_data_seek(MYSQL_RES *res, unsigned int
offset)
Moves the internal row pointer (0 is the first row) associated with the
results returned from mysql_store_result() to a new position. The
offset is the row to move to, starting at 0.
For example:
mysql_data_seek(results,
mysql_num_rows(results)-1);
mysql_debug
mysql_debug(char *debug)
To use, the MySQL client needs to have been compiled with debugging
enabled. Uses the Fred Fish debug library.
For example:
/* Traces application activity in the file
debug.out */ mysql_debug("d:t:O,debug.out");
mysql_drop_db
int mysql_drop_db(MYSQL *mysql, const char *db)
For dropping a database. The function has been deprecated, so instead use
mysql_query() .
mysql_dump_debug_info
int mysql_dump_debug_info(MYSQL *mysql)
Writes connection debug information into the log. The connection needs
theSUPER privilege to be able to do this. This returns 0 if it succeeded;
otherwise it returns a nonzero result.
For example:
result = mysql_dump_debug_info(&mysql);
mysql_eof
my_bool mysql_eof(MYSQL_RES *result)
Checks whether the last row has been read. The function has been
deprecated, so instead usemysql_err() ormysql_errno() .
mysql_errno
unsigned int mysql_errno(MYSQL *mysql)
Returns the error code for the most recent API function or 0 if there have
been no errors. You can retrieve the actual text of the error using
themysql_error() function.
For example:
error = mysql_errno(&mysql);
mysql_error
char *mysql_error(MYSQL *mysql)
Returns the error message (in the current server language) for the most
recent API function or an empty string if there was no error. If there have
been no errors in the connection, the function returns 0.
For example:
printf("Error: '%s'\n", mysql_error(&mysql));
mysql_escape_string
unsigned int mysql_escape_string(char *to, const
char *from, unsigned int length)
Returns a string with all characters that could break the query escaped (a
backslash placed before them). Instead,
usemysql_real_escape_string() because it respects the current
character set.
mysql_fetch_field
MYSQL_FIELD *mysql_fetch_field(MYSQL_RES *result)
Returns the field data of the current field. You can call this function
repeatedly to return data of the following fields in the result. This returns a
null value when there are no more fields to return.
For example:
while((field = mysql_fetch_field(results))) {
/* .. process results by accessing field->name,
field->length etc */
}
mysql_fetch_field_direct
MYSQL_FIELD * mysql_fetch_field_direct(MYSQL_RES *
result,– unsigned int field_number)
Returns field data of the specified field (which starts at 0). For example:
/* Return the second field */
field = mysql_fetch_field_direct(results, 1);
mysql_fetch_fields
MYSQL_FIELD *mysql_fetch_fields(MYSQL_RES *
result)
Returns an array of field data from each field in the result. For example:
mysql_fetch_lengths
unsigned long *mysql_fetch_lengths(MYSQL_RES
*result)
Returns an array of the lengths of the fields from the current row (called
with mysql_fetch_ row() ) in the result set, or null if there was an
error.
This is the only function that correctly returns the length of binary fields
(for example, BLOB s).
For example:
unsigned long *lengths;
/* Return the next row of data */ row =
mysql_fetch_row(results);
/* Return the array of lengths */
length_array = mysql_fetch_lengths(results); /*
... Access lengths as length_array[0],
length_array[1] and so on */
mysql_fetch_row
MYSQL_ROW mysql_fetch_row(MYSQL_RES *result)
Returns the next row from the result or null if there are no more rows or an
error. For example:
MYSQL_ROW row;
row = mysql_fetch_row(results);
/* Access the row data as row[0], row[1] and so on
mysql_field_count
unsigned int mysql_field_count(MYSQL *mysql)
Returns the number of fields in the last executed query. It allows you to
determine whether aNULL returned frommysql_use_result()
ormysql_store_result() is because of an error or because it
shouldn't return a result (a nonSELECT type query). For checking the
number of fields in a successful result set, usemysql_num_fields() .
For example:
results = mysql_store_result(&mysql);
/* test if no result set found */
if (results == NULL) {
/* if no result, test whether the field count was
zero or not. if (mysql_field_count(&mysql) > 0) {
mysql_field_seek
MYSQL_FIELD_OFFSET mysql_field_seek(MYSQL_RES
*result, MYSQL_FIELD_OFFSET offset)
Moves the internal field pointer (which starts at 0) to the specified field.
The next call to mysql_fetch_field() will the specified field. This
returns the previous field position pointer.
mysql_field_tell
MYSQL_FIELD_OFFSET mysql_field_tell(MYSQL_RES
*result)
Returns the current position of the field pointer. For example:
/* Record the current position */ current_pos =
mysql_field_tell(results);
mysql_free_result
void mysql_free_result(MYSQL_RES *result) Frees the
resources allocated to a result set.
mysql_get_client_info
char *mysql_get_client_info(void)
Returns a string containing the client's MySQL library version. For
example:
/* Displays - Client library version is: 4.0.2
(for example) */ printf("Client library version
is: %s\n", mysql_get_client_info());
mysql_get_host_info
char *mysql_get_host_info(MYSQL *mysql)
Returns a string containing the connection information. For example:
/* Displays - Type of connection: Localhost via
UNIX socket (for example) */ printf("Type of
connection: %s", mysql_get_host_info(&mysql));
mysql_get_proto_info
unsigned int mysql_get_proto_info(MYSQL *mysql)
Returns an integer containing the protocol version (for example, 10) used
by the connection. For example:
/* displays - Protocol version: 10 (for example)
*/ printf("Protocol version: %d\n",
mysql_get_proto_info(&mysql));
mysql_get_server_info
char *mysql_get_server_info(MYSQL *mysql)
Returns a string containing the MySQL server version (for example, 4.0.3).
For example:
mysql_init
MYSQL *mysql_init(MYSQL *mysql)
Returns an initialized MySQL handle, ready for
amysql_real_connect().
mysql_insert_id
my_ulonglong mysql_insert_id(MYSQL *mysql)
Returns a value containing the most recently insertedAUTO_INCREMENT
value or 0 if the most recent query did insert an auto incremented value.
For example:
last_auto_increment = mysql_insert_id(&mysql);
mysql_kill
int mysql_kill(MYSQL *mysql, unsigned long
process_id)
Requests that MySQL kill the thread specified by theprocess_id . This
returns 0 if operation was successful or a nonzero value if it failed.
Requires that you have thePROCESS privilege.
For example:
kill = mysql_kill(&mysql, 1293);
mysql_list_dbs
MYSQL_RES *mysql_list_dbs(MYSQL *mysql, const char
*wild)
Returns a result set containing the names of the databases on the server that
match the wild regular expression (equivalent to the SQL statementSHOW
DATABASES LIKE 'wild' ) or null if there was an error. This returns
all databases if passed a null pointer.
For example:
MYSQL_RES database_names;
/* returns a list of all databases with 'db' in
the name */ database_names =
mysql_list_dbs(&mysql, "%db%");
/* ... Don't forget to free the resources at a
later stage mysql_free_result(database_names);
mysql_list_fields
MYSQL_RES *mysql_list_fields(MYSQL *mysql, const
char *table, const char *wild)
Returns a result set containing the names of the fields in the specified table
that match the wild regular expression (equivalent to the SQL
statementSHOW COLUMNS FROM tablename LIKE 'wild' ) or
null if there was an error. This returns all fields if passed a null pointer.
For example:
MYSQL_RES field_names;
/* returns a list of all fields with 'name' in the
name */ field_names = mysql_list_fields(&mysql,
"customer", "%name%");
/* ... Don't forget to free the resources at a
later stage mysql_free_result(field_names);
mysql_list_processes
MYSQL_RES *mysql_list_processes(MYSQL *mysql)
For example:
MYSQL_RES *threadlist;
MYSQL_ROW row
threadlist = mysql_list_processes(&mysql);
row = mysql_fetch_row(threadlist);
/* Access the thread data as row[0], row[1] and so
on
/* ... Don't forget to free the resources at a
later stage mysql_free_result(threadlist);
mysql_list_tables
MYSQL_RES *mysql_list_tables(MYSQL *mysql, const
char *wild)
Returns a result set containing the names of the tables in the current
database that match thewild regular expression (equivalent to the SQL
statementSHOW TABLES LIKE 'wild ') or null if there was an error. This
returns all fields if passed a null pointer.
For example:
MYSQL_RES tablelist;
/* returns a list of all tables with 'customer' in
the name */ tablelist = mysql_list_tables(&mysql,
"%customer%");
/* ... Don't forget to free the resources at a
later stage mysql_free_result(tablelist);
mysql_num_fields
unsigned int mysql_num_fields(MYSQL_RES *result)
For example:
num_fields = mysql_num_fields(results);
mysql_num_rows
int mysql_num_rows(MYSQL_RES *result)
Returns the number of rows in a query result (only the results to date if
mysql_use_result() was used to get the result set).
For example:
num_rows = mysql_num_rows(results);
mysql_options
int mysql_options(MYSQL *mysql, enum mysql_option
option, void *value)
Sets extra connect options for the connection about to be made. It can be
called multiple times and is called aftermysql_init() and
beforemysql_real_connect() . This returns 0 if successful or a
nonzero value if it was passed an invalid option. The options are as follows:
For example:
MYSQL mysql; mysql_init(&mysql);
mysql_ping
int mysql_ping(MYSQL *mysql)
Returns 0 if the MySQL server is up or a nonzero value if not. If the ping
fails, the program will try to reconnect.
mysql_query
int mysql_query(MYSQL *mysql, const char *query)
mysql_real_connect
MYSQL *mysql_real_connect(MYSQL *mysql, const char
*host, const char *user,– const char *passwd,
const char *db, uint port, const char
*unix_socket,– uint client_flag)
Establishes a connection to the MySQL server with the specified
arguments, as follows: MYSQL *mysql An existingMYSQL structure,
created when you called mysql_init().
mysql_init(&mysql);
mysql_options(&mysql, MYSQL_OPT_COMPRESS, 0 );
mysql_options(&mysql, MYSQL_INIT_COMMAND, "FLUSH
TABLES" ); if(!mysql_real_connect(&mysql,
"localhost", "guru2b", "g00r002b",–
"firstdb", 0, NULL,0)) {
printf("The following connection error occurred
%s\n", mysql_error(&mysql));
}
mysql_real_escape_string
unsigned long mysql_real_escape_string(MYSQL
*mysql, char *new_string,– char *old_string,
unsigned long old_string_length)
For example:
/* the original query is 4 bytes (a,b,c and the
null character) */ char *old_query = "abc\000";
mysql_real_query
int mysql_real_query(MYSQL *mysql, const char
*query, unsigned long length)
Executes the query (which can also use binary data), specifying the length
as well (excluding a null character). You can then retrieve the result, if
applicable, with themysql_ store_result()
ormysql_use_result() functions.
For example:
query_result = mysql_real_query(&mysql, "CREATE
DATABASE seconddb");
mysql_reload
int mysql_reload(MYSQL *mysql)
A deprecated function that reloads the grant tables, assuming the connected
user has Reload permission. Rather, use themysql_query() function.
mysql_row_seek
MYSQL_ROW_OFFSET mysql_row_seek(MYSQL_RES *result,
MYSQL_ROW_OFFSET offset)
Moves the internal row pointer to the specified row, returning the original
row pointer. The MYSQL_ROW_OFFSET should be the structure returned
from either the
mysql_row_tell() function or anothermysql_row_seek()
function, not just a row number (in which case you'd use the
mysql_data_seek() function).
For example:
current_location =
mysql_row_seek(result,row_offset);
mysql_row_tell
MYSQL_ROW_OFFSET mysql_row_tell(MYSQL_RES *result)
Returns the current position of the row pointer. You can use this with
mysql_row_seek() to move to the specified row. Use
aftermysql_store_result() , not
mysql_use_result() .
For example:
MYSQL_ROW_OFFSET current_position =
mysql_row_tell(results);
/* A little later..., move back to this position
*/ moved_position =
mysql_row_seek(result,current_position);
mysql_select_db
int mysql_select_db(MYSQL *mysql, const char *db)
Changes the current database to the specified database (assuming the user
has permission to change). It returns 0 if successful or a nonzero value if
there was an error. For example:
mysql_select_db(&mysql, "seconddb");
mysql_shutdown
int mysql_shutdown(MYSQL *mysql)
Requests that the MySQL server shut down. The user must have
theSHUTDOWN privilege for this to work. Returns 0 if successful or a
nonzero value if there was an error. For example:
mysql_shutdown(&mysql);
mysql_stat
char *mysql_stat(MYSQL *mysql)
Returns a string containing the server status. This contains uptime, threads,
questions, slow queries, opens, flush tables, open tables, and queries per
second average. For example:
mysql_store_result
MYSQL_RES *mysql_store_result(MYSQL *mysql)
For all queries that return data, you need to call either this function or
mysql_use_result() . This stores the query results into
theMYSQL_RES structure, or returns null in case of an error or if the query
did not return data (such as after aCREATE DATABASE of anINSERT ).
You should usemysql_field_count() to count the number of fields
expected from the query. If it's not zero (when the query was not expected
to return any data), then an error has occurred.
MYSQL_RES results;
mysql_query(&mysql, "SELECT first_name, surname
FROM customers"); results =
mysql_store_result(&mysql);
mysql_thread_id
unsigned long mysql_thread_id(MYSQL * mysql)
Returns the current thread ID of the connection, usually in order to kill it
with mysql_kill() .
For example:
thread_id = mysql_thread_id(&mysql);
mysql_use_result
MYSQL_RES *mysql_use_result(MYSQL *mysql)
For all queries that return data, you need to call either this function or
mysql_store_result() . This function reads the data row by row, not
all at once as doesmysql_store_result() . It is therefore faster, but it
does not allow other queries to be run until all the data has been returned,
making locking more of a problem than usual. It returns a null value in case
of an error or if the query did not return data (such as after a CREATE
DATABASE of anINSERT ). You should usemysql_field_count() to
count the number of fields expected from the query. If it's not zero (when
the query was not expected to return any data), then an error has occurred.
For example:
MYSQL_RES results;
mysql_query(&mysql, "SELECT first_name,surname
FROM customer"); results =
mysql_use_result(&mysql);
#include <stdio.h>
#include <mysql.h>
/* the two basic includes */
6. You can click the Options button to be able to select various options to
cater for the idiosyncrasies of each application (those that are not 100-
percent ODBC compliant). For example, currently with Microsoft Access
you should check the Return Matching Rows option. You may need to
experiment to get things working smoothly, and you should look at the
MySQL website as well, which will have the latest options for many
common applications.
[myodbc]
Driver = /usr/local/lib/libmyodbc3.so Description
= MySQL ODBC 3.51 Driver DSN SERVER
PORT
USER
Password Database OPTION
SOCKET
= localhost =
= root
= g00r002b = firstdb = 3
=
2
Cannot handle receiving the actual number of affected rows (the number of
found rows will be returned instead).
64
Ignores the database name in a structure such as
databasename.tablename.fieldname .
128 An experimental option that forces the use of ODBC manager cursors.
256 An experimental option that disables the use of extended fetch.
512 CHAR fields are padded to the full length of the field.
1024 TheSQLDescribeCol() function returns fully qualified column
names.
2048 Uses compressed protocol.
4096
Causes the server to ignore spaces between the function name and the open
bracket and makes all function names keywords as a result.
/tmp/myodbc.sql file.
8. If the data source connection details are not correct, you'll need to change
them, which you can do at this point as well. Remember that MySQL
permissions have to be granted to allow access. See Chapter 14, "Database
Security" for details on this.
Using ODBC
The following examples demonstrate inserting a record into and selecting
records from a MySQL database via ODBC in different programming
environments. The first examples show the connection being made directly,
and the DAO example shows a connection being made through a data
source.
Warning Be sure to keep the formatting the same or the examples may not
work.
set
path=%path%;C:\WINNT\Microsoft.NET\Framework\v1.0.
3705–
C:\WINNT\Microsoft.NET\Framework\v1.0.3705\csc
/t:exe– /out:odbc_cnet.exe odbc_cnet.cs
/r:"C:\Program–
Files\Microsoft.NET\Odbc.Net\Microsoft.Data.Odbc.d
ll"– pause
C:\WINNT\Microsoft.NET\Framework\v1.0.3705\csc
/t:exe– /out:odbc_cnet.exe odbc_cnet.cs
/r:"C:\Program–
Files\Microsoft.NET\Odbc.Net\Microsoft.Data.Odbc.d
ll"
pause
Listing H.2: dbnet.cs
using Console = System.Console; using
Microsoft.Data.Odbc;
namespace MyODBC {
class MySQLCSharp {
static void Main(string[] args) {
try {
// Set the arguments for connecting to a MySQL
firstdb database– with MyODBC 3.51
string MySQLConnectionArgs = "DRIVER={MySQL ODBC
3.51 Driver};–
SERVER=www.testhost.co.za;DATABASE=firstdb;UID=gur
u2b;– PASSWORD=g00r002b;OPTION=0";
MySQLCommand.ExecuteNonQuery();
Console.WriteLine("" + MySQLDataReader.GetInt32(0)
+ ":" +– MySQLDataReader.GetString(1) + " " +
MySQLDataReader.GetString(2)); }
}
// If theres an ODBC Exception, catch it catch
(OdbcException MySQLOdbcException) {
throw MySQLOdbcException;
} }
}
}
SERVER=www.testhost.co.za;DATABASE=customer;UID=gu
ru2b;– PWD=g00r002b;OPTION=0" MySQLConnection.Open
Set Results = New ADODB.Recordset
Results.CursorLocation = adUseServer
'There are two common ways of inserting - the
first is the direct insert SQLQuery = "INSERT INTO
customer (first_name, surname) VALUES– ('Werner',
'Christerson')"
MySQLConnection.Execute SQLQuery
'The second way of inserting is to add to a result
set using the 'AddNew method. First return a
result set
Results.Open "SELECT * FROM customer",
MySQLConnection,–
adOpenDynamic, adLockOptimistic
Results.AddNew
Results!first_name = "Lance"
Results!surname = "Plaaitjies"
Results.Update
Results.Close
'select a record from the customer table, return
the results, 'loop through the results displaying
them
Results.Open "SELECT id, first_name, surname FROM
customer", MySQLConnection While Not Results.EOF
To gain access to the ADO 2.0 objects in Visual Basic, set a reference to the
ADODB type library contained inMSADO15.DLL . It appears in the
References dialog box (available from the Project menu) as Microsoft
ActiveX Data Objects 2.0 Library.
MySQLConnection.CursorDriver = rdUseOdbc
MySQLConnection.EstablishConnection
rdDriverNoPrompt 'There are two common ways of
inserting - the first is the direct insert
SQLQuery = "INSERT INTO customer (first_name,
surname) VALUES–
('Lance', 'Plaaitjies')"
MySQLConnection.Execute SQLQuery, rdExecDirect
rdConcurRowVer, rdExecDirect)
While Not Results.EOF
Debug.Print Results!id & ":" & Results!first_name
& " " & Results!surname
Results.MoveNext
Wend
'Free the result set, and the connection
Results.Close
MySQLConnection.Close
End Sub
To gain access to the RDO 2.0 objects in Visual Basic, set a reference to the
RDO type library contained inMSRD020.DLL . It appears in the
References dialog box (available from the Project menu) as Microsoft
Remote Data Objects 2.0. The code needs to appear inside a form for
theDebug.Print method to work. Alternatively, you can change it
toMsgBox to make the code executable.
"g00r002b", dbUseODBC)
Set MySQLConnection =
Works.OpenConnection("MySQLConn",–
rdDriverCompleteRequired, False, "ODBC;DSN=MyDAO")
MySQLConnection.Close Works.Close
End Sub
To gain access to the DAO objects in Visual Basic, set a reference to the
DAO type library contained inDAO360.DLL . It appears in the References
dialog box (available from the Project menu) as Microsoft DAO 3.6 Object
Library. The code needs to appear inside a form for theDebug.Print
method to work. Alternatively, you can change it toMsgBox to make the
code executable. This example requires a DSN to be set for it to work.
MyODBC Functions
The following sections serve as a function reference for experienced
programmers. The descriptions in this appendix apply to MyODBC 3.5x .
SQLAllocConnect
Allocates memory for a connection handle. The function is deprecated and
has been replaced withSQLAllocHandle() , which is called with
theSQL_HANDLE_DBC argument.
SQLAllocEnv
Obtains an environment handle from the driver. The function is deprecated
and has been replaced withSQLAllocHandle() , which is called with
the SQL_HANDLE_ENV argument.
SQLAllocHandle
SQLAllocHandle (handle_type, input_handle,
output_handle_pointer);
Allocates a handle (either a connection, descriptor, environment, or
statement handle).
Thehandle_type can be one ofSQL_HANDLE_ENV (environment
handle), SQL_HANDLE_DBC (connection handle), orSQL_HANDLE_STMT
(statement handle).
The input_handle describes the context for allocating the new handle.
This will beSQL_ NULL_HANDLE if thehandle_type
isSQL_HANDLE_ENV , an environment handle if the handle_ type is
SQL_HANDLE_DBC , and a connection handle if it'sSQL_HANDLE_STMT
.
SQLAllocStmt
Allocates memory for a statement handle. The function is deprecated and
has been replaced withSQLAllocHandle() , which is called with
theSQL_HANDLE_STMT argument.
SQLBindParameter
SQLBindParameter(statement_handle,
parameter_number, parameter_type,– value_type,
sql_type, column_size, decimal_digits,–
parameter_value_pointer, buffer_length,
string_length_pointer);
Binds a parameter marker in a SQL statement. Theparameter_number
starts at 1. For example:
SQLUINTEGER id_ptr;
SQLINTEGER idl_ptr;
// Prepare SQL
SQLPrepare(sth, "INSERT INTO customer(id)
VALUES(?)", SQL_NTS);
// Bind id to the parameter for the id column
SQLBindParameter(sth, 1, SQL_PARAM_INPUT,
SQL_C_ULONG, SQL_LONG, 0, 0, &id_ptr, 0,
&idl_ptr);
// ...
SQLExecute(sth);
SQLBulkOperations
SQLBulkOperations(statement_handle, operation);
Performs bulk operations.
SQLCancel
SQLCancel(statement_handle)
Cancels operations on the specified statement handle.
SQLCloseCursor
SQLCloseCursor(statement_handle);
Closes any open cursors for the specified statement handle.
SQLColAttribute
SQLColAttribute (statement_handle, record_number,–
field_identifier, character_attribute_pointer,
buffer_length,– string_length_pointer,
numeric_attribute_pointer);
Describes attributes of a field from the result set.
Therecord_number argument is the number of the record, starting at 1.
SQLColAttributes
Describes attributes of a field from the result. The function is deprecated
and has been replaced withSQLColAttribute() .
SQLColumnPrivileges
SQLColumnPrivileges(statement_handle,
catalog_name,
catalog_name_length, schema_name,
schema_name_length, table_name, table_name_length,
column_name, column_name_length);
SQLColumns
SQLColumns(statement_handle, catalog_name,
catalog_name_length, schema_name,
schema_name_length, table_name, table_name_length,
column_name, column_name_length);
SQLConnect
SQLConnect(connection_handle, datasource_name,
datasource_name_length, user_name,
user_name_length, password, password_length);
Connects to the data source with the specified username and password.
SQLDataSources
Implemented by the Driver Manager, this function returns a list of available
data sources.
SQLDescribeCol
SQLDescribeCol(statement_handle, column_number,
column_name, buffer_length, name_length_pointer,
data_type_pointer, column_size_pointer,
decimal_digits_pointer, nullable_pointer);
Describes a column in the result set. Thecolumn_number argument is the
number of the column in the result set, starting at 1.
SQLDescribeParam
SQLDescribeParam(statement_handle,
parameter_number, data_type_pointer,
parameter_size_pointer, decimal_digits_pointer,
nullable_pointer);
SQLDisconnect
SQLDisconnect(connection_handle);
Closes the connection specified by the connection handle.
SQLDriverConnect
SQLDriverConnect (connection_handle,
window_handle, in_connection,
in_connection_length, out_connection,
out_connection_length, buffer_length,
prompt_flag);
The prompt_flag argument specifies whether the driver must prompt for
more information to connect. It can beSQL_DRIVER_PROMPT
,SQL_DRIVER_COMPLETE , SQL_DRIVER_COMPLETE_ REQUIRED ,
orSQL_DRIVER_NOPROMPT .
SQLDrivers
Implemented by the Driver Manager, this function returns details of the
installed drivers.
SQLEndTran
SQLEndTran(handle_type, handle, completion_type);
Ends an open transaction, calling a rollback or commit.
Thehandle_type argument contains eitherSQL_HANDLE_ENV
orSQL_HANDLE_DBC depending on the type of handle (environment or
connection).
Thehandle argument specifies the actual handle.
Thecompletion_type determines whether the transaction is ended with
a commit or a rollback, and it can be eitherSQL_COMMIT
orSQL_ROLLBACK .
SQLError
This function is for returning error information and is deprecated. You can
use SQLGetDiagRec orSQLGetDiagField to replace it.
SQLExecDirect
SQLExecDirect(statement_handle, sql, sql_length);
Executes a SQL statement. Quicker thanSQLExecute if the statement is
only to be executed once, as there is no need to prepare it.
SQLExecute
SQLExecute(statement_handle);
Executes a previously prepared statement (withSQLPrepare ).
UseSQLExecDirect if the statement is only to be executed once and
there is no need to prepare it.
SQLExtendedFetch
This function returns scrollable results and is deprecated. Instead
useSQLFetchScroll .
SQLFetch
SQLFetch(statement_handle); Returns the next row of data.
SQLFetchScroll
SQLFetchScroll(statement_handle, fetch_type,
offset);
Returns data for the specified row.
SQLFreeConnect
Frees the connection handle. The function is deprecated, so instead
useSQLFreeHandle .
SQLFreeEnv
Frees the environment handle. The function is deprecated, so instead use
SQLFreeHandle .
SQLFreeHandle
SQLFreeHandle(handle_type,handle);
Frees a handle (either connection, descriptor, environment, or statement
handle).
SQLFreeStmt
SQLFreeStmt(statement_handle, option);
Stops the processing of a statement.
SQLForeignKeys
SQLForeignKeys(statement_handle,
primary_key_catalog_name,
primary_key_catalog_name_length,
primary_key_schema_name,
primary_key_schema_name_length,
primary_key_table_name,
priamry_key_table_name_length,
foreign_key_catalog_name,
foreign_key_catalog_name_length,
foreign_key_schema_name,
foreign_key_schema_name_length,
foreign_key_table_name,
foregn_key_table_name_length);
Returns foreign keys in the specified table and foreign keys in other tables
linked with the specified table.
SQLGetConnectAttr
SQLRETURN SQLGetConnectAttr(connection_handle,
attribute, value_pointer, buffer_length,
string_length_pointer);
Returns the value of a connection attribute. Theattribute argument can
be one of the values in Table H.2 .
Table H.2: Attribute and Associated value_pointer Contents
Attribute value_pointer Contents
Indicates whether to use auto-commit of or manual commit mode. Can
beSQL_AUTOCOMMIT_OFF ( in
SQLGetConnectOption
Returns the connection option value. The function has been deprecated, so
instead use SQLGetConnectAttr .
SQLGetCursorName
SQLGetCursorName(statement_handle,cursor_name,curs
or_name_length, name_length_pointer);
Returns the name of the cursor associated with the statement handle.
Thecursor_name argument points to a buffer where the cursor name is
returned.
Returns the column number from the result set or parameter number ina
parameter set, starting at 0. If neither of these apply, it will containSQL_
NO_COLUMN_NUMBER or
SQL_COLUMN_NUMBER_UNKNOWN .
Returns the number of rows affected by a SQL operation that modifies data
(such asINSERT or DELETE ) run bySQLExecute ,SQLExecDirect ,
SQLBulkOperations , orSQLSetPos .
Returns a string containing the server name to which the diagnostics refer.
Returns a five-character string containing the
SQLSTATE code.
Returns a string indicating theSQLSTATE subclass
SQL_DIAG_SUBCLASS_ORIGIN origin (for example, ISO9075 or ODBC 3.0).
Thediagnostic_id_pointer points to a buffer where the diagnostic
data is to be returned.
Thebuffer_length argument can contain one of the following values:
The length of the diagnostic_identifier_pointer (or SQL_NTS) if
diagnostic_identifier_pointer points to a string.
The result ofSQL_LEN_BINARY_ATTR(length) if
diagnostic_identifier_pointer points to a binary buffer.
SQLGetDiagRec
SQLGetDiagRec(handle_type, handle, recorn_number,
sql_state, native_error_pointer, message_text,
message_text_length, text_length_pointer);
SQLGetEnvAttr
SQLGetEnvAttr(environment_handle, attribute,
value_pointer, buffer_length,
string_length_pointer);
Returns the environment attribute value.
Theattribute argument can be one of the supported values listed in
Table H.4 .
SQL_ATTR_CONNECTION_ POOLING
SQL_ATTR_CP_MATCH
SQL_ATTR_ODBC_VERSION
value_pointer Contents
A 32-bit value for enabling or disabling connection pooling.
A 32-bit value for determining how a connection is selected from the
available pool.
SQLGetFunctions
SQLGetFunctions(connection_handle, function_id,
supported_pointer);
Returns the functions the driver supports.
SQLGetInfo
SQLGetInfo(connection_handle,info_type,
info_value_pointer, buffer_length,
string_length_pointer);
Returns information about the driver and server.
SQLGetStmtAttr
SQLGetStmtAttr(statement_handle, attribute,
value_pointer, buffer_length,
string_length_pointer);
Returns the statement attribute value.
Theattribute argument can be one of the supported options listed in
Table H.5 .
Table H.5: The SQLGetStmt attribute Argument Attribute
SQL_ATTR_CURSOR_SCROLLABLE
SQL_ATTR_CURSOR_SENSITIVITY
SQL_ATTR_CURSOR_TYPE
SQL_ATTR_KEYSET_SIZE
SQL_ATTR_MAX_LENGTH
value_pointer Contents
SQL_ATTR_NOSCAN
SQL_ATTR_PARAM_BIND_TYPE
SQL_ATTR_PARAM_OPERATION_ PTR
SQL_ATTR_PARAM_STATUS_PTR
SQL_ATTR_PARAMS_PROCESSED_ PTR
SQL_ATTR_PARAMSET_SIZE
SQL_ATTR_QUERY_TIMEOUT
truncated and return
SQL_SUCCESS_WITH_INFO .
Maximum number of rows the driver can return (0 is no limit).
Points to an array of values (one for each row) with status information after
a call to
SQLExecute orSQLExecDirect containing one of the
following:SQL_PARAM_SUCCESS , SQL_PARAM_SUCCESS_
WITH_INFO
(successful with a warning),
SQL_PARAM_ERROR ,SQL_PARAM_UNUSED (usually
becauseSQL_PARAM_IGNORE was set),
orSQL_PARAM_DIAG_UNAVAILABLE . Can also be set to a null pointer,
in which case data is not returned.
SQL_ATTR_ROW_ARRAY_SIZE
SQL_ATTR_ROW_BIND_OFFSET_ PTR
SQL_ATTR_ROW_BIND_TYPE
SQL_ATTR_ROW_NUMBER
SQL_ATTR_ROW_OPERATION_PTR
SQL_ATTR_ROW_STATUS_PTR
SQL_ATTR_ROWS_FETCHED_PTR
SQL_ATTR_SIMULATE_CURSOR
The number of rows returned by a call to SQLFetch
orSQLFetchScroll . The default is 1.
Points to an array of values (one for each row) containing row status values
after a call to either SQLFetch orSQLFetchScroll . Can also be set to
a null pointer, and the driver will not return the array.
SQLGetStmtOption
Returns the statement option value. The function is deprecated, so instead
use SQLGetStmtAttr .
SQLGetTypeInfo
SQLGetTypeInfo(statement_handle,data_type);
Returns a SQL result set with information about the specified data type.
Settingdata_type toSQL_ALL_TYPES returns information about the
data types returned by the server.
SQLNativeSql
SQLNativeSql(connection_handle, sql_string,
sql_string_length, modified_sql_string,
modified_sql_string_length,
string_length_pointer);
SQLNumParams
SQLNumParams(statement_handle,
parameter_count_pointer);
Returns the number of parameters in a statement.
Theparameter_count_pointer argument points to a buffer where
the number of parameters is to be returned.
SQLNumResultCols
SQLNumResultCols(statement_handle,
column_count_pointer);
Returns the number of columns in the result set.
Thecolumn_count_pointer argument points to a buffer where the
number of columns is to be returned.
SQLParamData
Used in conjunction withSQLPutData to supply parameter data at
execution time. (This is useful for long data values.)
SQLPrepare
SQLPrepare(statement_handle, sql_string,
sql_string_length); Prepares a SQL statement for later execution.
SQLPrimaryKeys
SQLPrimaryKeys(statement_handle, catalog_name,
catalog_name_length, schema_name,
schema_name_length, table_name,
table_name_length); Returns the primary key columns from the
specified table.
SQLPutData
SQLPutData(statement_handle, data_pointer,
data_pointer_length);
For sending column or parameter data at execution time.
SQLSetConnectAttr
SQLSetConnectAttr(connection_handle, attribute,
value_pointer, string_length);
Sets a connection attribute.
SeeSQLGetConnectAttr for a list and description of possible attributes.
Thevalue_pointer argument points to value of the attribute, of which
the type depends on theattribute .
Thestring_length argument can contain one of the following values:
The length of thevalue_pointer (orSQL_NTS ) ifvalue_pointer
points to a string.
The result ofSQL_LEN_BINARY_ATTR(length) ifvalue_pointer
points to a binary buffer.
One ofSQL_IS_INTEGER ,SQL_IS_UNINTEGER ,SQL_SMALLINT ,
or SQLUSMALLINT ifvalue_ pointer points to a specific data type of
fixed length. SQL_IS_POINTER ifvalue_pointer points to another
pointer.
SQLSetConnectOption
Sets a connection option. This function has been deprecated, so instead use
SQLSetConnectAttr .
SQLSetCursorName
SQLSetCursorName(statement_handle,cursor_name,curs
or_name_length); Specifies a cursor name.
SQLSetEnvAttr
SQLSetEnvAttr(environment_handle, attribute,
value_pointer, string_length_pointer); Sets an
environment attribute. SeeSQLGetEnvAttr for a list of possible
attributes.
SQLSetPos
SQLSetPos(statement_handle, row_number, operation,
lock_type);
Moves a cursor to a position in a fetched block of data and can also refresh
data in the row set or update and delete the underlying data.
Therow_number argument selects the row in the result set the operation
affects (starting at 1). Setting it to 0 applies the operation toevery row .
SQL_DELETE
SQLSetScrollOptions
Sets options affecting cursor behavior. This function is deprecated, so
instead use SQLSetStmtAttr .
SQLSetStmtAttr
SQLGetStmtAttr(statement_handle, attribute,
value_pointer, string_length);
Sets a statement attribute.
For the list of possibleattribute values, seeSQLGetStmtAttr .
Thevalue_pointer argument points to value of the attribute, of which
the type depends on theattribute .
Thestring_length argument can contain one of the following values:
The length of thevalue_pointer (orSQL_NTS ) ifvalue_pointer
points to a string.
The result ofSQL_LEN_BINARY_ATTR(length) ifvalue_pointer
points to a binary buffer.
One ofSQL_IS_INTEGER ,SQL_IS_UNINTEGER ,SQL_SMALLINT ,
or SQLUSMALLINT ifvalue_ pointer points to a specific data type of
fixed length. SQL_IS_POINTER ifvalue_pointer points to another
pointer.
SQLSetStmtOption
Sets a statement option. This function has been deprecated, so instead use
SQLSetStmtAttr.
SQLSpecialColumns
SQLSpecialColumns(statement_handle,
identifier_type, catalog_name, catalog_length,
schema_name, schema_length, table_name,
table_name_length, scope, nullable);
The scope argument is the minimum required scope of the row ID. It can
be one of the following:SQL_SCOPE_CURROW (The row ID is definitely
valid only while on that row),SQL_ SCOPE_TRANSACTION (The row ID
is definitely valid for the duration of the current transaction),
orSQL_SCOPE_SESSION (The row ID is definitely valid for the entire
session).
SQLStatistics
SQLStatistics(statement_handle, catalog_name,
catalog_name_length, schema_name,
schema_name_length, table_name, table_name_length,
index_type, reserved);
SQLTablePrivileges
SQLTablePrivileges(statement_handle, catalog_name,
catalog_length, Schema_name, schema_name_length,
table_name, table_name_length); Returns a list of tables and
associated privileges.
SQLTables
SQLTables(statement_handle, catalog_name,
catalog_length, Schema_name, schema_name_length,
table_name, table_name_length, table_type,
table_type_length);
Returns the list of tables, catalog, or schema names and table types.
SQLTransact
Ends a transaction. This function has been deprecated, so instead
useSQLEndTran .