How to Extract Substrings in PostgreSQL and MySQL
Database:
Operators:
Table of Contents
Problem
You'd like to extract a substring from a string in a PostgreSQL or MySQL database.
Example 1
In the emails
table, there is an email
column. You'd like to display the first seven characters of each email.
The table looks like this:
jake99@gmail.com |
tamarablack@zoho.com |
notine@yahoo.fr |
jessica1995@onet.pl |
Solution 1
SELECT email, SUBSTRING(email, 1, 7) AS substring FROM emails;
Another syntax:
SELECT email, SUBSTRING(email FROM 1 FOR 7) AS substring FROM emails;
The result is:
substring | |
---|---|
jake99@gmail.com | jake99@ |
tamarablack@zoho.com | tamarab |
notine@yahoo.fr | notine@ |
jessica1995@onet.pl | jessica |
Discussion
Use the SUBSTRING()
function. The first argument is the string or the column name. The second argument is the index of the character at which the substring should begin. The third argument is the length of the substring.
Watch out! Unlike in some other programming languages, in SQL the indexes start at 1, not 0. This means the first character has index 1, the second character has index 2, etc.
SUBSTRING(email, 1, 7)
will return the substrings of the values in the email column that start at the beginning of the strings (first character) and go for seven characters. The other notation, SUBSTRING(email FROM 1 FOR 7)
, does exactly the same. The argument after the FROM
is the starting index, and the argument after the FOR
is the substring length.
The third argument of the SUBSTRING()
function is optional. If you omit it, you'll get the substring that starts at the index in the second argument and goes all the way up to the end of the string. SUBSTRING(email, 1)
will return the whole string, just as will SUBSTRING(email FROM 1)
.
Example 2
You'd like to display the substring between indexes 2 and 6 (inclusive).
Solution 2
SELECT email, SUBSTRING(email, 2, 5) AS substring FROM emails;
Alternative syntax:
SELECT email, SUBSTRING(email FROM POSITION('@' IN email)) AS substring FROM emails;
The result is:
substring | |
---|---|
jake99@gmail.com | @gmail.com |
tamarablack@zoho.com | @zoho.com |
notine@yahoo.fr | @yahoo.fr |
jessica1995@onet.pl | @onet.pl |
Discussion
You use the SUBSTRING()
function like in the previous examples. This time, you're looking for a specific character whose position can vary from row to row. To find the index of the specific character, you can use the POSITION(character IN expression)
function, where character
is the specific character at which you'd like to start the substring (here, @
). The argument expression
is the column or a constant string from which you'd like to retrieve the substring.
If you want the substring to go all the way to the end of the original string, the third argument in the SUBSTRING()
function (or the FOR
argument) is not needed. Otherwise, it should be the length of the substring.
You may also want to retrieve a substring that doesn't end at the end of the string but at some specific character, e.g., before .
. Here's an example:
SELECT email, SUBSTRING( email, POSITION('@' IN email), POSITION('.' IN email) - POSITION('@' IN email)) AS substring FROM emails;
Another syntax:
SELECT email, SUBSTRING(email FROM POSITION('@' IN email) FOR POSITION('.' IN email) - POSITION('@' IN email)) AS substring FROM emails;
The result of this query is:
substring | |
---|---|
jake99@gmail.com | @gmail |
tamarablack@zoho.com | @zoho |
notine@yahoo.fr | @yahoo |
jessica1995@onet.pl | @onet |
The part POSITION('.' IN email) - POSITION('@' IN email)
simply calculates the length of the substring: from the position of @
(inclusive) to the position of .
(exclusive).