How to Split a String in Oracle
Database:
Operators:
Table of Contents
Problem:
You want to split a string in Oracle.
Example:
You have a sentence, and you'd like to split it by the space character.
Solution:
SELECT REGEXP_SUBSTR('An example sentence.', '[^ ]+', 1, level) AS parts FROM dual CONNECT BY REGEXP_SUBSTR('An example sentence.', '[^ ]+', 1, level) IS NOT NULL;
The result table looks like this:
parts |
---|
An |
example |
sentence. |
Discussion:
To get substrings from a string, you can use Oracle's built-in REGEXP_SUBSTR() function. It takes four arguments:
- The string to be searched for a substring.
- The regular expression; i.e., the pattern you'd like to find.
- The position at which you'd like to start looking for the pattern (the default value is 1, which means starting from the beginning of the string).
- The number specifying which occurrence of the matching substring you'd like to see (the default value is
1
, which means the first matching substring should be shown).
In our example, the string to be searched is 'An example sentence.'
In the string, you look for substrings that are words, which we define here as any groups of any characters except spaces. This is why we specify the pattern '[^ ]+'
, i.e., any sequence of non-space characters. [^ ]
stands for any character except space, and the +
sign stands for one or more occurrences of such characters. The third argument should be the default (1
), since you want to look for words in the entire string from the beginning. The last argument is the trickiest – you need its value to change for each string to be searched. It should be level – a pseudocolumn available in hierarchical queries using CONNECT BY
– which, in our case, will be equal to the row number of the current row in the result when CONNECT BY is used. So, you have:
REGEXP_SUBSTR('An example sentence.', '[^ ]+', 1, level)
This expression should be selected from dual
– a pseudotable used in Oracle when you don’t select from an actual table. Then, you should CONNECT BY
the hierarchical queries operator, on the condition that the retrieved substring shouldn't be NULL
. The retrieved substring will be NULL
once all the words are retrieved; the hierarchical query will then stop the execution. So, you have:
FROM dual CONNECT BY REGEXP_SUBSTR('An example sentence.', '[^ ]+', 1, level) IS NOT NULL
You can read here about level
and CONNECT BY
.
If you'd like to split the string by some other delimiter, you can easily modify the query. For example, if you'd like to split the string by the comma, change '[^ ]+'
to '[^,]+'
. Here's an example query:
SELECT REGEXP_SUBSTR('Johnson,Blair,Black,Newman,Henderson', '[^,]+', 1, level) AS parts FROM dual CONNECT BY REGEXP_SUBSTR('Johnson,Blair,Black,Newman,Henderson', '[^,]+', 1, level) IS NOT NULL;