How to Replace Part of a String in T-SQL
Database:
Operators:
Table of Contents
Problem:
You’d like to replace part of a string with another string in T-SQL.
Example 1:
Our database has a table named life_insurance
with data in the following columns: policy_ID
, last_name
, and first_name
.
policy_ID | last_name | first_name |
---|---|---|
v-01 | Anstruther - Gough | Gary |
V-23 | Elliot - Murray - Stewart | Mary |
3A-v | Smith - Dorrie | Alex |
As you can see, some policyholders have hyphenated surnames. We’d like to change the separator between the parts of these policyholders’ last names. Currently, the separator is -
: it consists of a space, a hyphen (-
), and another space; we want to replace this with a single hyphen (-
).
Solution:
SELECT REPLACE( last_name, ' - ', '-' ) AS correct_last_name, first_name FROM life_insurance;
This query returns a list of policyholders with correctly hyphenated last names, i.e. names separated by a hyphen instead of space-hyphen-space:
correct_last_name | first_name |
---|---|
Anstruther-Gough | Gary |
Elliot-Murray-Stewart | Mary |
Smith-Dorrie | Alex |
Discussion:
Use the T-SQL function REPLACE()
to replace a substring (a word, character, group of letters, etc.) with another substring. The target can be a string, an expression (or an expression returning a string) or a column name. This function takes three arguments:
- The target string, expression, etc. (In our case, it’s the column
last_name
.) - The substring to replace (here, the space-hyphen-space pattern
-
). - The substring to insert (here, a hyphen
-
).
Notice that this function replaces all occurrences of the substring in a given text or column, not just the first occurrence. In our example, Mary’s complex surname consists of three names, so her last_name
contains two occurrences of -
, which were replaced by -
.
Example 2:
In the next example, we replace the character v
with 5
in the policy ID numbers.
Solution for Example 2:
SELECT REPLACE( policy_id, 'v', '5' ) AS new_policy_id, last_name, first_name FROM life_insurance;
Here is the result:
policy_id | last_name | first_name |
---|---|---|
5-01 | Anstruther - Gough | Gary |
5-23 | Elliot - Murray - Stewart | Mary |
3A-5 | Smith - Dorrie | Alex |
This query replaces both v
and V
with 5
in the policy_id
column. By default, SQL Server’s REPLACE
function is not case sensitive, but this actually depends on your database server settings.
You can specify whether this function is case sensitive or insensitive by using the COLLATE
clause. Look at the last example:
Solution 3:
SELECT REPLACE(policy_id COLLATE SQL_Latin1_General_CP1_CS_AS, 'v', '5') AS new_policy_id, last_name, first_name FROM life_insurance;
Here is the result:
policy_id | last_name | first_name |
---|---|---|
5-01 | Anstruther - Gough | Gary |
V-23 | Elliot - Murray - Stewart | Mary |
3A-5 | Smith - Dorrie | Alex |
In this query, the V
was not replaced because the REPLACE
function is now case sensitive (i.e. v
is not the same as V
). In the COLLATE
clause, _CS
denotes case-sensitivity; if you change it to _CI
, the function will be case insensitive.
What’s a collation? In SQL Server, it’s a set of rules that determine how values will be compared and sorted and how accent marks will be treated. It can be handled differently in different database systems. COLLATE
is the SQL Server clause that sets the collation by using various options like _CI
, _CS
, and _AS
and the name of an encoding set (like SQL_Latin1_General).
You can read more about COLLATE
in the SQL Server documentation.