Trimming (computer programming)
‹The template Manual is being considered for merging.›
This article is written like a manual or guide. (February 2009) |
In computer programming, trimming (trim) or stripping (strip) is a string manipulation in which leading and trailing whitespace is removed from a string.
For example, the string (enclosed by apostrophes)
' this is a test '
would be changed, after trimming, to
'this is a test'
Variants
- Left or right trimming
- The most popular variants of the trim function strip only the beginning or end of the string. Typically named ltrim and rtrim respectively, or in the case of Python: lstrip and rstrip. C# uses TrimStart and TrimEnd, and Common Lisp string-left-trim and string-right-trim. Pascal and Java do not have these variants built-in, although Object Pascal (Delphi) has TrimLeft and TrimRight functions.[1]
- Whitespace character list parameterization
- Many trim functions have an optional parameter to specify a list of characters to trim, instead of the default whitespace characters. For example, PHP and Python allow this optional parameter, while Pascal and Java do not. With Common Lisp's
string-trim
function, the parameter (called character-bag) is required. The C++ Boost library defines space characters according to locale, as well as offering variants with a predicate parameter (a functor) to select which characters are trimmed.
- Special empty string return value
- An uncommon variant of trim returns a special result if no characters remain after the trim operation. For example, Apache Jakarta's StringUtils has a function called
stripToNull
which returnsnull
in place of an empty string.
- Space normalization
- Space normalization is a related string manipulation where in addition to removing surrounding whitespace, any sequence of whitespace characters within the string is replaced with a single space. Space normalization is performed by the function named
Trim()
in spreadsheet applications (including Excel, Calc, Gnumeric, and Google Docs), and by thenormalize-space()
function in XSLT and XPath,
- In-place trimming
- While most algorithms return a new (trimmed) string, some alter the original string in-place. Notably, the Boost library allows either in-place trimming or a trimmed copy to be returned.
Definition of whitespace
The characters which are considered whitespace varies between programming languages and implementations. For example, C traditionally only counts space, tab, line feed, and carriage return characters, while languages which support Unicode typically include all Unicode space characters. Some implementations also include ASCII control codes (non-printing characters) along with whitespace characters.
Java's trim method considers ASCII spaces and control codes as whitespace, contrasting with the Java isWhitespace()
method,[2] which recognizes all Unicode space characters.
Delphi's Trim function considers characters U+0000 (NULL) through U+0020 (SPACE) to be whitespace.
Usage
Following are examples of trimming a string using several programming languages. All of the implementations shown return a new string and do not alter the original variable.
Example usage | Languages |
---|---|
String.Trim([chars]) | C#, VB.NET, Windows PowerShell |
string.strip(); | D |
(.trim string) | Clojure |
sequence [ predicate? ] trim | Factor |
(string-trim '(#\Space #\Tab #\Newline) string) | Common Lisp |
(string-trim string) | Scheme |
string.trim() | Java, JavaScript (1.8.1+, Firefox 3.5+) |
Trim(String) | Pascal,[3] QBasic, Visual Basic, Delphi |
string.strip() | Python |
strings.Trim(string, chars) | Go |
LTRIM(RTRIM(String)) | Oracle SQL, T-SQL |
strip(string [,option, char]) | REXX |
string:strip(string [,option, char]) | Erlang |
string.strip | Ruby |
string =~ s/^\s+//r =~ s/\s+$//r | Perl 5 |
string.trim | Perl 6 |
trim(string) | PHP |
[string stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]] | Objective-C using Cocoa |
string withBlanksTrimmed string withoutSpaces string withoutSeparators |
Smalltalk (Squeak, Pharo) Smalltalk |
strip(string) | SAS |
string trim $string | Tcl |
TRIM(string) or TRIM(ADJUSTL(string)) | Fortran |
TRIM(string) | SQL |
TRIM(string) or LTrim(string) or RTrim(String) | ColdFusion |
String.trim string | OCaml 4+ |
Other languages
In languages without a built-in trim function, it is usually simple to create a custom function which accomplishes the same task.
AWK
In AWK, one can use regular expressions to trim:
ltrim(v) = gsub(/^[ \t]+/, "", v)
rtrim(v) = gsub(/[ \t]+$/, "", v)
trim(v) = ltrim(v); rtrim(v)
or:
function ltrim(s) { sub(/^[ \t]+/, "", s); return s }
function rtrim(s) { sub(/[ \t]+$/, "", s); return s }
function trim(s) { return rtrim(ltrim(s)); }
C/C++
There is no standard trim function in C or C++. Most of the available string libraries[4] for C contain code which implements trimming, or functions that significantly ease an efficient implementation. The function has also often been called EatWhitespace in some non-standard C libraries.
In C, programmers often combine a ltrim and rtrim to implement trim:
#include <string.h>
#include <ctypes.h>
void rtrim(char *str)
{
size_t n;
n = strlen(str);
while (n > 0 && isspace((unsigned char)str[n - 1])) {
n--;
}
str[n] = '\0';
}
void ltrim(char *str)
{
size_t n;
n = 0;
while (str[n] != '\0' && isspace((unsigned char)str[n])) {
n++;
}
memmove(str, str + n, strlen(str) - n + 1);
}
void trim(char *str)
{
rtrim(str);
ltrim(str);
}
The open source C++ library Boost has several trim variants, including a standard one:[5]
#include <boost/algorithm/string/trim.hpp>
trimmed = boost::algorithm::trim_copy("string");
Note that with boost's function named simply trim
the input sequence is modified in-place,[6] and does not return a result.
Another open source C++ library Qt has several trim variants, including a standard one:[7]
#include <QString>
trimmed = s.trimmed();
The Linux kernel also includes a strip function, strstrip()
, since 2.6.18-rc1, which trims the string "in place". Since 2.6.33-rc1, the kernel uses strim()
instead of strstrip()
to avoid false warnings.[8]
Haskell
A trim algorithm in Haskell:
import Data.Char (isSpace)
trim :: String -> String
trim = f . f
where f = reverse . dropWhile isSpace
may be interpreted as follows: f drops the preceding whitespace, and reverses the string. f is then again applied to its own output. Note that the type signature (the second line) is optional.
J
The trim algorithm in J is a functional description:
trim =. #~ [: (+./\ *. +./\.) ' '&~:
That is: filter (#~
) for non-space characters (' '&~:
) between leading (+./\
) and (*.
) trailing (+./\.
) spaces.
JavaScript
There is a built-in trim function in JavaScript 1.8.1 (Firefox 3.5 and later), and the ECMAScript 5 standard. In earlier versions it can be added to the String object's prototype as follows:
String.prototype.trim = function() {
return this.replace(/^\s+/g, "").replace(/\s+$/g, "");
};
Perl
Perl 5 has no built-in trim function. However, the functionality is commonly achieved using regular expressions.
Example:
$string =~ s/^\s+//; # remove leading whitespace
$string =~ s/\s+$//; # remove trailing whitespace
or:
$string =~ s/^\s+|\s+$//g ; # remove both leading and trailing whitespace
These examples modify the value of the original variable $string
.
Also available for Perl is StripLTSpace in String::Strip
from CPAN.
There are, however, two functions that are commonly used to strip whitespace from the end of strings, chomp
and chop
:
chop
removes the last character from a string and returns it.chomp
removes the trailing newline character(s) from a string if present. (What constitutes a newline is $INPUT_RECORD_SEPARATOR dependent).
In Perl 6, the upcoming major revision of the language, strings have a trim
method.
Example:
$string = $string.trim; # remove leading and trailing whitespace
$string .= trim; # same thing
Tcl
The Tcl string
command has three relevant subcommands: trim
, trimright
and trimleft
. For each of those commands, an additional argument may be specified: a string that represents a set of characters to remove—the default is whitespace (space, tab, newline, carriage return).
Example of trimming vowels:
set string onomatopoeia
set trimmed [string trim $string aeiou] ;# result is nomatop
set r_trimmed [string trimright $string aeiou] ;# result is onomatop
set l_trimmed [string trimleft $string aeiou] ;# result is nomatopoeia
XSLT
XSLT includes the function normalize-space(string)
which strips leading and trailing whitespace, in addition to replacing any whitespace sequence (including line breaks) with a single space.
Example:
<xsl:variable name='trimmed'>
<xsl:value-of select='normalize-space(string)'/>
</xsl:variable>
XSLT 2.0 includes regular expressions, providing another mechanism to perform string trimming.
Another XSLT technique for trimming is to utilize the XPath 2.0 substring()
function.
See also
External links
- Tcl: string trim
- Faster JavaScript Trim - compares various JavaScript trim implementations
Notes
- ^ http://www.freepascal.org/docs-html/rtl/sysutils/trim.html
- ^ http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Character.html#isWhitespace(char)
- ^ http://gnu-pascal.de/gpc-hr/Trim.html
- ^ http://www.and.org/vstr/comparison
- ^ http://www.boost.org/doc/html/string_algo/usage.html#id2742817
- ^ http://www.boost.org/doc/html/trim.html
- ^ http://doc.trolltech.com/4.5/qstring.html#trimmed
- ^ http://www.kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.33-rc1-git1.log