Perl | Regex Character Classes
Character classes are used to match the string of characters. These classes let the user match any range of characters, which user don’t know in advance. Set of characters that to be matched is always written between the square bracket []. A character class will always match exactly for one character. If match not found then the whole regex matching fails.
Suppose you have a lot of strings like #g#, #e#, #k#, #k#, #s#, #.# or #@# and you have to match a # character which is followed by ‘g‘, ‘e‘, ‘k‘, ‘s‘, ‘.‘, or ‘@‘, followed by the another # character, then try the regex /[#geeks@.#]/ that will match the required. It will start a match with # and then match any character in [] and after that match another #. This regex will not match “##” or “#ge#” or “#gg#” etc. because as said earlier that the character class always match exactly one character between the two ‘#‘ characters.
Important Points:
- The Dot(.) inside the character class, lost its special meaning i.e. “everything except newline“.
- The Dot(.) can match a single dot(.) only inside a character class.
- Most of the special characters lose their special meaning inside a character class, but there some characters that get some special meaning inside a character class.
# Perl program to demonstrate # character class # Actual String $str = "#g#" ; # Prints match found if # its found in $str if ( $str =~ /[ #geeks@.#]/) { print "Match Found\n" ; } # Prints match not found # if it is not found in $str else { print "Match Not Found\n" ; } |
Match Found
Range In Character Class: To match a long list of characters is very difficult to type because it may be a possibility that user might skip one or two characters. So to make the task easy we will use range. Generally, a dash(-) is used to specify the range.
To specify range [abcdef] you can use /[a-f]/
Important Points:
- Range is specified using -(Dash) symbol.
- User can also combine multiple range of characters, digits etc. like [0-9a-gA-g]. Here ‘–‘ allows user to take any number of character or digit specified in the range
- If the user want to match dash(-) in a given string then he can simply put it between the square brackets [].
- To match a closing square bracket in a string, just precede it with \ i.e. \] and put it between the square brackets [].
# Perl program to demonstrate # range in character class # Actual String $str = "61geeks" ; # Prints match found if # its found in $str # using range if ( $str =~ /[0-7a-z]/) { print "Match Found\n" ; } # Prints match not found # if its not found in $str else { print "Match Not Found\n" ; } |
Match Found
Negated Character Class: To negate a character class just use caret(^) symbol. It will negate the specified character after the symbol or even a range. If you will put a Caret (^) as the first character in the character class it means that character class can match any one character except those mentioned in the character class.
# Perl program to demonstrate # negated character class # Actual String $str = "geeks56" ; # using negated character class # Prints match found if # its found in $str if ( $str =~ /[^geeks0-7]/) { print "Match Found\n" ; } # Prints match not found # if its not found in $str else { print "Match Not Found\n" ; } |
Match Not Found