Start of Tutorial > Start of Trail > Start of Lesson |
Search
Feedback Form |
Until now, we've only used the test harness to createPattern
objects in their most basic form. This section explores advanced techniques such as creating patterns with flags and using embedded flag expressions. It also explores some remaining methods we haven't discussed yet.Creating a Pattern with Flags
ThePattern
class defines an alternatecompile
method that accepts a set of flags affecting the way the pattern is matched. The flags parameter is a bit mask that may include any of the following public static fields:In the following steps we will modify the test harness,
Pattern.CANON_EQ
Pattern.CASE_INSENSITIVE
Pattern.COMMENTS
Pattern.DOTALL
Pattern.MULTILINE
Pattern.UNICODE_CASE
Pattern.UNIX_LINES
RegexTestHarness
to create a pattern with case-insensitive matching.First, modify the code to call the alternate version of
compile
:Then edit your input file,pattern = Pattern.compile(REGEX,Pattern.CASE_INSENSITIVE);regex.txt
, to contain the following:Finally, compile and run the test harness to get the following results:dog DoGDOgAs you can see, the string literal "dog" matches both occurances, regardless of case. To compile a pattern with multiple flags, separate the flags to be included using the bitwise OR operator (Current REGEX is: dog Current INPUT is: DoGDOg I found the text "DoG" starting at index 0 and ending at index 3. I found the text "DOg" starting at index 3 and ending at index 6.|
):Note that for clarity you could also specify anpattern = Pattern.compile("[az]$", Pattern.MULTILINE | Pattern.UNIX_LINES);int
variable:final int flags = Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE; Pattern pattern = Pattern.compile("aa", flags);Embedded Flag Expressions
It's also possible to enable various flags using embedded flag expressions. Embedded flag expressions are an alternative to the two-argument version ofcompile
, and are specified in the regular expression itself. The following example uses the original test harness,RegexTestHarness.java
with the embedded flag expression(?i)
to enable case-insensitive matching.Once again, all matches succeed regardless of case.Current REGEX is: (?i)foo Current INPUT is: FOOfooFoOfoO I found the text "FOO" starting at index 0 and ending at index 3. I found the text "foo" starting at index 3 and ending at index 6. I found the text "FoO" starting at index 6 and ending at index 9. I found the text "foO" starting at index 9 and ending at index 12.The embedded flag expressions that correspond to
Pattern
's publicly-accessible fields are presented in the following table:
Constant Equivalent Embedded Flag Expression Pattern.CANON_EQ
None Pattern.CASE_INSENSITIVE
(?i)
Pattern.COMMENTS
(?x)
Pattern.MULTILINE
(?m)
Pattern.DOATALL
(?s)
Pattern.UNICODE_CASE
(?u)
Pattern.UNIX_LINES
(?d)
Using the
Thematches(String,CharSequence)
MethodPattern
class defines a convenientmatches
method that allows you to quickly check if a pattern is present in a given input string. As with all public static methods, you should callmatches
with its class name, such asPattern.matches("\\d","1");
In this example, the method returns true, because the digit "1" matches the regular expression\d
.Using the
Thesplit(String)
methodsplit
method is a great tool for gathering the text that lies on either side of the pattern that's been matched. As shown below in theSplitTest
code, thesplit
method could extract the words "one two three four five
" from the string "one:two:three:four:five
":For simplicity, we've matched a string literal, the colon (import java.util.regex.*; public final class SplitTest { private static String REGEX = ":"; private static String INPUT = "one:two:three:four:five"; public static void main(String[] argv) { Pattern p = Pattern.compile(REGEX); String[] items = p.split(INPUT); for(int i=0;i<items.length;i++) { System.out.println(items[i]); } } } OUTPUT: one two three four five:
) instead of a complex regular expression. Since we're still usingPattern
andMatcher
objects, you can use split to get the text that falls on either side of any regular expression. Here's the same example,SplitTest2
, modified to split on digits instead:import java.util.regex.*; public final class SplitTest2 { private static String REGEX = "\\d"; private static String INPUT = "one9two4three7four1five"; public static void main(String[] argv) { Pattern p = Pattern.compile(REGEX); String[] items = p.split(INPUT); for(int i=0;i<items.length;i++) { System.out.println(items[i]); } } } OUTPUT: one two three four fivePattern Method Equivalents in
Regular expression support has also been introduced tojava.lang.String
java.lang.String
through several methods that mimic the behavior ofjava.util.regex.Pattern
. For convenience, key excerpts from their API are presented below.
public boolean matches(String regex)
: Tells whether or not this string matches the given regular expression. An invocation of this method of the formstr.matches(regex)
yields exactly the same result as the expressionPattern.matches(regex, str)
.
public String[] split(String regex, int limit)
: Splits this string around matches of the given regular expression. An invocation of this method of the formstr.split(regex, n)
yields the same result as the expressionPattern.compile(regex).split(str, n)
public String[] split(String regex)
: Splits this string around matches of the given regular expression. This method works the same as if you invoked the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are not included in the resulting array.
Start of Tutorial > Start of Trail > Start of Lesson |
Search
Feedback Form |
Copyright 1995-2004 Sun Microsystems, Inc. All rights reserved.