java.util.regex
public
final
class
java.util.regex.Pattern
Represents a pattern used for matching, searching, or replacing strings.
Pattern
s are specified in terms of regular expressions and compiled
using an instance of this class. They are then used in conjunction with a
Matcher to perform the actual search.
A typical use case looks like this:
Pattern p = Pattern.compile("Hello, A[a-z]*!");
Matcher m = p.matcher("Hello, Android!");
boolean b1 = m.matches(); // true
m.setInput("Hello, Robot!");
boolean b2 = m.matches(); // false
The above code could also be written in a more compact fashion, though this
variant is less efficient, since
Pattern
and
Matcher
objects
are created on the fly instead of being reused.
fashion:
boolean b1 = Pattern.matches("Hello, A[a-z]*!", "Hello, Android!"); // true
boolean b2 = Pattern.matches("Hello, A[a-z]*!", "Hello, Robot!"); // false
Please consult the
package documentation for an
overview of the regular expression syntax used in this class as well as
Android-specific implementation details.
Summary
Constants
|
|
|
Value |
|
int |
CANON_EQ |
This constant specifies that a character in a Pattern and a
character in the input string only match if they are canonically
equivalent. |
128 |
0x00000080 |
int |
CASE_INSENSITIVE |
This constant specifies that a Pattern is matched
case-insensitively. |
2 |
0x00000002 |
int |
COMMENTS |
This constant specifies that a Pattern may contain whitespace or
comments. |
4 |
0x00000004 |
int |
DOTALL |
This constant specifies that the '.' meta character matches arbitrary
characters, including line endings, which is normally not the case. |
32 |
0x00000020 |
int |
LITERAL |
This constant specifies that the whole Pattern is to be taken
literally, that is, all meta characters lose their meanings. |
16 |
0x00000010 |
int |
MULTILINE |
This constant specifies that the meta characters '^' and '$' match only
the beginning and end end of an input line, respectively. |
8 |
0x00000008 |
int |
UNICODE_CASE |
This constant specifies that a Pattern is matched
case-insensitively with regard to all Unicode characters. |
64 |
0x00000040 |
int |
UNIX_LINES |
This constant specifies that a pattern matches Unix line endings ('\n')
only against the '.', '^', and '$' meta characters. |
1 |
0x00000001 |
Public Methods
Protected Methods
clone,
equals,
finalize,
getClass,
hashCode,
notify,
notifyAll,
toString,
wait,
wait,
wait
Details
Constants
public
static
final
int
CANON_EQ
This constant specifies that a character in a
Pattern
and a
character in the input string only match if they are canonically
equivalent. It is (currently) not supported in Android.
Constant Value:
128
(0x00000080)
public
static
final
int
CASE_INSENSITIVE
This constant specifies that a
Pattern
is matched
case-insensitively. That is, the patterns "a+" and "A+" would both match
the string "aAaAaA".
Note: For Android, the CASE_INSENSITIVE
constant
(currently) always includes the meaning of the UNICODE_CASE
constant. So if case insensitivity is enabled, this automatically extends
to all Unicode characters. The UNICODE_CASE
constant itself has
no special consequences.
Constant Value:
2
(0x00000002)
public
static
final
int
COMMENTS
This constant specifies that a
Pattern
may contain whitespace or
comments. Otherwise comments and whitespace are taken as literal
characters.
Constant Value:
4
(0x00000004)
public
static
final
int
DOTALL
This constant specifies that the '.' meta character matches arbitrary
characters, including line endings, which is normally not the case.
Constant Value:
32
(0x00000020)
public
static
final
int
LITERAL
This constant specifies that the whole
Pattern
is to be taken
literally, that is, all meta characters lose their meanings.
Constant Value:
16
(0x00000010)
public
static
final
int
MULTILINE
This constant specifies that the meta characters '^' and '$' match only
the beginning and end end of an input line, respectively. Normally, they
match the beginning and the end of the complete input.
Constant Value:
8
(0x00000008)
public
static
final
int
UNICODE_CASE
This constant specifies that a
Pattern
is matched
case-insensitively with regard to all Unicode characters. It is used in
conjunction with the
CASE_INSENSITIVE constant to extend its
meaning to all Unicode characters.
Note: For Android, the CASE_INSENSITIVE
constant
(currently) always includes the meaning of the UNICODE_CASE
constant. So if case insensitivity is enabled, this automatically extends
to all Unicode characters. The UNICODE_CASE
constant then has no
special consequences.
Constant Value:
64
(0x00000040)
public
static
final
int
UNIX_LINES
This constant specifies that a pattern matches Unix line endings ('\n')
only against the '.', '^', and '$' meta characters.
Constant Value:
1
(0x00000001)
Public Methods
public
static
Pattern
compile(String pattern, int flags)
Compiles a regular expression, creating a new
Pattern
instance in
the process. Allows to set some flags that modify the behavior of the
Pattern
.
Parameters
pattern
| the regular expression. |
flags
| the flags to set. Basically, any combination of the constants
defined in this class is valid.
Note: Currently, the CASE_INSENSITIVE and
UNICODE_CASE constants have slightly special behavior
in Android, and the CANON_EQ constant is not
supported at all. |
Returns
- the new
Pattern
instance.
public
static
Pattern
compile(String pattern)
Compiles a regular expression, creating a new Pattern instance in the
process. This is actually a convenience method that calls
compile(String, int) with a
flags
value of zero.
Parameters
pattern
| the regular expression. |
Returns
- the new
Pattern
instance.
public
int
flags()
Returns the flags that have been set for this
Pattern
.
Returns
- the flags that have been set. A combination of the constants
defined in this class.
Returns a
Matcher for the
Pattern
and a given input. The
Matcher
can be used to match the
Pattern
against the
whole input, find occurrences of the
Pattern
in the input, or
replace parts of the input.
Parameters
input
| the input to process. |
public
static
boolean
matches(String regex, CharSequence input)
Tries to match a given regular expression against a given input. This is
actually nothing but a convenience method that compiles the regular
expression into a
Pattern
, builds a
Matcher for it, and
then does the match. If the same regular expression is used for multiple
operations, it is recommended to compile it into a
Pattern
explicitly and request a reusable
Matcher
.
Parameters
regex
| the regular expression. |
input
| the input to process. |
Returns
- true if and only if the
Pattern
matches the input.
public
String
pattern()
Returns the regular expression that was compiled into this
Pattern
.
Quotes a given string using "\Q" and "\E", so that all other
meta-characters lose their special meaning. If the string is used for a
Pattern
afterwards, it can only be matched literally.
Splits the given input sequence around occurrences of the
Pattern
.
The function first determines all occurrences of the
Pattern
inside the input sequence. It then builds an array of the
"remaining" strings before, in-between, and after these
occurrences. An additional parameter determines the maximal number of
entries in the resulting array and the handling of trailing empty
strings.
Parameters
inputSeq
| the input sequence. |
limit
| Determines the maximal number of entries in the resulting
array.
- For n > 0, it is guaranteed that the resulting array
contains at most n entries.
- For n < 0, the length of the resulting array is
exactly the number of occurrences of the
Pattern +1.
All entries are included.
- For n == 0, the length of the resulting array is at most
the number of occurrences of the
Pattern +1. Empty
strings at the end of the array are not included.
|
Splits a given input around occurrences of a regular expression. This is
a convenience method that is equivalent to calling the method
split(java.lang.CharSequence, int) with a limit of 0.
Parameters
input
| the input sequence. |
public
String
toString()
Returns a string containing a concise, human-readable description of the
receiver.
Returns
- String a printable representation for the receiver.
Protected Methods
protected
void
finalize()
Called by the virtual machine when there are no longer any (non-weak)
references to the receiver. Subclasses can use this facility to guarantee
that any associated resources are cleaned up before the receiver is
garbage collected. Uncaught exceptions which are thrown during the
running of the method cause it to terminate immediately, but are
otherwise ignored.
Note: The virtual machine assumes that the implementation in class Object
is empty.