HBase Java Regular Expressions Support

This topics defines the subset of Java regular expressions that are supported for HPE Ezmeral Data Fabric Database tables.

Filters used with Scan operations support regular expressions. When you filter scans on HPE Ezmeral Data Fabric Database tables, you can use regular expressions that comprise the Perl-Compatible Regular Expressions library, as well as a subset of the regular expressions that are supported in java.util.regex.pattern.

Characters

Pattern

Description

x

The character x

\\

The backslash character

\0n

The character with octal value 0n (0 <= n <= 7)

\0nn

The character with octal value 0nn (0 <= n <= 7)

\xhh

The character with hexadecimal value 0xhh

\t

The tab character ('\u0009')

\n

The newline (line feed) character ('\u000A')

\r

The carriage-return character ('\u000D')

\f

The form-feed character ('\u000C')

\a

The alert (bell) character ('\u0007')

\e

The escape character ('\u001B')

\cx

The control character corresponding to x

Character Classes

Pattern

Description

[abc]

a, b, or c (simple class)

[Supported Regular Expressions in HPE Ezmeral Data Fabric Database Tables^abc]

Any character except a, b, or c (negation)

[a-zA-Z]

a through z or A through Z, inclusive (range)

Predefined Character Classes

Pattern

Description

.

Any character (may or may not match line terminators)

\d

A digit: [0-9]

\D

A non-digit: [Supported Regular Expressions in HPE Ezmeral Data Fabric Database Tables^0-9]

\s

A whitespace character: [ \t\n\x0B\f\r]

\S

A non-whitespace character: [Supported Regular Expressions in HPE Ezmeral Data Fabric Database Tables^\s]

\w

A word character: [a-zA-Z_0-9]

\W

A non-word character: [Supported Regular Expressions in HPE Ezmeral Data Fabric Database Tables^\w]

Classes for Unicode Blocks and Categories

Pattern

Description

\p{Lu}

An uppercase letter (simple category)

\p{Sc}

A currency symbol

Boundaries

Pattern

Description

^

The beginning of a line

$

The end of a line

\b

A word boundary

\B

A non-word boundary

\A

The beginning of the input

\G

The end of the previous match

\Z

The end of the input but for the final terminator, if any

\z

The end of the input

Greedy Quantifiers

Pattern

Description

X?

X, once or not at all

X*

X, zero or more times

X+

X, one or more times

X{n}

X, exactly n times

X{n,}

X, at least n times

X{n,m}

X, at least n but not more than m times

Reluctant Quantifiers

Pattern

Description

X??

X, once or not at all

X*?

X, zero or more times

X+?

X, one or more times

X{n}?

X, exactly n times

X{n,}?

X, at least n times

X{n,m}?

X, at least n but not more than m times

Possessive Quantifiers

Pattern

Description

X?+

X, once or not at all

X*+

X, zero or more times

X++

X, one or more times

X{n}+

X, exactly n times

X{n,}+

X, at least n times

X{n,m}+

X, at least n but not more than m times

Logical Operators

Pattern

Description

XY

X followed by Y

X|Y

Either X or Y

(X)

X, as a capturing group

Back References

Pattern

Description

\n

Whatever the nth capturing group matches

Quotation

Pattern

Description

\

Nothing, but quotes the following character

\Q

Nothing, but quotes all characters until \E

\E

Nothing, but ends quoting started by \Q

Special Constructs

Pattern

Description

(?:X)

X, as a non-capturing group

(?=X)

X, via zero-width positive lookahead

(?!X)

X, via zero-width negative lookahead

(?<=X)

X, via zero-width positive lookbehind

(?<!X)

X, via zero-width negative lookbehind

(?>X)

X, as an independent, non-capturing group