Caractor classes

[character_group]
[^character_group]Negation: Matches any single character that is not in character_group. By default, characters in character_group are case-sensitive.
[firstCharacter-lastCharacter]Character range: Matches any single character in the range from first to last.
.Wildcard: Matches any single character except \n.
\p{name}Matches any single character in the Unicode general category or named block specified by name.
\P{name}Matches any single character that is not in the Unicode general category or named block specified by name.
\wMatches any word character.
\WMatches any non-word character.
\sMatches any white-space character.
\SMatches any non-white-space character.
\dMatches any decimal digit.
\DMatches any non-decimal digit.


Unicode general category

LuLetter, Uppercase
LlLetter, Lowercase
LtLetter, Titlecase
LmLetter, Modifier
LoLetter, Other
MnMark, Nonspacing
McMark, Spacing Combining
MeMark, Enclosing
NdNumber, Decimal Digit
NlNumber, Letter
NoNumber, Other
PcPunctuation, Connector
PdPunctuation, Dash
PsPunctuation, Open
PePunctuation, Close
PiPunctuation, Initial quote
PfPunctuation, Final quote
PoPunctuation, Other
SmSymbol, Math
ScSymbol, Currency
SkSymbol, Modifier
SoSymbol, Other
ZsSeparator, Space
ZlSeparator, Line
ZpSeparator, Paragraph
CcOther, Control
CfOther, Format
CsOther, Surrogate
CoOther, Private Use
CnOther, Not Assigned
CCc, Cf, Cs, Co, and Cn
LLu, Ll, Lt, Lm, and Lo
MMn, Mc, and Me
NNd, Nl, and No
PPc, Pd, Ps, Pe, Pi, Pf, and Po
SSm, Sc, Sk, and So
ZZs, Zl, and Zp


Block with name

Code point rangeBlock name
0000~007FIsBasicLatin
0080~00FFIsLatin-1Supplement
0100~017FIsLatinExtended-A
0180~024FIsLatinExtended-B
0250~02AFIsIPAExtensions
02B0~02FFIsSpacingModifierLetters
0300~036FIsCombiningDiacriticalMarks
0370~03FFIsGreek or IsGreekandCoptic
0400~04FFIsCyrillic
0500~052FIsCyrillicSupplement
0530~058FIsArmenian
0590~05FFIsHebrew
0600~06FFIsArabic
0700~074FIsSyriac
0780~07BFIsThaana
0900~097FIsDevanagari
0980~09FFIsBengali
0A00~0A7FIsGurmukhi
0A80~0AFFIsGujarati
0B00~0B7FIsOriya
0B80~0BFFIsTamil
0C00~0C7FIsTelugu
0C80~0CFFIsKannada
0D00~0D7FIsMalayalam
0D80~0DFFIsSinhala
0E00~0E7FIsThai
0E80~0EFFIsLao
0F00~0FFFIsTibetan
1000~109FIsMyanmar
10A0~10FFIsGeorgian
1100~11FFIsHangulJamo
1200~137FIsEthiopic
13A0~13FFIsCherokee
1400~167FIsUnifiedCanadianAboriginalSyllabics
1680~169FIsOgham
16A0~16FFIsRunic
1700~171FIsTagalog
1720~173FIsHanunoo
1740~175FIsBuhid
1760~177FIsTagbanwa
1780~17FFIsKhmer
1800~18AFIsMongolian
1900~194FIsLimbu
1950~197FIsTaiLe
19E0~19FFIsKhmerSymbols
1D00~1D7FIsPhoneticExtensions
1E00~1EFFIsLatinExtendedAdditional
1F00~1FFFIsGreekExtended
2000~206FIsGeneralPunctuation
2070~209FIsSuperscriptsandSubscripts
20A0~20CFIsCurrencySymbols
20D0~20FFIsCombiningDiacriticalMarksforSymbols or IsCombiningMarksforSymbols
2100~214FIsLetterlikeSymbols
2150~218FIsNumberForms
2190~21FFIsArrows
2200~22FFIsMathematicalOperators
2300~23FFIsMiscellaneousTechnical
2400~243FIsControlPictures
2440~245FIsOpticalCharacterRecognition
2460~24FFIsEnclosedAlphanumerics
2500~257FIsBoxDrawing
2580~259FIsBlockElements
25A0~25FFIsGeometricShapes
2600~26FFIsMiscellaneousSymbols
2700~27BFIsDingbats
27C0~27EFIsMiscellaneousMathematicalSymbols-A
27F0~27FFIsSupplementalArrows-A
2800~28FFIsBraillePatterns
2900~297FIsSupplementalArrows-B
2980~29FFIsMiscellaneousMathematicalSymbols-B
2A00~2AFFIsSupplementalMathematicalOperators
2B00~2BFFIsMiscellaneousSymbolsandArrows
2E80~2EFFIsCJKRadicalsSupplement
2F00~2FDFIsKangxiRadicals
2FF0~2FFFIsIdeographicDescriptionCharacters
3000~303FIsCJKSymbolsandPunctuation
3040~309FIsHiragana
30A0~30FFIsKatakana
3100~312FIsBopomofo
3130~318FIsHangulCompatibilityJamo
3190~319FIsKanbun
31A0~31BFIsBopomofoExtended
31F0~31FFIsKatakanaPhoneticExtensions
3200~32FFIsEnclosedCJKLettersandMonths
3300~33FFIsCJKCompatibility
3400~4DBFIsCJKUnifiedIdeographsExtensionA
4DC0~4DFFIsYijingHexagramSymbols
4E00~9FFFIsCJKUnifiedIdeographs
A000~A48FIsYiSyllables
A490~A4CFIsYiRadicals
AC00~D7AFIsHangulSyllables
D800~DB7FIsHighSurrogates
DB80~DBFFIsHighPrivateUseSurrogates
DC00~DFFFIsLowSurrogates
E000~F8FFIsPrivateUse
F900~FAFFIsPrivateUseArea
FB00~FB4FIsCJKCompatibilityIdeographs
FB50~FDFFIsAlphabeticPresentationForms
FE00~FE0FIsArabicPresentationForms-A
FE20~FE2FIsVariationSelectors
FE30~FE4FIsCombiningHalfMarks
FE50~FE6FIsCJKCompatibilityForms
FE70~FEFFIsSmallFormVariants
FF00~FFEFIsArabicPresentationForms-B
FFF0~FFFFIsHalfwidthandFullwidthForms