public final class Character extends Object implements Serializable, Comparable<Character>
char
. This class also provides a
number of utility methods for working with characters.
Character data is kept up to date as Unicode evolves.
See the Locale data section of
the Locale
documentation for details of the Unicode versions implemented by current
and historical Android releases.
The Unicode specification, character tables, and other information are available at http://www.unicode.org/.
Unicode characters are referred to as code points. The range of valid
code points is U+0000 to U+10FFFF. The Basic Multilingual Plane (BMP)
is the code point range U+0000 to U+FFFF. Characters above the BMP are
referred to as Supplementary Characters. On the Java platform, UTF-16
encoding and char
pairs are used to represent code points in the
supplementary range. A pair of char
values that represent a
supplementary character are made up of a high surrogate with a value
range of 0xD800 to 0xDBFF and a low surrogate with a value range of
0xDC00 to 0xDFFF.
On the Java platform a char
value represents either a single BMP code
point or a UTF-16 unit that's part of a surrogate pair. The int
type
is used to represent all Unicode code points.
Unicode categories
Here's a list of the Unicode character categories and the corresponding Java constant,
grouped semantically to provide a convenient overview. This table is also useful in
conjunction with \p
and \P
in regular expressions
.
Cn Unassigned UNASSIGNED
Cc Control CONTROL
Cf Format FORMAT
Co Private use PRIVATE_USE
Cs Surrogate SURROGATE
Lu Uppercase letter UPPERCASE_LETTER
Ll Lowercase letter LOWERCASE_LETTER
Lt Titlecase letter TITLECASE_LETTER
Lm Modifier letter MODIFIER_LETTER
Lo Other letter OTHER_LETTER
Mn Non-spacing mark NON_SPACING_MARK
Me Enclosing mark ENCLOSING_MARK
Mc Combining spacing mark COMBINING_SPACING_MARK
Nd Decimal digit number DECIMAL_DIGIT_NUMBER
Nl Letter number LETTER_NUMBER
No Other number OTHER_NUMBER
Pd Dash punctuation DASH_PUNCTUATION
Ps Start punctuation START_PUNCTUATION
Pe End punctuation END_PUNCTUATION
Pc Connector punctuation CONNECTOR_PUNCTUATION
Pi Initial quote punctuation INITIAL_QUOTE_PUNCTUATION
Pf Final quote punctuation FINAL_QUOTE_PUNCTUATION
Po Other punctuation OTHER_PUNCTUATION
Sm Math symbol MATH_SYMBOL
Sc Currency symbol CURRENCY_SYMBOL
Sk Modifier symbol MODIFIER_SYMBOL
So Other symbol OTHER_SYMBOL
Zs Space separator SPACE_SEPARATOR
Zl Line separator LINE_SEPARATOR
Zp Paragraph separator PARAGRAPH_SEPARATOR
Modifier and Type | Class and Description |
---|---|
static class |
Character.Subset |
static class |
Character.UnicodeBlock
Represents a block of Unicode characters, as defined by the Unicode 4.0.1
specification.
|
Modifier and Type | Field and Description |
---|---|
static byte |
COMBINING_SPACING_MARK
Unicode category constant Mc.
|
static byte |
CONNECTOR_PUNCTUATION
Unicode category constant Pc.
|
static byte |
CONTROL
Unicode category constant Cc.
|
static byte |
CURRENCY_SYMBOL
Unicode category constant Sc.
|
static byte |
DASH_PUNCTUATION
Unicode category constant Pd.
|
static byte |
DECIMAL_DIGIT_NUMBER
Unicode category constant Nd.
|
static byte |
DIRECTIONALITY_ARABIC_NUMBER
Unicode bidirectional constant AN.
|
static byte |
DIRECTIONALITY_BOUNDARY_NEUTRAL
Unicode bidirectional constant BN.
|
static byte |
DIRECTIONALITY_COMMON_NUMBER_SEPARATOR
Unicode bidirectional constant CS.
|
static byte |
DIRECTIONALITY_EUROPEAN_NUMBER
Unicode bidirectional constant EN.
|
static byte |
DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
Unicode bidirectional constant ES.
|
static byte |
DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
Unicode bidirectional constant ET.
|
static byte |
DIRECTIONALITY_LEFT_TO_RIGHT
Unicode bidirectional constant L.
|
static byte |
DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
Unicode bidirectional constant LRE.
|
static byte |
DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
Unicode bidirectional constant LRO.
|
static byte |
DIRECTIONALITY_NONSPACING_MARK
Unicode bidirectional constant NSM.
|
static byte |
DIRECTIONALITY_OTHER_NEUTRALS
Unicode bidirectional constant ON.
|
static byte |
DIRECTIONALITY_PARAGRAPH_SEPARATOR
Unicode bidirectional constant B.
|
static byte |
DIRECTIONALITY_POP_DIRECTIONAL_FORMAT
Unicode bidirectional constant PDF.
|
static byte |
DIRECTIONALITY_RIGHT_TO_LEFT
Unicode bidirectional constant R.
|
static byte |
DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
Unicode bidirectional constant AL.
|
static byte |
DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
Unicode bidirectional constant RLE.
|
static byte |
DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
Unicode bidirectional constant RLO.
|
static byte |
DIRECTIONALITY_SEGMENT_SEPARATOR
Unicode bidirectional constant S.
|
static byte |
DIRECTIONALITY_UNDEFINED
Unicode bidirectional constant.
|
static byte |
DIRECTIONALITY_WHITESPACE
Unicode bidirectional constant WS.
|
static byte |
ENCLOSING_MARK
Unicode category constant Me.
|
static byte |
END_PUNCTUATION
Unicode category constant Pe.
|
static byte |
FINAL_QUOTE_PUNCTUATION
Unicode category constant Pf.
|
static byte |
FORMAT
Unicode category constant Cf.
|
static byte |
INITIAL_QUOTE_PUNCTUATION
Unicode category constant Pi.
|
static byte |
LETTER_NUMBER
Unicode category constant Nl.
|
static byte |
LINE_SEPARATOR
Unicode category constant Zl.
|
static byte |
LOWERCASE_LETTER
Unicode category constant Ll.
|
static byte |
MATH_SYMBOL
Unicode category constant Sm.
|
static int |
MAX_CODE_POINT
The maximum code point value,
U+10FFFF . |
static char |
MAX_HIGH_SURROGATE
The maximum value of a high surrogate or leading surrogate unit in UTF-16
encoding,
'?' . |
static char |
MAX_LOW_SURROGATE
The maximum value of a low surrogate or trailing surrogate unit in UTF-16
encoding,
'?' . |
static int |
MAX_RADIX
The maximum radix used for conversions between characters and integers.
|
static char |
MAX_SURROGATE
The maximum value of a surrogate unit in UTF-16 encoding,
'?' . |
static char |
MAX_VALUE
The maximum
Character value. |
static int |
MIN_CODE_POINT
The minimum code point value,
U+0000 . |
static char |
MIN_HIGH_SURROGATE
The minimum value of a high surrogate or leading surrogate unit in UTF-16
encoding,
'?' . |
static char |
MIN_LOW_SURROGATE
The minimum value of a low surrogate or trailing surrogate unit in UTF-16
encoding,
'?' . |
static int |
MIN_RADIX
The minimum radix used for conversions between characters and integers.
|
static int |
MIN_SUPPLEMENTARY_CODE_POINT
The minimum value of a supplementary code point,
U+010000 . |
static char |
MIN_SURROGATE
The minimum value of a surrogate unit in UTF-16 encoding,
'?' . |
static char |
MIN_VALUE
The minimum
Character value. |
static byte |
MODIFIER_LETTER
Unicode category constant Lm.
|
static byte |
MODIFIER_SYMBOL
Unicode category constant Sk.
|
static byte |
NON_SPACING_MARK
Unicode category constant Mn.
|
static byte |
OTHER_LETTER
Unicode category constant Lo.
|
static byte |
OTHER_NUMBER
Unicode category constant No.
|
static byte |
OTHER_PUNCTUATION
Unicode category constant Po.
|
static byte |
OTHER_SYMBOL
Unicode category constant So.
|
static byte |
PARAGRAPH_SEPARATOR
Unicode category constant Zp.
|
static byte |
PRIVATE_USE
Unicode category constant Co.
|
static int |
SIZE
The number of bits required to represent a
Character value
unsigned form. |
static byte |
SPACE_SEPARATOR
Unicode category constant Zs.
|
static byte |
START_PUNCTUATION
Unicode category constant Ps.
|
static byte |
SURROGATE
Unicode category constant Cs.
|
static byte |
TITLECASE_LETTER
Unicode category constant Lt.
|
static Class<Character> |
TYPE
The
Class object that represents the primitive type char . |
static byte |
UNASSIGNED
Unicode category constant Cn.
|
static byte |
UPPERCASE_LETTER
Unicode category constant Lu.
|
Constructor and Description |
---|
Character(char value)
Constructs a new
Character with the specified primitive char
value. |
Modifier and Type | Method and Description |
---|---|
static int |
charCount(int codePoint)
Calculates the number of
char values required to represent the
specified Unicode code point. |
char |
charValue()
Gets the primitive value of this character.
|
static int |
codePointAt(char[] seq,
int index)
Returns the code point at
index in the specified array of
character units. |
static int |
codePointAt(char[] seq,
int index,
int limit)
Returns the code point at
index in the specified array of
character units, where index has to be less than limit . |
static int |
codePointAt(CharSequence seq,
int index)
Returns the code point at
index in the specified sequence of
character units. |
static int |
codePointBefore(char[] seq,
int index)
Returns the code point that precedes
index in the specified
array of character units. |
static int |
codePointBefore(char[] seq,
int index,
int start)
Returns the code point that precedes the
index in the specified
array of character units and is not less than start . |
static int |
codePointBefore(CharSequence seq,
int index)
Returns the code point that precedes
index in the specified
sequence of character units. |
static int |
codePointCount(char[] seq,
int offset,
int count)
Counts the number of Unicode code points in the subsequence of the
specified char array, as delineated by
offset and count . |
static int |
codePointCount(CharSequence seq,
int beginIndex,
int endIndex)
Counts the number of Unicode code points in the subsequence of the
specified character sequence, as delineated by
beginIndex and
endIndex . |
static int |
compare(char lhs,
char rhs)
Compares two
char values. |
int |
compareTo(Character c)
Compares this object to the specified character object to determine their
relative order.
|
static int |
digit(char c,
int radix)
Convenience method to determine the value of the specified character
c in the supplied radix. |
static int |
digit(int codePoint,
int radix)
Convenience method to determine the value of the character
codePoint in the supplied radix. |
boolean |
equals(Object object)
Compares this object with the specified object and indicates if they are
equal.
|
static char |
forDigit(int digit,
int radix)
Returns the character which represents the specified digit in the
specified radix.
|
static byte |
getDirectionality(char c)
Gets the Unicode directionality of the specified character.
|
static byte |
getDirectionality(int codePoint)
Gets the Unicode directionality of the specified character.
|
static String |
getName(int codePoint)
Returns the name of the given code point, or null if the code point is unassigned.
|
static int |
getNumericValue(char c)
Returns the numeric value of the specified Unicode character.
|
static int |
getNumericValue(int codePoint)
Gets the numeric value of the specified Unicode code point.
|
static int |
getType(char c)
Gets the general Unicode category of the specified character.
|
static int |
getType(int codePoint)
Gets the general Unicode category of the specified code point.
|
int |
hashCode()
Returns an integer hash code for this object.
|
static char |
highSurrogate(int codePoint)
Returns the high surrogate for the given code point.
|
static boolean |
isBmpCodePoint(int codePoint)
Tests whether the given code point is in the Basic Multilingual Plane (BMP).
|
static boolean |
isDefined(char c)
Indicates whether the specified character is defined in the Unicode
specification.
|
static boolean |
isDefined(int codePoint)
Indicates whether the specified code point is defined in the Unicode
specification.
|
static boolean |
isDigit(char c)
Indicates whether the specified character is a digit.
|
static boolean |
isDigit(int codePoint)
Indicates whether the specified code point is a digit.
|
static boolean |
isHighSurrogate(char ch)
Indicates whether
ch is a high- (or leading-) surrogate code unit
that is used for representing supplementary characters in UTF-16
encoding. |
static boolean |
isIdentifierIgnorable(char c)
Indicates whether the specified character is ignorable in a Java or
Unicode identifier.
|
static boolean |
isIdentifierIgnorable(int codePoint)
Indicates whether the specified code point is ignorable in a Java or
Unicode identifier.
|
static boolean |
isISOControl(char c)
Indicates whether the specified character is an ISO control character.
|
static boolean |
isISOControl(int c)
Indicates whether the specified code point is an ISO control character.
|
static boolean |
isJavaIdentifierPart(char c)
Indicates whether the specified character is a valid part of a Java
identifier other than the first character.
|
static boolean |
isJavaIdentifierPart(int codePoint)
Indicates whether the specified code point is a valid part of a Java
identifier other than the first character.
|
static boolean |
isJavaIdentifierStart(char c)
Indicates whether the specified character is a valid first character for
a Java identifier.
|
static boolean |
isJavaIdentifierStart(int codePoint)
Indicates whether the specified code point is a valid first character for
a Java identifier.
|
static boolean |
isJavaLetter(char c)
Deprecated.
|
static boolean |
isJavaLetterOrDigit(char c)
Deprecated.
|
static boolean |
isLetter(char c)
Indicates whether the specified character is a letter.
|
static boolean |
isLetter(int codePoint)
Indicates whether the specified code point is a letter.
|
static boolean |
isLetterOrDigit(char c)
Indicates whether the specified character is a letter or a digit.
|
static boolean |
isLetterOrDigit(int codePoint)
Indicates whether the specified code point is a letter or a digit.
|
static boolean |
isLowerCase(char c)
Indicates whether the specified character is a lower case letter.
|
static boolean |
isLowerCase(int codePoint)
Indicates whether the specified code point is a lower case letter.
|
static boolean |
isLowSurrogate(char ch)
Indicates whether
ch is a low- (or trailing-) surrogate code unit
that is used for representing supplementary characters in UTF-16
encoding. |
static boolean |
isMirrored(char c)
Indicates whether the specified character is mirrored.
|
static boolean |
isMirrored(int codePoint)
Indicates whether the specified code point is mirrored.
|
static boolean |
isSpace(char c)
Deprecated.
|
static boolean |
isSpaceChar(char c)
Indicates whether the specified character is a Unicode space character.
|
static boolean |
isSpaceChar(int codePoint)
Indicates whether the specified code point is a Unicode space character.
|
static boolean |
isSupplementaryCodePoint(int codePoint)
Indicates whether
codePoint is within the supplementary code
point range. |
static boolean |
isSurrogate(char ch)
Tests whether the given character is a high or low surrogate.
|
static boolean |
isSurrogatePair(char high,
char low)
Indicates whether the specified character pair is a valid surrogate pair.
|
static boolean |
isTitleCase(char c)
Indicates whether the specified character is a titlecase character.
|
static boolean |
isTitleCase(int codePoint)
Indicates whether the specified code point is a titlecase character.
|
static boolean |
isUnicodeIdentifierPart(char c)
Indicates whether the specified character is valid as part of a Unicode
identifier other than the first character.
|
static boolean |
isUnicodeIdentifierPart(int codePoint)
Indicates whether the specified code point is valid as part of a Unicode
identifier other than the first character.
|
static boolean |
isUnicodeIdentifierStart(char c)
Indicates whether the specified character is a valid initial character
for a Unicode identifier.
|
static boolean |
isUnicodeIdentifierStart(int codePoint)
Indicates whether the specified code point is a valid initial character
for a Unicode identifier.
|
static boolean |
isUpperCase(char c)
Indicates whether the specified character is an upper case letter.
|
static boolean |
isUpperCase(int codePoint)
Indicates whether the specified code point is an upper case letter.
|
static boolean |
isValidCodePoint(int codePoint)
Indicates whether
codePoint is a valid Unicode code point. |
static boolean |
isWhitespace(char c)
Indicates whether the specified character is a whitespace character in
Java.
|
static boolean |
isWhitespace(int codePoint)
Indicates whether the specified code point is a whitespace character in
Java.
|
static char |
lowSurrogate(int codePoint)
Returns the low surrogate for the given code point.
|
static int |
offsetByCodePoints(char[] seq,
int start,
int count,
int index,
int codePointOffset)
Determines the index in a subsequence of the specified character array
that is offset
codePointOffset code points from index . |
static int |
offsetByCodePoints(CharSequence seq,
int index,
int codePointOffset)
Determines the index in the specified character sequence that is offset
codePointOffset code points from index . |
static char |
reverseBytes(char c)
Reverses the order of the first and second byte in the specified
character.
|
static char[] |
toChars(int codePoint)
Converts the specified Unicode code point into a UTF-16 encoded sequence
and returns it as a char array.
|
static int |
toChars(int codePoint,
char[] dst,
int dstIndex)
Converts the specified Unicode code point into a UTF-16 encoded sequence
and copies the value(s) into the char array
dst , starting at
index dstIndex . |
static int |
toCodePoint(char high,
char low)
Converts a surrogate pair into a Unicode code point.
|
static char |
toLowerCase(char c)
Returns the lower case equivalent for the specified character if the
character is an upper case letter.
|
static int |
toLowerCase(int codePoint)
Returns the lower case equivalent for the specified code point if it is
an upper case letter.
|
String |
toString()
Returns a string containing a concise, human-readable description of this
object.
|
static String |
toString(char value)
Converts the specified character to its string representation.
|
static char |
toTitleCase(char c)
Returns the title case equivalent for the specified character if it
exists.
|
static int |
toTitleCase(int codePoint)
Returns the title case equivalent for the specified code point if it
exists.
|
static char |
toUpperCase(char c)
Returns the upper case equivalent for the specified character if the
character is a lower case letter.
|
static int |
toUpperCase(int codePoint)
Returns the upper case equivalent for the specified code point if the
code point is a lower case letter.
|
static Character |
valueOf(char c)
Returns a
Character instance for the char value passed. |
public static final char MIN_VALUE
Character
value.public static final char MAX_VALUE
Character
value.public static final int MIN_RADIX
public static final int MAX_RADIX
public static final Class<Character> TYPE
Class
object that represents the primitive type char
.public static final byte UNASSIGNED
public static final byte UPPERCASE_LETTER
public static final byte LOWERCASE_LETTER
public static final byte TITLECASE_LETTER
public static final byte MODIFIER_LETTER
public static final byte OTHER_LETTER
public static final byte NON_SPACING_MARK
public static final byte ENCLOSING_MARK
public static final byte COMBINING_SPACING_MARK
public static final byte DECIMAL_DIGIT_NUMBER
public static final byte LETTER_NUMBER
public static final byte OTHER_NUMBER
public static final byte SPACE_SEPARATOR
public static final byte LINE_SEPARATOR
public static final byte PARAGRAPH_SEPARATOR
public static final byte CONTROL
public static final byte FORMAT
public static final byte PRIVATE_USE
public static final byte SURROGATE
public static final byte DASH_PUNCTUATION
public static final byte START_PUNCTUATION
public static final byte END_PUNCTUATION
public static final byte CONNECTOR_PUNCTUATION
public static final byte OTHER_PUNCTUATION
public static final byte MATH_SYMBOL
public static final byte CURRENCY_SYMBOL
public static final byte MODIFIER_SYMBOL
public static final byte OTHER_SYMBOL
public static final byte INITIAL_QUOTE_PUNCTUATION
public static final byte FINAL_QUOTE_PUNCTUATION
public static final byte DIRECTIONALITY_UNDEFINED
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
public static final byte DIRECTIONALITY_ARABIC_NUMBER
public static final byte DIRECTIONALITY_COMMON_NUMBER_SEPARATOR
public static final byte DIRECTIONALITY_NONSPACING_MARK
public static final byte DIRECTIONALITY_BOUNDARY_NEUTRAL
public static final byte DIRECTIONALITY_PARAGRAPH_SEPARATOR
public static final byte DIRECTIONALITY_SEGMENT_SEPARATOR
public static final byte DIRECTIONALITY_WHITESPACE
public static final byte DIRECTIONALITY_OTHER_NEUTRALS
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
public static final byte DIRECTIONALITY_POP_DIRECTIONAL_FORMAT
public static final char MIN_HIGH_SURROGATE
'?'
.public static final char MAX_HIGH_SURROGATE
'?'
.public static final char MIN_LOW_SURROGATE
'?'
.public static final char MAX_LOW_SURROGATE
'?'
.public static final char MIN_SURROGATE
'?'
.public static final char MAX_SURROGATE
'?'
.public static final int MIN_SUPPLEMENTARY_CODE_POINT
U+010000
.public static final int MIN_CODE_POINT
U+0000
.public static final int MAX_CODE_POINT
U+10FFFF
.public static final int SIZE
Character
value
unsigned form.public Character(char value)
Character
with the specified primitive char
value.value
- the primitive char value to store in the new instance.public char charValue()
public int compareTo(Character c)
compareTo
in interface Comparable<Character>
c
- the character object to compare this object to.0
if the value of this character and the value of
c
are equal; a positive value if the value of this
character is greater than the value of c
; a negative
value if the value of this character is less than the value of
c
.Comparable
public static int compare(char lhs, char rhs)
char
values.public static Character valueOf(char c)
Character
instance for the char
value passed.
If it is not necessary to get a new Character
instance, it is
recommended to use this method instead of the constructor, since it
maintains a cache of instances which may result in better performance.
c
- the char value for which to get a Character
instance.Character
instance for c
.public static boolean isValidCodePoint(int codePoint)
codePoint
is a valid Unicode code point.codePoint
- the code point to test.true
if codePoint
is a valid Unicode code point;
false
otherwise.public static boolean isSupplementaryCodePoint(int codePoint)
codePoint
is within the supplementary code
point range.codePoint
- the code point to test.true
if codePoint
is within the supplementary
code point range; false
otherwise.public static boolean isHighSurrogate(char ch)
ch
is a high- (or leading-) surrogate code unit
that is used for representing supplementary characters in UTF-16
encoding.ch
- the character to test.true
if ch
is a high-surrogate code unit;
false
otherwise.isLowSurrogate(char)
public static boolean isLowSurrogate(char ch)
ch
is a low- (or trailing-) surrogate code unit
that is used for representing supplementary characters in UTF-16
encoding.ch
- the character to test.true
if ch
is a low-surrogate code unit;
false
otherwise.isHighSurrogate(char)
public static boolean isSurrogate(char ch)
public static boolean isSurrogatePair(char high, char low)
high
- the high surrogate unit to test.low
- the low surrogate unit to test.true
if high
is a high-surrogate code unit and
low
is a low-surrogate code unit; false
otherwise.isHighSurrogate(char)
,
isLowSurrogate(char)
public static int charCount(int codePoint)
char
values required to represent the
specified Unicode code point. This method checks if the codePoint
is greater than or equal to 0x10000
, in which case 2
is
returned, otherwise 1
. To test if the code point is valid, use
the isValidCodePoint(int)
method.codePoint
- the code point for which to calculate the number of required
chars.2
if codePoint >= 0x10000
; 1
otherwise.isValidCodePoint(int)
,
isSupplementaryCodePoint(int)
public static int toCodePoint(char high, char low)
isSurrogatePair(char, char)
method should be used prior to this
method to validate the pair.high
- the high surrogate unit.low
- the low surrogate unit.isSurrogatePair(char, char)
public static int codePointAt(CharSequence seq, int index)
index
in the specified sequence of
character units. If the unit at index
is a high-surrogate unit,
index + 1
is less than the length of the sequence and the unit at
index + 1
is a low-surrogate unit, then the supplementary code
point represented by the pair is returned; otherwise the char
value at index
is returned.seq
- the source sequence of char
units.index
- the position in seq
from which to retrieve the code
point.char
value at index
in
seq
.NullPointerException
- if seq
is null
.IndexOutOfBoundsException
- if the index
is negative or greater than or equal to
the length of seq
.public static int codePointAt(char[] seq, int index)
index
in the specified array of
character units. If the unit at index
is a high-surrogate unit,
index + 1
is less than the length of the array and the unit at
index + 1
is a low-surrogate unit, then the supplementary code
point represented by the pair is returned; otherwise the char
value at index
is returned.seq
- the source array of char
units.index
- the position in seq
from which to retrieve the code
point.char
value at index
in
seq
.NullPointerException
- if seq
is null
.IndexOutOfBoundsException
- if the index
is negative or greater than or equal to
the length of seq
.public static int codePointAt(char[] seq, int index, int limit)
index
in the specified array of
character units, where index
has to be less than limit
.
If the unit at index
is a high-surrogate unit, index + 1
is less than limit
and the unit at index + 1
is a
low-surrogate unit, then the supplementary code point represented by the
pair is returned; otherwise the char
value at index
is
returned.seq
- the source array of char
units.index
- the position in seq
from which to get the code point.limit
- the index after the last unit in seq
that can be used.char
value at index
in
seq
.NullPointerException
- if seq
is null
.IndexOutOfBoundsException
- if index < 0
, index >= limit
,
limit < 0
or if limit
is greater than the
length of seq
.public static int codePointBefore(CharSequence seq, int index)
index
in the specified
sequence of character units. If the unit at index - 1
is a
low-surrogate unit, index - 2
is not negative and the unit at
index - 2
is a high-surrogate unit, then the supplementary code
point represented by the pair is returned; otherwise the char
value at index - 1
is returned.seq
- the source sequence of char
units.index
- the position in seq
following the code
point that should be returned.char
value before index
in seq
.NullPointerException
- if seq
is null
.IndexOutOfBoundsException
- if the index
is less than 1 or greater than the
length of seq
.public static int codePointBefore(char[] seq, int index)
index
in the specified
array of character units. If the unit at index - 1
is a
low-surrogate unit, index - 2
is not negative and the unit at
index - 2
is a high-surrogate unit, then the supplementary code
point represented by the pair is returned; otherwise the char
value at index - 1
is returned.seq
- the source array of char
units.index
- the position in seq
following the code
point that should be returned.char
value before index
in seq
.NullPointerException
- if seq
is null
.IndexOutOfBoundsException
- if the index
is less than 1 or greater than the
length of seq
.public static int codePointBefore(char[] seq, int index, int start)
index
in the specified
array of character units and is not less than start
. If the unit
at index - 1
is a low-surrogate unit, index - 2
is not
less than start
and the unit at index - 2
is a
high-surrogate unit, then the supplementary code point represented by the
pair is returned; otherwise the char
value at index - 1
is returned.seq
- the source array of char
units.index
- the position in seq
following the code point that
should be returned.start
- the index of the first element in seq
.char
value before index
in seq
.NullPointerException
- if seq
is null
.IndexOutOfBoundsException
- if the index <= start
, start < 0
,
index
is greater than the length of seq
, or
if start
is equal or greater than the length of
seq
.public static int toChars(int codePoint, char[] dst, int dstIndex)
dst
, starting at
index dstIndex
.codePoint
- the Unicode code point to encode.dst
- the destination array to copy the encoded value into.dstIndex
- the index in dst
from where to start copying.char
value units copied into dst
.IllegalArgumentException
- if codePoint
is not a valid code point.NullPointerException
- if dst
is null
.IndexOutOfBoundsException
- if dstIndex
is negative, greater than or equal to
dst.length
or equals dst.length - 1
when
codePoint
is a
supplementary code point
.public static char[] toChars(int codePoint)
codePoint
- the Unicode code point to encode.codePoint
is a
supplementary code point
,
then the returned array contains two characters, otherwise it
contains just one character.IllegalArgumentException
- if codePoint
is not a valid code point.public static int codePointCount(CharSequence seq, int beginIndex, int endIndex)
beginIndex
and
endIndex
. Any surrogate values with missing pair values will be
counted as one code point.seq
- the CharSequence
to look through.beginIndex
- the inclusive index to begin counting at.endIndex
- the exclusive index to stop counting at.NullPointerException
- if seq
is null
.IndexOutOfBoundsException
- if beginIndex < 0
, beginIndex > endIndex
or
if endIndex
is greater than the length of seq
.public static int codePointCount(char[] seq, int offset, int count)
offset
and count
.
Any surrogate values with missing pair values will be counted as one code
point.seq
- the char array to look throughoffset
- the inclusive index to begin counting at.count
- the number of char
values to look through in
seq
.NullPointerException
- if seq
is null
.IndexOutOfBoundsException
- if offset < 0
, count < 0
or if
offset + count
is greater than the length of
seq
.public static int offsetByCodePoints(CharSequence seq, int index, int codePointOffset)
codePointOffset
code points from index
.seq
- the character sequence to find the index in.index
- the start index in seq
.codePointOffset
- the number of code points to look backwards or forwards; may
be a negative or positive value.seq
that is codePointOffset
code
points away from index
.NullPointerException
- if seq
is null
.IndexOutOfBoundsException
- if index < 0
, index
is greater than the
length of seq
, or if there are not enough values in
seq
to skip codePointOffset
code points
forwards or backwards (if codePointOffset
is
negative) from index
.public static int offsetByCodePoints(char[] seq, int start, int count, int index, int codePointOffset)
codePointOffset
code points from index
.
The subsequence is delineated by start
and count
.seq
- the character array to find the index in.start
- the inclusive index that marks the beginning of the
subsequence.count
- the number of char
values to include within the
subsequence.index
- the start index in the subsequence of the char array.codePointOffset
- the number of code points to look backwards or forwards; may
be a negative or positive value.seq
that is codePointOffset
code
points away from index
.NullPointerException
- if seq
is null
.IndexOutOfBoundsException
- if start < 0
, count < 0
,
index < start
, index > start + count
,
start + count
is greater than the length of
seq
, or if there are not enough values in
seq
to skip codePointOffset
code points
forward or backward (if codePointOffset
is
negative) from index
.public static int digit(char c, int radix)
c
in the supplied radix. The value of radix
must be
between MIN_RADIX and MAX_RADIX.public static int digit(int codePoint, int radix)
codePoint
in the supplied radix. The value of radix
must
be between MIN_RADIX and MAX_RADIX.public boolean equals(Object object)
object
must be an instance of
Character
and have the same char value as this object.equals
in class Object
object
- the object to compare this double with.true
if the specified object is equal to this
Character
; false
otherwise.Object.hashCode()
public static char forDigit(int digit, int radix)
radix
must be between MIN_RADIX
and
MAX_RADIX
inclusive; digit
must not be negative and
smaller than radix
. If any of these conditions does not hold, 0
is returned.digit
- the integer value.radix
- the radix.digit
in the
radix
.public static String getName(int codePoint)
As a fallback mechanism this method returns strings consisting of the Unicode block name (with underscores replaced by spaces), a single space, and the uppercase hex value of the code point, using as few digits as necessary.
Examples:
Character.getName(0)
returns "NULL".
Character.getName('e')
returns "LATIN SMALL LETTER E".
Character.getName('?')
returns "ARABIC-INDIC DIGIT SIX".
Character.getName(0xe000)
returns "PRIVATE USE AREA E000".
IllegalArgumentException
- if codePoint
is not a valid code point.public static int getNumericValue(char c)
getNumericValue(int)
.c
- the characterc
exists, -1 if there is no numeric value for c
,
-2 if the numeric value can not be represented as an integer.public static int getNumericValue(int codePoint)
There are two points of divergence between this method and the Unicode specification. This method treats the letters a-z (in both upper and lower cases, and their full-width variants) as numbers from 10 to 35. The Unicode specification also supports the idea of code points with non-integer numeric values; this method does not (except to the extent of returning -2 for such code points).
codePoint
- the code pointcodePoint
exists, -1 if there is no numeric value for
codePoint
, -2 if the numeric value can not be
represented with an integer.public static int getType(char c)
c
- the character to get the category of.c
.public static int getType(int codePoint)
codePoint
- the Unicode code point to get the category of.codePoint
.public static byte getDirectionality(char c)
c
- the character to get the directionality of.c
.public static byte getDirectionality(int codePoint)
codePoint
- the Unicode code point to get the directionality of.codePoint
.public static boolean isMirrored(char c)
c
- the character to check.true
if c
is mirrored; false
otherwise.public static boolean isMirrored(int codePoint)
codePoint
- the code point to check.true
if codePoint
is mirrored, false
otherwise.public int hashCode()
Object
Object.equals(java.lang.Object)
returns true
must return
the same hash code value. This means that subclasses of Object
usually override both methods or neither method.
Note that hash values must not change over time unless information used in equals comparisons also changes.
See Writing a correct
hashCode
method
if you intend implementing your own hashCode
method.
hashCode
in class Object
Object.equals(java.lang.Object)
public static char highSurrogate(int codePoint)
public static char lowSurrogate(int codePoint)
public static boolean isBmpCodePoint(int codePoint)
char
.public static boolean isDefined(char c)
c
- the character to check.true
if the general Unicode category of the character is
not UNASSIGNED
; false
otherwise.public static boolean isDefined(int codePoint)
codePoint
- the code point to check.true
if the general Unicode category of the code point is
not UNASSIGNED
; false
otherwise.public static boolean isDigit(char c)
c
- the character to check.true
if c
is a digit; false
otherwise.public static boolean isDigit(int codePoint)
codePoint
- the code point to check.true
if codePoint
is a digit; false
otherwise.public static boolean isIdentifierIgnorable(char c)
c
- the character to check.true
if c
is ignorable; false
otherwise.public static boolean isIdentifierIgnorable(int codePoint)
codePoint
- the code point to check.true
if codePoint
is ignorable; false
otherwise.public static boolean isISOControl(char c)
c
- the character to check.true
if c
is an ISO control character;
false
otherwise.public static boolean isISOControl(int c)
c
- the code point to check.true
if c
is an ISO control character;
false
otherwise.public static boolean isJavaIdentifierPart(char c)
c
- the character to check.true
if c
is valid as part of a Java identifier;
false
otherwise.public static boolean isJavaIdentifierPart(int codePoint)
codePoint
- the code point to check.true
if c
is valid as part of a Java identifier;
false
otherwise.public static boolean isJavaIdentifierStart(char c)
c
- the character to check.true
if c
is a valid first character of a Java
identifier; false
otherwise.public static boolean isJavaIdentifierStart(int codePoint)
codePoint
- the code point to check.true
if codePoint
is a valid start of a Java
identifier; false
otherwise.@Deprecated public static boolean isJavaLetter(char c)
isJavaIdentifierStart(char)
c
- the character to check.true
if c
is a Java letter; false
otherwise.@Deprecated public static boolean isJavaLetterOrDigit(char c)
isJavaIdentifierPart(char)
c
- the character to check.true
if c
is a Java letter or digit;
false
otherwise.public static boolean isLetter(char c)
c
- the character to check.true
if c
is a letter; false
otherwise.public static boolean isLetter(int codePoint)
codePoint
- the code point to check.true
if codePoint
is a letter; false
otherwise.public static boolean isLetterOrDigit(char c)
c
- the character to check.true
if c
is a letter or a digit; false
otherwise.public static boolean isLetterOrDigit(int codePoint)
codePoint
- the code point to check.true
if codePoint
is a letter or a digit;
false
otherwise.public static boolean isLowerCase(char c)
c
- the character to check.true
if c
is a lower case letter; false
otherwise.public static boolean isLowerCase(int codePoint)
codePoint
- the code point to check.true
if codePoint
is a lower case letter;
false
otherwise.@Deprecated public static boolean isSpace(char c)
isWhitespace(char)
c
- the character to check.true
if c
is a Java space; false
otherwise.public static boolean isSpaceChar(char c)
c
- the character to check.true
if c
is a Unicode space character,
false
otherwise.public static boolean isSpaceChar(int codePoint)
codePoint
- the code point to check.true
if codePoint
is a Unicode space character,
false
otherwise.public static boolean isTitleCase(char c)
c
- the character to check.true
if c
is a titlecase character, false
otherwise.public static boolean isTitleCase(int codePoint)
codePoint
- the code point to check.true
if codePoint
is a titlecase character,
false
otherwise.public static boolean isUnicodeIdentifierPart(char c)
c
- the character to check.true
if c
is valid as part of a Unicode
identifier; false
otherwise.public static boolean isUnicodeIdentifierPart(int codePoint)
codePoint
- the code point to check.true
if codePoint
is valid as part of a Unicode
identifier; false
otherwise.public static boolean isUnicodeIdentifierStart(char c)
c
- the character to check.true
if c
is a valid first character for a
Unicode identifier; false
otherwise.public static boolean isUnicodeIdentifierStart(int codePoint)
codePoint
- the code point to check.true
if codePoint
is a valid first character for
a Unicode identifier; false
otherwise.public static boolean isUpperCase(char c)
c
- the character to check.true
if c
is a upper case letter; false
otherwise.public static boolean isUpperCase(int codePoint)
codePoint
- the code point to check.true
if codePoint
is a upper case letter;
false
otherwise.public static boolean isWhitespace(char c)
c
- the character to check.true
if the supplied c
is a whitespace character
in Java; false
otherwise.public static boolean isWhitespace(int codePoint)
codePoint
- the code point to check.true
if the supplied c
is a whitespace character
in Java; false
otherwise.public static char reverseBytes(char c)
c
- the character to reverse.public static char toLowerCase(char c)
c
- the characterc
is an upper case character then its lower case
counterpart, otherwise just c
.public static int toLowerCase(int codePoint)
codePoint
- the code point to check.codePoint
is an upper case character then its lower
case counterpart, otherwise just codePoint
.public String toString()
Object
getClass().getName() + '@' + Integer.toHexString(hashCode())
See Writing a useful
toString
method
if you intend implementing your own toString
method.
public static String toString(char value)
value
- the character to convert.public static char toTitleCase(char c)
c
- the character to convert.c
if it exists, otherwise
c
.public static int toTitleCase(int codePoint)
codePoint
- the code point to convert.codePoint
if it exists,
otherwise codePoint
.public static char toUpperCase(char c)
c
- the character to convert.c
is a lower case character then its upper case
counterpart, otherwise just c
.public static int toUpperCase(int codePoint)
codePoint
- the code point to convert.codePoint
is a lower case character then its upper
case counterpart, otherwise just codePoint
.