Character Class
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
The Character
class wraps a value of the primitive
type char
in an object.
[Android.Runtime.Register("java/lang/Character", DoNotGenerateAcw=true)]
public sealed class Character : Java.Lang.Object, IConvertible, IDisposable, Java.Interop.IJavaPeerable, Java.IO.ISerializable, Java.Lang.IComparable
[<Android.Runtime.Register("java/lang/Character", DoNotGenerateAcw=true)>]
type Character = class
inherit Object
interface IConvertible
interface ISerializable
interface IJavaObject
interface IDisposable
interface IJavaPeerable
interface IComparable
- Inheritance
- Attributes
- Implements
Remarks
The Character
class wraps a value of the primitive type char
in an object. An object of class Character
contains a single field whose type is char
.
In addition, this class provides several methods for determining a character's category (lowercase letter, digit, etc.) and for converting characters from uppercase to lowercase and vice versa.
Character information is based on the Unicode Standard
The methods and data of class Character
are defined by the information in the UnicodeData file that is part of the Unicode Character Database maintained by the Unicode Consortium. This file specifies various properties including name and general category for every defined Unicode code point or character range.
The file and its description are available from the Unicode Consortium at: <ul> <li>http://www.unicode.org</ul>
<h2>"conformance">Unicode Conformance</h2>
The fields and methods of class Character
are defined in terms of character information from the Unicode Standard, specifically the UnicodeData file that is part of the Unicode Character Database. This file specifies properties including name and category for every assigned Unicode code point or character range. The file is available from the Unicode Consortium at http://www.unicode.org.
Character information is based on the Unicode Standard, version 13.0.
The Java platform has supported different versions of the Unicode Standard over time. Upgrades to newer versions of the Unicode Standard occurred in the following Java releases, each indicating the new version: <table class="striped"> <caption style="display:none">Shows Java releases and supported Unicode versions</caption> <thead> <tr><th scope="col">Java release</th> <th scope="col">Unicode version</th></tr> </thead> <tbody> <tr><td>Java SE 15</td> <td>Unicode 13.0</td></tr> <tr><td>Java SE 13</td> <td>Unicode 12.1</td></tr> <tr><td>Java SE 12</td> <td>Unicode 11.0</td></tr> <tr><td>Java SE 11</td> <td>Unicode 10.0</td></tr> <tr><td>Java SE 9</td> <td>Unicode 8.0</td></tr> <tr><td>Java SE 8</td> <td>Unicode 6.2</td></tr> <tr><td>Java SE 7</td> <td>Unicode 6.0</td></tr> <tr><td>Java SE 5.0</td> <td>Unicode 4.0</td></tr> <tr><td>Java SE 1.4</td> <td>Unicode 3.0</td></tr> <tr><td>JDK 1.1</td> <td>Unicode 2.0</td></tr> <tr><td>JDK 1.0.2</td> <td>Unicode 1.1.5</td></tr> </tbody> </table> Variations from these base Unicode versions, such as recognized appendixes, are documented elsewhere. <h2>"unicode">Unicode Character Representations</h2>
The char
data type (and therefore the value that a Character
object encapsulates) are based on the original Unicode specification, which defined characters as fixed-width 16-bit entities. The Unicode Standard has since been changed to allow for characters whose representation requires more than 16 bits. The range of legal <em>code point</em>s is now U+0000 to U+10FFFF, known as <em>Unicode scalar value</em>. (Refer to the definitionhttp://www.unicode.org/reports/tr27/#notation of the U+n notation in the Unicode Standard.)
"BMP">The set of characters from U+0000 to U+FFFF is sometimes referred to as the <em>Basic Multilingual Plane (BMP)</em>. "supplementary">Characters whose code points are greater than U+FFFF are called <em>supplementary character</em>s. The Java platform uses the UTF-16 representation in char
arrays and in the String
and StringBuffer
classes. In this representation, supplementary characters are represented as a pair of char
values, the first from the <em>high-surrogates</em> range, (\uD800-\uDBFF), the second from the <em>low-surrogates</em> range (\uDC00-\uDFFF).
A char
value, therefore, represents Basic Multilingual Plane (BMP) code points, including the surrogate code points, or code units of the UTF-16 encoding. An int
value represents all Unicode code points, including supplementary code points. The lower (least significant) 21 bits of int
are used to represent Unicode code points and the upper (most significant) 11 bits must be zero. Unless otherwise specified, the behavior with respect to supplementary characters and surrogate char
values is as follows:
<ul> <li>The methods that only accept a char
value cannot support supplementary characters. They treat char
values from the surrogate ranges as undefined characters. For example, Character.isLetter('\u005CuD840')
returns false
, even though this specific value if followed by any low-surrogate value in a string would represent a letter.
<li>The methods that accept an int
value support all Unicode characters, including supplementary characters. For example, Character.isLetter(0x2F81A)
returns true
because the code point value represents a letter (a CJK ideograph). </ul>
In the Java SE API documentation, <em>Unicode code point</em> is used for character values in the range between U+0000 and U+10FFFF, and <em>Unicode code unit</em> is used for 16-bit char
values that are code units of the <em>UTF-16</em> encoding. For more information on Unicode terminology, refer to the Unicode Glossary.
<!-- Android-removed: paragraph on ValueBased
This is a value-based class; programmers should treat instances that are #equals(Object) equal as interchangeable and should not use instances for synchronization, or unpredictable behavior may occur. For example, in a future release, synchronization may fail. -->
Added in 1.0.
Java documentation for java.lang.Character
.
Portions of this page are modifications based on work created and shared by the Android Open Source Project and used according to terms described in the Creative Commons 2.5 Attribution License.
Constructors
Character(Char) |
Constructs a newly allocated |
Fields
Bytes |
The number of bytes used to represent a |
CombiningSpacingMark |
General category "Mc" in the Unicode specification. |
ConnectorPunctuation |
General category "Pc" in the Unicode specification. |
Control |
General category "Cc" in the Unicode specification. |
CurrencySymbol |
General category "Sc" in the Unicode specification. |
DashPunctuation |
General category "Pd" in the Unicode specification. |
DecimalDigitNumber |
General category "Nd" in the Unicode specification. |
DirectionalityArabicNumber |
Weak bidirectional character type "AN" in the Unicode specification. |
DirectionalityBoundaryNeutral |
Weak bidirectional character type "BN" in the Unicode specification. |
DirectionalityCommonNumberSeparator |
Weak bidirectional character type "CS" in the Unicode specification. |
DirectionalityEuropeanNumber |
Weak bidirectional character type "EN" in the Unicode specification. |
DirectionalityEuropeanNumberSeparator |
Weak bidirectional character type "ES" in the Unicode specification. |
DirectionalityEuropeanNumberTerminator |
Weak bidirectional character type "ET" in the Unicode specification. |
DirectionalityFirstStrongIsolate |
Weak bidirectional character type "FSI" in the Unicode specification. |
DirectionalityLeftToRight |
Strong bidirectional character type "L" in the Unicode specification. |
DirectionalityLeftToRightEmbedding |
Strong bidirectional character type "LRE" in the Unicode specification. |
DirectionalityLeftToRightIsolate |
Weak bidirectional character type "LRI" in the Unicode specification. |
DirectionalityLeftToRightOverride |
Strong bidirectional character type "LRO" in the Unicode specification. |
DirectionalityNonspacingMark |
Weak bidirectional character type "NSM" in the Unicode specification. |
DirectionalityOtherNeutrals |
Neutral bidirectional character type "ON" in the Unicode specification. |
DirectionalityParagraphSeparator |
Neutral bidirectional character type "B" in the Unicode specification. |
DirectionalityPopDirectionalFormat |
Weak bidirectional character type "PDF" in the Unicode specification. |
DirectionalityPopDirectionalIsolate |
Weak bidirectional character type "PDI" in the Unicode specification. |
DirectionalityRightToLeft |
Strong bidirectional character type "R" in the Unicode specification. |
DirectionalityRightToLeftArabic |
Strong bidirectional character type "AL" in the Unicode specification. |
DirectionalityRightToLeftEmbedding |
Strong bidirectional character type "RLE" in the Unicode specification. |
DirectionalityRightToLeftIsolate |
Weak bidirectional character type "RLI" in the Unicode specification. |
DirectionalityRightToLeftOverride |
Strong bidirectional character type "RLO" in the Unicode specification. |
DirectionalitySegmentSeparator |
Neutral bidirectional character type "S" in the Unicode specification. |
DirectionalityUndefined |
Undefined bidirectional character type. |
DirectionalityWhitespace |
Neutral bidirectional character type "WS" in the Unicode specification. |
EnclosingMark |
General category "Me" in the Unicode specification. |
EndPunctuation |
General category "Pe" in the Unicode specification. |
FinalQuotePunctuation |
General category "Pf" in the Unicode specification. |
Format |
General category "Cf" in the Unicode specification. |
InitialQuotePunctuation |
General category "Pi" in the Unicode specification. |
LetterNumber |
General category "Nl" in the Unicode specification. |
LineSeparator |
General category "Zl" in the Unicode specification. |
LowercaseLetter |
General category "Ll" in the Unicode specification. |
MathSymbol |
General category "Sm" in the Unicode specification. |
MaxCodePoint |
The maximum value of a
Unicode code point, constant |
MaxHighSurrogate |
The maximum value of a
Unicode high-surrogate code unit
in the UTF-16 encoding, constant |
MaxLowSurrogate |
The maximum value of a
Unicode low-surrogate code unit
in the UTF-16 encoding, constant |
MaxRadix |
The maximum radix available for conversion to and from strings. |
MaxSurrogate |
The maximum value of a Unicode surrogate code unit in the
UTF-16 encoding, constant |
MaxValue |
The constant value of this field is the largest value of type
|
MinCodePoint |
The minimum value of a
Unicode code point, constant |
MinHighSurrogate |
The minimum value of a
Unicode high-surrogate code unit
in the UTF-16 encoding, constant |
MinLowSurrogate |
The minimum value of a
Unicode low-surrogate code unit
in the UTF-16 encoding, constant |
MinRadix |
The minimum radix available for conversion to and from strings. |
MinSupplementaryCodePoint |
The minimum value of a
Unicode supplementary code point, constant |
MinSurrogate |
The minimum value of a Unicode surrogate code unit in the
UTF-16 encoding, constant |
MinValue |
The constant value of this field is the smallest value of type
|
ModifierLetter |
General category "Lm" in the Unicode specification. |
ModifierSymbol |
General category "Sk" in the Unicode specification. |
NonSpacingMark |
General category "Mn" in the Unicode specification. |
OtherLetter |
General category "Lo" in the Unicode specification. |
OtherNumber |
General category "No" in the Unicode specification. |
OtherPunctuation |
General category "Po" in the Unicode specification. |
OtherSymbol |
General category "So" in the Unicode specification. |
ParagraphSeparator |
General category "Zp" in the Unicode specification. |
PrivateUse |
General category "Co" in the Unicode specification. |
Size |
The number of bits used to represent a |
SpaceSeparator |
General category "Zs" in the Unicode specification. |
StartPunctuation |
General category "Ps" in the Unicode specification. |
Surrogate |
General category "Cs" in the Unicode specification. |
TitlecaseLetter |
General category "Lt" in the Unicode specification. |
Unassigned |
General category "Cn" in the Unicode specification. |
UppercaseLetter |
General category "Lu" in the Unicode specification. |
Properties
Class |
Returns the runtime class of this |
Handle |
The handle to the underlying Android instance. (Inherited from Object) |
JniIdentityHashCode | (Inherited from Object) |
JniPeerMembers | |
PeerReference | (Inherited from Object) |
ThresholdClass |
This API supports the Mono for Android infrastructure and is not intended to be used directly from your code. (Inherited from Object) |
ThresholdType |
This API supports the Mono for Android infrastructure and is not intended to be used directly from your code. (Inherited from Object) |
Type |
The |
Methods
CharCount(Int32) |
Determines the number of |
CharValue() |
Returns the value of this |
Clone() |
Creates and returns a copy of this object. (Inherited from Object) |
CodePointAt(Char[], Int32, Int32) |
Returns the code point at the given index of the
|
CodePointAt(Char[], Int32) |
Returns the code point at the given index of the
|
CodePointAt(ICharSequence, Int32) |
Returns the code point at the given index of the
|
CodePointAt(String, Int32) |
Returns the code point at the given index of the
|
CodePointBefore(Char[], Int32, Int32) |
Returns the code point preceding the given index of the
|
CodePointBefore(Char[], Int32) |
Returns the code point preceding the given index of the
|
CodePointBefore(ICharSequence, Int32) |
Returns the code point preceding the given index of the
|
CodePointBefore(String, Int32) |
Returns the code point preceding the given index of the
|
CodePointCount(Char[], Int32, Int32) |
Returns the number of Unicode code points in a subarray of the
|
CodePointCount(ICharSequence, Int32, Int32) |
Returns the number of Unicode code points in the text range of the specified char sequence. |
CodePointCount(String, Int32, Int32) |
Returns the number of Unicode code points in the text range of the specified char sequence. |
CodePointOf(String) |
Returns the code point value of the Unicode character specified by the given Unicode character name. |
Compare(Char, Char) |
Compares two |
CompareTo(Character) |
Compares two |
Digit(Char, Int32) |
Returns the numeric value of the character |
Digit(Int32, Int32) |
Returns the numeric value of the specified character (Unicode code point) in the specified radix. |
Dispose() | (Inherited from Object) |
Dispose(Boolean) | (Inherited from Object) |
Equals(Object) |
Indicates whether some other object is "equal to" this one. (Inherited from Object) |
ForDigit(Int32, Int32) |
Determines the character representation for a specific digit in the specified radix. |
GetDirectionality(Char) |
Returns the Unicode directionality property for the given character. |
GetDirectionality(Int32) |
Returns the Unicode directionality property for the given character (Unicode code point). |
GetHashCode() |
Returns a hash code value for the object. (Inherited from Object) |
GetName(Int32) |
Returns the Unicode name of the specified character
|
GetNumericValue(Char) |
Returns the |
GetNumericValue(Int32) |
Returns the |
GetType(Char) |
Returns a value indicating a character's general category. |
GetType(Int32) |
Returns a value indicating a character's general category. |
HashCode(Char) |
Returns a hash code for a |
HighSurrogate(Int32) |
Returns the leading surrogate (a high surrogate code unit) of the surrogate pair representing the specified supplementary character (Unicode code point) in the UTF-16 encoding. |
IsAlphabetic(Int32) |
Determines if the specified character (Unicode code point) is alphabetic. |
IsBmpCodePoint(Int32) |
Determines whether the specified character (Unicode code point) is in the Basic Multilingual Plane (BMP). |
IsDefined(Char) |
Determines if a character is defined in Unicode. |
IsDefined(Int32) |
Determines if a character (Unicode code point) is defined in Unicode. |
IsDigit(Char) |
Determines if the specified character is a digit. |
IsDigit(Int32) |
Determines if the specified character (Unicode code point) is a digit. |
IsHighSurrogate(Char) |
Determines if the given |
IsIdentifierIgnorable(Char) |
Determines if the specified character should be regarded as an ignorable character in a Java identifier or a Unicode identifier. |
IsIdentifierIgnorable(Int32) |
Determines if the specified character (Unicode code point) should be regarded as an ignorable character in a Java identifier or a Unicode identifier. |
IsIdeographic(Int32) |
Determines if the specified character (Unicode code point) is a CJKV (Chinese, Japanese, Korean and Vietnamese) ideograph, as defined by the Unicode Standard. |
IsISOControl(Char) |
Determines if the specified character is an ISO control character. |
IsISOControl(Int32) |
Determines if the referenced character (Unicode code point) is an ISO control character. |
IsJavaIdentifierPart(Char) |
Determines if the specified character may be part of a Java identifier as other than the first character. |
IsJavaIdentifierPart(Int32) |
Determines if the character (Unicode code point) may be part of a Java identifier as other than the first character. |
IsJavaIdentifierStart(Char) |
Determines if the specified character is permissible as the first character in a Java identifier. |
IsJavaIdentifierStart(Int32) |
Determines if the character (Unicode code point) is permissible as the first character in a Java identifier. |
IsJavaLetter(Char) |
Obsolete.
Determines if the specified character is permissible as the first character in a Java identifier. |
IsJavaLetterOrDigit(Char) |
Obsolete.
Determines if the specified character may be part of a Java identifier as other than the first character. |
IsLetter(Char) |
Determines if the specified character is a letter. |
IsLetter(Int32) |
Determines if the specified character (Unicode code point) is a letter. |
IsLetterOrDigit(Char) |
Determines if the specified character is a letter or digit. |
IsLetterOrDigit(Int32) |
Determines if the specified character (Unicode code point) is a letter or digit. |
IsLowerCase(Char) |
Determines if the specified character is a lowercase character. |
IsLowerCase(Int32) |
Determines if the specified character (Unicode code point) is a lowercase character. |
IsLowSurrogate(Char) |
Determines if the given |
IsMirrored(Char) |
Determines whether the character is mirrored according to the Unicode specification. |
IsMirrored(Int32) |
Determines whether the specified character (Unicode code point) is mirrored according to the Unicode specification. |
IsSpace(Char) |
Obsolete.
Determines if the specified character is ISO-LATIN-1 white space. |
IsSpaceChar(Char) |
Determines if the specified character is a Unicode space character. |
IsSpaceChar(Int32) |
Determines if the specified character (Unicode code point) is a Unicode space character. |
IsSupplementaryCodePoint(Int32) |
Determines whether the specified character (Unicode code point) is in the supplementary character range. |
IsSurrogate(Char) |
Determines if the given |
IsSurrogatePair(Char, Char) |
Determines whether the specified pair of |
IsTitleCase(Char) |
Determines if the specified character is a titlecase character. |
IsTitleCase(Int32) |
Determines if the specified character (Unicode code point) is a titlecase character. |
IsUnicodeIdentifierPart(Char) |
Determines if the specified character may be part of a Unicode identifier as other than the first character. |
IsUnicodeIdentifierPart(Int32) |
Determines if the specified character (Unicode code point) may be part of a Unicode identifier as other than the first character. |
IsUnicodeIdentifierStart(Char) |
Determines if the specified character is permissible as the first character in a Unicode identifier. |
IsUnicodeIdentifierStart(Int32) |
Determines if the specified character (Unicode code point) is permissible as the first character in a Unicode identifier. |
IsUpperCase(Char) |
Determines if the specified character is an uppercase character. |
IsUpperCase(Int32) |
Determines if the specified character (Unicode code point) is an uppercase character. |
IsValidCodePoint(Int32) |
Determines whether the specified code point is a valid Unicode code point value. |
IsWhitespace(Char) |
Determines if the specified character is white space according to Java. |
IsWhitespace(Int32) |
Determines if the specified character (Unicode code point) is white space according to Java. |
JavaFinalize() |
Called by the garbage collector on an object when garbage collection determines that there are no more references to the object. (Inherited from Object) |
LowSurrogate(Int32) |
Returns the trailing surrogate (a low surrogate code unit) of the surrogate pair representing the specified supplementary character (Unicode code point) in the UTF-16 encoding. |
Notify() |
Wakes up a single thread that is waiting on this object's monitor. (Inherited from Object) |
NotifyAll() |
Wakes up all threads that are waiting on this object's monitor. (Inherited from Object) |
OffsetByCodePoints(Char[], Int32, Int32, Int32, Int32) |
Returns the index within the given |
OffsetByCodePoints(ICharSequence, Int32, Int32) |
Returns the index within the given char sequence that is offset
from the given |
OffsetByCodePoints(String, Int32, Int32) |
Returns the index within the given char sequence that is offset
from the given |
ReverseBytes(Char) |
Returns the value obtained by reversing the order of the bytes in the
specified |
SetHandle(IntPtr, JniHandleOwnership) |
Sets the Handle property. (Inherited from Object) |
ToArray<T>() | (Inherited from Object) |
ToChars(Int32, Char[], Int32) |
Converts the specified character (Unicode code point) to its UTF-16 representation. |
ToChars(Int32) |
Converts the specified character (Unicode code point) to its
UTF-16 representation stored in a |
ToCodePoint(Char, Char) |
Converts the specified surrogate pair to its supplementary code point value. |
ToLowerCase(Char) |
Converts the character argument to lowercase using case mapping information from the UnicodeData file. |
ToLowerCase(Int32) |
Converts the character (Unicode code point) argument to lowercase using case mapping information from the UnicodeData file. |
ToString() |
Returns a string representation of the object. (Inherited from Object) |
ToString(Char) |
Returns a |
ToString(Int32) |
Returns a |
ToTitleCase(Char) |
Converts the character argument to titlecase using case mapping information from the UnicodeData file. |
ToTitleCase(Int32) |
Converts the character (Unicode code point) argument to titlecase using case mapping information from the UnicodeData file. |
ToUpperCase(Char) |
Converts the character argument to uppercase using case mapping information from the UnicodeData file. |
ToUpperCase(Int32) |
Converts the character (Unicode code point) argument to uppercase using case mapping information from the UnicodeData file. |
UnregisterFromRuntime() | (Inherited from Object) |
ValueOf(Char) |
Returns a |
Wait() |
Causes the current thread to wait until it is awakened, typically by being <em>notified</em> or <em>interrupted</em>. (Inherited from Object) |
Wait(Int64, Int32) |
Causes the current thread to wait until it is awakened, typically by being <em>notified</em> or <em>interrupted</em>, or until a certain amount of real time has elapsed. (Inherited from Object) |
Wait(Int64) |
Causes the current thread to wait until it is awakened, typically by being <em>notified</em> or <em>interrupted</em>, or until a certain amount of real time has elapsed. (Inherited from Object) |
Operators
Explicit(Character to Char) |
Explicit Interface Implementations
Extension Methods
JavaCast<TResult>(IJavaObject) |
Performs an Android runtime-checked type conversion. |
JavaCast<TResult>(IJavaObject) | |
GetJniTypeName(IJavaPeerable) |
Gets the JNI name of the type of the instance |
JavaAs<TResult>(IJavaPeerable) |
Try to coerce |
TryJavaCast<TResult>(IJavaPeerable, TResult) |
Try to coerce |