Share via


UCharacter Class

Definition

<strong>[icu enhancement]</strong> ICU's replacement for java.lang.Character.

[Android.Runtime.Register("android/icu/lang/UCharacter", ApiSince=24, DoNotGenerateAcw=true)]
public sealed class UCharacter : Java.Lang.Object
[<Android.Runtime.Register("android/icu/lang/UCharacter", ApiSince=24, DoNotGenerateAcw=true)>]
type UCharacter = class
    inherit Object
Inheritance
UCharacter
Attributes

Remarks

<strong>[icu enhancement]</strong> ICU's replacement for java.lang.Character.&nbsp;Methods, fields, and other functionality specific to ICU are labeled '<strong>[icu]</strong>'.

The UCharacter class provides extensions to the java.lang.Character class. These extensions provide support for more Unicode properties. Each ICU release supports the latest version of Unicode available at that time.

For some time before Java 5 added support for supplementary Unicode code points, The ICU UCharacter class and many other ICU classes already supported them. Some UCharacter methods and constants were widened slightly differently than how the Character class methods and constants were widened later. In particular, Character#MAX_VALUE is still a char with the value U+FFFF, while the UCharacter#MAX_VALUE is an int with the value U+10FFFF.

Code points are represented in these API using ints. While it would be more convenient in Java to have a separate primitive datatype for them, ints suffice in the meantime.

Aside from the additions for UTF-16 support, and the updated Unicode properties, the main differences between UCharacter and Character are: <ul> <li> UCharacter is not designed to be a char wrapper and does not have APIs to which involves management of that single char.<br> These include: <ul> <li> char charValue(), <li> int compareTo(java.lang.Character, java.lang.Character), etc. </ul> <li> UCharacter does not include Character APIs that are deprecated, nor does it include the Java-specific character information, such as boolean isJavaIdentifierPart(char ch). <li> Character maps characters 'A' - 'Z' and 'a' - 'z' to the numeric values '10' - '35'. UCharacter also does this in digit and getNumericValue, to adhere to the java semantics of these methods. New methods unicodeDigit, and getUnicodeNumericValue do not treat the above code points as having numeric values. This is a semantic change from ICU4J 1.3.1. </ul>

In addition to Java compatibility functions, which calculate derived properties, this API provides low-level access to the Unicode Character Database.

Unicode assigns each code point (not just assigned character) values for many properties. Most of them are simple boolean flags, or constants from a small enumerated list. For some properties, values are strings or other relatively more complex types.

For more information see "About the Unicode Character Database" (http://www.unicode.org/ucd/) and the ICU User Guide chapter on Properties (https://unicode-org.github.io/icu/userguide/strings/properties).

There are also functions that provide easy migration from C/POSIX functions like isblank(). Their use is generally discouraged because the C/POSIX standards do not define their semantics beyond the ASCII range, which means that different implementations exhibit very different behavior. Instead, Unicode properties should be used directly.

There are also only a few, broad C/POSIX character classes, and they tend to be used for conflicting purposes. For example, the "isalpha()" class is sometimes used to determine word boundaries, while a more sophisticated approach would at least distinguish initial letters from continuation characters (the latter including combining marks). (In ICU, BreakIterator is the most sophisticated API for word boundaries.) Another example: There is no "istitle()" class for titlecase characters.

ICU 3.4 and later provides API access for all twelve C/POSIX character classes. ICU implements them according to the Standard Recommendations in Annex C: Compatibility Properties of UTS #18 Unicode Regular Expressions (http://www.unicode.org/reports/tr18/#Compatibility_Properties).

API access for C/POSIX character classes is as follows:

{@code
            - alpha:     isUAlphabetic(c) or hasBinaryProperty(c, UProperty.ALPHABETIC)
            - lower:     isULowercase(c) or hasBinaryProperty(c, UProperty.LOWERCASE)
            - upper:     isUUppercase(c) or hasBinaryProperty(c, UProperty.UPPERCASE)
            - punct:     ((1<<getType(c)) & ((1<<DASH_PUNCTUATION)|(1<<START_PUNCTUATION)|
                          (1<<END_PUNCTUATION)|(1<<CONNECTOR_PUNCTUATION)|(1<<OTHER_PUNCTUATION)|
                          (1<<INITIAL_PUNCTUATION)|(1<<FINAL_PUNCTUATION)))!=0
            - digit:     isDigit(c) or getType(c)==DECIMAL_DIGIT_NUMBER
            - xdigit:    hasBinaryProperty(c, UProperty.POSIX_XDIGIT)
            - alnum:     hasBinaryProperty(c, UProperty.POSIX_ALNUM)
            - space:     isUWhiteSpace(c) or hasBinaryProperty(c, UProperty.WHITE_SPACE)
            - blank:     hasBinaryProperty(c, UProperty.POSIX_BLANK)
            - cntrl:     getType(c)==CONTROL
            - graph:     hasBinaryProperty(c, UProperty.POSIX_GRAPH)
            - print:     hasBinaryProperty(c, UProperty.POSIX_PRINT)}

The C/POSIX character classes are also available in UnicodeSet patterns, using patterns like [:graph:] or \p{graph}.

<strong>[icu] Note:</strong> There are several ICU (and Java) whitespace functions. Comparison:<ul> <li> isUWhiteSpace=UCHAR_WHITE_SPACE: Unicode White_Space property; most of general categories "Z" (separators) + most whitespace ISO controls (including no-break spaces, but excluding IS1..IS4) <li> isWhitespace: Java isWhitespace; Z + whitespace ISO controls but excluding no-break spaces <li> isSpaceChar: just Z (including no-break spaces)</ul>

This class is not subclassable.

Java documentation for android.icu.lang.UCharacter.

Portions of this page are modifications based on work created and shared by the Android Open Source Project and used according to terms described in the Creative Commons 2.5 Attribution License.

Fields

FoldCaseDefault
Obsolete.

<strong>[icu]</strong> Option value for case folding: use default mappings defined in CaseFolding.

FoldCaseExcludeSpecialI
Obsolete.

<strong>[icu]</strong> Option value for case folding: Use the modified set of mappings provided in CaseFolding.

MaxCodePoint

Constant U+10FFFF, same as Character#MAX_CODE_POINT.

MaxHighSurrogate

Constant U+DBFF, same as Character#MAX_HIGH_SURROGATE.

MaxLowSurrogate

Constant U+DFFF, same as Character#MAX_LOW_SURROGATE.

MaxRadix

Compatibility constant for Java Character's MAX_RADIX.

MaxSurrogate

Constant U+DFFF, same as Character#MAX_SURROGATE.

MaxValue

The highest Unicode code point value (scalar value), constant U+10FFFF (uses 21 bits).

MinCodePoint

Constant U+0000, same as Character#MIN_CODE_POINT.

MinHighSurrogate

Constant U+D800, same as Character#MIN_HIGH_SURROGATE.

MinLowSurrogate

Constant U+DC00, same as Character#MIN_LOW_SURROGATE.

MinRadix

Compatibility constant for Java Character's MIN_RADIX.

MinSupplementaryCodePoint

Constant U+10000, same as Character#MIN_SUPPLEMENTARY_CODE_POINT.

MinSurrogate

Constant U+D800, same as Character#MIN_SURROGATE.

MinValue

The lowest Unicode code point value, constant 0.

NoNumericValue

Special value that is returned by getUnicodeNumericValue(int) when no numeric value is defined for a code point.

ReplacementChar

Unicode value used when translating into Unicode encoding form and there is no existing character.

SupplementaryMinValue

The minimum value for Supplementary code points, constant U+10000.

TitlecaseNoBreakAdjustment
Obsolete.

Do not adjust the titlecasing indexes from BreakIterator::next() indexes; titlecase exactly the characters at breaks from the iterator.

TitlecaseNoLowercase
Obsolete.

Do not lowercase non-initial parts of words when titlecasing.

Properties

Class

Returns the runtime class of this Object.

(Inherited from Object)
ExtendedNameIterator

<strong>[icu]</strong>

Handle

The handle to the underlying Android instance.

(Inherited from Object)
JniIdentityHashCode (Inherited from Object)
JniPeerMembers
NameIterator

<strong>[icu]</strong>

PeerReference (Inherited from Object)
ThresholdClass

This API supports the Mono for Android infrastructure and is not intended to be used directly from your code.

(Inherited from Object)
ThresholdType

This API supports the Mono for Android infrastructure and is not intended to be used directly from your code.

(Inherited from Object)
TypeIterator

<strong>[icu]</strong>

UnicodeVersion

<strong>[icu]</strong> Returns the version of Unicode data used.

Methods

CharCount(Int32)

Same as Character#charCount.

Clone()

Creates and returns a copy of this object.

(Inherited from Object)
CodePointAt(Char[], Int32, Int32)

Same as Character#codePointAt(char[], int, int).

CodePointAt(Char[], Int32)

Same as Character#codePointAt(char[], int).

CodePointAt(ICharSequence, Int32)

Same as Character#codePointAt(CharSequence, int).

CodePointAt(String, Int32)

Same as Character#codePointAt(CharSequence, int).

CodePointBefore(Char[], Int32, Int32)

Same as Character#codePointBefore(char[], int, int).

CodePointBefore(Char[], Int32)

Same as Character#codePointBefore(char[], int).

CodePointBefore(ICharSequence, Int32)

Same as Character#codePointBefore(CharSequence, int).

CodePointBefore(String, Int32)

Same as Character#codePointBefore(CharSequence, int).

CodePointCount(Char[], Int32, Int32)

Equivalent to the Character#codePointCount(char[], int, int) method, for convenience.

CodePointCount(ICharSequence, Int32, Int32)

Equivalent to the Character#codePointCount(CharSequence, int, int) method, for convenience.

CodePointCount(String, Int32, Int32)

Equivalent to the Character#codePointCount(CharSequence, int, int) method, for convenience.

Digit(Int32, Int32)

Returnss the numeric value of a decimal digit code point.

Digit(Int32)

Returnss the numeric value of a decimal digit code point.

Dispose() (Inherited from Object)
Dispose(Boolean) (Inherited from Object)
Equals(Object)

Indicates whether some other object is "equal to" this one.

(Inherited from Object)
FoldCase(Int32, Boolean)

<strong>[icu]</strong> The given character is mapped to its case folding equivalent according to UnicodeData.

FoldCase(Int32, FoldCaseOptions)

<strong>[icu]</strong> The given character is mapped to its case folding equivalent according to UnicodeData.

FoldCase(String, Boolean)

<strong>[icu]</strong> The given string is mapped to its case folding equivalent according to UnicodeData.

FoldCase(String, FoldCaseOptions)

<strong>[icu]</strong> The given string is mapped to its case folding equivalent according to UnicodeData.

ForDigit(Int32, Int32)

Provide the java.

GetAge(Int32)

<strong>[icu]</strong> Returns the "age" of the code point.

GetBidiPairedBracket(Int32)

<strong>[icu]</strong> Maps the specified character to its paired bracket character.

GetCharFromExtendedName(String)

<strong>[icu]</strong>

GetCharFromName(String)

<strong>[icu]</strong>

GetCharFromNameAlias(String)

<strong>[icu]</strong>

GetCodePoint(Char, Char)

<strong>[icu]</strong> Returns a code point corresponding to the two surrogate code units.

GetCodePoint(Char)

<strong>[icu]</strong> Returns the code point corresponding to the BMP code point.

GetCodePoint(Int32, Int32)

<strong>[icu]</strong> Returns a code point corresponding to the two surrogate code units.

GetCombiningClass(Int32)

<strong>[icu]</strong> Returns the combining class of the argument codepoint

GetDirection(Int32)

<strong>[icu]</strong> Returns the Bidirection property of a code point.

GetDirectionality(Int32)

Equivalent to the Character#getDirectionality(char) method, for convenience.

GetExtendedName(Int32)

<strong>[icu]</strong> Returns a name for a valid codepoint.

GetHanNumericValue(Int32)

<strong>[icu]</strong> Returns the numeric value of a Han character.

GetHashCode()

Returns a hash code value for the object.

(Inherited from Object)
GetIntPropertyMaxValue(Int32)

<strong>[icu]</strong> Returns the maximum value for an integer/binary Unicode property.

GetIntPropertyMinValue(Int32)

<strong>[icu]</strong> Returns the minimum value for an integer/binary Unicode property type.

GetIntPropertyValue(Int32, Int32)

<strong>[icu]</strong> Returns the property value for a Unicode property type of a code point.

GetMirror(Int32)

<strong>[icu]</strong> Maps the specified code point to a "mirror-image" code point.

GetName(Int32)

<strong>[icu]</strong> Returns the most current Unicode name of the argument code point, or null if the character is unassigned or outside the range UCharacter.MIN_VALUE and UCharacter.MAX_VALUE or does not have a name.

GetName(String, String)

<strong>[icu]</strong> Returns the names for each of the characters in a string

GetNameAlias(Int32)

<strong>[icu]</strong> Returns the corrected name from NameAliases.

GetNumericValue(Int32)

Returns the numeric value of the code point as a nonnegative integer.

GetPropertyEnum(ICharSequence)

<strong>[icu]</strong> Return the UProperty selector for a given property name, as specified in the Unicode database file PropertyAliases.

GetPropertyEnum(String)

<strong>[icu]</strong> Return the UProperty selector for a given property name, as specified in the Unicode database file PropertyAliases.

GetPropertyName(Int32, Int32)

<strong>[icu]</strong> Return the Unicode name for a given property, as given in the Unicode database file PropertyAliases.

GetPropertyValueEnum(Int32, ICharSequence)

<strong>[icu]</strong> Return the property value integer for a given value name, as specified in the Unicode database file PropertyValueAliases.

GetPropertyValueEnum(Int32, String)

<strong>[icu]</strong> Return the property value integer for a given value name, as specified in the Unicode database file PropertyValueAliases.

GetPropertyValueName(Int32, Int32, Int32)

<strong>[icu]</strong> Return the Unicode name for a given property value, as given in the Unicode database file PropertyValueAliases.

GetType(Int32)

Returns a value indicating a code point's Unicode category.

GetUnicodeNumericValue(Int32)

<strong>[icu]</strong> Returns the numeric value for a Unicode code point as defined in the Unicode Character Database.

HasBinaryProperty(ICharSequence, Int32)

<strong>[icu]</strong> Returns true if the property is true for the string.

HasBinaryProperty(Int32, Int32)

<strong>[icu]</strong> Check a binary Unicode property for a code point.

HasBinaryProperty(String, Int32)

<strong>[icu]</strong> Returns true if the property is true for the string.

IsBaseForm(Int32)

<strong>[icu]</strong> Determines whether the specified code point is of base form.

IsBMP(Int32)

<strong>[icu]</strong> Determines if the code point is in the BMP plane.

IsDefined(Int32)

Determines if a code point has a defined meaning in the up-to-date Unicode standard.

IsDigit(Int32)

Determines if a code point is a Java digit.

IsHighSurrogate(Char)

Same as Character#isHighSurrogate,

IsHighSurrogate(Int32)

Same as Character#isHighSurrogate, except that the ICU version accepts int for code points.

IsIdentifierIgnorable(Int32)

Determines if the specified code point should be regarded as an ignorable character in a Java identifier.

IsISOControl(Int32)

Determines if the specified code point is an ISO control character.

IsJavaIdentifierPart(Int32)

Compatibility override of Java method, delegates to java.

IsJavaIdentifierStart(Int32)

Compatibility override of Java method, delegates to java.

IsLegal(Int32)

<strong>[icu]</strong> A code point is illegal if and only if <ul> <li> Out of bounds, less than 0 or greater than UCharacter.

IsLegal(String)

<strong>[icu]</strong> A string is legal iff all its code points are legal.

IsLetter(Int32)

Determines if the specified code point is a letter.

IsLetterOrDigit(Int32)

Determines if the specified code point is a letter or digit.

IsLowerCase(Int32)

Determines if the specified code point is a lowercase character.

IsLowSurrogate(Char)

Same as Character#isLowSurrogate,

IsLowSurrogate(Int32)

Same as Character#isLowSurrogate, except that the ICU version accepts int for code points.

IsMirrored(Int32)

Determines whether the code point has the "mirrored" property.

IsPrintable(Int32)

<strong>[icu]</strong> Determines whether the specified code point is a printable character according to the Unicode standard.

IsSpaceChar(Int32)

Determines if the specified code point is a Unicode specified space character, i.

IsSupplementary(Int32)

<strong>[icu]</strong> Determines if the code point is a supplementary character.

IsSupplementaryCodePoint(Int32)

Same as Character#isSupplementaryCodePoint.

IsSurrogatePair(Char, Char)

Same as Character#isSurrogatePair.

IsSurrogatePair(Int32, Int32)

Same as Character#isSurrogatePair, except that the ICU version accepts int for code points.

IsTitleCase(Int32)

Determines if the specified code point is a titlecase character.

IsUAlphabetic(Int32)

<strong>[icu]</strong>

IsULowercase(Int32)

<strong>[icu]</strong>

IsUnicodeIdentifierPart(Int32)

Determines if the specified character is permissible as a non-initial character of an identifier according to UAX #31 Unicode Identifier and Pattern Syntax.

IsUnicodeIdentifierStart(Int32)

Determines if the specified character is permissible as the first character in an identifier according to UAX #31 Unicode Identifier and Pattern Syntax.

IsUpperCase(Int32)

Determines if the specified code point is an uppercase character.

IsUUppercase(Int32)

<strong>[icu]</strong>

IsUWhiteSpace(Int32)

<strong>[icu]</strong>

IsValidCodePoint(Int32)

Equivalent to Character#isValidCodePoint.

IsWhitespace(Int32)

Determines if the specified code point is a white space character.

JavaFinalize()

Called by the garbage collector on an object when garbage collection determines that there are no more references to the object.

(Inherited from Object)
Notify()

Wakes up a single thread that is waiting on this object's monitor.

(Inherited from Object)
NotifyAll()

Wakes up all threads that are waiting on this object's monitor.

(Inherited from Object)
OffsetByCodePoints(Char[], Int32, Int32, Int32, Int32)

Equivalent to the Character#offsetByCodePoints(char[], int, int, int, int) method, for convenience.

OffsetByCodePoints(ICharSequence, Int32, Int32)

Equivalent to the Character#offsetByCodePoints(CharSequence, int, int) method, for convenience.

OffsetByCodePoints(String, Int32, Int32)

Equivalent to the Character#offsetByCodePoints(CharSequence, int, int) method, for convenience.

SetHandle(IntPtr, JniHandleOwnership)

Sets the Handle property.

(Inherited from Object)
ToArray<T>() (Inherited from Object)
ToChars(Int32, Char[], Int32)

Same as Character#toChars(int, char[], int).

ToChars(Int32)

Same as Character#toChars(int).

ToCodePoint(Char, Char)

Same as Character#toCodePoint.

ToCodePoint(Int32, Int32)

Same as Character#toCodePoint, except that the ICU version accepts int for code points.

ToLowerCase(Int32)

The given code point is mapped to its lowercase equivalent; if the code point has no lowercase equivalent, the code point itself is returned.

ToLowerCase(Locale, String)

Returns the lowercase version of the argument string.

ToLowerCase(String)

Returns the lowercase version of the argument string.

ToLowerCase(ULocale, String)

Returns the lowercase version of the argument string.

ToString()

Returns a string representation of the object.

(Inherited from Object)
ToString(Int32)

Converts argument code point and returns a String object representing the code point's value in UTF-16 format.

ToTitleCase(Int32)

Converts the code point argument to titlecase.

ToTitleCase(Locale, String, BreakIterator, TitlecaseOptions)

<strong>[icu]</strong>

ToTitleCase(Locale, String, BreakIterator)

Returns the titlecase version of the argument string.

ToTitleCase(String, BreakIterator)

Returns the titlecase version of the argument string.

ToTitleCase(ULocale, String, BreakIterator, TitlecaseOptions)

Returns the titlecase version of the argument string.

ToTitleCase(ULocale, String, BreakIterator)

Returns the titlecase version of the argument string.

ToUpperCase(Int32)

Converts the character argument to uppercase.

ToUpperCase(Locale, String)

Returns the uppercase version of the argument string.

ToUpperCase(String)

Returns the uppercase version of the argument string.

ToUpperCase(ULocale, String)

Returns the uppercase version of the argument string.

UnregisterFromRuntime() (Inherited from Object)
Wait()

Causes the current thread to wait until it is awakened, typically by being <em>notified</em> or <em>interrupted</em>.

(Inherited from Object)
Wait(Int64, Int32)

Causes the current thread to wait until it is awakened, typically by being <em>notified</em> or <em>interrupted</em>, or until a certain amount of real time has elapsed.

(Inherited from Object)
Wait(Int64)

Causes the current thread to wait until it is awakened, typically by being <em>notified</em> or <em>interrupted</em>, or until a certain amount of real time has elapsed.

(Inherited from Object)

Explicit Interface Implementations

IJavaPeerable.Disposed() (Inherited from Object)
IJavaPeerable.DisposeUnlessReferenced() (Inherited from Object)
IJavaPeerable.Finalized() (Inherited from Object)
IJavaPeerable.JniManagedPeerState (Inherited from Object)
IJavaPeerable.SetJniIdentityHashCode(Int32) (Inherited from Object)
IJavaPeerable.SetJniManagedPeerState(JniManagedPeerStates) (Inherited from Object)
IJavaPeerable.SetPeerReference(JniObjectReference) (Inherited from Object)

Extension Methods

JavaCast<TResult>(IJavaObject)

Performs an Android runtime-checked type conversion.

JavaCast<TResult>(IJavaObject)
GetJniTypeName(IJavaPeerable)

Gets the JNI name of the type of the instance self.

JavaAs<TResult>(IJavaPeerable)

Try to coerce self to type TResult, checking that the coercion is valid on the Java side.

TryJavaCast<TResult>(IJavaPeerable, TResult)

Try to coerce self to type TResult, checking that the coercion is valid on the Java side.

Applies to