Source Code

A selection of utility methods for accessing Unicode information about Characters and performing locale-aware transformations on Strings:

Platform: Java
By: Tom Bentley
Packages
ceylon.unicode
Dependencies
java.base7
Values
arabicNumberSource Codeshared arabicNumber arabicNumber
boundaryNeutralSource Codeshared boundaryNeutral boundaryNeutral
commonNumberSeparatorSource Codeshared commonNumberSeparator commonNumberSeparator
europeanNumberSource Codeshared europeanNumber europeanNumber
europeanNumberSeparatorSource Codeshared europeanNumberSeparator europeanNumberSeparator
europeanNumberTerminatorSource Codeshared europeanNumberTerminator europeanNumberTerminator
leftToRightSource Codeshared leftToRight leftToRight
leftToRightEmbeddingSource Codeshared leftToRightEmbedding leftToRightEmbedding
leftToRightOverrideSource Codeshared leftToRightOverride leftToRightOverride
letterLowercaseSource Codeshared letterLowercase letterLowercase

The General category for Ll

letterModifierSource Codeshared letterModifier letterModifier

The General category for Lm

letterOtherSource Codeshared letterOther letterOther

The General category for Lo

letterTitlecaseSource Codeshared letterTitlecase letterTitlecase

The General category for Lt

letterUppercaseSource Codeshared letterUppercase letterUppercase

The General category for Lu

markCombiningSpacingSource Codeshared markCombiningSpacing markCombiningSpacing

The General category for Mc

markEnclosingSource Codeshared markEnclosing markEnclosing

The General category for Me

markNonspacingSource Codeshared markNonspacing markNonspacing

The General category for Mn

nonspacingMarkSource Codeshared nonspacingMark nonspacingMark
numberDecimalDigitSource Codeshared numberDecimalDigit numberDecimalDigit

The General category for Nd

numberLetterSource Codeshared numberLetter numberLetter

The General category for Nl

numberOtherSource Codeshared numberOther numberOther

The General category for No

otherControlSource Codeshared otherControl otherControl

The General category for Cc

otherFormatSource Codeshared otherFormat otherFormat

The General category for Cf

otherNeutralsSource Codeshared otherNeutrals otherNeutrals
otherPrivateUseSource Codeshared otherPrivateUse otherPrivateUse

The General category for Co

otherSurrogateSource Codeshared otherSurrogate otherSurrogate

The General category for Cs

otherUnassignedSource Codeshared otherUnassigned otherUnassigned

The General category for Cn

paragraphSeparatorSource Codeshared paragraphSeparator paragraphSeparator
popDirectionalFormatSource Codeshared popDirectionalFormat popDirectionalFormat
punctuationCloseSource Codeshared punctuationClose punctuationClose

The General category for Pe

punctuationConnectorSource Codeshared punctuationConnector punctuationConnector

The General category for Pc

punctuationDashSource Codeshared punctuationDash punctuationDash

The General category for Pd

punctuationFinalQuoteSource Codeshared punctuationFinalQuote punctuationFinalQuote

The General category for Pf

punctuationInitialQuoteSource Codeshared punctuationInitialQuote punctuationInitialQuote

The General category for Pi

punctuationOpenSource Codeshared punctuationOpen punctuationOpen

The General category for Ps

punctuationOtherSource Codeshared punctuationOther punctuationOther

The General category for Po

rightToLeftSource Codeshared rightToLeft rightToLeft
rightToLeftArabicSource Codeshared rightToLeftArabic rightToLeftArabic
rightToLeftEmbeddingSource Codeshared rightToLeftEmbedding rightToLeftEmbedding
rightToLeftOverrideSource Codeshared rightToLeftOverride rightToLeftOverride
segmentSeparatorSource Codeshared segmentSeparator segmentSeparator
separatorLineSource Codeshared separatorLine separatorLine

The General category for Zl

separatorParagraphSource Codeshared separatorParagraph separatorParagraph

The General category for Zp

separatorSpaceSource Codeshared separatorSpace separatorSpace

The General category for Zs

symbolCurrencySource Codeshared symbolCurrency symbolCurrency

The General category for Sc

symbolMathSource Codeshared symbolMath symbolMath

The General category for Sm

symbolModifierSource Codeshared symbolModifier symbolModifier

The General category for Sk

symbolOtherSource Codeshared symbolOther symbolOther

The General category for So

undefinedSource Codeshared undefined undefined
unicodeVersionSource Codeshared String? unicodeVersion

The version of the Unicode standard being used, or null if this information was not available.

whitespaceSource Codeshared whitespace whitespace
Functions
assignedSource Codeshared Boolean assigned(Integer codePoint)

Determine if the given integer code point is assigned a Unicode character.

characterNameSource Codeshared String characterName(Character character)

The Unicode name of the given character.

directionalitySource Codeshared Directionality directionality(Character character)

The directionality of the given character.

generalCategorySource Codeshared GeneralCategory generalCategory(Character character)

The general category of the given character.

graphemesSource Codeshared {String*} graphemes(String text, String tag = ...)

The graphemes contained in the given string. In general, a Unicode String contains fewer graphemes than codepoints.

Parameters:
  • text

    The string

  • tag = system.locale

    The IETF BCP 47 language tag string of the locale.

lowercaseSource Codeshared String lowercase(String string, String tag = ...)

Convert the given string to lowercase according to the rules of the locale with the given language tag.

Parameters:
  • string

    The string to convert to lowercase.

  • tag = system.locale

    The IETF BCP 47 language tag string of the locale.

privateUseSource Codeshared Boolean privateUse(Integer codePoint)

Determine if the given integer code point is belongs to a Unicode Private Use Area.

sentencesSource Codeshared {String*} sentences(String text, String tag = ...)

The sentences contained in the given string, according to the rules of the given locale. Whitespace is trimmed from the beginning and end of each sentence, but whitespace contained within the sentence is not normalized.

Parameters:
  • text

    The string

  • tag = system.locale

    The IETF BCP 47 language tag string of the locale.

uppercaseSource Codeshared String uppercase(String string, String tag = ...)

Convert the given string to uppercase according to the rules of the locale with the given language tag.

Parameters:
  • string

    The string to convert to uppercase.

  • tag = system.locale

    The IETF BCP 47 language tag string of the locale.

wordsSource Codeshared {String*} words(String text, String tag = ...)

The words and punctuation contained in the given string, according to the rules of the given locale. Any non-whitespace character not contained in a word is treated as a whole word. All whitespace characters are discarded.

Parameters:
  • text

    The string

  • tag = system.locale

    The IETF BCP 47 language tag string of the locale.

Classes
DirectionalitySource Codeshared abstract Directionality

Enumerates the Directionalities defined by the Unicode specification.

GeneralCategorySource Codeshared abstract GeneralCategory

Enumerates the major classes of General Category defined by the Unicode specification.

LetterSource Codeshared abstract Letter

Enumerates the general categories in the Letter major class.

MarkSource Codeshared abstract Mark

Enumerates the general categories in the Mark major class.

NumberSource Codeshared abstract Number

Enumerates the general categories in the Number major class.

OtherSource Codeshared abstract Other

Enumerates the general categories in the Other major class.

PunctuationSource Codeshared abstract Punctuation

Enumerates the general categories in the Punctuation major class.

SeparatorSource Codeshared abstract Separator

Enumerates the general categories in the Separator major class.

SymbolSource Codeshared abstract Symbol

Enumerates the general categories in the Symbol major class.

arabicNumberSource Codeshared arabicNumber
boundaryNeutralSource Codeshared boundaryNeutral
commonNumberSeparatorSource Codeshared commonNumberSeparator
europeanNumberSource Codeshared europeanNumber
europeanNumberSeparatorSource Codeshared europeanNumberSeparator
europeanNumberTerminatorSource Codeshared europeanNumberTerminator
leftToRightSource Codeshared leftToRight
leftToRightEmbeddingSource Codeshared leftToRightEmbedding
leftToRightOverrideSource Codeshared leftToRightOverride
letterLowercaseSource Codeshared letterLowercase

The General category for Ll

letterModifierSource Codeshared letterModifier

The General category for Lm

letterOtherSource Codeshared letterOther

The General category for Lo

letterTitlecaseSource Codeshared letterTitlecase

The General category for Lt

letterUppercaseSource Codeshared letterUppercase

The General category for Lu

markCombiningSpacingSource Codeshared markCombiningSpacing

The General category for Mc

markEnclosingSource Codeshared markEnclosing

The General category for Me

markNonspacingSource Codeshared markNonspacing

The General category for Mn

nonspacingMarkSource Codeshared nonspacingMark
numberDecimalDigitSource Codeshared numberDecimalDigit

The General category for Nd

numberLetterSource Codeshared numberLetter

The General category for Nl

numberOtherSource Codeshared numberOther

The General category for No

otherControlSource Codeshared otherControl

The General category for Cc

otherFormatSource Codeshared otherFormat

The General category for Cf

otherNeutralsSource Codeshared otherNeutrals
otherPrivateUseSource Codeshared otherPrivateUse

The General category for Co

otherSurrogateSource Codeshared otherSurrogate

The General category for Cs

otherUnassignedSource Codeshared otherUnassigned

The General category for Cn

paragraphSeparatorSource Codeshared paragraphSeparator
popDirectionalFormatSource Codeshared popDirectionalFormat
punctuationCloseSource Codeshared punctuationClose

The General category for Pe

punctuationConnectorSource Codeshared punctuationConnector

The General category for Pc

punctuationDashSource Codeshared punctuationDash

The General category for Pd

punctuationFinalQuoteSource Codeshared punctuationFinalQuote

The General category for Pf

punctuationInitialQuoteSource Codeshared punctuationInitialQuote

The General category for Pi

punctuationOpenSource Codeshared punctuationOpen

The General category for Ps

punctuationOtherSource Codeshared punctuationOther

The General category for Po

rightToLeftSource Codeshared rightToLeft
rightToLeftArabicSource Codeshared rightToLeftArabic
rightToLeftEmbeddingSource Codeshared rightToLeftEmbedding
rightToLeftOverrideSource Codeshared rightToLeftOverride
segmentSeparatorSource Codeshared segmentSeparator
separatorLineSource Codeshared separatorLine

The General category for Zl

separatorParagraphSource Codeshared separatorParagraph

The General category for Zp

separatorSpaceSource Codeshared separatorSpace

The General category for Zs

symbolCurrencySource Codeshared symbolCurrency

The General category for Sc

symbolMathSource Codeshared symbolMath

The General category for Sm

symbolModifierSource Codeshared symbolModifier

The General category for Sk

symbolOtherSource Codeshared symbolOther

The General category for So

undefinedSource Codeshared undefined
whitespaceSource Codeshared whitespace