Not followed by canonical composition | Followed by canonical composition | |
---|---|---|
Canonical decomposition |
|
|
Compatable decomposition |
|
|
sun.text.Normalizer
- code using this may break in time.
It is used internally by the Collator
class
package sun.text;
public class Normalizer {
public static Mode COMPOSE;
public static Mode COMPOSE_COMPAT;
public static Mode DECOMP;
public static Mode DECOMP_COMPAT;
public static String normalize(String str, Mode mode, int options);
public static String compose(String source, boolean compat, int options);
public static String decompose(String source, boolean compat, int options);
}
Collator
class is the major public one for testing
strings
boolean equals(String source,
String target)
and a string comparison method
int compare(String source, String target)
-
Collator
can take a Locale
parameter in the contructor or use the default locale
-
Collator
is an abstract class that must be subclassed.
The JDK supplies one subclass RuleBasedCollator
Collator
applies the normalization rules of
compose/decompose/compatable/non-compatable of
Normalizer
setDecomposition(int decompositionMode)
CANONICAL_DECOMPOSITION
(Normalization Form D)
FULL_DECOMPOSITION
(Normalization Form KD)
NO_DECOMPOSITION
(default)
Strength | Description | Example |
---|---|---|
PRIMARY | The base letters are different | A versus B |
SECONDARY | The base letters are the same, but the accents are different | A versus Á |
TERTIARY | The letters are the same but differ by case | A versus a |
IDENTICAL | The letters are identical | A versus A |
Collator
class has a factory method that takes a Locale
Collator.getInstance(Locale)
Collator
orders string according to the locale rules
and collator strength and canonicalisation
if (collate.compare(str1, str2) > 0) ...
You can make your own rules for RuleBasedCollator
Collections
has a static method
sort(List list, Comparator super T> c)
and Collator
implements Comparator
BreakIterator
class can be used to segment text
BreakIterator.getCharacterInstance(Locale l);
BreakIterator.getWordInstance(Locale l);
BreakIterator.getLneInstance(Locale l);
BreakIterator.getSentenceInstance(Locale l);
iterator.setText(String s)
int iterator.first();
int iterator.next();
int iterator.DONE
which return indexes into the current text string, and DONE when there
are no more