See: Description
| Class | Description |
|---|---|
| CJKAnalyzer |
An
Analyzer that tokenizes text with StandardTokenizer,
normalizes content with CJKWidthFilter, folds case with
LowerCaseFilter, forms bigrams of CJK with CJKBigramFilter,
and filters stopwords with StopFilter |
| CJKBigramFilter |
Forms bigrams of CJK terms that are generated from StandardTokenizer
or ICUTokenizer.
|
| CJKBigramFilterFactory |
Factory for
CJKBigramFilter. |
| CJKWidthCharFilter |
A
CharFilter that normalizes CJK width differences:
Folds fullwidth ASCII variants into the equivalent basic latin
Folds halfwidth Katakana variants into the equivalent kana
|
| CJKWidthCharFilterFactory |
Factory for
CJKWidthCharFilter. |
| CJKWidthFilter |
A
TokenFilter that normalizes CJK width differences:
Folds fullwidth ASCII variants into the equivalent basic latin
Folds halfwidth Katakana variants into the equivalent kana
|
| CJKWidthFilterFactory |
Factory for
CJKWidthFilter. |
Three analyzers are provided for Chinese, each of which treats Chinese text in a different way.
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.