| Package | Description |
|---|---|
| org.apache.lucene.analysis.standard |
Fast, general-purpose grammar-based tokenizer
StandardTokenizer
implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in
Unicode Standard Annex #29. |
| org.apache.lucene.analysis.tokenattributes |
General-purpose attributes for text analysis.
|
| Class and Description |
|---|
| CharTermAttribute
The term text of a Token.
|
| Class and Description |
|---|
| BytesTermAttribute
This attribute can be used if you have the raw term bytes to be indexed.
|
| CharTermAttribute
The term text of a Token.
|
| CharTermAttributeImpl
Default implementation of
CharTermAttribute. |
| FlagsAttribute
This attribute can be used to pass different flags down the
Tokenizer chain,
e.g. |
| KeywordAttribute
This attribute can be used to mark a token as a keyword.
|
| OffsetAttribute
The start and end character offset of a Token.
|
| PackedTokenAttributeImpl
Default implementation of the common attributes used by Lucene:
CharTermAttribute
TypeAttribute
PositionIncrementAttribute
PositionLengthAttribute
OffsetAttribute
TermFrequencyAttribute
|
| PayloadAttribute
The payload of a Token.
|
| PayloadAttributeImpl
Default implementation of
PayloadAttribute. |
| PositionIncrementAttribute
Determines the position of this token
relative to the previous Token in a TokenStream, used in phrase
searching.
|
| PositionLengthAttribute
Determines how many positions this
token spans.
|
| TermFrequencyAttribute
Sets the custom term frequency of a term within one document.
|
| TermToBytesRefAttribute
This attribute is requested by TermsHashPerField to index the contents.
|
| TypeAttribute
A Token's lexical type.
|
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.