org.apache.lucene.analysis (Lucene 8.9.0 API)乐学网一站式学习平台

Interface Summary
Interface Description

BaseTokenStreamTestCase.CheckClearAttributesAttribute
Attribute that records if it was cleared or not.

Interface Summary
Interface	Description
BaseTokenStreamTestCase.CheckClearAttributesAttribute	Attribute that records if it was cleared or not.

Class Summary
Class	Description
BaseTokenStreamTestCase	Base class for all Lucene unit tests that use TokenStreams.
BaseTokenStreamTestCase.CheckClearAttributesAttributeImpl	Attribute that records if it was cleared or not.
CannedBinaryTokenStream	TokenStream from a canned list of binary (BytesRef-based) tokens.
CannedBinaryTokenStream.BinaryToken	Represents a binary token.
CannedTokenStream	TokenStream from a canned list of Tokens.
CollationTestBase	Base test class for testing Unicode collation.
CrankyTokenFilter	Throws IOException from random Tokenstream methods.
LookaheadTokenFilter<T extends LookaheadTokenFilter.Position>	An abstract TokenFilter to make it easier to build graph token filters requiring some lookahead.
LookaheadTokenFilter.Position	Holds all state for a single position; subclass this to record other state at each position.
MockAnalyzer	Analyzer for testing
MockBytesAnalyzer	Analyzer for testing that encodes terms as UTF-16 bytes.
MockCharFilter	the purpose of this charfilter is to send offsets out of bounds if the analyzer doesn't use correctOffset or does incorrect offset math.
MockFixedLengthPayloadFilter	TokenFilter that adds random fixed-length payloads.
MockGraphTokenFilter	Randomly inserts overlapped (posInc=0) tokens with posLength sometimes > 1.
MockHoleInjectingTokenFilter	Randomly injects holes (similar to what a stopfilter would do)
MockLowerCaseFilter	A lowercasing `TokenFilter`.
MockPayloadAnalyzer	Wraps a whitespace tokenizer with a filter that sets the first token, and odd tokens to posinc=1, and all others to 0, encoding the position as pos: XXX in the payload.
MockRandomLookaheadTokenFilter	Uses `LookaheadTokenFilter` to randomly peek at future tokens.
MockReaderWrapper	Wraps a Reader, and can throw random or fixed exceptions, and spoon feed read chars.
MockSynonymAnalyzer	adds synonym of "dog" for "dogs", and synonym of "cavy" for "guinea pig".
MockSynonymFilter	adds synonym of "dog" for "dogs", and synonym of "cavy" for "guinea pig".
MockTokenFilter	A tokenfilter for testing that removes terms accepted by a DFA.
MockTokenizer	Tokenizer for testing.
MockUTF16TermAttributeImpl	Extension of `CharTermAttributeImpl` that encodes the term text as UTF-16 bytes instead of as UTF-8 bytes.
MockVariableLengthPayloadFilter	TokenFilter that adds random variable-length payloads.
SimplePayloadFilter	Simple payload filter that sets the payload as pos: XXXX
Token	A Token is an occurrence of a term from the text of a field.
TokenStreamToDot	Consumes a TokenStream and outputs the dot (graphviz) string (graph).
ValidatingTokenFilter	A TokenFilter that checks consistency of the tokens (eg offsets are consistent with one another).
VocabularyAssert	Utility class for doing vocabulary-based stemming tests

Package org.apache.lucene.analysis Description

Support for testing analysis components.

The main classes of interest are:

BaseTokenStreamTestCase: Highly recommended to use its helper methods, (especially in conjunction with MockAnalyzer or MockTokenizer), as it contains many assertions and checks to catch bugs.
MockTokenizer: Tokenizer for testing. Tokenizer that serves as a replacement for WHITESPACE, SIMPLE, and KEYWORD tokenizers. If you are writing a component such as a TokenFilter, it's a great idea to test it wrapping this tokenizer instead for extra checks.
MockAnalyzer: Analyzer for testing. Analyzer that uses MockTokenizer for additional verification. If you are testing a custom component such as a queryparser or analyzer-wrapper that consumes analysis streams, it's a great idea to test it with this analyzer instead.