WhitespaceTokenizerFactory (Lucene 8.9.0 API)乐学网一站式学习平台

Skip navigation links

Prev Class
Next Class

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

java.lang.Object
- org.apache.lucene.analysis.util.AbstractAnalysisFactory
- - org.apache.lucene.analysis.util.TokenizerFactory
  - - org.apache.lucene.analysis.core.WhitespaceTokenizerFactory

```
public class WhitespaceTokenizerFactory
extends TokenizerFactory
```
Factory for WhitespaceTokenizer.
```
 <fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100">
   <analyzer>
     <tokenizer class="solr.WhitespaceTokenizerFactory" rule="unicode"  maxTokenLen="256"/>
   </analyzer>
 </fieldType>
```
Options:
- rule: either "java" for WhitespaceTokenizer or "unicode" for UnicodeWhitespaceTokenizer
- maxTokenLen: max token length, should be greater than 0 and less than MAX_TOKEN_LENGTH_LIMIT (1024*1024). It is rare to need to change this else CharTokenizer::DEFAULT_MAX_TOKEN_LEN
Since:

3.1

SPI Name (Note: This is case-insensitive. e.g., if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service):

"whitespace"

Field Summary

Fields
Modifier and Type Field and Description

static String NAME
SPI name

static String RULE_JAVA

static String RULE_UNICODE
- Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
  LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion

Constructor Summary

Constructors
Constructor and Description

WhitespaceTokenizerFactory(Map<String,String> args)
Creates a new WhitespaceTokenizerFactory

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`Tokenizer`	`create(AttributeFactory factory)` Creates a TokenStream of the specified input using the given AttributeFactory

Methods inherited from class org.apache.lucene.analysis.util.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizers

Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - NAME
```
public static final String NAME
```
    SPI name
    
    See Also:
    
    Constant Field Values
  - RULE_JAVA
```
public static final String RULE_JAVA
```
    See Also:
    
    Constant Field Values
  - RULE_UNICODE
```
public static final String RULE_UNICODE
```
    See Also:
    
    Constant Field Values
- Constructor Detail
  - WhitespaceTokenizerFactory
```
public WhitespaceTokenizerFactory(Map<String,String> args)
```
    Creates a new WhitespaceTokenizerFactory
- Method Detail
  - create
```
public Tokenizer create(AttributeFactory factory)
```
    Description copied from class: TokenizerFactory
    
    Creates a TokenStream of the specified input using the given AttributeFactory
    
    Specified by:
    
    create in class TokenizerFactory

Skip navigation links

Prev Class
Next Class

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.