public class AutomatonQuery extends MultiTermQuery implements Accountable
Query
that will match terms against a finite-state machine.
This query will match documents that contain terms accepted by a given
finite-state machine. The automaton can be constructed with the
org.apache.lucene.util.automaton
API. Alternatively, it can be
created from a regular expression with RegexpQuery
or from
the standard Lucene wildcard syntax with WildcardQuery
.
When the query is executed, it will create an equivalent DFA of the
finite-state machine, and will enumerate the term dictionary in an
intelligent way to reduce the number of comparisons. For example: the regular
expression of [dl]og?
will make approximately four comparisons:
do, dog, lo, and log.
MultiTermQuery.RewriteMethod, MultiTermQuery.TopTermsBlendedFreqScoringRewrite, MultiTermQuery.TopTermsBoostOnlyBooleanQueryRewrite, MultiTermQuery.TopTermsScoringBooleanQueryRewrite
Modifier and Type | Field and Description |
---|---|
protected Automaton |
automaton
the automaton to match index terms against
|
protected boolean |
automatonIsBinary |
protected CompiledAutomaton |
compiled |
protected Term |
term
term containing the field, and possibly some pattern structure
|
CONSTANT_SCORE_BOOLEAN_REWRITE, CONSTANT_SCORE_REWRITE, field, rewriteMethod, SCORING_BOOLEAN_REWRITE
NULL_ACCOUNTABLE
Constructor and Description |
---|
AutomatonQuery(Term term,
Automaton automaton)
Create a new AutomatonQuery from an
Automaton . |
AutomatonQuery(Term term,
Automaton automaton,
int maxDeterminizedStates)
Create a new AutomatonQuery from an
Automaton . |
AutomatonQuery(Term term,
Automaton automaton,
int maxDeterminizedStates,
boolean isBinary)
Create a new AutomatonQuery from an
Automaton . |
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object obj)
Override and implement query instance equivalence properly in a subclass.
|
Automaton |
getAutomaton()
Returns the automaton used to create this query
|
protected TermsEnum |
getTermsEnum(Terms terms,
AttributeSource atts)
Construct the enumeration to be used, expanding the
pattern term.
|
int |
hashCode()
Override and implement query hash code properly in a subclass.
|
boolean |
isAutomatonBinary()
Is this a binary (byte) oriented automaton.
|
long |
ramBytesUsed()
Return the memory usage of this object in bytes.
|
String |
toString(String field)
Prints a query to a string, with
field assumed to be the
default field and omitted. |
void |
visit(QueryVisitor visitor)
Recurse through the query tree, visiting any child queries
|
getField, getRewriteMethod, getTermsEnum, rewrite, setRewriteMethod
classHash, createWeight, sameClassAs, toString
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
getChildResources
protected final Automaton automaton
protected final CompiledAutomaton compiled
protected final Term term
protected final boolean automatonIsBinary
public AutomatonQuery(Term term, Automaton automaton)
Automaton
.term
- Term containing field and possibly some pattern structure. The
term text is ignored.automaton
- Automaton to run, terms that are accepted are considered a
match.public AutomatonQuery(Term term, Automaton automaton, int maxDeterminizedStates)
Automaton
.term
- Term containing field and possibly some pattern structure. The
term text is ignored.automaton
- Automaton to run, terms that are accepted are considered a
match.maxDeterminizedStates
- maximum number of states in the resulting
automata. If the automata would need more than this many states
TooComplextToDeterminizeException is thrown. Higher number require more
space but can process more complex automata.public AutomatonQuery(Term term, Automaton automaton, int maxDeterminizedStates, boolean isBinary)
Automaton
.term
- Term containing field and possibly some pattern structure. The
term text is ignored.automaton
- Automaton to run, terms that are accepted are considered a
match.maxDeterminizedStates
- maximum number of states in the resulting
automata. If the automata would need more than this many states
TooComplextToDeterminizeException is thrown. Higher number require more
space but can process more complex automata.isBinary
- if true, this automaton is already binary and
will not go through the UTF32ToUTF8 conversionprotected TermsEnum getTermsEnum(Terms terms, AttributeSource atts) throws IOException
MultiTermQuery
TermsEnum.EMPTY
if no
terms match). The TermsEnum must already be
positioned to the first matching term.
The given AttributeSource
is passed by the MultiTermQuery.RewriteMethod
to
share information between segments, for example TopTermsRewrite
uses
it to share maximum competitive boostsgetTermsEnum
in class MultiTermQuery
IOException
public int hashCode()
Query
QueryCache
works properly.hashCode
in class MultiTermQuery
Query.equals(Object)
public boolean equals(Object obj)
Query
QueryCache
works properly.
Typically a query will be equal to another only if it's an instance of
the same class and its document-filtering properties are identical that other
instance. Utility methods are provided for certain repetitive code.equals
in class MultiTermQuery
Query.sameClassAs(Object)
,
Query.classHash()
public String toString(String field)
Query
field
assumed to be the
default field and omitted.public void visit(QueryVisitor visitor)
Query
public Automaton getAutomaton()
public boolean isAutomatonBinary()
public long ramBytesUsed()
Accountable
ramBytesUsed
in interface Accountable
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.