public class HighFreqTerms extends Object
HighFreqTerms
class extracts the top n most frequent terms
(by document frequency) from an existing Lucene index and reports their
document frequency.
If the -t flag is given, both document frequency and total tf (total number of occurrences) are reported, ordered by descending total tf.
Modifier and Type | Class and Description |
---|---|
static class |
HighFreqTerms.DocFreqComparator
Compares terms by docTermFreq
|
static class |
HighFreqTerms.TotalTermFreqComparator
Compares terms by totalTermFreq
|
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_NUMTERMS |
Constructor and Description |
---|
HighFreqTerms() |
Modifier and Type | Method and Description |
---|---|
static TermStats[] |
getHighFreqTerms(IndexReader reader,
int numTerms,
String field,
Comparator<TermStats> comparator)
Returns TermStats[] ordered by the specified comparator
|
static void |
main(String[] args) |
public static final int DEFAULT_NUMTERMS
public static TermStats[] getHighFreqTerms(IndexReader reader, int numTerms, String field, Comparator<TermStats> comparator) throws Exception
Exception
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.