Class HyperLogLogPlusPlus
java.lang.Object
org.elasticsearch.search.aggregations.metrics.AbstractHyperLogLogPlusPlus
org.elasticsearch.search.aggregations.metrics.HyperLogLogPlusPlus
- All Implemented Interfaces:
Closeable,AutoCloseable,org.elasticsearch.core.Releasable
Hyperloglog++ counter, implemented based on pseudo code from
http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/40671.pdf and its appendix
https://docs.google.com/document/d/1gyjfMHy43U9OWBXxfaeG-3MjGzejW1dlpyMwEYAAWEI/view?fullscreen
This implementation is different from the original implementation in that it uses a hash table instead of a sorted list for linear
counting. Although this requires more space and makes hyperloglog (which is less accurate) used sooner, this is also considerably faster.
Trying to understand what this class does without having read the paper is considered adventurous.
The HyperLogLogPlusPlus contains two algorithms, one for linear counting and the HyperLogLog algorithm. Initially hashes added to the
data structure are processed using the linear counting until a threshold defined by the precision is reached where the data is replayed
to the HyperLogLog algorithm and then this is used.
It supports storing several HyperLogLogPlusPlus structures which are identified by a bucket number.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic intstatic intstatic intprotected intFields inherited from class org.elasticsearch.search.aggregations.metrics.AbstractHyperLogLogPlusPlus
HYPERLOGLOG, LINEAR_COUNTING -
Constructor Summary
ConstructorsConstructorDescriptionHyperLogLogPlusPlus(int precision, BigArrays bigArrays, long initialBucketCount) -
Method Summary
Modifier and TypeMethodDescriptionprotected voidaddRunLen(long bucketOrd, int register, int runLen)longcardinality(long bucketOrd)Returns the current computed cardinalityvoidclose()voidcollect(long bucket, long hash)Collect a value in the given bucketprotected booleangetAlgorithm(long bucketOrd)Algorithm used in the given bucketprotected AbstractHyperLogLog.RunLenIteratorgetHyperLogLog(long bucketOrd)Get HyperLogLog algorithmprotected AbstractLinearCounting.HashesIteratorgetLinearCounting(long bucketOrd)Get linear counting algorithmlongmaxOrd()Get the number of data structuresstatic longmemoryUsage(int precision)Return the expected per-bucket memory usage for the given precision.voidmerge(long thisBucket, AbstractHyperLogLogPlusPlus other, long otherBucket)intPrecision of the algorithmstatic intprecisionFromThreshold(long count)Compute the required precision so thatcountdistinct entries would be counted with linear counting.
-
Field Details
-
DEFAULT_PRECISION
public static final int DEFAULT_PRECISION- See Also:
- Constant Field Values
-
MIN_PRECISION
public static final int MIN_PRECISION- See Also:
- Constant Field Values
-
MAX_PRECISION
public static final int MAX_PRECISION- See Also:
- Constant Field Values
-
p
protected final int p
-
-
Constructor Details
-
HyperLogLogPlusPlus
-
-
Method Details
-
precisionFromThreshold
public static int precisionFromThreshold(long count)Compute the required precision so thatcountdistinct entries would be counted with linear counting. -
memoryUsage
public static long memoryUsage(int precision)Return the expected per-bucket memory usage for the given precision. -
maxOrd
public long maxOrd()Description copied from class:AbstractHyperLogLogPlusPlusGet the number of data structures- Specified by:
maxOrdin classAbstractHyperLogLogPlusPlus
-
cardinality
public long cardinality(long bucketOrd)Returns the current computed cardinality -
getAlgorithm
protected boolean getAlgorithm(long bucketOrd)Description copied from class:AbstractHyperLogLogPlusPlusAlgorithm used in the given bucket- Specified by:
getAlgorithmin classAbstractHyperLogLogPlusPlus
-
getLinearCounting
Description copied from class:AbstractHyperLogLogPlusPlusGet linear counting algorithm- Specified by:
getLinearCountingin classAbstractHyperLogLogPlusPlus
-
getHyperLogLog
Description copied from class:AbstractHyperLogLogPlusPlusGet HyperLogLog algorithm- Specified by:
getHyperLogLogin classAbstractHyperLogLogPlusPlus
-
collect
public void collect(long bucket, long hash)Description copied from class:AbstractHyperLogLogPlusPlusCollect a value in the given bucket- Specified by:
collectin classAbstractHyperLogLogPlusPlus
-
close
public void close() -
addRunLen
protected void addRunLen(long bucketOrd, int register, int runLen) -
merge
-
precision
public int precision()Precision of the algorithm
-