cascading.pipe.assembly
Class Unique.FilterPartialDuplicates
java.lang.Object
cascading.operation.BaseOperation<java.util.LinkedHashMap<Tuple,java.lang.Object>>
cascading.pipe.assembly.Unique.FilterPartialDuplicates
- All Implemented Interfaces:
- Filter<java.util.LinkedHashMap<Tuple,java.lang.Object>>, Operation<java.util.LinkedHashMap<Tuple,java.lang.Object>>, java.io.Serializable
- Enclosing class:
- Unique
public static class Unique.FilterPartialDuplicates
- extends BaseOperation<java.util.LinkedHashMap<Tuple,java.lang.Object>>
- implements Filter<java.util.LinkedHashMap<Tuple,java.lang.Object>>
Class FilterPartialDuplicates is a Filter
that is used to remove observed duplicates from the tuple stream.
Use this class typically in tandem with a First
Aggregator
in order to improve de-duping performance by removing as many values
as possible before the intermediate GroupBy
operator.
The threshold
value is used to maintain a LRU of a constant size. If more than threshold unique values
are seen, the oldest cached values will be removed from the cache.
- See Also:
Unique
,
Serialized Form
Fields inherited from interface cascading.operation.Operation |
ANY |
Method Summary |
void |
cleanup(FlowProcess flowProcess,
OperationCall<java.util.LinkedHashMap<Tuple,java.lang.Object>> operationCall)
Method cleanup does nothing, and may safely be overridden. |
boolean |
equals(java.lang.Object object)
|
int |
hashCode()
|
boolean |
isRemove(FlowProcess flowProcess,
FilterCall<java.util.LinkedHashMap<Tuple,java.lang.Object>> filterCall)
Method isRemove returns true if input should be removed from the tuple stream. |
void |
prepare(FlowProcess flowProcess,
OperationCall<java.util.LinkedHashMap<Tuple,java.lang.Object>> operationCall)
Method prepare does nothing, and may safely be overridden. |
Methods inherited from class java.lang.Object |
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Unique.FilterPartialDuplicates
public Unique.FilterPartialDuplicates()
- Constructor FilterPartialDuplicates creates a new FilterPartialDuplicates instance.
Unique.FilterPartialDuplicates
@ConstructorProperties(value="threshold")
public Unique.FilterPartialDuplicates(int threshold)
- Constructor FilterPartialDuplicates creates a new FilterPartialDuplicates instance.
- Parameters:
threshold
- of type int
prepare
public void prepare(FlowProcess flowProcess,
OperationCall<java.util.LinkedHashMap<Tuple,java.lang.Object>> operationCall)
- Description copied from class:
BaseOperation
- Method prepare does nothing, and may safely be overridden.
- Specified by:
prepare
in interface Operation<java.util.LinkedHashMap<Tuple,java.lang.Object>>
- Overrides:
prepare
in class BaseOperation<java.util.LinkedHashMap<Tuple,java.lang.Object>>
isRemove
public boolean isRemove(FlowProcess flowProcess,
FilterCall<java.util.LinkedHashMap<Tuple,java.lang.Object>> filterCall)
- Description copied from interface:
Filter
- Method isRemove returns true if input should be removed from the tuple stream.
- Specified by:
isRemove
in interface Filter<java.util.LinkedHashMap<Tuple,java.lang.Object>>
- Parameters:
flowProcess
- of type FlowProcessfilterCall
- of type FilterCall
- Returns:
- boolean
cleanup
public void cleanup(FlowProcess flowProcess,
OperationCall<java.util.LinkedHashMap<Tuple,java.lang.Object>> operationCall)
- Description copied from class:
BaseOperation
- Method cleanup does nothing, and may safely be overridden.
- Specified by:
cleanup
in interface Operation<java.util.LinkedHashMap<Tuple,java.lang.Object>>
- Overrides:
cleanup
in class BaseOperation<java.util.LinkedHashMap<Tuple,java.lang.Object>>
equals
public boolean equals(java.lang.Object object)
- Overrides:
equals
in class BaseOperation<java.util.LinkedHashMap<Tuple,java.lang.Object>>
hashCode
public int hashCode()
- Overrides:
hashCode
in class BaseOperation<java.util.LinkedHashMap<Tuple,java.lang.Object>>
Copyright © 2007-2011 Concurrent, Inc. All Rights Reserved.