public class WritableSequenceFile extends SequenceFile
SequenceFile
that reads and writes values of the given
writableType
Class
, instead of Tuple
instances used by default in SequenceFile.
This Class is a convenience for those who need to read/write specific types from existing sequence files without them being wrapped in a Tuple instance.
Note due to the nature of sequence files, only one type can be stored in the key and value positions, they they can be uniquely different types (LongWritable, Text).
If keyType is null, valueType must not be null, and vice versa, assuming you only wish to store a single value.
NullWritable
is used as the empty type for either a null keyType or valueType.
Modifier and Type | Field and Description |
---|---|
protected java.lang.Class<? extends org.apache.hadoop.io.Writable> |
keyType |
protected java.lang.Class<? extends org.apache.hadoop.io.Writable> |
valueType |
Constructor and Description |
---|
WritableSequenceFile(Fields fields,
java.lang.Class<? extends org.apache.hadoop.io.Writable> valueType)
Constructor WritableSequenceFile creates a new WritableSequenceFile instance.
|
WritableSequenceFile(Fields fields,
java.lang.Class<? extends org.apache.hadoop.io.Writable> keyType,
java.lang.Class<? extends org.apache.hadoop.io.Writable> valueType)
Constructor WritableSequenceFile creates a new WritableSequenceFile instance.
|
Modifier and Type | Method and Description |
---|---|
boolean |
equals(java.lang.Object object) |
int |
hashCode() |
void |
sink(FlowProcess<? extends org.apache.hadoop.conf.Configuration> flowProcess,
SinkCall<java.lang.Void,org.apache.hadoop.mapred.OutputCollector> sinkCall)
Method sink writes out the given
Tuple found on SinkCall.getOutgoingEntry() to
the SinkCall.getOutput() . |
void |
sinkConfInit(FlowProcess<? extends org.apache.hadoop.conf.Configuration> flowProcess,
Tap<org.apache.hadoop.conf.Configuration,org.apache.hadoop.mapred.RecordReader,org.apache.hadoop.mapred.OutputCollector> tap,
org.apache.hadoop.conf.Configuration conf)
Method sinkInit initializes this instance as a sink.
|
boolean |
source(FlowProcess<? extends org.apache.hadoop.conf.Configuration> flowProcess,
SourceCall<java.lang.Object[],org.apache.hadoop.mapred.RecordReader> sourceCall)
Method source will read a new "record" or value from
SourceCall.getInput() and populate
the available Tuple via SourceCall.getIncomingEntry() and return true
on success or false if no more values available. |
getExtension, sourceCleanup, sourceConfInit, sourcePrepare
getNumSinkParts, getSinkFields, getSourceFields, getTrace, isSink, isSource, isSymmetrical, presentSinkFields, presentSinkFieldsInternal, presentSourceFields, presentSourceFieldsInternal, retrieveSinkFields, retrieveSourceFields, setNumSinkParts, setSinkFields, setSourceFields, sinkCleanup, sinkPrepare, sinkWrap, sourceRePrepare, sourceWrap, toString
protected final java.lang.Class<? extends org.apache.hadoop.io.Writable> keyType
protected final java.lang.Class<? extends org.apache.hadoop.io.Writable> valueType
@ConstructorProperties(value={"fields","valueType"}) public WritableSequenceFile(Fields fields, java.lang.Class<? extends org.apache.hadoop.io.Writable> valueType)
fields
- of type FieldsvalueType
- of type Class, may not be null@ConstructorProperties(value={"fields","keyType","valueType"}) public WritableSequenceFile(Fields fields, java.lang.Class<? extends org.apache.hadoop.io.Writable> keyType, java.lang.Class<? extends org.apache.hadoop.io.Writable> valueType)
fields
- of type FieldskeyType
- of type ClassvalueType
- of type Classpublic void sinkConfInit(FlowProcess<? extends org.apache.hadoop.conf.Configuration> flowProcess, Tap<org.apache.hadoop.conf.Configuration,org.apache.hadoop.mapred.RecordReader,org.apache.hadoop.mapred.OutputCollector> tap, org.apache.hadoop.conf.Configuration conf)
Scheme
This method is executed client side as a means to provide necessary configuration parameters used by the underlying platform.
It is not intended to initialize resources that would be necessary during the execution of this class, like a "formatter" or "parser".
See Scheme.sinkPrepare(cascading.flow.FlowProcess, SinkCall)
if resources much be initialized
before use. And Scheme.sinkCleanup(cascading.flow.FlowProcess, SinkCall)
if resources must be
destroyed after use.
sinkConfInit
in class SequenceFile
flowProcess
- of type FlowProcesstap
- of type Tapconf
- of type Configpublic boolean source(FlowProcess<? extends org.apache.hadoop.conf.Configuration> flowProcess, SourceCall<java.lang.Object[],org.apache.hadoop.mapred.RecordReader> sourceCall) throws java.io.IOException
Scheme
SourceCall.getInput()
and populate
the available Tuple
via SourceCall.getIncomingEntry()
and return true
on success or false
if no more values available.
It's ok to set a new Tuple instance on the incomingEntry
TupleEntry
, or
to simply re-use the existing instance.
Note this is only time it is safe to modify a Tuple instance handed over via a method call.
This method may optionally throw a TapException
if it cannot process a particular
instance of data. If the payload Tuple is set on the TapException, that Tuple will be written to
any applicable failure trap Tap.
source
in class SequenceFile
flowProcess
- of type FlowProcesssourceCall
- of SourceCalltrue
when a Tuple was successfully readjava.io.IOException
public void sink(FlowProcess<? extends org.apache.hadoop.conf.Configuration> flowProcess, SinkCall<java.lang.Void,org.apache.hadoop.mapred.OutputCollector> sinkCall) throws java.io.IOException
Scheme
Tuple
found on SinkCall.getOutgoingEntry()
to
the SinkCall.getOutput()
.
This method may optionally throw a TapException
if it cannot process a particular
instance of data. If the payload Tuple is set on the TapException, that Tuple will be written to
any applicable failure trap Tap. If not set, the incoming Tuple will be written instead.
sink
in class SequenceFile
flowProcess
- of ProcesssinkCall
- of SinkCalljava.io.IOException
public boolean equals(java.lang.Object object)
Copyright © 2007-2017 Cascading Maintainers. All Rights Reserved.