|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcascading.tap.Tap<Config,java.lang.Void,Output>
cascading.tap.SinkTap<JobConf,OutputCollector>
cascading.tap.hadoop.TemplateTap
public class TemplateTap
Class TemplateTap can be used to write tuple streams out to sub-directories based on the values in the Tuple
instance.
Hfs
Tap
and a Formatter
format syntax String. This allows
Tuple values at given positions to be used as directory names. Note that Hadoop can only sink to directories, and
all files in those directories are "part-xxxxx" files.
openTapsThreshold
limits the number of open files to be output to. This value defaults to 300 files.
Each time the threshold is exceeded, 10% of the least recently used open files will be closed.
Nested Class Summary | |
---|---|
static class |
TemplateTap.TemplateScheme
|
Constructor Summary | |
---|---|
TemplateTap(Hfs parent,
java.lang.String pathTemplate)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the
base path and default Scheme , and the pathTemplate as the Formatter format String. |
|
TemplateTap(Hfs parent,
java.lang.String pathTemplate,
Fields pathFields)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the
base path and default Scheme , and the pathTemplate as the Formatter format String. |
|
TemplateTap(Hfs parent,
java.lang.String pathTemplate,
Fields pathFields,
int openTapsThreshold)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the
base path and default Scheme , and the pathTemplate as the Formatter format String. |
|
TemplateTap(Hfs parent,
java.lang.String pathTemplate,
Fields pathFields,
SinkMode sinkMode)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the
base path and default Scheme , and the pathTemplate as the Formatter format String. |
|
TemplateTap(Hfs parent,
java.lang.String pathTemplate,
Fields pathFields,
SinkMode sinkMode,
boolean keepParentOnDelete)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the
base path and default Scheme , and the pathTemplate as the Formatter format String. |
|
TemplateTap(Hfs parent,
java.lang.String pathTemplate,
Fields pathFields,
SinkMode sinkMode,
boolean keepParentOnDelete,
int openTapsThreshold)
/** Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the
base path and default Scheme , and the pathTemplate as the Formatter format String. |
|
TemplateTap(Hfs parent,
java.lang.String pathTemplate,
int openTapsThreshold)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the
base path and default Scheme , and the pathTemplate as the Formatter format String. |
|
TemplateTap(Hfs parent,
java.lang.String pathTemplate,
SinkMode sinkMode)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the
base path and default Scheme , and the pathTemplate as the Formatter format String. |
|
TemplateTap(Hfs parent,
java.lang.String pathTemplate,
SinkMode sinkMode,
boolean keepParentOnDelete)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the
base path and default Scheme , and the pathTemplate as the Formatter format String. |
|
TemplateTap(Hfs parent,
java.lang.String pathTemplate,
SinkMode sinkMode,
boolean keepParentOnDelete,
int openTapsThreshold)
Constructor TemplateTap creates a new TemplateTap instance using the given parent Hfs Tap as the
base path and default Scheme , and the pathTemplate as the Formatter format String. |
Method Summary | |
---|---|
boolean |
commitResource(JobConf conf)
Method commitResource allows the underlying resource to be notified when all write processing is successful so that any additional cleanup or processing may be completed. |
boolean |
createResource(JobConf conf)
Method createResource creates the underlying resource. |
boolean |
deleteResource(JobConf conf)
Method deleteResource deletes the resource represented by this instance. |
boolean |
equals(java.lang.Object object)
|
java.lang.String |
getIdentifier()
Method getIdentifier returns a String representing the resource this Tap instance represents. |
long |
getModifiedTime(JobConf conf)
Method getModifiedTime returns the date this resource was last modified. |
int |
getOpenTapsThreshold()
Method getOpenTapsThreshold returns the openTapsThreshold of this TemplateTap object. |
Tap |
getParent()
Method getParent returns the parent Tap of this TemplateTap object. |
java.lang.String |
getPathTemplate()
Method getPathTemplate returns the pathTemplate Formatter format String of this TemplateTap object. |
int |
hashCode()
|
TupleEntryCollector |
openForWrite(FlowProcess<JobConf> flowProcess,
OutputCollector output)
Method openForWrite opens the resource represented by this Tap instance. |
boolean |
resourceExists(JobConf conf)
Method resourceExists returns true if the path represented by this instance exists. |
boolean |
rollbackResource(JobConf conf)
Method rollbackResource allows the underlying resource to be notified when any write processing has failed or was stopped so that any cleanup may be started. |
java.lang.String |
toString()
|
Methods inherited from class cascading.tap.SinkTap |
---|
getSourceFields, isSource, openForRead, sourceConfInit |
Methods inherited from class cascading.tap.Tap |
---|
flowConfInit, getConfigDef, getFullIdentifier, getScheme, getSinkFields, getSinkMode, getStepConfigDef, getTrace, hasConfigDef, hasProcessConfigDef, isEquivalentTo, isKeep, isReplace, isSink, isTemporary, isUpdate, openForRead, openForWrite, outgoingScopeFor, presentSinkFields, presentSourceFields, resolveFields, resolveIncomingOperationFields, retrieveSinkFields, retrieveSourceFields, setScheme, sinkConfInit, taps |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
@ConstructorProperties(value={"parent","pathTemplate"}) public TemplateTap(Hfs parent, java.lang.String pathTemplate)
Hfs
Tap as the
base path and default Scheme
, and the pathTemplate as the Formatter
format String.
parent
- of type TappathTemplate
- of type String@ConstructorProperties(value={"parent","pathTemplate","openTapsThreshold"}) public TemplateTap(Hfs parent, java.lang.String pathTemplate, int openTapsThreshold)
Hfs
Tap as the
base path and default Scheme
, and the pathTemplate as the Formatter
format String.
openTapsThreshold limits the number of open files to be output to.
parent
- of type HfspathTemplate
- of type StringopenTapsThreshold
- of type int@ConstructorProperties(value={"parent","pathTemplate","sinkMode"}) public TemplateTap(Hfs parent, java.lang.String pathTemplate, SinkMode sinkMode)
Hfs
Tap as the
base path and default Scheme
, and the pathTemplate as the Formatter
format String.
parent
- of type TappathTemplate
- of type StringsinkMode
- of type SinkMode@ConstructorProperties(value={"parent","pathTemplate","sinkMode","keepParentOnDelete"}) public TemplateTap(Hfs parent, java.lang.String pathTemplate, SinkMode sinkMode, boolean keepParentOnDelete)
Hfs
Tap as the
base path and default Scheme
, and the pathTemplate as the Formatter
format String.
keepParentOnDelete, when set to true, prevents the parent Tap from being deleted when deleteResource(org.apache.hadoop.mapred.JobConf)
is called, typically an issue when used inside a Cascade
.
parent
- of type TappathTemplate
- of type StringsinkMode
- of type SinkModekeepParentOnDelete
- of type boolean@ConstructorProperties(value={"parent","pathTemplate","sinkMode","keepParentOnDelete","openTapsThreshold"}) public TemplateTap(Hfs parent, java.lang.String pathTemplate, SinkMode sinkMode, boolean keepParentOnDelete, int openTapsThreshold)
Hfs
Tap as the
base path and default Scheme
, and the pathTemplate as the Formatter
format String.
keepParentOnDelete, when set to true, prevents the parent Tap from being deleted when deleteResource(org.apache.hadoop.mapred.JobConf)
is called, typically an issue when used inside a Cascade
.
openTapsThreshold limits the number of open files to be output to.
parent
- of type TappathTemplate
- of type StringsinkMode
- of type SinkModekeepParentOnDelete
- of type booleanopenTapsThreshold
- of type int@ConstructorProperties(value={"parent","pathTemplate","pathFields"}) public TemplateTap(Hfs parent, java.lang.String pathTemplate, Fields pathFields)
Hfs
Tap as the
base path and default Scheme
, and the pathTemplate as the Formatter
format String.
The pathFields is a selector that selects and orders the fields to be used in the given pathTemplate.
This constructor also allows the sinkFields of the parent Tap to be independent of the pathFields. Thus allowing
data not in the result file to be used in the template path name.
parent
- of type TappathTemplate
- of type StringpathFields
- of type Fields@ConstructorProperties(value={"parent","pathTemplate","pathFields","openTapsThreshold"}) public TemplateTap(Hfs parent, java.lang.String pathTemplate, Fields pathFields, int openTapsThreshold)
Hfs
Tap as the
base path and default Scheme
, and the pathTemplate as the Formatter
format String.
The pathFields is a selector that selects and orders the fields to be used in the given pathTemplate.
This constructor also allows the sinkFields of the parent Tap to be independent of the pathFields. Thus allowing
data not in the result file to be used in the template path name.
openTapsThreshold limits the number of open files to be output to.
parent
- of type HfspathTemplate
- of type StringpathFields
- of type FieldsopenTapsThreshold
- of type int@ConstructorProperties(value={"parent","pathTemplate","pathFields","sinkMode"}) public TemplateTap(Hfs parent, java.lang.String pathTemplate, Fields pathFields, SinkMode sinkMode)
Hfs
Tap as the
base path and default Scheme
, and the pathTemplate as the Formatter
format String.
The pathFields is a selector that selects and orders the fields to be used in the given pathTemplate.
This constructor also allows the sinkFields of the parent Tap to be independent of the pathFields. Thus allowing
data not in the result file to be used in the template path name.
parent
- of type TappathTemplate
- of type StringpathFields
- of type FieldssinkMode
- of type SinkMode@ConstructorProperties(value={"parent","pathTemplate","pathFields","sinkMode","keepParentOnDelete"}) public TemplateTap(Hfs parent, java.lang.String pathTemplate, Fields pathFields, SinkMode sinkMode, boolean keepParentOnDelete)
Hfs
Tap as the
base path and default Scheme
, and the pathTemplate as the Formatter
format String.
The pathFields is a selector that selects and orders the fields to be used in the given pathTemplate.
This constructor also allows the sinkFields of the parent Tap to be independent of the pathFields. Thus allowing
data not in the result file to be used in the template path name.
keepParentOnDelete, when set to true, prevents the parent Tap from being deleted when deleteResource(org.apache.hadoop.mapred.JobConf)
is called, typically an issue when used inside a Cascade
.
parent
- of type TappathTemplate
- of type StringpathFields
- of type FieldssinkMode
- of type SinkModekeepParentOnDelete
- of type boolean@ConstructorProperties(value={"parent","pathTemplate","pathFields","sinkMode","keepParentOnDelete","openTapsThreshold"}) public TemplateTap(Hfs parent, java.lang.String pathTemplate, Fields pathFields, SinkMode sinkMode, boolean keepParentOnDelete, int openTapsThreshold)
Hfs
Tap as the
base path and default Scheme
, and the pathTemplate as the Formatter
format String.
The pathFields is a selector that selects and orders the fields to be used in the given pathTemplate.
This constructor also allows the sinkFields of the parent Tap to be independent of the pathFields. Thus allowing
data not in the result file to be used in the template path name.
keepParentOnDelete, when set to true, prevents the parent Tap from being deleted when deleteResource(org.apache.hadoop.mapred.JobConf)
is called, typically an issue when used inside a Cascade
.
openTapsThreshold limits the number of open files to be output to.
parent
- of type HfspathTemplate
- of type StringpathFields
- of type FieldssinkMode
- of type SinkModekeepParentOnDelete
- of type booleanopenTapsThreshold
- of type intMethod Detail |
---|
public Tap getParent()
public java.lang.String getPathTemplate()
Formatter
format String of this TemplateTap object.
public java.lang.String getIdentifier()
Tap
getIdentifier
in class Tap<JobConf,java.lang.Void,OutputCollector>
Tap.getIdentifier()
public int getOpenTapsThreshold()
public TupleEntryCollector openForWrite(FlowProcess<JobConf> flowProcess, OutputCollector output) throws java.io.IOException
Tap
output
value may be null, if so, sub-classes must inquire with the underlying Scheme
via Scheme.sinkConfInit(cascading.flow.FlowProcess, Tap, Object)
to get the proper
output type and instantiate it before calling super.openForWrite()
.
openForWrite
in class Tap<JobConf,java.lang.Void,OutputCollector>
java.io.IOException
- whenpublic boolean createResource(JobConf conf) throws java.io.IOException
Tap
createResource
in class Tap<JobConf,java.lang.Void,OutputCollector>
conf
- of type JobConf
java.io.IOException
- when there is an error making directoriesTap.createResource(Object)
public boolean deleteResource(JobConf conf) throws java.io.IOException
Tap
deleteResource
in class Tap<JobConf,java.lang.Void,OutputCollector>
conf
- of type JobConf
java.io.IOException
- when the resource cannot be deletedTap.deleteResource(Object)
public boolean commitResource(JobConf conf) throws java.io.IOException
Tap
Tap.rollbackResource(Object)
to handle cleanup in the face of failures.
This method is invoked once "client side" and not in the cluster, if any.
commitResource
in class Tap<JobConf,java.lang.Void,OutputCollector>
java.io.IOException
public boolean rollbackResource(JobConf conf) throws java.io.IOException
Tap
Tap.commitResource(Object)
to handle cleanup when the write has successfully completed.
This method is invoked once "client side" and not in the cluster, if any.
rollbackResource
in class Tap<JobConf,java.lang.Void,OutputCollector>
java.io.IOException
public boolean resourceExists(JobConf conf) throws java.io.IOException
Tap
resourceExists
in class Tap<JobConf,java.lang.Void,OutputCollector>
conf
- of type JobConf
java.io.IOException
- when the status cannot be determinedTap.resourceExists(Object)
public long getModifiedTime(JobConf conf) throws java.io.IOException
Tap
getModifiedTime
in class Tap<JobConf,java.lang.Void,OutputCollector>
conf
- of type Config
java.io.IOException
Tap.getModifiedTime(Object)
public boolean equals(java.lang.Object object)
equals
in class Tap<JobConf,java.lang.Void,OutputCollector>
public int hashCode()
hashCode
in class Tap<JobConf,java.lang.Void,OutputCollector>
public java.lang.String toString()
toString
in class java.lang.Object
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |