org.apache.hadoop.mapreduce
Class OutputCommitter
java.lang.Object
extended by org.apache.hadoop.mapreduce.OutputCommitter
Direct Known Subclasses:
FileOutputCommitter, OutputCommitter
public abstract class OutputCommitter
extends Object
OutputCommitter describes the commit of task output for a Map-Reduce job.
The Map-Reduce framework relies on the OutputCommitter of the job to:
- Setup the job during initialization. For example, create the temporary output directory for the job during the initialization of the job.
- Cleanup the job after the job completion. For example, remove the temporary output directory after the job completion.
- Setup the task temporary output.
- Check whether a task needs a commit. This is to avoid the commit procedure if a task does not need commit.
- Commit of the task output.
- Discard the task commit.
org.apache.hadoop.mapreduce
Class OutputFormat<K,V>
java.lang.Object
extended by org.apache.hadoop.mapreduce.OutputFormat<K,V>
Direct Known Subclasses:
DBOutputFormat, FileOutputFormat, FilterOutputFormat, NullOutputFormat
public abstract class OutputFormat<K,V>
extends Object
OutputFormat describes the output-specification for a Map-Reduce job.
The Map-Reduce framework relies on the OutputFormat of the job to:
- Validate the output-specification of the job. For e.g. check that the output directory doesn't already exist.
- Provide the RecordWriter implementation to be used to write out the output files of the job. Output files are stored in a FileSystem.
https://hadoop.apache.org/docs/r1.2.1/api/org/apache/hadoop/mapreduce/OutputFormat.html
沒有留言:
張貼留言