| 
Spring Batch requires unique job parameters for its execution.so you can add the current time as a job parameter Map<String, JobParameter> confMap = new HashMap<String, JobParameter>();
confMap.put("time", new JobParameter(System.currentTimeMillis()));
JobParameters jobParameters = new JobParameters(confMap);
jobLauncher.run(springCoreJob, jobParameters);
 | 
Monday, 30 December 2013
org.springframework.batch.core.repository.JobInstanceAlreadyCompleteException: A job instance already exists and is complete for parameters
Labels:
hadoop,
multiple jobs,
spring batch
Friday, 20 December 2013
Sort mapreduce output keys in descending order
Add the following class to your current class
NB. This only works if your key belongs to Text.class else modify the reverse comparator class accordingly
public static class ReverseComparator extends WritableComparator {
    
    private static final Text.Comparator TEXT_COMPARATOR = new Text.Comparator();
    public ReverseComparator() {
        super(Text.class);
    }
    @Override
    public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {
       return (-1)* TEXT_COMPARATOR.compare(b1, s1, l1, b2, s2, l2);
    }
    @SuppressWarnings("rawtypes")
    @Override
    public int compare(WritableComparable a, WritableComparable b) {
        if (a instanceof Text && b instanceof Text) {
                return (-1)*(((Text) a).compareTo((Text) b));
        }
        return super.compare(a, b);
    }
}
in new api(mapreduce) add the following to your configuration.
Job.setSortComparator(ReverseComparator.class);
NB. This only works if your key belongs to Text.class else modify the reverse comparator class accordingly
Set separator for mapreduce output
By default the output separator is a single space, to set the output separated by our desired character set this configuration
where conf is a org.apache.hadoop.conf.Configuration object
conf.set("mapred.textoutputformat.separator", ",");
The map reduce(ie the key and values) output will be comma separated in this case.where conf is a org.apache.hadoop.conf.Configuration object
Tuesday, 10 December 2013
Region servers going down in cdh4 due to mapreduce job
I faced this problem because i had set the scan caching to 500 ie it passes 500 rows to your mapreduce job which is memory intensive and not recommended
data driven db input format
Include the id also.....
in case of dbinput format dont use the id in the VO.
in case of dbinput format dont use the id in the VO.
Subscribe to:
Comments (Atom)
