Apache Shuffle on YARN
You can disable Direct Shuffle and enable Apache Shuffle by modifying the configuration
    options in the yarn-site.xml and mapred-site.xml files. This
    page describes how to configure Apache Shuffle for MapReduce applications.
The shuffling phase in Hadoop is the process of transferring mappers intermediate output to the reducers. Direct shuffle increases the load on file system disks. You can enable the Apache Shuffle to reduce the load on file system disks.
Configuration for Apache Shuffle
Add the following property to 
      yarn-site.xml
        file:<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property> Add the following properties to mapred-site.xml file:
 <property>
    <name>mapreduce.job.shuffle.provider.services</name>
    <value>mapreduce_shuffle</value>
</property>
<property>
    <name>mapreduce.job.reduce.shuffle.consumer.plugin.class</name>
    <value>org.apache.hadoop.mapreduce.task.reduce.Shuffle</value>
</property>
<property>
    <name>mapreduce.job.map.output.collector.class</name>
    <value>org.apache.hadoop.mapred.MapTask$MapOutputBuffer</value>
</property>
<property>
    <name>mapred.ifile.outputstream</name>
    <value>org.apache.hadoop.mapred.IFileOutputStream</value>
</property>
<property>
    <name>mapred.ifile.inputstream</name>
    <value>org.apache.hadoop.mapred.IFileInputStream</value>
</property>
<property>
    <name>mapred.local.mapoutput</name>
    <value>true</value>
</property>
<property>
    <name>mapreduce.task.local.output.class</name>
    <value>org.apache.hadoop.mapred.YarnOutputFiles</value>
</property>