Spark parameters
Dynamic Executor Allocation
spark.dynamicAllocation.enabled=True
spark.dynamicAllocation.executorIdleTimeout=2m
spark.dynamicAllocation.minExecutors=1
spark.dynamicAllocation.maxExecutors=2000
Better fetch failure handling
spark.max.fetch.failures.per.stage = 10
Scaling spark Driver
spark.rpc.io.serverThreads = 64
Tuning memory configurations
1.Enable Off heap memory
spark.memory.offHeap.enabled = True
spark.memory.offHeap.size = 3g
spark.executor.memory = 3g
spark.yarn.executor.memoryOverhead = 0.1 * (spark.executor.memory + spark.memory.offHeap.size)
2.Garbage collection Tuning
spark.executor.extraJavaOptions = -XX:ParallelGCThreads=4 -XX:+UseParallelGC
Eliminate Disk I/O bottleneck
1.spark.shuffle.file.buffer=1Mb
spark.unsafe.sorter.spill.reader.buffer.size=1Mb
2.spark.file.transferTo=false
spark.shuffle.unsafe.file.output.buffer=5Mb
3.spark.io.comporession.lz4.blockSize=512KB
Cache index files on Shuffle Server
spark.shuffle.service.index.cache.entries=2048
Scaling External Shuffle Service
Tune shuffle service worker thread and backlog
spark.shuffle.io.serverThreads=128
spark.shuffle.io.backLog=8192
Configurable shuffle registration timeout and entry
spark.shuffle.registration.timeout = 2m
spark.shuffle.registration.maxAttempts = 5
Dynamic Executor Allocation
spark.dynamicAllocation.enabled=True
spark.dynamicAllocation.executorIdleTimeout=2m
spark.dynamicAllocation.minExecutors=1
spark.dynamicAllocation.maxExecutors=2000
Better fetch failure handling
spark.max.fetch.failures.per.stage = 10
Scaling spark Driver
spark.rpc.io.serverThreads = 64
Tuning memory configurations
1.Enable Off heap memory
spark.memory.offHeap.enabled = True
spark.memory.offHeap.size = 3g
spark.executor.memory = 3g
spark.yarn.executor.memoryOverhead = 0.1 * (spark.executor.memory + spark.memory.offHeap.size)
2.Garbage collection Tuning
spark.executor.extraJavaOptions = -XX:ParallelGCThreads=4 -XX:+UseParallelGC
Eliminate Disk I/O bottleneck
1.spark.shuffle.file.buffer=1Mb
spark.unsafe.sorter.spill.reader.buffer.size=1Mb
2.spark.file.transferTo=false
spark.shuffle.unsafe.file.output.buffer=5Mb
3.spark.io.comporession.lz4.blockSize=512KB
Cache index files on Shuffle Server
spark.shuffle.service.index.cache.entries=2048
Scaling External Shuffle Service
Tune shuffle service worker thread and backlog
spark.shuffle.io.serverThreads=128
spark.shuffle.io.backLog=8192
Configurable shuffle registration timeout and entry
spark.shuffle.registration.timeout = 2m
spark.shuffle.registration.maxAttempts = 5