Spark performance reduced with increased number of worker nodes

Multi tool use
Multi tool use


Spark performance reduced with increased number of worker nodes



I have a spark application which loads data from csv files, calls Drools engine, uses flatmap and saves results to output files (.csv)



Below are the two test cases :



1) When I am running this Application with 2 Worker nodes having same configuration (8 cores) Application takes 5.2 minutes to complete :



No. Of Executors : 133



Total No of Cores : 16



Used Memory : 2GB( 1GB per executor)



Available Memory : 30GB



2) When I am running this Application with 3 Worker nodes having same configuration (8 cores) Application takes 7.6 minutes to complete :



Expected result



It should take less time after adding one more worker node with same configuration.



Actual Result



It takes more time after adding one more worker node with same configuration.



I am running application using spark-submit command in standalone mode.


spark-submit



Here I want to understand that why increasing worker node doesn't increase the performance, is that not the correct expectation?



EDIT



After looking at other similar question on stackoverflow I tried running application with spark.dynamicAllocation.enabled=true, however it further degrades the performance.


spark.dynamicAllocation.enabled=true





Spark: Inconsistent performance number in scaling number of cores
– user8371915
Jul 3 at 10:37





@user8371915 Thanks for pointing out, however problem in that question is having same machine and increased number of cores(that's why in solution he mentioned multithreading on single machine), in my case machines are multiple(worker nodes are 2 in first case and 3 in second case) , is that solution still remain applicable in my case?
– Raj
Jul 3 at 12:06










By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

2CT4NSVtD,h 5tTw31v9 Rvu NfA
8OHGXYnr CeZciaIus3dZD6EwI5cpTiXTdtXd8JQL3z4HTZrBYnHQo8C92vTO,OedlY gekNVZQAIg5NB A 3wObVu8Rxhf9tw

Popular posts from this blog

PHP contact form sending but not receiving emails

Do graphics cards have individual ID by which single devices can be distinguished?

Create weekly swift ios local notifications