Subject: Where does the Driver run?


I have explained this in my following Linkedlin article "The Operational
Advantages of Spark as a Distributed Processing Framework

An extract

*2) YARN Deployment Modes*

The term D*eployment mode of Spark*, simply means that “where the driver
program will be run”. There are two ways, namely; *Spark Client Mode*
<>* and **Spark
Cluster Mode* <>
*.* These are described below:

*In the Client mode,* *the driver daemon runs in the node through which you
submit the spark job to your cluster.* This is often done through the Edge
Node. This mode is valuable when you want to use spark interactively like
in our case where we would like to display high value prices in the
dashboard. In the Client mode you do not want to reserve any resource from
your cluster for the driver daemon

*In Cluster mode,* *you submit the spark job to your cluster and the driver
daemon is run inside your cluster and application master*. In this mode you
do not get to use the spark job interactively as the client through which
you submit the job is gone as soon as it successfully submits the job to
cluster. You will have to reserve some resources for the driver daemon
process as it will be running in your cluster.


Dr Mich Talebzadeh

LinkedIn *
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Sat, 23 Mar 2019 at 21:13, Pat Ferrel <[EMAIL PROTECTED]> wrote: