Ch02-Spark 应用执行模式

Ch02-Spark 应用执行模式

July 10, 2018
Apache Spark
spark

Spark 应用执行模式的不同主要体现在 Cluster Manager 使用的是哪个。如果是单独的进程,那么就是 Standalone 模式;如果是 Hadoop Yarn,那就是 Hadoop Yarn 模式。

1. Cluster Mananger #

类型 说明
Standalone a simple cluster manager included with Spark that makes it easy to set up a cluster.
Hadoop YARN the resource manager in Hadoop 2.
Apache Mesos a general cluster manager that can also run Hadoop MapReduce and service applications. (Deprecated)
Kubernetes an open-source system for automating deployment, scaling, and management of containerized applications.

2. Yarn 部署模式 #

Cluster Mananger 选择 Yarn 的时候存在两种部署模式,一种是 client,一种是 cluster。

bin/spark-submit --master yarn --deploy-mode client
bin/spark-submit --master yarn --deploy-mode cluster

client 和 Cluster 唯一的不同点是 Spark Context 运行在哪里。如果运行在 Spark Yarn Client 中,那便是 Client 模式,如下图左所示;如果运行在 Application Master 中,那便是 Cluster 模式,如下图右所示。

spark-deploy

3. 参考文献 #