'how to connect spark workers to spark driver in kubernetes (standalone cluster)
I created a Dockerfile
with just debian
and apache spark
downloaded from the main website. I then created a kubernetes deployment to have 1 pod running spark driver
, and another spark worker
NAME READY STATUS RESTARTS AGE
spark-driver-54446998ff-2rz5h 1/1 Running 0 45m
spark-worker-5d55b54d8d-9vfs7 1/1 Running 2 (69m ago) 16h
tested to work
I am able to launch spark driver with ./start-master.sh
located in /spark-dir/sbin/
. This is the log generated from the start-master.sh
file
22/04/28 04:34:25 INFO Utils: Successfully started service 'sparkMaster' on port 7077.
22/04/28 04:34:25 INFO Master: Starting Spark master at spark://10.244.1.148:7077
22/04/28 04:34:25 INFO Master: Running Spark version 3.2.1
22/04/28 04:34:26 INFO Utils: Successfully started service 'MasterUI' on port 8080.
22/04/28 04:34:27 INFO MasterWebUI: Bound MasterWebUI to 0.0.0.0, and started at http://spark-driver-54446998ff-2rz5h:8080
22/04/28 04:34:28 INFO Master: I have been elected leader! New state: ALIVE
what spark-driver
/etc/hosts produce:
# Kubernetes-managed hosts file.
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.244.1.148 spark-driver-54446998ff-2rz5h
Now I can connect my worker on another pod in the same namespace with: ./start-worker.sh spark://10.244.1.148:7077
and this shows success because in the spark-driver
log file, it shows:
22/04/28 04:34:52 INFO Master: Registering worker 10.244.2.134:44413 with 4 cores, 1024.0 MiB RAM
My question is: In order for me to do this dynamically, I need the worker pod to be able to pull the ip address of spark-driver
for it to connect. I read that a potential way for doing this is to make use of dns
service to achieve this, but so far i've been unsuccessful in getting it to work.
This is my deployment.yaml
file which contains the service
as well. But i'm unable to understand how it'll work together.
apiVersion: v1
kind: Service
metadata:
name: spark-driver
spec:
type: ClusterIP
# type: NodePort
selector:
app.kubernetes.io/name: spark-3.2.1
app.kubernetes.io/instance: spark-driver
ports:
- name: service
protocol: TCP
port: 80
targetPort: service-port
- name: spark-master
protocol: TCP
port: 8080
targetPort: spark-ui-port
- name: spark-worker
protocol: TCP
port: 7077
targetPort: spark-wkr-port
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: spark-driver
labels:
app.kubernetes.io/name: spark-3.2.1
app.kubernetes.io/instance: spark-driver
app.kubernetes.io/version: 0.0.4
app.kubernetes.io/managed-by: kubernetes-standalone-cluster
spec:
replicas: 1
selector:
matchLabels:
app: spark-3.2.1-driver
template:
metadata:
labels:
app: spark-3.2.1-driver
spec:
containers:
- name: spark-driver
image: zzzzzzzzzzz
ports:
- containerPort: 80
name: service-port
- containerPort: 8080
name: spark-ui-port
- containerPort: 7077
name: spark-wkr-port
resources:
requests:
cpu: "2"
memory: "2Gi"
limits:
cpu: "4"
memory: "3Gi"
env:
- name: SPARK_MASTER_HOST
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: SPARK_MASTER_PORT
value: "7077"
- name: SPARK_MODE
value: driver
- name: TERM
value: xterm
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: spark-worker
labels:
app.kubernetes.io/name: spark-3.2.1-worker
app.kubernetes.io/instance: spark-worker
app.kubernetes.io/version: 0.0.4
app.kubernetes.io/managed-by: kubernetes-standalone-cluster
spec:
replicas: 1
selector:
matchLabels:
app: spark-3.2.1-worker
template:
metadata:
labels:
app: spark-3.2.1-worker
spec:
containers:
- name: spark-worker
image: zzzzzzzzzzz
resources:
requests:
cpu: "2"
memory: "1Gi"
limits:
cpu: "4"
memory: "2Gi"
env:
- name: SPARK_MODE
value: worker
- name: TERM
value: xterm
---
How should I be configuring the service, or spark env such that the dns
can be used by spark worker to connect to spark driver?
Solution 1:[1]
I am trying to accomplish the same as you, so far unsuccessful. However, this may be useful to you:
You can expose the hostIP of the driver pod as an env variable like so:
-env:
- name: "SPARK_DRIVER_HOST_IP"
valueFrom:
fieldRef:
apiVersion: "v1"
fieldPath: "status.hostIP"
This works for me, as I go ahead and set the spark.host.driver property to be the SPARK_DRVIER_HOST_IP. I actually am doing this from within my container that is supposed to run the spark application, in my SparkConf settings.
However my issue is that the executor is getting Connection Refused when trying to connect to [driverHostIP]:[PORT]. I suspect this is maybe because I need a service to expose this IP like you have in your yaml? but I am not sure.
I am hoping that we both have the 2 pieces of this solution and that using the exposed driver IP address and the spark driver service will work. Let me know if having the driver IP handy it helpful or not.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | 9945 |