-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spark-dependencies job is failing #405
spark-dependencies job is failing #405
Comments
can you provide the output of |
|
I have the same issue when running via kubectl as detailed in jaeger-kubernetes
|
@jkandasa I am not able to reproduce your issue. Could you please try updating spark-dependencies image to the latest tag which is currently:
I have created this CR # setup an elasticsearch with `make es`
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: simple-prod
spec:
strategy: production
storage:
type: elasticsearch
options:
es:
server-urls: http://elasticsearch:9200
dependencies:
enabled: true
schedule: "*/1 * * * *" and ES via |
@pavolloffay ok, I will check with this image and update here. Thank you! |
@pavolloffay I tried to pull the latest image on a different machine, It reports the same image that I used.
Could you please confirm the repository? |
The repository and image is correct. Before I was referring to ID, which does not seem to be the image SHA. But your image SHA is shown below as
|
@jkandasa could you provide exact information about the ES installation? What version, where did you downloaded, how it is deployed and configured? |
@pavolloffay In this test, I was using ES image from https://registry.centos.org/rhsyseng/elasticsearch/5.6.10 |
@pavolloffay I tried again with the operator provided es cluster, I see the same issue. spark-dependencies log: jaegerqe-spark-dependencies-1559226240-5tmxj.log oc describe spark-dependencies log: jaegerqe-spark-dependencies-1559226240-5tmxj.txt CR file: crfile.yaml.txt |
The self-provisioned ES cluster does not work, it's not supported at the moment. |
Fixed in jaegertracing/spark-dependencies#66 |
I am having problem to deploy the jaeger-spark-dependencies service in my openshift cluster: the service cannot connect to the elasticsearch service.... Therefore, I used the following command to deploy the spark-dependencies service: The log from the jaeger-spark-dependencies container is:
Then, I tried with: And, I got this:
|
spark dependencies do not support TLS at the moment #294 |
Hi @pavolloffay This issue still appears in my k8s cluster. Here is the error log I think I am using the version after your fix
and here is the info
Here is my es cluster info
Here is spark cronjob info
|
Could you please pull the latest spark dependencies image? The latest seems to be |
@pavolloffay thanks, |
That seems weird, there might be some inconsistency then - using different images between clusters. |
thanks, it seems everything works. |
I'm currently running into this as well (v1.20). I suspect it's because I override the job pod's securityContext's runAsUser. Using the latest tag ( Should I create a new issue for this? In general, I'm struggling quite a bit to get the operator to run on a cluster with restrictive PSPs. securityContext:
fsGroup: 1
runAsGroup: 999
runAsNonRoot: true
runAsUser: 999 # didn't work. With 185, ie the Dockerfile's default user, the same issue occurrs (https://github.com/jaegertracing/spark-dependencies/blob/master/Dockerfile)
supplementalGroups:
- 1 Logs:
|
Not sure if this helps anybody, but I had the same problem in a different context, while creating a custom Spark image for usage in Kubernetes. FROM bitnami/spark:3.1.1
ENV TINI_VERSION v0.19.0
ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini /usr/bin/tini
USER root
RUN chmod +x /usr/bin/tini && echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su && chgrp root /etc/passwd && chmod ug+rw /etc/passwd && chmod ugo+rwx -R /opt/bitnami/spark
USER 1001
ADD entrypoint.sh /opt/
ENTRYPOINT ["/opt/bitnami/scripts/spark/entrypoint.sh", "/opt/entrypoint.sh"] entrypoint.sh #!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# echo commands to the terminal output
set -ex
# Check whether there is a passwd entry for the container UID
myuid=$(id -u)
mygid=$(id -g)
# turn off -e for getent because it will return error code in anonymous uid case
set +e
uidentry=$(getent passwd $myuid)
set -e
# If there is no passwd entry for the container UID, attempt to create one
if [ -z "$uidentry" ] ; then
if [ -w /etc/passwd ] ; then
echo "$myuid:x:$myuid:$mygid:${SPARK_USER_NAME:-anonymous uid}:$SPARK_HOME:/bin/false" >> /etc/passwd
else
echo "Container ENTRYPOINT failed to add passwd entry for anonymous UID"
fi
fi
SPARK_CLASSPATH="$SPARK_CLASSPATH:${SPARK_HOME}/jars/*"
env | grep SPARK_JAVA_OPT_ | sort -t_ -k4 -n | sed 's/[^=]*=\(.*\)/\1/g' > /tmp/java_opts.txt
readarray -t SPARK_EXECUTOR_JAVA_OPTS < /tmp/java_opts.txt
if [ -n "$SPARK_EXTRA_CLASSPATH" ]; then
SPARK_CLASSPATH="$SPARK_CLASSPATH:$SPARK_EXTRA_CLASSPATH"
fi
if [ "$PYSPARK_MAJOR_PYTHON_VERSION" == "2" ]; then
pyv="$(python -V 2>&1)"
export PYTHON_VERSION="${pyv:7}"
export PYSPARK_PYTHON="python"
export PYSPARK_DRIVER_PYTHON="python"
elif [ "$PYSPARK_MAJOR_PYTHON_VERSION" == "3" ]; then
pyv3="$(python3 -V 2>&1)"
export PYTHON_VERSION="${pyv3:7}"
export PYSPARK_PYTHON="python3"
export PYSPARK_DRIVER_PYTHON="python3"
fi
# If HADOOP_HOME is set and SPARK_DIST_CLASSPATH is not set, set it here so Hadoop jars are available to the executor.
# It does not set SPARK_DIST_CLASSPATH if already set, to avoid overriding customizations of this value from elsewhere e.g. Docker/K8s.
if [ -n "${HADOOP_HOME}" ] && [ -z "${SPARK_DIST_CLASSPATH}" ]; then
export SPARK_DIST_CLASSPATH="$($HADOOP_HOME/bin/hadoop classpath)"
fi
if ! [ -z ${HADOOP_CONF_DIR+x} ]; then
SPARK_CLASSPATH="$HADOOP_CONF_DIR:$SPARK_CLASSPATH";
fi
case "$1" in
driver)
shift 1
CMD=(
"$SPARK_HOME/bin/spark-submit"
--conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS"
--deploy-mode client
"$@"
)
;;
executor)
shift 1
CMD=(
${JAVA_HOME}/bin/java
"${SPARK_EXECUTOR_JAVA_OPTS[@]}"
-Xms$SPARK_EXECUTOR_MEMORY
-Xmx$SPARK_EXECUTOR_MEMORY
-cp "$SPARK_CLASSPATH:$SPARK_DIST_CLASSPATH"
org.apache.spark.executor.CoarseGrainedExecutorBackend
--driver-url $SPARK_DRIVER_URL
--executor-id $SPARK_EXECUTOR_ID
--cores $SPARK_EXECUTOR_CORES
--app-id $SPARK_APPLICATION_ID
--hostname $SPARK_EXECUTOR_POD_IP
)
;;
*)
echo "Non-spark-on-k8s command provided, proceeding in pass-through mode..."
CMD=("$@")
;;
esac
# Execute the container CMD under tini for better hygiene
exec /usr/bin/tini -s -- "${CMD[@]}" |
hi I encoutered same issue: docker image i use.
when i run following command is no problem:
but When i change another way(same image version)
my environment varibles:
|
Setup:
spark-dependencies
but it fails,Error:
CR file:
Elasticsearch access with
curl
,The text was updated successfully, but these errors were encountered: