-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod crash loop back-off when deploying on OCP 4.10 on top of IBM Rocks #34
Comments
@jotak can you take a look ??? is this connected to the latest PR that removed the decode stage ??? |
@andresmareca-ibm thanks for looking into this .... I think that this might be more connected to FLP and NOO repo's but having the issue here is also ok. |
Hi @andresmareca-ibm , Another option is to prefer using released versions rather than |
By the way,
@mariomac , any idea about that? |
@andresmareca-ibm could I see your Also, if possible, can I see the netobserv-controller-manager pod logs? |
I've been digging into the default permissions that we grant to the eBPF agent. The In order to try to reproduce the issue, and verify that we provide a patch that will actually work, what version of OpenShift are you using? |
I deploy the cluster using the IBM Cloud procedure. The version is: 4.10.16_1521 |
I'm going to create a new cluster and reapply the scripts in the following order:
At the end I should have the same pods as you have on the pictures from above, right?? |
@andresmareca-ibm you don't need to clone the In the NOO repo, you should do:
Then you can deploy the example flowcollector:
If you want the eBPF agent to be deployed, you should set the agent: ebpf property in the descriptor |
@mariomac BTW: I use |
FYI:: @ctrath ^^^ |
I'm getting this error during the ebpf container creation. |
@andresmareca-ibm this means that the kernel does not support this capability. By curiosity, which Linux distribution and Kernel version are you using? Anyway, you can workaround this issue by adding |
The error that the pods do not start we have already solved it. We have had to create a service account with the necessary permissions as shown below. ---
apiVersion: v1
kind: ServiceAccount
metadata:
name: netobserv-ebpf-agent-test
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: netobserv-clusterrole
rules:
- apiGroups: [""] # "" indicates the core API group
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "watch", "list"]
- apiGroups:
- security.openshift.io
resourceNames:
- hostmount-anyuid
- privileged
- cgroup
resources:
- securitycontextconstraints
verbs:
- use
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: netobserv-rolebinding
subjects:
- kind: ServiceAccount
name: netobserv-ebpf-agent-test
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: netobserv-clusterrole I close this issue but there will be another one because another error has come out. Thank you very much!! |
I'm deploying this agent in an openshift managed cluster by IBM Cloud. During the deploy face I have an issue with some of the pods.
First I have deployed the operator network-observability-operator following the instructions in the README file.
After that I try to deploy what is on netobserv-ebpf-agent. And here it fails for the first time due to lack of serviceAccount permissions. After adding on the serviceAccount I was able to start the actual pods:
This is where it fails a second time. The output of the pods are:
Any ideas on how to fix it?
The text was updated successfully, but these errors were encountered: