Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s-int-test-build: zk-less druid cluster and http based segment/task managment #10686

Merged
merged 12 commits into from
Jan 6, 2021

Conversation

himanshug
Copy link
Contributor

@himanshug himanshug commented Dec 17, 2020

Linked #10542

Description

Follow up on #10669

@himanshug himanshug changed the title zk-less druid cluster in k8s build k8s-int-test-build: zk-less druid cluster and http based segment/task managment Dec 17, 2020
@himanshug himanshug removed the WIP label Dec 19, 2020
### Gotchas

- Label/Annotation path in each pod spec MUST EXIST, which is easily satisfied if there is at least one label/annotation in the pod spec already. This limitation may be removed in future.
- Druid Pods need permissions to be able to add labels to self-pod, List and Watch other Pods, create ConfigMap for leader election. Assuming, "default" service account is used by Druid pods, you might need to add following or something similar Kubernetes Role and Role Binding.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious. Which kind of druid server needs these add labels to self-pod, List and Watch other Pods, create ConfigMap permissions? MiddleManager or all the druid servers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

everyone needs pod list/update .
create/update ConfigMap is needed only by coordinators/overlords which participate in leader election ... but read ConfigMap is needed by all so as to know who the leader is. That said, I haven't tested such granularity and would just give same permission to all pods.


## Debug And FastFail

sudo /usr/local/bin/kubectl get pod
Copy link
Contributor

@zhangyue19921010 zhangyue19921010 Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: /usr/local/bin/kubectl could be replaced by $KUBECTL. The same as sudo /usr/local/bin/kubectl get svc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

export MINIKUBE_WANTREPORTERRORPROMPT=false
export MINIKUBE_HOME=$HOME
export KUBECONFIG=$HOME/.kube/config
export KUBECTL="sudo /usr/local/bin/kubectl"
Copy link
Contributor

@zhangyue19921010 zhangyue19921010 Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It seems that export KUBECTL="sudo /usr/local/bin/kubectl" is never used in setup_k8s_cluster.sh . Maybe can be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@himanshug
Copy link
Contributor Author

@zhangyue19921010 thanks for taking a look

@zhangyue19921010
Copy link
Contributor

LTGM :)

Copy link
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhangyue19921010
Copy link
Contributor

As a follow-up we should try to get some or all of https://github.com/apache/druid/blob/master/integration-tests/src/test/java/org/apache/druid/tests/leadership/ITHighAvailabilityTest.java running on this configuration.

Hi @clintropolis and @himanshug Are there any progress being made on that? If no, I'd like to take the chance to implement it after this PR have been merged. :)

@himanshug
Copy link
Contributor Author

himanshug commented Jan 5, 2021

@clintropolis thanks for approving, I will do one more round of checks of things in this PR tomorrow and merge.

Are there any progress being made on that? If no, I'd like to take the chance to implement it after this PR have been merged

@zhangyue19921010 I did some work there and will finish it, I am adding few more related things in that PR . that work currently lives in my fork at https://github.com/himanshug/druid/tree/k8s_build2

@himanshug himanshug merged commit d2e6240 into apache:master Jan 6, 2021
JulianJaffePinterest pushed a commit to JulianJaffePinterest/druid that referenced this pull request Jan 22, 2021
… managment (apache#10686)

* zk-less druid cluster in k8s build

* attempt to fix build and use http based remote task management

* mm/router logs for debugging

* add default account k8s role and binding for pod, configMap access

* fix issue

* change router port to 8088 for common readinessProbe

* break build_run_k8s_cluster.sh into separate scripts

* revert changes to K8sDruidNodeAnnouncer.java

* k8s extension doc update

* add license to new file

* address review comments

* do not try to load lookups at startup to improve cluster startup time
@jihoonson jihoonson added this to the 0.21.0 milestone Jul 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants