Skip to content
This repository has been archived by the owner on May 22, 2020. It is now read-only.

Add experimental-flannel-overlay flag #69

Closed

Conversation

zreigz
Copy link
Contributor

@zreigz zreigz commented Jun 8, 2016

This PR is some kind of proof of concept for experimental-flannel-overlay flag. Because it is not well documented yet I was experimenting with this. I've removed docker "bootstrap" service and flannel. What I've seen from logs it uses hairpin plugin with hairpin-veth mode. I've executed e2e test with different hyperkube versions

v1.2.0

Ran 94 of 293 Specs in 14982.578 seconds
FAIL! -- 59 Passed | 35 Failed | 0 Pending | 199 Skipped 

v1.2.4

Ran 94 of 293 Specs in 4962.784 seconds
FAIL! -- 59 Passed | 32 Failed | 0 Pending | 199 Skipped 

v1.3.0.alpha5

Ran 94 of 293 Specs in 591.524 seconds
FAIL! -- 81 Passed | 13 Failed | 0 Pending | 199 Skipped

I hope it will open discussion about networking in docker-multinode project.

@zreigz
Copy link
Contributor Author

zreigz commented Jun 8, 2016

cc @cheld @luxas @mikedanese

@luxas
Copy link
Contributor

luxas commented Jun 8, 2016

I really like this, we'll get rid of 200 lines :)

However, I don't know if it's recommended for use... or if kubenet is a better approach long-term.
But kubenet in turn does depend on cni, which isn't multiarch... yet: containernetworking/cni#241

So I vote for going with this for now, and then switch over to kubenet (probably with cni v0.4)

What do the networking team think?
cc @kubernetes/sig-network

@cheld
Copy link
Contributor

cheld commented Jun 10, 2016

@luxas In my understanding the --experimental-flannel-overlay alone has effect on this setup. I tried it and did not see any change in the network setup. It is intended to be used with --configure-cbr0=true. See https://github.com/kubernetes/kubernetes/blob/release-1.2/pkg/kubelet/kubelet.go#L2763.
It reads the flannel output and configures CBR0 bridge. See: https://github.com/kubernetes/kubernetes/blob/release-1.2/pkg/kubelet/flannel_helper.go

I assume a docker restart will be needed afterwards.

I tried to run it in hyperkube:

docker run \
    --volume=/:/rootfs:ro \
    --volume=/sys:/sys:ro \
    --volume=/dev:/dev \
    --volume=/var/lib/docker/:/var/lib/docker:ro \
    --volume=/var/lib/kubelet/:/var/lib/kubelet:rw,rslave \
    --volume=/var/run:/var/run:rw \
    --net=host \
    --pid=host \
    --privileged=true \
    -d \
    gcr.io/google_containers/hyperkube-amd64:v1.3.0-alpha.5 \
    /hyperkube kubelet \
        --allow-privileged=true \
        --containerized \
        --hostname-override="127.0.0.1" \
        --address="0.0.0.0" \
        --api-servers=http://localhost:8080 \
        --configure-cbr0=true \
    --experimental-flannel-overlay=true
        --v=2
0610 13:23:02.632089    3282 flannel_helper.go:84] Found flannel subnet file /var/run/flannel/subnet.env
I0610 13:23:02.632173    3282 flannel_helper.go:145] Read kv options FLANNEL_NETWORK=10.1.0.0/16
FLANNEL_SUBNET=10.1.72.1/24
FLANNEL_MTU=1472
FLANNEL_IPMASQ=true
 from /var/run/flannel/subnet.env
W0610 13:23:02.632203    3282 flannel_helper.go:149] Ignoring non key-value pair []
I0610 13:23:02.632263    3282 kubelet.go:2967] Flannel server handshake failed stat /etc/default/docker: no such file or directory
I0610 13:23:02.637179    3282 kubelet.go:2547] skipping pod synchronization - [network state unknown]

The kubelet fails when trying to write the docker opts.

BTW: the --configure-cbr0 option does work when executed in hyperkube, however, I do not think it is of much help

Overall, I think the flag is intended to be used without hyperkube.

I think the only option for hyperkube is the CNI plugin, however, it fails without error log when executing in hyperkube....

@cheld
Copy link
Contributor

cheld commented Jun 10, 2016

@zreigz I assume the e2e tests do not verify the network setup. I assume you get the same results without the flag.

@luxas
Copy link
Contributor

luxas commented Jun 12, 2016

Ok, seems like it was just too good to be true :)
@cheld However, does CNI work now when kubernetes/kubernetes#24983 is in?

Guess it failed silently because it didn't recognize the options from k8s, but that should be fixed now

@cheld
Copy link
Contributor

cheld commented Jun 13, 2016

Currently, I have tested:

  • kubelet binary with dummy container in manifest (not full kubernetes) - seems to work
  • kubelet in container (hyperkube 1.3.05-alpha5) - not working. It seems to fail during init without error log. The add-to-network command is never called. I will do some more investigations

@cheld
Copy link
Contributor

cheld commented Jun 13, 2016

@luxas I was thinking about an alternative approach: In the master.sh script, we could pull kubelet, cni and manifest files out of hyperkube, create a unit file for kubelet and run on host directly.

Advantages:

  • Get rid of docker-bootstrap
  • Move etcd and flannel in hyperkube
  • The script would be pretty short
  • No modifications to /etc/default/docker
  • The user could customize the manifest files
  • No kubelet-in-container related issues

IMHO, better than the current script but not as good as making the hyperkube work for multi-node.

@zreigz
Copy link
Contributor Author

zreigz commented Jun 14, 2016

Some failing tests indicate network problem:

[Fail] [k8s.io] Networking [It] should function for intra-pod communication [Conformance] 
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/networking.go:197


[Fail] [k8s.io] Networking [It] should provide Internet connection for containers [Conformance] 
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/networking.go:54

@cheld
Copy link
Contributor

cheld commented Jun 14, 2016

@zreigz what is the setup from the test results?

@zreigz
Copy link
Contributor Author

zreigz commented Jun 14, 2016

  docker run -d \
    --net=host \
    --pid=host \
    --privileged \
    --restart=${RESTART_POLICY} \
    ${KUBELET_MOUNTS} \
    gcr.io/google_containers/hyperkube-${ARCH}:${K8S_VERSION} \
    /hyperkube kubelet \
      --allow-privileged \
      --api-servers=http://localhost:8080 \
      --config=/etc/kubernetes/manifests-multi \
      --cluster-dns=${DNS_SERVER_IP} \
      --cluster-domain=${DNS_DOMAIN} \
      --hostname-override=$(ip -o -4 addr list ${NET_INTERFACE} | awk '{print $4}' | cut -d/ -f1) \
      --v=2 --experimental-flannel-overlay=true

@cheld
Copy link
Contributor

cheld commented Jun 16, 2016

Can we close this PR?

@zreigz
Copy link
Contributor Author

zreigz commented Jun 16, 2016

I will close it because we have been going in CNI direction

@zreigz zreigz closed this Jun 16, 2016
@luxas
Copy link
Contributor

luxas commented Jun 17, 2016

Works fine for me.

@colemickens
Copy link

Can someone elaborate on what "we have been going in CNI direction" means? Or point me to an issue/design doc about this?

@luxas
Copy link
Contributor

luxas commented Jun 17, 2016

We want to get rid of the restart of the main docker daemon and instead implement CNI/kubenet as the overlay network provider.

If we can remove the restart, we can remove all OS dependent code, and we're getting closer to "run a kubernetes cluster anywhere where docker is above the vX.Y version".

I'm not an expert on CNI/kubenet, yet :), but I think it's doable at some point quite soon.

On 17 Jun 2016, at 08:24, Cole Mickens notifications@github.com wrote:

Can someone elaborate on what "we have been going in CNI direction" means? Or point me to an issue/design doc about this?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@zreigz
Copy link
Contributor Author

zreigz commented Jun 17, 2016

We are testing hypercube with flannel cni plugin. There are some problems.

when you start the newest hypercube with cni plugin the add-ons crash because the network problems. We haven't test it before so we don't know if it is regression or something new.

@bprashanth
Copy link

were already running with Kubernet on by default, restart isn't necessary in HEAD. I (and probably no one else) haven't tested with the cni flannel plugin yet, so if you find issue please report.

@zreigz
Copy link
Contributor Author

zreigz commented Jun 17, 2016

OK I will create the issue.

@zreigz
Copy link
Contributor Author

zreigz commented Jun 17, 2016

Issue crerated: kubernetes/kubernetes#27603

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants