Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] kube-ovn-controller crash because of concurrent map read and write #4298

Closed
oilbeater opened this issue Jul 16, 2024 · 0 comments · Fixed by #4302
Closed

[BUG] kube-ovn-controller crash because of concurrent map read and write #4298

oilbeater opened this issue Jul 16, 2024 · 0 comments · Fixed by #4302
Labels
bug Something isn't working

Comments

@oilbeater
Copy link
Collaborator

Kube-OVN Version

v1.13.0

Kubernetes Version

v1.29.2

Operation-system/Kernel Version

github action environment

Description

github.com/kubeovn/kube-ovn/pkg/controller/controller.go:246 and github.com/kubeovn/kube-ovn/cmd/controller/controller.go may write schema map at the same time

fatal error: concurrent map read and map write

goroutine 1 [running]:
k8s.io/apimachinery/pkg/runtime.(*Scheme).ObjectKinds(0xc0006037[30](https://github.com/kubeovn/kube-ovn/actions/runs/9957878461/job/27511204912?pr=4289#step:24:31), {0x55a0e4d41de0?, 0xc00036af00?})
	k8s.io/apimachinery@v0.30.2/pkg/runtime/scheme.go:263 +0xe9
k8s.io/apimachinery/pkg/runtime.(*parameterCodec).EncodeParameters(0x55a0e61a9720, {0x55a0e4d41de0, 0xc00036af00}, {{0x55a0e3c2c269, 0x13}, {0x55a0e3c0e0a2, 0x2}})
	k8s.io/apimachinery@v0.30.2/pkg/runtime/codec.go:190 +0x62
k8s.io/client-go/rest.(*Request).SpecificallyVersionedParams(0xc00032efc0, {0x55a0e4d41de0?, 0xc00036af00?}, {0x55a0e4d42038?, 0x55a0e61a9720?}, {{0x55a0e3c2c269?, 0xc00052e4e0?}, {0x55a0e3c0e0a2?, 0x55a0e1efd591?}})
	k8s.io/client-go@v12.0.0+incompatible/rest/request.go:377 +0x75
k8s.io/client-go/rest.(*Request).VersionedParams(...)
	k8s.io/client-go@v12.0.0+incompatible/rest/request.go:370
k8s.io/client-go/kubernetes/typed/coordination/v1.(*leases).Update(0xc0007303c0, {0x55a0e4d5e1f8, 0xc000[31](https://github.com/kubeovn/kube-ovn/actions/runs/9957878461/job/27511204912?pr=4289#step:24:32)2700}, 0xc0001368c0, {{{0x0, 0x0}, {0x0, 0x0}}, {0x0, 0x0, ...}, ...})
	k8s.io/client-go@v12.0.0+incompatible/kubernetes/typed/coordination/v1/lease.go:135 +0x152
k8s.io/client-go/tools/leaderelection/resourcelock.(*LeaseLock).Update(0xc000200a20, {0x55a0e4d5e1f8, 0xc000312700}, {{0xc0000541b9, 0x24}, 0x1e, {{0xc19dbbed8e0528d4, 0x2e080a1, 0x55a0e61ce640}}, {{0xc19dbbed8e60683c, ...}}, ...})
	k8s.io/client-go@v12.0.0+incompatible/tools/leaderelection/resourcelock/leaselock.go:75 +0x294
k8s.io/client-go/tools/leaderelection.(*LeaderElector).tryAcquireOrRenew(0xc000200b40, {0x55a0e4d5e1f8, 0xc000312700})
	k8s.io/client-go@v12.0.0+incompatible/tools/leaderelection/leaderelection.go:335 +0x245
k8s.io/client-go/tools/leaderelection.(*LeaderElector).renew.func1.1()
	k8s.io/client-go@v12.0.0+incompatible/tools/leaderelection/leaderelection.go:275 +0x1f
k8s.io/apimachinery/pkg/util/wait.PollImmediateUntil.ConditionFunc.WithContext.func1({0x55a0e1f5402b?, 0x0?})
	k8s.io/apimachinery@v0.30.2/pkg/util/wait/wait.go:109 +0x13
k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext({0x55a0e4d5e518?, 0xc0002b6780?}, 0xc19dbbf28e605c9d?)
	k8s.io/apimachinery@v0.30.2/pkg/util/wait/wait.go:154 +0x4c
k8s.io/apimachinery/pkg/util/wait.poll({0x55a0e4d5e518, 0xc0002b6780}, 0x0?, 0xc00072b988, 0xc00072b9d0)
	k8s.io/apimachinery@v0.30.2/pkg/util/wait/poll.go:245 +0x[32](https://github.com/kubeovn/kube-ovn/actions/runs/9957878461/job/27511204912?pr=4289#step:24:33)
k8s.io/apimachinery/pkg/util/wait.PollImmediateUntilWithContext({0x55a0e4d5e518?, 0xc0002b6780?}, 0xc00072b9b0?, 0xc00072b9e0?)
	k8s.io/apimachinery@v0.30.2/pkg/util/wait/poll.go:200 +0x53
k8s.io/apimachinery/pkg/util/wait.PollImmediateUntil(0x55a0e4d5e188?, 0xc000512500?, 0x4a817c800?)
	k8s.io/apimachinery@v0.30.2/pkg/util/wait/poll.go:187 +0x3c
k8s.io/client-go/tools/leaderelection.(*LeaderElector).renew.func1()
	k8s.io/client-go@v12.0.0+incompatible/tools/leaderelection/leaderelection.go:274 +0xd4
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?)
	k8s.io/apimachinery@v0.30.2/pkg/util/wait/backoff.go:226 +0x[33](https://github.com/kubeovn/kube-ovn/actions/runs/9957878461/job/27511204912?pr=4289#step:24:34)
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00072bbf8, {0x55a0e4d384e0, 0xc0006077d0}, 0x1, 0xc0002b6720)
	k8s.io/apimachinery@v0.30.2/pkg/util/wait/backoff.go:227 +0xaf
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00072bbf8, 0x165a0bc00, 0x0, 0x1, 0xc0002b6720)
	k8s.io/apimachinery@v0.30.2/pkg/util/wait/backoff.go:204 +0x7f
k8s.io/apimachinery/pkg/util/wait.Until(...)
	k8s.io/apimachinery@v0.30.2/pkg/util/wait/backoff.go:161
k8s.io/client-go/tools/leaderelection.(*LeaderElector).renew(0xc000200b40, {0x55a0e4d5e188?, 0xc0005124b0?})
	k8s.io/client-go@v12.0.0+incompatible/tools/leaderelection/leaderelection.go:271 +0x118
k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run(0xc000200b40, {0x55a0e4d5e188, 0xc00015cc80})
	k8s.io/client-go@v12.0.0+incompatible/tools/leaderelection/leaderelection.go:214 +0xfa
k8s.io/client-go/tools/leaderelection.RunOrDie({0x55a0e4d5e188, 0xc00015cc80}, {{0x55a0e4d65078, 0xc000200a20}, 0x6fc23ac00, 0x4a817c800, 0x165a0bc00, {0xc00059ec40, 0xc000010378, 0x0}, ...})
	k8s.io/client-go@v12.0.0+incompatible/tools/leaderelection/leaderelection.go:228 +0x85
github.com/kubeovn/kube-ovn/cmd/controller.CmdMain()
	github.com/kubeovn/kube-ovn/cmd/controller/controller.go:110 +0x59b
main.main()
	github.com/kubeovn/kube-ovn/cmd/cmdmain.go:98 +0x185

goroutine 40 [chan receive]:
k8s.io/client-go/tools/record.NewBroadcaster.func1()
	k8s.io/client-go@v12.0.0+incompatible/tools/record/event.go:219 +0x2c
created by k8s.io/client-go/tools/record.NewBroadcaster in goroutine 1
	k8s.io/client-go@v12.0.0+incompatible/tools/record/event.go:218 +0x1e5

goroutine 28 [syscall]:
os/signal.signal_recv()
	runtime/sigqueue.go:152 +0x29
os/signal.loop()
	os/signal/signal_unix.go:23 +0x13
created by os/signal.Notify.func1.1 in goroutine 1
	os/signal/signal.go:151 +0x1f

goroutine 29 [chan receive]:
main.dumpProfile.func1()
	github.com/kubeovn/kube-ovn/cmd/cmdmain.go:45 +0x37
created by main.dumpProfile in goroutine 1
	github.com/kubeovn/kube-ovn/cmd/cmdmain.go:43 +0xe6

goroutine 30 [chan receive]:
main.dumpProfile.func2()
	github.com/kubeovn/kube-ovn/cmd/cmdmain.go:70 +0x37
created by main.dumpProfile in goroutine 1
	github.com/kubeovn/kube-ovn/cmd/cmdmain.go:68 +0x125

goroutine 31 [chan receive]:
github.com/kubeovn/kube-ovn/cmd/controller.CmdMain.func1()
	github.com/kubeovn/kube-ovn/cmd/controller/controller.go:38 +0x2f
created by github.com/kubeovn/kube-ovn/cmd/controller.CmdMain in goroutine 1
	github.com/kubeovn/kube-ovn/cmd/controller/controller.go:36 +0xbb

goroutine 76 [select]:
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x55a0e4d2d718, {0x55a0e4d384e0, 0xc0002b2000}, 0x1, 0x0)
	k8s.io/apimachinery@v0.30.2/pkg/util/wait/backoff.go:238 +0x12c
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x55a0e4d2d718, 0x12a05f200, 0x0, 0x1, 0x0)
	k8s.io/apimachinery@v0.30.2/pkg/util/wait/backoff.go:204 +0x7f
k8s.io/apimachinery/pkg/util/wait.Until(...)
	k8s.io/apimachinery@v0.30.2/pkg/util/wait/backoff.go:161
created by github.com/kubeovn/kube-ovn/pkg/util.InitKlogMetrics in goroutine 1
	github.com/kubeovn/kube-ovn/pkg/util/klog_metrics.go:24 +0x1f

goroutine 77 [select]:
k8s.io/klog/v2.(*flushDaemon).run.func1()
	k8s.io/klog/v2@v2.130.1/klog.go:1141 +0x117
created by k8s.io/klog/v2.(*flushDaemon).run in goroutine 1
	k8s.io/klog/v2@v2.130.1/klog.go:1137 +0x171

goroutine 81 [chan receive]:
sigs.k8s.io/controller-runtime/pkg/manager/signals.SetupSignalHandler.func1()
	sigs.k8s.io/controller-runtime@v0.18.4/pkg/manager/signals/signal.go:38 +0x27
created by sigs.k8s.io/controller-runtime/pkg/manager/signals.SetupSignalHandler in goroutine 31
	sigs.k8s.io/controller-runtime@v0.18.4/pkg/manager/signals/signal.go:37 +0xc5

goroutine 86 [runnable]:
reflect.Value.MethodByName({0x55a0e4c78a60?, 0x55a0e61ce280?, 0x16?}, {0x55a0e3c1c5aa?, 0xc?})
	reflect/value.go:2110 +0x175
k8s.io/apimachinery/pkg/runtime.(*Scheme).AddKnownTypeWithName(0xc000603730, {{0x55a0e3c17e07, 0xa}, {0x55a0e3c0e0a2, 0x2}, {0x55a0e465a902, 0xd}}, {0x55a0e4d41d90, 0x55a0e61ce280})
	k8s.io/apimachinery@v0.30.2/pkg/runtime/scheme.go:184 +0x52c
k8s.io/apimachinery/pkg/runtime.(*Scheme).AddKnownTypes(0xc000603730, {{0x55a0e3c17e07?, 0xa?}, {0x55a0e3c0e0a2?, 0xa?}}, {0x55a0e61abca0?, 0x6?, 0x55a0e4d41e58?})
	k8s.io/apimachinery@v0.30.2/pkg/runtime/scheme.go:148 +0x165
k8s.io/apimachinery/pkg/apis/meta/v1.AddToGroupVersion(0xc000603730, {{0x55a0e3c17e07?, 0x0?}, {0x55a0e3c0e0a2?, 0x0?}})
	k8s.io/apimachinery@v0.30.2/pkg/apis/meta/v1/register.go:73 +0x1e5
github.com/kubeovn/kube-ovn/pkg/apis/kubeovn/v1.addKnownTypes(0xc000603730)
	github.com/kubeovn/kube-ovn/pkg/apis/kubeovn/v1/register.go:75 +0x8ae
k8s.io/apimachinery/pkg/runtime.(*SchemeBuilder).AddToScheme(...)
	k8s.io/apimachinery@v0.30.2/pkg/runtime/scheme_builder.go:29
github.com/kubeovn/kube-ovn/pkg/controller.Run({0x55a0e4d5e188, 0xc0005124b0}, 0xc000164608)
	github.com/kubeovn/kube-ovn/pkg/controller/controller.go:246 +0x62
github.com/kubeovn/kube-ovn/cmd/controller.CmdMain.func3({0x55a0e4d5e188?, 0xc0005124b0?})
	github.com/kubeovn/kube-ovn/cmd/controller/controller.go:117 +0x25
created by k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run in goroutine 1
	k8s.io/client-go@v12.0.0+incompatible/tools/leaderelection/leaderelection.go:213 +0xe6

goroutine 38 [IO wait]:
internal/poll.runtime_pollWait(0x7f4750231db8, 0x72)
	runtime/netpoll.go:[34](https://github.com/kubeovn/kube-ovn/actions/runs/9957878461/job/27511204912?pr=4289#step:24:35)5 +0x85
internal/poll.(*pollDesc).wait(0x8?, 0x10?, 0x0)
	internal/poll/fd_poll_runtime.go:84 +0x27
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc000316480)
	internal/poll/fd_unix.go:611 +0x2ac
net.(*netFD).accept(0xc000316480)
	net/fd_unix.go:172 +0x29
net.(*TCPListener).accept(0xc0002a9120)
	net/tcpsock_posix.go:159 +0x1e
net.(*TCPListener).Accept(0xc0002a9120)
	net/tcpsock.go:327 +0x30
net/http.(*Server).Serve(0xc00076c000, {0x55a0e4d51b10, 0xc0002a9120})
	net/http/server.go:3260 +0x33e
net/http.(*Server).ListenAndServe(0xc00076c000)
	net/http/server.go:3189 +0x71
github.com/kubeovn/kube-ovn/cmd/controller.CmdMain.func2()
	github.com/kubeovn/kube-ovn/cmd/controller/controller.go:89 +0x314
created by github.com/kubeovn/kube-ovn/cmd/controller.CmdMain in goroutine 1
	github.com/kubeovn/kube-ovn/cmd/controller/controller.go:56 +0x2db

goroutine 101 [IO wait]:
internal/poll.runtime_pollWait(0x7f4750231eb0, 0x72)
	runtime/netpoll.go:345 +0x85
internal/poll.(*pollDesc).wait(0xc000316180?, 0xc0001e2e00?, 0x0)
	internal/poll/fd_poll_runtime.go:84 +0x27
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc000316180, {0xc0001e2e00, 0x700, 0x700})
	internal/poll/fd_unix.go:164 +0x27a
net.(*netFD).Read(0xc000316180, {0xc0001e2e00?, 0x7f4709855958?, 0xc0006cc318?})
	net/fd_posix.go:55 +0x25
net.(*conn).Read(0xc00053a528, {0xc0001e2e00?, 0xc000753938?, 0x55a0e1ef3bbb?})
	net/net.go:185 +0x45
crypto/tls.(*atLeastReader).Read(0xc0006cc318, {0xc0001e2e00?, 0x0?, 0xc0006cc318?})
	crypto/tls/conn.go:806 +0x3b
bytes.(*Buffer).ReadFrom(0xc0002469b0, {0x55a0e4d38a60, 0xc0006cc318})
	bytes/buffer.go:211 +0x98
crypto/tls.(*Conn).readFromUntil(0xc000246708, {0x55a0e4d37da0, 0xc00053a528}, 0xc000753980?)
	crypto/tls/conn.go:828 +0xde
crypto/tls.(*Conn).readRecordOrCCS(0xc000246708, 0x0)
	crypto/tls/conn.go:626 +0x3cf
crypto/tls.(*Conn).readRecord(...)
	crypto/tls/conn.go:588
crypto/tls.(*Conn).Read(0xc000246708, {0xc00075f000, 0x1000, 0xc000498000?})
	crypto/tls/conn.go:1370 +0x156
bufio.(*Reader).Read(0xc0001f8600, {0xc0001a2e40, 0x9, 0x55a0e618c710?})
	bufio/bufio.go:241 +0x197
io.ReadAtLeast({0x55a0e4d371c8, 0xc0001f8600}, {0xc0001a2e40, 0x9, 0x9}, 0x9)
	io/io.go:3[35](https://github.com/kubeovn/kube-ovn/actions/runs/9957878461/job/27511204912?pr=4289#step:24:36) +0x90
io.ReadFull(...)
	io/io.go:354
golang.org/x/net/http2.readFrameHeader({0xc0001a2e40, 0x9, 0x753dc0?}, {0x55a0e4d371c8?, 0xc0001f8600?})
	golang.org/x/net@v0.26.0/http2/frame.go:237 +0x65
golang.org/x/net/http2.(*Framer).ReadFrame(0xc0001a2e00)
	golang.org/x/net@v0.26.0/http2/frame.go:501 +0x85
golang.org/x/net/http2.(*clientConnReadLoop).run(0xc000753fa8)
	golang.org/x/net@v0.26.0/http2/transport.go:2358 +0xda
golang.org/x/net/http2.(*ClientConn).readLoop(0xc00072ec00)
	golang.org/x/net@v0.26.0/http2/transport.go:2254 +0x8b
created by golang.org/x/net/http2.(*Transport).newClientConn in goroutine 100
	golang.org/x/net@v0.26.0/http2/transport.go:869 +0xd1b

goroutine 39 [chan receive]:
k8s.io/apimachinery/pkg/watch.(*Broadcaster).loop(0xc0002a6[37](https://github.com/kubeovn/kube-ovn/actions/runs/9957878461/job/27511204912?pr=4289#step:24:38)0)
	k8s.io/apimachinery@v0.30.2/pkg/watch/mux.go:268 +0x66
created by k8s.io/apimachinery/pkg/watch.NewLongQueueBroadcaster in goroutine 1
	k8s.io/apimachinery@v0.30.2/pkg/watch/mux.go:93 +0x125

Steps To Reproduce

Run the action many times will meet.

https://github.com/kubeovn/kube-ovn/actions/runs/9957878461/job/27511204912?pr=4289

Current Behavior

The kube-ovn-controller may crash

Expected Behavior

The kube-ovn-controller run without crash

@oilbeater oilbeater added the bug Something isn't working label Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant