Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fluent-bit: Support multi instances for fluent-bit go loki plugin (revised) #1454

Closed
wants to merge 1 commit into from

Conversation

cosmo0920
Copy link
Contributor

@cosmo0920 cosmo0920 commented Dec 24, 2019

I've noticed that current implementation does not cover multi instance use case.

Signed-off-by: Hiroshi Hatake cosmo0920.oucc@gmail.com

What this PR does / why we need it:
In the previous implementation, this fluent-bit go loki plugin does not
support multi instances.

Because global variable which is var plugin *loki is used for plugin
instance management.
So, it is always overrided when loki plugin is used.

This implementation is based for fluent-plugin-go-s3 plugin multi instances support code.

Also, I know the similar patch is sent in #1294, but this patch needn't specify context id by users. This ID should be calculated automatically. Users can use fluent-bit plugin in the same way after applying this patch.

Which issue(s) this PR fixes:
Fixes #1446

Special notes for your reviewer:

Built binary with go build -race and this patch causes SEGV.

But this SEGV occurs current master with go build -race: dafb9d8

My used fluent-bit config is:

[INPUT]
  Name cpu
  Tag  my_cpu

[INPUT]
  Name thermal
  Tag  my_thermal

[OUTPUT]
  Name loki
  Match my_cpu
  Url http://localhost:3100/loki/api/v1/push
  BatchWait 1
  BatchSize 30720
  Labels {test="fluent-bit-go", tag="my_cpu"}
  LogLevel info

[OUTPUT]
  Name loki
  Match my_thermal 
  Url http://localhost:3100/loki/api/v1/push
  BatchWait 1
  BatchSize 30720
  Labels {test="fluent-bit-go", tag="my_thermal"}
  LogLevel info

Then, fluent-bit can handle both of loki instances.

Checklist

  • Documentation added
  • Tests updated

@cosmo0920
Copy link
Contributor Author

I've obtained the back trace with debug symbol enabled shared object (fluent-bit plugin):

$ go build -race -gcflags="-N -l" -ldflags "-X github.com/grafana/loki/pkg/build.Branch=support-multi-instances -X github.com/grafana/loki/pkg/build.Version=support-multi-instances-fa18b87 -X github.com/grafana/loki/pkg/build.Revision=fa18b87a -X github.com/grafana/loki/pkg/build.BuildUser=hhatake@hhatake-debian -X github.com/grafana/loki/pkg/build.BuildDate=2019-12-24T08:54:32Z" -tags netgo -mod=vendor -buildmode=c-shared -o cmd/fluent-bit/out_loki.so ./cmd/fluent-bit/

Then,

$ gdb --args ./bin/fluent-bit -c fluent.conf -e ~/go/1.13.4/src/github.com/grafana/loki/cmd/fluent-bit/out_loki.so
<snip>      
(gdb) r
Starting program: /media/work/hhatake/GitHub/fluent-bit/build/bin/fluent-bit -c fluent.conf -e /home/hhatake/go/1.13.4/src/github.com/grafana/loki/cmd/fluent-bit/out_loki.so
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
warning: File "/home/hhatake/.goenv/versions/1.13.4/src/runtime/runtime-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
	add-auto-load-safe-path /home/hhatake/.goenv/versions/1.13.4/src/runtime/runtime-gdb.py
line to your configuration file "/home/hhatake/.gdbinit".
To completely disable this security protection add
	set auto-load safe-path /
line to your configuration file "/home/hhatake/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
	info "(gdb)Auto-loading safe path"
[New Thread 0x7ffff4bd7700 (LWP 1218)]
==1128==ERROR: ThreadSanitizer failed to allocate 0x2871000 (42405888) bytes at address 21fffdbf1c000 (errno: 12)
[New Thread 0x7fffedfff700 (LWP 1219)]

Thread 2 "fluent-bit" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff4bd7700 (LWP 1218)]
0x00007ffff61877ae in __tsan_read () from /home/hhatake/go/1.13.4/src/github.com/grafana/loki/cmd/fluent-bit/out_loki.so
(gdb) bt
#0  0x00007ffff61877ae in __tsan_read () from /home/hhatake/go/1.13.4/src/github.com/grafana/loki/cmd/fluent-bit/out_loki.so
#1  0x00007ffff5267996 in racecall () at /home/hhatake/.goenv/versions/1.13.4/src/runtime/race_amd64.s:381
#2  0x00007ffff5238300 in ?? () at /home/hhatake/.goenv/versions/1.13.4/src/runtime/proc.go:1080
   from /home/hhatake/go/1.13.4/src/github.com/grafana/loki/cmd/fluent-bit/out_loki.so
#3  0x00007ffff5263a88 in runtime.rt0_go () at /home/hhatake/.goenv/versions/1.13.4/src/runtime/asm_amd64.s:220
#4  0x0000000000000000 in ?? ()

This SEGV seems to come from go runtime. Am I wrong?

@JensErat JensErat mentioned this pull request Dec 27, 2019
2 tasks
@JensErat
Copy link
Contributor

Indeed, the Id configuration is not required at all, I mostly got this from the upstream documentation. I removed it again in my approach, but even removed the lookup from the local slice: you can as well pass a pointer to the plugin instead of a pointer to an integer used to retrieve the plugin from the slice. When going over your code, I also realized we were both missing a nil check in the shutdown code...

Regarding the -race flag: I also didn't have success and similar/the same effects you observed. To me it looks like the code crashing is actually from the address sanitizer that gets linked in; I gave up on this. Just found some reports on cgo failing with -race, but none that had it working.

@cosmo0920 cosmo0920 force-pushed the support-multi-instances branch 2 times, most recently from 096b505 to 8ffb481 Compare December 28, 2019 07:19
@cosmo0920
Copy link
Contributor Author

you can as well pass a pointer to the plugin instead of a pointer to an integer used to retrieve the plugin from the slice. When going over your code, I also realized we were both missing a nil check in the shutdown code...

Oh, yeah. Thank you for your feedback. 👍

Regarding the -race flag: I also didn't have success and similar/the same effects you observed. To me it looks like the code crashing is actually from the address sanitizer that gets linked in; I gave up on this. Just found some reports on cgo failing with -race, but none that had it working.

I also gave up on it then....

In the previous implementation, this fluent-bit go loki plugin does not
support multi instances.

Because global variable which is `var plugin *loki` is used for plugin
instance management.
So, it is always overrided when loki plugin is used.

This implementation is based for fluent-plugin-go-s3 plugin multi
instances support code.

Signed-off-by: Hiroshi Hatake <cosmo0920.oucc@gmail.com>
@cyriltovena
Copy link
Contributor

Closing in favor #1294, I still co-authored you in that PR.

@cyriltovena cyriltovena closed this Jan 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fluent bit loki output plugin match rule not working multiple loki servers
3 participants