Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Cannot register file error #796

Open
axot opened this issue Mar 14, 2024 · 3 comments
Open

[Bug] Cannot register file error #796

axot opened this issue Mar 14, 2024 · 3 comments

Comments

@axot
Copy link

axot commented Mar 14, 2024

Describe the question/issue

When the rate at which logs are ingested per second is elevated, the system produce an error stating "cannot register file."

Configuration

The customer tried to have fluent retrieve about 3000 logs per second. It is unable to retrieve all the logs.
3000 is the number of logs that fluent retrieved, but actually more than 3000 logs are sent by application.

Fluent-bit was deployed by built-in feature(aws-logging configmap) of EKS Fargate.

The logs were set to send to both kinesis firehose and cloudwatch logs, and the number of logs matched.

filters.conf:
----
[FILTER]
    Name     parser
    Match    kube.* Key_name log
    Parser   crio
    Reserve_Data On
    [FILTER]
    Name        kubernetes
    Match       kube.*
    Merge_Log   On
    Merge_Log_Key       log_data
    Buffer_Size 0
    Kube_Meta_Cache_TTL 300s
[FILTER]
    Name  rewrite_tag
    Match kube.*
    Rule  $log_data['logger'] ^(search)$ search true
[FILTER]
    Name    grep
    Match   *
    Exclude $kubernetes['container_name'] envoy
[FILTER]
    Name    grep
    Match   *
    Exclude $kubernetes['container_name'] xray-daemon

flb_log_cw:
----
true
output.conf:
----
[OUTPUT]
    Name      cloudwatch
    Match     kube.*
    region    ap-northeast-1
    log_group_name    *****
    log_stream_prefix from-fluent-bit-
    auto_create_group true
[OUTPUT]
    Name    kinesis_firehose
    Match   kube.*
    region  ap-northeast-1
    delivery_stream *****
[OUTPUT]
    Name    kinesis_firehose
    Match   search
    region  ap-northeast-1
    delivery_stream *****
parsers.conf:
----
[PARSER]
    Namecrio
    Format      Regex
    Regex       ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>P|F) (?<log>.*)$
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%LZ

Fluent Bit Log Output

{
        "@timestamp": "2024-03-14 08:40:51.627",
        "@message": {
            "log": "[2024/03/14 08:40:51] [error] [plugins/in_tail/tail_fs_inotify.c:147 errno=2] No such file or directory"
        },
        "@logStream": "from-fluent-bit-**********",
        "@log": "-**********","
    },
    {
        "@timestamp": "2024-03-14 08:40:51.627",
        "@message": {
            "log": "[2024/03/14 08:40:51] [error] [input:tail:tail.0] inode=1836081 cannot register file /var/log/containers/-**********",.log"
        },
        "@logStream": "from-fluent-bit--**********",",
        "@log": "-**********","
    },

Fluent Bit Version Info

Fluent Bit v1.9.10(eks on fargate built-in fluentbit)
EKS version 1.27, 1.28, 1.29

Cluster Details

VPC is unlimited outbound, inbound is focused on specific ip and sg.

Use Appmesh
Using EKS with Fargate
Incorporate Fluent Bit into Fargate

Application Details

Logs are no longer recoverable past 3000 per second.
Roughly 6 MB per second.

Related Issues

Not sure if this issue related to EKS AMI update with 1024 NOFILE
awslabs/amazon-eks-ami#1535

@nooperpudd
Copy link

same issue:
[2024/04/07 13:14:08] [error] [plugins/in_tail/tail_fs_inotify.c:147 errno=2] No such file or directory [2024/04/07 13:14:08] [error] [input:tail:tail.1] inode=20972144 cannot register file /var/log/pods/amazon-cloudwatch_fluent-bit-4fb7m_256061af-86f8-48eb-b45e-d3a5d2190006/fluent-bit/0.log (deleted)
eks version: 1.29
plugin version: amazon-cloudwatch-observability v1.4.0-eksbuild.1
Fluent Bit: aws-for-fluent-bit:2.32.0.20240304

@chuanAlloy
Copy link

chuanAlloy commented Jul 17, 2024

Same issue:

[2024/07/17 14:03:56] [error] [plugins/in_tail/tail_fs_inotify.c:147 errno=2] No such file or directory
[2024/07/17 14:03:56] [error] [input:tail:tail.0] inode=76569335 cannot register file /var/log/pods/devops-ops_fluentbit-devops-ops-aws-for-fluent-bit-t589r_fbaef6e3-eb19-4599-a11a-cf82da3e9be7/aws-for-fluent-bit/0.log (deleted)

EKS: 1.24
Fluentbit: public.ecr.aws/aws-observability/aws-for-fluent-bit:2.32.2.20240516

@joebowbeer
Copy link
Contributor

joebowbeer commented Jul 19, 2024

Source code for this "No such file or directory" error:

https://github.com/fluent/fluent-bit/blob/574a69af744535b6e016965f02eef9f739a5df1e/plugins/in_tail/tail_fs_inotify.c#L147

NOTE that the in_tail plugin code included with aws-for-fluent-bit (fluentbit v1.9) is 2 or 3 years old.

The in_tail code in fluentbit v2 and v3 has seen a lot of changes, but even so it may not be issue free:

fluent/fluent-bit#2110

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants