Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loki cannot query logs from before the upgrade from 2.3 -> 2.7 #8738

Open
mohit94ag opened this issue Mar 7, 2023 · 5 comments
Open

Loki cannot query logs from before the upgrade from 2.3 -> 2.7 #8738

mohit94ag opened this issue Mar 7, 2023 · 5 comments

Comments

@mohit94ag
Copy link

mohit94ag commented Mar 7, 2023

Describe the bug
We recently upgraded loki from version 2.3 -> 2.7. The upgrade was successful and we are able to query logs created after the upgrade but anything before that is not returned and there's no error in the logs/dashboard either making it harder for us to debug what might be the issue.

Expected behavior
Able to query logs from time before the upgrade

Environment:

  • Infrastructure: eks v1.24
  • Deployment tool: helm

Loki config

# This tells loki to require the X-Scope-OrgID header when receiving log events.
    # promtail sets this to neeva-loki, with the net effect that this names the S3 directory
    # that holds all the logs.
    auth_enabled: false
    chunk_store_config:
      max_look_back_period: 0s
    ingester:
      chunk_block_size: 262144
      chunk_idle_period: 3m
      chunk_retain_period: 1m
      lifecycler:
        ring:
          kvstore:
            store: inmemory
          replication_factor: 1
    limits_config:
      enforce_metric_name: false
    schema_config:
      configs:
      - from: 2021-11-05
        store: boltdb-shipper
        object_store: s3
        schema: v11
        index:
          prefix: loki_index_
          period: 24h
    storage_config:
      aws:
        s3: s3://us-west-2/neeva-shared-us-west-2
      boltdb_shipper:
        active_index_directory: /data/index
        shared_store: s3
        shared_store_key_prefix: neeva-loki/index/
        cache_location: /data/boltdb-cache
    compactor:
      working_directory: /data/compactor
      shared_store: s3
    server:
      http_listen_port: 3100

Tasks

No tasks being tracked yet.
@DylanGuedes
Copy link
Contributor

is the configuration file exactly the same? If so, maybe it is a subtle change to some default that you're currently relying on. Out of curiosity, if you revert to 2.7.3 can you query old logs again? And if you can revert to 2.7.3, do you mind grabbing a copy of localhost:3100/config output and compares it with the pos-2.7 output?

@mohit94ag
Copy link
Author

mohit94ag commented Mar 7, 2023

there were 2 changes to the config that I did when upgrading.

  1. Removing the following from limits_config:
    reject_old_samples: true
    reject_old_samples_max_age: 168h
  2. Setting auth_enabled from true to false.

Other than these two everything else is same.

Also, I went from 2.3 - 2.7 and on reverting to 2.3, I can query the older logs before the upgrade but not the ones after the upgrade. Do you still need the config? I did the revert sometime back and would prefer not to disturb the cluster to avoid missing any logs but can do so if necessary.

@DylanGuedes
Copy link
Contributor

Sorry, nothing grasped my attention enough to be considered the culprit. My hypothesis was more inclined towards some change to the schema, but looks like you didn't change anything there 😢

@bluesky6529
Copy link
Contributor

2.3 -> 2.7 schema maybe need to update?

    schema_config:
      configs:
      - from: 2021-11-05
        store: boltdb-shipper
        object_store: s3
        schema: v11
        index:
          prefix: loki_index_
          period: 24h
      - from: 2023-03-08 ### update date
        store: boltdb-shipper
        object_store: s3
        schema: v12 #### upgrade
        index:
          prefix: loki_index_
          period: 24h

https://grafana.com/docs/loki/latest/operations/storage/schema/

@j-karls
Copy link

j-karls commented Nov 14, 2023

@bluesky6529

2.3 -> 2.7 schema maybe need to update?

Yes, but it's not quite enough. The full answer can be found in issues 7827 and 1111. I just figured it out after a good day of frustration and zero documentation, where I kept coming across this thread. I'm upgrading from just 2.6.1 to 2.7.5 using the SingleBinary, so that upgrade-path should be easy, right? No.

The problem is caused by the migration to the new GrafanaLabs controlled Helmchart:

  • The default value for schema_config is changed, like you mention. You need to ensure your new setup is the same as the old setup - where you probably didn't define any schema_config yourself. Then you can think about transitioning to a new schema afterwards.
  • The mount point of your "storage" persistent volume has changed from /data to /var/loki. So now your old boltdb and chunks dir from your old Loki install is placed in /var/loki/loki, whereas before it was in /data/loki. The helmchart template version (of 4.10.0 that I use) hardcodes this, so I am not even able to change it back to something sane. However, you can set the following: filesystem.chunks_directory: /var/loki/loki/chunks/.

Hope this helps someone.

Rant: It's honestly the same thing every time. These migration paths seem barely tested and work as the merest definition of minimum viable product. You have to guess at half the values you should be setting to port your setup, and then some values are inexplicably still changed without any supported way of changing them back. If this is what we can expect now that the Loki team has taken responsibility for the helmchart, then I really wish they hadn't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants