Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

featureflagservice database migrations fail in kubernetes #398

Closed
joshleecreates opened this issue Oct 3, 2022 · 1 comment · Fixed by #402
Closed

featureflagservice database migrations fail in kubernetes #398

joshleecreates opened this issue Oct 3, 2022 · 1 comment · Fixed by #402
Labels
bug Something isn't working

Comments

@joshleecreates
Copy link
Contributor

Bug Report

Which version of the demo you are using? (please provide either a specific
b6e75ee

Symptom

When using the helm charts to deploy to k8s, the featureflagservice fails to run it's migrations because it does not wait for the postgres pod to be ready.

What is the expected behavior?

I expect to see ecto database migrations run automatically as they do when using docker compose.

What is the actual behavior?

Instead, the featureflagservice starts with the following errors:

02:51:47.123 [error] Could not create schema migrations table. This error usually happens due to the following:

  * The database does not exist
  * The "schema_migrations" table, which Ecto uses for managing
    migrations, was defined by another library
  * There is a deadlock while migrating (such as using concurrent
    indexes with a migration_lock)

To fix the first issue, run "mix ecto.create".

To address the second, you can run "mix ecto.drop" followed by
"mix ecto.create". Alternatively you may configure Ecto to use
another table and/or repository for managing migrations:

    config :featureflagservice, Featureflagservice.Repo,
      migration_source: "some_other_table_for_schema_migrations",
      migration_repo: AnotherRepoForSchemaMigrations

The full error report is shown below.

** (DBConnection.ConnectionError) connection not available and request was dropped from queue after 2971ms. This means requests are coming in and your connection pool cannot serve them fast enough. You can address this by:

  1. Ensuring your database is available and that you can connect to it
  2. Tracking down slow queries and making sure they are running fast enough
  3. Increasing the pool_size (although this increases resource consumption)
  4. Allowing requests to wait longer by increasing :queue_target and :queue_interval

See DBConnection.start_link/2 for more information

Reproduce

Deploy the latest helm charts and check the logs for featureflagservice.

Additional context

It seems like a fix would be to modify the ffs docker image to wait for connectivity on the database port before running the release.

@joshleecreates joshleecreates added the bug Something isn't working label Oct 3, 2022
@puckpuck
Copy link
Contributor

puckpuck commented Oct 5, 2022

I wonder if we can have the entire process fail on error, and leverage container restart policies in docker/kubernetes instead. @tsloughter is this something we can have Elixir do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants