Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully integrate airflow to TDP #105

Open
PACordonnier opened this issue Mar 2, 2023 · 0 comments · May be fixed by #117
Open

Fully integrate airflow to TDP #105

PACordonnier opened this issue Mar 2, 2023 · 0 comments · May be fixed by #117
Assignees

Comments

@PACordonnier
Copy link
Member

PACordonnier commented Mar 2, 2023

Apache Airflow is already present in tdp-collection-extras but in a basic state.

It needs some more configuration to be able to work with a TDP cluster. Also the role needs a refactor see #59 .

Here are the requirements I think are necessary to consider Airflow integrated :

  • Airflow version should be at least 2.3.X. Airflow 2.4 dropped compatibility with Python 3.6 (is that really an issue ?)
  • A secure webserver needs to be installed (with authentication and SSL)
  • One scheduler should be installed
  • At least one worker needs to be installed
    • Workers relies on Celery Task Queues. It requires a message transport backend (RabbitMQ or Redis are popular)
  • Airflow should connect to a postgresql or mysql for production use
  • Airflow needs to be able to use Hive and Spark in its workflow (configure Spark and Hive providers)
  • HDFS provider is nice to have but the HDFS provider seems very out of date
  • Kerberos and Proxy users needs to be configured
  • Airflow daemons needs to be installed using systemd
  • Enable multi tenancy if possible
@PACordonnier PACordonnier self-assigned this Mar 2, 2023
@gonzaloetjo gonzaloetjo linked a pull request May 5, 2023 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant