Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trino Operator doesn't pull core-site.xml into the catalog config when applying TrinoCatalog of type Hive #525

Closed
therealslimjp opened this issue Jan 22, 2024 · 4 comments
Labels

Comments

@therealslimjp
Copy link

Affected Stackable version

23.11.0

Affected Trino version

428

Current and expected behavior

I am defining a TrinoCatalog of type hive, and expect to create tables via Trino with Create Table (...) Location (hdfs://...). This does not work, and only does, if i reference the specific currently active hdfs-namenode pod in the location field.
I would expect to be able to create tables just with hdfs:// without specifying the pod, because I do reference the hdfs-site.xml and the core-site.xml in the TrinoCatalog Resource

Possible solution

reference the core-site.xml in the catalog config properties file similarly to the hdfs-site.xml

Additional context

I have seen, that inside the catalog config file of the Trino-coordinator there are both files mounted, the correct hdfs-site.xml and the core-site.xml with the correct values. Even though both are existing, inside the catalog config properties file, only the hdfs-site.xml is referenced. I think this is not the desired behavior and the core-site.xml should be referenced in there too, so Trino is able to resolve the hdfs namenode.

For Example, this is the properties file I am referring to:
cat /stackable/config/catalog/<trino-catalog-name>.properties connector.name=hive hive.config.resources=/stackable/config/catalog/<trino-catalog-name>/hdfs-config/hdfs-site.xml hive.metastore.uri=thrift\://hive-metastore-default-0.hive-metastore-default.<my_namespace>.svc.cluster.local\:9083 hive.s3.aws-access-key=${ENV\:<SecretName>} hive.s3.aws-secret-key=${ENV\:<SecretName>} hive.s3.endpoint=http\://minio-druid.default.svc.cluster.local\:9000 hive.s3.path-style-access=true hive.s3.ssl.enabled=false hive.security=allow-all

Environment

No response

Would you like to work on fixing this bug?

None

@sbernauer
Copy link
Member

Hi @therealslimjp thanks for raising this!
I wonder why this is not working for you, as it normally should - E.g. in the integration test we add the Hdfs hdfs in


and are able to call the hdfs by the HA name (same name as the HdfcCluster) in
run_query(connection, "CREATE SCHEMA IF NOT EXISTS hive.hdfs WITH (location = 'hdfs://hdfs/trino/')")
rows_written = run_query(connection, "CREATE TABLE IF NOT EXISTS hive.hdfs.taxi_data_copy AS SELECT * FROM hive.minio.taxi_data")[0][0]
assert rows_written == 5000 or rows_written == 0
assert run_query(connection, "SELECT COUNT(*) FROM hive.hdfs.taxi_data_copy")[0][0] == 5000

What is the exact error message you get? As mentioned in the docs, you need to also give the Hive metastore access to HDFS, maybe this is missing?

Regardless adding the core-site.xml is a totally good thing, which already planned to do anyway (as probably needed for Kerberos), opened #526 for that

@therealslimjp
Copy link
Author

i see, in your integration test example you add the nameservice value ('hdfs') to your path. This does actually work for us too, so we don't have to reference the exact pod anymore, which is good, thank you. Nevertheless, if I am not mistaken here, adding the core-site.xml would make setting this inside the path unnecessary, because it will pull this config through the defaultFS field in the core-site, right?

@sbernauer
Copy link
Member

I have to admit I have not though about this, but yeah that sounds right! We write the defaultFS into the HDFS discovery ConfigMap here

As we merged #526 today, the current nightly trino-operator version should now pull in the core-site and remove the need for the nameservice (although I have not tested it explicitly)

@therealslimjp
Copy link
Author

Alright, thanks for the effort. I am going to try it out. I am closing this issue for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants