Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SEDONA-243] R features: read/write geoparquet, get names from shapefiles #770

Merged
merged 5 commits into from
Feb 17, 2023

Conversation

gregleleu
Copy link
Contributor

Updates to the R package:

  • reading/writing geoparquet files
  • getting names to sdf from cases where "fieldNames" exists (shapefile)

How was this patch tested?

Added/updated the relevant R tests

Did this PR include necessary documentation updates?

Updated the R documentation only

R/R/data_interface.R Show resolved Hide resolved
R/R/data_interface.R Show resolved Hide resolved
@jiayuasu
Copy link
Member

jiayuasu commented Feb 16, 2023

@gregleleu Each Sedona PR needs to be associated with a JIRA ticket. I can create a Sedona JIRA account for you. This way, you can easily contribute to Sedona R in the future. To do so, I just need your email address. You can send your email address to me via jiayu@apache.org. Thanks!

@gregleleu
Copy link
Contributor Author

gregleleu commented Feb 16, 2023

I created an account, and wanted to create an issue, but JIRA is so complicated I gave up. I'll try again next time.

Regarding the "generic" function, it's not really the R way, cf. sparklyr's documentation (https://spark.rstudio.com/packages/sparklyr/latest/reference/)

Any idea why the checks failed? I'm using the same test as the scala version – an exception with the words "GeoParquet file does not contain valid geo metadata" – but it raised another kind of exception for spark 3.0.3.
Haven't done java in a long time, but any chance that Spark 3.0.3 doesn't handle the AnalysisException the same way? It wasn't SparkException back then...

@jiayuasu
Copy link
Member

jiayuasu commented Feb 16, 2023

@gregleleu

Spark Analysis exception has been there for a while: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/AnalysisException.scala

Not sure why it failed. In fact, we have the same Scala tests on Spark 3.0.3 and it can successfully capture the exception. @Kontinuation Hi Kristin, any idea why this AnalysisException was not found in Spark 3.0.3?

@gregleleu In addition, Sedona JIRA does not allow public signup (enforced by ASF). If you didn't have an account before, I have to create one manually for you.

@Kontinuation
Copy link
Member

@gregleleu

Spark Analysis exception has been there for a while: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/AnalysisException.scala

Not sure why it failed. In fact, we have the same Scala tests on Spark 3.0.3 and it can successfully capture the exception. @Kontinuation Hi Kristin, any idea why this AnalysisException was not found in Spark 3.0.3?

AnalysisException has some ABI-breaking changes after 3.0, so the code compiled with spark 3.3 could break on spark 3.0. Can we simply replace them with IllegalArgumentException?

@jiayuasu
Copy link
Member

@Kontinuation Yes, let's change it to IllegalArgumentException. Can you make a PR to fix it?

@jiayuasu jiayuasu changed the title [SEDONA-XXX] R features: read/write geoparquet, get names from shapefiles [SEDONA-243] R features: read/write geoparquet, get names from shapefiles Feb 16, 2023
@gregleleu
Copy link
Contributor Author

Do we keep the informative message ("GeoParquet file does not contain valid geo metadata") if we change the class of the exception?

@Kontinuation
Copy link
Member

Do we keep the informative message ("GeoParquet file does not contain valid geo metadata") if we change the class of the exception?

Yes, the message will be kept.

@Kontinuation
Copy link
Member

@Kontinuation Yes, let's change it to IllegalArgumentException. Can you make a PR to fix it?

Submitted #771

@jiayuasu
Copy link
Member

@gregleleu #771 has been merged. Please pull from the upstream. This will trigger GitHub actions of this PR.

sedona_write_geoparquet instead of sedona_save_geoparquet fro coeherence
cleanup
Fix in documentation
Using a Sparklyr exported function instead of using ":::"
@gregleleu
Copy link
Contributor Author

@jiayuasu Made a few changes, notably changed the name of the write function to be coherent with the rest of the sparklyr ecosystem.

@jiayuasu jiayuasu merged commit b3ce901 into apache:master Feb 17, 2023
@gregleleu gregleleu deleted the r-features branch March 4, 2023 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants