Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[connectors] fix: Use csv document internalId for tableId, add csv in migration #6677

Merged
merged 2 commits into from
Aug 6, 2024

Conversation

tdraier
Copy link
Contributor

@tdraier tdraier commented Aug 6, 2024

Description

Currently, for csv files in google/microsoft, we use the google/ms id as a table_id . We should rather use here our internalId to avoid conflicts.
Added google/microsoft csv in parents migration .

We have only one datasource with csv enabled (doctolib), we may need a full reindex as the table ids are changed for csv.

Risk

Deploy Plan

Deploy connectors
Reindex datasources with csv

Copy link
Contributor

@JulesBelveze JulesBelveze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure to understand why we to retrieve both worksheet and ms-excel in microsoftTables while only retrieving text/csv in GDrive and not include application/vnd.google-apps.spreadsheet?

@tdraier
Copy link
Contributor Author

tdraier commented Aug 6, 2024

Not sure to understand why we to retrieve both worksheet and ms-excel in microsoftTables while only retrieving text/csv in GDrive and not include application/vnd.google-apps.spreadsheet?

For microsoft, we have :

  • worksheets
  • csv file which are application/vnd.ms-excel ( I may also catch text/csv )
    All of these are in the same table microsoft_nodes

For google :

  • sheets
  • csv file (text/csv)
    File which are application/vnd.google-apps.spreadsheet are actually parsed as google sheets (equivalent to application/vnd.openxmlformats-officedocument.spreadsheetml.sheet for microsoft)
    Here sheets are in a different table than files.

@JulesBelveze
Copy link
Contributor

@tdraier Gotcha 👌🏼

@tdraier tdraier merged commit 5947825 into main Aug 6, 2024
3 checks passed
@tdraier tdraier deleted the fix/csv-id branch August 6, 2024 11:57
flvndvd pushed a commit that referenced this pull request Aug 8, 2024
albandum pushed a commit that referenced this pull request Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants