Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add Sharepoint datasoure Issue #19 #48

Closed
wants to merge 12 commits into from

Conversation

akizito
Copy link
Contributor

@akizito akizito commented Apr 25, 2024

Adding nesis api implementation for sharepoint datasource this is the backend that will support adding sharepoint document library sources.

@akizito akizito linked an issue Apr 25, 2024 that may be closed by this pull request
Copy link

gitguardian bot commented Apr 25, 2024

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
Once a secret has been leaked into a git repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!

@akizito akizito requested a review from mawandm April 25, 2024 04:45
metadata: Dict[str, Any],
cache_client: memcache.Client,
) -> None:
global _sharepoint_context
Copy link
Contributor

@mawandm mawandm Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akizito I think it'd be best to have this as a local variable rather than global and then pass it down as a parameter to the _sync_sharepoint_documents and _unsync_sharepoint_documents functions. This way, this fetch_documents becomes thread safe - if in future we want to use it in a multi-threaded manner


_LOG.info(f"Completed syncing to endpoint {rag_endpoint}")
def _process_file(file, connection, rag_endpoint, http_client, metadata, cache_client):
self_link = file.serverRelativeUrl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akizito the self_link needs to be the full url. This is to guarantee its uniqueness. It looks like EncodedAbsUrl may have the answer.

"self_link": r"/sites/nesit-test/Shared Documents/some_file.pdf",
},
)

Copy link
Contributor

@mawandm mawandm Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test above tests the sync (create) document process so we just need another test for the un_sync process and another one for the sync (update). See the test nesis/api/tests/core/document_loaders/test_s3.py

client_id = connection.get("client_id")
tenant = connection.get("tenant")
thumbprint = connection.get("thumbprint")
cert_path = connection.get("certificate_path")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akizito for this, I think that we should have the certificate contents instead of the path. To use them, we should have to persist them into a temporary file and pass the path to the file into the cert_path parameter. This way, the connection details can be stored and retrieved from the database.

@akizito akizito marked this pull request as draft April 26, 2024 02:57
@akizito akizito force-pushed the 19-feature-add-sharepoint-datasoure branch from fbc6ac1 to 73b40ca Compare April 26, 2024 03:00
@mawandm
Copy link
Contributor

mawandm commented Apr 29, 2024

Fixed by #61

@mawandm mawandm closed this Apr 29, 2024
@akizito akizito deleted the 19-feature-add-sharepoint-datasoure branch April 29, 2024 22:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Add Sharepoint datasoure
3 participants