-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Add Sharepoint datasoure Issue #19 #48
Conversation
️✅ There are no secrets present in this pull request anymore.If these secrets were true positive and are still valid, we highly recommend you to revoke them. 🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request. |
metadata: Dict[str, Any], | ||
cache_client: memcache.Client, | ||
) -> None: | ||
global _sharepoint_context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akizito I think it'd be best to have this as a local variable rather than global and then pass it down as a parameter to the _sync_sharepoint_documents
and _unsync_sharepoint_documents
functions. This way, this fetch_documents
becomes thread safe - if in future we want to use it in a multi-threaded manner
|
||
_LOG.info(f"Completed syncing to endpoint {rag_endpoint}") | ||
def _process_file(file, connection, rag_endpoint, http_client, metadata, cache_client): | ||
self_link = file.serverRelativeUrl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akizito the self_link
needs to be the full url. This is to guarantee its uniqueness. It looks like EncodedAbsUrl
may have the answer.
"self_link": r"/sites/nesit-test/Shared Documents/some_file.pdf", | ||
}, | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test above tests the sync (create) document process so we just need another test for the un_sync process and another one for the sync (update). See the test nesis/api/tests/core/document_loaders/test_s3.py
client_id = connection.get("client_id") | ||
tenant = connection.get("tenant") | ||
thumbprint = connection.get("thumbprint") | ||
cert_path = connection.get("certificate_path") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akizito for this, I think that we should have the certificate contents instead of the path. To use them, we should have to persist them into a temporary file and pass the path to the file into the cert_path
parameter. This way, the connection details can be stored and retrieved from the database.
fbc6ac1
to
73b40ca
Compare
Add deployment instructions for 1. Docker compose 2. Helm Chart 3. Ametnes Platform Closes #25
Fixed by #61 |
Adding nesis api implementation for sharepoint datasource this is the backend that will support adding sharepoint document library sources.