Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial work to create a service for returning related documents for copilot purposes. #74906

Merged
merged 37 commits into from
Aug 28, 2024

Conversation

CyrusNajmabadi
Copy link
Member

No description provided.

@dotnet-issue-labeler dotnet-issue-labeler bot added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels Aug 26, 2024
/// examples might be checking to see which symbols are used at that particular location and prioritizing documents
/// those symbols are defined in.
/// </summary>
ValueTask GetRelatedDocumentIdsAsync(Document document, int position, Func<DocumentId, CancellationToken, ValueTask> callbackAsync, CancellationToken cancellationToken);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switched to callback-style (Vs IAsyncEnumerable). it was just much easier to model, once i had all the OOP stuff hooked up.

@@ -13,38 +13,39 @@
<ItemGroup>
<ServiceHubService Include="Microsoft.VisualStudio.LanguageServices.AssetSynchronization" ClassName="Microsoft.CodeAnalysis.Remote.RemoteAssetSynchronizationService+Factory" />
<ServiceHubService Include="Microsoft.VisualStudio.LanguageServices.AsynchronousOperationListener" ClassName="Microsoft.CodeAnalysis.Remote.RemoteAsynchronousOperationListenerService+Factory" Audience="AllClientsIncludingGuests"/>
<ServiceHubService Include="Microsoft.VisualStudio.LanguageServices.CodeLensReferences" ClassName="Microsoft.CodeAnalysis.Remote.RemoteCodeLensReferencesService+Factory" />
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just sorted.

<ServiceHubService Include="Microsoft.VisualStudio.LanguageServices.ProjectTelemetry" ClassName="Microsoft.CodeAnalysis.Remote.RemoteProjectTelemetryService+Factory" />
<ServiceHubService Include="Microsoft.VisualStudio.LanguageServices.LegacySolutionEventsAggregation" ClassName="Microsoft.CodeAnalysis.Remote.RemoteLegacySolutionEventsAggregationService+Factory" />
<ServiceHubService Include="Microsoft.VisualStudio.LanguageServices.RelatedDocuments" ClassName="Microsoft.CodeAnalysis.Remote.RemoteRelatedDocumentsService+Factory" />
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is new.

[ExportLanguageService(typeof(IRelatedDocumentsService), LanguageNames.CSharp), Shared]
[method: ImportingConstructor]
[method: Obsolete(MefConstruction.ImportingConstructorMessage, error: true)]
internal sealed class CSharpRelatedDocumentsService() : AbstractRelatedDocumentsService<
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

language specific logic. for now, the lang specific logic is just iterating and finding which nodes to bind. we bind Name nodes (X, X.Y, X<Y>, etc.), and member access nodes with a left dotter name (like the X.Y in X.Y.Z()).


namespace Microsoft.CodeAnalysis.RelatedDocuments;

internal interface IRemoteRelatedDocumentsService
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remote side of the API. we want to defer to OOP to do things so we're not churning through tons of memory on the VS side, causing impactful GCs.

[ExportRemoteServiceCallbackDispatcher(typeof(IRemoteRelatedDocumentsService)), Shared]
[method: ImportingConstructor]
[method: Obsolete(MefConstruction.ImportingConstructorMessage, error: true)]
internal sealed class RelatedDocumentsServiceServerCallbackDispatcher() : RemoteServiceCallbackDispatcher, IRemoteRelatedDocumentsService.ICallback
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next two types are just OOP goo to allow an oop service to report bvack results as they are found.

RemoteServiceCallbackId callbackId,
CancellationToken cancellationToken)
{
return RunServiceAsync(solutionChecksum, async solution =>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops service is trivial, it just receives the inputs, gets the service on the oop side, and calls into it.

@CyrusNajmabadi
Copy link
Member Author

@dibarbet @genlu ptal. The lsp handler PR comes after this one.

@CyrusNajmabadi
Copy link
Member Author

NOte: as this is green, i'd like to roll any non-critical changes into: #74918

{
protected override async ValueTask GetRelatedDocumentIdsInCurrentProcessAsync(
Document document,
int position,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

position needs to be optional.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is. you can just pass 0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using -1 for this purpose. I think 0 is a valid position in file, so potentially we could return a different list of related files for "position 0" compare to "the entire file"

if (IsPossibleTypeName(memberAccess.Expression, out var nameToken))
{
// Something like `X.Y.Z` where `X.Y` is a type name. Bind X.Y
yield return (memberAccess.Expression, nameToken);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For X.Y.Z, memberAccess.Expression is X.Y, but the nameToken returned would be X, right? Shouldn't it be Y?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not exactly. we have X.Y.Z. But we pass X.Y Into IsPossibleTypeName, and if it succeeds, we return Y.

// results to whatever client is calling into us.
await ProducerConsumer<DocumentId>.RunParallelAsync(
// Order the nodes by the distance from the requested position.
IteratePotentialTypeNodes(root).OrderBy(t => t.expression.SpanStart - position),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.expression.SpanStart - position

Should this be the absolute value of t.expression.SpanStart - position ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup. That would make more sense!

Copy link
Member

@genlu genlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a solid starting point. I have a couple of question about the code, other than those LGTM. Feel free to address in the follow up PR

One concern is the perf/behavior on really large files. We might consider a different grouping strategy like

  1. (if a position is provided) find types referenced within enclosing member
  2. types referenced in other members' signatures
  3. types referenced inside other members

We could always try to return everything from first group in the first callback, so consumer of the API would always get the most relevant ones and have an opportunity to decide if they want to wait for more. Thoughts?

@CyrusNajmabadi
Copy link
Member Author

CyrusNajmabadi commented Aug 28, 2024

We could always try to return everything from first group in the first callback, so consumer of the API would always get the most relevant ones and have an opportunity to decide if they want to wait for more. Thoughts?

Totally. Though note that the design of this (including with LSP) is to be entirely streaming based. The service just keeps calling the callback as it finds results, and teh client just pulls as long as it wants. This works in LSP as well, where we take the results we are informed about, and we send them along as 'reports' to the client.

@CyrusNajmabadi CyrusNajmabadi merged commit a3dbd80 into dotnet:main Aug 28, 2024
28 checks passed
@CyrusNajmabadi CyrusNajmabadi deleted the relatedDocuments branch August 28, 2024 03:57
@dotnet-policy-service dotnet-policy-service bot added this to the Next milestone Aug 28, 2024
@CyrusNajmabadi
Copy link
Member Author

We might consider a different grouping strategy like ...

Yup. All those ideas are great. This was meant more as a starting position. The code that iterates the nodes can be greatly tweaked as we go forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead VSCode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants