Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Activity Ids #46890

Merged
merged 8 commits into from
Jan 21, 2021
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ public partial class Activity : IDisposable
private static readonly IEnumerable<ActivityLink> s_emptyLinks = new ActivityLink[0];
private static readonly IEnumerable<ActivityEvent> s_emptyEvents = new ActivityEvent[0];
#pragma warning restore CA1825
internal static readonly Random s_random = new Random();
tarekgh marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tarekgh - sorry if my memory betrays me but this seems to me that will cause collisions of trace IDs when using the package on .NET Framework: the seed is the time and it has a resolution around 15ms and reaches zero in a bit less than 50 days. Anytime that we get the same seed we will be repeating the trace ids that right now have a very strong guarantee of uniqueness.

A thing that perhaps also changed was that it was problematic to use the same random object concurrently, has that changed?

In order to keep the same strong guarantee for trace IDs, I think we need to keep using NewGuid for them. That said the span IDs don't need the same strong guarantees and perhaps can leverage Random.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refreshing my own memory: I looked at guaranteeing uniqueness about 6 months ago in the context of SignalFx instrumentation, ptal at the top comment signalfx/signalfx-dotnet-tracing#64 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pjanotti, thanks for the feedback. Yes I agree the current solution is not enough. NewGuid is very slow too which we are trying to change use something else. I'll look more and see if I can find better way.

Copy link
Member Author

@tarekgh tarekgh Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pjanotti what you think of the idea using thread static for Random and create it with a different seed for different threads. (we can be obtain the seed from Guid.NewGuid().GetHashCode() as example). I can try that and measure it to see if it still be faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't thought it through, but a few initial thoughts:
a) we could make an internal version of Random that uses long instead of int for the seed, all the internal state, and the output type of the Next() function. This gets us to 63 bits of entropy (the implementation does Math.Abs() that loses 1 bit) and also doubles the throughput of the generator. The downside is maybe 100 mostly copied lines of code and double the amount of memory used by the per-thread Random.
b) Random uses a substractive generator that has an internal state that is 55x larger than what we are using for the seed. I don't have quick access to Knuth's book where he analyzes the algorithm but it appeared that if we put more entropy into that initialization then we get more entropy in the resulting random sequences.
c) If the default seeds are weak we could always use something like Guid.NewGuid() to generate a strong random seed. As long as we only call it once per thread 1us should be neglible.

Copy link
Member

@noahfalk noahfalk Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pjanotti what you think of the idea using thread static for Random

If the size of the seed remains 32 bits I think it would still cause trouble. I am assuming we have a bunch of VMs in a distributed system (~1000 machines), each of which is starting potentially many threads, and each of these threads starts at least one Activity. All of these Activities are being logged to a database and the transactions are stored for a few months before being deleted. If each machine during that time period created 100 threads (seems very likely for an that app restarts periodically) then there are 100K total seeds generated. This gives a 68% chance that two of the threads had the same seed. All of the Activities generated by those two threads would have colliding IDs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tarekgh, @noahfalk is correct: the issue is that a random generator that takes a larger seed is needed to avoid collisions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will try to look if I can generate a random numbers with 64-bit seeds.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please have a look and let me know what you think about the last changes.


private static readonly ActivitySource s_defaultSource = new ActivitySource(string.Empty);

private const byte ActivityTraceFlagsIsSet = 0b_1_0000000; // Internal flag to indicate if flags have been set
Expand Down Expand Up @@ -1657,15 +1659,22 @@ public void CopyTo(Span<byte> destination)
}

/// <summary>
/// Sets the bytes in 'outBytes' to be random values. outBytes.Length must be less than or equal to 16
/// Sets the bytes in 'outBytes' to be random values. outBytes.Length must be either 8 or 16 bytes.
/// </summary>
/// <param name="outBytes"></param>
internal static unsafe void SetToRandomBytes(Span<byte> outBytes)
{
Debug.Assert(outBytes.Length <= sizeof(Guid)); // Guid is 16 bytes, and so is TraceId
Guid guid = Guid.NewGuid();
ReadOnlySpan<byte> guidBytes = new ReadOnlySpan<byte>(&guid, sizeof(Guid));
guidBytes.Slice(0, outBytes.Length).CopyTo(outBytes);
Debug.Assert(outBytes.Length == 16 || outBytes.Length == 8);
Random r = Activity.s_random;

Unsafe.WriteUnaligned(ref outBytes[0], r.Next());
Unsafe.WriteUnaligned(ref outBytes[4], r.Next());

if (outBytes.Length >= 16)
tarekgh marked this conversation as resolved.
Show resolved Hide resolved
{
Unsafe.WriteUnaligned(ref outBytes[8], r.Next());
Unsafe.WriteUnaligned(ref outBytes[12], r.Next());
}
}

/// <summary>
Expand Down