Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update "String addition" performance tests to give a better perspective #10201

Closed
2 tasks done
santisq opened this issue Jun 23, 2023 · 2 comments · Fixed by #10202
Closed
2 tasks done

Update "String addition" performance tests to give a better perspective #10201

santisq opened this issue Jun 23, 2023 · 2 comments · Fixed by #10202
Assignees
Labels
area-sdk-docs Area - SDK docs issue-doc-idea Issue - request for new content

Comments

@santisq
Copy link
Contributor

santisq commented Jun 23, 2023

Prerequisites

  • Existing Issue: Search the existing issues for this repository. If there is an issue that fits your needs do not file a new one. Subscribe, react, or comment on that issue instead.
  • Descriptive Title: Write the title for this issue as a short synopsis. If possible, provide context. For example, "Document new Get-Foo cmdlet" instead of "New cmdlet."

PowerShell Version

5.1, 7.2, 7.3, 7.4

Summary

I would like to suggest updating the String addition section of the Performance docs to show unified tests between StringBuilder, addition assignment op += and -join operator similar to the ones used in #9997, this would help giving a better performance perspective as well as add fairness to the tests.

The doc is currently offering measurements like:

$string = ''
Measure-Command {
    foreach ($i in 1..10000) {
        $string += "Iteration $i`n"
    }
    $string
} | Select-Object TotalMilliseconds

When the fair test would've been:

Measure-Command {&{
    $string = ''
    foreach ($i in 1..10000) {
        $string += "Iteration $i`n"
    }
    $string
}} | Select-Object TotalMilliseconds

It's worth noting that these tests are executing in the caller's Scope since Measure-Command dot sources the Script Block, performance tests should always be executed in their own scope & { } to add fairness and perhaps more accurate results.

Details

Proposed Performance Tests

$tests = @{
    'StringBuilder' = {
        $sb = [System.Text.StringBuilder]::new()
        foreach ($i in 0..$args[0]) {
            $sb = $sb.AppendLine("Iteration $i")
        }
        $sb.ToString()
    }
    'Join operator' = {
        $string = @(
            foreach ($i in 0..$args[0]) {
                "Iteration $i"
            }
        ) -join "`n"
        $string
    }
    'Addition Assignment +=' = {
        $string = ''
        foreach ($i in 0..$args[0]) {
            $string += "Iteration $i`n"
        }
        $string
    }
}

10kb, 50kb, 100kb | ForEach-Object {
    $groupResult = foreach ($test in $tests.GetEnumerator()) {
        $ms = (Measure-Command { & $test.Value $_ }).TotalMilliseconds

        [pscustomobject]@{
            Iterations        = $_
            Test              = $test.Key
            TotalMilliseconds = [math]::Round($ms, 2)
        }

        [GC]::Collect()
        [GC]::WaitForPendingFinalizers()
    }

    $groupResult = $groupResult | Sort-Object TotalMilliseconds
    $groupResult | Select-Object *, @{
        Name       = 'RelativeSpeed'
        Expression = {
            $relativeSpeed = $_.TotalMilliseconds / $groupResult[0].TotalMilliseconds
            [math]::Round($relativeSpeed, 2).ToString() + 'x'
        }
    }
}

Measurements Results

.NET - PowerShell 7.4.0-preview.3

Iterations Test                   TotalMilliseconds RelativeSpeed
---------- ----                   ----------------- -------------
     10240 Join operator                      72.79 1x
     10240 StringBuilder                      89.62 1.23x
     10240 Addition Assignment +=           1151.77 15.82x
     51200 StringBuilder                     105.06 1x
     51200 Join operator                     165.47 1.58x
     51200 Addition Assignment +=          31361.21 298.51x
    102400 StringBuilder                     140.13 1x
    102400 Join operator                     284.59 2.03x
    102400 Addition Assignment +=         124368.18 887.52x

.NET Framework - Windows PowerShell 5.1

Iterations Test                   TotalMilliseconds RelativeSpeed
---------- ----                   ----------------- -------------
     10240 StringBuilder                      38.59 1x
     10240 Join operator                     120.09 3.11x
     10240 Addition Assignment +=           2508.15 64.99x
     51200 StringBuilder                      62.07 1x
     51200 Join operator                     248.17 4x
     51200 Addition Assignment +=          72645.55 1170.38x
    102400 StringBuilder                     137.31 1x
    102400 Join operator                     514.65 3.75x
    102400 Addition Assignment +=         326358.47 2376.8x

Proposed Content Type

About Topic, Concept

Proposed Title

No response

Related Articles

@santisq santisq added issue-doc-idea Issue - request for new content needs-triage Waiting - Needs triage labels Jun 23, 2023
@sdwheeler sdwheeler added area-sdk-docs Area - SDK docs and removed needs-triage Waiting - Needs triage labels Jun 23, 2023
@sdwheeler sdwheeler self-assigned this Jun 23, 2023
michaeltlombardi added a commit that referenced this issue Jun 23, 2023
* Refactor performance tests

* Apply suggestions from review

---------

Co-authored-by: Mikey Lombardi (He/Him) <michael.t.lombardi@gmail.com>
@iRon7
Copy link
Contributor

iRon7 commented Jun 24, 2023

@santisq,

feel free to chime in if you have any observation

Thanks for your update, I think it is a nice improvement of this section.

The only thing that keeps singing in the back of my mind that sometimes you better off with not concatenating strings at all and just pipe them through (as usually adviced with objects).
In a pipeline like:

Get-Content .\in.txt | Foreach-Object { "some string processing on $_" } | Set-Content .\out.txt

Don't blindly use e.g. the string builder on the whole content because "it is faster".

But this comment comes probably down to a stricter general note (or warning)

Note

Many of the techniques described here aren't idiomatic PowerShell and may reduce the readability of a PowerShell script. Besides, as they often choke the PowerShell pipeline, they might consume considerable more memory. Script authors are advised to use idiomatic PowerShell unless performance dictates otherwise.

@santisq
Copy link
Contributor Author

santisq commented Jun 24, 2023

@iRon7 I agree with that but the doc section is making a recommendation about string addition not about how to write output to a file. Also the perf tests are describing an extreme scenario, nobody would / should be adding to a string so many times but in case they are, then its good to advise on which technique is better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-sdk-docs Area - SDK docs issue-doc-idea Issue - request for new content
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants