Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NO MERGE: CAD-2907 ouroboros-consensus-shelley: computation stub module #3113

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

deepfire
Copy link
Contributor

@deepfire deepfire commented May 4, 2021

This provides a stub computation in ouroboros-consensus-shelley's applyLedgerBlock/reapplyLedgerBlock that is configurable for its run time.

Note that this requires calibration to be performed once, at system init time.

@deepfire deepfire added the WIP Do not merge, work in progress. label May 4, 2021
@deepfire deepfire self-assigned this May 4, 2021
@deepfire deepfire force-pushed the cad-2907-stub-compute branch 3 times, most recently from e74b9d3 to f13beab Compare May 4, 2021 17:16
applyHelper $
-- Apply the BBODY transition using the ticked state
withExcept BBodyError ..: SL.applyBlock

reapplyLedgerBlock = runIdentity ...:
seq (stubComputation stubComputationArg) $
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written, this thunk will only ever be evaluated once during the program execution, right? (And Common Subexpression Elimination likely even causes that one thunk to be shared between both applyLedgerBlock and reapplyLedgerBlock.)

You'd need to give stubComputation a dummy argument seemingly dependent upon the context here (eg the arguments to reapplyLedgerBlock). By "seemingly" I mean real enough that GHC won't float the thunk out from under reapplyLedgerBlock lambda but not real enough that the value actually affects the computation.

Copy link
Contributor

@nfrisby nfrisby May 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pointer comparison seems like a fine way to do it.

My only follow-up that is that you might want to add the pointer comparison for each of the arguments, not just the first. I haven't double-checked, but it seems entirely reasonable that the first argument in particular could be captured in a partial application closure just once. Since this method body is so small and GHC tries pretty hard to inline dictionaries, I'd worry (at least a little bit :)) that even this thing will still be thunked.

If you repeat the same fake usage for all three arguments of these methods, I'd be more confident that the stub computation is always being repeated per invocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nfrisby, this is sufficiently interesting that I'd be inclined to try diving into core analysis.

Before I seriously consider this -- do you think it would be plausible, or it is a pipe dream due to core volume / entanglement?

Copy link
Contributor

@nfrisby nfrisby May 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seq (stubComputation (stubComputationArg
                             + I# (reallyUnsafePtrEquality# x x))
                             + I# (reallyUnsafePtrEquality# y y))
                             + I# (reallyUnsafePtrEquality# z z))
                             )

I anticipate that would prevent the floating.

Also: it should be relatively clear whether there is sharing of the thunk between invocations. If that is happening, then you'll see essentially no difference compared to the baseline (ie no computational delay) run-time of a multi-block execution. If it isn't happening, then you should be able to (for large enough input stub arguments) see a marked difference in run-time.

So I don't think you need to Core dive here. But maybe you want to.


In that case, there are two basic steps I'd suggest starting with:

  • Find where the call-sites end up, after inlining, etc. EG If applyLedgerBlock occurs in Foo.hs, that's a bit misleading if the function that contains that occurrence is itself always inlined into its own call-sites. And so on.

  • Then inspect the Core at those final call-sites. I usually start with -ddump-simpl --dsuppress-all in an OPTIONS_GHC pragma.

Because so much of the Consensus layer is (a) polymorphic in the block type and (b) has large enough RHSes that they won't be inlined, I don't think step 1 will actually involve that much work. The "root occurrences" are in only six core library files. Sounds manageable.

$ git grep -l applyLedgerBlock -- 'ouroboros-consensus/src/*hs' | wc -l
6

HTH!

@deepfire
Copy link
Contributor Author

deepfire commented May 6, 2021

@nfrisby, thank you for catching this!

Does that look better now?

P.S. reallyUnsafePtrEquality# x x returns 0# in GHCi:

Prelude GHC.Prim GHC.Exts> let a = 1 in I# (reallyUnsafePtrEquality# a a)
0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WIP Do not merge, work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants