Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(abandoned) [compiler][rewrite] PropagateScopeDeps hir rewrite #30079

Closed
wants to merge 2 commits into from

Conversation

mofeiZ
Copy link
Contributor

@mofeiZ mofeiZ commented Jun 24, 2024

Stack from ghstack (oldest at bottom):

Quick background:

Rvalues / temporaries:

In the compiler, unnamed temporaries that represents the evaluation of an expression
In the code snippet below, $1, $2, $3, and $4 are temporaries.

// input
function Component({ bar} ) {
  const x = {a: foo(bar), b: {}};
}
// gets lowered to
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
[3] $4 = Call $2(<unknown> $3)
[4] $5 = Object {  }
[5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6

The compiler currently treats temporaries and named variables (e.g. x) differently in this pass.

  • named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple Identifier instances -- each with its own scope and mutable range)
  • temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see [compiler][fixtures] test repros: codegen, alignScope, phis #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in ReactiveScope.declarations and later promote them to named variables.
    In the same example, $4, $5, and $6 need to be promoted: $2 ->t0, $5 ->t1, and $6 ->t2.
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6

Dependencies

ReactiveScope.dependencies records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate obj.a.b without before creating x and checking objIsNull.

// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}

While other memoization strategies with different constraints exist, the current compiler requires that ReactiveScope.dependencies be re-orderable to the beginning of the reactive scope. But.. PropertyLoads from null values will throw TypeError. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See https://github.com/facebook/react-forget/pull/2709)


Rough high level overview

  1. Pass 1
    Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
  2. Pass 2 (collectHoistablePropertyLoads)
    Walk over instructions to generate sidemaps:
    a. temporary identifier -> named variable and property path (e.g. $3 -> {obj: props, path: ["a", "b"]})
    b. block -> accessed variables and properties (e.g. bb0 -> [ {obj: props, path: ["a", "b"]} ])
    c. Walk over control flow graph to understand the set of object and property paths that can be read by each basic block. This analysis:
    • relies on post-dominator trees
    • traverses the CFG from entry (producing the set of variables/paths unconditionally evaluated before a block).
    • traverses the CFG from exit (producing the set of variables/paths unconditionally evaluated after a block).
  3. Pass 3: (collectDependencies)
    Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps

Will add more fixture tests (although most cases should be covered in reduce-reactive-deps).
Tested by syncing internally and checking compilation output differences (internal link)


Followups:

  1. Rewrite function expression deps
    This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate this function-expr hoisting bug. A short term fix here is to simply call some form of collectNonNullObjects on every function expression to find hoistable variable / paths. In the longer term, we should refactor out FunctionExpression.deps.

  2. Enable optional paths
    (a) don't count optional load temporaries as dependencies (e.g. collectOptionalLoadRValues(...)).
    (b) add optional paths back. This is a bit tricky as we'll want to implement some merging logic for ConditionalAccess | OptionalChain | UnconditionalAccess. In addition, our current optional chain lowering is slightly incorrect / imprecise

Copy link

vercel bot commented Jun 24, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
react-compiler-playground ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 10, 2024 11:27pm

@react-sizebot
Copy link

Comparing: 6aea169...e0acdd8

Critical size changes

Includes critical production bundles, as well as any change greater than 2%:

Name +/- Base Current +/- gzip Base gzip Current gzip
oss-stable/react-dom/cjs/react-dom.production.js = 6.68 kB 6.68 kB = 1.83 kB 1.83 kB
oss-stable/react-dom/cjs/react-dom-client.production.js = 497.93 kB 497.93 kB = 89.26 kB 89.26 kB
oss-experimental/react-dom/cjs/react-dom.production.js = 6.69 kB 6.69 kB = 1.83 kB 1.83 kB
oss-experimental/react-dom/cjs/react-dom-client.production.js = 502.75 kB 502.75 kB = 89.96 kB 89.96 kB
facebook-www/ReactDOM-prod.classic.js = 597.17 kB 597.17 kB = 105.33 kB 105.33 kB
facebook-www/ReactDOM-prod.modern.js = 571.52 kB 571.52 kB = 101.27 kB 101.26 kB
test_utils/ReactAllWarnings.js Deleted 62.88 kB 0.00 kB Deleted 15.69 kB 0.00 kB

Significant size changes

Includes any change greater than 0.2%:

Expand to show
Name +/- Base Current +/- gzip Base gzip Current gzip
test_utils/ReactAllWarnings.js Deleted 62.88 kB 0.00 kB Deleted 15.69 kB 0.00 kB

Generated by 🚫 dangerJS against e0acdd8

mofeiZ added a commit that referenced this pull request Jun 25, 2024
…ent terminal preds when pruning labels"

Missed this initially in `pruneUnusedLabelsHIR`. It wasn't an active bug as `preds` wasn't referenced by later passes, until #30079

[ghstack-poisoned]
mofeiZ added a commit that referenced this pull request Jun 25, 2024
…ds when pruning labels"

Missed this initially in `pruneUnusedLabelsHIR`. It wasn't an active bug as `preds` wasn't referenced by later passes, until #30079

[ghstack-poisoned]
mofeiZ added a commit that referenced this pull request Jun 25, 2024
…ent terminal preds when pruning labels"

Missed this initially in `pruneUnusedLabelsHIR`. It wasn't an active bug as `preds` wasn't referenced by later passes, until #30079

[ghstack-poisoned]
mofeiZ added a commit that referenced this pull request Jun 25, 2024
…ds when pruning labels"

Missed this initially in `pruneUnusedLabelsHIR`. It wasn't an active bug as `preds` wasn't referenced by later passes, until #30079

[ghstack-poisoned]
mofeiZ added a commit that referenced this pull request Jun 25, 2024
…ning labels

Missed this initially in `pruneUnusedLabelsHIR`. It wasn't an active bug as `preds` wasn't referenced by later passes, until #30079

ghstack-source-id: 3e151b74c31554299e870f001c0ac3f72706318c
Pull Request resolved: #30076
@mofeiZ mofeiZ changed the title [wip][will rewrite] Working draft of PropagateScopeDeps hir rewrite [compiler][rewrite] PropagateScopeDeps hir rewrite Jul 10, 2024
\### Quick background:
\#### Rvalues / temporaries:
In the compiler, unnamed temporaries that represents the evaluation of an expression
In the code snippet below, $1, $2, $3, and $4 are temporaries.
```js
// input
function Component({ bar} ) {
  const x = {a: foo(bar), b: {}};
}
// gets lowered to
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
[3] $4 = Call $2(<unknown> $3)
[4] $5 = Object {  }
[5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```
The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

\#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
\### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectHoistablePropertyLoads)
Walk over instructions to generate sidemaps:
  a. temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
  b. block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  c. Walk over control flow graph to understand the set of object and property paths that can be read by each basic block. This analysis:
    - relies on post-dominator trees
    - traverses the CFG from entry (producing the set of variables/paths unconditionally evaluated *before* a block).
    - traverses the CFG from exit (producing the set of variables/paths unconditionally evaluated *after* a block).
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps

Will add more fixture tests (although most cases should be covered in `reduce-reactive-deps`).
Tested by syncing internally and checking compilation output differences ([internal link](https://fburl.com/wiki_markdown/nazsiszd))

---
\### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.ts). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) add optional paths back. This is a bit tricky as we'll want to implement some merging logic for `ConditionalAccess | OptionalChain | UnconditionalAccess`. In addition, our current optional chain lowering is slightly incorrect / imprecise

[ghstack-poisoned]
mofeiZ added a commit that referenced this pull request Jul 10, 2024
\### Quick background:
\#### Rvalues / temporaries:
In the compiler, unnamed temporaries that represents the evaluation of an expression
In the code snippet below, $1, $2, $3, and $4 are temporaries.
```js
// input
function Component({ bar} ) {
  const x = {a: foo(bar), b: {}};
}
// gets lowered to
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
[3] $4 = Call $2(<unknown> $3)
[4] $5 = Object {  }
[5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```
The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

\#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
\### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectHoistablePropertyLoads)
Walk over instructions to generate sidemaps:
  a. temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
  b. block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  c. Walk over control flow graph to understand the set of object and property paths that can be read by each basic block. This analysis:
    - relies on post-dominator trees
    - traverses the CFG from entry (producing the set of variables/paths unconditionally evaluated *before* a block).
    - traverses the CFG from exit (producing the set of variables/paths unconditionally evaluated *after* a block).
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps

Will add more fixture tests (although most cases should be covered in `reduce-reactive-deps`).
Tested by syncing internally and checking compilation output differences ([internal link](https://fburl.com/wiki_markdown/nazsiszd))

---
\### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.ts). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) add optional paths back. This is a bit tricky as we'll want to implement some merging logic for `ConditionalAccess | OptionalChain | UnconditionalAccess`. In addition, our current optional chain lowering is slightly incorrect / imprecise

ghstack-source-id: 0562a03ed7325678e9797f1d2cedd8fbf2cd0dc9
Pull Request resolved: #30079
@mofeiZ mofeiZ marked this pull request as ready for review July 10, 2024 23:22
// @enableTreatFunctionDepsAsConditional
import { Stringify } from "shared-runtime";

function Component({ props }) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's confusing to destructure props and name the identifier props, pick a different name?

Comment on lines +28 to +30
const functionExprRvals = fn.env.config.enableTreatFunctionDepsAsConditional
? collectFunctionExpressionRValues(fn)
: new Set<IdentifierId>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we simplify the initial version of this by not supporting this mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, happy to add this in a later PR in the stack!

return nodes;
}

function deriveNonNull(fn: HIRFunction, nodes: Map<BlockId, BlockInfo>): void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bookkeeping/traversal logic is making it really hard to follow the core logic of this pass. I would have expected that a non-null object analysis could use the dominator tree rather than the walking forward/backwards that's happening here. Did you try using dominators? If yeah, what was the challenge?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding (please correct if I'm misunderstanding something, my algo knowledge is much weaker than yours), dominator tree would work but misses some cases the non-hir implementation currently covers.

Take this example. Here, we know that it's safe to access a.b on line 5 even though no dominator of its basic block accesses a.b.

0: if(...) {
1:   a.b;
2: } else {
3:   a.b;
4: }
5: // It's safe to take `a.b` as a dependency of scope@0 (i.e. hoist the a.b access)
6: scope @0 {
7:   if (cond) {
8:     // access a.b;
9:   }
10:}

I believe a more precise formation would be that a.b is safe to read in $bb_i$ if any of the following hold:

  • $bb_i$ itself accesses a.b
  • a.b is safe to access in every pred of a.b
  • a.b is safe to access in every successor of a.b

Basic block cycles can be handled by filtering / dropping the offending block (e.g. reducing x = x && y && ... to x = y && ...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. That makes sense. I suspect this can still be simplified - in particular, i had to remap the variable names in my head to help keep track of things. See comments for the most unclear cases.

One thing that strikes me about this algorithm is that it isn't accurate in the sense that it doesn't account for the actual live range of values. During the initial collectNodes stage we account for mutable ranges when deciding if a property access can locally be considered to be non-nullable. But then in the deriveNonNull phase we don't have access to the mutable ranges anymore. AFAICT, if you have:

bb0:
  let x = null
  if cond then bb1 else bb2
bb1:
  x = {}
  goto bb3
bb2:
  ...
  goto bb3:
bb3:
  x.foo.bar

this algorithm will infer x.foo.bar as non-nullable for bb0, via the intersection of bb1 & bb2 which in turn comes from bb3. So the results are actually inaccurate, but that's okay because of how we filter out dependencies in PropagateScopeDeps.

This isn't necessarily a blocker, but it isn't great to have a pass that returns results that are technically incorrect — it's very easy to accidentally use them in a different way later, w/o realizing that important context. Thoughts on this, and whether we could make the results guaranteed accurate?


// Handle scopes that begin or end at the start of the given block,
// invoking scope enter/exit callbacks as applicable
handleBlock(block: BasicBlock): void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This abstraction kind of divides up when processing happens (scope entry/exit ends up on the side) separate from processing the block itself. Just feels weird, for lack of a better word, and a bit hard to follow.

At least for me, it tends to be easier to follow something where there's a simple for loop over the blocks in standard (RPO) order, and we do stuff for each block. Maybe pulling this out into a side-table, that would allow looking up scope start/exit per block, would be easier to follow?

Copy link
Contributor

@josephsavona josephsavona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok i spent a bunch of time reading though. There's a ton here, but a lot of the core logic is retained from the existing implementation which makes it a bit easier. See comments. In particular, the note about updating our feature flag setup — it's time to drop the non-HIR based versions of older passes so that we can more easily compare the results of this new pass, and fall back quickly if necessary.

const declaringScope = declarations.get(place.identifier.id);
if (
declaringScope != null &&
scopeTraversal.activeScopes.indexOf(declaringScope) === -1 &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does activeScopes need to be an array? could it be a Set instead to make this faster/clearer? or expose an API like isScopeActive(scope) and currentScope(): Scope | null

Comment on lines +100 to +102
for (const place of eachInstructionOperand(instr)) {
handlePlace(place);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's intentional that we're not visiting lvalues right?

}
}

export function propagateScopeDependenciesHIR(fn: HIRFunction): void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've mentioned this on previous PRs, but putting the most important function at the bottom makes it really tedious to get back to when reading.

export type BlockInfo = {
blockId: BlockId;
scope: ScopeId | null;
preds: Set<BlockId>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just store the Block itself, which has the preds on it already?

nodeId: BlockId,
kind: "succ" | "pred",
traversalState: Map<BlockId, "active" | "done">,
result: Map<BlockId, Set<PropertyLoadNode>>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: name this more clearly? nonNullablesByBlock or similar?

Comment on lines +336 to +337
// non-active neighbors with no recorded results can occur due to backedges.
// it's not safe to assume they can be filtered out (e.g. not intersected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines +340 to +342
// if (object.identifier.id === 32) {
// console.log(printDependency(nextDependency));
// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:-)

return nodes;
}

function deriveNonNull(fn: HIRFunction, nodes: Map<BlockId, BlockInfo>): void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. That makes sense. I suspect this can still be simplified - in particular, i had to remap the variable names in my head to help keep track of things. See comments for the most unclear cases.

One thing that strikes me about this algorithm is that it isn't accurate in the sense that it doesn't account for the actual live range of values. During the initial collectNodes stage we account for mutable ranges when deciding if a property access can locally be considered to be non-nullable. But then in the deriveNonNull phase we don't have access to the mutable ranges anymore. AFAICT, if you have:

bb0:
  let x = null
  if cond then bb1 else bb2
bb1:
  x = {}
  goto bb3
bb2:
  ...
  goto bb3:
bb3:
  x.foo.bar

this algorithm will infer x.foo.bar as non-nullable for bb0, via the intersection of bb1 & bb2 which in turn comes from bb3. So the results are actually inaccurate, but that's okay because of how we filter out dependencies in PropagateScopeDeps.

This isn't necessarily a blocker, but it isn't great to have a pass that returns results that are technically incorrect — it's very easy to accidentally use them in a different way later, w/o realizing that important context. Thoughts on this, and whether we could make the results guaranteed accurate?

Comment on lines +521 to +526
for (const [_blockId, node] of nodes) {
if (node.scope === scopeId) {
nonNullObjects = [...node.assumedNonNullObjects].map((n) => n.fullPath);
break;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should translate nodes into a Map keyed by scope earlier?

Comment on lines +311 to +316
propagateScopeDependenciesHIR(hir);
yield log({
kind: "hir",
name: "PropagateScopeDependenciesHIR",
value: hir,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that enableReactiveScopesInHIR has been enabled by default for some time with strong validation of the results, i think as a precursor to this diff we should promote enableReactiveScopesInHIR to stable, removing the alternative code path. Let's repurpose the feature flag for the remainder of the conversion, including this one.

Basically, it would be nice to just compare the old/new PropagateScopeDeps behavior without also including all the other bugs from non-enableReactiveScopesInHIR mode.

Copy link
Contributor

@josephsavona josephsavona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comments, also from the fixtures it looks like there is an issue with the non-null inference

@@ -30,15 +30,15 @@ import { identity } from "shared-runtime";
function Component(props) {
const $ = _c(4);
let x;
if ($[0] !== props.value) {
if ($[0] !== props) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm this seems wrong, props.value is unconditionally accessed and should be retained as the dep

Copy link
Contributor Author

@mofeiZ mofeiZ Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! This is a bit of a weird one. The reason we're not inferring props.value as unconditionally accessed is because we check mutable ranges before adding to assumedNonNullObjects. This is to avoid recording loads within an object's mutable range (e.g. read(x.y.z); x.y = null;)

// in function collectNodes
      if (value.kind === "PropertyLoad") {
        const propertyNode = c.declareProperty(...);
        if ( !isMutable(instr, value.object))  // <----- this line
          let curr = propertyNode.parent;
          while (curr != null) {
            assumedNonNullObjects.add(curr);
            curr = curr.parent;
          }
        }

The edge case here is that we're inferring a mutable range for props due to it being directly aliased into a context variable, which gets captured / mutated within a lambda.

@@ -29,15 +29,15 @@ import { identity } from "shared-runtime";
function Component(props) {
const $ = _c(4);
let x;
if ($[0] !== props.value) {
if ($[0] !== props) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@@ -25,11 +25,11 @@ import { c as _c } from "react/compiler-runtime"; // @enableTreatFunctionDepsAsC
function Component(props) {
const $ = _c(5);
let t0;
if ($[0] !== props) {
if ($[0] !== props.bar) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

@@ -38,10 +38,11 @@ function Component(props) {
items = t0;

items?.push(props.a);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, so we know that props is non-null and therefore can hoist the props.a, even though that property load is optional

@mofeiZ mofeiZ changed the title [compiler][rewrite] PropagateScopeDeps hir rewrite (abandoned) [compiler][rewrite] PropagateScopeDeps hir rewrite Sep 5, 2024
@mofeiZ mofeiZ closed this Sep 5, 2024
mofeiZ added a commit that referenced this pull request Sep 6, 2024
Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573

### Quick background
#### Temporaries

The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectHoistablePropertyLoads)
Walk over instructions to generate sidemaps:
  a. temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
  b. block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  c. Walk over control flow graph to understand the set of object and property paths that can be read by each basic block. This analysis:
    - relies on post-dominator trees
    - traverses the CFG from entry (producing the set of variables/paths unconditionally evaluated *before* a block).
    - traverses the CFG from exit (producing the set of variables/paths unconditionally evaluated *after* a block).
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here
5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads

Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq))

---
### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.ts). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) record optional paths in both collectHoistablePropertyLoads and dependency collection

ghstack-source-id: 51e0759c68c931c19fee02665bd37a7274a45572
Pull Request resolved: #30894
mofeiZ added a commit that referenced this pull request Sep 11, 2024
Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573

### Quick background
#### Temporaries

The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectTemporariesSidemap)
Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
2. Pass 2 (collectHoistablePropertyLoads)
  a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`.
  A basic block can unconditionally read from identifier X if any of the following applies:
    - the block itself reads from identifier X
    - all predecessors of the block read from identifier X
    - all successors of the block read from identifier X
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here
5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads

Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq))

---
### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) record optional paths in both collectHoistablePropertyLoads and dependency collection

ghstack-source-id: 470ab175a901772567139bd0bb93b3e3c71c9dc7
Pull Request resolved: #30894
mofeiZ added a commit that referenced this pull request Sep 11, 2024
Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573

### Quick background
#### Temporaries

The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectTemporariesSidemap)
Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
2. Pass 2 (collectHoistablePropertyLoads)
  a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`.
  A basic block can unconditionally read from identifier X if any of the following applies:
    - the block itself reads from identifier X
    - all predecessors of the block read from identifier X
    - all successors of the block read from identifier X
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here
5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads

Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq))

---
### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) record optional paths in both collectHoistablePropertyLoads and dependency collection

ghstack-source-id: e1fe5d80076df15c5277c6761138f45f267e11dd
Pull Request resolved: #30894
mofeiZ added a commit that referenced this pull request Sep 11, 2024
Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573

### Quick background
#### Temporaries

The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectTemporariesSidemap)
Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
2. Pass 2 (collectHoistablePropertyLoads)
  a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`.
  A basic block can unconditionally read from identifier X if any of the following applies:
    - the block itself reads from identifier X
    - all predecessors of the block read from identifier X
    - all successors of the block read from identifier X
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here
5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads

Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq))

---
### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) record optional paths in both collectHoistablePropertyLoads and dependency collection

ghstack-source-id: e1fe5d80076df15c5277c6761138f45f267e11dd
Pull Request resolved: #30894
mofeiZ added a commit that referenced this pull request Sep 11, 2024
Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573

### Quick background
#### Temporaries

The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectTemporariesSidemap)
Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
2. Pass 2 (collectHoistablePropertyLoads)
  a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`.
  A basic block can unconditionally read from identifier X if any of the following applies:
    - the block itself reads from identifier X
    - all predecessors of the block read from identifier X
    - all successors of the block read from identifier X
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here
5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads

Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq))

---
### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) record optional paths in both collectHoistablePropertyLoads and dependency collection

ghstack-source-id: 6c62db14213284bc61869b8db57198546c399f7f
Pull Request resolved: #30894
mofeiZ added a commit that referenced this pull request Sep 12, 2024
Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573

### Quick background
#### Temporaries

The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectTemporariesSidemap)
Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
2. Pass 2 (collectHoistablePropertyLoads)
  a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`.
  A basic block can unconditionally read from identifier X if any of the following applies:
    - the block itself reads from identifier X
    - all predecessors of the block read from identifier X
    - all successors of the block read from identifier X
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here
5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads

Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq))

---
### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) record optional paths in both collectHoistablePropertyLoads and dependency collection

ghstack-source-id: 2507f6ea751dce09ad1dccd353ae6fc7cf411582
Pull Request resolved: #30894
mofeiZ added a commit that referenced this pull request Sep 12, 2024
Resubmission of #30079 -- core logic unchanged, but needed to rebase past #30573

### Quick background
#### Temporaries

The compiler currently treats temporaries and named variables (e.g. `x`) differently in this pass.
- named variables may be reassigned (in fact, since we're running after LeaveSSA, a single named identifier's IdentifierId may map to multiple `Identifier` instances -- each with its own scope and mutable range)
- temporaries are replaced with their represented expressions during codegen. This is correct (mostly correct, see #29878) as we're careful to always lower the correct evaluation semantics. However, since we rewrite reactive scopes entirely (to if/else blocks), we need to track temporaries that a scope produces in `ReactiveScope.declarations` and later promote them to named variables.
In the same example, $4, $5, and $6 need to be promoted: $2 ->`t0`,  $5 ->`t1`, and $6 ->`t2`.
```js
[1] $2 = LoadGlobal(global) foo
[2] $3 = LoadLocal bar$1
scope 0:
  [3] $4 = Call $2(<unknown> $3)
scope 1:
  [4] $5 = Object {  }
scope 2:
  [5] $6 = Object { a: $4, b: $5 }
[6] $8 = StoreLocal Const x$7 = $6
```

#### Dependencies
`ReactiveScope.dependencies` records the set of (read-only) values that a reactive scope is dependent on. This is currently limited to just variables (named variables from source and promoted temporaries) and property-loads.
All dependencies we record need to be hoistable -- i.e. reordered to just before the ReactiveScope begins. Not all PropertyLoads are hoistable.

In this example, we should not evaluate `obj.a.b` without before creating x and checking `objIsNull`.
```js
// reduce-reactive-deps/no-uncond.js
function useFoo({ obj, objIsNull }) {
  const x = [];
  if (isFalse(objIsNull)) {
    x.push(obj.a.b);
  }
  return x;
}
```

While other memoization strategies with different constraints exist, the current compiler requires that `ReactiveScope.dependencies` be re-orderable to the beginning of the reactive scope. But.. `PropertyLoad`s from null values will throw `TypeError`. This means that evaluating hoisted dependencies should throw if and only if the source program throws. (It is also a bug if source throws and compiler output does not throw. See facebook/react-forget#2709)

---
### Rough high level overview
1. Pass 1
Walk over instructions to gather every temporary used outside of its defining scope (same as ReactiveFunction version). These determine the sidemaps we produce, as temporaries used outside of their declaring scopes get promoted to named variables later (and are not considered hoistable rvals).
2. Pass 2 (collectTemporariesSidemap)
Walk over instructions to generate a sidemap of temporary identifier -> named variable and property path (e.g. `$3 -> {obj: props, path: ["a", "b"]}`)
2. Pass 2 (collectHoistablePropertyLoads)
  a. Build a sidemap of block -> accessed variables and properties (e.g. `bb0 -> [ {obj: props, path: ["a", "b"]} ]`)
  b. Propagate "non-nullness" i.e. variables and properties for which we can safely evaluate `PropertyLoad`.
  A basic block can unconditionally read from identifier X if any of the following applies:
    - the block itself reads from identifier X
    - all predecessors of the block read from identifier X
    - all successors of the block read from identifier X
4. Pass 3: (collectDependencies)
Walks over instructions again to record dependencies and declarations, using the previously produced sidemaps. We do not record any control-flow here
5. Merge every scope's recorded dependencies with the set of hoistable PropertyLoads

Tested by syncing internally and (1) checking compilation output differences ([internal link](https://www.internalfb.com/intern/everpaste/?handle=GPCfUBt_HCoy_S4EAJDVFJyJJMR0bsIXAAAB)), running internally e2e tests ([internal link](https://fburl.com/sandcastle/cs5mlkxq))

---
### Followups:
1. Rewrite function expression deps
This change produces much more optimal output as the compiler now uses the function CFG to understand which variables / paths are assumed to be non-null. However, it may exacerbate [this function-expr hoisting bug](https://github.com/facebook/react/blob/main/compiler/packages/babel-plugin-react-compiler/src/__tests__/fixtures/compiler/bug-invalid-hoisting-functionexpr.tsx). A short term fix here is to simply call some form of `collectNonNullObjects` on every function expression to find hoistable variable / paths. In the longer term, we should refactor out `FunctionExpression.deps`.

2. Enable optional paths
(a) don't count optional load temporaries as dependencies (e.g. `collectOptionalLoadRValues(...)`).
(b) record optional paths in both collectHoistablePropertyLoads and dependency collection

ghstack-source-id: 2507f6ea751dce09ad1dccd353ae6fc7cf411582
Pull Request resolved: #30894
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed React Core Team Opened by a member of the React Core Team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants