Skip to content

Commit

Permalink
refactor TaskScopes to use an aggregation tree (vercel/turborepo#5992)
Browse files Browse the repository at this point in the history
### Description

Remove TaskScope and add a similar tree structure instead.

It aggregates Tasks in the task graph (parent-children) to aggregated
nodes which summarize important information from all contained tasks.
It's a tree-structure and any subgraph can be aggregated to a single
aggregated node to query these information in an efficient way.

We aggregate the following information from tasks:

* is there any unfinished task
  * and an event when this reaches zero (so one can wait for it)
* collectibles emitted
* a list of dirty tasks

Receiving this information doesn't require to walk the tree since it's
eagerly aggregated when it changes on Tasks. Since it's a tree structure
updating such information on a Task has to walk the tree upwards which
is `O(R + log(R) * log(N))` where N is the number of Tasks in the graph
and R the number of roots of the graph.

It's also possible to query the aggregation from any Task to get
information about the roots. This has to walk the tree to every root
which should be `O(R + log(R) * log(N))`, but it's possible to shortcut
when any root already contains the information needed.

We use that to gather the following information about roots:

* is this root active

The tree is only connected from bottom to top. It's not possible to walk
the tree from top to bottom.

The tree is build from two parts, the top tree and the bottom tree.
A height 0 bottom tree will aggregate Tasks up to a configured
connectivity. The height 1 bottom tree will aggregate height 0 bottom
tree up to a configured connectivity. This continues recursively.
Since one doesn't know which height of bottom tree is needed to
aggregate a certain subgraph, a top tree is used. A depth 0 top tree
aggregates Tasks of infinite connectivity, by using a bottom tree of
height X and optionally a depth 1 top tree. This continues with depth 1
top tree using a height X + 1 bottom tree and a depth 2 top tree.

The connectivity and X are subject of fine tuning.

In general a Task can be in multiple bottom trees as inner node, but to
ensure tree reuse there is a limitation to that. Once a bottom tree
starts at a certain Task, it cannot be an inner node of other bottom
tree. So a Task is either non-root inner node in one or more bottom tree
or root node of exactly 1 bottom tree.

When a task is inner node in multiple bottom tree, the cost of the
children of the task will multiply with the number of bottom trees. This
can create a performance hit. To avoid that there is a threshold
(subject of fine tuning) which converts the Task into a root of a new
bottom tree when the multiple reaches the threshold.

The same limitations apply on higher level bottom trees.


Closes WEB-1621
  • Loading branch information
sokra committed Sep 28, 2023
1 parent 7d1e61d commit fdf5e86
Show file tree
Hide file tree
Showing 49 changed files with 4,087 additions and 2,973 deletions.
7 changes: 2 additions & 5 deletions crates/node-file-trace/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -495,13 +495,10 @@ async fn run<B: Backend + 'static, F: Future<Output = ()>>(
module_options,
resolve_options,
);
let _ = output.resolve_strongly_consistent().await?;

let source = TransientValue::new(Vc::into_raw(output));
let issues = output
.peek_issues_with_path()
.await?
.strongly_consistent()
.await?;
let issues = output.peek_issues_with_path().await?;

let console_ui = ConsoleUi::new(log_options);
Vc::upcast::<Box<dyn IssueReporter>>(console_ui)
Expand Down
2 changes: 2 additions & 0 deletions crates/turbo-tasks-auto-hash-map/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
#![feature(hash_raw_entry)]

pub mod map;
pub mod set;

Expand Down
108 changes: 108 additions & 0 deletions crates/turbo-tasks-auto-hash-map/src/map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,28 @@ impl<K: Eq + Hash, V, H: BuildHasher + Default> AutoMap<K, V, H> {
}
}

pub fn raw_entry_mut<Q>(&mut self, key: &Q) -> RawEntry<'_, K, V, H>
where
K: Borrow<Q>,
Q: Hash + Eq + ?Sized,
{
let this = self as *mut Self;
match self {
AutoMap::List(list) => match list.iter().position(|(k, _)| k.borrow() == key) {
Some(index) => RawEntry::Occupied(OccupiedRawEntry::List { list, index }),
None => RawEntry::Vacant(VacantRawEntry::List { this, list }),
},
AutoMap::Map(map) => match map.raw_entry_mut().from_key(key) {
std::collections::hash_map::RawEntryMut::Occupied(entry) => {
RawEntry::Occupied(OccupiedRawEntry::Map { this, entry })
}
std::collections::hash_map::RawEntryMut::Vacant(entry) => {
RawEntry::Vacant(VacantRawEntry::Map(entry))
}
},
}
}

/// see [HashMap::shrink_to_fit](https://doc.rust-lang.org/std/collections/struct.HashMap.html#method.shrink_to_fit)
pub fn shrink_to_fit(&mut self) {
match self {
Expand Down Expand Up @@ -337,6 +359,15 @@ impl<'a, K, V> Iterator for Iter<'a, K, V> {
}
}

impl<'a, K, V> Clone for Iter<'a, K, V> {
fn clone(&self) -> Self {
match self {
Iter::List(iter) => Iter::List(iter.clone()),
Iter::Map(iter) => Iter::Map(iter.clone()),
}
}
}

pub enum IterMut<'a, K, V> {
List(std::slice::IterMut<'a, (K, V)>),
Map(std::collections::hash_map::IterMut<'a, K, V>),
Expand Down Expand Up @@ -515,6 +546,83 @@ impl<'a, K: Eq + Hash, V, H: BuildHasher + Default + 'a> VacantEntry<'a, K, V, H
}
}

pub enum RawEntry<'a, K, V, H> {
Occupied(OccupiedRawEntry<'a, K, V, H>),
Vacant(VacantRawEntry<'a, K, V, H>),
}

pub enum OccupiedRawEntry<'a, K, V, H> {
List {
list: &'a mut Vec<(K, V)>,
index: usize,
},
Map {
this: *mut AutoMap<K, V, H>,
entry: std::collections::hash_map::RawOccupiedEntryMut<'a, K, V, H>,
},
}

impl<'a, K: Eq + Hash, V, H: BuildHasher> OccupiedRawEntry<'a, K, V, H> {
/// see [HashMap::RawOccupiedEntryMut::get_mut](https://doc.rust-lang.org/std/collections/hash_map/struct.RawOccupiedEntryMut.html#method.get_mut)
pub fn get_mut(&mut self) -> &mut V {
match self {
OccupiedRawEntry::List { list, index } => &mut list[*index].1,
OccupiedRawEntry::Map { entry, .. } => entry.get_mut(),
}
}

/// see [HashMap::RawOccupiedEntryMut::into_mut](https://doc.rust-lang.org/std/collections/hash_map/struct.RawOccupiedEntryMut.html#method.into_mut)
pub fn into_mut(self) -> &'a mut V {
match self {
OccupiedRawEntry::List { list, index } => &mut list[index].1,
OccupiedRawEntry::Map { entry, .. } => entry.into_mut(),
}
}
}

impl<'a, K: Eq + Hash, V, H: BuildHasher + Default> OccupiedRawEntry<'a, K, V, H> {
/// see [HashMap::OccupiedEntry::remove](https://doc.rust-lang.org/std/collections/hash_map/enum.OccupiedEntry.html#method.remove)
pub fn remove(self) -> V {
match self {
OccupiedRawEntry::List { list, index } => list.swap_remove(index).1,
OccupiedRawEntry::Map { entry, this } => {
let v = entry.remove();
let this = unsafe { &mut *this };
if this.len() < MIN_HASH_SIZE {
this.convert_to_list();
}
v
}
}
}
}

pub enum VacantRawEntry<'a, K, V, H> {
List {
this: *mut AutoMap<K, V, H>,
list: &'a mut Vec<(K, V)>,
},
Map(std::collections::hash_map::RawVacantEntryMut<'a, K, V, H>),
}

impl<'a, K: Eq + Hash, V, H: BuildHasher + Default + 'a> VacantRawEntry<'a, K, V, H> {
/// see [HashMap::RawVacantEntryMut::insert](https://doc.rust-lang.org/std/collections/hash_map/struct.RawVacantEntryMut.html#method.insert)
pub fn insert(self, key: K, value: V) -> &'a mut V {
match self {
VacantRawEntry::List { this, list } => {
if list.len() >= MAX_LIST_SIZE {
let this = unsafe { &mut *this };
this.convert_to_map().entry(key).or_insert(value)
} else {
list.push((key, value));
&mut list.last_mut().unwrap().1
}
}
VacantRawEntry::Map(entry) => entry.insert(key, value).1,
}
}
}

impl<K, V, H> Serialize for AutoMap<K, V, H>
where
K: Eq + Hash + Serialize,
Expand Down
6 changes: 6 additions & 0 deletions crates/turbo-tasks-auto-hash-map/src/set.rs
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,12 @@ impl<'a, K> Iterator for Iter<'a, K> {
}
}

impl<'a, K> Clone for Iter<'a, K> {
fn clone(&self) -> Self {
Self(self.0.clone())
}
}

pub struct IntoIter<K>(super::map::IntoIter<K, ()>);

impl<K> Iterator for IntoIter<K> {
Expand Down
3 changes: 3 additions & 0 deletions crates/turbo-tasks-memory/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ num_cpus = "1.13.1"
once_cell = { workspace = true }
parking_lot = { workspace = true }
priority-queue = "1.3.0"
ref-cast = "1.0.20"
rustc-hash = { workspace = true }
tokio = { workspace = true }
tracing = { workspace = true }
Expand All @@ -38,6 +39,7 @@ turbo-tasks-testing = { workspace = true }
turbo-tasks-build = { workspace = true }

[features]
track_unfinished = []
unsafe_once_map = []
log_running_tasks = []
log_scheduled_tasks = []
Expand All @@ -48,6 +50,7 @@ print_scope_updates = []
print_task_invalidation = []
inline_add_to_scope = []
inline_remove_from_scope = []
default = []

[[bench]]
name = "mod"
Expand Down
2 changes: 1 addition & 1 deletion crates/turbo-tasks-memory/benches/scope_stress.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ pub fn scope_stress(c: &mut Criterion) {
let mut group = c.benchmark_group("turbo_tasks_memory_scope_stress");
group.sample_size(20);

for size in [10, 100, 200, 300] {
for size in [5, 10, 15, 20, 25, 30, 100, 200, 300] {
group.throughput(criterion::Throughput::Elements(
/* tasks for fib from 0 to size - 1 = */
(size as u64) * (size as u64) +
Expand Down
Loading

0 comments on commit fdf5e86

Please sign in to comment.