Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add types for Object.groupBy() and Map.groupBy() #56805

Merged
merged 8 commits into from
Jan 11, 2024

Conversation

bakkot
Copy link
Contributor

@bakkot bakkot commented Dec 16, 2023

Fixes #47171.

The tests are pretty simple because the types are simple, but I'm happy to expand them.

Try them on the playground.

Co-authored-by: Nick McCurdy <nick@nickmccurdy.com>
@huw
Copy link

huw commented Dec 16, 2023

I’d probably also credit @karlhorky and @nikeee ❤️

Co-authored-by: Karl Horky <karl.horky@gmail.com>
Co-authored-by: Niklas Mollenhauer <nikeee@outlook.com>
@bakkot
Copy link
Contributor Author

bakkot commented Dec 16, 2023

Done. Hope that's alright @karlhorky @nikeee.

@DanielRosenwasser
Copy link
Member

@typescript-bot pack this

@typescript-bot
Copy link
Collaborator

typescript-bot commented Dec 18, 2023

Heya @DanielRosenwasser, I've started to run the tarball bundle task on this PR at 547e844. You can monitor the build here.

@typescript-bot
Copy link
Collaborator

typescript-bot commented Dec 18, 2023

Hey @DanielRosenwasser, I've packed this into an installable tgz. You can install it for testing by referencing it in your package.json like so:

{
    "devDependencies": {
        "typescript": "https://typescript.visualstudio.com/cf7ac146-d525-443c-b23c-0d58337efebc/_apis/build/builds/159135/artifacts?artifactName=tgz&fileId=CFD5AF1D1FCD328827EEC59F4D7DA6580765BE88B68F32DA4FD25CE87FA5199502&fileName=/typescript-5.4.0-insiders.20231218.tgz"
    }
}

and then running npm install.


There is also a playground for this build and an npm module you can use via "typescript": "npm:@typescript-deploys/pr-build@5.4.0-pr-56805-4".;

@DanielRosenwasser
Copy link
Member

This seems good overall, though users will have to opt in to the more precise inference for keys.

I would also think to stick with the same precedent we have elsewhere for type parameter names (K and T), and I feel like specifying the key first might be more desirable if we ever had the ability to only provide some arguments while letting the others be inferred.

I'll let others weigh in.

@bakkot
Copy link
Contributor Author

bakkot commented Dec 20, 2023

This seems good overall, though users will have to opt in to the more precise inference for keys.

What do you mean by "more precise"?

Object.groupBy([0, 2, 8], x => x < 5 ? 'small' : 'large') correctly infers the key type as 'small | 'large', which is what it ought to be.

Similarly type Employee = { name: string, role: 'ic' | 'manager' }; Object.groupBy([] as Employee[], x => x.role); infers the key type as 'ic' | 'manager' - again, what it should be. (These are both in the tests.)

I would also think to stick with the same precedent we have elsewhere for type parameter names (K and T), and I feel like specifying the key first might be more desirable if we ever had the ability to only provide some arguments while letting the others be inferred.

SGTM, done.

Copy link
Member

@sandersn sandersn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks right, just one question about Partial.

groupBy<K extends PropertyKey, T>(
items: Iterable<T>,
keySelector: (item: T, index: number) => K,
): Partial<Record<K, T[]>>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the return type Partial?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you might not actually get all of the keys.

As a trivial example, consider the case that the iterable is empty. Then the resulting record will have no keys at all. Typing as Record<K, T[]> would mean that e.g. Object.groupBy(vals, x => x < 5 ? 'small' : 'large') would be typed as always having small and large keys, such that result.small.length would check without errors, which is misleading.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, I forgot that Map<K, T[]> has get: (k: K) => T[] | undefined so was confused by the difference in types.

@sandersn sandersn self-assigned this Jan 2, 2024
@sandersn
Copy link
Member

sandersn commented Jan 3, 2024

@bakkot I'll merge this after you merge from main (or rebaseline, whichever).

@bakkot
Copy link
Contributor Author

bakkot commented Jan 3, 2024

@sandersn Done.

@sandersn sandersn merged commit fbf908b into microsoft:main Jan 11, 2024
19 checks passed
@bakkot bakkot deleted the groupby branch January 11, 2024 20:05
@nikelborm
Copy link

Hi, I just noticed this and think it needs some improvements, since it doesn't handle cases like these:

function tester<K extends number | string>(k: K, index: number): K extends number ? 'numbers' : 'strings' {
    return (typeof k === 'number') ? 'numbers' : 'strings';
}


// expected numbers ✅ 
const a1 = tester(123, 123);
//    ^?  const a1: "numbers"


// expected strings ✅ 
const a2 = tester('123', 123);
//    ^?  const a1: "strings"

// expected error ✅ 
const a3 = tester(true, 123);
//    Argument of type 'boolean' is not assignable to parameter of type 'string | number'.

const basic = Object.groupBy([0, 2, 8, 'asd'], tester);
//    ^?  const basic: Partial<Record<"numbers" | "strings", (string | number)[]>>


// expected number[] | undefined ❌
basic.numbers
//     ^?  (property) numbers?: (string | number)[] | undefined


// expected string[] | undefined ❌
basic.strings
//     ^?  (property) numbers?: (string | number)[] | undefined

@bakkot
Copy link
Contributor Author

bakkot commented Jan 12, 2024

@nikelborm Typescript's types don't generally go for that level of type-level logic. It might be possible in theory but it would significantly complicate the types. The current types aren't wrong, just slightly imprecise.

@nikelborm
Copy link

They do go, and do it pretty well

Screenshot from 2024-01-12 19-50-37

@nikelborm
Copy link

nikelborm commented Jan 12, 2024

It neither does "hard" logic

function tester(k: number | string, index: number): k is string {
    return typeof k === 'string';
}

// expected {'true'?: string[], 'false'?: number[]}
const basic = Object.groupBy([0, 2, 8, 'asd'], tester);
//  Type 'boolean' is not assignable to type 'PropertyKey'.

nor easy (it doesn't seems hard to support just booleans)

function tester2(k: number | string, index: number): boolean {
    return typeof k === 'string';
}

// expected Partial<Record<"true" | "false", (number | string)[]>>
const basic2 = Object.groupBy([0, 2, 8, 'asd'], tester2);
//    Type 'boolean' is not assignable to type 'PropertyKey'.

It's very common use case to divide an array of something into 2 groups like this:

const { true:strings, false: numbers} = Object.groupBy([0, 2, 8, 'asd'], (k) => typeof k === 'string');

It doesn't have to be as simple as checking is this a string or not. It may cover many other business logic use cases

const people = [{name: 'Julie', age: 15}, {name: 'John', age: 23}]
const { true: adults, false: children } = Object.groupBy(people, (person) => person.age >= 18);

@bakkot
Copy link
Contributor Author

bakkot commented Jan 12, 2024

@nikelborm You're welcome to send a followup PR and see if the TS team accepts it.

@nikelborm
Copy link

@sandersn what do you think about this?

@huw
Copy link

huw commented Jan 13, 2024

Hey there 🙂 You’ve raised 2 separate issues here:

  1. groupBy actually accepts all return types that are coercible to PropertyKey, and we should widen the types
  2. That groupBy doesn’t narrow on the real-world output types of key selector functions

PropertyKey coercion

You are correct to identify that, under spec, Object.groupBy([1, 2, 3], () => true) should technically be typed as { true?: number[], false?: number[] }.

We can see this because the array grouping proposal specifies that outputs should be coerced to property keys if possible, and it refers to ToPropertyKey, which specifies that booleans should be coerced to the literals ‘true’ and ‘false’.

TypeScript actually manages this natively for numeric types, which are usually used to index arrays (which are sort of objects under the hood)—that’s why PropertyKey extends number, even though the spec says property keys can only be strings or Symbols.

Although other types are coercible to property keys, TypeScript seems to have preferred simplicity/sanity/ease. Challenging this is much more philosophical and would affect a large amount of TypeScript, so at the very least you’ll have to raise a new issue. It would affect a lot more than Object.groupBy().

Narrowing key selector outputs

I am by no means a TypeScript expert, but I don’t think there’s a way to do what you’re asking for. To recap, you correctly noted that (for example) groupBy([1, "2", 3], (item) => typeof item === "string" ? "string" : "number") implicitly types keySelector as (item: string | number) => "string" | "number", which doesn’t have enough type information to determine the output type from the input type.

To fix this, you added a generic parameter to determine the output type from the input type, such as groupBy([1, "2", 3], <K extends number | string>(item: K): K extends string ? "string" : "number" => typeof item === "string" ? "string" : "number"). I don’t think there’s a way to achieve this without generics.

What you need, concretely, is a type that can map specific K keys to T values. A generic function type can do this by itself, as we’ve seen. But the issue is that this would require higher-order generics to infer the output type from the input type within the already-genericised function. Here’s some fake code that is doing roughly what you mean:

declare function groupBy<K extends PropertyKey, KeySelector<T> extends (item: T, index: number) => K>(
    items: Iterable<genericof KeySelector>,
    keySelector: KeySelector,
): { [key in T]?: KeySelector<key>[] };

#1213 seems to be the canonical issue for higher-kinded types, in case you want to follow it (or someone there is looking for a use-case here).

Alternatively, you might get clever and create a mapping from KT directly, using mapped object types:

declare function groupBy<TypeMap extends {}, Key extends keyof TypeMap = keyof TypeMap, KeySelector extends (item: TypeMap[Key], index: number) => Key = (item: TypeMap[Key], index: number) => Key>(
    items: Iterable<TypeMap[Key]>,
    keySelector: KeySelector,
): { [key in Key]?: TypeMap[key][] };

groupBy<{ "string": string; "number": number }>([1, "2", 3], <K extends number | string>(value: K): K extends string ? "string" : "number" => typeof value === "string" ? "string" : "number");

This actually works! But (a) it’s a hack, and TypeScript doesn’t accept hacks in the core library and (b) it requires you to always specify TypeMap, which isn’t possible to infer, and as noted above TypeScript doesn’t usually accept types that are complex to use in this way. Also, perhaps more importantly, is that it’s no more ergonomic than just overriding the return type of groupBy yourself.

Here’s the TypeScript playground I used for the above.

I hope that clears things up! I sort of hate to write stuff like this because it feels very conservative and self-justifying; if you have a problem with either of these answers I absolutely support starting a constructive dialogue with the TypeScript team to try and change these things—it may be the case that this is the issue that tips the balance :)

(Also, just as a matter of etiquette, I personally wouldn’t @-mention a maintainer in the future if another commenter disagrees with your take. I have to imagine it’s pretty annoying to be called in to adjudicate things, especially when another commenter has explained the next course of action (raise a new issue), and it’s pretty likely they already have notifications on for this thread and will see it in due time. In most open-source repos it’s a pretty fast way to get your position ignored.)

@slorber
Copy link

slorber commented Jun 28, 2024

Returning Partial Partial<Record<K, T[]>> is quite annoying in practice

See for example:

CleanShot 2024-06-28 at 10 55 40

We now have to deal with entries with undefined values, which technically is not really possible in practice.

Isn't there a way to express a Record where "only keys are Partial", but values are always defined when the key exists?


I tried various things but not sure to have a good solution.

Here's my last attempt:

type SubsetRecord<K extends PropertyKey, T> = {
  [P in K]?: T;
};


declare function groupBy<K extends PropertyKey, T>(
  items: Iterable<T>,
  keySelector: (item: T, index: number) => K,
): SubsetRecord<K, T[]>


type User = {name: string,age: number, type: "a" | "b" | "c"}

const users: User[] = [{name: "Seb",age: 42, type: "a"}]

const groups = groupBy(users, u => u.type);


// ✅ entries do not have undefined 
Object.entries(groups) satisfies [string,User[]][]

// ✅ 
groups["a"]!.toSorted()

// ✅ 
groupBy(users,u => "Seb")["Seb"]!.toSorted()


// ❌ should be forbidden: value can be undefined
// @ts-expect-error
groups["a"].toSorted()

// ❌ should be forbidden because keys do not match
// @ts-expect-error
groupBy(users,u => "Seb")["blabla"] 

// ❌ should be forbidden: value can be undefined
// @ts-expect-error
groupBy(users,u => "Seb")["Seb"].toSorted()

@ktkiiski
Copy link

ktkiiski commented Oct 1, 2024

Returning Partial Partial<Record<K, T[]>> is quite annoying in practice

See for example:

CleanShot 2024-06-28 at 10 55 40

We now have to deal with entries with undefined values, which technically is not really possible in practice.

Isn't there a way to express a Record where "only keys are Partial", but values are always defined when the key exists?

I agree with this.

In cases where the type of the group key K becomes string (instead of a more narrow type such as "a" | "b" | "c"), having the Partial there is unnecessary and causes these issues where, according to the typing, the object values could be undefined, even though they never are in practice.

This all comes down to TypeScript's limits to separate if the property is there at all vs. if they exist with value undefined, when dealing with this kinds of types, so for now that's something that we need to live with.

However, imo the developer experience would be greatly improved if the signature of Object.groupBy is changed to this:

interface ObjectConstructor {
    /**
     * Groups members of an iterable according to the return value of the passed callback.
     * @param items An iterable.
     * @param keySelector A callback which will be invoked for each item in items.
     */
    groupBy<K extends PropertyKey, T>(
        items: Iterable<T>,
        keySelector: (item: T, index: number) => K,
    ): string extends K ? Record<K, T[]> : Partial<Record<K, T[]>>; // 👈 changed return type
}

The change is on the return value of the method: it now conditionally returns the regular "dictionary" type, e.g. Record<string, User[]>, without unnecessary Partial.

Now, the previously mentioned unnecessary undefined in the value type of e.g. Object.entries would be gone.

This return type still matches the implementation, but is more useful in cases where the grouping key is just the broad type of string.

Also, with this change, it also leaves up to the developer to choose their noUncheckedIndexedAccess configuration depending on how they would like to deal with possibly undefined values with indexed property access.

What do you think?

@bakkot
Copy link
Contributor Author

bakkot commented Oct 1, 2024

I think having the types incorrectly suggest that every key is present is worse than having the types incorrectly suggest that keys which are present might have value undefined. The former will accept incorrect programs, which you discover by running into bugs at runtime and can only fix by globally enabling noUncheckedIndexedAccess (which affects your whole program), whereas the latter will reject correct programs, which you discover at compile-time and can fix by adding ! or other casts to the specific cases where you need it.

So I do not think that would be a good change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
For Backlog Bug PRs that fix a backlog bug
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Add Array.prototype.groupBy[toMap] to JS lib definitions.
8 participants