Remove RawLiteralType synthetic type #6121

Michael0x2a · 2018-12-31T01:44:02Z

This diff changes how we track raw literal types in the semantic analysis phase. It makes the following changes:

Removes the RawLiteralType synthetic type.
Adds a new TypeOfAny: TypeOfAny.invalid_type as suggested in Better error message for "Invalid type" #4030.
Modifies AnyType so it can optionally contain a new RawLiteral class. This class contains information about the underlying literal that produced that particular TypeOfAny.
Adjusts mypy to stop recommending using Literal[...] when doing A = NewType('A', 4) or T = TypeVar('T', bound=4).

(The former suggestion is a bad one: you can't create a NewType of a Literal[...] type. The latter suggestion is a valid but stupid one: T = TypeVar('T', bound=Literal[4]) is basically the same thing as T = Literal[4].)

This resolves Regression: error messages for literal expressions in invalid places are worse #5989.

The net effect of this diff is that:

RawLiteralTypes no longer leak during fine-grained mode, which should partially help unblock Add tests for Literal types with incremental and fine-grained mode #6075.
The way mypy handles literal expressions in types is "inverted".

Previously, we by default assumed literal expressions would belong inside Literal[...] and tacked on some logic to make them convert into error AnyTypes. Now, we do the reverse: we start with an error AnyType and convert those into Literal[...]s as needed.

This more closely mirrors the way mypy used to work before we started work on Literal types. It should also hopefully help reduce some of the cognitive burden of working on other parts of the semantic analysis code, since we no longer need to worry about the RawLiteralType synthetic type.
We now have more flexibility in how we choose to handle invalid types: since they're just Anys, we have more opportunities to intercept and customize the exact way in which we handle errors.

Also see Better error message for "Invalid type" #4030 for additional context. (This diff lays out some of the foundation work for that diff).

This diff changes how we track raw literal types in the semantic analysis phase. It makes the following changes: 1. Removes the `RawLiteralType` synthetic type. 2. Adds a new `TypeOfAny`: `TypeOfAny.invalid_type` as suggested in python#4030. 3. Modifies `AnyType` so it can optionally contain a new `RawLiteral` class. This class contains information about the underlying literal that produced that particular `TypeOfAny`. 4. Adjusts mypy to stop recommending using `Literal[...]` when doing `A = NewType('A', 4)` or `T = TypeVar('T', bound=4)`. (The former suggestion is a bad one: you can't create a NewType of a Literal[...] type. The latter suggestion is a valid but stupid one: `T = TypeVar('T', bound=Literal[4])` is basically the same thing as `T = Literal[4]`.) This resolves python#5989. The net effect of this diff is that: 1. RawLiteralTypes no longer leak during fine-grained mode, which should partially help unblock python#6075. 2. The way mypy handles literal expressions in types is "inverted". Previously, we by default assumed literal expressions would belong inside `Literal[...]` and tacked on some logic to make them convert into error `AnyTypes`. Now, we do the reverse: we start with an error `AnyType` and convert those into `Literal[...]`s as needed. This more closely mirrors the way mypy *used* to work before we started work on Literal types. It should also hopefully help reduce some of the cognitive burden of working on other parts of the semantic analysis code, since we no longer need to worry about the `RawLiteralType` synthetic type. 3. We now have more flexibility in how we choose to handle invalid types: since they're just `Anys`, we have more opportunities to intercept and customize the exact way in which we handle errors. Also see python#4030 for additional context. (This diff lays out some of the foundation work for that diff).

Michael0x2a · 2018-12-31T01:49:00Z

One meta-note:

I think my "fix" of just removing the assert from the astmerge visitor in #6075 might have actually been fine: after analyzing the fine-grained logic, it seemed to me that encountering RawLiteralTypes during that phase of the update logic might have actually been expected behavior.

However, I decided that it would just be easier to rip out RawLiteralType altogether instead of sitting down and formalizing this hunch.

I do feel a little bad that this PR tacks on some additional cruft to AnyType, but we're already tracking a lot of provenance-related info in that class, so I felt this wasn't too extreme of a change.

JukkaL · 2019-01-02T16:56:56Z

This seems too hacky to me. The new responsibility of AnyType seems like an implementation shortcut but doesn't really make sense conceptually, at least to me. Is there something that makes it hard to fix the leaking of raw literal types in fine-grained incremental mode using the existing implementation technique?

Adding TypeOfAny.invalid_type while also keeping RawLiteralType sounds like a more promising approach to me, though I haven't thought about this very carefully.

ilevkivskyi · 2019-01-03T14:43:46Z

I am actually with Jukka here.

Also, if we add TypeOfAny.invalid_type we can also easily remove this TODO item in typeanal.py:

        # TODO: Would it be better to always return Any instead of UnboundType
        # in case of an error? On one hand, UnboundType has a name so error messages
        # are more detailed, on the other hand, some of them may be bogus.
        return t

ilevkivskyi · 2019-01-03T14:45:59Z

Also, it would make sense to use invalid_type in some other places like this:

        else:  # sym is None
            if self.third_pass:
                self.fail('Invalid type "{}"'.format(t.name), t)
                return AnyType(TypeOfAny.from_error)  # <- here

Michael0x2a · 2019-01-03T20:00:35Z

Adding TypeOfAny.invalid_type while also keeping RawLiteralType sounds like a more promising approach to me, though I haven't thought about this very carefully.

The main problem with this approach is that I think adding just TypeOfAny.invalid_type doesn't actually buy us much. It lets us determine broadly where that TypeOfAny came from, but I don't think we'd necessarily have enough context to do anything useful in places that end up using that Any.

And if we do add some context to Any, I think we'd end up with a solution similar to what this PR does. That was actually what I was hoping to eventual do: follow-up by generalizing RawLiteral so it context about any kind of invalid type (and I'd perhaps rename the class to InvalidTypeContext or something).

In any case, I'll look into just plugging the leak as well -- but I still think it's worth adding in some variation of this PR if we want to add TypeOfAny.invalid_type.

Michael0x2a · 2019-01-04T02:14:45Z

Is there something that makes it hard to fix the leaking of raw literal types in fine-grained incremental mode using the existing implementation technique?

Ok, after doing more poking around, I'm pretty confident that we're not actually leaking anything and that removing the assert is the from astmerge's TypeReplaceVisitor.visit_raw_literal_type is the correct thing to do.

In short, we're invoking this class within NodeReplaceVisitor.fixup_type. One of the places this function is called is inside NodeReplaceVisitor.process_base_func, which looks like this:

def process_base_func(self, node: FuncBase) -> None:
    self.fixup_type(node.type)
    node.info = self.fixup(node.info)
    if node.unanalyzed_type:
        # Unanalyzed types can have AST node references
        self.fixup_type(node.unanalyzed_type)

It looks like node.unanalyzed_type has (intentionally) not been semantically analyzed, which is why we're encountering the RawLiteralTypes. This is also probably why TypeReplaceVisitor implements visit_unbound_type (I didn't notice it did until just now).

ilevkivskyi · 2019-01-04T17:05:59Z

@Michael0x2a as I mentioned in #6075, I am fine to merge it as is. So you can keep only the parts necessary to fix #5989 in this PR. Optionally, you can also fix the TODO item I mentioned above.

ilevkivskyi · 2019-01-04T17:08:56Z

(Also I would keep TypeOfAny.invalid_type because it will be useful for #4030.)

Michael0x2a · 2019-01-04T19:02:07Z

I think I'll just close this PR and submit a new one with the error message fixes.

Michael0x2a mentioned this pull request Dec 31, 2018

Add tests for Literal types with incremental and fine-grained mode #6075

Merged

Fix typo

8f3e5ad

Michael0x2a mentioned this pull request Dec 31, 2018

Literal types tracking issue #5935

Closed

42 tasks

Michael0x2a mentioned this pull request Jan 3, 2019

Release 0.660 planning #6130

Closed

Michael0x2a closed this Jan 4, 2019

Michael0x2a mentioned this pull request Jan 6, 2019

Improve error messages related to literal types #6149

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove RawLiteralType synthetic type #6121

Remove RawLiteralType synthetic type #6121

Michael0x2a commented Dec 31, 2018

Michael0x2a commented Dec 31, 2018

JukkaL commented Jan 2, 2019

ilevkivskyi commented Jan 3, 2019

ilevkivskyi commented Jan 3, 2019

Michael0x2a commented Jan 3, 2019

Michael0x2a commented Jan 4, 2019

ilevkivskyi commented Jan 4, 2019

ilevkivskyi commented Jan 4, 2019

Michael0x2a commented Jan 4, 2019

Remove RawLiteralType synthetic type #6121

Remove RawLiteralType synthetic type #6121

Conversation

Michael0x2a commented Dec 31, 2018

Michael0x2a commented Dec 31, 2018

JukkaL commented Jan 2, 2019

ilevkivskyi commented Jan 3, 2019

ilevkivskyi commented Jan 3, 2019

Michael0x2a commented Jan 3, 2019

Michael0x2a commented Jan 4, 2019

ilevkivskyi commented Jan 4, 2019

ilevkivskyi commented Jan 4, 2019

Michael0x2a commented Jan 4, 2019