Skip to content

Latest commit

 

History

History
686 lines (368 loc) · 149 KB

nov-18.md

File metadata and controls

686 lines (368 loc) · 149 KB

18 November, 2020 Meeting Notes


Remote attendees:

Name Abbreviation Organization
Waldemar Horwat WH Google
Bradford C. Smith BSH Google
Robin Ricard RRD Bloomberg
Jordan Harband JHD Invited Expert
Istvan Sebestyen IS Ecma
Richard Gibson RGN OpenJS Foundation
Frank Yung-Fong Tang FYT Google
Michael Ficarra MF F5 Networks
Chengzhong Wu CZW Alibaba
Chip Morningstar CM Agoric
Sergey Rubanov SRV Invited Expert
Michael Saboff MLS Apple
Devin Rousso DRO Apple
Shaheer Shabbir SSR Apple
Leo Balter LEO Salesforce
Caio Lima CLA Igalia
Marja Hölttä MHA Google
Shane F. Carr SFC Google
Sven Sauleau SSA Babel
Mathias Bynens MB Google
Markus W. Scherer MWS Google
Daniel Rosenwasser DRR Microsoft
Mark E. Davis MED Google
Daniel Ehrenberg DE Igalia

JSON modules for Stage 3

Presenter: Daniel Ehrenberg (DE), Dan Clark (DDC)

DDC: Okay, JSON modules for stage 3. JSON modules is just a recap. This was split off originally from importance exertion. So this is not the end of the import assertions mechanics. This is this is just the bit about saying what should happen What should host to do when the assertions list includes the type JSON assertion if that is the that assertion is present the host must either reject the Imports or they must they must The module must be treated as a JSON module which is to say that the module content must be parse the JSON and the resulting object is like the modules default export. Is that is that resulting JSON exports and then there are no other exports type assertions are not required in all hosts. So like for hosts like the web where there's this mime type security concern that like those hosts will want to require this. This JSON on this slide syntax in the slides out of date, but those hosts will want to have this type equals JSON assertion, but other hosts that don't have those same conservative security concerns is the web can just about can just use this to search this syntax without the without the assertion being present. They can still interpret this as a JSON module. Yeah, so it's optional. I think the big question here from last time that's come up is whether these should be mutable or not. It is a position of the champion group that the JSON object should be mutable. That's it's more natural to developers who are used to immutability from like JSON modules where mutability is the default and there's just this issue where if you need mutability if you need immutability, but you need your JSON. Intent not to be changed. You can kind of have some workarounds here to get this by reordering your Imports and assuring that shooting you're able to lock down the JSON before other other modules can import it. But if you need mutability and the JSON is immutable by default, then you're kind of stuck. The only really you can you can then make changes to that imported object is to do like a deep copy I was just at the form of the JSON object expect it to be a just the default export is the entire JSON object. There are no named exports and these are exported. JSON is just made up of objects and arrays like what you'd get from parseJSON, not records or tuples like you get from a parseImmutable. We've previously got external positions on this tag review signed off and then the Zilla position that's worth prototyping these positions were obtained [?] like was paired with the import assertion stopped by some the JSON module as part of The Proposal hasn't really changed since since we got these approvals. Tom there's a HTML integration PR for this up again, most of the complexity here is with the import assertions stuff that JSON modules that it hasn't really changed since the last last time we discussed the proposal. and yeah, and then the spec has been split off from the import assertion spec. It's pretty much just the couple bullet points that say this is what much happened when the type. JSON assertion is present and then that plugs into the web IDL synthetic module spec and that's that's pretty much all there is to it. I think we plan to ask for stage three for this meeting. We should probably open it up to the Queue first that's Which offer the initial presentation not a bunch to this. I think the biggest question here is going to be this if I can find the slide this mutable versus immutable discussion.

MM: So this is just something that I became curious about, in no sense an objection, which is why did we - you know in terms of non JavaScript resources, It seems like the most natural place to have started would be to read in the contents of a file either as binary turning into a big you and at array or as utf-8 turning into string and then you know the the JSON.parse of a string could have given us JSON resources indirectly. Not that that would be preferable, but it's certainly a more universal place to start. So, why did we not start with binary and text?

DE: I think that would be an interesting proposal and I'd encourage people to Champion it. I mean we see that pattern in tooling today. It's not really clear which mime type we should be expecting for that. There are several different mime types that we could use for binary. I think JSON is used very widely in JavaScript programs. So it's a natural thing to include and the relative priority of those two things is just kind of subjective.

MM: Okay.

CM: Yeah. just yeah issues about mutability. I'm a little concerned about the “well, just control the order of imports” as being the the explanation for why mutability might be okay, since this presumes that you're in complete control of the entire ordering of the import graph, which seems like an implausible state in the general case, and I'm wondering do you know something I don't about how one how one controls what what order Imports happen in.

DE: I can answer that question. Also, I think I can see how this is a concern and I'm not really sure if it's practical to control this ordering. What you can do is all callers can freeze it because it's identity operation. This is one of those cases where we have conflicting arguments and somehow both sides - if we want to move forward with this proposal we kind of have to agree to disagree and make a compromise

CM: Because I mean, I think that obviously there's no way you can have a compromise position without making the proposal more complicated somehow and I understand where that might be a non-starter. But you know if I import the JSON module and then immediately freeze it I still have no way of knowing whether somebody else has imported it and modified it when I wasn't looking, you know, somehow somebody else got to it prior to my getting to it.

DE: I want to go back and disagree with you where you say that you know allowing both who would be too complicated. I think that would be a totally valid thing to do in a follow-on proposal in particular when we have the evaluator attributes that were mentioned in the in the import assertions proposal. We're really interested in hearing about more different use cases and one that MF raised previously was that you could have these module attributes that change how a JSON module is interpreted. So one way of changing how it's interpreted would be to parse it as records and temples or as frozen objects and arrays, I think that would be a valid thing to do. Like continue to free associate like how do you know that no one else imported but then modified it but also how do you know that - like in node nobody wrote to the file system before you did your module read or like on the web install the service worker worker to intercept the network ultimately, you kind of have to be in control of things in order to have guarantees about what's going on. There's a lot of things going on.

CM: I mean, it's certainly true that there are lots of places for things to go off the rails and you have to be in control of all of them, but if there were some way to direct the import to give you an immutable form and I think. I think in conjunction with the records and tuples proposal might be a good place to stand to do that or simply some kind of annotation or some kind of attribute, which said “give this to me in a mutable form, please”. I just think that if you want it mutable it should be you doing it. You want to be sure the thing imported was actually the thing that you intended to import. But arranging to have a trusted path to the resource is a separate problem and it's something you need for any import. For something which is code, code is in control of itself -- it gets a chance to have a say on what it is that's being returned prior to anybody else getting a handle to it. Whereas something which is pure data has no agency and therefore any kind of qualities that you want to ensure have to be ensured extrinsically. Whether they're insured by the platform or by the consumer of the data can make a lot of difference in how much effort it is, how reliable and trustworthy the mechanism is, whether you push a lot of additional complexity on users. I just generally think that the need for immutability here is real and while I understand some people want to have a mutable form as well, I would be much happier if the need for immutability were given more weight than just "oh, well, that's your problem".

JHD: In general, this kind of seems like the same issue that exists with any shared object which includes anything that's exported from a JavaScript module. If I export an object like export default an object from a module and I import it. How do I know that another module hasn't a mutated that object before I got to it if I freeze it, how do I know that something that Imports it later doesn't need to modify it right? Like the this sort of interaction always happens with a thing that is shared and mutable and

CM: If I'm returning it I can freeze it. I know I've frozen it. If somebody else really needed to change it and expected it to be mutable, they will have an error and they will know that they now have a problem. Whereas the opposite failure, when it needed to be immutable but it was not -- that can manifest in a security problem that only surfaces much later.

JHD: Yeah, so CM you're envisioning a use case where you're creating a JSON file. Like I feel like I understand your use case where you're creating a JavaScript object and you want that thing to be be immutable, but you're envisioning a use case where you create a JSON file and you somehow need consumers not to be able to mutate an object.

CM: Well, let's say I have a block of configuration data, and I want to be able to have people read their configuration without being able to change what configuration is seen by other people who also read the configuration. I might want to return an object that conveys some configuration information and do that in a way that doesn't provide a side channel between different importers.

JHD: right, but if you want it to be immutable you already have the choice of doing that by making your module not be JSON, by having it be a JS module.

CM: And what I would do is I would have to have some more complicated thing. You could wrap the mutable import in a module whose job is to import JSON modules and return them in immutable form, and you could develop a package that does that, and maybe that's going to be something that appears on npm in short order after this goes out. So it's this is not this is not like a catastrophic "Oh my God, the world is ending." objection. It's just, I'm cranky.

JHD: Yeah. I mean, I agree with you right like a simple Babel transform for example could do that where it could take JSON as an input and spit out a JavaScript module that has inlined data that's frozen. But there's no trivial way to unfreeze in the inverse and deep cloning is also not a trivial thing. That is true.

DE: So if I could respond - I'm sympathetic to both CM and JHD arguments, I mean, especially to the argument that you need immutability is often the right the right defaults. At the same time we've been discussing this particular issue for several months over a number of different tc39 meetings, and I'd like guidance from the committee on how to decide one way or the other. What do you all recommend? I feel like we've articulated these arguments.

CM: You’re looking for something more useful than "don't argue do it my way".

DE: I guess we have to - I mean we could make a decision one way or the other which is kind of like that or we could not do this proposal. We've to decide among those three options.

YSV: Okay, so I just want to note I stepped out for maybe two minutes and I missed the exact topic that's being discussed right now. So to verify we're talking about the decision around whether or not the Singleton object returned from this module is mutable or Frozen, right?

CM: Indeed.

YSV: Okay, so from our side when we reviewed the proposal our feeling was that this should be immutable. And there should be potentially a second proposal like JSON dot module which gives you the behavior that you would expect from JSON Doug parse gives you a copy of the of the JSON module that you can then mutate, but it should be a separate proposal that was appealing.

DE: Could you elaborate on what you meant by this second proposal previously? I mentioned the possibility of using multiple attributes. I have no idea what JSON.module would be.

YSV: This would be a completely separate proposal that would leverage the import capability that we currently have, but it would instead give you back something that is a copy and mutable so it breaks from what our module systems do right now, which is to give you back a singleton. And in this case, it would be immutable so we would in this JSON.module version of it, which would take a string like a URL specifier. It would use that same infrastructure, but from that immutable JSON object from that object, that would be mutable coming out of this JSON file. We would create a copy that then is mutable. That's what we were thinking might be an interesting direction.

DE: So, I'm a little confused by the suggestion because you're talking about a different what I agree that we could provide mutable Imports in a follow-up proposal, but I don't understand the details of that proposal. JSON.module. Are you sure that that's meant to be do you have anything written about it, but I could refer to we just have a couple of notes between ourselves but effectively what I'm trying to get across.

YSV: Is I think that the direction that you currently have which is immutable singletons for JSON modules is the right direction and that the follow-on proposal we can figure out the details about that later, but what we could do for the following proposal is leverage what we have here make a copy. That's that's mutable.

CM: I believe the proposal that's on the table that Daniel and company are proposing is for the imported object to be mutable. And that's what I'm objecting to, right?

YSV: Okay, so we think it should be immutable as well.

DE: Okay. Right. So I think that the champion group has expressed flexibility on this question. I mean, would you agree with that Dan? Yes, so do do any do we have any further thoughts from the committee? Especially from people have been spoken yet on this question whether it should be mutable.

KG: Great, so Dan was proposing a specific question that we are asking for, like positive or negative feelings on - what was the specific question, Dan?

DE: so if it's phrased about positive or negative feelings, the positive feeling would be towards the aspect of this proposal that we're making that the JSON modules be mutable. And so if you vote negative then you think the JSON module should be immutable by default in either option that we choose is use open various different forms of follow-on proposals for making the modules. You know for making the opposite Choice as YSV mentioned in a different form. All right, so we'll leave that open while RPR. I think you're up next.

RPR: so only because we're really being asked to chip in here. From my point of view. I think immutability provides better overall better behavior because, certainly for one of our use cases, it would enable safe sharing across multiple users. However, I don't see that as the high-order bit. The most important thing is that we get something that is acceptable to the host environments that are going to use this. So that means something that is acceptable to node and something that is acceptable to browsers and I feel like our steer on this in the past has been guided by input that we've had from those constituencies. So this is a mild push towards immutability, but, you know, could go either way.

DE: so great the feedback we've gotten from web browsers so far until right now with most of us feedback has been towards mutable JSON modules. And I can't really speak for node I can't speak to the landscape of opinions there.

RPR: So I worry a little bit that if we just express, you know know the tc39 use here - sure we may wind up, you know preferring immutability, but that could then cause contention down the line. So, that's all. Thank you.

SYG: [written]: "What does immutable mean exactly? Deep frozen or a new mechanic?"

DE: I think immutability as I've been picturing it would be deeply Frozen that each array or object would have object.freeze applied to it. So I feel like this is something that we would need to write up a concrete spec text for before asking for consensus, but we haven't even done that given the things expressed on the issue so far. Thanks.

KKL: Who would block if mutable and who would block if immutable?

DE: Can we do that question after we talked through the substantial points? Like once we run out of the queue then we could ask about who would block.

RGN: Okay, so thinking about the motivations for the proposal and it being a bridge from the present to the future, what authors have available to them now is to load JSON as JavaScript and it inherently comes back mutable. So if we preserve that characteristic then we still provide a way to get safety of no longer executing code without breaking any current use cases that people might have where they modify the data coming back. This may not be the strongest argument, but I think that it provides a better transition and then follow-ups can introduce ways to have stronger guarantees of immutability.

DRO: So maybe I did miss this before but there was a question earlier about you know, what about other proposals like binary data and whatnot. So my personal preference if you just gave me this feature would be I bought this to be immutable right, But I'm vaguely empathetic like if you give it to either way most my use cases wouldn't make a difference but if you know long-term use the other data formats coming in as coming in as you know, arrays or something like that and those are immutable well it might be strange to have have that difference. So I don't know if there's a specific thing in mind there. But from what I would think that might push the direction towards mutable all the time.

DE: Yeah, I want to agree with you there. I think it's clear that there are lots of module types that we want to add in the future where there's no reasonable way to make them immutable. For example, CSS modules could also be thought of as a data structure which you could make immutable but CSS modules will definitely be normal mutable CSS things. so this could not establish any kind of precedent that everything is immutable by default.

YSV: So I just want to clarify what I said earlier: we're not pushing for this to be immutable. It might be better if we have it as being immutable because chances are that due to the expectations that users have had from JSON.parse - not from require(JSON) as Jordan brings up - They may be expecting to work with the copy and this could lead to hard to catch bugs. So for that reason, it may be a good idea to make it immutable. But if we do that then we may want to have a second API that does create the copy. That's more our position rather than it must be immutable. There's a slight preference for immutability and I think Daniel who just spoke made a really good point that we might want to think about future data types that come in whether or not those will be mutable or not and how the decision we make here will impact that later.

MM: Yeah, so the UInt8Array for binary I think does not argue that binary should therefore be shared mutability. It more argues that we should have an immutable way to represent binary data. We have an immutable way to represent text data which is strings and people have even used strings for binary data. The bigger point is that moddable has repeatedly raised the desire to have some form of raw representation of various Collections and likewise the moddable way of treating treating pre-compiled modules really naturally want pre-compiled resources like JSON resources to be something they just put in ROM without having to do the extra bookkeeping to shadow that with mutability. So I think that once we extend beyond JSON, I think we're going to continue to have desires for immutability being the default and to adjust the data types that we provide to deal with that.

Conclusion/Resolution

  • We will revisit this topic later

Tour of Temporal

Presenter: Ujjwal Sharma (USA)

USA: Welcome everyone to Temporal. This is a stage 2 status update, but this one's special and let me tell you why so if you've been lost, if you've been feeling lost in the last couple of months, we've been doing so much and there's so much lost context. Let me start from the beginning, what is to temporal? It is this new date/time library that we've been working in JavaScript. I guess one of the bigger points at which we have consensus on more or less is that Date is not very ideal and we're planning to improve that. We have multiple types for different kinds of data and it avoids bugs. It's strongly typed in a nice nice way and I think it really I liked it. We have all date time objects, which are immutable. Thanks very much. Mutability, we just talked about, is not the best thing especially when you're dealing with the dates and times. And of course we have strong support for internationalization because as we know they in times are inherently concepts that you need to perceive in your locale. The importance of separating types is something that the champions group has focused a lot on since the beginning. They represent the information that you have and we don't want you to be using loose types that do not exactly represent what you have and they also avoid the buggy pattern or filling in default values like zeros or UTC in cases where you actually do not have such a default value. And they do the appropriate calculations always based on which type you're in. So, how's it going you may ask? Oh, thanks for asking. Temporal is stable now. The Champions group vigorously debated each other and reached conclusions that we believe would work best for everyone. We have now settled down on a complete design proposal. The polyfills, spec, documentation, everything is now complete. You can go on npm right now and download versions 0.6 of the polyfill and use that to check things out. We want feedback from everyone. To make any further refinements before we go for stage 3 the plan is to continue the review period and ask for stage 3 in at least two or more than two months. Unless some major issues come up in review, hopefully things will go smoothly and you have like what you've come up with. We'll try to see if we can go for a stage 3 in January, but if some people haven't had time to complete their review we'll obviously give you more time. That's not a problem at all.

USA: So let's get into it. What are the temporal types? The data model is based on the common use cases that we've examined and all the types are serializable using ISO 8601 and then you know, there's a few extensions that we have planned and they're being worked out in the standards track. For example, this includes the timezone extensions on the calendar. Six inches I have more on that in a bit, but don't worry. There are playing types, which do not have a time zone and then there is a Zone today type which does have a time zone. Most types of the calendar. This is because a lot of operations are inherently calendar dependent. So if you say to me at a month to a date well, how big is a month really? There is a calendar based operation and for that need to specify which to do this operation and because of this we have first class support for additional calendars. This is really cool. And then we have the date-time information that is held in internal slots, which is never changed after initial construction. And this means that the data types are purely functional and the API would really give you lots of joy when you see it.

USA: So first off we have the simplest types, which is plain date. It is part of the [?] that has a calendar date and no time. So if you ask me what's the date today? That's the kind of thing that we have here so you can initialize the date. Let's say here you have year 2006 month 8 in day 24 and you can query these slots so you can say what is the year and if you give you 2006 and you can check if it's in a leap year Well, it's not 2006. It's not over here and you can get the [?]. there is a lot of fun things you can do with dates time is mostly in a similar way. It represents a time in a single day. So you can construct a time and check out the exact nanoseconds. That's the kind of precision working with or seconds.

USA: Next up we have PlainDateTime now PlainDateTime represents a date and a time with a calendar of course, but without a time zone. So for example if you have December 7th, 1985 at 3 p.m. in the Gregorian calendar, this is the kind of data type that you would use. It is used when the time zone data is not known, not available, or not needed. So when I'm going about my day to day life, for example, I don't think too much about time zones. Not really until I need to coordinate a meeting with friends so you can use this type and you know working on a fitness tracker or something like that and don't really care about the exact time zone.

USA: Then we have YearMonth. YearMonth is a subset of dates and it gives you just the year month. So you might be trying to store the generated tc39 meeting and that's a year month actively try sorry, sorry, or maybe you're storing which month you start working at a particular place so you can see I have October 2020 and because this year has been awfully long. I can check the days in this month and I can also see the days in the year and I notice that it's slightly longer than average. So that's not ideal.

USA: Then we have PlainMonthDay, which is the opposite of year month really so I can check, hey there's my birthday, which is a a day, you know because my birthday happens to exist every year and then I can convert that to a date by specifying the year and check which day of the week my birthday is on in 2030. Oh, it's on Sunday. That's not very good. So you can work with that.

USA: Next up we have Instant. Now Instant represents a point on the universal timeline. It replaces sort of the frequent use of UTC in objects like the legacy date object that we see there no time zone instead of having a [?] time zone. Logically it stands in the place of legacy date, which also lacks a time zone for confusingly has methods to operate in the current time zone that really works as you'd acted to in most cases and there's more confusing. So here you are. To access calendar or clock units like month or year a time zone is needed so you can have an Instant from zero epochs seconds, seconds, and then you can convert that to a zoned time in a particular time zone.

USA: Next up you have durations. These are signed ISO 8601 durations. So they have a direction and you can create a duration from just using the from method in durations and you can say for example called the total method and get the exact number of minutes in that duration.

USA: next up we have calendars. As I mentioned we have first class support for calendars and the main sort of calendar we mean is ISO 8601, but and you know you it might sound weird, but it's usually called Gregorian, but it's actually, fun fact, it's not Gregorian. It's different slightly from Gregorian, but we have decided to add support for different calendars that are not just this calendar. We are planning to add support for all International calendars provided through ECMA 402 and also you have have the ability to define custom calendars and calendars are special, you can you can have fields in certain calendar and then you can call date from fields on that calendar and get a Date object or you or you can see months in year, days in a year, these operations in the calendar specific context. Now, that would be more useful than just in the ISO calendar.

USA: Next up we have time zones. now time zones just as we talked about calendars are a major building block of temporal all time zones in the IANA time zone database or provided built in in in talk through until so, you know any human time zone all over the planet Earth can be accessible while this. You can of course use the time zone data yourself to create custom time zones. This allows you to create more esoteric time zones or more specific time zones. You can use these kinds of objects to convert objects between each other so you can project instance in to date times. They turns into instants and you can get the transitions for a particular concern about who is that

USA: and finally to sum it all up we have zoned date times as I mentioned. That is a date time with the calendar and with a time zone. This type interestingly did not exist in the original draft or was not intended to be added, it was added after we start working on the cookbook and realize that it was really really common for people to couple an instant or a daily time together with a time zones. because you know a lot of times you have a time zone as sort of implied context in a few, you know, arithmetic operations and other operations, so we added this type that allows you to persist the time zone between different operations and make your life just a little bit easier. The math operations, of course adjusted for DST and are compliant with with icalendar. So that's really useful. It's similar to moment time zone except moment only Only supports One calendar.

USA: then you ask me. How do I get the current time? Well, that's generally the most common question and that's done using temporal.now. Temporal.now is a single object on the temporal object and it contains all the different functions that you need to get the current time. You can call temporal now instant to get the current instant or get the users time zone you, You can get the current plain time in the eyes of calendar. You can really see all the possibilities here. We have namespaced this into a single place to make it easy to virtualize or block or change this part of the proposal without affecting everything else, which has really no side channels.

USA: Okay, now that we talked about all these different data types. Let's talk about the type relationships. So as I said, there are basically two realms that we deal with. One is the exact time, and one is the sort of more casual time. So on one side you have Instants the other, you have PlainDateTime and you know, know, it's some objects. And you can play around with these through the time zone you can go from one to the other and then there's this zoned date time which is of the more overarching type that includes the Precision of an instant, but the useful API of it and it's really [?] and you of course have the calendar which plays into all of these types and makes arithmetic great. And you have durations for doing all these sort of arithmetic. Now that I'm talking so much about arithmetic. We can actually move on to the operations.

USA: but first off, talking about the string persistence - I mentioned you're using ISO 8601. this you see in the green box is the canonical ISO 8601 form of [?], this version of this is called RFC 3339. They're mostly the same really and using this format. You can represent a whole lot of objects to be have and a whole lot of information that you have in temporal. Unfortunately, this format is kind of old and it's not it doesn't unfortunately include all the information that we need, for example time zones and calendars, which are really the building blocks of what we need, and for that we have proposed a few extensions that truly allow you to include all this information. We are working with the folks in IETF calcified [?] working group and also cal connect which is the body that does these for these standards on the ISO side of things to sort of consult with the experts to align our format and to make sure that this format that we're working with ends up in the greater standards universe and we have more and more tools that operate with this and to also have more high-quality feedback from people who have been experts in this field, because you know, it's certainly a bit out of our syllabus.

USA: So as I said talking about the operations, we have a whole lot of interesting operations that you'd like to take a look at. Of course you have to temporal constructors because we have to. They are low level and you can use them if you have the the exact ISO calendar times and dates and you can create a new object using. However, for most people we create something called from. these "from" methods are sort of convenience methods that are static methods on each type. Iif you have used rust, for example, you would be familiar with this Paradigm. You've used Array.from or something like this. This is a common pattern and it's a convenience method allowing you to use a variety of formats. Here so you can specify a string that can be deserialized into that type, or you can provide an object, or you can provide another object of the same type and it would be cloned and returned to you. So and we have a bunch of really useful properties. We have different fields that can be used to access certain information. So you can have time zone calendar. really see here how you can query all that information using these and then you have convenience properties that are more like computed properties. You can get the day of the week or you can get inLeapYear, which is a Boolean that tells you if it's a leap year. It's really fun stuff and if you're building applications using types, they would come in really handy.

USA: Next up we have with. So say for example, you have the current date which is someday and what if you want the first day of the month, well, you cannot mutate and projects write I as I said, they're immutable so you can't change the day variable to 1. What you can do is call the "with" method and this would return your date with these changes made. So it's actually a new date with the changes made and that's what I mentioned when I said it there are pretty functional and it's really fun to make all these changes without actually having to mutate the original object. Next you have to other type, if it's not exactly to other tight, but you can call "to" and then the name of the type you want to convert it to, to convert objects from one type to the other. This is really useful for people who have applications that play around with different temporal types because we have so many of them so if for example you have plain date you can convert that to date time and if you have a point in time, you can also convert that to find a time using to [?] datetime and it accepts strings options bags really the whole deal. Okay, we have basic math. You have add and subtract which are really simple methods that allow you to do addition and subtraction. These accept a duration like you can pass in an options bag that has a number of minutes and number of nanoseconds, maybe a number of days, weeks, whatever you prefer. And you can even use a duration instance which you've carried over from from a previous operation. You can include a ISO 8601 string that can be deserialized into a duration and you can also specify the overflow mode to actually make sure that the arithmetic is as precise as you'd like it to be.

USA: Then we have difference methods. So that includes since and until. If you have two objects of the same type, you can find the difference between those two depending on which direction you want to go from later to earlier or or earlier to later you use these two methods to specify the directionality and it returns a duration object as I mentioned in the last slide. You can then use this direction object to either display it or to use it in further calculation.

USA: Next up we have compared. So of course when we play around with date so much you want to compare them at some point right if you have a long list of dates and times and you want to sort them then you'll also need a compare method. So we have a static compare method on each type that allows you to compare two objects of that type and because it returns -1, 0, or 1 it is compatible with Array.sort. and we have a convenience method called equals, which is sort of a subset of compared but instead of calling compare and checking if the result is 0 you can just use equals that returns a Boolean which is sort of a more common more specific use case of comparing. We also allow rounding so you can for example round the time to the nearest hour, you can run round they time as you see here to the nearest 30 minutes, you can round the day to the nearest ten days. Whatever you want really. Rounding is a really important use case as we identified. While I was working on the duration format proposal it became evident that Rounding is something that people really want especially when displaying stuff because you know, you sometimes really don't need all that precision. I don't want to know in how many Nanoseconds my cab is going to arrive. but you know really if any think about it. It has applications beyond the realm of just formatting. Rounding is really useful for all sorts of arithmetic and it was more relevant for all types. So we added this to the main proposal instead.

USA: next up we have serialization. So as I mentioned the serialization format, you can call toString or toJSON on every Temporal object to get the serialization into JSON. On function is more for making sure that a JSON stringify work, but I would highly recommend that - if you notice here, this ISO string is not very relevant or useful for human beings. I just simply don't like it personally and you can call methods like toLocaleString, more on that later, but you can create human readable representations. These representations are sort of more useful for transferring them across the wire. So yeah talking about toLocaleString into and temporal have a really fun and amazing relationship. So everything in temporal supports toLocaleString. This is done with a date-time format. So, you know if you go in a post temporal world if you go to date-time format and pass as in any Temporal object instead of a date it would just work, except for duration, you know, because that's a different type and for and because durations are fundamentally different. We are working on duration format for also allowing you to display durations in a nice locale friendly way and as you might have noticed that's a different proposal that's also being worked on. And we also are working on APIs for accessing the default calendar so that everybody who's using Temporal would have their lives easier.

USA: So what are the next steps for feedback and reviews? Well, I'd really recommend that you try Temporal and give us feedback. It has been quite a ride and as I mentioned the polyfill is now out on npm so you can install that locally and try it out. It's called proposal - temporal to really ensure the people don't use it in production. It's a proposal. you can read the docs at this link if you see the docs, you can just pop open the developer console on your favorite browser and you can see that you can access the temporal API right there. If you really want to play with it and don't want to download the polyfill that's one of the ways. And you can review the complete spec which is at this link. We would really be happy if you file issues if you have any conflicting opinions and any suggestions for the champions group, it would be amazing to hear your thoughts. Of course, not all suggestions can be taken into account by the time they're ready, but we'd be really happy to discuss these with you and the earlier you do this the better and the easier it would be for us so so please check it out right now. I am really really happy to announce this to you all: Temporal is stable! I am really sorry that we missed our goal for reaching stability by mid-october. We really tried but there were a few final API details that just needed to be reached. We like to propose for stage 3 in January, but we would be happy to push back in March if in January somebody hasn't had the chance to review and would make sure that we try to reach out to everyone as much as possible. Please review this as soon as you can. Remember, it's version 0.60. Please don't download an outdated version, that would waste a lot of your time. The sooner we get feedback the easier it would be for us to incorporate, to debate it, to discuss it and we will be working on. So we're getting feedback from more parties, from more area experts in different fields, especially calendars and date time and all these areas and we are fixing any issues that come up. As I mentioned we are also working with the folks at IETF and ISO to make sure that the changes we've mentioned here end up being the official standards tracks and that's it.

RGN: Yeah, so if you go back to the slide showing the string serialization format. The parts are... actually, it's nicely labeled. So the first part is standardized by ISO 8601 and RFC 3339 which references it. The second part, where it adds the bracketed time zone, is functionally a de facto standard that originated within Java and then the part following that, the bracketed calendar extension, is an invention of this group. So there's work with standards bodies to find a way to incorporate that but it is work work in progress. My position on this is that tc39 should not ship a serialization format that hasn't been standardized by an organization with responsibility in this space. I wanted to get a temperature check to see if that opinion was shared, or if instead people are comfortable with this innovation coming from tc39 in the hopes that a later specification published by such a group corresponds with it. To clarify, the positive phrasing in this case would be, don't ship a format that hasn't been externally specified.

DE: So I'm really confused by this question because, as USA explained, the plan is to specify it and get approval in parallel with this not to be innovators here. So, I don't know I don't understand the purpose of this temperature check

RGN: I'm seeking a stricter definition of what that means. Working in parallel could mean that tc39 ships this first in the hopes or expectations that another group publishes something more definitive, or it could mean tc39 does not publish this until that other group has already already done so. The latter is my strong preference.

DE: I feel like this would be better framed as a question about what the plan is. I don't think we need an official ISO standard that's officially published because that takes years, but I do think - I was imagining that stage 3 would be based on having some discussion with them. So are you talking about blocking stage 3 or stage 4?

RGN: Stage 4, and it's not necessarily ISO... an IETF RFC would be sufficient in my mind. But again, other people might be comfortable with shipping something even before that and this seemed to me like the kind of thing that the committee should weigh in on.

AKI: To clarify, your question that you're asking people's feelings on right now is, how do you feel about shipping a feature that does not yet have a published standard alongside it. Is that accurate?

RGN: That is accurate.

USA: One thing that I'd like to mention however is of course, as DE pointed out, the champions group understands that it's important to work with experts in this field and other standards bodies and I believe this whole talk of getting it into other standards is not really black and white. We're working with them right now if they have any suggestions with some of the things that we have suggested then work on them and try to before we publish try to align the standards as much as we can. In the case of ISO they - like just to add a little more detail, they had been working with a quite more verbose format, but even ISO have been very positive so far, and has even the people at cal connect even mentioned that they'd be happy to include a format is sort of the shorthand, so there's generally really positive feeling regarding this.

RGN: yeah, I definitely agree. Everyone's goals are definitely in favor of alignment between standards bodies, I just I just want to see if there is an official stance that this committee should take regarding leadership versus deference. And it's possible that this is too soon. I'd like to have an answer for it now before the stage 4 question comes up, but another outcome might be that it's inappropriate at this time and should instead be raised in the future.

DE: It's not that it's inappropriate to raise it. It's that the contrast you're drawing is a false contrast. This is something that we're working on and that the plan is to advance it so I don't don't I don't understand why being raised in this kind of "choose one way or the other" thing. I think it would be good to form an issue to kind of work the exact stage 4 requirement. The Emoji mechanism is not is not suitable for this because we don't have two clear contrasting choices, we're sort of agreeing with each other about what the plan is.

RGN: Are we? Is the plan that the external spec must exist before this can advance to stage 4?

DE: I think this is something to wordsmith because "external spec exists" is an ambiguous term, you know, because there are several different drafts stages and I think we have to work with them to even understand which draft stage is the relevant one that we're asking about. I don't think a published numbered ISO document is a reasonable stage 4 requirement, but I and I don't know enough the IETF process to see which exact thing from them is the requirement. So I think this is just something to iterate on in the issues in conjunction with those people.

RGN: Okay, works for me.

WH: I'm curious what kind of duration you get when you subtract two instants. If you subtract the instant of the beginning of December 1st, 2020 and November 1st, 2020, do you get a P30D, a duration in seconds, or something else?

USA: All these operations have different options that allow you to specify the kind of granularity that you want in the output. In the case of durations we have specified that they follow a canonical set of definitions of the time units, but only up to day so over that we never have an assumption of how big a time unit is and I hope that answers your question.

WH: So if you subtract the two days I mentioned, do you get a duration in seconds, in nanoseconds, or in days?

USA: I think the default would be to get seconds, but you can specify really if you want something more granular.

WH: Okay.

SFC: So the way that we deal with the problem is that each type's difference function has what we call a largest unit. When you take you call instant.until, for example, the largest unit is allowed to be what makes sense for that type. We've been what we've been wavering back and forth for Instant about whether to go to hours or days. Days can be ambiguous because of Daylight Savings Time switches and things like that. According to the docs right now, we're on hours. We've gone back and forth on that at least two or three times, but the largest unit we allow depends on the type we're talking about, which is hours if you take the duration between two instants. Using instant.until, you can only get up to hours.

https://github.com/tc39/proposal-temporal/blob/main/docs/instant.md#instantuntilother-temporalinstant--string-options-object--temporalduration

AKI: Does that fully answer the question you have WH?

WH: Yes, it's kind of a weird space to do things in. There are also leap seconds, which I assume you're still ignoring.

SFC: Yeah, we're ignoring those in this particular case.

SYG: This may be a naive question. Is there guidance somewhere for what is the subset of features that should be available if there is no timezone data or other kind of data dependent stuff? Like if we don't bundle ICU for a particular build, should we disable everything that's here or some set of stuff?

USA: No, so the general direction we've moved to is to have as many things as possible that don't require CLDR data and to decouple those. A few features especially regarding IANA timezones or certain calendars would require say for example, TZ data or calendar data, but the general arithmetic methods, especially ISO calendar, do not require additional data. So in that realm you can do all operations without additional data, you know using something that's from Intl or using additional data.

SYG: Then my request as an implementer is to please put the guidance in the spec for stage 3 what is the set of features you expect to be available without CLDR data and to do a pass to make sure that set of features is coherent.

USA: Okay, that sounds good.

DE: so just to comment further on the 262-402 layering, that is indeed a weak point of the spec text right now, so to clarify concretely all of the classes that are in the specification would be present without 402, including time zones. Just like with legacy Date, it's implementation dependent to find the exact set of time zones. We don't normatively reference the IANA time zone database. I think it would be reasonable to include that in an implementation that doesn't include ICU or an equivalent library, but it's also permitted to support a smaller set of timezones. the things that we would include in the 402 part are one the interaction with Intl. formatters those who are separated just as today, there's a default toLocaleString method. That would probably output the ISO 8601 string and then there's one that ECMA 402 overrides to convert it, to put it through a date time formatter. The other place that 402 would interact with is in Temporal calendar supporting additional calendars besides ISO 8601 times. they're not affected by the 402/262 split and this is also something that we expect to have more formal spec text for splitting out an actual 402 profile document in the next couple months.

SYG: Okay. Thanks.

AKI: All right. The queue is empty.

USA: Yay, I guess that's all and thank you everyone for your time and I hope to see you on the issue tracker.

AKI: Excellent. Thank you very much. Love a good temporal update.

Intl Enumeration API update

Presenter: Frank Yung-Fong Tang (FYT)

FYT: Cool, so, my name is Frank Yung-Fong Tang from Google. I will just give you an update about this API. It's not for stage advancement of this intl enumeration API. This is a proposal we're working on. I think next time I will probably try to be published more and bring to you again. But this time just try to give you an update of this API originally put on the stand. And agenda, but somehow just not be able to work on that prevents. So they're not not nothing much you can talk about

FYT: So what is intl enumeration API? ECMA 402 supports lots of APIs and for lots of them, there's a lot of option bags as in the argument of API that the option value are not very clear to the caller that what we supported were not - [technical issues with the slides]

FYT: So the charter of this API is an effort to make sure we are able to allow the developer to programmatically figure out the supported value for some option in the pre-existing API. The motivations are original from, I think, Temporal proposal that will try to identify what Timezones are supported. So I put together a Stage 0 proposal and Stage 1 that passed in June 2020 and the last time in September would advance to stage 2. The scope of this API is currently try to solve not all the values in ECMA402 but only some of them. For example, the Locale is not. Locale or region is a very long list. And also there's a to define whether it's supported or not because there're multiple places that you can support but more focusing on whether we support datetime format for calendar or an number format whether the currency is supported or timezone or unit.

After the stage two investment we change some of the designs originally we have this spec. It used to use several different functions. One function to support calendars and the other for units. It returned an array. Another for currencies. But later we talk about that and we fold it into only one method call which is called supportiveValuesOf. And we're passing a key that we can be decided whether the key is a calendar or a collation or number system or time zone

FYT: So basically that particular function returns an object that supports an iterator symbol that you can call it to iterate. We change it to be an iterator. Intl.supportedValuesOf by passing those key and each, you know, then you can either way through it because the for example time zone could have a lot of timezone again. And so this is a new design of how the API looks like we talked about. that actually because this option there solve those key may have options for example timezones only maybe eight if we don't pass any option worked on all the times or its way to all the kinds of but we keep passing region, Uso will only return whatever the times are used but inside, who is in the collation, the collation key we are considering adding that in the collation there. We haven't finalized these thoughts yet, but sound some way to have option to narrow down What is supported in that particular collation? And I think that's about it. We I think currently we still work out this I think one of the issues in the stage to Investments that we are there are people in the state. [transcription error] So we still need people to understand the Privacy concern to help us to see whether our design currently. Well whether have the issue major issue, but I think the tendon I cannot believe that is pretty limited because it's legally on welcome version of the chauffeur or supported so thin versions of user agent probably off very similar list or what stand list until for example if they get a newer version of of of time zone if someone change the time zone have a traditional kinds o ID by Allah say some government decide to split different kinds of IV until that happened probably or get very similar. They all wore the same value well, Of course, there are chances that some of you user agent may make their data set can install different language We may become some issue about fingerprinting issue because then different user may got to come packaged here, but short currently we don't see that in happened at this moment, but there are people thinking about maybe some of the shoulder to go to that direction. Also, if you are interested about this and you are particular, people who know about fingerprinting and privacy concerns will be really helpful. It can chip in to take a look at this and give you some feedback. Yeah, that's it. Those are my updates, any questions?

DRO: Yeah along what you just mentioned with the potential fingerprinting concerns. I think as a general suggestion whenever you have an API that enumerates things you're opening yourself up to those concerns and as a possibility perhaps instead of saying having an API that is equivalent to “give me all of the support of values that match this” you could have an API that says “is this thing supported” and sort of flip it on its head. I don't really see common situations where someone would need to know the full list of time zones and in those situations, I feel like there's other ways of solving those problems that don't involve exposing the full enumerated list to an API like this and therefore the potential fingerprinting concern so I would imagine that possibly a querying API would be a better way forward.

FYT: Okay. One of the one of the issues that for example with just adding the examples is whether what if people want to build a list of what timezone is supported in less a India right? Well not India, for example, some bigger area. Let's say Russia, that is, information can do is try to provide it.

DRO: Just a suggestion: if you're talking about making a picker or a list view for that area, generally speaking, my recommendation would be to create a new input type (e.g. <input type=”timezone”>) so that the control over what is shown is given to the browser/user rather than relying on the good faith of the page or app that is using this enumeration. That kind of been a design that's done for other things that have been similar to this in the past. One example that is similar to this would be like trying to show a list of supported fonts. Rather than having a enumeration API for all of the fonts that exists on the system you have a way to have the browser give me a font and let the user pick, giving them the control.

JHD: I'm in the queue to reply to that. DRO any of these things that need to be available in the browser also need to be available on the server where HTML generated user input is evaluated - so any DOM element is just not a solution to those use cases. I realize that with a SPA that's fully client-rendered that would be sufficient, but many apps don't opt for that for various reasons.

DRO: All I would say to that is that's a perfectly valid point and just that I don't think that we wouldn't want to do those at the so-to-speak sacrifice of the potential fingerprinting in browsers or in other places. I think there's possibly a way for that satisfies both

SFC: Yeah, so about the adding new HTML input types. This is an idea that's been brought up before when we've been talking about this proposal, and what we've heard at least once or twice is that the problem with HTML input elements is that people like to write custom pickers, custom styled, custom input elements. So for example, you might be typing into a cell in your spreadsheet and then we auto fill the rest of the time zone name for you. That's not a time zone picker UI element, and that's one of the use cases that we're trying to empower with this proposal. For example, I can't remember the last time I've been to a website that uses HTML date picker. Most websites use custom date pickers for a whole variety of reasons. As long as they're implemented correctly, they can use APIs such as this one.

DRO: Yeah, for what it's worth I wasn't saying that that was the solution. It was a possible solution. It's just that there is a possible concern here with fingerprinting that we would like to avoid.

FYT: well, actually, I even that one actually could be fingerprinted because the reason I just saw the but relate to that the file picker extension the width of that, depends on the Locale . So actually people are using that to figure out what the width of that weíll take ourselves to decide which language you are using so it actually could go either way. That itself becomes a fingerprinting problem. So just an interesting thing that we bring up.

SFC: I'm glad we're having this discussion and I would like to reply about the fingerprinting concern here. So suppose that you were to have a large comprehensive list of, say, 250 timezone IDs, and you pass those 250 timezone IDs to the inverted function. You get a true or false value for all 250 of them. Is that not also a fingerprint itself? Why is that approach less of a fingerprinting vector than having an iterator method such as this one?

DRO: I don't think that it's any less of a fingerprinting vector, but I think the difference is that it gives more control to the host to mitigate those fingerprinting concerns if it decides to. I think that signals intent a lot more clearly. There's not much you can reason about when it comes to an enumeration API. If you're given an entire list, it's really hard to derive intention from that. I'm not saying it's perfect to derive intention from repeated query and calls either but at least it enables what I believe is likely the more common use case of “get this enumeration API and find out if this item exists in this list”. I don't really think that either one of them is the same result of what you're describing as it could be done with both, but there are more ways to prevent the bad actors with a querying API then there are with enumeration APIs.

SFC: So the other direction is supported in limited cases; for example, it is possible for units by trying to pass the unit into Intl.NumberFormat; you get an exception if it's not supported. I think that's largely the case for some of these types, but maybe not comprehensively, so that's definitely an interesting suggestion. I think the other direction is also useful; they're two related questions. I wouldn't say necessarily that having one means the other is no less important for its own use cases. I think that maybe we should consider both directions and then we can debate each one on its own merits.

FYT: Yeah, I think what a lot of things are is the last time we were here. I think one of the issues in general not not bigger for this API I think so apart before me we're talking about whether any feature that we have put in at 262-402 will be able to be detected right? So if we have a you know, whether the website they can easily figure out a particular feature is detected or not the supported by that and I think currently a lot of things people just using very high key way the packed it and of course the detection can become fingerprint inspector, too. here, we just try to make the detection easier, but then of course you can say that makes you think you're going to be easier. So if I think there's a contradiction of the direction right whether our old API that ships in including our standard should have a design in a way to be easily for the web developer to figure out whether that exists or not were supportive or not or we shouldn't even allow that to happen. Won't you make sure we make it harder to happen. I think there's a contradiction there.

WH: I am concerned about fingerprinting, but I don't understand the position that enumeration helps fingerprinting in any significant way here because these sets are small. The set of time zones commonly used in browsers is likely to be well-known and you can just iterate through all of them and find out if they're supported or or not. So if the concern is fingerprinting then the better solution is to reduce the variety of different kinds of subsets you might have.

DRO: One quick reply to that is time zones are entirely man-made and political. It's totally possible for a new time zone to appear tomorrow or a million or a thousand or none of them. So it very well may be that today this is true, but this API that we're designing should be designed for the future and I don't think that it's a safe assumption to say that this truth that is true today will always be the case. That is the concern about fingerprinting.

WH: I don't see how that's relevant to what I just said.

DRO: I mean my point was more that this list might not be small in the future.

FYT: But I think let's say you add 300 new time zones all the browser will ship with the other 300 time zones to fulfill the user right? So then why this becomes a fingerprinting issue, of course in the transition, you know some users upgrade faster than the order will not have that new 300 time zone. But any vendor who supports all the users in the world will have those 300 timezones ship in a reasonable period of time right?

DRO: Well, I think you even gave the example earlier of it's potentially possible that the list would depend on who installs the package or if you add an additional package that contains these types of information. It is an assumption that the host will contain all time zones and that is not an assumption that we want, unless it's codified into the standard and again even then time zones are political so it's very possible that depending on where you are someone might not want to support or acknowledge the existence of a time zone. I know that sounds ridiculous, but other ridiculous things like this have happened.

AKI: I think in that sense in a lot of ways it’s no different than political borders.

DRO: Exactly.

YSV: I don't think that we've had this come up from our side before and Zibi isn't in the call. So I'm going to be representing a hesitation that we have towards this API in general. We are not entirely convinced by the use cases additionally this sort of deviates from how intl has been working so far and that the APIs have been opaque and this is introduced. A kind of transparency to what's available on until rather than say. Oh I need this thing for for the specific users. What is the time zone for the specific user Instead This is an enumeration of everything. This has a pretty heavy cost on implementers, and we're not sure that the use case mandates this cost. So it's hesitation and I wanted to make sure that the committee was aware of that.

LEO: I have a reply regarding fingerprinting. We've had some discussions about fingerprinting in a ECMA-402 Meeting in February 2020, IIRC that happened in perso. We might have notes about it as ended up reaching some conclusion for fingerprinting not being a problem for many different aspects, etc. I would suggest Devin to be in the loop, and reach out to us (ECMA-402 / TG2) so we can bring what we have from those notes and our conclusions.

SYG: YSV, quick clarifying question for you. What are the implementer costs?

YSV: I'm not entirely clear on that because I'm not the expert in this area. That would be Zibi. I don't know. Let me ask, I can get that answer.

SFC: Yeah, I just wanted to give a quick history lesson here. One of the original motivations for this proposal was that Temporal previously had a way to get the list of time zones from Temporal.TimeZone. And when we removed that feature from Temporal, committee members raised a lot of a lot of pushback. So then we decided to go ahead and prioritize work on this Intl proposal to add the list of available time zones in Intl instead of adding it to Temporal. So I think it would be really helpful, especially when we're talking about use cases, if some of the original people who had been advocating for having this feature in Temporal would speak up and describe their use cases. Please post them on the repo offline just so we're all on the same page here, because if we don't advance this proposal for whatever reason, then, you know, we're losing that feature that was originally in Temporal, to get the list of available time zones, and that won't be available to developers. So I would definitely request that the people who were originally interested from the Temporal side post on the repo.

KKL: I just wanted to make sure that WH's point was heard and understood that that one way to reduce fingerprinting is to reduce the variety of expressed combinations, and I think that people understood that to mean that the quantity was related to variety and I don't think that's the case. I think Waldemar is suggesting that if on day one you have one set of time zones available and at time two you have have 300 more that just means that you have two fingerprints, not 300 more fingerprints.

WH: That's correct.

AKI: Thank you for that clarification, that helped me a lot. All right. Thanks everyone. We are at time Frank. Do you have any 15 seconds items you want to wrap up with?

FYT: Nope, thanks

Conclusion/Resolution

  • No change
  • Request for those who wanted timezone listing in temporal to chime in with use cases on the enumeration repo.

JS Module Blocks

Presenter: Surma (SUR)

SUR: I want to talk a bit about modular blocks. I'm going to talk about it with this in the color palette of the cake that I had recently. So before I get into syntax, which we probably already talked about. I wanted to quickly outline what the actual problems are that I was aiming to solve with this and the whole realm and talking about its responsiveness and whenever someone cares about responsiveness in the low latency sense not in the responsive design sense, JavaScript is kind of facing a challenge because I mean, I'm you all know this but JavaScript is event driven and single-threaded and still you to make sure that you keep your tasks nice and short to allow other tasks to get processed as quickly as possible and that you know, if your responsiveness low, the thing is that developers increasingly care about having low latency in their apps and not only on the web on the server side note as well and the language is not really supporting developers very well in this desire. And so to achieve responsiveness the advice is often to chunky your code or yield to the browser. It's at least when working on the web and which means your chunk your code into multiple smaller chunks. As with other tax can be processed in between and that is kind of a form of Cooperative scheduling. Well and apart from fact that often the advice is using settimeout which is especially bad on the web because we have timeout clamping on the web to it to a minimum of four milliseconds, even if you do it, right you are left to figure out what the right chunk size is and it means if you want to make any guarantees or even estimates about responsiveness the capabilities of the environment play a huge role and and any given chunk size can be too big or too wasteful depending on the environment and by that, I mean the device that the code is running on. So I guess it's to no one's surprise at this point what I'm talking about here is workers where you move the actual work to a different thread to keep your main thread, that does the processing of the events, keep that one free and respond quickly to incoming events and you know workers are now in node and have been on the web platform for a long long time, and I've been working on on this off main thread. problem in the JavaScript context for a couple of years now with that perspective on the web, but I also care about it another time and quite passionate about it and I won't go into any more detail about all the stuff that I think there is to talk about here. But if you're curious I'm going to shamelessly link to a talk that I gave in November 2013 as well as a blog post that I wrote on this topic. Now if workers are the solution, and I think developers are picking up on this, developers portals like CSS tricks or Smashing Magazine are starting to cover off main thread as a hot topic, but at the same time workers have often been called an unergonomic and that's a problem because it's perceived. They are perceived as a big hurdle to enter or even as a hurdle to adoption. And so I think we have a chance you could preempt an upcoming developer needs and I think the problem comes with that many people I think learned to think about threads in the world of C++ or C# or Java and expect something similar to this and now I think we know that this requires shared memory and shared memories and that we can't really retrofit to JavaScript as a whole but that doesn't mean that people won't try. And so one big difference to threads is that workers have to be in a different file and people often point out they don't like that. They don't; it seems minor but really people dislike having to break up code that belongs together just so they can use a worker and also in the age of bundler it can be surprisingly hard to stop a bundler from what bundling like to keep a file separate and as a result is a whole bunch of packages out there pretend to run functions in the worker trying to emulate that developer experience for more traditional approaches to threads and this is often reliant on strings or blobification as I call it and those patterns actually kind kinda work until don't so stringification turns a function into a string before sending it over and uses eval or the Function constructor or something similar to turn it back into code and know apart from the obvious performance hit of double parsing. There's actually many invisible trip wires that make code that is technically correct approach that looks correct stop working. You can't really close other variables as the verb is looted in is to close over or not the string, some global's are available in one thread, but not the other any form of eval like behavior is often not CSP compatible and tooling, you know has a hard time catching these errors Blob-ification relies on using blob URLs. It's pretty much still the same as blobificiation, but with proper URLs and probably a bit more CSP compatible. Although I'm a bit foggy on the details here either way neither if you're using using either of these techniques, they will especially frustrating the second pass come into the equation because the string if occasion passed an hour relative to the worker where the string got reparsed with blobification means relative and absolute path are completely meaningless. only full paths including the hostname will actually continue to work. And so this piece of code here would not work as a typical might expect it to. And so this entire problem space can actually extract it to think about Realms in general if you create a realm, how do you even think about code execution inside that realm without making it an ergonomic nightmare? Q finally this proposal and it's based on Dominicm and Nikolas and mine all the blocks proposal and there's also enough just an in line models proposal in here. So it's all a new iteration and in its simplest form. It is just a way to declare an inline module and you can then use to import to quote unquote instantiate it. Dynamic import is asynchronous, but that is actually a good thing because at any module block could potentially import other modules potentially from a URL. Or make use of top-level await. So it should be in the synchronous process by making module blocks work as modules. We can actually build on top of a lot of great spec work. It is already well defined well explored and well-understood by developers avoiding a lot of potential problems or questions because it is in the end just a module. With respect to syntax, we can't really introduce new keywords. I've been told and I've been taught taught about contextual keywords by saying there is no line Terminator allowed between module and the opening brace, but this is not set in stone. I'm happy to bike shed syntax with you all if you think it's a bad idea, but I think this actually looks quite idiomatic and nice. We did consider strings and template strings but in the end end they imply that you can close over values from outside the module scope and we did try to do this in the blocks proposal way back when and it just turns out at any time you try to close over values and still keep the code in a way that you can transfer between Realms. It turns into an absolute can of worms and so by using modules, an existing primitive, we naturally solve this by just disallowing closing over values right from the get-go with string application. As I said has this double parsing cost and with modules these molds were just participate in the module caching layer, which as long as in the same realm, but could even potentially be shared or cashed across realms by the engine, I'm not quite sure about the idea, but I think it definitely leaves a lot of room for optimization here because there is no closing over variables. Secondary benefits of this approach also include the parsing and compilation of these inline modules cut kick off earlier by the engine even before instantiation or transfer has occurred and by consequence errors of multiple kinds can actually be caught early rather than just an instantiation or at runtime. Now with the Realms proposal we can create multiple Realms within one file and with modular blocks. We then be able to also put the code for each of these Realms in the same file and shared across these Realms however, we like. And so I think that has a really nice bit of synergy here that these two things are kind of like the complement of each other to have state and code that is now nicely encapsulated. And workers in the end benefit you as well because as I said, they're kind of just a realm Well, so to build on top of this and address the ergonomics problem. I mentioned earlier my goal would be that in the end the worker Constructor would not only accept the path to a file but also a module blocks. I can instantiate the worker from within the same file and not only that but we could also make module block structured cloneable meaning that we can send modules using post message and instantiate it on the other side. And in this way we would finally give JavaScript a way to model tasks in a way that workers across realm boundaries that were excited - model task in a way that works across realm boundaries and allow people to build, you know, like a proper scheduler for example on top of this primitive. And to address the path problem I mentioned earlier, at least the path problem you get when you use string application or blobificatoin. The idea would be that the module block inherits the import.meta URL data from the module it is syntactically embedded in. Just in case you don't know import meta URL is the path of the module you are currently in, its like the portable version of document.transcript or __dirname or __file name in node. So this way right Ask would work intuitively for Developers So if you look at this code sample, this would now work as expected even Even if the scheduler package is loaded from a and the resulting worker would run a complete different origin. This would still continue to work as expected. And that's a really nice thing and lastly in terms of compatibility with APIs that have not been updated to consume or take a module block. The idea is to allow model blocks to be turned into object URLs. And this way module blocks will also work with APIs that have not been designed to just accept good old string URLs. objects lifetime is often a bit iffy, but it is usually tied it is tied to the creating realm which should be sufficient for the vast majority of use cases if you want to make use of this kind of technique. You might be thinking about bundlers about now that this might be nice to use with bundlers. For example allowing a file to contain multiple modules that can import each other, but Dan has kind of separate out a separate proposal that is built on the same ideas, but is complementary to address that specific use case in isolation and that target static Imports instead because we need to think about model identifies that point and I think he'll talk about this at some point. And then a couple of open question still and I'm happy to hear your thoughts either here or on the repository. But for example, like what exactly is the type of a module block it's probably an object but does have a prototype and if so what kind; I do not know happy to hear ideas here. There's another open question about caching because do we cache the module and if we do that do we cache it by its parse as the cache key or do we create really do create a new module on on each evaluation similar to an object literal. Now if we cache it by the parse node then assertion 1 and 2 would pass by searching 3 & 4 would throw while if it's like an object literal all the assertions will throw. Personally I think making it behave similarly to an object literal is the most intuitive for developers and I can see actually wanting to create the same module multiple times in a loop but might be interesting to hear some more opinions on this and what the implications are. Lastly there I have questions for the engine implementers you are hopefully here if this is actually a simple as I believe it to be what it does. Does this work passing a module block to a worker constructor. What about post messaging module blocks? Is this actually doable or am I breaking a million assumptions in everyone's code base? And with that I am happy to start a discussion and hopefully get some questions from y'all.

MM: Yep, so I think that there's a lot of convergence between your module block notion and a non syntactic notion in the compartments proposal called Static module records. The module record in the spec language is an odd beast because it starts off with the static and Information but then as it's linked and initialized, it's modified in place. As I understand your module block, the module module block object is not modified in place as it is linked and initialized. Rather it's static and the thing that it corresponds to a module record would be dynamic would be something dynamic that is derived from the module block? So that's very much like our static module record. So I want to verify that the module block object itself does not capture its linkage graph. Rather the resolution of the import specifiers would be per importing context. So the same module block would resolve to different import graphs in different contexts.

SUR: As far as I understand you, I would think that you are correct. In my head, it's as simple as creating an inline module on the fly that needs to get evaluated. And so yeah every time you call it and import, the static Imports in that module would have to be re-evaluated. Whether there is room for optimization from an engine I do not know but in its simplest form, I think what you just said is correct.

MM: Yeah, I don't think there's an optimization cost here because if the resolution of the specifier is to something that's already cached instance then it gets reused anyway, so there's a flexibility benefit. I don't think there's a corresponding cost. So in any case I very much recommend that you take a look at that moddable doesn't yet have an explicit reified notion of the static module record, but the compartment API as implemented by moddable already takes pre-compiled modules and they can be linked differently in different. and compartments by providing different import environments and we get different import environments reusing all of these static pre same static pre-compiled information. So I think we're very much [?] here with the addition that you're creating a syntactic notion. Whereas the compartment proposal had not thought to create a syntactic notion.

SUR: All right. I mean yeah, yeah, I'll definitely take a look at sounds good.

PST: You broke no assumptions in XS. I don't have much to say just that I want to agree with what MM just said about moddable XS and to the question "Did you break a million assumption" at least in XS you didn't.

DRR: I think about this up in one of the tooling meetings earlier, which is that while you are cementing syntactically like where this module scope is and is separate in some way. It's still hard for our tooling to figure out exactly where that module is intended to be used and what sort of settings you intend to use with that module block. I don't know what the specific solutions are there. I'm not saying it's something that's insurmountable, but I don't know of a good clean easy way to try to figure out how to make those things meshed together well. So I just do want to give that input from what I'm seeing. I think I'm done speaking. I don't know if anyone wants to respond to that or we can move on.

SYG: I do have a reply to that. I'm wondering what scope information you want to inject.

DRR: So the specific case that is given in the presentation is between a plain DOM context and a webworker context. That's one of the most common ones and if you're using a tool like typescript you need to make that explicit as part of your compilation settings. You need to say when I'm typing. that means Can this file I'm you're not getting all of I'm intending to use that with the web. the DOM APIs. You're not going to accidentally use something. That's not there. That's one of the key reasons that you would do that and likewise. You don't get all of the pollution of like the wall broken stuff in a plane down contacts to so part of what if - [interrupted]. So basically that's what I had in mind when I talked about scope, I guess scopes aren't the right thing, but I'm thinking more global environment in a sense like figuring out what that is.

SYG: Yeah, that makes sense to me. yeah, that seems like a problem to solve. In the in the general case it doesn't seem solvable to me in that the kind of the core value add here is that these are kind of unbounded right and then you bind them upon importing them somewhere else like you you parse this thing and you get this kind of unbounded module and then when you import it you bind it to that new global in a new context and until then you can clone it around and whatever and you have this nice first class thing that you can actually pass around. So for the use case, which I imagine would be common where you always want to want to run it in the same kind of context like a worker context then if there is some other way probably out of band to to signal that intent then that would help help the tooling but for the general unbounded case - and I'm just spitballing here, This might not actually ever come up in practice - suppose I have some code that works in multiple kinds of worklets, and maybe the API is mostly the same, but the types actually turn out to be different for like the same global identifiers or something. That will be hard for tooling to actually - you can type that with union types or something, but that seems kind of weird.

DRR: But yeah, you would not not want that. Maybe maybe can I ask a clarifying question about the proposal then? I wasn't entirely sure. Is this something where in these module blocks you cannot reference anything that's captured or just basically a way to say like you're going to do something syntactic checks and then when you actually need to use this in the realm in which you use that then it'll get down and then you'll get any resolution errors. Is that is it the latter that sounds like what it was?

SUR: No, like currently in my head. It was there is no capturing. There is no closing over anything that is outside the module blocks because we tried in previous proposal and it was just so hard to work out. So currently you the idea was for now to not close over anything, to make it really clear even for like very simple interest that these are module boundaries between the curly braces basically in there is no referencing to anything outside of those.

DRR: I want to keep talking about this, but I want to let other people make you go. So let's circle back.

DE: So I think the the idea is that it closes over the global object, but not the other lexically scoped things, so I think for type systems that need to track this it seems it seems pretty similar to the case where you have scripts that might be included in a worker or in a global object. I agree with what Shu was saying about how you might assume some kind of stability. I don't want to put words in his mouth. when I say close over the global object, I mean an instantiation of a module closes over the global object. So the module block is unbounded and it can reference only things in the global object where it is imported, which is how you instantiate a module. Thanks MM for the clarification. Yeah. That was my comment there. So I think it would be good to work through this and hopefully we can work together before stage 3 to validate the design.

JWK: I have to mention that TypeScript doesn't work well if you make one file both work in a worker and in the main thread, so I think the module block doesn't make things worse.

DE: Okay, very briefly, I think one object representation of these multiple blocks when they're not instantiated could be as a string and I linked to an issue where I discuss that.

KKL: I think that a lot of the design considerations so far have been very good in particular echoing MM's sentiment that a module block could evaluate to a static module record is good and it is also consistent with that design if the module has no lexical capture of the surrounding environment, that it's effectively equivalent to a separate file. A useful design consideration, I think is that the module block would have to also capture the referrer specifier. I think it would have to inherit the import module URL of the environment in which it's declared and that would need to be carried with the static module record to wherever it's evaluated so that the import specifiers can be resolved.

SYG: So to SUR's question about "is structured cloning workable": for V8 it should be very easy. How I'm thinking about it for the internal representation is this unbounded module script that then gets bound when you eventually import it. Such a thing exists already in the implementation like in the API itself, when you first compile a module, you get back this unbounded script. And basically what aren't bound to scripts exist and you confine them to a particular context so

JWK: You say it supports structured cloning so can it be stored in an IndexedDB?

SUR: am not sure that it needs to be the conclusion. Like I don't know if there's a value in storing a module block inside indexeddb for now for Simplicity sake I would exclude it unless anyone has used case whether it actually becomes valuable.

SYG: Yeah, I was understanding the question to be for postmessage. Is that right? That's mainly for postmessage.

SUR: Yeah. I was mostly thinking about making it set so you can send it to other Realms other workers. Service workers, whatever that kind of thing.

JWK: I'm not saying it should, but if you supported the structure cloning, by the spec of IndexedDB, it can be stored into the database as a natural result IIRC.

KM: I don't think that's totally true. I don't think, for instance, you can do that with webassembly modules. I don't think webassembly modules - despite the fact that you can postmessage them you can't store them in index DB.

SFC: I think that this proposal is really interesting from the perspective of giving a novel way to express parallel code and parallelism in code. I was just wondering if there's a lot of prior art with this model, with the JavaScript model of one main thread with workers that communicate back with the main thread as opposed to the model that's used in other programming languages like Python and Java etcetera. We've seen multiprocessing in Python, and then of course there's traditional threading and locks between threads. So, does this model of parallel programming have a lot of prior art, or is this a new invention? And if it's a new invention, does this suit itself well to the same kinds of logic that can be implemented, say, with multiprocessing?

SUR: The short answer is yes, there is a lot of practice but I've been working on for the last couple years and I've been working on - I've been maintaining a couple of apps that make use of this multi thread architecture. And so I think I have a fairly good understanding of where the ergonomics problems are, and I'm pretty confident that this would solve the vast majority of them, which is why I've now finally built this all together. But I'm happy to chat more with you about this offline.

SYG: quick reply to SFC. I don't know if you were able to make the meeting yesterday or was it two days ago. I gave talk about my vision for concurrent JS in the future and the model that this will better enable is and the worker model that the web uses is something that is that I'm thinking of it as actor inspired like not quite actors, but it's communicating event loops thing that you said, that's exactly how what this would help and there is a lot of prior art there.

SFC: Yeah. I think I missed your presentation, but I'll review it and talk with you offline. Thanks.

YSV: SUR, you asked about implementers review of this and I would just like to directly answer that. Yes. This is perfectly feasible within the SpiderMonkey engine.

SUR: Brilliant. Thank you.

MM: Yeah, so the module block itself does not have a specifier name. You can use it as an argument in your extended dynamic import expression because that takes a first class value, but because you can't name it you you can't import from using a static import and therefore the thing itself cannot export a live binding. Is all of of that correct?

JWK: In the compartments proposal there is a synchronous Imports API, so you can import it synchronously. And why would you like to import module blocks statically? Why not just write code in the module?

MM: I'm not expressing a desire. I'm just trying to verify my understanding in particular. I think the fact that it cannot export a live binding is something that makes me feel good, not bad. I hate live bindings. but I wanted to verify my understanding.

SUR: I'm going to defer to to Dan on this one.

DE: So fortunately or unfortunately, I believe this would support live bindings just the same way because multiple namespace objects do support live bindings. And so but you're right this would only be useful through dynamic import, not static. If you're interested in bundling static [?]. I have a presentation later in this meeting on that topic.

MM: Thanks.

SUR: Okay, since the speaker queue is now empty, I want to ask if there's any opposition or I don't know about the process - but my intention was to move this to stage 1, I'm asking for consensus.

MM: Enthusiastic thumbs up from me.

JWK?: I love this proposal.

DRO: I like the things that the proposal is trying to solve. Part of me wonders if there's a slightly different way to achieve it, but I like the idea of making it easier to do something like a worker thread right just in the same script block.

SUR: I'm happy to hear any and all of your ideas and concerns on this but the problem space is just really interesting to me. So yeah.

KM: Defining the problem space is what's needed for stage 1; it's not necessarily a prescribed solution at that point so that's exactly the time to have this conversation.

WH: I am also enthusiastic about this proposal.

MBS: So I think with that in mind we have stage 1 potentially with the caveat that DRO, while excited about the problem space, still thinks there might be other solutions. Which might be good to just have as a note there in the notes.

AKI: In case you aren't familiar with the process document rubric, Stage 1 is all about TC39 expressing an interest in exploring the problem space. Since this is your first proposal, I want to remind you to be open to being flexible about what the solution looks like as it’s not necessarily what the committee is endorsing for stage 1.

SUR: I will definitely keep that in mind. Thank you very much.

Conclusion/Resolution

  • Stage 1

Process Update

Presenter: Yulia Startsev (YSV)

YSV: This is picking up from our conversation last meeting about our process document. This is a super quick summary and I'll show you the pull requests in a second. The summary is in two parts and it covers the entire change that is being made to the process document document. The first segment is tips for achieving consensus, Highlights there are encouraging async work, raising constraints in issues rather than waiting for the meeting to come up, and acknowledges that consensus doesn't always happen. So just to give people a realistic idea of what a post is looks like it also begins with the definition of what a constraint is and how they work and finally it has emphasis on the importance of a stage three to to four advancement and that proposals cannot be blocked from stage 3 to stage stage 4 that means that two implementations have already implemented. They can't be blocked vaguely or with issues that have been resolved and agreed to by the committee. So that's the first big chunk of text. Second chunk of text is cases where the committee does not come to consensus. What does that look like? this section explicitly requires that we write down the reason why proposal doesn't advance. This allows us to learn from situations where something didn't move forward. Or if the situation has changed in the world that makes the current goal more realistic than it was in the past we can revisit that decision and see why we decided against in the past and in Future decisions. There's further details in this section about what constraints means and why blocking means a clarifies that the committee doesn't reject proposals. So this is something that's been a part of our process for a long time, but people are often confused from the outside of the committee about "oh this proposal got rejected and I'm sad" but in reality, we don't reject, proposals just get blocked from advancement. So that clarifies that and it advocates for actionable constraints wherever possible so, of course, that's maybe not always possible. But if it is possible, we should be aiming to say I would like this to be changed in a way that it is actionable rather than saying I don't like this at all. So this is just a continuation. There's no difference between these two slides in terms of what they mean. The next part is also just defining a bit more clearly parts of our process. Conditional advancement - it's something we didn't really have a term for but we sometimes did. So let's say someone sees a presentation and something is said that hadn't been considered by that delegate before; they might need a bit of time to understand that more and speak directly to the champion. So in that case, they might say I'm not ready to speak about this yet can, we have some time to discuss offline and come back to this later in the meeting. So we've done that several times, this writes that down. Then the other thing that it does is write down a concept of conditional advancement, which is, there is an issue I want to have investigated but if the issue turns out to be what I think it is and it's not a real blocker than this can advance but if it is a real blocker than I want us to solve that before this advances. So something like conditional advancements will save us the trouble of revisiting that proposal again later just to have someone say this wasn't action issue and we can now advance. So that's the purpose of that part of the text.

YSV: Withdrawing proposals, reverting to earlier stages, and adopting proposals after the champion has left them; that's just writing down how all those three things work. And finally there is a segment of text on the scope of responsibility for Champions. We have similar text to this scope. The responsibility text for reviewers and editors but not pure championship and this is adding that. So what changed from the last time I presented this document is pretty much the length and removing duplicates bits of information. So some of the segments were copying information from other segments. I removed those, I tried to make the language more terse etc. And yeah, I'm asking for consensus to have this merged in.

MLS: So you said that reject and Block are different things. Could you describe the substantive difference between those two?

YSV: So the okay. we don't say that something will never happen. We don't say that a given proposal when it's blocked has absolutely no possibility of ever being ever being a realistic proposal that comes back to committee and then is adopted instead. We block things and then they don't get advanced. They might not be picked up ever again. So that that functionally is the same as rejecting something but we sort of leave the door open. That's the difference.

MLS: It seems in practice they're the same thing. It seems a little interesting that we're not willing to use the word reject.

YSV: I am happy to hear other delegates comment on this because this has been something that has been floating around multiple people in the past. I don't have a strong opinion here. I do think that there is some value to say - let's say something like classes. We worked on classes for I think 15 years and it came back in multiple different forms. If something like that was said to be rejected, how would that return to committee? I guess that's the kind of workflow that I'm thinking of it. I think saying that there's always a possibility that maybe we overlooked something or the scenario changes in a way that it makes sense for us to say that we're not necessarily rejecting things, but they're not advancing because we don't think they're a good idea and advancing would require considerable rethinking or considerable change in the status quo, but maybe that's rejection. I don't have that strong of an opinion about that and if the committee right now decides that we want to remove that line. I am totally fine with the program.

MM: I like the terminology of avoiding the terms like reject and and talk about lack of progress rather than rejection. I'm in favor of this, of the way you have it.

WH: Speaking to this issue: We have a list of all the active proposals. So if we never reject any, aren't we going to wind up with a lot of dead proposals cluttering it up?

YSV: That's a great thing to bring up. I think that, it's really great to look at the inactive proposals section as well because we have this long list of an act of proposals which are withdrawn and if I remember correctly, I may not remember correctly right now, I don't think we have anything that's explicitly rejected. We might have one or two, but we have a list of all of them, which is really cool because if you're trying to see if someone has tried to work on this before you can go to that link to be like, oh, yeah. yeah. Somebody has tried to do that. Why did it fail?

WH: Okay, to understand your position, we should never clean it up, we should just archive all past proposals and keep them in the list as a historical archive?

YSV: I think so. I think it's really interesting as an archival document and it really helps us see how we're developing as committee.

SYG: I want to engage with MLS on on his question. So I think it is true that we have priors that we reevaluate. Case in point, no data races, no shared memory was a pretty strongly held position. Let's not expose GC was a strongly held position. And we have changed our thinking on both given what's happening in ecosystem given stuff that's happened. I think the distinction that YSV is trying to make does make sense to me. I was wondering if your position on the reject versus block thing is - like that the difference to me is that there is a it's a time difference. Like I don't think we can ever say we will never consider something but it might like practically that's probably understood as we're not going to revisit this without significant new information or new changes in the ecosystem or or other external factors. So that might take longer to come to pass then placed work through some of this to concrete issues of then we can can keep progressing.

MLS: The one proposal that comes to mind is SIIMD JS, which I think all of the implementors basically said we won't implement this because we think that that wasm is probably a better forum to provide that functionality. So I think maybe we can say that it's blocked but effectively it's rejected in my mind.

MM: I think we're ignoring the fact that there's also withdrawn the champions themselves can withdraw proposal and that's a valid state.

SYG: That's a thing that already exists.

MLS: SIMD JS, I don't know whether it's been withdrawn, just it's a dead proposal that hasn't been worked on some time. There's also that we've had history of a few proposals that have been shopped to other venues and they've advanced in that in those venues and effectively they're they're dead now for JavaScript. They were rejected or the champion felt that they were rejected and took an easier path to implementation.

SYG: So my question to you then is do you see value in kind of capturing that limbo state more formally in the process doc?

MLS: yeah, I think it'd be good that when proposals are quote unquote blocked if the reasons for blocking from various delegates is strong enough that the whatever constraints need to be overcome. I think it'd be good to include those in the proposal repository so that we know, how likely something is to be revived or somebody thinks that that a feature like that, several years later thinks it's worth pursuing again that they understand what they're up against.

YSV: Just a second. I'll share the actual document. I think there might be an interesting text in the actual document. Because I may have misrepresented this just now in terms of what that "reject" concept is. So not all issues with a proposal are easily solvable. Some issues are too fundamental and serious, requiring significant rework of the proposal or may be unsolvable. This might capture what you mentioned and these situations have consensus withheld. It may be referred to colloquially as a block if the proposal will require substantial work to address the concern. It may need to be rethought or may not have enough justification to pursue at this time. So this is how a block is currently being defined. And this next sentence where it says that proposals don't get rejected, it's that there's always the option for a champion to pick it up and make a modification be re-presented to the committee in order to seek consensus. So I may have misrepresented how those two paragraphs interact with one another.

MLS: So if something is that there's there's unsolvable issues doesn't that mean that The proposal should be rejected.

DE: I think there's sort of a difference in polarity here. If we want to create a concept of rejected proposals that would be useful. I think if we want to reject a proposal that's something we should achieve consensus as community for and then we could have consensus to sort of un-reject it. But there's a much more common issue than a case where we have consensus on rejecting the proposal which is that there's a blocking issue and this is very relevant to understanding the status of the proposal. It's true that we haven't always documented this very well. In proposal readmes it's often not clear what a proposal is being held up on so, I think this is more establishing a convention that the proposal readme should accurately describe what the what the status is and what it may be blocked on. And you know, we're all using GitHub. So if the proposal champion is not making the update than anybody in the committee or outside. The committee can make a PR to their readme to document the status. If the proposal champion is inactive if the proposal is just sort of being dropped then the chairs could review and land that kind of PR. I think MLS identifies a real problem and this is a place that we could make positive changes,

MLS: I mostly agree with you Daniel.Not reaching consensus on advancement does not imply consensus on a block. so I think it's too strong of a o - we don't need to reach consensus on blocking since we are a consensus model of advancing. If e don't reach consensus it's blocked until we do reach consensus and if some on the committee believes that something is unsolvable, or maybe it's widely held but not it effectively it is blocked in I think it's good to to face the reality that it's going to be blocking and so therefore we reject it. I don't say we think we should remove it from the repository, but that we could keep it for posterity of things that we considered and rejected for various reasons. That doesn't mean a self-similar solution can come back, but that that solution is considered unworkable.

DE: I mean, I agree with you that we should document these things better, but I think there's there's a difference between saying this person is blocking this proposal for this reason and the committee recognizes it to be kind of permanently blocked because people change their minds they find extra extra evidence and I think for the committee to recognize something as being rejected that's something different from acknowledging this state of it's currently blocked by a person for a reason.

MLS: So would you agree with me that a proposal that was first made in tc39 that received some kind of pushback and didn't reach consensus for moving forward and was later moved to another standards venue and therefore an advanced and therefore is part of that standard that we would consider that as something that we are not going to go forward with because it's already implemented elsewhere in the ecosystem and therefore it's been rejected and that rejection is kind of an implicit not an an explicit rejection.

DE: These things are really complicated. I don't know if you're alluding to cancellation. Personally I still have hope for doing something more about cancellation in committee, and I regret the current state of it and I think these things are these things are pretty complicated to summarize in just a single line of "it's rejected". I think to make such a strong statement we should have committee consensus on it.

MLS: I'm not saying that the criteria is just because he got implemented in another standard that we you know by fiat rejected it. I think it requires some committee discussion and agreement, but I think that'd be good to acknowledge that so that we can have a proper documentation of how that proposal fared it tc39 tc39 and other venues.

MM: Now that you've clarified I don't understand what you're saying.

MLS: So if some proposal is presented tc39 and at some point the champion feels that they're not making progress through multiple attempts to get their proposal to advance and so they take that proposal or some variation it to another standards body and present it there and it gets advanced and it's part of that standard and there's no need now to have it as part of ecmascript, would you consider -

MM: I disagree with that line of reasoning. I think that if having taken it elsewhere it advanced elsewhere then became part of the JavaScript ecosystem and could be - you know, that it should always be on the table for it to come back to tc39.

YSV: I would like to make a suggestion here after hearing everyone's thoughts. One that I tried to maintain in writing this document is that the proposal's forward motion is the responsibility of the champion who care about proposal. And a block or you know, if we introduced a concept of rejection, they would be functionally the same because whether we have a single person blocking a proposal or we have the entire committee saying that it should be rejected, the main difference is the signal and what we say with that. It doesn't have have an impact on the proposal's forward motion because that will rely the champion either taking it and significantly reworking it or withdrawing it or taking it to another standards body or something like that. It really comes from that individual or group of individuals. So I think one thing we can say is maybe if we want to give the message that a given proposal should - maybe we want to give a message that from the committee's perspective as a whole rather than we have a single blocker or multiple blockers, but it's a committee position. Maybe what we can do is say that the committee can recommend that the champion withdraws the proposal. It would be a very strong statement very similar to saying that we reject it, but it would leave the forward motion in the hands of the champion.

MM: I like that refinement.

MLS: One clarifying comment. If the committee generally would encourage a champion to withdraw and they refuse, or vice versa they withdraw and the committee doesn't think it needs to be withdrawn. It seems like that may be kind of a disconnect between - in either case - a disconnect between the champion champion and the committee.

YSV: Yes, but this would also reflect how we currently work because a champion may withdraw a proposal or member organization may withdraw a proposal on behalf of a champion who's no longer working with the organization. We had that at Mozilla. And then another member organization may say we wish to pick this work up again and start the proposal process again using that withdrawn proposal as a starting point. And if a champion chooses not to withdraw proposal in spite of the recommendation of the committee to do so then they also have the rather heavy task of convincing a committee that has gone to the point of saying we have consensus that this is a bad idea. They would have to find a way to convince the committee that said that to accept their proposal. So yeah, I don't know if that answers your question.

MLS: Somewhat, yeah.

SYG: I would like to express general support for this PR. I think it's a pretty nice clarification of our existing working norms is how I'm looking at it. It seems like most of the current discussion around a notion of rejection and how we should frame it and as a recommendation from committee is out of scope for this PR which I see as a strict Improvement. I think we should be mindful to not kind of take this as the opportunity to game out different things that we want to see formally captured in the process in the process document. If we have novel things we would like to capture in the process document it seems like those should separate PR and discussions. That's it for me.

MLS: This is a completely separate issue. There have been times when we've talked about removing something from language and I'm not talking about shared arraybuffers and the whole Spectre thing. I think that that was very principled and it was done when shared array buffers were pretty new, but there's been other times, rare, but other times when we've considered removing something from the language. And I think we struggle because we don't have a process for doing that - and this PR probably isn't place to doing it - but we probably should spend some time thinking about what is a principled way of doing that since something is in the standard and in we now consider removing it. Is that just a demotion from stage 4 to some lesser stage stage or is it something else?

YSV: That's a really great point. I would love to continue talking about that.

[queue is empty]

YSV: Okay. So from what I heard now, there is a concern about this line now. SYG said that it's out of scope scope for current PR what I would have suggested is, if it is a blocking issue I could reword this sentence to say the committee does not have an established concept of a rejected proposal, however, the committee may recommend a champion withdraw a given proposal and then the rest is the same. That would be the change that I would recommend here and then everything else stays the same or since so many people have reviewed this so far we can leave it at it as is as it is right now and I will make a new PR with that wording change to address the concern that was raised by MLS and then to address MLS's next topic, which I think is very important, "what do we do about demoting language features?" I can open can open an issue for that on the the reflector. So on that first topic should I add that clarification to the slide?

MLS: Separate PR is fine with me.

YSV: Okay, so I will put that into a separate PR and people can review that at their Leisure and I suppose the next thing to ask is does this look okay to people to be merged without green background.

[general agreement]

YSV: I had a couple of people write me offline saying that they also support this change (ljharb and leo). So I will go ahead and squash this and merge it, if there are no objections.

RPR: Congratulations Yulia you have consensus.

WH: For the third item you raised about deleting things from a language: I would view it as a proposal that would have to go through the stages if it's anything more substantive than a bug fix. Proposals are for changes to the language. When something is already in the language standard, deleting it is a change like any other that should go through the stages.

YSV: I'm also thinking about that the same way. So what I'll do is - because we have currently the Symbol.species removal in flight that's being worked on by Shu and myself, and we're following what WH just suggested. So what I'll do is I'll open an issue on this is a more general topic and we can discuss how we handle that kind of a deletion.

Conclusion/Resolution

  • PR to be merged

Adopting Unicode behavior for set notation in regular expressions

Presenter: Mathias Bynens (MB) & Markus W. Scherer (MWS)

MWS: Hello, good morning. Good evening. My name is Markus Scherer. I work for Google. I think Mathias is going to present and start on this occasion.

MB: I'm happy to let you do the presentation if you prefer Markus, I just wanted to say a few words as an intro. Yeah, maybe if you want to get set up with the presentation I can start the intro because I just want to give some context to this proposal. It's a brand new proposal stage 0 were not asking for any stage advancement. So not even for stage 1 today. We're just want to throw some ideas out there and see what the general. Yeah, but the general sense is within the committee about some of these ideas that we want to figure out if this is something worth pursuing in one particular way or another and another thing I wanted to make very clear is Is that in both the repository and these slides that prepared prepared from the content in the Repository. Here we have some illustrative example slides. I want to make it very clear that we are not tied to any particular syntax here. So although we do use some example syntax in the slides, I'm really hoping to avoid thread holding on syntactic details too much today. We're really trying to focus on the use cases and whether or not the committee thinks this is a problem worth solving or investigating further. With that out of the way, MWS go ahead when you're ready.

MWS: All right, so this is not actually precisely About Properties or strings or sequence properties. This is also about regular expressions. What we are proposing here today is to add set notations in regular expressions in character classes in regular expressions. Basically starting from where we are. We have Unicode properties and that's a wonderful way of number one making the character class or regular expression future proof because as Unicode ads characters like digits or let us these things naturally grow and we don't have to update them ourselves. It also means that when we have something that would take hundreds of ranges actually in this case. It would be I think 63 range of stress the digits. The regular Expressions don't get totally unwieldy and that's all fine and good but typically you quickly get into a place where you want to have whatever the property says plus a few things, but maybe you want to combine multiple Properties or you want a property like [?] except for something and then you want to remove things and sometimes what you remove his another property sometimes what you remove is just a list of exception cases and it's also also quite common for people to use, basically an intersection of sets meaning that I want characters to match that have both this property and and that other property.

MWS: Currently what we have in JavaScript regular Expressions is that we can make a union inside a character class you can use character class escape, which means you can have the properties in there and you can have characters and ranges and you can make the union of these so basically for one notion of quote and identify your let us for example, you would take all the letter characters Does and all the characters characters all the combining mark that have a numeric kind of property including the digits but also Roman numerals and stuff. And in this case, I also added an underscore just to illustrate that. We can have just a single character [?] as well. that helps for Union, but what if you only want the letters that are in the Khmer script? for the Cyrillic script or some other script and this is a real life example it except for the underscore here that kind of gets thrown out as we intersect with with the queer script. This is the kind of thing that's used in real life. I fished this out of a piece of Google code except what I had to do for EcmaScript. I had to express the intersection as a positive. hit that's the issue and that's pretty unintuitive if when I looked on StackOverflow and other places for how to do intersection and set difference in various regular expression engines for some of them. It works as a built-in feature and for some of them you have to do things that are not obvious and one of one of the suggestions that keeps coming up is to do lookup ahead.

MWS: This is kind of clunky and not intuitive, but it's also slower because it does actually to lookups on us in character. I don't know if engines can optimize that but typically I would expect that to have an impact. We can do the same thing with a negative look ahead kind of for subtraction. So if we have the non-ascii digits we can match on personal decimal number characters, but only if The character is not also an Esky zero through 9 digit. So that kind of works. The other thing that people can do of course is they can take the full list of ranges corresponding to the property and can remove the ones that they don't want and basically pre-compute and then hard code the character class. and that works, of course, but now we are back to having lots of ranges. So abbreviated as here. This is actually sixty two ranges of 0 through 9 in its full form currently and every few years Unicode adds a script that has its own set of digits. So this gets out of date we lose the benefit of readability in updating of properties. So probably this is not the best way of doing things. So what we are proposing is basically to add real support for doing intersection and subtraction was that difference and with that also to make it kind of useful and meaningful. We need nested character classes. So we need to be able to not just have a class Escape like a \p quality inside of a character class, but also another square bracket character class for doing things like exceptions like the removing a scalar case so for example, instead of doing the grouping and the positive look ahead we could just write something that looks like well, yeah, there is a syntax for doing an intersection. We have the Khmer script property. And the Letter, Mark, and Number class and those two classes or sets together. The intersection is what we want for having mellitus. I put a note here that this is not necessarily the actual syntax we would be using this is the kind of syntax that's used in other places. There is variation on things like presidents on whether to use a single or double ampersand. and various things like that. We're not trying to settle those kind of things here. We're just presenting an idea for subtraction for non-ascii digits instead of doing a negative look ahead. Can just write a character class and the character class gets computed from decimal numbers except for those asking digits and you could also express it in a different way. could use a pointer here, right? You could have the decimal number - oh - - the property for the ASCII range. And so this could be a \p Calibre zosky. But in this case, it's kind of easy to list the singular points for the digits that we all know and love and of course in a union in order to keep the syntax consistent predictable. It would be also handy if we we could have a square bracket character character class in there even though it's not strictly necessary an example like this it would work just as well if you left out the inner square brackets, but it would be strange if we allowed the character class nested inside of other things are not here. so I dug up a few more examples from Google code where people were doing things like breaking spaces basically taking all the 30 or so space characters in Unicode and removing one set. worked like a non-breaking space and things like that. I'm aware that the line-break properties aren’t currently supported in ecmascript regular Expressions, but that's I think a decent illustration. Anyway, there's an emoji property and there are ASCII characters that are also cut, also have this property for the keycap sequences. And I found code that wanted to remove those it like having the ASCII characters in there. There was code looking for combining marks that were not script specific. So they intersect to combining Mark property with having inherited and common script. So that's that's all the characters that are not like the vowel marks in in the clutch more like the acute and graph that we use in French and Spanish and other places or there was a piece of code that was looking for the first letter in a script in it had a starting point of taking things that are quick check. Maybe our yes in normalization form z c and then removing things that have sort of dtc's common inherited script properties meaning punctuation symbols in those kind of stuff.

MWS: For comparison, We have two versions of this slide. We looked at other regular expression engines. They all basically support unions and nested classes. So the ones we looked at here several of them support either intersection or subtraction. I'm not sure why if they go to the lengths of having one syntax. Why not add the other you can emulate these things? By combinations of intersecting with the negation and things like that, but it's much more obvious if you have dedicated syntax, there are a couple of regex engines that also support a symmetric difference. I'm not really sure what the use case is for that but UTS 18 the Unicode regex describes it and a couple places have implemented it. It is a tub table view view of that. Showing which engine has which feature in a more visual form? Thanks to Matias who put together this nice table form the star here the in and shrug for Java subtraction is basically saying that they don't have Syntax for it, but they documented as you can get it. From doing intersections and something else. And then you can see there are places that do one but not the other and at this point the ecmascript Regex. can only do Union and not even Union of an SAE class and we would like to fill in that bottom row. I'm not sure that we need a symmetric difference. We are basically not proposing to add that but otherwise subtraction intersection and nested classes would be handy and we think it would be handy in summary because the regular expressions with a character class has become a lot more. Show a little more readable, more intuitive and because of that they help avoid errors by having hard-coded classes that are then also hard to check, hard to keep up to date, and people might be tempted to just stick with simple character classes that don't really support internationalization or doing lookaheads that are also unintuitive not easy to use and hurt performance. so as Mathias said this is not ready yet for really having something concrete at once to be decided and this is what it's going to look like, but if we could get a thumbs up for continuing this work, that would be great. If people have concerns and think this doesn't fit in EcmaScript regexes. I would like to hear what the line of argument is for that and we can think about that but basically basically what we are asking is a go-ahead to make a real proposal for adding these features into JavaScript ecmascript regular expressions. MED suggested that I put in an example for something that has a lot of ranges. So here we have all the letters that are not lowercase. These are almost a hundred thirty thousand characters with hundreds and hundreds of ranges, so imagine you do this and then you don't just have this one character class, but you do this kind of expression. And have to write it out and someone has to make sense of it. Matthias

MB: yeah at this point. I think we can open the discussion and see if there's anything on the queue.

WH: Long ago during the presentation you had a bunch of slides in which you used /(: …)/. As far as I can tell that just evaluates to a capturing group whose first character is :. What did you mean here?

MWS: I have to admit that I'm not a complete regex Maven, but I've seen suggestions on stack overflow with and without the grouping and I remember there was a reason why the grouping was needed when you were doing a plus oor star Star operator after it. It was something about how these things are getting evaluated.

WH: I'm not familiar with that. As far as I can tell it's just a capturing group that requires its first character to be a colon.

MWS: So I think the high point on this kind of slide is that we need to look ahead assertion to emulate the intersection or the subtraction. I had the impression from some of the stackoverflow answers that the grouping was also useful to do some of the work that was wanted to be done. But if that's not the case, then I can remove that part from from the example.

BSH: I think. ‘:’ actually means that it doesn't really capture it only groups.

MWS: I'm sorry for not copying it correctly.

WH: Okay. I thought you were trying to introduce a new syntax which I wasn’t familiar with here.

MWS: That's just a typo; I'm sorry for any confusion. So I need a question mark before the colon.

WH: Yeah, okay. I'm also next on the queue. There are all kinds of syntactic issues with this proposal. /[[0-9]]/ currently means a character class containing '[' or one of the ASCII digits, followed by a ] character. Also, /a--b/ means a range starting with a and ending with a -, or a b character. Introducing those would be breaking changes. I want to get an idea of where you want to go with this. Is your intent to do this without breaking changes to the language or not?

MB: We're definitely not interested in making breaking changes to the language. So again, like the examples were used, we're not married to any particular type of syntax, but if we want to pursue this in some way we either need something that is backwards compatible. So since I said currently breaks that throws an exception or we would need to introduce a new regular expression flag. Those are the two options.

WH: I’m concerned because all of all of the examples you gave are breaking changes — they currently mean something different.

MWS: Yeah, I think that degrees of breaking changes. So I think that regular expression engines have commonly extended their syntax using these double operators like - - or &&.

WH: -- inside a character class means an endpoint of a range is a minus sign. So that's already used.

MED: I mean there are a variety of ways to make it less likely like you can require that. It's a - - but it only occurs between character classes for ranges. So you either have a curly brace or a curly brace or /p curly brace or and square bracket on one side and the other so that's a possibility. I mean there I don't know if we want to go deep on that. But I think they're the choices that a lot of the regular expression engines have made to reduce reduce backwards compatibility problems, even if you don't have a flag, have been pretty successful. People roll these out without a lot of problems, but there is always the option of using a flag to make it really sure.

WH: I worry about trying to specify syntax in which -- only works if you have a \p on one of the sides will make the syntax very irregular and will make it extremely difficult to understand.

MED: I think it's a balance. You can have a flag and then you can have you could have have a single ampersand, you do all sorts of things, or you can make it less likely and make it easier to migrate migrate Expressions if you also also use syntax like doubled characters that unlikely. I mean unlikely that you would have a character. I mean a double - in between character classes and that's the place where it's really useful.

WH: -- is used quite a bit. Things like the nested square brackets also mean something already. If you nest square brackets the first closing square bracket will end your character class while the redundant opening square bracket is just a square bracket literal.

MED: I don't want it to talk too much but a lot of programming languages, I mean languages have faced exactly the same problem and I think we can learn by the steps that they've taken.

WH: I'm really worried about breaking syntax. But I do support the concept of doing operations character classes like unions and intersections and whatnot. Also, to clarify, you're proposing to do this only for single characters and not character sequences, right?

MWS: This is separate but could be combined in the future is if both of these proposals are accepted then they would naturally combined. Like if you have a set of all emojis, which is a sequences property and then you remove the Emoji flag sequences as a subtraction that would work and make sense.

WH: In some cases yes, in some cases no, but I don't want to rathole on that right now.

MB: Waldemar, just to quickly respond what I think I'm hearing you say is that you would prefer to add a new flag other than us trying to find syntax that somehow doesn't break anything. Is that correct?

WH: Yes. I want a syntax in which it’s possible for mere mortals to understand when you can use it.

AKI: You know this is regular expressions, right?

WH: Which is why this is important.

MF: On the addition of syntax for this feature: while I support this feature because I think that it's a problem best solved within the browser (who has the appropriate Unicode data), I don't think that it's a common enough problem that we should really be worried too much about adding syntax for this, especially in the Pattern grammar where it's very hard to do so. I would prefer an API for doing set operations on regular expressions.

MB: Re: how common this is, I'd like to clarify that the examples we've used in the repo and these slides are taken from a code search within Google's internal monorepo. So these are all examples that are being used in production today, not made-up examples.

MF: Oh, yeah. I don't doubt that. It definitely is happening and happening often enough to warrant inclusion in the language, and inconvenient enough to do within a library. I just think that it doesn't happen commonly enough to warrant addition to the regular expression Pattern grammar, which as we discussed just a minute ago is already hard enough for users to comprehend. If you're doing something here on the more advanced side of the regular expression use, I think you can use an API to construct your regular expression, especially if we have better ability to construct regular expressions in the future with like - as we were discussing in IRC - with like a template tag for regex construction that would make it really easy to compose character classes and stuff. That's my opinion.

MB: Okay, so that would be a third option. So far, we've been talking about a) somehow finding magical syntax that is backwards compatible, or b) adding a new flag, but there's also c) what if we don't do any of those and add an API, okay, I see.

MWS: I don't know in terms of complication of the grammar. It seems like a lot of other regex engines have added this and that tells me that they had motivation to do so and it seems like they document it in ways that aren't all that confusing I think.

MED: yeah, I think this is this would be the least complicated addition to regex if you look at all of the possible syntaxes and complications that are used in rege-

MF: You say that but you've proposed something that's ambiguous. I don't know how you could make that claim.

MED: No, it's not ambiguous. If you have a flag, it's not ambiguous. If you have a flag, you can prevent the backwards compatibility issue. I'm not sure why you say it's ambiguous.

MF: I say that without the flag. Yes, with the flag, it's not ambiguous. But then now you have the mental burden of having to know which flags are enabled when you're trying to read a regular expression.

MED: I mean, this is something that ECMAScript used in the past, when it went to using Unicode capabilities and eventually the u flag. I don't think you have to use that flag anymore.

MF: yeah, we have the U flag which I don't think anyone would argue wasn't worthwhile adding the flag because it's for such an incredibly useful purpose.

KG: Just a quick comment there. You do still have to use the u flag and you will have to use the u flag forever because we never change the meaning of anything ever. So if we introduce a new flag here you will have to continue to use that new flag for this syntax forever.

MED: Although that new flag could subsume the U flag as well. if you wanted to keep the number of flags in an expression down, so if it were a new flag, then we could say that also implies the U flag.

MF: I don't think the burden is the number of flags. It's the number of individual states that you have to be aware of when reading a regex.

MLS: So MF, when you say API you think about an API to construct the pattern and just want to clarify? [transcription error].

MF: The API I'm talking about is doing the like union and intersection operations on character classes and resulting in some representation of the code points that would or would not be matched by that character class and then being able to construct a regex from it via a separate API.

MLS: So I see some problems with that because it considers a regular expression that has multiple logical operations of Union and subtraction intersection that you'd almost have to have like a formatting kind of API that would take some special syntax do it needs to do and construct this pattern or the API. We have the ability to compose separate patterns so you can do it individually and I think that may be just as problematic as coming coming up with syntax and be nonbreaking

MF: I believe so. I believe MB has a library that does, effectively, this. MB, can you speak on that point?

MB: Yeah. Okay. I have a library called regenerate or regular expression generate. It's supposed to be used at build time (instead of run-time) and part of the reason is that performance is a problem depending on how you implement this. My implementation is pretty basic. It gives you an API to easily operate on sets of code points and the library makes it very easy to add or do Union or subtraction or all of these things and then to finally toString() that resulting set into a regular expression pattern, with different options to customize the output depending on whether or not you want to use the u flag. I think there's also precedent for this in ICU and MWS can definitely speak more about that in the name for that — UnicodeSet is the name of that API. So this is what I'm imagining when when I heard heard your proposal: conceptually it could be an API that accepts a string that describes this pattern, like for it could be the patterns that we had in the slides as strings and then that produces either a regular expression pattern or it gives you some other type of objects that you can then use to combine as MLS said into the larger regular expression pattern. Does anyone want to speak about UnicodeSet and how ICU handles this?

MWS: Yeah, I can speak to that a little bit the UnicodeSet - actually MED I think cooked at up over 20 years ago together with [?] who was working on the translator-inator service in ICU, which is basically a rule-based way of transforming strings from something like Russian Cyrillic to Russian Latin or something like that, for example, but you also do other things with that syntax. And basically it introduced the notion of the Unicode set which is like a regular expression character class and the rules in the transliterator are a lot like regular Expressions but you have potentially hundreds of these kinds of rules. and at the sets are used for most keys at the context like the context before and after something that wants to be replaced kind of like a look ahead assertion a bit, but sort of more describing the context. And for that, that was probably one of the early places that supported Unicode properties in the syntax and one of the early places that something like 20 years ago supported the set operations. And that has been very fruitful in the transliterator framework. It's also been used as a standalone feature in lots and lots of places. You can create a Unicode set based on one of those patterns and people do that all the time and then say, given a string tell me how far I can go from the beginning of the string or from some offset in the string with characters that are in there like whitespace space characters or [?] letter sets that I showed as an example and give me the end point of where that is. So that's been a very useful but also a very popular feature and people are successfully writing their own patterns for these things which are really just basically regex character classes in a standalone implementation. MEd, do you want to add something?

MED: I think I think you covered it quite well. The advantage of the Unicode sets is that they can be that they have both - you can create them from a string representation and you can also perform all of these set operations on the Unicode set so that I can use a Unicode defining it goes that later on subtract it from another unit code. and produce the third one one. And so on. That capability turns out to be very useful. It's like sets it's as if you had sets of characters, but you can make a much more compact representation for them, much easier to process, and a smaller footprint.

AKI: Yeah, I think we're ready to move on to Richard.

RGN: Yep, we had some sleep deprived chatter in IRC and basically at least convinced ourselves that there is room for syntax here (e.g., /\U[\p{N}--\p{Nd}]/u). I don't want to get into the concrete parts of it. Obviously that's a later stage concern, but it's not prima facie dead.

KG: Yeah, just as RGN said, the details of the syntax are definitely a later stage concern.

SFC: It looks like there's positive sentiments toward this which is good. And I also just wanted to think about - the set operations were sort of one of the biggest areas of improvement for that were ecmascript regular Expressions were farthest behind regular expression engines in other programming languages, but there's going to be room to extend this to some of the other features also that that that Unicode regular Expressions also support, the biggest of which is multi character sets which are important for things like Emoji. I think that this was alluded to a little bit earlier in this conversation, but I like, the idea that Richard posted about the set notation here. That could also be extensible to other area, so I just wanted to sort of get that thought out there because I think it would be really exciting if ecmascript added not only set notation, but also did it in a way that we can also add other Unicode regular expression features at the same time.

Sentiment meter:

  • Strong positive: SFC, RGN
  • Positive: DE, KG, BSH

MLS: There already is a sequence property proposal.

SFC: There is a sequence property proposal, but that's different than sequence character sets. And the sequence property proposal is indefinitely blocked on the syntax; my hope is that if we were to adopt if we were to agree on adopting Unicode character class behavior, then this could also help us unblock the sequence character proposal, which is really one of the reasons why we're really looking at this problem because the sequence property proposal is not as useful as it could be if we had support for all of these other features, which is in large part why Markus and Matthias are bringing this proposal forward.

WH: Since you brought up sequence properties: If you intend to go to sequence properties, I think it's crucial to consider the the syntax of sequence properties together with this because there are things which are going to come up in the combination of those two which will not come up if you consider this alone and sequence properties alone.

SFC: Yeah, I agree. Would you recommend basically having one unified proposal that introduces both sequence sets as well as the Unicode set operations, one proposal that introduces all these features? Or is it a better process to have each of these as separate proposals?

WH: What I would recommend is that you find out if the committee is favorable towards considering sequence properties as part of this. If you get an unfavorable reaction, well, then you'll have to figure out what to do. If you get a favorable reaction towards considering sequence properties, then we'll want to figure out what the syntax should be for sequence properties inside character classes — well, what used to be character classes, what will now be sequence classes. And in particular the issue will come up of how you define singleton sequences within sequence classes.

MLS: You also have the issue of negation and things like that.

WH: Yes, you’ll have that issue too. Back to the point I was making: Singleton sequences is the main thing of interest here and that will probably drive you away from some syntactic choices you might have made if you didn't consider them.

MWS: So if I might jump in here for a moment, in the Unicode set in ICU, we have supported what you call sequence is what we just call strings as part of a set for I don't know something like 15 years. At the time, we added syntax that wasn't quite backward compatible just using a curly brace bracketing around the string as a single element of the set. I understand that that particular syntax is too disruptive. There is a recommendation in the Unicode regular expression spec for doing something like a \q{ and then the string but in general that are definitely options for supporting something like that and these kinds of things do make sense. We use them for example in the Locale data, the CLDR data, for things like to set of characters that you need to write a language. and that can include sequences not just single code points. The other thing - someone mentioned negation, and for UTS 18 I really have to credit MED on this one, he came up with a way of resolving some internal negations so that in the end you can test and make sure that you don't end up with a negated set that contains multi-character strings because that just doesn't work, but it is possible (if you wish to do that) to have some indication on the inside and have it be resolved in case that's permissible. For example, you could have a double negation which then falls out and gets resolved away.

WH: Yes, my thinking about this is that there is too much of syntax within the character classes we have today that's already used for existing behaviors. I would prefer to start with a cleaner slate which we can get with a flag that lets us define a straightforward uniform syntax for doing sequence classes and not have to worry about doing really bizarre contortions to avoid breaking stuff.

MED: and I think that's perfectly reasonable direction to take. A lot of times you make decisions for backwards compatibility that ten years down the line people squaring that because it gets so ugly.

MLS: And, with the U flag, we have syntax available to us because the escapes are the currently unused escapes are syntax errors with the U flag so we can introduce new escapes for new constructs.

MB: Yeah, we were discussing this on IRC. It was an interesting discussion and we could basically do \UnicodeSet{…}, which I think is quite elegant and readable. But anyway we can discuss this on the repository.

AKI: Yeah, I know that we've definitely had this space to discuss syntax, but that since that's not really an early stage issue, we are just about at time.

MB: All right. Thanks everyone.

SFC: Thank you. This feedback was helpful. And I also recorded in the notes the sentiment meter.

KG: just for the notes. Are we officially calling this stage one?

AKI: So I'm going to just mention this doesn't to my knowledge have a repo.

MB: It does have a repo and slides, but we didn't provide those materials before the stage advancement deadline. It's fine, we're not asking for stage 1.

MBS: So I believe but I could be wrong that those requirements don't apply to stage one historically, but someone can correct me if I'm wrong there. I think they do. the committee has the ability to the deadlines not naming the refused. I don't remember now, but I believe we changed some details of that in a recent meeting. I would like to see if something does go for stage advancement.

YSV: So stage one isn't quite as important, but if it doesn't get into the agenda within like before the 10-day limit for example any member organization that relies on their peer review will be able to review it. Yeah, I think it should apply to stage one, but I think whatever.

MWS: I apologize for putting this together later. I was in Germany for three weeks with my parents and just got around to doing it early this week.

YSV: I think no problem. Thank you so much for the presentation. Okay.

AKI: Yes. Thank you so much and the conversation. thank you.

Conclusion/Resolution

  • No advancement due to late addition to the agenda