-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Editorial and formatting changes for clarity #120
Conversation
* Editorial and formatting changes * Delete .gitignore * Added suggested changes * Minor updates suggested by Leo
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. |
Earlier, we experimented with defining the term feed in relation to dataset:
The above definition of feed was not included in this pull request. The changes in this pull request removes ambiguous uses of the word feed. Questions:
|
While I would personally rather see that documenation would be written in
Publication
Operator For some changes I think this goes in the wrong direction, specifically:
This means two totally different things in my opinion, and would only |
Thanks for your review. Here are responses.
I'm hesitant to use the word operator because "transit agency" has historically referred to service brand which is sometimes distinct from the operator. For example, RATP operates Capital MetroBus in Austin TX but we don't want to display "RATP" to riders in Austin because it doesn't mean anything to them.
I worry that "publication" has a very general meaning in English. As an alternative to dataset, what do you think of "feed iteration"?
I am guessing that "trip" in GTFS is equivalent to "ServiceJourney" in Transmodel? Would "journey" confuse those who are familiar with Transmodel terms? |
So there's good news and bad news. 👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there. 😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request. Note to project maintainer: This is a terminal state, meaning the |
I'm the commit author and I'm okay with contributing. |
This discussion is really great. Because it shows you show that we are not interested in an operator or agency, but instead: the brand. Would you agree that we should use brand instead?
For NeTEx a delivery starts with: <PublicationDelivery xmlns:mstns="http://www.netex.org.uk/netex" xmlns="http://www.netex.org.uk/netex" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:gml="http://www.opengis.net/gml/3.2" version="1.0"> I would say lets not try find synonyms or in betweeners.
Exactly.
It again depends on what you are trying to communicate here. Do you mean all the legs that make up the entire journey the rider takes to go from A to Z via B, C, D etc. or do you mean the leg that could be walking or a specific ServiceJourney/Trip that goes from A to B, B to C, etc. This is why we need an unambigious shared vocabulary. |
This is an area where the spec is ambiguous, but in practice agency_name generally means "transit brand". We might go with either "brand", "transit brand". This would be more true to practice, but might necessitate changing how other data elements are presented in relation to agency.txt. For example, "Timezone where the transit brand is located" reads oddly compared to "Timezone where the transit agency is located." Even though "transit agency" isn't precisely accurate, it does follow the terminology embedded in the spec's file and field names. How about we alter the text to reference "brand" where that will make sense? Addressing branding completely will require changes to the spec.
By "journey" we mean all the legs that make up the entire journey a rider takes to go from A to Z via B, C, D, etc. As you point out this is potentially confusing. I propose we use the term "rider journey" to denote the entire trip from A to Z.
I'm opposed to the term "publication" because I don't think it means anything to most existing GTFS users. Of course I may be wrong about that and we should involve some more people in this discussion. |
IMO "transit brand" is more technically correct than "transit agency" but, due to that fact that agency is written into many field names, I think we lose more than we gain by shoehorning "transit brand" in. I should also note that MobilityData plans to add transit branding as a fully featured GTFS extension. So, for now, I propose we leave "transit agency" as it is and fix this issue with the transit branding extension. I've also replaced "journey" with "rider journey" |
changed "journey" to "rider journey"
Note @giocorti's earlier comment was edited. We propose to leave "transit agency" in for now, and later address questions of agency branding holistically. This follows the purpose of this spec modification -- to provide greater clarity without making substantive changes in meaning. If there are no other comments, I'd like to call the vote tomorrow. |
I'd like to call the vote on this change. This PR makes editorial and formatting changes to the GTFS specification. See the opening comment for a summary and motivation. The vote will close on Dec 26 at 23:59:59 UTC. |
-1 As mentioned before, there is still too much unclarity and we are again introducing new terms for subjects that have been heavily standardised. Obviously this isn't all black/white, but some change are more controversal than others. |
@skinkie: Which changes can we remove or alter to get your support for the proposal? |
@antrim as we have discussed before "transit agency" is at this moment what it is described. But the change of itinerary to "Rider Journey" just becomes newspeak. |
@skinkie If we reverted back to "itinerary" instead of "rider journey" would that solve your concern? |
And add that brands are at this moment encoded as different agencies: sure. |
I've made the requested changes. "Itinerary" has been kept and "brand" vs "agency" is now discussed. I should also note that we've defined the term "service day" in this PR, and this was accidentally omitted from the initial change list. |
@giocorti thanks for this effort, I really appreciate it. With respect to service day, we currently define a service_id that is never used in the context of the words "calendar" or "calendar_dates". It suggests that it is some abstract grouping of both. My preference would be that it wouldn't be called service day but operational day. But I do want to ask another question, please don't take this as an offence. Where do you get your inspiration to get to these terms? For example compare the search queries:
|
@skinkie thanks for the feedback, and no offense taken at all! I'm using the term "service day" because its already in the spec. I've specifically added it as a defined term because it appears in a number of places and it could be potentially confusing as it doesn't correspond to an actual day (service days can be greater than 24 hours). So really the definition is just to make it explicit and obvious that a service day is not the same as a day. But you bring up a good point about "service day" vs "operational day". Unless someone else is opposed, I think that this a change that should be incorporated. |
Please allow some time for feedback on the service day subject. If it is already used somewhere in the spec I am not opposed to use the term here. What I am eager to learn is where these terms originate from and where they started to deviate (etymology). |
My inspiration for these terms just comes from the general (American) english lexicon. In general I've just tried to use the most accurate and understandable word in order to minimize confusion for someone reading the reference. In some cases (such as "record") I've also consulted what is, IMO, the appropriate technical literature. I should explicitly state that I am not trying to mirror the terminology defined by some outside party or spec. Rather, I'm trying to make the reference as understandable and consistent as possible to a wide variety of readers. I admit that, as a native American english speaker and someone who is not well versed in other specs such as Transmodel , I do exhibit a linguistic bias towards vernacular American english. In some cases, such as empty vs blank, there is no real reason to chose one word over the other. I just wanted a single word to be consistently used. I'm more than happy to discuss my reasons for choosing specific words if there are any you're wondering about. |
How about we call the vote after the new year, since many people will be away for the holidays? This will let everyone weigh in and allow a discussion of the terminology. We have 30 days from the last vote to continue working on the proposal. |
+1 |
Good catch @prhod. Thanks. We definitely do not intend to change this behavior. It has been discussed in the past and the conclusion was to keep it as it is. I don't want to re-open this discussion in this thread. @giocorti Could you have a look on that? I see you define ID type as "A sequence of any UTF-8 characters which uniquely identifies an entity, but does not necessarily identify a specific record in a table." But is there a place where you define uniqueness? |
@LeoFrachet while not intended I don't mind to give my support for this great idea of non-continuous service dates ;-) |
Uniqueness is not explicitly defined anywhere in the spec. I've actually removed language that specified that IDs were "dataset unique" where "dataset unique" was defined as
I removed that language because it was factually incorrect in some cases. For example, shape_id in shapes.txt does not identify a distinct entity within a column. Of course, uniqueness is an important concept in GTFS so we may want to define it. But we'd also need to be careful about the exact language we use to do as its easy to write something that entirely accurate. |
@giocorti the point with service_id is that in calendar_dates.txt it is defined multiple times. |
Added clarity and specificity by explicitly stating that a service_id may appear only once in calendar.txt.
@LeoFrachet I agree - I think that change is significant enough it requires a re-vote. |
I am calling a third vote (!) on this change, which @giocorti update (c7f18ba Jan 20) during the second vote. This PR makes editorial and formatting changes to the GTFS specification. See the opening comment for a summary and motivation. The vote will close on Jan 17 at 23:59:59 UTC. For the four who had already casted their ballot, please revote: @skinkie, @abyrd, @barbeau, @prhod. |
+1 |
1 similar comment
+1 |
+1 (Thanks for the clarification). Lets keep the non-continuous service dates for later ;-) |
Yes, good catch @prhod. +1 |
+1 |
The vote closed on Jan 17 at 23:59:59 UTC. We have 5 votes in favor of the change, from both GTFS producers and consumers, and no votes against. So this change passes. We'll get this merged. |
@LeoFrachet Yes. |
@googlebot. The pull request author (@giocorti) confirmed they are ok with contributing in this comment: #120 (comment) It looks as though that is the last step we need to pass the checks. |
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) |
So there's good news and bad news. 👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there. 😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request. Note to project maintainer: This is a terminal state, meaning the |
I just updated the revision history in 61c15c5 and it looks like that confused the CLA bot again. I'm ok with my commits being contributed to this project (obviously). |
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) |
Motivation: Inconsistencies in the GTFS make it more difficult for newcomers to understand the specification and build software. Our motivation is to make GTFS more clear and consistent.
Summary: This PR makes the following editorial and formatting changes to the GTFS specification. Except for one possible exception (discussed below), this pull request merely changes the language of the GTFS reference, but does not change any of its meanings.
List of changes:
parent_station
stop_timezone
arrival_time
anddeparture_time
Monday
,...,Sunday
timepoint
exact_times
Background discussion is in the original (@MobilityData) draft pull request: MobilityData#8