Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store more information (including name handling rules) for each category #4906

Closed
bhousel opened this issue Feb 16, 2021 · 15 comments
Closed

Comments

@bhousel
Copy link
Member

bhousel commented Feb 16, 2021

Moving this discussion to a new issue from the Greene King PR:

"What to do with names" is really the final unresolved thing about NSI that I need to figure out before making a real v5 release - you can see on openstreetmap/iD#8305 that we're running into this same thing with Amazon Lockers (they all have unique names), and on openstreetmap/id-tagging-schema#119 the same thing with cinema chains.

I see three ways to handle a name tag, depending on what kind of a feature we have:

  • "strict name" - NSI provides the correct name, and iD should enforce it (e.g. "Burger King")
  • "default name" - NSI provides a name that works as a suitable default, but users can change it: (e.g. "Amazon Locker" / "Amazon Locker - Hemlock") - I think pubs are like this
  • "no name" - NSI doesn't contain a name tag and takes no opinion on what the user wants to do - bus routes are like this.

I think this information should live in NSI somewhere, so we can remove the iD preset-field hack. I also think that this should be a per-category decision (maybe items can override it), but this is tricky because NSI has quite a lot of categories.

So I'll be adjusting the file formats a bit to settle on a nicer way to capture more information for each category.

Originally posted by @bhousel in #4902 (comment)

@kjonosm
Copy link
Collaborator

kjonosm commented Feb 16, 2021

"default name" - NSI provides a name that works as a suitable default, but users can change it: (e.g. "Amazon Locker" / "Amazon Locker - Hemlock") - I think pubs are like this

Some more potential candidates for "default names" based on my experience:

  • shop=car these usually have their own name, often in combination with the name of the brand of the cars they sell.
  • office=insurance small branch offices are often named after the actual broker (person) who runs it.
  • tourism=hotel hotels often combine brand name and location to make it easier to find them.
  • shop=travel_agency small branch offices are often named after the person who runs it.

I think your suggestion to split name handling in 3 categories works well. Is it enough to define a category based only on the feature tag or do we need to categorize individual brands/features?

@bhousel
Copy link
Member Author

bhousel commented Feb 16, 2021

Some more potential candidates for "default names" based on my experience:

All good suggestions! I'd add amenity=cinema to that list too.. openstreetmap/id-tagging-schema#119

Is it enough to define a category based only on the feature tag or do we need to categorize individual brands/features?

I'm going to try to make it per-category, but allow override per-brand.

bhousel added a commit that referenced this issue Feb 18, 2021
This moves towards storing more information about categories in the data/* files
(re: #4906)
bhousel added a commit that referenced this issue Feb 18, 2021
@nuxper
Copy link
Collaborator

nuxper commented Feb 18, 2021

I would consider actually that "Strict name" would be pretty rare. Even the full name of the McDonald's on the corner of my street is "McDonald's Lyon Bellecour". IMHO : we should suggest the base name but allow the user to modify it so he can enter the full name.

@bhousel
Copy link
Member Author

bhousel commented Feb 18, 2021

I would consider actually that "Strict name" would be pretty rare. Even the full name of the McDonald's on the corner of my street is "McDonald's Lyon Bellecour". IMHO : we should suggest the base name but allow the user to modify it so he can enter the full name.

I think most people would prefer that you don't map branded POIs this way. Would you be open to the idea of using branch=* tag as described on the wiki here to record the "Lyon Bellecour" string?

Screen Shot 2021-02-18 at 10 53 09 AM

@nuxper
Copy link
Collaborator

nuxper commented Feb 18, 2021

Well "McDonald's Lyon Bellecour" is the name of the restaurant, as named by McDonald's ( https://www.restaurants.mcdonalds.fr/mcdonalds-lyon-bellecour , https://www.google.com/maps/place/McDonald's+Lyon+Bellecour/@45.7581072,4.8342199,19z/data=!4m5!3m4!1s0x47f4ea515bc1b983:0x286b4b8e903dba5a!8m2!3d45.7583331!4d4.8341668 )

It's not a branch of the brand.

I can check with other mappers but I believe the use around here is to specify the full name when known.

bhousel added a commit that referenced this issue Feb 18, 2021
Add skipCollection for categories which should not do it
(re: #4906)
@1ec5
Copy link
Member

1ec5 commented Feb 19, 2021

I’d welcome additional nuance when it comes to name keys. A “default name” will be particularly useful for “community anchor institutions” that are branded by their communities just as much as by their parent organizations. For example, in the U.S., maps and directories conventionally list post offices and libraries by their individual branch names (“Springfield Post Office”) rather than their brands (“United States Post Office”) when space allows.

Besides community anchor institutions, there are some brands that make a point of using unique names for each store location whenever possible, even if competing chains don’t:

apple maps apple.com

Well "McDonald's Lyon Bellecour" is the name of the restaurant, as named by McDonald's

In the U.S., McDonald’s uses store location names much less frequently than a chain like Apple. Do their branding practices differ significantly in France?

I’ve been using official_name pretty frequently for full names that only appear on receipts or in contexts where only one brand would ever be shown (like the McDonald’s store locator). I like that official_name makes it clear to data consumers how they would combine the branch and brand into a full name; sometimes the branch comes first (“Blue Ash Shell”) and other times the brand comes first (“Best Western Plus Airport Plaza”). Unfortunately, not many editors have a dedicated field for either official_name or branch yet. Also, there are brands like Applebee’s that already have a brand-wide official_name in this index (for what would technically be name+strapline).

Ultimately, it comes down to a matter of taste and which kind of data consumer one is optimizing for. In general, including a branch in name optimizes for lists and prose, while excluding branch from name optimizes for map labels.

@peternewman
Copy link
Collaborator

* "default name" - NSI provides a name that works as a suitable default, but users can change it:  (e.g. "Amazon Locker" / "Amazon Locker - Hemlock") - _I think pubs are like this_

I think in the UK this perhaps varies by brand/brewery for pubs, or I guess more subtly the name is generally more important than the brand (indeed the brewery can change without you even necessarily noticing), whereas for Amazon Lockers it's the other way round, you need to find an Amazon Locker first and then find your one, it's no good finding someone's else's locker which also happens to be called Hemlock, or perhaps ironically if another company chose "Amazon" for one of their locker refs.

As I mentioned in the Greene King PR:
"If we're only storing the identifier in the name, then aside from the lack of normalisation from a database perspective, making life harder for any data consumers"

Which I think is particularly relevant for things where it's almost some sort of ref (e.g. parcel lockers, bike docks etc), you don't want to be having to deal with whether someone named it "Hemlock", "Amazon Hemlock", "Hemlock - Amazon Locker", "Amazon Locker - Hemlock", "Amazon Locker Hemlock" etc.

I’ve been using official_name pretty frequently for full names that only appear on receipts or in contexts where only one brand would ever be shown (like the McDonald’s store locator). I like that official_name makes it clear to data consumers how they would combine the branch and brand into a full name; sometimes the branch comes first (“Blue Ash Shell”) and other times the brand comes first (“Best Western Plus Airport Plaza”).

Ultimately, it comes down to a matter of taste and which kind of data consumer one is optimizing for. In general, including a branch in name optimizes for lists and prose, while excluding branch from name optimizes for map labels.

Possibly controversial idea, as maybe this is tagging for the renderer a bit too much, although people also seem to agree that we should be over-filling the name tag, presumably because it means the most basic data consumer can get something useful.

Anyway, what about rather than name handling rules, how about name construction rules to extend your default and strict ones. So if we know Best Western Plus brand uses brand + " " + branch whereas shell uses branch + " " + brand, but we want Amazon Lockers tagged as brand + " - " + ref. As a first stage it could just rebuild the name if the other data is present, and offer an upgrade if it's populated later. In future the iD upgrade could also populate the branch/ref field where it's currently only included in the name. Likewise for the official name.

Also relevant:
gravitystorm/openstreetmap-carto#698
gravitystorm/openstreetmap-carto#1874

@bhousel
Copy link
Member Author

bhousel commented Feb 23, 2021

I am thinking of adding support for "merge strategy" property like this:

"mergeStrategy": {    // per-key strategy to apply when merging tag values from NSI -> OSM
  // "create": NSI may set a default tag value on creation, but may not modify an existing tag value
  // "replace": NSI tag value always replaces OSM tag value (the default)
  // "ignore": NSI tag values will never update OSM tag value

  "create": ["^building", "^name", "^takeaway$"], 
  "replace": [".*"],
  "ignore": ["^payment:"]
},

We could define these things per-tree, per-category, or per-item.

It's a little weird, but I think it's flexible enough to let us do anything we need today.

I could imagine creating something in the future that would let us map "nonstandard" name to "standard" name/branch/ref using string replacement, but that sounds hard and I don't to spend too much time on this right now.

@bryceco
Copy link

bryceco commented Feb 23, 2021

@bhousel The existing addTags/removeTags for presets is fairly tricky to implement precisely. This looks to add another layer of complexity on top of it.

@bhousel
Copy link
Member Author

bhousel commented Feb 23, 2021

@bhousel The existing addTags/removeTags for presets is fairly tricky to implement precisely. This looks to add another layer of complexity on top of it.

Yea for sure - It's really an attempt to remove some of the complexity from iD's validator (which tags should be replaced) and user interface (which fields should be locked)

I'd still generate the presets the same way with addTags/removeTags - that interface won't change.

@bhousel
Copy link
Member Author

bhousel commented Feb 24, 2021

I realized after talking to @bryceco that this mergeStrategy stuff was just too complex, so I got rid of it and just added a simpler property called preserveTags which we can use on the iD side in the validator:

"preserveTags": {
  "description": "(optional) For tags matching these patterns, NSI should not replace an existing value in OSM",
  "examples": ["^name"],
  "type": "array",
  "uniqueItems": true,
  "items": {
    "type": "string",
    "format": "regex"
  }
},

This property can be used as a per-category property, or a per-item property.

I added "preserveTags": ["^name"], to the following:

  • data/brands/amenity/bar.json
  • data/brands/amenity/cinema.json
  • data/brands/amenity/fuel.json
  • data/brands/amenity/pub.json
  • data/brands/amenity/social_centre.json
  • data/brands/amenity/vending_machine.json (the Amazon hubs/lockers)
  • data/brands/landuse/residential.json
  • data/brands/office/insurance.json
  • data/brands/shop/car.json
  • data/brands/shop/car_repair.json
  • data/brands/shop/electronics.json (the Apple store item)
  • data/brands/shop/motorcycle.json
  • data/brands/tourism/hotel.json
  • data/operators/amenity/hospital.json (also preserve the emergency tag)
  • data/operators/amenity/parking.json
  • data/operators/amenity/post_depot.json
  • data/operators/amenity/post_office.json

I added "preserveTags": ["^flag:name"], to the following:

  • data/flags/man_made/flagpole.json

@1ec5
Copy link
Member

1ec5 commented Feb 26, 2021

Would it be useful to preserve flag:name too, since it could be in the local language?

@bhousel
Copy link
Member Author

bhousel commented Feb 26, 2021

Would it be useful to preserve flag:name too, since it could be in the local language?

I did! It’s the last one on the bulleted list..

@1ec5
Copy link
Member

1ec5 commented Feb 26, 2021

Ah, I misinterpreted the ^ to mean “everything except” but overlooked that it’s the same as ^name above. Sorry for the noise.

@bhousel
Copy link
Member Author

bhousel commented Feb 26, 2021

Ah, I misinterpreted the ^ to mean “everything except” but overlooked that it’s the same as ^name above. Sorry for the noise.

No problem, I definitely want you to check my work! (in this case the ^ matches the beginning-of-input)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants