Store more information (including name handling rules) for each category #4906

bhousel · 2021-02-16T15:53:46Z

Moving this discussion to a new issue from the Greene King PR:

"What to do with names" is really the final unresolved thing about NSI that I need to figure out before making a real v5 release - you can see on openstreetmap/iD#8305 that we're running into this same thing with Amazon Lockers (they all have unique names), and on openstreetmap/id-tagging-schema#119 the same thing with cinema chains.

I see three ways to handle a name tag, depending on what kind of a feature we have:

"strict name" - NSI provides the correct name, and iD should enforce it (e.g. "Burger King")
"default name" - NSI provides a name that works as a suitable default, but users can change it: (e.g. "Amazon Locker" / "Amazon Locker - Hemlock") - I think pubs are like this
"no name" - NSI doesn't contain a name tag and takes no opinion on what the user wants to do - bus routes are like this.

I think this information should live in NSI somewhere, so we can remove the iD preset-field hack. I also think that this should be a per-category decision (maybe items can override it), but this is tricky because NSI has quite a lot of categories.

So I'll be adjusting the file formats a bit to settle on a nicer way to capture more information for each category.

Originally posted by @bhousel in #4902 (comment)

The text was updated successfully, but these errors were encountered:

kjonosm · 2021-02-16T16:47:30Z

"default name" - NSI provides a name that works as a suitable default, but users can change it: (e.g. "Amazon Locker" / "Amazon Locker - Hemlock") - I think pubs are like this

Some more potential candidates for "default names" based on my experience:

shop=car these usually have their own name, often in combination with the name of the brand of the cars they sell.
office=insurance small branch offices are often named after the actual broker (person) who runs it.
tourism=hotel hotels often combine brand name and location to make it easier to find them.
shop=travel_agency small branch offices are often named after the person who runs it.

I think your suggestion to split name handling in 3 categories works well. Is it enough to define a category based only on the feature tag or do we need to categorize individual brands/features?

bhousel · 2021-02-16T16:53:31Z

Some more potential candidates for "default names" based on my experience:

All good suggestions! I'd add amenity=cinema to that list too.. openstreetmap/id-tagging-schema#119

Is it enough to define a category based only on the feature tag or do we need to categorize individual brands/features?

I'm going to try to make it per-category, but allow override per-brand.

This moves towards storing more information about categories in the data/* files (re: #4906)

(re: #4906)

nuxper · 2021-02-18T15:33:02Z

I would consider actually that "Strict name" would be pretty rare. Even the full name of the McDonald's on the corner of my street is "McDonald's Lyon Bellecour". IMHO : we should suggest the base name but allow the user to modify it so he can enter the full name.

bhousel · 2021-02-18T15:53:31Z

I would consider actually that "Strict name" would be pretty rare. Even the full name of the McDonald's on the corner of my street is "McDonald's Lyon Bellecour". IMHO : we should suggest the base name but allow the user to modify it so he can enter the full name.

I think most people would prefer that you don't map branded POIs this way. Would you be open to the idea of using branch=* tag as described on the wiki here to record the "Lyon Bellecour" string?

nuxper · 2021-02-18T16:04:08Z

Well "McDonald's Lyon Bellecour" is the name of the restaurant, as named by McDonald's ( https://www.restaurants.mcdonalds.fr/mcdonalds-lyon-bellecour , https://www.google.com/maps/place/McDonald's+Lyon+Bellecour/@45.7581072,4.8342199,19z/data=!4m5!3m4!1s0x47f4ea515bc1b983:0x286b4b8e903dba5a!8m2!3d45.7583331!4d4.8341668 )

It's not a branch of the brand.

I can check with other mappers but I believe the use around here is to specify the full name when known.

Add skipCollection for categories which should not do it (re: #4906)

1ec5 · 2021-02-19T00:07:02Z

I’d welcome additional nuance when it comes to name keys. A “default name” will be particularly useful for “community anchor institutions” that are branded by their communities just as much as by their parent organizations. For example, in the U.S., maps and directories conventionally list post offices and libraries by their individual branch names (“Springfield Post Office”) rather than their brands (“United States Post Office”) when space allows.

Besides community anchor institutions, there are some brands that make a point of using unique names for each store location whenever possible, even if competing chains don’t:

Well "McDonald's Lyon Bellecour" is the name of the restaurant, as named by McDonald's

In the U.S., McDonald’s uses store location names much less frequently than a chain like Apple. Do their branding practices differ significantly in France?

I’ve been using official_name pretty frequently for full names that only appear on receipts or in contexts where only one brand would ever be shown (like the McDonald’s store locator). I like that official_name makes it clear to data consumers how they would combine the branch and brand into a full name; sometimes the branch comes first (“Blue Ash Shell”) and other times the brand comes first (“Best Western Plus Airport Plaza”). Unfortunately, not many editors have a dedicated field for either official_name or branch yet. Also, there are brands like Applebee’s that already have a brand-wide official_name in this index (for what would technically be name+strapline).

Ultimately, it comes down to a matter of taste and which kind of data consumer one is optimizing for. In general, including a branch in name optimizes for lists and prose, while excluding branch from name optimizes for map labels.

peternewman · 2021-02-21T17:52:18Z

* "default name" - NSI provides a name that works as a suitable default, but users can change it:  (e.g. "Amazon Locker" / "Amazon Locker - Hemlock") - _I think pubs are like this_

I think in the UK this perhaps varies by brand/brewery for pubs, or I guess more subtly the name is generally more important than the brand (indeed the brewery can change without you even necessarily noticing), whereas for Amazon Lockers it's the other way round, you need to find an Amazon Locker first and then find your one, it's no good finding someone's else's locker which also happens to be called Hemlock, or perhaps ironically if another company chose "Amazon" for one of their locker refs.

As I mentioned in the Greene King PR:
"If we're only storing the identifier in the name, then aside from the lack of normalisation from a database perspective, making life harder for any data consumers"

Which I think is particularly relevant for things where it's almost some sort of ref (e.g. parcel lockers, bike docks etc), you don't want to be having to deal with whether someone named it "Hemlock", "Amazon Hemlock", "Hemlock - Amazon Locker", "Amazon Locker - Hemlock", "Amazon Locker Hemlock" etc.

I’ve been using official_name pretty frequently for full names that only appear on receipts or in contexts where only one brand would ever be shown (like the McDonald’s store locator). I like that official_name makes it clear to data consumers how they would combine the branch and brand into a full name; sometimes the branch comes first (“Blue Ash Shell”) and other times the brand comes first (“Best Western Plus Airport Plaza”).

Ultimately, it comes down to a matter of taste and which kind of data consumer one is optimizing for. In general, including a branch in name optimizes for lists and prose, while excluding branch from name optimizes for map labels.

Possibly controversial idea, as maybe this is tagging for the renderer a bit too much, although people also seem to agree that we should be over-filling the name tag, presumably because it means the most basic data consumer can get something useful.

Anyway, what about rather than name handling rules, how about name construction rules to extend your default and strict ones. So if we know Best Western Plus brand uses brand + " " + branch whereas shell uses branch + " " + brand, but we want Amazon Lockers tagged as brand + " - " + ref. As a first stage it could just rebuild the name if the other data is present, and offer an upgrade if it's populated later. In future the iD upgrade could also populate the branch/ref field where it's currently only included in the name. Likewise for the official name.

Also relevant:
gravitystorm/openstreetmap-carto#698
gravitystorm/openstreetmap-carto#1874

(re: #4906)

bhousel · 2021-02-23T16:45:16Z

I am thinking of adding support for "merge strategy" property like this:

"mergeStrategy": {    // per-key strategy to apply when merging tag values from NSI -> OSM
  // "create": NSI may set a default tag value on creation, but may not modify an existing tag value
  // "replace": NSI tag value always replaces OSM tag value (the default)
  // "ignore": NSI tag values will never update OSM tag value

  "create": ["^building", "^name", "^takeaway$"], 
  "replace": [".*"],
  "ignore": ["^payment:"]
},

We could define these things per-tree, per-category, or per-item.

It's a little weird, but I think it's flexible enough to let us do anything we need today.

I could imagine creating something in the future that would let us map "nonstandard" name to "standard" name/branch/ref using string replacement, but that sounds hard and I don't to spend too much time on this right now.

bryceco · 2021-02-23T17:22:04Z

@bhousel The existing addTags/removeTags for presets is fairly tricky to implement precisely. This looks to add another layer of complexity on top of it.

bhousel · 2021-02-23T18:12:40Z

@bhousel The existing addTags/removeTags for presets is fairly tricky to implement precisely. This looks to add another layer of complexity on top of it.

Yea for sure - It's really an attempt to remove some of the complexity from iD's validator (which tags should be replaced) and user interface (which fields should be locked)

I'd still generate the presets the same way with addTags/removeTags - that interface won't change.

bhousel · 2021-02-24T14:59:57Z

I realized after talking to @bryceco that this mergeStrategy stuff was just too complex, so I got rid of it and just added a simpler property called preserveTags which we can use on the iD side in the validator:

"preserveTags": {
  "description": "(optional) For tags matching these patterns, NSI should not replace an existing value in OSM",
  "examples": ["^name"],
  "type": "array",
  "uniqueItems": true,
  "items": {
    "type": "string",
    "format": "regex"
  }
},

This property can be used as a per-category property, or a per-item property.

I added "preserveTags": ["^name"], to the following:

data/brands/amenity/bar.json
data/brands/amenity/cinema.json
data/brands/amenity/fuel.json
data/brands/amenity/pub.json
data/brands/amenity/social_centre.json
data/brands/amenity/vending_machine.json (the Amazon hubs/lockers)
data/brands/landuse/residential.json
data/brands/office/insurance.json
data/brands/shop/car.json
data/brands/shop/car_repair.json
data/brands/shop/electronics.json (the Apple store item)
data/brands/shop/motorcycle.json
data/brands/tourism/hotel.json
data/operators/amenity/hospital.json (also preserve the emergency tag)
data/operators/amenity/parking.json
data/operators/amenity/post_depot.json
data/operators/amenity/post_office.json

I added "preserveTags": ["^flag:name"], to the following:

data/flags/man_made/flagpole.json

1ec5 · 2021-02-26T08:20:47Z

Would it be useful to preserve flag:name too, since it could be in the local language?

bhousel · 2021-02-26T12:13:34Z

Would it be useful to preserve flag:name too, since it could be in the local language?

I did! It’s the last one on the bulleted list..

1ec5 · 2021-02-26T15:49:34Z

Ah, I misinterpreted the ^ to mean “everything except” but overlooked that it’s the same as ^name above. Sorry for the noise.

bhousel · 2021-02-26T16:13:39Z

Ah, I misinterpreted the ^ to mean “everything except” but overlooked that it’s the same as ^name above. Sorry for the noise.

No problem, I definitely want you to check my work! (in this case the ^ matches the beginning-of-input)

bhousel mentioned this issue Feb 16, 2021

Name-suggestion-index v6 openstreetmap/iD#8305

Merged

bhousel added a commit that referenced this issue Feb 18, 2021

Relax schamas, start refactor of cache and tree structures

953b553

This moves towards storing more information about categories in the data/* files (re: #4906)

bhousel added a commit that referenced this issue Feb 18, 2021

Convert data files

3f91a81

(re: #4906)

bhousel added a commit that referenced this issue Feb 18, 2021

Restore the filtering in build script,

58d638f

Add skipCollection for categories which should not do it (re: #4906)

bhousel mentioned this issue Feb 21, 2021

merge Softbank #4916

Merged

bhousel added a commit that referenced this issue Feb 23, 2021

Experimenting with introducing a tag merge strategy

db2475d

(re: #4906)

bhousel closed this as completed in deb8799 Feb 24, 2021

bhousel mentioned this issue Feb 24, 2021

Have the matcher handle generic/common words too #4924

Closed

This was referenced Feb 26, 2021

Pub names #4926

Closed

how should 'Brazier' be handled? #4928

Closed

Handle new format in name-suggestion-index #4931

Closed

Publish a new release or otherwise ensure iD can import recent updates to the NSI #4543

Closed

bhousel mentioned this issue Mar 20, 2021

Update charging_station.json #4985

Closed

bhousel mentioned this issue Oct 15, 2021

Store names which include the branch name #5500

Closed

bhousel mentioned this issue Jan 18, 2022

Unique fast-food names #6071

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Store more information (including name handling rules) for each category #4906

Store more information (including name handling rules) for each category #4906

bhousel commented Feb 16, 2021

kjonosm commented Feb 16, 2021

bhousel commented Feb 16, 2021 •

edited

Loading

nuxper commented Feb 18, 2021

bhousel commented Feb 18, 2021

nuxper commented Feb 18, 2021

1ec5 commented Feb 19, 2021

peternewman commented Feb 21, 2021

bhousel commented Feb 23, 2021

bryceco commented Feb 23, 2021

bhousel commented Feb 23, 2021

bhousel commented Feb 24, 2021

1ec5 commented Feb 26, 2021

bhousel commented Feb 26, 2021

1ec5 commented Feb 26, 2021

bhousel commented Feb 26, 2021

Store more information (including name handling rules) for each category #4906

Store more information (including name handling rules) for each category #4906

Comments

bhousel commented Feb 16, 2021

kjonosm commented Feb 16, 2021

bhousel commented Feb 16, 2021 • edited Loading

nuxper commented Feb 18, 2021

bhousel commented Feb 18, 2021

nuxper commented Feb 18, 2021

1ec5 commented Feb 19, 2021

peternewman commented Feb 21, 2021

bhousel commented Feb 23, 2021

bryceco commented Feb 23, 2021

bhousel commented Feb 23, 2021

bhousel commented Feb 24, 2021

1ec5 commented Feb 26, 2021

bhousel commented Feb 26, 2021

1ec5 commented Feb 26, 2021

bhousel commented Feb 26, 2021

bhousel commented Feb 16, 2021 •

edited

Loading