Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of poor address matches at the civicNumber, block, and street matchPrecision levels #255

Open
mraross opened this issue Jul 7, 2021 · 3 comments

Comments

@mraross
Copy link

mraross commented Jul 7, 2021

Match score range is really three ranges, one for each of civic, street, and locality match precision as follows:

Match confidence CivicNumber matchPrecision Street matchPrecision Locality matchPrecision
High 90-100 72-78 58-68
Medium 80-89 69-71 48-57
Low 0-79 0-68 0-47
Low confidence penalty 32 10 0

matchPrecisions site, block, and unit can all be considered equivalent to civicNumber for this analysis.
A low confidence match is given an extra penalty so that a high confidence locality match will be ranked higher.
Low confidence penalty = matchPrecisionScore - locality.matchPrecisionScore

Here are the match scores for 6207 Centre St Salmo BC adjusted using the low confidence penalty:

Address Score After low confidence penalty Match Precision
Cemetery Rd, Salmo, BC 72 72 Street
Salmo St, Kimberley, BC 64 54 Street
6207 Salmon Valley Rd, Salmon Valley, BC 74 42 Block
6207 Centre Rd, Niskonlith Lake, BC 58 26 Block
6207 Central Rd, Hornby Island, BC 56 24 Block

Without the low confidence penalty, 6207 Salmon Valley Rd, Salmon Valley is the best match.
With the low confidence penalty, Cemetery Rd, Salmo becomes the best match and the only match in the correct locality.

@mraross mraross changed the title Handle poor matches at civic, block, and street levels better Improve handling of poor address matches at the civicNumber, block, and street levels Jul 9, 2021
@mraross mraross assigned gleeming and unassigned gleeming Jul 9, 2021
@mraross mraross changed the title Improve handling of poor address matches at the civicNumber, block, and street levels Improve handling of poor address matches at the civicNumber, block, and street matchPrecision levels Jul 9, 2021
@mraross
Copy link
Author

mraross commented Jul 10, 2021

Here are the match scores for 110 Main St Kaslo, BC adjusted using the low confidence penalty:

Address Score After low confidence penalty Match Precision
Main St, Lardeau, BC 73 73 Street
Main St, Slocan, BC 73 73 Street
Rain Bow Dr, Kaslo, BC 71 71 Street
110 Main St, Yahk, BC 58 26 Block
110 Main St, Ahousaht, BC 58 26 Block

Low confidence penalties have no effect on the ranking of matches. This is ok because there is no Main St in Kaslo and both Lardeau and Slocan are neighbours of Kaslo.

@mraross
Copy link
Author

mraross commented Jul 12, 2021

Here are the match scores for 627 ABERDEEN RD. MERRITT adjusted using the low confidence penalty:

Address Score After low confidence penalty Match Precision
Merritt, BC 55 55 Locality
627 Aberdeen Dr, Parksville, BC 57 25 Block
627 Aberdeen Dr, Kamloops, BC 57 25 Block

This is a good example of locality hopping. Both civic address matches end up below Merritt thanks to the low confidence penalty.

@mraross
Copy link
Author

mraross commented Jul 12, 2021

Here are the match scores for 990 Bob Lane, North Saanich, Port Alberni adjusted using the low confidence penalty:

Address Score After low confidence penalty Match Precision
Port Alberni, BC 55 55 Locality
990 Bob Lane, North Saanich, BC 63 32 Block
Bob-O-Link Way, Nanaimo, BC 59 49 Street

For this address, Port Alberni bubbles to the top but 990 Bob Lane, North Saanich looks like a better match.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants