Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative buffer operation returns EMPTY geometry #984

Closed
mwtoews opened this issue Nov 5, 2023 · 22 comments · Fixed by #1056
Closed

Negative buffer operation returns EMPTY geometry #984

mwtoews opened this issue Nov 5, 2023 · 22 comments · Fixed by #1056
Labels
Bug JTS Issue also appears in JTS

Comments

@mwtoews
Copy link
Contributor

mwtoews commented Nov 5, 2023

This bug was originally reported here: shapely/shapely#1932

The input MultiPolygon is valid, and identical behaviour is obtained with a single Polygon too.

With a recent build of GEOS 3.13.0dev:

$ cat > mp.wkt
MULTIPOLYGON (((833454.7163917861 6312507.405413097, 833455.3726665961 6312510.208920742, 833456.301153878 6312514.207390314, 833492.2432584754 6312537.770332065, 833493.0901320165 6312536.098774815, 833502.6580673696 6312517.561360772, 833503.9404352929 6312515.0542803425, 833454.7163917861 6312507.405413097)))
$ ./bin/geosop -a mp.wkt buffer N-3.7
POLYGON ((833459.564533443 6312511.903163322, 833459.5698499765 6312511.926058625, 833490.8265097003 6312532.417314233, 833498.3074367424 6312517.923378301, 833459.564533443 6312511.903163322))
$ ./bin/geosop -a mp.wkt buffer N-3.8
POLYGON EMPTY

A buffer of -3.8 should be fine. Here is the visual result with -3.7:
image

This bug applies to JTS too (using a recent-ish 1.20.0 SNAPSHOT).

@mwtoews mwtoews added JTS Issue also appears in JTS Bug labels Nov 5, 2023
@theroggy
Copy link

theroggy commented Feb 15, 2024

This looks like another example of the same problem?

image

Script:

from matplotlib import pyplot as plt
import shapely
import shapely.plotting

wkt = "Polygon ((182719.04521570954238996 224897.14115349075291306, 182807.02887436276068911 224880.64421749324537814, 182808.47314301913138479 224877.25002362736267969, 182718.38701137207681313 224740.00115247094072402, 182711.82697281913715415 224742.08599378637154587, 182717.1393717635946814 224895.61432328051887453, 182719.04521570954238996 224897.14115349075291306))"

poly = shapely.from_wkt(wkt)
poly_bufm = shapely.buffer(poly, distance=-5)

print(f"poly.is_valid: {poly.is_valid}")
print(f"poly_bufm.is_valid: {poly_bufm.is_valid}")

shapely.plotting.plot_polygon(poly)
shapely.plotting.plot_polygon(poly_bufm, color="red")
plt.show()

@theroggy
Copy link

theroggy commented Feb 15, 2024

And another one...

from matplotlib import pyplot as plt
import shapely
import shapely.plotting

wkt = "POLYGON ((189830.82937655927 236280.33651798425, 189826.6517640246 236284.73029061654, 189890.06437987628 236413.14271446358, 189896.17545283484 236412.43408390653, 189917.96066382117 236318.46340062976, 189917.6788285035 236317.90492723353, 189830.82937655927 236280.33651798425))"
poly = shapely.from_wkt(wkt)
poly_bufm5 = shapely.buffer(poly, distance=-5)

print(f"poly.is_valid: {poly.is_valid}")
print(f"poly_bufp5m5.is_valid: {poly_bufm5.is_valid}")

shapely.plotting.plot_polygon(poly)
shapely.plotting.plot_polygon(poly_bufm5, color="red")
plt.show()

@theroggy
Copy link

theroggy commented Mar 7, 2024

Another example, originally reported here: shapely/shapely#2009

from matplotlib import pyplot as plt
import shapely
import shapely.plotting

wkt = "POLYGON ((-8486160.859752608 4407005.311912118, -8486322.012133999 4419552.266313265, -8498821.965759974 4419382.682467878, -8498646.158633558 4406836.479565462, -8486160.859752608 4407005.311912118))"

poly = shapely.from_wkt(wkt)
poly_bufm = shapely.buffer(poly, distance=1e-11)

print(f"{poly.is_valid=}")
print(f"{poly_bufm.is_valid=}")
print(f"{poly_bufm=}")

shapely.plotting.plot_polygon(poly)
shapely.plotting.plot_polygon(poly_bufm, color="red")
plt.show()

@christaina
Copy link

Hello, I am being impacted by this bug as well (we are using geos through shapely/geopandas)

Here is some information about an example we found with this issue, if it is helpful:

  • we are buffering in a polygon projected to utm 0.5m, with cap style = flat, join style = mitre, mitre limit = 1.5
  • the polygon is valid prior to the buffer in operation
  • the geometry has gone through several other steps of geospatial processing prior to the buffer in. it is relatively complex by the time of the buffer in op. prior to the buffer in, we buffer out a large number of smaller polygons by 0.5m with the same buffer parameters as the buffer in operation, then we dissolve them together, and simplify with tolerance 0.1m
  • for the one example we discovered, each of the following modifications to our processing avoided the bug:
    • simplifying by 0.2m instead of 0.1m
    • changing parameters in buffer in operation to cap style = round, join style = round (we would like to avoid having to do this as a workaround as we dont want to round the corners of the geometry when processing)
    • buffering out/in by 0.6m instead of 0.5m

currently, we are trying to understand what characteristics of the geometry can trigger this bug so that we can modify our processing to avoid it. I appreciate if anyone has any guidance / information on what could be the culprit, intuition that we could test, or suggestions on how we can modify our code to robustly get around the issue. the concern with the modifications we tried above that resolved it on our one example, is that these workaround either modify our geometries in a way we dont want, or that the workarounds may not be robust when the code is run on a different dataset we process.

for context- we have several pieces of software that perform multiple steps of geometry processing (buffering, simplify, dissolve, spatial overlay, etc). the software is run to process thousands of different datasets, and each dataset contains thousands to millions of polygons. until the bug is resolved, we would like to find a robust/elegant way to get around the bug.

thank you!

@pramsey
Copy link
Member

pramsey commented Mar 13, 2024

Funny, I cannot replicate the original case, but your second case I can.

@dr-jts
Copy link
Contributor

dr-jts commented Mar 13, 2024

@christaina can you provide the polygon geometry and buffer parameters that are causing your issue?

I suspect there's (at least) a couple of things that are causing these problems. So it's helpful to have concrete data to analyze.

@pramsey
Copy link
Member

pramsey commented Mar 13, 2024

@dr-jts has tracked down the culprit, which in turn leads to a potential workaround. I've only tested on one geometry, but the idea is to run RemoveRepeatedPoints on the input, with the tolerance being the same as the negative buffer distance. So in PostGIS terms, this:

select st_astext(st_buffer(st_removerepeatedpoints(
  'Polygon ((182719.04521570954238996 224897.14115349075291306, 182807.02887436276068911 224880.64421749324537814, 182808.47314301913138479 224877.25002362736267969, 182718.38701137207681313 224740.00115247094072402, 182711.82697281913715415 224742.08599378637154587, 182717.1393717635946814 224895.61432328051887453, 182719.04521570954238996 224897.14115349075291306))'::geometry, 
  5), -5));

@dr-jts
Copy link
Contributor

dr-jts commented Mar 13, 2024

The culprit causing all these cases is the "inverted ring detection" heuristic introduced in locationtech/jts#878 (ported to GEOS in 85ead60).

This is a very narrow heuristic which only targets a "small" number of cases (although still infinite! ;). It is incorrectly being triggered by the cases in this issue, which all have the characteristic of being small almost-triangles, with additional vertices at each triangle apex.

I'm hopeful that adding a further refinement to the heuristic code will prevent it from flagging these kinds of cases as incorrect.

@christaina
Copy link

@christaina can you provide the polygon geometry and buffer parameters that are causing your issue?

I suspect there's (at least) a couple of things that are causing these problems. So it's helpful to have concrete data to analyze.

@dr-jts thank you for the responsiveness and trying to support. unfortunately I do not have permission to share the data for the polygon geometry (out of my control). let me know if there is any info about the characteristics of this example i can provide to be helpful though.

the buffer parameters we are using:

  • data projected in utm
  • buffer distance is -0.5 meters
  • cap style param passed is 'flat'
  • join style param is 'mitre'
  • mitre limit param is set to 1.5
  • any other param is set to whatever shapely/geopandas defaults to for the buffer op (we are using latest version iirc, or close) - https://shapely.readthedocs.io/en/latest/manual.html#object.buffer

@dr-jts
Copy link
Contributor

dr-jts commented Mar 13, 2024

@christaina you'll have to just try out the fix I'm working on then.

Can you provide images of the input polygons? Or perturb them via translation?

@dr-jts
Copy link
Contributor

dr-jts commented Mar 13, 2024

UPDATE: I have a fix in preparation for this issue.

@christaina
Copy link

christaina commented Mar 13, 2024

that is awesome to hear about the fix.

@christaina you'll have to just try out the fix I'm working on then.

Can you provide images of the input polygons? Or perturb them via translation?

probably not. let me know if there is anything about the input polygon i can verify on my end and report back on (e.g. "It is incorrectly being triggered by the cases in this issue, which all have the characteristic of being small almost-triangles, with additional vertices at each triangle apex." - i can check if the polygon has this pattern if youre interested. not sure if you have a more specific defn of what count as 'additional vertices at the triangle apex' , tho i have some idea). could probably share a zoomed in picture of a small subsection of the original polygon and its vertices. fyi the polygon itself is large and complex with maybe hundreds/thousands of vertices.

@dr-jts
Copy link
Contributor

dr-jts commented Mar 13, 2024

fyi the polygon itself is large and complex with maybe hundreds/thousands of vertices.

@christaina if the polygon has that many vertices then it's likely a different problem than what is causing this issue. There's not much we can do without having a reproducible example to test.

@christaina
Copy link

@dr-jts got it. yeah thats reasonable, thanks anyway for your help & replies.

by 'different problem' do you mean the this bug is unrelated to my issue, or do you mean that there are multiple possible causes for this bug and your fix is not guaranteed to address all potential scenarios?

@dr-jts
Copy link
Contributor

dr-jts commented Mar 13, 2024

by 'different problem' do you mean the this bug is unrelated to my issue, or do you mean that there are multiple possible causes for this bug and your fix is not guaranteed to address all potential scenarios?

I suspect it's a different bug/problem in the buffer algorithm. But it will be worth trying the fix for this issue when it lands.

@pramsey
Copy link
Member

pramsey commented Mar 13, 2024

@christaina if you strip away all attribution and do a location translation on the geometry, it's not really recoverable as identifiable data anymore. Consider anonymizing your geometry and submitting it, if you can.

@mwtoews
Copy link
Contributor Author

mwtoews commented Mar 13, 2024

E.g., to translate the above example to start at (0, 0):

from shapely import affinity, wkt

poly = wkt.loads("POLYGON ((182719.04521570954 224897.14115349075, 182807.02887436276 224880.64421749325, 182808.47314301913 224877.25002362736, 182718.38701137208 224740.00115247094, 182711.82697281914 224742.08599378637, 182717.1393717636 224895.61432328052, 182719.04521570954 224897.14115349075))")

poly0 = affinity.translate(poly, -poly.bounds[0], -poly.bounds[1])

print(poly0.bounds)
# (0.0, 0.0, 96.64617019999423, 157.1400010198122)

print(poly0.wkt)
# POLYGON ((7.218242890405236 157.1400010198122, 95.20190154362353 140.64306502230465, 96.64617019999423 137.24887115642196, 6.560038552939659 0, 0 2.084841315430822, 5.312398944457527 155.61317080957815, 7.218242890405236 157.1400010198122))

assert poly0.buffer(-5).is_empty

now poly0 has ambiguous coordinates. You can even chop off the decimals and get something that still demonstrates the issue:

poly1 = wkt.loads(wkt.dumps(poly0, rounding_precision=0))

print(poly1.bounds)
# (0.0, 0.0, 97.0, 157.0)

print(poly1.wkt)
# POLYGON ((7 157, 95 141, 97 137, 7 0, 0 2, 5 156, 7 157))

assert poly1.buffer(-5).is_empty

@christaina
Copy link

christaina commented Mar 13, 2024

[edited from original comment] - i was able to isolate which piece of my geometry is the problem and create a smaller polygon from it that can reproduce the issue. i used your code to modify the coordinates @mwtoews however it appears that translating the original coordinates resolves the issue in my example.

image

on the original we see the bug:
image

after translating the orignal data using your code, the buffer in produces the correct result:
image

here is the wkt of the ambigious geometry i created as shown above:
POLYGON ((8.199474338965956 0.2279206719249487, 0.5717645731638186 0.1361661963164806, 0.5295531218289398 3.2573571633547544, 0.3858843961497769 3.653283822350204, 0.2238492488395423 3.6574836056679487, 0.0762551225489005 4.506569623015821, 0.2614748299820349 4.864056357182562, 0 5.177353265695274, 0.1837987996987067 5.5559824258089066, 9.190854811691679 5.723026826046407, 9.194003739859909 5.4951130310073495, 11.244943040888757 5.523147692903876, 11.241794116329402 5.751061487942934, 19.86881524050841 5.868991915136576, 19.874942413181998 5.42551632784307, 20.141101929009892 5.075332430191338, 19.8842932167463 4.748719167895615, 19.888757351785898 4.425611754879355, 19.562379321898334 4.339304134249687, 19.55321845615981 4.058740854263306, 19.82954796490958 4.061927259899676, 19.83135298069101 3.9006921174004674, 20.13803375896532 3.782377396710217, 20.008533840009477 3.063924697227776, 19.841014401463326 3.061993000097573, 19.87714232603321 0.3684074049815536, 12.25031278852839 0.2766517363488674, 12.252921311883256 0.0487310625612736, 11.490193735866342 0.0395553829148412, 8.202083160926122 0, 8.199474338965956 0.2279206719249487))

@christaina
Copy link

christaina commented Mar 13, 2024

I suspect it's a different bug/problem in the buffer algorithm. But it will be worth trying the fix for this issue when it lands.

i see, good to know. so the reason why we thought this issue relates to our example is that we are certain the data should not be empty after the buffer in, and perturbing the parameters passed into the buffer in operation slightly (and not significantly enough to meaningfully change what the expected output should look like) returns the result we expect. Similarly to what the original reporter observed.

The data passed into the buffer operation is around 5000 square meters in size and looks roughly like an elongated rectangle (with a lot of vertices, and not a perfect rectangle), so theres no complexity in what we expect the output after buffer in should look like. Also, we are successfully able to process other polygons, with the same code, that are similar in size/structure in the same dataset, and others, without the issue occuring.

not sure if this info goes against your thought or is helpful. anyway, am trying to share more info, even tho it isnt enough to draw a specific conclusion. Dont mean to derail productive convo about the originally reported issue here.

@dr-jts
Copy link
Contributor

dr-jts commented Mar 14, 2024

This bug applies to JTS too (using a recent-ish 1.20.0 SNAPSHOT).

@mwtoews actually this particular case now works in JTS. It looks like the GEOS offset curve generation logic must be slightly different to the current JTS logic, since the generated curves are different.

image

This should be tracked down at some point, since it probably indicates a JTS fix that hasn't been applied to GEOS.

However, the fix in locationtech/jts#1038 should work for GEOS in any case.

@dr-jts dr-jts changed the title Negative buffer operation return EMPTY GEOMETRY Negative buffer operation returns EMPTY GEOMETRY Mar 15, 2024
@dr-jts dr-jts changed the title Negative buffer operation returns EMPTY GEOMETRY Negative buffer operation returns EMPTY geometry Mar 15, 2024
@dr-jts
Copy link
Contributor

dr-jts commented Mar 15, 2024

@christaina we still need the data to reproduce your issue. However, if translating the polygon allows the buffer operation to execute correctly, then you could build a workaround using that - if the expected result is missing, then translate, buffer, and translate back.

@christaina
Copy link

christaina commented Mar 15, 2024

@dr-jts thanks for the idea. I was trying to see if I can generate an ambiguous geometry to share with you that does reproduce the issue, by manually offsetting the centered one by some amount. that hasnt worked so far. I will try to do this more programmatically later and see if I have some success :)

However, if translating the polygon allows the buffer operation to execute correctly, then you could build a workaround using that - if the expected result is missing, then translate, buffer, and translate back.

i am wondering if we build a workaround based on this, how likely is it that across all the data we need to process, translating wont cause the issue to occur again. I am also wondering if you have thoughts on - if we translated every geometry to center using the bounds, liked i tried to get the polygon i shared with you, would this likely resolve the issue overall or is it more likely the issue will be triggered on different data points (i guess what im getting at is- do you have an opinion on what about translating the geom probably resolves the issue in my example?)

we would prefer to avoid having to check if its empty in the code, and then reprocessing, because we have a lot of places in the code where we buffer and so checking and reprocessing every time is not an elegant/straightforward modification (feels like bad practice?). also because the proccessing time of our data is already slow, so avoiding reprocessing again and instead implementing a workaround to avoid it up front would be ideal

misc related info i just discovered-

  • if I try to translate the original geometry by a fixed amount (e.g. ambiguous_bad_poly = affinity.translate(bad_poly, -100000, -100000) i can reproduce the bug up until a certain number. translating the coordinates by -1 reproduces the issue, so does -100k, but if I translate it by -200k or more, the bug no longer appears for my example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug JTS Issue also appears in JTS
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants