Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

region-id inconsistent between depressions.shp and regions.shp #33

Open
tlohde opened this issue Mar 15, 2023 · 3 comments
Open

region-id inconsistent between depressions.shp and regions.shp #33

tlohde opened this issue Mar 15, 2023 · 3 comments

Comments

@tlohde
Copy link

tlohde commented Mar 15, 2023

Firstly, thank you for providing an excellent package/resource.

  • lidar version: 0.7.1
  • Python version: 3.11.0
  • Operating System: Windows 10

Description

Describe what you were trying to get done.

  • short-term goal, and where the problem is happening: aligning depression regions, with individual nested depressions

Tell us what happened, what went wrong, and what you expected to happen.

  • the region-id values in regions.shp/regions_info.csv that is generated using ExtractSinks() differ from those in the depressions.shp/depressions_info.csv generated using DelineateDepressions().
  • I would expect the region-id to be consistent between these.

What I Did

outdir='tmp/'
min_size = 50           # minimum number of pixels as a depression
min_depth = 30           # minimum depth as a depression
interval = 10         # slicing interval for the level-set method
bool_shp = False        # output shapefiles for each individual level

sink_path = ExtractSinks('sample.tif', min_size, outdir)

dep_id_path, dep_level_path = DelineateDepressions(sink_path, min_size, min_depth, interval, outdir, bool_shp)

# read in output and combine info csv with geometry from shapefile
# depressions
depressions = gpd.read_file('tmp/depressions.shp')
depressions = depressions.dissolve('id').reset_index() # makes for a tidier merge with depressions_info.csv (creates multipolygons)
dep_info = pd.read_csv('tmp/depressions_info.csv')
gdf = depressions.merge(dep_info, on='id')

# regions
regions = gpd.read_file('tmp/regions.shp').sort_values(by='id')
reg_info = pd.read_csv('tmp/regions_info.csv')
regions = regions.merge(reg_info, left_on='id', right_on='region-id').sort_values(by='id').reset_index(drop=True)

at this stage I would expect the region-id in regions to match that in gdf...however, they do not. Below is an illustration, which I think also provides a way to handle the discrepency (spatial join)

# read in raster (for plotting)
region_raster = rio.open_rasterio('tmp/region.tif')

# to enable comparison of region-ids intersection of depressions and regions shapefiles
overlaid = (gdf
            .dissolve('region-id')   # aggregate by region-id
            .reset_index() # to ensure result gdf has both region-id_1 and region-id_2 (where _2 is the value that correctly corresponds with those in regions.shp)
            .overlay(regions, keep_geom_type=False)
)

###### plotting

# random region number
R = 310

fig, axs = plt.subplots(figsize=[15,6],ncols=5)

# plot raster 
(region_raster==R).plot(ax=axs[0], add_colorbar=False)

# plot from regions shapefile
regions.loc[regions['region-id']==R].plot(ax=axs[1])

# plot from depreesion shapefile
gdf.loc[gdf['region-id']==R].plot(column='level', ax=axs[2])

# plot from intersection of depressions and regions
overlaid.loc[overlaid['region-id_1']==R].plot(ax=axs[3])

# plot from intersection of depressions and regions
overlaid.loc[overlaid['region-id_2']==R].plot(ax=axs[4])

# tidy up axes limits, and labels etc...
axs[0].set_xlim(axs[1].get_xlim())
axs[0].set_ylim(axs[1].get_ylim())
axs[0].set_aspect('equal')

axs[0].set_title(f'region:{R}\nfrom region.tif')
axs[1].set_title(f'region:{R}\nfrom regions.shp')
axs[2].set_title(f'region:{R}\nfrom depressions.shp')
axs[3].set_title(f'region:{overlaid.loc[overlaid["region-id_1"]==R,"region-id_2"].values[0]}\nfrom regions.shp')
axs[4].set_title(f'region:{overlaid.loc[overlaid["region-id_2"]==R,"region-id_1"].values[0]}\nfrom depressions.shp')

plt.subplots_adjust(wspace=0.35)

print(f"region: {R} in the depressions.shp file corresponds to region: {overlaid.loc[overlaid['region-id_1']==R,'region-id_2'].values[0]} in regions.shp")
print(f"region: {R} in the regions.shp file corresponds to region: {overlaid.loc[overlaid['region-id_2']==R,'region-id_1'].values[0]} in depressions.shp")

region: 310 in the depressions.shp file corresponds to region: 394 in regions.shp
region: 310 in the regions.shp file corresponds to region: 245 in depressions.shp

image

the question

So, i think the question is... is the discrepency between the two region_ids expected/normal?
If yes, is the use of .overlay() the best way to handle it and reconcile regions.shp/regions_info.csv and depressions.shp/depressions_info.csv, or is there an even more straightforward way?
If no, have I done something wrong?

thank you

@giswqs
Copy link
Member

giswqs commented Mar 17, 2023

Thank you for reporting. Did you try out the example notebook to see if it has the same issue? I suspect that might just be a raster-vector conversion issue rather than the lidar package

https://colab.research.google.com/github/giswqs/lidar/blob/master/examples/lidar_colab.ipynb

@tlohde
Copy link
Author

tlohde commented Mar 17, 2023

I can't seem to make the issue appearin the example notebook. I changed the sink parameters to:

min_size = 20 
min_depth = 0.5 
interval = 0.2

in order generate more regions, and the region-id appears to be consistent between the two.

I will have a closer look at my DEM and the raster-vector conversion. As it is, I think the spatial join is good enough for my needs.

@tlohde
Copy link
Author

tlohde commented Mar 17, 2023

Probelm solved/identified (i think)

The issue arose because regions of my DEM has regions where elevations are < 0 (and they are supposed to be, as I am dealing with a DEM that includes some ocean bathymetry). These areas were identified by ExtractSinks(), but not DelineateDepressions(), i.e. the there were a different number of unique region-ids in regions.shp/regions.tif and in depressions.shp/depressions.tif

Workaround

  • create a duplicate DEM that has offset that brings all elevations above 0.
  • use this in ExtractSinks()

potential solution

I have tried to follow what DelineateDepressions() is doing, and i think, a solution might be to allow level_image to be < 0.

Thank you,
tlohde

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants