Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PER_BAND mask and MaskedArray #580

Merged
merged 10 commits into from
Mar 27, 2023
Merged

PER_BAND mask and MaskedArray #580

merged 10 commits into from
Mar 27, 2023

Conversation

vincentsarago
Copy link
Member

@vincentsarago vincentsarago commented Mar 6, 2023

ref #579

Since early version of rio-tiler we always used either the alpha band or dataset_mask as mask, but in reality mask can be PER_BAND (e.g nodata masked value can be different from a band to another).

This PR Sketch out the switch to numpy.MaskedArray within the ImageData class and as output from rio_tiler.reader.read function.

ToDo

  • add more tests
  • update PointData class
  • update mosaic methods
  • documentation



@attr.s
class ImageData:
"""Image Data class.

Attributes:
data (numpy.ndarray): pixel values.
mask (numpy.ndarray): rasterio mask values.
data (numpy.ma.MaskedArray): image values.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data -> array

def mask(self) -> numpy.ndarray:
"""Return Mask in form of rasterio dataset mask."""
return numpy.logical_or.reduce(~self.array.mask) * numpy.uint8(255)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can still use ImageData().data and ImageData.mask for compatibility

mask = ~numpy.logical_or.reduce(numpy.ma.getmaskarray(arr))
return cls(data, mask * numpy.uint8(255))

# TODO: Deprecate
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from array is not needed anymore

@@ -31,7 +31,7 @@ class Options(TypedDict, total=False):
vrt_options: Optional[Dict]
resampling_method: Optional[Resampling]
unscale: Optional[bool]
post_process: Optional[Callable[[numpy.ndarray, numpy.ndarray], DataMaskType]]
post_process: Optional[Callable[[numpy.ma.MaskedArray], numpy.ma.MaskedArray]]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should assume that post_process method will only work with MaskedArray

out_shape=(height, width) if height and width else None,
resampling=resampling,
boundless=boundless,
masked=True,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 dataset_mask

@@ -441,7 +437,7 @@ def point(
resampling_method: Resampling = "nearest",
unscale: bool = False,
post_process: Optional[
Callable[[numpy.ndarray, numpy.ndarray], DataMaskType]
Callable[[numpy.ma.MaskedArray], numpy.ma.MaskedArray]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PointData will also be updated to use MaskedArray

else numpy.nan,
"minority": float(keys[counts.tolist().index(counts.min())].tolist())
if valid_pixels
else numpy.nan,
"unique": float(counts.size),
**dict(zip(percentiles_names, percentiles_values)),
"histogram": histogram,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix some issues when no valid data is present in the file

@vincentsarago vincentsarago self-assigned this Mar 6, 2023
arr = src.read(
masked=True,
out_shape=(src.count, int(src.height / 10), int(src.width / 10)),
)
Copy link
Member Author

@vincentsarago vincentsarago Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

speed up tests by reducing the array size

data = data * 2
return data, mask
data.mask = False # set mask to False
return data
Copy link
Member Author

@vincentsarago vincentsarago Mar 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post_process now require a MaskedArray

if force_binary_mask:
mask = numpy.where(mask != 0, numpy.uint8(255), numpy.uint8(0))
pass
# mask = numpy.where(mask != 0, numpy.uint8(255), numpy.uint8(0))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the biggest breaking change! now that we use rasterio masked and numpy.ma.MaskedArray for the PER_BAND mask representation we won't support non-binary mask. It will always be True (0) or False (255).

This is a major downside for people using non-binary alpha bands, but sometimes we have to take hard decisions. (ref: rasterio/rasterio#1721 (comment))

@vincentsarago
Copy link
Member Author

🤔 we may hit #105 again

@vincentsarago vincentsarago marked this pull request as ready for review March 23, 2023 15:50
@vincentsarago vincentsarago added the breaking breaking change label Mar 23, 2023
@vincentsarago vincentsarago changed the title sketch use of MaskedArray with PER_BAND mask PER_BAND mask and MaskedArray Mar 23, 2023
@vincentsarago vincentsarago changed the base branch from main to dev March 24, 2023 20:23
@vincentsarago vincentsarago merged commit a5502ce into dev Mar 27, 2023
@vincentsarago vincentsarago deleted the rioTilerV5 branch March 27, 2023 13:45
vincentsarago added a commit that referenced this pull request Jun 1, 2023
* update readme

* PER_BAND mask and MaskedArray (#580)

* sketch use of MaskedArray with PER_BAND mask

* update PointData and fix tests

* update mosaics

* update nodata mask

* remove unused

* update changelog

* migration guide

* split resampling option to RasterIO and Warp options (#590)

* refactor Mask tests and Benchmark

* refactor Mask tests and Benchmark (#591)

* change arguments in dynamic_tiler.md (#592)

* change arguments

The argument  ("tile", {"z": "{z}", "x": "{x}", "y": "{y}"}) causes errors below.

File "/home/ubuntu/app/app.py", line 48, in tilejson
    tile_url = request.url_for("tile", {"z": "{z}", "x": "{x}", "y": "{y}"})
TypeError: HTTPConnection.url_for() takes 2 positional arguments but 3 were given

("tile", z='{z}', x='{x}', y='{y}') is correct.

* Update docs/src/advanced/dynamic_tiler.md

---------

Co-authored-by: Vincent Sarago <vincent.sarago@gmail.com>

* refactor MosaicMethods (#594)

* refactor MosaicMethods

* fix tests

* Optional boto3 (#597)

* make boto3 an optional dependency

* update migration

* update test dependencies

* add flake8 rules in ruff (#600)

* update dev version

* update mosaic example notebook custom pixel_selection class (#602)

* improve cutline handling (#598)

* Improve cutline handling

Closes #588

* handle projection

---------

Co-authored-by: vincentsarago <vincent.sarago@gmail.com>

* remove useless cache

* save benchmarks

* add benchmarking in docs

* fix nodata/mask/alpha forwarding for GCPS dataset (#604)

* remove boto3

* Allow clip-box to auto expand (#608)

* Allow clip-box to auto expand

* Add test

* Update tests/test_io_xarray.py

* Fix import order

* Changes from black linter

* Change default to true

* use auto_expand in part and update changelog

---------

Co-authored-by: Vincent Sarago <vincent.sarago@gmail.com>

* allow morecantile 4.0 (#606)

* allow morecantile 4.0

* set morecantile min version to 4.0

* update changelog

* forward statistics from STAC raster:bands (#611)

* forward statistics from STAC raster:bands

* forward tags as metadata and forward metadata when creating ImageData from list

* update changelog

* handle nodata in XarrayReader (#612)

* handle nodata in XarrayReader

* update changelog

* add AWS credential overrides for S3 stac (#613)

* add AWS credential overrides for S3 stac

* catch warnings

* deprecated model methods (#615)

* update readme

---------

Co-authored-by: TTY6335 <36815385+TTY6335@users.noreply.github.com>
Co-authored-by: Daniel Wiesmann <yellowcap@users.noreply.github.com>
Co-authored-by: Aimee Barciauskas <aimee@developmentseed.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5.0.0 breaking breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant