Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gnarly results that I can't explain #207

Open
StevenLangbroek opened this issue Jun 13, 2023 · 3 comments
Open

Gnarly results that I can't explain #207

StevenLangbroek opened this issue Jun 13, 2023 · 3 comments

Comments

@StevenLangbroek
Copy link

StevenLangbroek commented Jun 13, 2023

Describe the bug
Hey hey! I'm working with a dataset called GSOC to calculate soil organic carbon for a geometry. There's some geometries that are producing... interesting :D results for us. If I run geoblaze.stats on the geometry, it gives me these results:

[
  {
    "count": 3,
    "valid": 3,
    "invalid": 0,
    "median": 43,
    "min": -3.3999999521443642e+38,
    "max": 43,
    "sum": -3.3999999521443642e+38,
    "range": 3.3999999521443642e+38,
    "mean": -1.1333333173814548e+38,
    "variance": 2.5688888165737062e+76,
    "std": 1.6027753481301446e+38,
    "histogram": {
      "43": {
        "n": 43,
        "ct": 2
      },
      "-3.3999999521443642e+38": {
        "n": -3.3999999521443642e+38,
        "ct": 1
      }
    },
    "modes": [
      43
    ],
    "mode": 43,
    "uniques": [
      -3.3999999521443642e+38,
      43
    ]
  }
]

That -3.3999999521443642e+38, doesn't seem right, and I have no idea what's causing that...

To Reproduce

// this is close to my house, found it by accident but the issue is prevalent in the dataset.
const geometry = [
                    [
                        [
                            13.457568617507945,
                            52.49182485147867
                        ],
                        [
                            13.46005856478584,
                            52.492777796740285
                        ],
                        [
                            13.476492216820844,
                            52.487622982948125
                        ],
                        [
                            13.479195588151981,
                            52.48467710379617
                        ],
                        [
                            13.473148573333333,
                            52.48155772259062
                        ],
                        [
                            13.457568617507945,
                            52.49182485147867
                        ]
                    ]
                ];
const gsoc = await geoblaze.parse('https://storage.googleapis.com/fao-maps-catalog-data/geonetwork/gsoc/GSOCmap/GSOCmap1.5.0.tif';
const stats = await geoblaze.stats(gsoc, geometry);

Expected behavior
I'm not sure to be honest. QGIS doesn't give me values like this when loading the dataset and looking up the coordinates...

Any help at all understanding what's happening here would be greatly appreciated <3

@StevenLangbroek
Copy link
Author

I'm also not entirely sure whether .mean produces correct results... It seems that it doesn't actually pick the mean, but produces an average? Is that possible? Given these values, I had expected mean to be (pseudo-code alert):

const values = [-3.39e+38, 42, 42]
expect(mean(value)).toBe(values[1]);

@StevenLangbroek
Copy link
Author

So, I've found the area in the dataset explorer and these are indeed "correct" values (although they're obviously gibberish). Is there a way to filter these out somehow? I couldn't find an appropriate API for it, but I'm assuming that's because I'm a dum-dum :D.

@DanielJDufour
Copy link
Member

Hey, sorry about that. Nothing you've done wrong. There's just often weirdness when dealing with no data values and Float 32 numbers. There's a bit of discrepancy between the No Data Value provided in the GeoTIFF metadata ("-3.39999999999999996e+038\x00") and what can actually be represented in JavaScript (-3.3999999521443642e+38). I think the issue is that whatever system is writing the noDataValue doesn't change the value based on the number of bits in encoding (but that's just a hunch). Unfortunately, it'd take some time to develop a proper fix.

Fortunately though, there's two workarounds possible. The easiest is to pass in a new filter function to the stats call:

const filter = value => value !== undefined && value !== -3.3999999521443642e+38;

geoblaze.stats(gsoc, geometry, undefined, filter);

The filter function is undocumented functionality, so I can't commit to maintaining that specific function param in the future. So if you go this route, I'd recommend locking your geoblaze version in your package.json (if you haven't already).

Alternatively, another solution, which is guaranteed to work into the future is to correct the noDataValue after parsing like so:

import parseGeoRaster from "georaster";

const georaster = await parseGeoRaster(url);
georaster.noDataValue = -3.3999999521443642e+38;
geoblaze.stats(georaster, geometry);

This is guaranteed to work because it's using georaster's public API, which I will always do my best to maintain.

Let me know if this helps. Happy to provide more assistance.

Also, thank you so much for alerting me to this issue and this great dataset. Now with a publicly available geotiff file in hand, I'll be able to write some tests and starting thinking about how to solve this issue in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants