Skip to content

Commit

Permalink
Readme updates
Browse files Browse the repository at this point in the history
  • Loading branch information
awxkee committed Jul 27, 2024
1 parent 3ddf131 commit 7648d49
Show file tree
Hide file tree
Showing 3 changed files with 58 additions and 2 deletions.
44 changes: 43 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,36 @@ let resized = scaler.resize_rgba(
let resized_image = resized.as_bytes();
```

### Fastest path with SIMD

Despite all implementation are fast, not all the paths are implemented using SIMD, so some paths are slower

`~` - Partially implemented

| | NEON | SSE | AVX |
|-----------------|------|-----|-----|
| RGBA (8 bit) | x | x | ~ |
| RGB (8 bit) | x | x | ~ |
| Plane (8 bit) | x | x | ~ |
| RGBA (8+ bit) | x | x | - |
| RGB (8+ bit) | x | x | - |
| Plane (10+ bit) | - | - | - |
| RGBA (f32) | x | x | x |
| RGB (f32) | x | x | ~ |
| Plane (f32) | x | x | ~ |
| RGBA (f16) | x | x | x |
| RGB (f16) | ~ | ~ | ~ |
| Plane (f16) | ~ | ~ | ~ |

#### Target features

`fma`, `sse4.1`, `sse4.2`, `avx2`, `neon`, `f16c` optional target features are available, enable it when compiling on supported platform to get full features

#### To enable full support of *f16* `half` feature should be used, and `f16c` enabled when targeting x86 platforms.
#### For NEON `f16` feature, target feature `neon` should be activated and platform and target platform expected to be `aarch64`.

Even when `half` feature activated but platform do not support or features not enabled for `f16` speed will be slow

### Performance

Example comparison with `fast-image-resize` time for downscale RGB 4928x3279 image in two times for x86_64 SSE.
Expand Down Expand Up @@ -81,7 +111,6 @@ M3 Pro. NEON
| pic-scale | 17.41 |
| fir sse | 25.82 |


Example comparison time for downscale RGBA 4928x3279 10 bit image in two times for *NEON* with premultiplying alpha.

| | Lanczos3 |
Expand All @@ -96,6 +125,19 @@ RGBA 4928x3279 10 bit downscale without premultiplying alpha *NEON*
| pic-scale | 45.09 |
| fir sse | 73.82 |

Example comparison time for downscale RGBA 4928x3279 10 bit image in two times for *SSE* with premultiplying alpha.

| | Lanczos3 |
|-----------|:--------:|
| pic-scale | 156.90 |
| fir sse | 150.65 |

RGBA 4928x3279 10 bit downscale without premultiplying alpha *SSE*

| | Lanczos3 |
|-----------|:--------:|
| pic-scale | 107.82 |
| fir sse | 113.51 |

#### Example in sRGB

Expand Down
2 changes: 1 addition & 1 deletion app/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,7 @@ fn test_fast_image() {
&mut dst_image,
&ResizeOptions::new()
.resize_alg(ResizeAlg::Convolution(Lanczos3))
.use_alpha(false),
.use_alpha(true),
)
.unwrap();

Expand Down
14 changes: 14 additions & 0 deletions src/scaler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -561,6 +561,11 @@ impl ScalingU16 for Scaler {
new_image.bit_depth = bit_depth;
return new_image;
}

if bit_depth < 1 || bit_depth > 16 {
panic!("Bit depth must be in [1, 16] but got {}", bit_depth);
}

let vertical_filters = self.generate_weights(store.height, new_size.height);
let horizontal_filters = self.generate_weights(store.width, new_size.width);

Expand Down Expand Up @@ -632,6 +637,10 @@ impl ScalingU16 for Scaler {
return new_image;
}

if bit_depth < 1 || bit_depth > 16 {
panic!("Bit depth must be in [1, 16] but got {}", bit_depth);
}

let pool = self
.threading_policy
.get_pool(ImageSize::new(new_size.width, new_size.height));
Expand Down Expand Up @@ -684,6 +693,11 @@ impl ScalingU16 for Scaler {
new_image.bit_depth = bit_depth;
return new_image;
}

if bit_depth < 1 || bit_depth > 16 {
panic!("Bit depth must be in [1, 16] but got {}", bit_depth);
}

let vertical_filters = self.generate_weights(store.height, new_size.height);
let horizontal_filters = self.generate_weights(store.width, new_size.width);

Expand Down

0 comments on commit 7648d49

Please sign in to comment.