Skip to content

Commit

Permalink
quantize_into_bins: refacto
Browse files Browse the repository at this point in the history
  • Loading branch information
unytics committed Jul 26, 2024
1 parent 8089d59 commit 061e77e
Showing 1 changed file with 7 additions and 11 deletions.
18 changes: 7 additions & 11 deletions bigfunctions/quantize_into_bins_with_labels.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,26 +20,23 @@ output:
examples:
- description: |
`55` is between `50` and `60` so it is in second bin.
Function returns `Wait for result exam` label.
--> Function returns `Wait for result exam` label.
arguments:
- "55"
- "[0, 50, 60, 90, 100]"
- "['Fail', 'Wait for result exam', 'Pass', 'Pass with mention']"
output: "Wait for result exam"
- description: |
Lower bounds are inclusive. `50` is then also in second bin.
Function returns `Wait for result exam` label.
--> Function returns `Wait for result exam` label.
arguments:
- "50"
- "[0, 50, 60, 90, 100]"
- "['Fail', 'Wait for result exam', 'Pass', 'Pass with mention']"
output: "Wait for result exam"
- description: |
`-10` is below the lowest bound
Function returns `UNDEFINED_INF`.
--> Function returns `UNDEFINED_INF`.
(It returns `UNDEFINED_SUP` is above the upper bound).
arguments:
- "-10"
Expand All @@ -48,7 +45,6 @@ examples:
output: "UNDEFINED_INF"
- description: |
You can also pass `n + 1` labels instead of `n - 1` labels (when `n` is the number of bounds).
In that case, values below the first bound will have this first label (instead of `UNDEFINED_INF`).
`-10` will then give `Lower than very bad!`.
arguments:
Expand All @@ -63,7 +59,7 @@ code: |
index as (
select cast(
replace(
ML.BUCKETIZE(value, bounds),
ML.BUCKETIZE(value, bin_bounds),
'bin_',
''
)
Expand All @@ -73,9 +69,9 @@ code: |
padded_labels as (
select
case
when array_length(labels) = array_length(bounds) + 1 then labels
when array_length(labels) = array_length(bounds) - 1 then array_concat(['UNDEFINED_INF'], labels, ['UNDEFINED_SUP'])
else error('len(labels) should be equal to len(bounds) + 1 OR equal to len(bounds) - 1')
when array_length(labels) = array_length(bin_bounds) + 1 then labels
when array_length(labels) = array_length(bin_bounds) - 1 then array_concat(['UNDEFINED_INF'], labels, ['UNDEFINED_SUP'])
else error('len(labels) should be equal to len(bin_bounds) + 1 OR equal to len(bin_bounds) - 1')
end as labls
)
Expand Down

0 comments on commit 061e77e

Please sign in to comment.