Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ustring): ustring hash collision protection #4350

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Commits on Jul 20, 2024

  1. feat(ustring): ustring hash collision protection

    The gist is that the ustring::strhash(str) function is modified to
    strip out the MSB from Strutil::strhash.  The rep entry is filed in
    the ustring table based on this hash.  So effectively, the computed
    hash is 63 bits, not 64.
    
    But rep->hashed field consists of the lower 63 bits being the computed
    hash, and the MSB indicates whether this is the 2nd (or more) entry in
    the table that had the same 63 bit hash.
    
    ustring::hash() then is modified as follows: If the MSB is 0, the
    computed hash is the hash. If the MSB is 1, though, we DON'T use that
    hash, and instead we use the pointer to the unique characters, but
    with the MSB set (that's an invalid address by itself). Note that the
    computed hashes never have MSB set, and the char*+MSB always have MSB
    set, so therefore ustring::hash() will never have the same value for
    two different ustrings.
    
    But -- please note! -- that ustring::strhash(str) and
    ustring(str).hash() will only match (and also be the same value on
    every execution) if the ustring is the first to receive that hash,
    which should be approximately always. Probably always, in practice.
    
    But in the very improbable case of a hash collision, one of them (the
    second to be turned into a ustring) will be using the alternate hash
    based on the character address, which is both not the same as
    ustring::strhash(chars), nor is it expected to be the same constant on
    every program execution.
    
    Signed-off-by: Larry Gritz <lg@larrygritz.com>
    lgritz committed Jul 20, 2024
    Configuration menu
    Copy the full SHA
    5c5ba90 View commit details
    Browse the repository at this point in the history

Commits on Aug 15, 2024

  1. Configuration menu
    Copy the full SHA
    3fa0a3b View commit details
    Browse the repository at this point in the history