You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need a binary file which contains all of unique keys of all the categorical features in sequential order.
Some clarifications:
unique keys of all the categorical features mean that there should NOT be any duplicates, even across features.
The file size should be exactly the size(4 bytes or 8 bytes) of key type(int32 or int64) * number of unique keys. In another word, we don't need any separators in the file.
The naming of file is *.keyset
The number of unique keys is also equal to the sum of embedding sizes(can be generated using the get_embedding_sizes method in NVT).
For example:
Suppose we have "feature1":[key1, key2, key3], "feature2:"[key1, key2], "feature3:"[key1, key2, key3, key4].
The embedding sizes of feature1, feature2, feature3 are 3, 2, 4, respectively. So the total number of unique keys is equal to 3 + 2 +4 = 9.
Therefore, what in the .keyset file should be 123456789 (in binary format).
Generate and output file in binary file, with the unique keys of all the cat features in sequential order.
@yingcanw @jershi425 Please, add more details.
@oyilmaz-nvidia for viz.
The text was updated successfully, but these errors were encountered: