Implement ngram-based blacklist filter #54

steffencruz · 2023-10-30T23:20:29Z

Introduces a ngram-based blacklist. The blacklist is based on an n-gram counter with a sliding window (deque). Conceptually, when completions contain similar subsequences the associated n-grams accumulate counts and significance. When the significance exceeds a threshold (Blacklist.boundary), all completions containing that n-gram are flagged and recieve no reward. The sliding window allows the blacklist to adapt to the changing network.

Runtime

Estimated runtime for adding to the counter is 25ms per step (50 completions).
Estimated runtime for rewarding completions is unknown right now but it requires a full pass over a dict of max_size (~1M) items

Hyperparameters

There is a tradeoff between speed and memory consumption. I think 1M is okay for the queue size but 100k works too.
The length of the queue and throughput of the network (words per unit time) determine how rapidly the blacklist learns/forgets n-grams. I estimate that one step contains around 3400 completion words so a queue of length 1M should be completely flushed in 300 steps. This means that there will be a record of an n-gram for at most 300 steps or ~2.5 hours (@30s per step).
n_min = 5 and n_max = 15 seem reasonable to me. If you make n_min too large then exploits will contain small phrases.

Notes

Long n-grams end up creating multiple entries in the blacklist as their subsequences are also necessarily frequent too. Eg. ('this','is','an','example','long','sentence') will also guarantee the presence of the n-gram ('an','example','long','sentence') (and all the other subsequences) with at least as many counts. Because of this, a common 14-gram also has a common 13-gram, 12-gram ... which tend to fill of the queue quite rapidly.

TODO

Tracked experiments to determine optimal blacklist parameters (would be good to include raw significance scores in those runs rather than only the binary output)
Pass Blacklist parameters via config
Unit tests

steffencruz · 2023-10-31T01:07:20Z

Come to think of it we can extend the effective time window by reducing the ngram addition rate via stochastic sampling. Instead of adding add ngrams in a completion we can simply select a subset.

Right now around 12,000 n-grams are added per step (from 50 completions combined). We can likely reduce this by a factor of 5-10x by randomly selecting n-grams.

steffencruz · 2023-10-31T01:16:28Z

import tqdm
import wandb
import numpy as np
import pandas as pd
import plotly.express as px
from prompting.validators.reward.blacklist import Blacklist

api = wandb.Api()

# reproducibility
run = api.run('opentensor-dev/openvalidators/7rahrixe')

df = pd.DataFrame(run.history(samples=1000))

ngram_manager = Blacklist()

# get the batches
batches = df.loc[df.completions.notna(), 'completions']

# inject some test phrases into the completions to see how they would be detected by the blacklist
test_phrases = [
    {'phrase':'that is an excellent question', 'begin':0, 'end':200, 'probability':0.25},
    {'phrase':'Sure! I\'d be happy to help you with that enquiry', 'begin':200, 'end':300, 'probability':0.15},
    {'phrase':'Hell yeah! I\'m ready to do this right now!', 'begin':500, 'end':700, 'probability':0.05},
]
n_test_phrases = 0

save_every = 20
snapshots = []
for i, completions in enumerate(tqdm.tqdm(batches)):
    comps = []
    for completion in completions:
        c = completion
        for phrase in test_phrases:
            if (phrase.get('begin',0) <= i <= phrase.get('end', len(batches))) and phrase.get('probability',1)>np.random.rand():
                c += ' ' + phrase['phrase']
                n_test_phrases += 1
                break
        comps.append(c)
    ngram_manager.add(comps)

    if i%save_every == 0:
        snapshots.append({'step':i, 'deque_length':len(ngram_manager.deque), 'running_size':ngram_manager._running_size, 'significance':ngram_manager.most_significant(10), 'counts':ngram_manager.most_common(10)})

df_snapshots = pd.DataFrame(snapshots)

def make_top_ngrams(x, ntop=10):
    return pd.DataFrame( [{'rank': i, 'phrase':' '.join(xx[0]), 'significance':xx[1]} for i, xx in enumerate(x[:ntop], start=1)])

traces = []
ntop = 10
for idx, row in df_snapshots.iterrows():
    traces.append(make_top_ngrams(row.significance, ntop=ntop).assign(step=row.step))#, date=row.Date, name=row.name, block=row.block))

df_traces = pd.concat(traces)
colors = [px.colors.find_intermediate_color('rgb(0,0,250)', 'rgb(250,0,0)', (ntop-i)/ntop, colortype='rgb') for i in range(0, ntop+1)]
fig = px.line(df_traces, x='step', y='significance', color='rank', #markers=True,
        hover_name='phrase', #hover_data=['date', 'name', 'block'],
        color_discrete_sequence=colors,
        width=800, height=600, template='plotly_white'
        )
fig.update_traces(opacity=0.5)

which produces the following trace data

which is well-detected by the blacklist and justifies a boundary of ~1000, given the default parameters.

The same trace data without the injected phrases looks like this

steffencruz · 2023-10-31T16:02:58Z

prompting/validators/reward/blacklist.py

+
+        # Check if any n-grams have significance above the boundary
+        for ngram in ngrams:
+            if scores.get(ngram) > self.boundary:


It would be of interest to include a self.blacklist attribute which contains all the ngrams with significance>self.boundary. It should not be a very large object (10s-100s items). This is something we could log to wandb regularly to introspect the blacklist.

p-ferreira

Things to be done:

Track raw score values (being done on Feature/track raw score #63 )
Identify why we have some n-grams of 3
Fix branch conflicts

p-ferreira

LGTM, I would be ok to merge it once the branch conflicts have been fixed

steffencruz

Some minor comments, then it's good to go

prompting/validators/reward/blacklist.py

Co-authored-by: Steffen Cruz <steffenjcruz@gmail.com>

…sor/text-prompting into features/ngram-blacklist

Co-authored-by: Steffen Cruz <steffenjcruz@gmail.com>

…or/text-prompting into feature/track_raw_score

Feature/track raw score

Implement ngram-based blacklist filter

65d8eab

steffencruz requested review from ifrit98, p-ferreira, isabella618033 and Eugene-hu October 30, 2023 23:20

steffencruz commented Oct 31, 2023

View reviewed changes

steffencruz mentioned this pull request Nov 1, 2023

Penalty rewards #52

Merged

isabella618033 added 11 commits November 3, 2023 17:17

added tokenizer

d077233

memory save algo

0e03e20

boundry 1000 -> 3

adecfc8

comments

c1e3015

update comment

517f2d6

decoding token when return significance score , added fuzzywuzzy

3f4a353

adding half life to counter

e6be5ef

clean up

ef5280c

remove deque

dfc2d58

added reset function

ac22d2d

forward fix

bc00d7d

p-ferreira suggested changes Nov 7, 2023

View reviewed changes

isabella618033 added 7 commits November 7, 2023 20:49

significance gram size fix

55b7ce8

defining rewardresult dataclass and reward event

1ba9777

moved event addition into reward model apply function

22a201f

clean up relevence

270fb93

apply to blaclist

e0b0f9d

fixes

0aeddc2

changed get_reward returns for all

7330314

p-ferreira suggested changes Nov 8, 2023

View reviewed changes

p-ferreira added the 2.1.2 label Nov 8, 2023

p-ferreira marked this pull request as ready for review November 8, 2023 22:16

schema update

21bbd62

p-ferreira approved these changes Nov 8, 2023

View reviewed changes

black formatting

1d033af

steffencruz commented Nov 8, 2023

View reviewed changes

isabella618033 and others added 18 commits November 8, 2023 17:45

Spelling

c929dd7

Co-authored-by: Steffen Cruz <steffenjcruz@gmail.com>

Update comment default.

592c612

Co-authored-by: Steffen Cruz <steffenjcruz@gmail.com>

Spelling

254449e

Co-authored-by: Steffen Cruz <steffenjcruz@gmail.com>

Spelling

2473585

Co-authored-by: Steffen Cruz <steffenjcruz@gmail.com>

Merge branch 'features/ngram-blacklist' into feature/track_raw_score

9b318f0

Changes regarding steffen's comments

9660c1d

Merge branch 'features/ngram-blacklist' of https://github.com/openten…

4140dcd

…sor/text-prompting into features/ngram-blacklist

black formatting

27badfc

retain comments

abd0f17

Co-authored-by: Steffen Cruz <steffenjcruz@gmail.com>

retain comments

00fe827

Co-authored-by: Steffen Cruz <steffenjcruz@gmail.com>

retain comments

6b02563

Co-authored-by: Steffen Cruz <steffenjcruz@gmail.com>

retain comments

2794c32

Co-authored-by: Steffen Cruz <steffenjcruz@gmail.com>

retain comments

3531257

Co-authored-by: Steffen Cruz <steffenjcruz@gmail.com>

retain comments

fda08af

Co-authored-by: Steffen Cruz <steffenjcruz@gmail.com>

retain comments

c68a22a

Co-authored-by: Steffen Cruz <steffenjcruz@gmail.com>

retain comments

820e6a5

Co-authored-by: Steffen Cruz <steffenjcruz@gmail.com>

fixes

b300460

Merge branch 'feature/track_raw_score' of https://github.com/opentens…

93dc591

…or/text-prompting into feature/track_raw_score

p-ferreira mentioned this pull request Nov 9, 2023

2.1.2 Release #67

Merged

isabella618033 and others added 5 commits November 9, 2023 15:13

black format

87faaf5

black format

00a201b

black formatted

07882b7

Merge branch 'staging' into features/ngram-blacklist

1d8c3bc

Merge pull request #63 from opentensor/feature/track_raw_score

beab14a

Feature/track raw score

p-ferreira merged commit 0da7803 into staging Nov 9, 2023
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement ngram-based blacklist filter #54

Implement ngram-based blacklist filter #54

steffencruz commented Oct 30, 2023 •

edited

Loading

steffencruz commented Oct 31, 2023

steffencruz commented Oct 31, 2023 •

edited

Loading

steffencruz Oct 31, 2023

p-ferreira left a comment •

edited

Loading

p-ferreira left a comment

steffencruz left a comment

Implement ngram-based blacklist filter #54

Implement ngram-based blacklist filter #54

Conversation

steffencruz commented Oct 30, 2023 • edited Loading

Runtime

Hyperparameters

Notes

steffencruz commented Oct 31, 2023

steffencruz commented Oct 31, 2023 • edited Loading

steffencruz Oct 31, 2023

Choose a reason for hiding this comment

p-ferreira left a comment • edited Loading

Choose a reason for hiding this comment

p-ferreira left a comment

Choose a reason for hiding this comment

steffencruz left a comment

Choose a reason for hiding this comment

steffencruz commented Oct 30, 2023 •

edited

Loading

steffencruz commented Oct 31, 2023 •

edited

Loading

p-ferreira left a comment •

edited

Loading