You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm currently attempting to summarize an article and classify the relevancy, which worked fine on outlines 0.0.36, however upgrading to outlines 0.0.43 produces a validation error which did not occur before.
I have tried:
Manually specifying the tokenizer to avoid any dictionary bugs
Reducing the complexity of the prompt to JUST summarization, in order to make a minimal example (I have more complicated use cases which worked in 0.0.36)
The model seems to be unable to generate valid json and there is an "Invalid control character at" bug that occurs during pydantic validation
notes:
Running on #20~22.04.1-Ubuntu, AWS instance with A10G GPU, Cuda 12.1
llama_cpp_python==0.2.77
outlines==0.0.43
Steps/code to reproduce the bug:
fromoutlinesimportmodels, generateimportllama_cppfrompydanticimportBaseModel# 0.0.36# model = models.llamacpp(# # "./models/bartowski_Meta-Llama-3-8B-Instruct-Q8_0.gguf",# n_ctx=8000,# n_gpu_layers=-1, # to use GPU acceleration# )# 0.0.43model=models.llamacpp("bartowski/Meta-Llama-3-8B-Instruct-GGUF",
"Meta-Llama-3-8B-Instruct-Q8_0.gguf",
tokenizer=llama_cpp.llama_tokenizer.LlamaHFTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct"),
n_ctx=8000,
n_gpu_layers=-1, # to use GPU acceleration
)
classUser(BaseModel):
name: strlast_name: strid: intclassRelevantSummary(BaseModel):
relevant_summary: strgenerator=generate.json(model, RelevantSummary)
result=generator(
"""<|start_header_id|>system<|end_header_id|><|eot_id|>## OBJECTIVE1. Write a detailed summary related to Product Announcements.2. Output your answer in JSON<|eot_id|><|start_header_id|>user<|end_header_id|>## ARTICLEVeriSilicon’s 2nd generation automotive ISP series IP passed ISO 26262 ASIL B and ASIL D certificationsLas Vegas, USA, January 8, 2024--VeriSilicon (688521.SH) today announced its Image Signal Processor (ISP) IP ISP8200-FS and ISP8200L-FS, designed for high-performance automotive applications, have been certified compliant with the ISO 26262 automotive functional safety standard, achieving ASIL B certification for random failures and ASIL D certification for systematic failures, respectively. The certifications were granted by ResilTech, a leading safety consultancy company. Building upon the 1st generation of ISO 26262 certified ISP IP, the ISP8200-FS series is updated with advanced ISP technologies and several crucial enhancements for automotive applications after multiple automotive customers’ engagements on the 1st generation version.ISP8200-FS series automotive ISP IP delivers high pixel throughputs from 1.6Giga to 2Giga pixel per second under different process technologies, supports up to 8 real-time or 16 camera streams from DDR with low latency technology based on multi-camera scheduling mechanism, and supplements the raw pixel processing pipelines for efficient AI processing. In addition, ISP8200-FS has a built-in FLEXA AI interface to capture automotive related ROI objects from AI processor for pedestrians, vehicles, traffic lights and signs detecting and processing.Since its launch, multiple global major automotive SoC vendors have adopted ISP8200-FS series IP in their products for in-cabin ADAS, the next generation autonomous driving, and unified autonomous driving applications.“ISP plays a pivotal role in the realm of autonomous driving. To meet the rapidly evolving demands of this industry, VeriSilicon is dedicated to providing our automotive customers with cutting-edge capabilities through our functional safety certified IPs,” said Wei-Jin Dai, Executive VP and GM of IP Division of VeriSilicon. “With adoption by multiple customers worldwide, our certified ISP8200-FS and ISP8200L-FS are specifically designed to cater to both primary application processor and the sensor fusion SoC requirements, including image, radar, and LiDAR capabilities. Minimizing latency from sensing to action is crucial in automotive applications. VeriSilicon offers a comprehensive solution with its Glass-to-Glass intelligent pixel processing functional safety IPs.”To explore our rich IP portfolios, we invite you to visit VeriSilicon’s booth at the Venetian Expo (Booth No.: Bassano 2701 & Bassano 2702) during the Consumer Electronics Show (CES) 2024, taking place from January 9 to January 12 in Las Vegas.## SUMMARY<|eot_id|><|start_header_id|>assistant<|end_header_id|>"""
, max_tokens=5000)
print(result)
Expected result:
### Results from outlines==0.0.36
relevant_summary="Verisilicon's second generation automotive ISP series IP has passed ISO26262 ASIL B and ASIL D certifications. The ISP8200-FS and ISP8200L-FS IPs are designed for high-performance automotive applications, achieving ASIL B certification for random failures and ASIL D certification for systematic failures respectively. They deliver high pixel throughputs, support multiple camera streams with low latency, and have a built-in FLEXA AI interface. Multiple major automotive SoC vendors have adopted these IPs in their products for in-cabin ADAS, autonomous driving, and unified autonomous driving applications."
Error message:
$ python3 test_outlines.py
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Compiling FSM index for all state transitions: 76%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 25/33 [00:03<00:01, 7.24it/s]
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/pydantic/main.py", line 1143, in parse_raw
obj = parse.load_str_bytes(
File "/usr/local/lib/python3.10/dist-packages/pydantic/deprecated/parse.py", line 49, in load_str_bytes
return json_loads(b) # type: ignore
File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.10/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 1 column 22 (char 21)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/test_outlines.py", line 32, in<module>
result = generator(
File "/home/ubuntu/.local/lib/python3.10/site-packages/outlines/generate/api.py", line 511, in __call__
return format(completions)
File "/home/ubuntu/.local/lib/python3.10/site-packages/outlines/generate/api.py", line 497, in format
return self.format_sequence(sequences)
File "/home/ubuntu/.local/lib/python3.10/site-packages/outlines/generate/json.py", line 50, in<lambda>
generator.format_sequence = lambda x: schema_object.parse_raw(x)
File "/usr/local/lib/python3.10/dist-packages/pydantic/main.py", line 1170, in parse_raw
raise pydantic_core.ValidationError.from_exception_data(cls.__name__, [error])
pydantic_core._pydantic_core.ValidationError: 1 validation error for RelevantSummary
__root__
Invalid control character at: line 1 column 22 (char 21) [type=value_error.jsondecode, input_value='{"relevant_summary":"\nV...ion SoC requirements."}', input_type=str]
</details>
### Context for the issue:
I would like to improve the performance of my summarization and classification pipeline with the newer Llama 3 gguf models. The current performance on the older 0.0.36 outlines library also has some number formatting issues.
No other issue has brought up any problems with Llama3 ggufs, but all of the finetunes I have tried also have the same issue. Either i'm doing something wrong or there is a signification Llama 3 gguf issue that there should be a discussion about. Thank you!
The text was updated successfully, but these errors were encountered:
It appears the tokenizer represents 198 differently between tokenizer.vocabulary() and tokenizer.decode()
>>> tokenizer.decode([198])
['\n']
>>> [(k, v) for k, v in tokenizer.vocabulary().items() if v == 198][0][0].encode()
'Ċ'
This isn't the case for other tokens
>>> tokenizer.decode([10])
['+']
>>> [(k, v) for k, v in tokenizer.vocabulary().items() if v == 10][0][0]
'+'
Inconsistent Tokens
from transformers import AutoTokenizer
tokenizer = TransformerTokenizer(
AutoTokenizer.from_pretrained("failspy/Meta-Llama-3-8B-Instruct-abliterated-v3")
)
bad_tokens = []
for vocab_token_str, token_id in tokenizer.vocabulary.items():
decoded_token_str = tokenizer.decode([token_id])[0]
if decoded_token_str != vocab_token_str:
bad_tokens.append((decoded_token_str, vocab_token_str))
if bad_tokens:
bad_tok_output = '\n'.join(map(repr, bad_tokens))
raise Exception(f"Found {len(bad_tokens)} bad tokens: {bad_tok_output}")
Found these inconsistent tokens:
E Exception: Found 78029 bad tokens: (' ROOM', 'ĠROOM')
E (' 않는', 'ĠìķĬëĬĶ')
E (' Overse', 'ĠOverse')
E (' slov', 'Ġslov')
E ('�', 'æ¦')
E (' Infragistics', 'ĠInfragistics')
E ('�', 'çĻ')
E (' DIFF', 'ĠDIFF')
E (' 武', 'ĠæѦ')
E (' eighth', 'Ġeighth')
...
I'm looking into whether we should be constructing a "true vocabulary" by decoding each token.
Edit:
It appears we already have a method to normalize:
class TransformerTokenizer(Tokenizer):
...
def convert_token_to_string(self, token: str) -> str:
from transformers.file_utils import SPIECE_UNDERLINE
string = self.tokenizer.convert_tokens_to_string([token])
Investigating the reason this failed to prevent a \n during generation.
Describe the issue as clearly as possible:
I'm currently attempting to summarize an article and classify the relevancy, which worked fine on outlines 0.0.36, however upgrading to outlines 0.0.43 produces a validation error which did not occur before.
I have tried:
The model seems to be unable to generate valid json and there is an "Invalid control character at" bug that occurs during pydantic validation
notes:
Running on #20~22.04.1-Ubuntu, AWS instance with A10G GPU, Cuda 12.1
llama_cpp_python==0.2.77
outlines==0.0.43
Steps/code to reproduce the bug:
Expected result:
Error message:
Outlines/Python version information:
Version information
ubuntu@ip-:~$ python3 -c "import sys; print('Python', sys.version)".
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
pip freeze
aiohttp==3.9.5
aiosignal==1.3.1
amqp==5.2.0
annotated-types==0.6.0
anyio==4.3.0
astroid==3.2.2
asttokens==2.4.1
async-timeout==4.0.3
attrs==23.2.0
Automat==22.10.0
awscli==1.32.108
Babel==2.15.0
backports.tarfile==1.1.1
bcrypt==3.2.0
billiard==4.2.0
black==24.4.2
blessed==1.20.0
blinker==1.8.2
boto3==1.34.108
botocore==1.34.108
build==1.2.1
celery==5.4.0
certifi==2024.2.2
cffi==1.16.0
chalice==1.31.0
chardet==4.0.0
charset-normalizer==3.3.2
click==8.1.7
click-didyoumean==0.3.1
click-plugins==1.1.1
click-repl==0.3.0
cloud-init==24.1.3
cloudpickle==3.0.0
cmake==3.29.3
colorama==0.4.6
command-not-found==0.3
configobj==5.0.6
constantly==23.10.4
cryptography==42.0.7
cssselect==1.2.0
dask==2024.5.1
datasets==2.19.1
dbus-python==1.2.18
decorator==5.1.1
defusedxml==0.7.1
devscripts===2.22.1ubuntu1
dill==0.3.8
diskcache==5.6.3
distlib==0.3.8
distro==1.7.0
distro-info==1.1+ubuntu0.2
dnspython==2.6.1
docutils==0.16
dparse==0.6.3
ec2-hibinit-agent==1.0.0
email_validator==2.1.1
exceptiongroup==1.2.1
executing==2.0.1
fastapi==0.111.0
fastapi-cli==0.0.4
filelock==3.14.0
Flask==3.0.3
frozenlist==1.4.1
fsspec==2024.3.1
gpg==1.16.0
greenlet==3.0.3
h11==0.14.0
hibagent==1.0.1
httpcore==1.0.5
httpie==3.2.2
httplib2==0.22.0
httptools==0.6.1
httpx==0.27.0
huggingface-hub==0.23.2
hyperlink==21.0.0
idna==3.7
importlib_metadata==7.1.0
incremental==22.10.0
iniconfig==2.0.0
inquirer==2.10.1
inquirerpy==0.3.4
interegular==0.3.3
ipython==8.24.0
isort==5.13.2
itemadapter==0.9.0
itemloaders==1.2.0
itsdangerous==2.2.0
jaraco.classes==3.4.0
jaraco.context==5.3.0
jaraco.functools==4.0.1
jedi==0.19.1
jeepney==0.8.0
Jinja2==3.1.4
jmespath==1.0.1
joblib==1.4.2
jsonpatch==1.32
jsonpointer==2.0
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
keyring==25.2.1
kombu==5.3.7
lark==1.1.9
launchpadlib==1.10.16
lazr.restfulclient==0.14.4
lazr.uri==1.0.6
llama_cpp_python==0.2.77
llvmlite==0.42.0
lm-format-enforcer==0.10.1
locket==1.0.0
lxml==5.2.2
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib-inline==0.1.7
mccabe==0.7.0
mdurl==0.1.2
more-itertools==10.2.0
mpmath==1.3.0
msgpack==1.0.8
multidict==6.0.5
multiprocess==0.70.16
mypy==1.10.0
mypy-extensions==1.0.0
nest-asyncio==1.6.0
netifaces==0.11.0
networkx==3.3
nh3==0.2.17
ninja==1.11.1.1
numba==0.59.1
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-ml-py==12.550.52
nvidia-nccl-cu12==2.20.5
nvidia-nvjitlink-cu12==12.5.40
nvidia-nvtx-cu12==12.1.105
oauthlib==3.2.2
olefile==0.46
openai==1.31.1
orjson==3.10.3
outlines==0.0.43
packaging==21.3
pandas==2.2.2
parsel==1.9.1
parso==0.8.4
partd==1.4.2
pathspec==0.12.1
pbr==6.0.0
pexpect==4.9.0
pfzy==0.3.4
pillow==10.3.0
pip-tools==7.4.1
pipdeptree==2.20.0
pipenv==2023.12.1
pkginfo==1.10.0
platformdirs==4.2.2
pluggy==1.5.0
prometheus-fastapi-instrumentator==7.0.0
prometheus_client==0.20.0
prompt-toolkit==3.0.43
Protego==0.3.1
protobuf==5.27.0
psutil==5.9.8
psycopg2-binary==2.9.9
ptyprocess==0.7.0
pure-eval==0.2.2
py-cpuinfo==9.0.0
pyairports==2.1.1
pyarrow==16.1.0
pyarrow-hotfix==0.6
pyasn1==0.6.0
pyasn1_modules==0.4.0
pycairo==1.26.0
pycountry==24.6.1
pycparser==2.22
pydantic==2.7.1
pydantic_core==2.18.2
PyDispatcher==2.0.7
Pygments==2.18.0
PyGObject==3.42.1
PyHamcrest==2.0.2
PyJWT==2.8.0
pylint==3.2.1
pyOpenSSL==24.1.0
pyparsing==3.1.2
pyproject_hooks==1.1.0
pyrsistent==0.18.1
pyserial==3.5
PySocks==1.7.1
pytest==8.2.1
python-apt==2.4.0+ubuntu3
python-dateutil==2.9.0.post0
python-debian==0.1.43+ubuntu1.1
python-dotenv==1.0.1
python-editor==1.0.4
python-magic==0.4.24
python-multipart==0.0.9
pytz==2022.1
pyxdg==0.27
PyYAML==6.0.1
queuelib==1.7.0
ray==2.23.0
readchar==4.1.0
readme_renderer==43.0
redis==5.0.4
referencing==0.35.1
regex==2024.5.15
requests==2.31.0
requests-file==2.0.0
requests-toolbelt==1.0.0
rfc3986==2.0.0
rich==13.7.1
roman==3.3
rpds-py==0.18.1
rsa==4.7.2
ruamel.yaml==0.18.6
ruamel.yaml.clib==0.2.8
s3transfer==0.10.1
safetensors==0.4.3
scikit-learn==1.5.0
scipy==1.13.1
Scrapy==2.11.2
SecretStorage==3.3.3
sentence-transformers==3.0.0
sentencepiece==0.2.0
service-identity==24.1.0
shellingham==1.5.4
six==1.16.0
sniffio==1.3.1
sos==4.5.6
SQLAlchemy==2.0.30
ssh-import-id==5.11
stack-data==0.6.3
starlette==0.37.2
sympy==1.12.1
systemd-python==234
testresources==2.0.1
threadpoolctl==3.5.0
tiktoken==0.7.0
tldextract==5.1.2
tokenizers==0.19.1
tomli==2.0.1
tomlkit==0.12.5
toolz==0.12.1
torch==2.3.0
tqdm==4.66.4
traitlets==5.14.3
transformers==4.41.2
triton==2.3.0
twine==5.1.0
Twisted==24.3.0
typer==0.12.3
typing_extensions==4.11.0
tzdata==2024.1
ubuntu-pro-client==8001
ufw==0.36.1
ujson==5.10.0
unattended-upgrades==0.1
unidiff==0.5.5
urllib3==2.2.1
uvicorn==0.29.0
uvloop==0.19.0
vine==5.1.0
virtualenv==20.26.2
vllm-flash-attn==2.5.8.post2
w3lib==2.1.2
wadllib==1.3.6
watchfiles==0.21.0
wcwidth==0.2.13
websockets==12.0
Werkzeug==3.0.3
xdg==5
xformers==0.0.26.post1
xxhash==3.4.1
yapf==0.40.2
yarl==1.9.4
zipp==3.18.2
zope.interface==6.4
The text was updated successfully, but these errors were encountered: