Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix token attention kernel #474

Merged
merged 10 commits into from
Jul 25, 2024

Conversation

flyinglandlord
Copy link
Contributor

问题描述

token attention的stage1和stage2中间的att_m_tensor的类型应当是torch.float32而不是torch.float16

测试用例

data = {
        "inputs": """<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\nWhat is the correct answer to this question: trans-cinnamaldehyde was treated with methylmagnesium bromide, forming product 1.\n\n1 was treated with pyridinium chlorochromate, forming product 2.\n\n3 was treated with (dimethyl(oxo)-l6-sulfaneylidene)methane in DMSO at elevated temperature, forming product 3.\n\nhow many carbon atoms are there in product 3?\nChoices:\n(A)12\n(B)14\n(C)11\n(D)10\nFormat your response as follows: "The correct answer is (insert answer here)"<|im_end|>\n<|im_start|>assistant\n""",
        "parameters": {
            "temperature": 1,
            "max_new_tokens": 300,
            "stop_sequences": [
            "<|endofblock|>",
            "<|endofblock|><|im_end|>",
            "<|endoftext|>",
            "<|im_start|>"
            ],
            # "repetition_penalty": 1.05,
            "top_k": 1,
            "best_of": 1,
            "do_sample": True
        }

输出为

{'generated_text': ['The correct answer is (A)12\n'], 'count_output_tokens': 12, 'finish_reason': 'stop', 'prompt_tokens': 149}

yunqian and others added 10 commits June 10, 2024 05:47
_fix: style
Conflicts:
	lightllm/common/basemodel/layer_infer/template/transformer_layer_infer_cohere_template.py
	lightllm/models/cohere/layer_infer/post_layer_infer.py
	lightllm/models/cohere/layer_infer/transformer_layer_infer.py
	lightllm/models/cohere/model.py
	lightllm/models/cohere/triton_kernels/layernorm.py
	lightllm/server/router/model_infer/mode_backend/base_backend.py
@hiworldwzj hiworldwzj merged commit b5815a4 into ModelTC:main Jul 25, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants