The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable. No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda' [2024-05-29 07:52:10,639] [INFO] [datasets.:58] [PID:33] PyTorch version 2.1.2+cu118 available. [2024-05-29 07:52:11,408] [WARNING] [real_accelerator.py:162:get_accelerator] Setting accelerator to CPU. If you have GPU or other accelerator, we were unable to detect it. [2024-05-29 07:52:11,408] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cpu (auto detect) df: /root/.triton/autotune: No such file or directory dP dP dP 88 88 88 .d8888b. dP. .dP .d8888b. 88 .d8888b. d8888P 88 88' `88 `8bd8' 88' `88 88 88' `88 88 88 88. .88 .d88b. 88. .88 88 88. .88 88 88 `88888P8 dP' `dP `88888P' dP `88888P' dP dP **************************************** **** Axolotl Dependency Versions ***** accelerate: 0.30.1 peft: 0.11.1 transformers: 4.41.1 trl: 0.8.6 torch: 2.1.2+cu118 bitsandbytes: 0.43.1 **************************************** [2024-05-29 07:52:12,381] [WARNING] [axolotl.utils.config.models.input.hint_sample_packing_padding:747] [PID:33] [RANK:0] `pad_to_sequence_len: true` is recommended when using sample_packing /root/miniconda3/envs/py3.10/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( [2024-05-29 07:52:12,607] [INFO] [axolotl.normalize_config:182] [PID:33] [RANK:0] GPU memory usage baseline: 0.000GB () [2024-05-29 07:52:12,930] [WARNING] [axolotl.cli.preprocess.do_cli:66] [PID:33] [RANK:0] preprocess CLI called without dataset_prepared_path set, using default path: last_run_prepared You are using the default legacy behaviour of the . This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 [2024-05-29 07:52:13,343] [DEBUG] [axolotl.load_tokenizer:280] [PID:33] [RANK:0] EOS: 2 / [2024-05-29 07:52:13,343] [DEBUG] [axolotl.load_tokenizer:281] [PID:33] [RANK:0] BOS: 1 / ~~[2024-05-29 07:52:13,343] [DEBUG] [axolotl.load_tokenizer:282] [PID:33] [RANK:0] PAD: 2 /~~ [2024-05-29 07:52:13,343] [DEBUG] [axolotl.load_tokenizer:283] [PID:33] [RANK:0] UNK: 0 / [2024-05-29 07:52:13,343] [INFO] [axolotl.load_tokenizer:294] [PID:33] [RANK:0] No Chat template selected. Consider adding a chat template for easier inference. [2024-05-29 07:52:13,344] [INFO] [axolotl.load_tokenized_prepared_datasets:183] [PID:33] [RANK:0] Unable to find prepared dataset in last_run_prepared/8cc35674c453a287d7de953d7084a596 [2024-05-29 07:52:13,344] [INFO] [axolotl.load_tokenized_prepared_datasets:184] [PID:33] [RANK:0] Loading raw datasets... [2024-05-29 07:52:13,344] [INFO] [axolotl.load_tokenized_prepared_datasets:193] [PID:33] [RANK:0] No seed provided, using default seed of 42 Repo card metadata block was not found. Setting CardData to empty. [2024-05-29 07:52:14,186] [WARNING] [huggingface_hub.repocard.content:107] [PID:33] Repo card metadata block was not found. Setting CardData to empty. Repo card metadata block was not found. Setting CardData to empty. [2024-05-29 07:52:17,188] [WARNING] [huggingface_hub.repocard.content:107] [PID:33] Repo card metadata block was not found. Setting CardData to empty. Tokenizing Prompts (num_proc=48): 26%|███████████████████████████████████████████▍ | 14203/54568 [00:00<00:01, 24731.47 examples/s][2024-05-29 07:52:19,618] [WARNING] [axolotl._tokenize:66] [PID:186] [RANK:0] Empty text requested for tokenization. Tokenizing Prompts (num_proc=48): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 54568/54568 [00:02<00:00, 21293.98 examples/s] [2024-05-29 07:52:21,478] [INFO] [axolotl.load_tokenized_prepared_datasets:410] [PID:33] [RANK:0] merging datasets [2024-05-29 07:52:21,520] [DEBUG] [axolotl.process_datasets_for_packing:188] [PID:33] [RANK:0] min_input_len: 40 [2024-05-29 07:52:21,554] [DEBUG] [axolotl.process_datasets_for_packing:190] [PID:33] [RANK:0] max_input_len: 810 Dropping Long Sequences (num_proc=48): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 54568/54568 [00:00<00:00, 76185.19 examples/s] Add position_id column (Sample Packing) (num_proc=48): 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 54568/54568 [00:00<00:00, 74018.34 examples/s] [2024-05-29 07:52:24,248] [INFO] [axolotl.load_tokenized_prepared_datasets:423] [PID:33] [RANK:0] Saving merged prepared dataset to disk... last_run_prepared/8cc35674c453a287d7de953d7084a596 Saving the dataset (1/1 shards): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 54568/54568 [00:00<00:00, 207263.24 examples/s] [2024-05-29 07:52:24,539] [DEBUG] [axolotl.calculate_total_num_steps:299] [PID:33] [RANK:0] total_num_tokens: 182_913 [2024-05-29 07:52:24,548] [DEBUG] [axolotl.calculate_total_num_steps:312] [PID:33] [RANK:0] `total_supervised_tokens: 38_104` [2024-05-29 07:52:27,786] [DEBUG] [axolotl.calculate_total_num_steps:364] [PID:33] [RANK:0] data_loader_len: 90 [2024-05-29 07:52:27,786] [INFO] [axolotl.calc_sample_packing_eff_est:370] [PID:33] [RANK:0] sample_packing_eff_est across ranks: [0.9923665364583333] [2024-05-29 07:52:27,786] [DEBUG] [axolotl.calculate_total_num_steps:382] [PID:33] [RANK:0] sample_packing_eff_est: None [2024-05-29 07:52:27,787] [DEBUG] [axolotl.calculate_total_num_steps:390] [PID:33] [RANK:0] total_num_steps: 360 [2024-05-29 07:52:27,852] [DEBUG] [axolotl.calculate_total_num_steps:299] [PID:33] [RANK:0] total_num_tokens: 10_466_111 [2024-05-29 07:52:28,321] [DEBUG] [axolotl.calculate_total_num_steps:312] [PID:33] [RANK:0] `total_supervised_tokens: 6_735_490` [2024-05-29 07:52:31,909] [DEBUG] [axolotl.calculate_total_num_steps:364] [PID:33] [RANK:0] data_loader_len: 5124 [2024-05-29 07:52:31,909] [INFO] [axolotl.calc_sample_packing_eff_est:370] [PID:33] [RANK:0] sample_packing_eff_est across ranks: [0.997346948032543] [2024-05-29 07:52:31,909] [DEBUG] [axolotl.calculate_total_num_steps:382] [PID:33] [RANK:0] sample_packing_eff_est: 1.0 [2024-05-29 07:52:31,909] [DEBUG] [axolotl.calculate_total_num_steps:390] [PID:33] [RANK:0] total_num_steps: 20496 [2024-05-29 07:52:31,921] [INFO] [axolotl.cli.preprocess.do_cli:74] [PID:33] [RANK:0] Success! Preprocessed data path: `dataset_prepared_path: last_run_prepared`