Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault When Running Example in README #4

Open
sidjha1 opened this issue Feb 3, 2024 · 2 comments
Open

Segmentation Fault When Running Example in README #4

sidjha1 opened this issue Feb 3, 2024 · 2 comments

Comments

@sidjha1
Copy link

sidjha1 commented Feb 3, 2024

Hello, I'm interested in reproducing some of the results and eventually testing it with my own models. I was following the README and was able to run the following command without any issues.

python run_irera.py \
    --dataset_name esco_tech \
    --state_path ./results_precompiled/esco_tech_infer-retrieve-rank_00/program_state.json \
    --lm_config_path ./lm_config.json \
    --do_validation \
    --do_test 

However, when I try to run the command below (copied from the README), I get a segmentation fault error.

(xmc) sidjha@guestrin-hgx-1:~/xmc.dspy$ python compile_irera.py \
>     --dataset_name esco_tech \
>     --ontology_name esco \
>     --prior_path ./data/esco/esco_priors.json \
>     --ontology_path ./data/esco/skills_en_label.txt \
>     --infer_signature_name infer_esco \
>     --rank_signature_name rank_esco \
>     --retriever_model_name sentence-transformers/all-mpnet-base-v2 \
>     --infer_student_model_name llama-2-13b-chat \
>     --infer_teacher_model_name gpt-3.5-turbo-instruct \
>     --rank_student_model_name gpt-4-1106-preview \
>     --rank_teacher_model_name gpt-4-1106-preview \
>     --infer_compile_metric_name rp10 \
>     --rank_compile_metric_name rp10 \
>     --prior_A 0 \
>     --rank_topk 50 \
>     --do_validation \
>     --do_test \
>     --optimizer_name left-to-right \
>     --lm_config_path ./lm_config.json 
./local_cache/compiler
dataset_name:  esco_tech
retriever_model_name:  sentence-transformers/all-mpnet-base-v2
infer_signature_name:  infer_esco
infer_student_model_name:  llama-2-13b-chat
infer_teacher_model_name:  gpt-3.5-turbo-instruct
rank_signature_name:  rank_esco
rank_student_model_name:  gpt-4-1106-preview
rank_teacher_model_name:  gpt-4-1106-preview
infer_compile:  True
infer_compile_metric_name:  rp10
rank_skip:  False
rank_compile:  True
rank_compile_metric_name:  rp10
prior_A:  0
rank_topk:  50
do_validation:  True
do_test:  True
prior_path:  ./data/esco/esco_priors.json
ontology_path:  ./data/esco/skills_en_label.txt
ontology_name:  esco
optimizer_name:  left-to-right
Dataset: esco_tech
# esco_tech: Total Validation size: 75
# esco_tech: Total Test size: 338
esco_tech: avg # ontology items per input (for validation set): 1.75
esco_tech: Q25, Q50, Q75, Q95 # ontology items per input (for validation set): 0.25    1.0
0.50    1.0
0.75    2.0
0.95    3.0
Name: label, dtype: float64
esco_tech: # Used Train size: 10
esco_tech: # Used Validation size: 65
esco_tech: # Used Test size: 338
/lfs/guestrin-hgx-1/0/sidjha/miniconda3/envs/xmc/lib/python3.10/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  return self.fget.__get__(instance, owner)()
Going to sample between 1 and 2 traces per predictor.
Will attempt to train 10 candidate sets.
-3 range(0, 20)
-2 range(0, 20)
-1 range(0, 20)
 30%|█████████████████████████████▋                                                                     | 3/10 [00:01<00:02,  2.40it/s]
Bootstrapped 2 full traces after 4 examples in round 0.
Segmentation fault (core dumped)

I also get segmentation faults when running bash scripts/compile_left_to_right.sh.

@KarelDO
Copy link
Owner

KarelDO commented Feb 7, 2024

I've not encountered this error. Do you know which part of the code throws this?

@tma15
Copy link

tma15 commented Apr 17, 2024

I had encountered the segmentation fault caused in Evaluate_execute_multi_thread of dspy. I don't know an actual reason but this can be avoided by setting num_threads in LeftToRightOptimizer to 1 by default.

class LeftToRightOptimizer:
    def __init__(
        self,
        modules_to_lms: dict[str, tuple],
        infer_compile: bool,
        infer_compile_metric_name: str,
        rank_compile: bool,
        rank_compile_metric_name: str,
    ):
        # TODO: add an optimization config
        self.modules_to_lms = modules_to_lms

        self.infer_compile = infer_compile
        self.infer_compile_metric = supported_metrics[infer_compile_metric_name]

        self.rank_compile = rank_compile
        self.rank_compile_metric = supported_metrics[rank_compile_metric_name]

        # compilation hyperparameters
        self.max_bootstrapped_demos = 2
        self.max_labeled_demos = 0
        self.max_rounds = 1
        self.num_candidate_programs = 10
        
        # self.num_threads = 8
        self.num_threads = os.environ.get('DSP_NUM_THREADS', 1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants