why the target_label is 0 for all training dataset? #139

hecongqing · 2024-07-06T07:00:38Z

class RerankerModel(nn.Module):
    TRANSFORMER_CLS = AutoModelForSequenceClassification

    def __init__(self, hf_model: PreTrainedModel, train_batch_size: int = None):
        super().__init__()
        self.config = hf_model.config
        self.hf_model = hf_model
        self.train_batch_size = train_batch_size
        self.cross_entropy = nn.CrossEntropyLoss(reduction='mean')
        if train_batch_size:
            self.register_buffer(
                'target_label',
                torch.zeros(self.train_batch_size, dtype=torch.long, device=self.hf_model.device)
            )
        for name, param in self.hf_model.named_parameters():
            # for some reason, ds zero 3 left some weights empty
            if 'modules_to_save' in name and param.numel() == 0:
                logger.warning(f'parameter {name}, shape {param.shape} is empty')
                param.data = nn.Linear(self.hf_model.config.hidden_size, 1).weight.data
                logger.warning('{} data: {}'.format(name, param.data.cpu().numpy()))

The text was updated successfully, but these errors were encountered:

MXueguang · 2024-07-12T03:31:25Z

sorry for the late reply,

we do this because we place positive doc at index 0, the following docs are negatives

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why the target_label is 0 for all training dataset? #139

why the target_label is 0 for all training dataset? #139

hecongqing commented Jul 6, 2024

MXueguang commented Jul 12, 2024

why the target_label is 0 for all training dataset? #139

why the target_label is 0 for all training dataset? #139

Comments

hecongqing commented Jul 6, 2024

MXueguang commented Jul 12, 2024