Different unique pair number for SetFitTrainer.train and Trainer.hyperparameter_search with same args #545

HexadimensionalerAlp · 2024-07-25T13:04:11Z

Hi, I trained a model with SetFitTrainer and afterwards started a hyperparameter optimization with the same parameters for testing reasons. The expected behaviour would be both of them having the same number of unique pairs and therefore taking roughly the same time. But in reality the direkt train approach had 64240 unique pairs, 4015 optimization steps and took 30 minutes per epoch, while the optimization had 2039350 unique pairs, 127460 optimization steps and was about too take 19 hours.

Training task:

model = SetFitModel.from_pretrained(
    'sentence-transformers/paraphrase-mpnet-base-v2',
    multi_target_strategy='multi-output'
)

trainer = SetFitTrainer(
    model=model,
    train_dataset=datasets['train'],
    eval_dataset=datasets['validation'],
    loss_class=CosineSimilarityLoss,
    batch_size=16,
    num_iterations=20,
    num_epochs=1
)

trainer.train()

Optimization task:

def model_init(params: Dict[str, Any]) -> SetFitModel:
    params = params or {}
    max_iter = params.get('max_iter', 100)
    solver = params.get('solver', 'liblinear')
    params = {
        'head_params': {
            'max_iter': max_iter,
            'solver': solver
        }
    }
    
    return SetFitModel.from_pretrained('sentence-transformers/paraphrase-mpnet-base-v2', multi_target_strategy='multi-output')


def hp_space(trial: Trial) -> Dict[str, Union[float, int, str]]:
    return {
        "body_learning_rate": trial.suggest_float("body_learning_rate", 1e-5, 1e-5, log=True),
        "num_epochs": trial.suggest_int("num_epochs", 1, 1),
        "batch_size": trial.suggest_categorical("batch_size", [16]),
        "seed": trial.suggest_int("seed", 42, 42),
        "max_iter": trial.suggest_int("max_iter", 20, 20),
        "solver": trial.suggest_categorical("solver", ["liblinear"]),
    }


trainer = Trainer(
    train_dataset=datasets['train'],
    eval_dataset=datasets['validation'],
    model_init=model_init
)

best_run = trainer.hyperparameter_search(direction="maximize", hp_space=hp_space, n_trials=1)

The data consists of datasets with the columns 'text' and 'label', where 'text' is a string and 'label' a tensor of following format: [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]. Although that should not be relevant for this issue.

In my understanding, both of them should be comparable in complexity of the training task as the used parameters are the same. What ist the explanation for this behaviour and is there a possibility to recreate the situation in the training task in the optimization task?

Thank you in advance!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different unique pair number for SetFitTrainer.train and Trainer.hyperparameter_search with same args #545

Different unique pair number for SetFitTrainer.train and Trainer.hyperparameter_search with same args #545

HexadimensionalerAlp commented Jul 25, 2024

Different unique pair number for SetFitTrainer.train and Trainer.hyperparameter_search with same args #545

Different unique pair number for SetFitTrainer.train and Trainer.hyperparameter_search with same args #545

Comments

HexadimensionalerAlp commented Jul 25, 2024