from transformers import AutoModelForSequenceClassification, Trainer, TrainingArguments# Fine-Tuning을 위한 옵션 지정training_args = TrainingArguments( output_dir='./results', # 결과 값이 저장될 디렉토리 지정 num_train_epochs=3, # 학습 epoch per_device_train_batch_size=16, # training 배치사이즈 per_device_eval_batch_size=64, # evaluation 배치사이즈 warmup_steps=500, # leaning rate 스케줄러의 웜업 step weight_decay=0.01, # weight decay 강도 logging_dir='./logs', # 로그를 저장할 디렉토리 logging_steps=200, # 로그 출력 step)
# pretrained 모델 지정model_pretrained ='distilbert-base-uncased'# 모델 다운로드, num_labels 지정, device 지정model = AutoModelForSequenceClassification.from_pretrained(model_pretrained, num_labels=2).to(device)# Trainer 생성 후, model, train, test 데이터셋 지정trainer = Trainer( model=model, # 이전에 불러온 허깅페이스 pretrained 모델 args=training_args, # 이전에 정의한 training arguments 지정 train_dataset=train_data, # training 데이터 eval_dataset=test_data # test 데이터)# trainer 를 활용한 학습 시작trainer.train()
Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertForSequenceClassification: ['vocab_layer_norm.weight', 'vocab_layer_norm.bias', 'vocab_transform.weight', 'vocab_projector.bias', 'vocab_projector.weight', 'vocab_transform.bias']
- This IS expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['pre_classifier.bias', 'classifier.bias', 'classifier.weight', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
***** Running training *****
Num examples = 20031
Num Epochs = 3
Instantaneous batch size per device = 16
Total train batch size (w. parallel, distributed & accumulation) = 32
Gradient Accumulation steps = 1
Total optimization steps = 1878
Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
wandb: Currently logged in as: clee166. Use `wandb login --relogin` to force relogin
wandb version 0.13.7 is available! To upgrade, please run:
$ pip install wandb --upgrade
Tracking run with wandb version 0.13.3
Run data is saved locally in /home/jupyter/07-pytorch/wandb/run-20230109_175133-3uq49qn3
Saving model checkpoint to ./results/checkpoint-500
Configuration saved in ./results/checkpoint-500/config.json
Model weights saved in ./results/checkpoint-500/pytorch_model.bin
Saving model checkpoint to ./results/checkpoint-1000
Configuration saved in ./results/checkpoint-1000/config.json
Model weights saved in ./results/checkpoint-1000/pytorch_model.bin
Saving model checkpoint to ./results/checkpoint-1500
Configuration saved in ./results/checkpoint-1500/config.json
Model weights saved in ./results/checkpoint-1500/pytorch_model.bin
Training completed. Do not forget to share your model on huggingface.co/models =)