openprompt

Форк
0

..
/
results 
2 года назад
README.md

Performance of Different Prompt Tuning Methods

We report the performance on widely-used datasets of each method. Note that we do not attempt to match the exact performance score of the referenced papers, if they use additional tricks such as data-augmentation or prompt-ensemble.

Table Heads Explanation

Prompt

The config of the template.

LM

The pre-trained language model we used.

Ref

The specific yaml file or tutorial scripts to achieve the results.

Comment

Other noticeable aspects of the experiments.


Few-NERD

Dataset details see https://arxiv.org/abs/2105.07464 N-S means N-shot

PromptLMRefCommentAcc(8-S)MiF(8-S)
ManualT+ManualVbert-base-casedyaml55.3067.88

webnlg_2017

The evaluation scripts: https://github.com/Yale-LILY/dart

PromptLMRefCommentBLEU-SEENBLEU-UNSEENBLEU-ALL
Prft5-base, fix,tutorial2.2plm-dropout-off62.8847.0555.79
Prft5-base, fixtutorial2.2plm-dropout-on61.9452.0257.41
Prfgpt2-medium, fix,tutorial2.2plm-dropout-off62.9743.4354.21
Prfgpt2-medium, fixtutorial2.2plm-dropout-on60.2145.6753.66

SuperGLUE

All result

PromptLMTemplateVerbalizerRefCommentValidation Acc
Softt5-lg-lm-admanual_0gen_0tutorial*Generation Objective0.74

* A command line command to reproduce all results:

python tutorial/4.1_all_tasks_are_generation.py --model t5-lm --plm_eval_mode --dataset $datasetname --template_id 0 --verbalizer_id 0 --seed 100 --prompt_lr 0.3 --optimizer Adafactor --warmup_step_prompt 0 --max_steps 20000 --eval_every_steps 500

Boolq

PromptLMTemplateVerbalizerRefCommentValidation Acc
Softt5-lg-lm-admanual_0manual_0tutorialClassification Objective0.833
Softt5-lg-lm-admanual_0gen_0tutorialGeneration Objective0.825

MultiRC

PromptLMTemplateVerbalizerRefCommentValidation Acc
Softt5-lg-lm-admanual_0manual_0tutorialClassification Objective0.812
Softt5-lg-lm-admanual_0gen_0tutorialGeneration Objective0.797

WiC

PromptLMTemplateVerbalizerRefCommentValidation Acc
Softt5-lg-lm-admanual_0manual_0tutorialClassification Objective0.701
Softt5-lg-lm-admanual_0gen_0tutorialGeneration Objective0.650

CB

PromptLMTemplateVerbalizerRefCommentValidation Acc
Softt5-lg-lm-admanual_0gen_0tutorialGeneration Objective0.75

RTE

PromptLMTemplateVerbalizerRefCommentValidation Acc
Softt5-lg-lm-admanual_0manual_0tutorialClassification Objective0.820
Softt5-lg-lm-admanual_0gen_0tutorialGeneration Objective0.794

WSC

PromptLMTemplateVerbalizerRefCommentValidation Acc
Softt5-lg-lm-admanual_0gen_0*tutorialGeneration Objective0.625

* The verbalier [{"text": "Another word}, {"meta": "span1_text"}] Might not be the optimal, just to show a use case of the generation verbalizer.

COPA

PromptLMTemplateVerbalizerRefCommentValidation Acc
Softt5-lg-lm-admanual_0gen_0*tutorialGeneration Objective0.72

* The verbalizer [{"meta":"choice1"}, {"meta":"choice2"}] is different from the verbalizer used in T5, ["True", "False"]. Superisingly, recovering the whole choice1/choice2 sentence is very easy for LM, and yield much better result (0.72 vs 0.60)

RECORD

PromptLMTemplateVerbalizerRefCommentValidation Acc
Softt5-lg-lm-admanual_0gen_0tutorialGeneration Objective0.770

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.