


Published as a long paper at EMNLP 2023.

This dataset contains data for the attribution task in It is based on data from the XOR-TyDiQA dataset ( and model predictions from the CoRA system (


XOR-AttriQA data:

We include data annotations for both the in-language and in-English settings in the in-language and in-english folders respectively. Use the data from the in-language setting to compare with results in our work. The data from the in-English setting was used to compare inter-annotator agreement between the two settings and is included only as a reference.

Data Fields

  • query: Query from XOR-QA.
  • query_language: Language of the query.
  • answers: Answers to the query provided by XOR-QA.
  • prediction: Prediction from the CoRA system.
  • prediction_correct: Whether the prediction was determined to match the answers.
  • query_translated_en: Query translated to English.
  • answers_translated_en: Answers translated to English.
  • prediction_translated_en: Prediction translated to English.
  • passage_in_language: Passage in language (see passage_retrieved_language to determine if translated or not).
  • passage_en: Passage in English (see passage_retrieved_language to determine if translated or not).
  • passage_retrieved_language: Language of retrieved passage.
  • intrepetability_vote: Ratio of votes on whether raters could understand the prediction.
  • ais_vote: Ratio of votes on whether the prediction is attributable to the passage and query.
  • interpretability: True or False on whether raters could understand the prediction.
  • ais: True or False on whether the prediction is attributable to the passage and query.


The data can be used for both in-language and cross-lingual attribution. In attribution, the task is to take a query, a passage, and an answer (model prediction), determine if the passage does indeed show that the provided answer is correct for the provided question.

For in-language attribution: Use the query, prediction, and passage_in_language fields to predict True or False. The gold label is in the ais field.

For cross-language attribution: Use the query, prediction, and passage_en fields to predict True or False. The gold label is again in the ais field.

For the task used in our paper, we used a mix of in-language and cross-language attribution, depending on which passages were retrieved from the CORA system. To reproduce this setting, use the passage_retrieved_language field to determine whether to select the passage in passage_in_language or the passage in passage_en.

Lastly, we have a validation set composed of 100 examples from each language and a training set composed of 50 high-confidence (all raters agreed on the ais label) examples.

Number of Data Points per Split

In-Language Setting:

  • bn: 1407
  • fi: 659
  • ja: 954
  • ru: 634
  • te: 1066
  • train: 250
  • validation: 500

In-English Setting:

  • bn: 567
  • fi: 812
  • ja: 1262
  • ru: 790
  • te: 443


  title={Evaluating and Modeling Attribution for Cross-Lingual Question Answering},
  author={Muller, Benjamin and Wieting, John and Clark, Jonathan H and Kwiatkowski, Tom and Ruder, Sebastian and Soares, Livio Baldini and Aharoni, Roee and Herzig, Jonathan and Wang, Xinyi},
  journal={arXiv preprint arXiv:2305.14332},

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.