DREsS

Dataset for Rubric-based Essay Scoring


This is an official website of DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing (Yoo et al., 2024).

DREsS is a large-scale, standard dataset for rubric-based automated essay scoring. DREsS comprises three sub-datasets: DREsS_New, DREsS_Std., and DREsS_CASE. We collect DREsS_New, a real-classroom dataset with 1.7K essays authored by EFL undergraduate students and scored by English education experts. We also standardize existing rubric-based essay scoring datasets as DREsS_Std. We generate 20K synthetic samples of DREsS_CASE using CASE (corruption-based augmentation strategy for essays).

Dataset

The essays in DREsS are scored on a range of 1 to 5, with increments of 0.5, based on the three rubrics: content, organization, and language.

Criteria Description
Content Paragraph is well-developed and relevant to the argument, supported with strong reasons and examples.
Organization The argument is very effectively structured and developed, making it easy for the reader to follow the ideas and understand how the writer is building the argument. Paragraphs use coherence devices effectively while focusing on a single main idea.
Language The writing displays sophisticated control of a wide range of vocabulary and collocations. The essay follows grammar and usage rules throughout the paper. Spelling and punctuation are correct throughout the paper.
Data attributes
Column Type Description
id Integer A unique identifier of each essay sample
source String [Optional] An original source of the essay sample (only for DREsS_std)
prompt String An essay prompt
essay String A student-written essay
score Float A rubric-based score of the essay (content, organization, language, total)
Data statistics
Subdata Source Content Organization Language
DREsS_New - 2,279 2,279 2,279
DREsS_Std. ASAP P7 1,569 1,569 1,569
  ASAP P8 723 723 723
  ASAP++ P1 1,785 1,785 1,785
  ASAP++ P2 1,799 1,799 1,799
  ICNALE EE 639 639 693
DREsS_CASE - 8,307 31,086 792
Total   17,101 39,880 9,586

Download

Please submit the consent form. After reviewing your consent form, we will send you the dataset link soon through email.

Citation

@article{yoo2024dress,
      title={DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing}, 
      author={Haneul Yoo and Jieun Han and So-Yeon Ahn and Alice Oh},
      journal={arXiv preprint arXiv:2402.16733},
      year={2024},
}