DREsS

Dataset for Rubric-based Essay Scoring

🎉 DREsS is accepted to ACL 2025!


This is an official website of DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing (Yoo et al., 2025).

DREsS is a large-scale, standard dataset for rubric-based automated essay scoring. DREsS comprises three sub-datasets:

Dataset

The essays in DREsS are scored on a range of 1 to 5, with increments of 0.5, based on the three rubrics: content, organization, and language.

Criteria Description
Content Paragraph is well-developed and relevant to the argument, supported with strong reasons and examples.
Organization The argument is very effectively structured and developed, making it easy for the reader to follow the ideas and understand how the writer is building the argument. Paragraphs use coherence devices effectively while focusing on a single main idea.
Language The writing displays sophisticated control of a wide range of vocabulary and collocations. The essay follows grammar and usage rules throughout the paper. Spelling and punctuation are correct throughout the paper.
Data attributes
Column Type Description
id Integer A unique identifier of each essay sample
source String [Optional] An original source of the essay sample (only for DREsS_std)
prompt String An essay prompt
essay String A student-written essay
score Float A rubric-based score of the essay (content, organization, language, total)
Data statistics
Subdata Source Content Organization Language
DREsS_New - 2,279 2,279 2,279
DREsS_Std. ASAP P7 1,569 1,569 1,569
  ASAP P8 723 723 723
  ASAP++ P1 1,785 1,785 1,785
  ASAP++ P2 1,799 1,799 1,799
  ICNALE EE 639 639 693
DREsS_CASE - 8,307 31,086 792
Total   17,101 39,880 9,586

Download

Please submit the consent form. You will be redirected to the accessible link to DREsS.

Citation

@inproceedings{yoo-etal-2025-dress,
    title = "{DRE}s{S}: Dataset for Rubric-based Essay Scoring on {EFL} Writing",
    author = "Yoo, Haneul  and
      Han, Jieun  and
      Ahn, So-Yeon  and
      Oh, Alice",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.acl-long.659/",
    doi = "10.18653/v1/2025.acl-long.659",
    pages = "13439--13454",
    ISBN = "979-8-89176-251-0",
}