Getting Started | SCROLLS Benchmark

Downloading the Data

Making a Leaderboard Submission

Create a comma-separated values (CSV) file with the headers (Task, ID, Prediction), where each row represents one output.
For example:

We recommend using our conversion script to produce the CSV file from JSON prediction files to avoid discrepancies.
Login to the website (using your Google account is recommended).
Upload your CSV file via the submission page.
Within a few minutes, check your email for a confirmation message that your submission has been received.
Results will be sent by email within 24 hours. Valid public submissions will immediately appear on the leaderboard.

Each user is limited to 5 submissions per week and a total of 10 submissions per month.

from datasets import load_dataset

scrolls_datasets = ["gov_report", "summ_screen_fd", "qmsum",

"narrative_qa", "qasper", "quality", "contract_nli"]

data = [load_dataset("tau/scrolls", dataset) for dataset in scrolls_datasets]

Task,ID,Prediction

qasper,8941956c4b67e2436bbaf372a120f358f50c377b,"English, German, French"

qasper,5b63fb32633223fa4ee214979860349242a11451,"sentiment classifiers"

...

quality,72790_5QFDYSRE_4,"No, he worked in the spaceport on Mercury."

...

summ_screen_fd,fd_Gilmore_Girls_01x13,"Rory's charity rummage sale is a disaste..."

...