top of page
Date | Model | Contributors | #Params | Input Length | Score (Average) | GovRep (R1/R2/RL) | SumScr (R1/R2/RL) | QMSum (R1/R2/RL) | Qspr (F1) | Nrtv (F1) | QALT (EM-T/H) | CNLI (EM) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
02/28/2023 | CoLT5 XL | Google Research | 5.3B | 16K | 43.51 | 61.3/32.2/33.8 | 36.4/10.1/21.7 | 36.2/12.9/24.2 | 53.9 | 31.1 | 48.1/43.8 | 88.4 |
03/07/2023 | LongT5 XL | LongT5 | 3B | 16K | 42.53 | 61.1/32.3/33.7 | 35.8/9.6/21.1 | 34.9/11.8/23.5 | 53.1 | 29.3 | 46.0/42.1 | 88.2 |
02/28/2023 | CoLT5 Large | Google Research | 1.46B | 16K | 41.04 | 60.7/31.3/32.9 | 36.7/10.6/22.0 | 34.9/11.5/23.1 | 49.8 | 27.7 | 39.9/36.8 | 88.7 |
03/07/2023 | LongT5 Large | LongT5 | 770M | 16K | 41.03 | 60.3/31.1/32.8 | 35.6/9.2/21.2 | 35.1/12.0/23.3 | 52.3 | 27.2 | 40.6/38.6 | 87.3 |
08/23/2022 | BART-LS | Meta AI | 460M | 16K | 39.76 | 59.4/29.8/30.8 | 37.7/10.2/21.5 | 35.1/11.0/22.0 | 48.7 | 26.2 | 37.8/34.0 | 87.1 |
03/07/2023 | LongT5 Base | LongT5 | 220M | 16K | 38.6 | 57.7/30.0/31.4 | 34.8/9.6/21.1 | 33.9/11.0/22.8 | 46.6 | 23.0 | 37.9/36.6 | 85.6 |
08/27/2022 | BART-large SLED | Ivgi et al., | 406M | 16K | 37.99 | 57.5/26.3/27.4 | 35.2/8.7/19.4 | 34.2/11.0/22.0 | 46.9 | 24.1 | 34.8/34.8 | 87.3 |
03/14/2022 | UL2 | Google Research | 20B | 2K | 37.87 | 53.6/26.1/28.8 | 32.9/7.8/19.4 | 31.1/8.5/20.4 | 37.6 | 24.2 | 45.8/40.7 | 88.7 |
02/28/2023 | CoLT5 Base | Google Research | 433M | 16K | 37.64 | 58.7/29.6/31.4 | 34.5/9.2/20.6 | 32.0/9.3/21.0 | 42.1 | 23.3 | 36.5/34.0 | 86.5 |
01/01/2022 | LED Base | SCROLLS team | 162M | 16K | 29.16 | 56.2/26.6/28.8 | 24.2/4.5/15.4 | 25.1/6.7/18.8 | 26.6 | 18.5 | 25.8/25.4 | 71.5 |
01/01/2022 | BART Base | SCROLLS team | 139M | 1K | 29.01 | 47.9/18.6/22.7 | 27.2/4.9/16.7 | 30.2/8.7/20.7 | 26.3 | 15.4 | 26.0/25.9 | 77.4 |
01/07/2022 | Naive | SCROLLS team | - | - | 19.35 | 45.3/17.9/20.8 | 19.6/1.8/11.0 | 14.2/2.0/9.3 | 3.4 | 1.5 | 25.2/26.1 | 66.0 |
Click here for a downloadable version of the leaderboard.
* LongT5 rows are based on revised submissions that use max-output-length of 1024 tokens for GovReport generations.
​
bottom of page