top of page
Date
Model
Contributors
#Params
Input Length
Score (Average)
GovRep (R1/R2/RL)
SumScr (R1/R2/RL)
QMSum (R1/R2/RL)
Qspr (F1)
Nrtv (F1)
QALT (EM-T/H)
CNLI (EM)
02/28/2023
CoLT5 XL
Google Research
5.3B
16K
43.51
61.3/32.2/33.8
36.4/10.1/21.7
36.2/12.9/24.2
53.9
31.1
48.1/43.8
88.4
03/07/2023
LongT5 XL
LongT5
3B
16K
42.53
61.1/32.3/33.7
35.8/9.6/21.1
34.9/11.8/23.5
53.1
29.3
46.0/42.1
88.2
02/28/2023
CoLT5 Large
Google Research
1.46B
16K
41.04
60.7/31.3/32.9
36.7/10.6/22.0
34.9/11.5/23.1
49.8
27.7
39.9/36.8
88.7
03/07/2023
LongT5 Large
LongT5
770M
16K
41.03
60.3/31.1/32.8
35.6/9.2/21.2
35.1/12.0/23.3
52.3
27.2
40.6/38.6
87.3
08/23/2022
BART-LS
Meta AI
460M
16K
39.76
59.4/29.8/30.8
37.7/10.2/21.5
35.1/11.0/22.0
48.7
26.2
37.8/34.0
87.1
03/07/2023
LongT5 Base
LongT5
220M
16K
38.6
57.7/30.0/31.4
34.8/9.6/21.1
33.9/11.0/22.8
46.6
23.0
37.9/36.6
85.6
08/27/2022
BART-large SLED
Ivgi et al.,
406M
16K
37.99
57.5/26.3/27.4
35.2/8.7/19.4
34.2/11.0/22.0
46.9
24.1
34.8/34.8
87.3
03/14/2022
UL2
Google Research
20B
2K
37.87
53.6/26.1/28.8
32.9/7.8/19.4
31.1/8.5/20.4
37.6
24.2
45.8/40.7
88.7
02/28/2023
CoLT5 Base
Google Research
433M
16K
37.64
58.7/29.6/31.4
34.5/9.2/20.6
32.0/9.3/21.0
42.1
23.3
36.5/34.0
86.5
01/01/2022
LED Base
SCROLLS team
162M
16K
29.16
56.2/26.6/28.8
24.2/4.5/15.4
25.1/6.7/18.8
26.6
18.5
25.8/25.4
71.5
01/01/2022
BART Base
SCROLLS team
139M
1K
29.01
47.9/18.6/22.7
27.2/4.9/16.7
30.2/8.7/20.7
26.3
15.4
26.0/25.9
77.4
01/07/2022
Naive
SCROLLS team
-
-
19.35
45.3/17.9/20.8
19.6/1.8/11.0
14.2/2.0/9.3
3.4
1.5
25.2/26.1
66.0

Click here for a downloadable version of the leaderboard.

* LongT5 rows are based on revised submissions that use max-output-length of 1024 tokens for GovReport generations.

​

bottom of page