Skip to content

Commit

Permalink
add p values
Browse files Browse the repository at this point in the history
  • Loading branch information
davevanveen committed Sep 15, 2023
1 parent bffd728 commit 962b213
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion index.html
Original file line number Diff line number Diff line change
Expand Up @@ -374,7 +374,7 @@
<table align=center width=800px>
<tr>
<td align=left width=800px>
Clinical reader study. Top: Study design comparing the summarization of GPT-4 vs. that of human experts on three attributes: completeness, correctness, and conciseness. Bottom: Results. GPT-4 summaries are rated higher than human summaries on completeness for all three summarization tasks and on correctness overall. Radiology reports highlight a trade-off between correctness (better) and conciseness (worse) with GPT-4. Highlight colors correspond to a value’s location on the color spectrum. Asterisks denote statistical significance by Wilcoxon signed-rank test.
Clinical reader study. Top: Study design comparing the summarization of GPT-4 vs. that of human experts on three attributes: completeness, correctness, and conciseness. Bottom: Results. GPT-4 summaries are rated higher than human summaries on completeness for all three summarization tasks and on correctness overall. Radiology reports highlight a trade-off between correctness (better) and conciseness (worse) with GPT-4. Highlight colors correspond to a value’s location on the color spectrum. Asterisks denote statistical significance by Wilcoxon signed-rank test &ast;p-value &lt; 0.05, &ast;&ast;p-value &lt;&lt; 0.001.
</td>
</tr>
</table>
Expand Down

0 comments on commit 962b213

Please sign in to comment.