Technical Validation of Plot Designs by Use of Deep Learning

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Technical Validation of Plot Designs by Use of Deep Learning. / Petersen, Anne Helby; Ekstrøm, Claus Thorn.

In: The American Statistician, Vol. 78, No. 2, 2024, p. 220-228.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Petersen, AH & Ekstrøm, CT 2024, 'Technical Validation of Plot Designs by Use of Deep Learning', The American Statistician, vol. 78, no. 2, pp. 220-228. https://doi.org/10.1080/00031305.2023.2270649

APA

Petersen, A. H., & Ekstrøm, C. T. (2024). Technical Validation of Plot Designs by Use of Deep Learning. The American Statistician, 78(2), 220-228. https://doi.org/10.1080/00031305.2023.2270649

Vancouver

Petersen AH, Ekstrøm CT. Technical Validation of Plot Designs by Use of Deep Learning. The American Statistician. 2024;78(2):220-228. https://doi.org/10.1080/00031305.2023.2270649

Author

Petersen, Anne Helby ; Ekstrøm, Claus Thorn. / Technical Validation of Plot Designs by Use of Deep Learning. In: The American Statistician. 2024 ; Vol. 78, No. 2. pp. 220-228.

Bibtex

@article{26b1f34993f2447c86b046ca64a30b5a,
title = "Technical Validation of Plot Designs by Use of Deep Learning",
abstract = "When does inspecting a certain graphical plot allow for an investigator to reach the right statistical conclusion? Visualizations are commonly used for various tasks in statistics – including model diagnostics and exploratory data analysis – and though attractive due to its intuitive nature, the lack of available methods for validating plots is a major drawback. We propose a new technical validation method for visual reasoning. Our method trains deep neural networks to distinguish between plots simulated under two different data generating mechanisms (null or alternative), and we use the classification accuracy as a technical validation score (TVS). The TVS measures the information content in the plots, and TVS values can be used to compare different plots or different choices of data generating mechanisms, thereby providing a meaningful scale that new visual reasoning procedures can be validated against. We apply the method to three popular diagnostic plots for linear regression, namely scatter plots, quantile-quantile plots and residual plots. We consider various types and degrees of misspecification, as well as different within-plot sample sizes. Our method produces TVSs that increase with increasing sample size and decrease with increasing difficulty, and hence the TVS is a meaningful measure of validity.",
author = "Petersen, {Anne Helby} and Ekstr{\o}m, {Claus Thorn}",
year = "2024",
doi = "10.1080/00031305.2023.2270649",
language = "English",
volume = "78",
pages = "220--228",
journal = "American Statistician",
issn = "0003-1305",
publisher = "American Statistical Association",
number = "2",

}

RIS

TY - JOUR

T1 - Technical Validation of Plot Designs by Use of Deep Learning

AU - Petersen, Anne Helby

AU - Ekstrøm, Claus Thorn

PY - 2024

Y1 - 2024

N2 - When does inspecting a certain graphical plot allow for an investigator to reach the right statistical conclusion? Visualizations are commonly used for various tasks in statistics – including model diagnostics and exploratory data analysis – and though attractive due to its intuitive nature, the lack of available methods for validating plots is a major drawback. We propose a new technical validation method for visual reasoning. Our method trains deep neural networks to distinguish between plots simulated under two different data generating mechanisms (null or alternative), and we use the classification accuracy as a technical validation score (TVS). The TVS measures the information content in the plots, and TVS values can be used to compare different plots or different choices of data generating mechanisms, thereby providing a meaningful scale that new visual reasoning procedures can be validated against. We apply the method to three popular diagnostic plots for linear regression, namely scatter plots, quantile-quantile plots and residual plots. We consider various types and degrees of misspecification, as well as different within-plot sample sizes. Our method produces TVSs that increase with increasing sample size and decrease with increasing difficulty, and hence the TVS is a meaningful measure of validity.

AB - When does inspecting a certain graphical plot allow for an investigator to reach the right statistical conclusion? Visualizations are commonly used for various tasks in statistics – including model diagnostics and exploratory data analysis – and though attractive due to its intuitive nature, the lack of available methods for validating plots is a major drawback. We propose a new technical validation method for visual reasoning. Our method trains deep neural networks to distinguish between plots simulated under two different data generating mechanisms (null or alternative), and we use the classification accuracy as a technical validation score (TVS). The TVS measures the information content in the plots, and TVS values can be used to compare different plots or different choices of data generating mechanisms, thereby providing a meaningful scale that new visual reasoning procedures can be validated against. We apply the method to three popular diagnostic plots for linear regression, namely scatter plots, quantile-quantile plots and residual plots. We consider various types and degrees of misspecification, as well as different within-plot sample sizes. Our method produces TVSs that increase with increasing sample size and decrease with increasing difficulty, and hence the TVS is a meaningful measure of validity.

U2 - 10.1080/00031305.2023.2270649

DO - 10.1080/00031305.2023.2270649

M3 - Journal article

VL - 78

SP - 220

EP - 228

JO - American Statistician

JF - American Statistician

SN - 0003-1305

IS - 2

ER -

ID: 372698804