TY - GEN
T1 - Stratified sampling for case selection criteria for evaluating CAD
AU - Nishikawa, Robert M.
AU - Pesce, Lorenzo L.
PY - 2010
Y1 - 2010
N2 - Ideally, the outcome of any CAD performance assessment should predict how well the system would work if used clinically. In principle, if the selection process draws cases that are "representative" of the general patient population, the study design will be unbiased. In this study we explored the effect of stratified sampling on stand-alone and radiologists' performance using data from an observer study. Although our database was relatively small, 50 cancer cases, no meaningful difference in performance was measured among different stratified sampling schemes or against the whole dataset nor was there any difference in the variance in the measured performance metrics. These results cast doubts on the usefulness of requiring stratified sampling, whose added cost does not seem to be justifiable without empirical evidence. We believe that it is more important to specify how cases should be collected than try to define the range and frequency of the characteristics of patients and cancers to be included the dataset, which we suspect to be prone to actually produce unrealistic samples.
AB - Ideally, the outcome of any CAD performance assessment should predict how well the system would work if used clinically. In principle, if the selection process draws cases that are "representative" of the general patient population, the study design will be unbiased. In this study we explored the effect of stratified sampling on stand-alone and radiologists' performance using data from an observer study. Although our database was relatively small, 50 cancer cases, no meaningful difference in performance was measured among different stratified sampling schemes or against the whole dataset nor was there any difference in the variance in the measured performance metrics. These results cast doubts on the usefulness of requiring stratified sampling, whose added cost does not seem to be justifiable without empirical evidence. We believe that it is more important to specify how cases should be collected than try to define the range and frequency of the characteristics of patients and cancers to be included the dataset, which we suspect to be prone to actually produce unrealistic samples.
KW - breast cancer
KW - case selection
KW - computer-aided detection
KW - computer-aided diagnosis
KW - evaluation
KW - mammography
KW - observer study
UR - http://www.scopus.com/inward/record.url?scp=77954645362&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77954645362&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-13666-5_72
DO - 10.1007/978-3-642-13666-5_72
M3 - Conference contribution
AN - SCOPUS:77954645362
SN - 3642136656
SN - 9783642136658
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 534
EP - 539
BT - Digital Mammography - 10th International Workshop, IWDM 2010, Proceedings
T2 - 10th International Workshop on Digital Mammography, IWDM 2010
Y2 - 16 June 2010 through 18 June 2010
ER -