Researchers Publish Chest X-Ray Dataset to Train AI Models
|
By HospiMedica International staff writers Posted on 20 Feb 2019 |

Image: The CheXpert dataset of chest X-rays is designed for automated chest X-ray interpretation (Photo courtesy of Stanford University School of Medicine).
Researchers from the Stanford University School of Medicine (Stanford, CA, USA) have published CheXpert, a large dataset of chest X-rays and competition for automated chest X-ray interpretation, which features uncertainty labels and radiologist-labeled reference standard evaluation sets. Automated chest radiograph interpretation at the level of practicing radiologists could provide substantial benefit in many medical settings, from improved workflow prioritization and clinical decision support to large-scale screening and global population health initiatives.
CheXpert consists of 224,316 chest radiographs of 65,240 patients collected from Stanford Hospital that were performed between October 2002 and July 2017 in both inpatient and outpatient centers, along with their associated radiology reports. The dataset was co-released with MIMIC-CXR, a large dataset of 371,920 chest X-rays associated with 227,943 imaging studies sourced from the Beth Israel Deaconess Medical Center between 2011-2016.
One of the main obstacles in the development of chest radiograph interpretation models has been the lack of datasets with strong radiologist-annotated groundtruth and expert scores against which researchers can compare their models. CheXpert is expected to address that gap, making it easy to track the progress of models over time on a clinically important task.
The researchers have also developed and open-sourced the CheXpert labeler, an automated rule-based labeler to extract observations from the free text radiology reports to be used as structured labels for the images. This is expected to help other institutions extract structured labels from their reports and release other large repositories of data that will allow for cross-institutional testing of medical imaging models. The dataset is expected to help in the development and validation of chest radiograph interpretation models towards improving healthcare access and delivery worldwide.
Related Links:
Stanford University School of Medicine
CheXpert consists of 224,316 chest radiographs of 65,240 patients collected from Stanford Hospital that were performed between October 2002 and July 2017 in both inpatient and outpatient centers, along with their associated radiology reports. The dataset was co-released with MIMIC-CXR, a large dataset of 371,920 chest X-rays associated with 227,943 imaging studies sourced from the Beth Israel Deaconess Medical Center between 2011-2016.
One of the main obstacles in the development of chest radiograph interpretation models has been the lack of datasets with strong radiologist-annotated groundtruth and expert scores against which researchers can compare their models. CheXpert is expected to address that gap, making it easy to track the progress of models over time on a clinically important task.
The researchers have also developed and open-sourced the CheXpert labeler, an automated rule-based labeler to extract observations from the free text radiology reports to be used as structured labels for the images. This is expected to help other institutions extract structured labels from their reports and release other large repositories of data that will allow for cross-institutional testing of medical imaging models. The dataset is expected to help in the development and validation of chest radiograph interpretation models towards improving healthcare access and delivery worldwide.
Related Links:
Stanford University School of Medicine
Latest AI News
Channels
Critical Care
view channel
Light-Based Technology to Measure Brain Blood Flow Could Diagnose Stroke and TBI
Monitoring blood flow in the brain is crucial for diagnosing and treating neurological conditions such as stroke, traumatic brain injury (TBI), and vascular dementia. However, current imaging methods like... Read more
AI Heart Attack Risk Assessment Tool Outperforms Existing Methods
For decades, doctors have relied on standardized scoring systems to assess patients with the most common type of heart attack—non-ST-elevation acute coronary syndrome (NSTE-ACS). The GRACE score, used... Read moreSurgical Techniques
view channel
Minimally Invasive Endoscopic Surgery Improves Severe Stroke Outcomes
Intracerebral hemorrhage, a type of stroke caused by bleeding deep within the brain, remains one of the most challenging neurological emergencies to treat. Accounting for about 15% of all strokes, it carries... Read more
Novel Glue Prevents Complications After Breast Cancer Surgery
Seroma and prolonged lymphorrhea are among the most common complications following axillary lymphadenectomy in breast cancer patients. These postoperative issues can delay recovery and postpone the start... Read morePatient Care
view channel
Revolutionary Automatic IV-Line Flushing Device to Enhance Infusion Care
More than 80% of in-hospital patients receive intravenous (IV) therapy. Every dose of IV medicine delivered in a small volume (<250 mL) infusion bag should be followed by subsequent flushing to ensure... Read more
VR Training Tool Combats Contamination of Portable Medical Equipment
Healthcare-associated infections (HAIs) impact one in every 31 patients, cause nearly 100,000 deaths each year, and cost USD 28.4 billion in direct medical expenses. Notably, up to 75% of these infections... Read more
Portable Biosensor Platform to Reduce Hospital-Acquired Infections
Approximately 4 million patients in the European Union acquire healthcare-associated infections (HAIs) or nosocomial infections each year, with around 37,000 deaths directly resulting from these infections,... Read moreFirst-Of-Its-Kind Portable Germicidal Light Technology Disinfects High-Touch Clinical Surfaces in Seconds
Reducing healthcare-acquired infections (HAIs) remains a pressing issue within global healthcare systems. In the United States alone, 1.7 million patients contract HAIs annually, leading to approximately... Read moreHealth IT
view channel
Printable Molecule-Selective Nanoparticles Enable Mass Production of Wearable Biosensors
The future of medicine is likely to focus on the personalization of healthcare—understanding exactly what an individual requires and delivering the appropriate combination of nutrients, metabolites, and... Read moreBusiness
view channel
Philips and Masimo Partner to Advance Patient Monitoring Measurement Technologies
Royal Philips (Amsterdam, Netherlands) and Masimo (Irvine, California, USA) have renewed their multi-year strategic collaboration, combining Philips’ expertise in patient monitoring with Masimo’s noninvasive... Read more
B. Braun Acquires Digital Microsurgery Company True Digital Surgery
The high-end microsurgery market in neurosurgery, spine, and ENT is undergoing a significant transformation. Traditional analog microscopes are giving way to digital exoscopes, which provide improved visualization,... Read more
CMEF 2025 to Promote Holistic and High-Quality Development of Medical and Health Industry
The 92nd China International Medical Equipment Fair (CMEF 2025) Autumn Exhibition is scheduled to be held from September 26 to 29 at the China Import and Export Fair Complex (Canton Fair Complex) in Guangzhou.... Read more







