AI and medicine: Italian study validates synthetic data with real-world cases

Palermo (ITALPRESS/GNA) – Finally, a “positive” study : AI does not replace reality, but it can help us better understand it when rigorously designed, controlled, and validated. The study, conducted by a research team at the IRCCS Galeazzi-Sant’Ambrogio Hospital in Milan with methodological and computational contributions from Marco Giacalone and Davide Lamartina, was published in the European Spine Journal and is indexed in PubMed.

The research involved using an AI-assisted computational method to expand and analyze a small clinical sample, while maintaining reference to real data. Starting from 123 real-world asymptomatic subjects, a biologically plausible synthetic dataset of 10,000 cases was constructed, used to enhance the robustness of the analysis of certain anatomical correlations of the spine.

The most important part was not generating synthetic data, but demonstrating that the correlations identified in the expanded sample remained verifiable on real data through independent statistical procedures.

The research arose from the need to study spinopelvic alignment in asymptomatic subjects, an essential element for understanding physiological sagittal balance and defining useful reference values ​​in the assessment of spinal deformities.

However, the availability of data on healthy subjects is limited by ethical, logistical, and radiation exposure constraints: it is neither simple nor always appropriate to subject healthy subjects to radiographic examinations for the sole purpose of expanding a research sample. In this context, the controlled generation of synthetic data can be a useful strategy for exploring anatomical relationships without losing touch with real data.

Starting from 123 real-world asymptomatic subjects, a synthetic dataset of 10,000 biologically plausible anatomical configurations was generated. The correlations identified in the expanded sample were then verified on the 123 real-world cases through statistical bootstrapping, a procedure that allows the robustness of the results to be assessed by repeatedly returning to the original data.

For the study, standing radiographs of the entire spine were analyzed, recording demographic characteristics and multiple spinopelvic parameters, including pelvic incidence (PI), pelvic tilt (PT), sacral slope (SS), lumbar lordosis (LL), thoracic kyphosis (TK), and cervical alignment measurements. A probabilistic Gaussian resampling approach was used, guided by anatomical and biological constraints.

The correlations identified in the synthetic dataset were subsequently validated against data measured on real subjects. The result is a feasible and reproducible approach to studying spinopelvic relationships in small clinical samples. This combination allows for the identification of biologically plausible correlations and may help reduce the need for further imaging studies on healthy subjects.

The applied methodology can also serve as an exploratory tool in spine research and other fields with limited datasets. The two-way workflow allowed us to verify the correlations that emerged in the synthetic dataset on real data, reducing the risk that the results were due to chance or the limited sample size.

In summary, the study does not propose replacing clinical data with artificial data, but demonstrates how an AI-assisted computational method can be used to generate hypotheses, test them on real data, and preserve biological plausibility. The authors of this research are: IRCCS Galeazzi-Sant’Ambrogio Hospital, Milan; Domenico Compagnone, Riccardo Cecchinato, Pedro Berjano, and Claudio Lamartina; Marco Giacalone, LUMSA Santa Silvia, Palermo.

GNA/ITALPRESS