TY - GEN
T1 - Multi-View Learning for Vertebrae Identification in Digitally Reconstructed Radiographs
AU - Ahmad, Iftikhar
AU - Ali, Shahzad
AU - Jung, Soon Ki
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Vertebrae localization and identification from Com-puted Tomography (CT) scans playa crucial role in the diagnosis of spine-related disorders. However, localization and labeling of vertebrae are laborious and challenging due to the complex anatomical structure of the spine, low contrast, and fuzzy bound-aries in CT scans. This study introduces an encoder-decoder-based multi-view learning approach by training the model using distinct representations (views) for vertebra identification in digitally reconstructed radiographs (DRR). Multi-view learning aims to enhance model robustness, accuracy, and generalization capabilities by leveraging information from multiple digitally acquired DRR images. To generate the DRR images, we developed a simulation environment that produces multiple DRR views from a given CT scan. We employed a contrastive learning strategy for training the backbone network to enhance the learning of global representations across these multi-views. Subsequently, we trained a localization network to detect vertebrae centroids, followed by an identification network to classify each vertebra accordingly. Moreover, we validated our model on the VerSe 2019 dataset and outperformed other state-of-the-art (SOTA) methods.
AB - Vertebrae localization and identification from Com-puted Tomography (CT) scans playa crucial role in the diagnosis of spine-related disorders. However, localization and labeling of vertebrae are laborious and challenging due to the complex anatomical structure of the spine, low contrast, and fuzzy bound-aries in CT scans. This study introduces an encoder-decoder-based multi-view learning approach by training the model using distinct representations (views) for vertebra identification in digitally reconstructed radiographs (DRR). Multi-view learning aims to enhance model robustness, accuracy, and generalization capabilities by leveraging information from multiple digitally acquired DRR images. To generate the DRR images, we developed a simulation environment that produces multiple DRR views from a given CT scan. We employed a contrastive learning strategy for training the backbone network to enhance the learning of global representations across these multi-views. Subsequently, we trained a localization network to detect vertebrae centroids, followed by an identification network to classify each vertebra accordingly. Moreover, we validated our model on the VerSe 2019 dataset and outperformed other state-of-the-art (SOTA) methods.
KW - digitally reconstructed radiographs
KW - multi-view learning
KW - spine
KW - vertebrae
KW - vertebrae identification
KW - vertebrae localization
UR - https://www.scopus.com/pages/publications/105017113226
U2 - 10.1109/HSI66212.2025.11142408
DO - 10.1109/HSI66212.2025.11142408
M3 - Conference contribution
AN - SCOPUS:105017113226
T3 - International Conference on Human System Interaction, HSI
BT - Proceeding - 17th International Conference on Human System Interaction, HSI 2025
PB - IEEE Computer Society
T2 - 17th International Conference on Human System Interaction, HSI 2025
Y2 - 16 July 2025 through 19 July 2025
ER -