Oral Presentation ESA-SRB 2023 in conjunction with ENSA

Comparison of M-TIRADS to Established Thyroid Classification Systems – A Real-World Retrospective Analysis of Current Thyroid Nodule Ultrasound Scoring Systems. (#167)

Scott McNeil 1 2 , Ilona Lavender 3 , Dee Nandurkar 3 , Peter Coombs 3 , Eldho Paul 4 , Jennifer Wong 1 2
  1. Monash Health Endocrinology and Diabetes Unit, Melbourne, Victoria, Australia
  2. Department of Medicine / School of Clinical Sciences at Monash Health, Monash University, Melbourne, Victoria, Australia
  3. Ultrasound, Monash Imaging, Monash Health, Melbourne, Victoria, Australia
  4. Monash Centre for Health Research and Implementation, Monash University, Melbourne, Victoria, Australia

Aim:

To compare and validate the M-TIRADS to established thyroid classification systems on their diagnostic accuracy of malignancy in thyroid nodules.

Methods:

We conducted a retrospective analysis on 1795 patients presenting for fine needle aspiration (FNA) of thyroid nodules to a single centre between 2012 and 2022. Sonographic images were classified and scored with the American Thyroid Association, Korean, American College of Radiology and Monash Thyroid Imaging, Reporting, and Data Systems (K-TIRADS, ACR-TIRADS and M-TIRADS), with M-TIRADS scoring for increased vascularity and size >4cm.  2211 thyroid nodules were biopsied with cytopathological results reported using the Bethesda system.  Outcomes of these systems were then compared with Bethesda results.

Results:

Overall, there were 2211 specimens.  2070 (93%) were classified as benign and 141 (7%) samples were malignant. ATA diagnosed 128 and K-TIRADS diagnosed 127 of 141 malignant nodules.  ACR-TIRADS diagnosed 95 and M-TIRADS 101 of 141 malignant nodules.  1532 nodules deemed intermediate-high risk in the ATA guidelines and 1529 in K-TIRADS were benign. 1621 nodules were unnecessarily biopsied based on ATA guidelines, and 1611 for K-TIRADS.  1227 nodules recommended for biopsy by ACR-TIRADS and 1368 by M-TIRADS were benign.

The sensitivity of M-TIRADs was 71.8% (95% Confidence Interval (CI), 63.7 – 79.1), specificity 27.8% (95%CI, 25.8 – 30.2). The sensitivity of ACR-TIRADs was 67.6% (95%CI, 59.2 – 75.2), specificity 35.3% (95%CI, 33.1 – 37.6). The sensitivity of ATA was 90.8% (95%CI, 84.9 – 95.0), and specificity was 10.0% (95%CI, 8.7 - 11.5). The sensitivity of K-TIRADs was 90.1% (95%CI, 84.0 - 94.5), specificity 10.6% (95%CI, 9.2 – 12.1).

Conclusion:

M-TIRADS diagnostic performance was similar to ACR-TIRADS and hence addition of vascularity and size did not achieve higher sensitivities or specificities.  Overall, each of the thyroid classification systems demonstrated strengths and weaknesses.  There remain significant limitations in utilising ultrasound characteristics for risk stratification.