Background
There is interest in artificial intelligence algorithms (machine learning and deep learning) that provide automation of neuroimaging in an effort to improve accuracy, reduce bias and aid in clinical decision-making. Machine learning is used to evaluate candidate biomarkers for assessing multiple sclerosis and disease severity, with random forest models commonly performing well while deep learning remains less widely adopted. Translating these methods into clinically meaningful decision support tools is limited by insufficient data quality, quantity, and lack of shared standards. Advancing clinical impact will require robust, shared data resources and integrated analyses across multiple data modalities rather than isolated, single-purpose solutions.1
Structural imaging has the potential to reduce variance and improve diagnostic and prognostic inferences from MRI scans. However, the tools must be trained and validated in a manner that provides generalizability to broader populations. Most available tools rely on multivariate machine learning approaches, offering promise for personalized medicine, but currently available studies are limited by small sample sizes, methodological rigor, and lack of longitudinal validation. Study authors emphasize that imaging outputs must be interpreted within the broader clinical context and that future progress will depend on addressing challenges such as biological heterogeneity, scanner variability, data integration, and robust validation.2 Of eight tools identified two were not approved for medical use and one had no associated references. Most of the tools were found to have been validated using a small number of cases and or a single data set. They compared the tools based on the number of validation methods for which was conducted. None of the tools account for intrascanner variability resulting from differences in the scanners and instrument- magnetic field and acquisition parameters and therefore the tools lack generalizability. The author concludes that the majority of available tools make use of multivariant machine learning methods and have potential to open new possibilities in personalized medicine. However, they caution results should be interpreted with vigilance due to the limitations in these studies especially related to small sample size and poor methodology. They also caution that results must be interpreted in light of the patient’s clinical history and symptomatology.
The American Society of Functional Neuroradiology (ASFNR) and the American Society of Neuroradiology (ASNR) acknowledge the challenges with artificial intelligence in neurology and created an Artificial intelligence Workshop Technology Workgroup.3 This group published a critical appraisal of Artificial Intelligence (AI)-enabled imaging tools using the levels of evidence system demonstrated in the American Journal of Neuroradiology. They call for critical appraisal of enabled image tools throughout the life cycle from development to implementation using systematic, standardized and an objective approaches that can verify both the technical and clinical efficacy of the tool. A challenge for developing AI models is accessing comprehensive and large data sets that can be utilized to train the technology. These data sets should represent the intended population and provide a diverse group from which the data may be extrapolated. This paper provides a resource for clinicians to aid in critical assessment of AI technologies to ensure safe and effective implementation of such technology into healthcare practices. 3
FDA cleared devices as of the publication date of this LCD include:
- NeuroQuantTM Medical Image Processing Software, which is registered as a Class II device under FDA 510(k) and intended for “automatic labeling, visualization and volumetric quantification of segmentable brain structures from a set of magnetic resonance images (MRI)”. NeuroQuant 4.0 uses AI modalities of machine learning and deep learning to aid in identifying complex patterns in imaging data.4
- Icobrain is registered as a Class II device under FDA 510(k) pathway.7-10 It is intended for “automatic labeling, visualization and volumetric quantification of segmental brain structures from a set of MRI images”. The predicate device is NeuroQuant.
- Icobrain aria, which is registered as a Class II device under FDA 510(k) pathway described as a software-only device for assisting radiologists with the detection and quantification of amyloid-related imaging abnormalities (ARIA) on brain MRI scans in patients undergoing an amyloid-beta directed antibody therapy. 5 Icobrain aria automatically processes inputs from brain MRI scans from 2 time points and calculates the ARIA-E (edema/sulcal effusion): the length of the longest axis computed from the segmented ARIA-E abnormalities, and the number of brain sites affected by ARIA-E; and ARIA-H (hemorrhage/superficial siderosis): which is the count of stable and new T2*-GRE hypo-intensities indicated as microhemorrhages or superficial siderosis.5 Using these measurements the ARIA radiographic severity is automatically derived based on deep learning technology and reported electronically. The intended use of the device is “as a computer-assisted detection and diagnosis software to be used as a concurrent reading aid to help trained radiologists in the detection, assessment, and characterization of ARIA. The software provides information about the presence, location, size, severity and changes of ARIA-E and ARIA-H. Patient management decisions should not be made solely based on analysis by Icobrain aria.” The device is not intended to replace radiologist review of images or clinical judgment and is not intended to be used to segment macro hemorrhages (diameter 10mm or more).6
- DeepBrain, which is registered as a Class II device under FDA 510(k) pathway.11 The device is intended for “automatic labeling, quantification and visualization of segmental brain structures from MRI images.” It is intended to be used by trained health professionals.
- Siemens Morphometry Analysis, which is registered as a Class II device under FDA 510(k) pathway.12 This product is a syngo-based post-acquisition imaging processing software for “viewing, manipulating, evaluating and analyzing MRI, MR PET, CT, PE, CT-PET and MR spectra using deep learning algorithms”.
Other FDA cleared devices were not found at the time of this LCD literature search.
The literature search for evidence related to quantitative analysis of brain MRI was conducted using PubMed and Hayes using the following search teams: Alzheimer’s, ARIA, imaging, artificial intelligence or AI, automated or software or computation or deep learning, machine learning or artificial neural network, multiple sclerosis, MRI and AI, brain or neurology, AI or MRI. Searches under known product that are commercially available in the United States included names NeuroQuant, NeuroGage, Icobrain, Icobrain aria, Jung Diagnostic, quantib, Qure, and volBrain was conducted. No Randomized Control Trials (RCT) were identified. Unpublished reports, posters, abstracts, case reports and small case series were omitted from the review unless there was no other evidence available to consider. Review papers were utilized in background and summary but not considered for evidence review. There were no guidelines or recommendations on the use of artificial intelligence assistive software tools for automated detection and quantification of the brain.
Alzheimer’s Disease
Amyloid beta (Aβ)-directed antibody therapies, such as aducanumab*, lecanemab and donanemab, are approved in the United States (U.S.) for the treatment of patients with mild cognitive dementia due to Alzheimer’s disease. There is evidence of slowing disease progression and improvement in clinical outcomes for those treated with mild cognitive impairment from Alzheimer’s dementia (AD), however it is known that Aβ-directed antibody therapies increase the risk of ARIA and ARIA related complications.13 ARIA complications include autoimmune or inflammatory conditions, seizures, and other disorders associated with extensive white matter pathology. Symptoms and signs of ARIA can include new focal neurological signs, headache, confusion, altered mental status, dizziness, nausea, vomiting, fatigue, blurred vision, vision disturbances, gait disturbance and or seizure.
Radiologists became familiar with the appearance of amyloid-related image abnormalities, developed imaging protocols to classify abnormalities related to amyloid Beta (Aβ)— directed therapy antibody therapies. and clinicians also must determine the optimal pathways for management of ARIA.14 This is an area of ongoing investigations, and several studies have contributed to the current knowledge of these challenges, but literature remains sparce. 14-18
Multiple grading schemes to determine the severity of ARIA have been proposed.15-18 The ARIA Radiographic Severity is categorized as mild, moderate and severe and this classification system was used in the pivotal clinical trials for anti-amyloid immunotherapies and by the FDA for drug approval. While becoming the accepted standard for classification there has been little peer-reviewed literature at the date of this LCD. 19,20,64,65 A comparison of the Barkhof Grand Total Scale (BGTS) and the 3- or 5-point Severity Scales of ARIA-E, SSAE-3/SSAE-5 respectively, demonstrated a high degree of correlation between the scales. Despite the strong correlation between SSAE-3, SSAE-5, and the BGTS for assessing ARIA-E, these approaches are not yet ready for routine clinical adoption. While threshold alignment across scales suggests the potential to translate existing BGTS based dosing suspension rules to SSAE based criteria, further validation is required.17
To monitor for ARIA the Appropriate Use Criteria for Aducanumab* recommends MRI before the 5th, 7th, 9th and 12th infusions to improve detection.21 The criteria recommends discontinuation of aducanumab* for any macro-hemorrhage, more than 1 area of superficial siderosis, more than 10 microhemorrhages occurring since the initiation of treatment, more than 2 episodes of ARIA, severe symptoms of ARIA or development of any medical condition requiring anticoagulation. The protocol allows continuation of aducanumab* for mild ARIA-E or ARIA-H with monthly MRI monitoring and discontinuation for worsening symptoms.
The Appropriate Use criteria for Lecanemab recommends obtaining MRI scans at baseline and prior to the 5th, 7th, 14th and 26th infusions. The authors explain that 81% of ARIA-E occur early and resolve spontaneously within 4 months of radiographic detection.22 The criteria allows continuation of aducanumab* for mild ARIA-E or ARIA-H with monthly MRI monitoring and discontinuation for worsening symptoms. Once the ARIA resolves or stabilizes monthly imaging can be discontinued. The criteria states that the imaging should be read by knowledgeable MRI readers proficient in detection and interpretation of ARIA or clinicals skilled in the identification and interpretation of cerebrovascular lesions and ARIA.
The Appropriate Use recommendations for Donanemab should be performed prior to the 2nd, 3rd, 4th and 7th infusions and 12th in those at high risk for ARIA. The protocol allows continuation of aducanumab* for mild ARIA of edema type mild ARIA-E or ARIA-H with monthly MRI monitoring and discontinuation for worsening symptoms.6
Since ARIA is a new entity, standard radiographic interpretation of ARIA is still under development. Those interpreting the imaging need education and training to ensure accuracy and consistency in reporting. The American College of Radiology, the Alzheimer’s Association, and the Radiological Society of North America are all offering training and continued medical education.19
Radiology
A retrospective report on ARIA was conducted reviewing the imaging of 262 subjects in Phase 2 studies of subjects with mild to moderate AD treated with bapineuzumab, a humanized monoclonal antibody against amyloid β. Two neuroradiologists independently reviewed 2572 MRI scans from 262 participants. The readers were masked to the patient's treatment arm. Patients were included in the risk analysis (n=210) if they did not have evidence of ARIA-E in their pretreatment MRI, had received bapineuzumab, or had at least one MRI scan after treatment. Thirty-six patients (17%) developed ARIA-E during treatment with bapineuzumab, of which 28 patients (78%) were asymptomatic, while 8 were symptomatic. Fifteen of these with ARIA-E detected (42%) on re-read of MRI were not detected previously. All of these patients were asymptomatic and had fewer brain regions involved (mean 1.3, SD=0.5) than patients identified during the clinical studies (mean 2.6, SD=2.4, p=0.0193). Thirteen of the patients whose ARIA-E findings were not detected during the clinical trial continued the bapineuzumab infusions for up to 2 years and remained asymptomatic. This study suggests that ARIA encompasses a spectrum of imaging findings with variable clinical relevance, including many subtle and asymptomatic cases that were not detected during clinical trials. The frequent co-occurrence of ARIA-E and ARIA- H supports a shared underlying pathophysiology, potentially related to transient vascular permeability and vascular amyloid. Associations with APOE ε4 status, treatment dose, and amyloid imaging further indicate a link between ARIA and amyloid burden and clearance, underscoring the need for continued investigation in ongoing amyloid modifying therapy studies. 18
Using the same study population, investigators sought to describe imaging characteristics of ARIA-E and ARIA-H identified.15 The rate of ARIA-H was reported as 12.4% (26/210). They also found that in 49% of those with ARIA-E there was also associated appearance of ARIA-H. The authors conclude this may suggest a common pathophysiologic mechanism. All scans were reviewed by local MR imaging readers and subsequently independently reviewed by the same 2 neuroradiologist as part of the study protocol. They reported the inter-reader kappa value of 0.76 indicated high inter-reader reliability with 94% agreement between neuroradiologist regarding the presence or absence of ARIA-E.
Using this same population, a retrospective analysis of 242 patients, the incidence of ARIA-E was detected more frequently by trained neuroradiologists as compared to local site radiologists.23 The MRI was performed in patients with mild to moderate AD in a Phase III trial of bapineuzumab. Seventy-six cases of ARIA-E were not detected on the initial read and reported on the final MRI review with the expert radiologist including 51 cases not identified by central/local readers. These represented low radiologic severity. A final read analysis found that the readers’ ability to detect ARIA-E improved as the study duration increased resulting in the majority of occurrences of ARIA-E later in the study to be identified by the local site radiologist, suggesting that ability to detect ARIA improved with increasing experience. It is unclear if the outside readers were using the same imaging criteria, the duration and type of training received, or if this finding resulted in any treatment changes between the groups. This supports the need for appropriate training for radiologists reading these studies and continued standardization of findings to ensure consistency between readers. The clinical significance of this finding is not determined as none of those with mild ARIA-E had symptoms despite continuation of therapy and the sample size was too small to generalize to a larger population.
A volumetric analysis of structural MRI images of the brain was tested and reported to have a high correlation to independent computer-aided manual segmentation for detection of atrophy changes seen in mild AD.24 Using Open Access series of Imaging Studies database, forty subjects with mild probable AD were compared to health controls. These images were processed by the NeuroQuant software package. The investigators reported that volumetric results obtained by the software offer a high correlation and could benefit in evaluation of brain atrophy. The study is limited by very small sample size, lack of human subjects and correlation to expert readers in the same population and represents very low-quality evidence.
A study reviewing the MRIs of 122 patients with dementia compared readings with NeuroQuant and radiology readings in effort to determine if the automated software could determine if the dementia was AD or other type. The study involved patients with various cognitive conditions including Dementia of the Alzheimer’s Type (DAT), other dementia types, mild cognitive impairment (MCI), and subjective cognitive impairment (SCI). Neuroquant had moderate ability to differentiate DAT from non-dementia patients (SCI + MCI), with the hippocampus and amygdala showing moderate discriminatory power (AUC 0.80 and AUC 0.79, respectively). They concluded that the software could not be used alone as the changes in brain segments were not specific for AD.25
A prospective study evaluated forty patients’ brain MRIs on 6 scanners from 5 institutions with both NeuroQuant quantitative analysis and neuroradiologist readings. Image processing was conducted with FAST-DL (FAST= accelerated scan, DL= deep learning), a Digital Imaging and Communications in Medicine (DICOM)-based convolution neural network-dependent deep learning AI enhancement software product called SubtleMR. Clinical classification performance was compared to standard of care scans, FAST-DL and NeuroQuant. The authors reported FAST-DL was statistically superior to standard of care in subjective image quality for perceived signal to noise ratio (SNR), sharpness, imaging artifacts, anatomic/lesion conspicuity, and image contrast (all P values < 0.008), despite a 60% reduction in sequence scan time. They conclude that deep learning can provide sixty percent faster image acquisitions with statically perceived image quality with accuracy comparable to standard of care scans.26 This study is limited by small samples size, use of a vendor-based software for comparison, risk of bias and conflicts of interest with investigators.
Clinical Validity (or Technical efficacy)
Sima et al.27 conducted a diagnostic study to assess the clinical performance of an AI–based software tool for assisting radiological interpretation of brain MRI scans in patients monitored for ARIA. This study enrolled sixteen US Board of Radiology certified radiologists to perform radiologic image interpretation with the software, referred to as assisted reads, and without the software, referred to as unassisted reads. A total of 199 cases, where each case consisted of a pre-dosing baseline and a postdosing follow-up MRI of patients from aducanumab* clinical trials PRIME (NCT01677572), EMERGE (NCT02484547), and ENGAGE (NCT02477800) were retrospectively evaluated. End points were the difference in diagnostic accuracy between assisted and unassisted detection of ARIA-E and ARIA-H independently, assessed using the area under the receiver operating characteristic curve (AUC). Demographics included mean age was 70.4 (7.2) years; 105 (52.8%) were female; 23 (11.6%) were Asian, 1 (0.5%) was Black, 157 (78.9%) were White, and 18 (9.0%) were other or unreported race and ethnicity. Among the sixteen radiological readers included, two were specialized neuroradiologists (12.5%), eleven were male individuals (68.8%), seven were individuals working in academic hospitals (43.8%), and they had a mean (SD) of 9.5 (5.1) years of experience. Radiologists assisted by the software were significantly superior in detecting ARIA than unassisted radiologists, with a mean assisted AUC of 0.87 (95% CI, 0.84-0.91) for ARIA-E detection (AUC improvement of 0.05 [95% CI, 0.02-0.08]; P = .001]) and 0.83 (95% CI, 0.78-0.87) for ARIA-H detection (AUC improvement of 0.04 [95% CI, 0.02-0.07]; P = .001). Sensitivity was higher in assisted reading compared with unassisted reading (87% vs 71% for ARIA-E detection; 79% vs 69% for ARIA-H detection). Specificity remained above 80% for the detection of both ARIA types. The software had the greatest improvement in detection of mild cases (70% compared to 47%). The unassisted readers distinguished ARIA grades well but had higher inter-reader agreement with software assistance. Time for reading was similar in both groups. The authors concluded that radiological reading performance for ARIA detection and diagnosis was significantly better when using the AI-based assistive software. The study is limited by concerns for indirectness (small sample size), lack of generalizability (limited representation in the population), and risk of software errors due to computer learning from individual readers which may have been trained on discrepant data. The software was not trained to detect cerebral hemorrhages larger than 1 cm. The software assists but does not replace the need for qualified radiologists to read the studies. An additional validation study compared Icobrain to Free Surfer with comparable results.28 It demonstrated that Icobrain dm provided more accurate and reliable segmented brain volume measurements than FreeSurfer, particularly evident in hippocampal segmentation, which showed higher Dice similarity coefficients (0.86–0.88 versus 0.80–0.83). The study concluded that Icobrain dm could significantly improve clinical diagnostics for AD, with a recommendation for further research into its effectiveness at earlier stages of the disease and differentiation from other dementia types. However, the lead author is employed by the device maker.
Another study compared the MRIs from healthy controls (n=90) to those with subjective cognitive decline (n=930), mild cognitive impairment (n=357), and AD (n=820). Icobrain dm not only achieved a high area under the curve (AUC) of 0.914 for distinguishing healthy controls from AD dementia patients but also outperformed FreeSurfer in terms of diagnostic performance and processing time. Icobrain dm demonstrated an impressive specificity of 83.0% and sensitivity of 86.3%. Neuroquant showed effectiveness in distinguishing Alzheimer's type dementia from non-dementia but was less reliable in differentiating it from other dementia types.29 FreeSurfer is not available clinically.
An empirical study in South Korea analyzed MRI for 98 patients with AD using VUNO Med-DeepBrain AD (DBAD) deep learning algorithm. They compared the results of the (DBAD) imaging reads to that of three panels of medical experts and reported comparable accuracy (87.1% for DBAD and 84.3% for ME), sensitivity (93.3% for DBAD and 80.0% for ME), and specificity (85.5% for DBAD and 85.5% for ME).66
Clinical Utility (or Clinical efficacy)
There are no published studies to date on clinical utility. The software was found to be particularly helpful in improving detection of mild cases of ARIA-E, however it is not determined if this improves patient outcomes.27 The Appropriate Use Recommendations allows continuation of aducanumab* for mild ARIA of ARIA-E or ARIA-H with monthly MRI monitoring and discontinuation for worsening symptoms. There is a lack of evidence continuation of medication in these cases correlates with improved outcomes or if this increases the risk of serious adverse events.
Multiple Sclerosis
Multiple sclerosis is an autoimmune demyelinating disease impacting the central nervous system and diagnosis is made by MRI findings, laboratory findings and clinical data. This has led to investigations into developing machine learning tools to aid in the diagnosis of multiple sclerosis (MS).
Several studies examine the connection between brain atrophy and disability progression in MS. The studies call for standardized assessment techniques to effectively utilize brain volume measurements in managing MS.58-63
Several review papers have identified multiple investigations for AI models and the potential this technology may bring but acknowledge these investigations did not yield a clinically usable model.31-34,55
A 2018 systematic review identified English language articles published between 2000 and 2018, screening 614 records and analyzing 85 studies, with 30 meeting inclusion criteria related to system scope, reasoning methods, and evaluation. The review found that although all reasoning methods demonstrated high efficiency in MS diagnosis, performance varied by methodological characteristics, leading many studies to combine approaches to address limitations and improve diagnostic accuracy, highlighting the broader potential of computational methods in MS clinical practice.35
A 2022 systematic review included 38 studies focusing on deep learning or AI to analyze any modalities with purpose of diagnosing MS. 36 The majority of these studies utilized MRI (20 studies), with others employing OCT (6 studies), serum/CSF markers (6 studies), and motor function assessments (3 studies). AI, especially using MRI, shows promise in improving MS diagnosis, monitoring, and treatment methodologies. They conclude this is a growing field and can result in drastic improvements in the future with larger multicenter studies across diverse populations and MS subtypes needed to validate these findings before clinical adoption. .36 Another systematic review with meta-analysis from the same group in 2023 included 41 articles (n=5989) and reported a high precision in MS diagnosis for AI studies (95%CI: 88%, 97%) suggesting that AI can aid the clinician in accurate diagnosis of MS and has strong potential to improve MS diagnosis, particularly when based on MRI data.37 The meta-analysis is limited by very high heterogeneity with overall I2=93% limiting the validity of these results. The authors conclude that further research is needed to develop robust and generalizable algorithms that adequately address preprocessing, data augmentation, feature selection, optimization, regularization, and model design and architecture.
Another 2022 systematic review included 66 papers which addressed developing classifiers for MS identification or measuring its progression. They also acknowledge the potential benefits of this approach if applied appropriately and provide guidance for further research.1
A retrospective study compared the MRIs for patients with MS were analyzed with Icobrain software platform for 6826 MRIs with 1207 MRI pairs meeting inclusion criteria. The investigators reported that Icobrain could be utilized for percentage brain volume change based on strict selection criteria.38 Another explored the potential role of icobrain and use of an MS app MS to inform treatment changes in a small population. Given the heterogeneity of MS and the need for more personalized, data driven, and standardized care pathways, the findings demonstrate the potential and feasibility of linking multiple digital and AI tools into an overarching MS care pathway, while also highlighting limitations related to real world data reliance, patient engagement, and clinical implementation. 39
Another 2021 study introduced a digital care management platform for MS, comprising the Icompanion app for patients and Icobrain ms for MRI analysis. The study highlighted the platform's capacity to enhance disease activity detection through patient-reported outcomes and improved MRI analytics, with patient interest in such digital tools peaking at 95.6% for MS apps and 98.2% for tracking MRI changes. However, the study relied on patient-reported outcomes, and the effectiveness of the platforms were dependent on the patients’ digital literacy. The study was observational limiting generalizability.56
Volumetric data for patients with MS were analyzed on the same and different-scanner MRI pairs. Of 6826 MRIs, 85% had appropriate volumetric sequences and 4446 serial MRI pairs were analyzed and 3335 (75%) met inclusion criteria. The percentage brain volume change (PBVC) of the included MRI pairs showed variance of 0.78 % for same-scanner pairs and 0.80 % for different-scanner pairs, but further selection of included MRI pairs with the best variance resulted in 1885 (42%) MRI pairs with PBVC variance of 0.34%. The authors acknowledge the challenges and limitations for brain volumetry measurements and need for standardization to perform adequately. The authors conclude that Icobrain should be utilized for PBVC determination only on selected MRIs with best alignment similarity and with strict selection criteria for the included MRI pairs to reduce PBVC variability.38
The integration of AI into the MS clinical setting has strong potential to transform diagnosis and prognosis, improving patient outcomes through enhanced decision support, efficiency, and care effectiveness. Continued research, data sharing, advances in computational technologies, and progress in interpretability, validation, regulatory, and ethical frameworks will be critical to enabling trustworthy and effective clinical adoption. 33
Brain Tumors
Two poster abstracts and several review papers were identified.40, 41, 42
This retrospective study developed and validated a deep learning model using multimodal MRI to segment enhancing and nonenhancing cellular tumor in patients with glioblastoma and to predict overall and progression free survival. While the model showed strong performance across internal and external cohorts, its reliance on retrospective data, advanced imaging sequences, and expert annotated reference standards highlights limitations related to generalizability, data availability, and real world clinical implementation. 41
Another abstract outlines the evolution of diffusion weighted imaging from its early success in stroke diagnosis to its expanded role in oncologic imaging, particularly in neuro oncology, while addressing common misconceptions about the biophysical basis of diffusion contrast. 42
A novel AI-driven application to aid in brain tumor detection from MRI images reports on the development of EfficientNetB2. This report focuses on the proposed technology and evaluation of performance in non-clinical setting.44 EfficentNetB2 is not FDA cleared as a medical device.
Epilepsy
Comparison of MRI images read by neuroradiologists and analyzed with NeuroQuant software in 144 patients with Temporal Lobe epilepsy (TLE) was performed. The investigators found similar specificity to neuroradiologist visual MR imaging analysis (90.4% versus 91.6%; P = .99) but a lower sensitivity (69.0% versus 93.0%, P < .001). Visual MR imaging analysis by a neuroradiologist with expertise in epilepsy had a higher sensitivity than didnNeuroQuant analysis, likely due to the inability of NeuroQuant to evaluate changes in hippocampal T2 signal or architecture. The positive predictive value of NeuroQuant analysis was comparable with visual MR imaging analysis (84.0% versus 89.1%), whereas the negative predictive value was not comparable (79.8% versus 95.0%). While NeuroQuant showed comparable specificity and may be useful when results are positive, negative findings require further expert evaluation to avoid false negatives, limiting its use as a standalone diagnostic tool.45 They conclude the technology may aid in evaluations when a neuroradiologist is not available, however product information states this is intended as an adjuvant, not replacement for the radiologist.
A prospective study measured volumetric MRI imaging data for 34 patients with TLE and compared to 116 control subjects.46 Structural volumes were calculated using automated quantitative MRI imaging analysis software (NeuroQuant). Results of the quantitative MRI imaging were compared with visual detection of atrophy and histological specimens if available. Quantitative MRI imaging results compared to visual inspection of the volumetric MRI imaging studies by two experienced neuroradiologists had a concordance between hippocampal asymmetry (91-97%). They reported the software discriminated patients with TLE from control subjects with high sensitivity (86.7%–89.5%) and specificity (92.2%–94.1%). The authors conclude that the software can provide “an expert eye” in centers that lack expertise, however the FDA indications for this software indicates it is intended to be an adjunct to the radiologist reader, not a substitute. Limitations of the study include lack of generalizability to non-expert readers, small sample size, lack of confirmative histological confirmation available for 12 patients (35%).
A retrospective report included 36 patients with mesial temporal sclerosis (MTS) which is important to detect for temporal lobe epilepsy as it often guides surgical intervention. One of the features of MTS is hippocampal volume loss. Using electronic medical records researchers scanned patients with proven MTS and analyzed the imaging with volumetric assessment software (NeuroQuant). They reported an estimated accuracy of the neuroradiologist as 72.6% with Kappa statistic of 0.512 (95% CI, 0.388–0.787). They conclude that the NeuroQuant software compared favorably with trained neuroradiologists in predicting MTS.47
Other literature identified was limited to review papers, case reports and series and not included in this assessment.
Traumatic Brain Injury
Twenty MRI images from patients with mild to moderate traumatic brain injury (TBI) were analyzed with NeuroQuant automated software and compared to attending radiologist interpretation. The investigators reported radiologist’s traditional approach found at least one sign of atrophy in 10.0% of patients compared to NeuroQuant finding this in 50.0% of patients concluding higher sensitivity of Neuroquant.48 A subsequent expanded study with 24 subjects found similar results.49 These studies are limited by very small sample size, and lack of knowledge if the atrophy was caused by TBI or other conditions that can cause similar findings. The authors state “we have never seen an MRI report on a patient that used a qualitative rating scale to assess level of atrophy or ventricular enlargement. With the rapid advances in computer-based technology, instead of focusing on understanding and developing the approach based on qualitative ratings, it may be more advantageous to focus on the computer-automated approaches.” In the absence of qualitative ratings, the comparison of the gold standard visual approach and qualitative approach is not established, and to the claim that the software was 50% more accurate is not validated.
Comparison Between Technologies
A review paper focused on AI for brain neuroanatomical segmentation in MRI imaging concludes high accuracy and fast performance overall. The application shows strong potential to enhance clinical diagnostics and surgical planning by improving the efficiency and accuracy of brain image analysis. Despite advances in automated segmentation, key challenges remain, including robustness to anatomical variability and maintaining high accuracy across diverse patient populations and imaging conditions.43 One of the challenges in developing automated volumetry software is lack of a gold standard for similar brain measurements to establish if the software correlates with reality. In current software product measurements of the brain segments are made with different methods and tools and therefore lack standardized measures for comparison. Efforts to understand the performance of different software modalities are undergoing as consistency between programs is an important component to create reliable standards which can be applied to clinical practice.
Multiple investigations compared the inter-method reliability between NeuroQuant and FreeSurfer computer-automated programs for measuring MRI brain volume. These demonstrate high inter-method reliability between the modalities yet often differ substantially in absolute volume estimates, making untransformed measures unsuitable for combined statistical analyses because method differences can masquerade as group effects. Linear transformations can reduce these discrepancies for some regions, but their usefulness is study specific, requires independent validation, and is inadequate when moderate or large effect size differences remain. While Bayesian regression methods appear to resolve most method related differences, they are not yet widely available or routinely used, and the relative accuracy of NQ versus FS across brain regions remains unclear.50,51 Using 56 MRIs in patients with AD or MS investigators compared results between NeuroQuant and volBrain automated brain analysis software and found high reliability except in the thalamus and amygdala where reliability has been proven to be poor. Using 115 MRIs with clinically isolated syndrome both measured with NeuroQuant and FMRIB's Integrated Registration Segmentation Tool (FIRST) found some variability between the modalities with larger volumes achieving better agreement. The presence of MS lesions affected volume estimates and must be accounted for to obtain reliable results. Differences between groups varied by software, with FIRST identifying more significant regional differences, underscoring the need for more accurate clinically robust MRI volumetric tools before routine clinical use.52 Another investigation compared NeuroQuant to DeepBrain and found significant differences in many brain regions.30 It found that NQ reported significantly higher normative percentiles for the thalamus, putamen, and parietal lobe compared to DB, while DB reported higher values for the occipital lobe. The study highlighted a significant correlation between these differences and cranial shape, indicating that skull morphology should be considered in brain volumetry assessments, especially across different ethnicities. A retrospective report compared 87 patients’ images with memory impairment with FreeSurfer, NeuroQuant, and Heuron AD and found significant differences between the programs. Heuron AD indication is for PET amyloid scans.53
A study compared Structural Image Evaluation using Normalisation of Atrophy (SIENA) to Icobrain longitudinal pipeline (Icobrain long) for assessing longitudinal whole brain atrophy (WBA) in MS patients. They found a strong correlation between SIENA and Icobrain long (r = 0.805, p < 0.001) with Icobrain long consistently quantifying longitudinal WBA in MS patients. The study highlights that WBA can serve as a potential biomarker for neurodegeneration in MS, aiding individual patient management and treatment monitoring.57
Another study compared brain volumetrics in MS using Structural Image Evaluation using Normalisation of Atrophy-Cross-sectional (SIENAX) to NeuroQuant and MSmetrix.54 SIENAX is widely used in cross-sectional MS studies, but clinical applications limited. The authors compare the performance of NeuroQuant and MSmetrix to SIENAX and concluded comparable results.
FreeSurfer, volBrain and FIRST are not FDA cleared and used for research purposes.
*Aducanumab has been discontinued as a treatment for Alzheimer’s.