Medical Policy
Policy Num: 11.001.009
Policy Name: Noninvasine Techniques for the Evaluation and Monitoring of Patients with Chronic Liver Disease
Policy ID: [11.001.009] [Ac / B / M+ / P+] [2.04.41]
Last Review: November 10, 2025
Next Review: November 20, 2026
Related Policies:
5.01.13 - Pharmacologic Treatments for Metabolic Dysfunction-Associated Steatohepatitis
| Population Reference No. | Populations | Interventions | Comparators | Outcomes |
| 1 | Individuals: · With chronic liver disease | Interventions of interest are: · FibroSURE serum panels | Comparators of interest are: · Liver biopsy · Noninvasive radiologic methods · Other multianalyte serum assays | Relevant outcomes include: · Test validity · Morbid events · Treatment-related morbidity |
| 2 | Individuals: · With chronic liver disease | Interventions of interest are: · Multianalyte serum assays for liver function assessment other than FibroSURE | Comparators of interest are: · Liver biopsy · Noninvasive radiologic methods · Other multianalyte serum assays | Relevant outcomes include: · Test validity · Morbid events · Treatment-related morbidity |
| 3 | Individuals: · With chronic liver disease | Interventions of interest are: · Transient elastography | Comparators of interest are: · Liver biopsy · Other noninvasive radiologic methods · Multianalyte serum assays | Relevant outcomes include: · Test validity · Morbid events · Treatment-related morbidity |
| 4 | Individuals: · With chronic liver disease | Interventions of interest are: · Multiparametric magnetic resonance imaging | Comparators of interest are: · Liver biopsy · Other noninvasive radiologic methods · Multianalyte serum assays | Relevant outcomes include: · Test validity · Morbid events · Treatment-related morbidity |
| 5 | Individuals: · With chronic liver disease | Interventions of interest are: · Noninvasive radiologic methods other than transient elastography or multiparametric magnetic resonance imaging for liver fibrosis measurement | Comparators of interest are: · Liver biopsy · Other noninvasive radiologic methods · Multianalyte serum assays | Relevant outcomes include: · Test validity · Morbid events · Treatment-related morbidity |
Noninvasive techniques to monitor liver fibrosis are being investigated as alternatives to liver biopsy in patients with chronic liver disease. There are 2 options for noninvasive monitoring: (1) multianalyte serum assays with algorithmic analysis of either direct or indirect biomarkers; and (2) specialized radiologic methods, including magnetic resonance elastography, multiparametric magnetic resonance imaging (MRI), transient elastography, acoustic radiation force impulse imaging, and real-time transient elastography.
For individuals who have chronic liver disease (CLD) who receive FibroSURE serum panels, the evidence includes systematic reviews of more than 30 observational studies (>5000 patients). Relevant outcomes are test validity, morbid events, and treatment-related morbidity. FibroSURE has been studied in populations with alcohol-associated liver disease (ALD), nonalcoholic fatty liver disease (NALFD)/metabolic dysfunction-associated steatotic liver disease (MASLD), and viral hepatitis. There are established cutoffs, although they were not consistently used in validation studies. Given these limitations and the imperfect reference standard, it is difficult to interpret performance characteristics. However, for the purposes of deciding whether a patient has severe fibrosis or cirrhosis, FibroSURE results provide data sufficiently useful to determine therapy. Specifically, FibroSURE has been used as an alternative to biopsy to establish eligibility regarding the presence of fibrosis or cirrhosis in several randomized controlled trials (RCTs) that showed the efficacy of hepatitis C virus (HCV) treatments, which in turn demonstrated that the test can identify patients who would benefit from therapy. The evidence is sufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have CLD who receive multianalyte serum assays for liver function assessment other than FibroSure, such as the Enhanced Liver Fibrosis (ELF) test and OWLiver panel, the evidence includes observational studies and systematic reviews. The ELF test shows high sensitivity but lower specificity for detecting advanced fibrosis in NAFLD/MASLD, especially at lower thresholds. The positive predictive value of the test improves with higher thresholds and greater disease prevalence. A systematic review conducted in support of the the American Association for the Study of Liver Diseases (AASLD) Practice Guidelines (2024) reported conflicting data on the diagnostic accuracy of ELF compared with nonproprietary blood-based tests such as the Fibrosis-4 (FIB-4) Index, and NAFLD/NASH fibrosis score (NFS) for the detection of fibrosis in NAFLD. The AASLD noted that in community-based and other low prevalence cohorts, blood-based noninvasive tests are useful for excluding advanced fibrosis with high negative predictive value but require additional noninvasive tests to improve their positive predictive value. A multicenter cross‐sectional study demonstrated high accuracy of the OWLiver panel for diagnosing MASH and advanced fibrosis in patients with obesity and type 2 diabetes, with consistent results across obesity levels and diabetes control. Further studies comparing the OWLiver panel to nonproprietary tests in larger and more diverse patient populations is necessary to confirm these findings. There is no direct evidence that either of these multianalyte serum assays improve health outcomes; further, it is not possible to construct a chain of evidence for clinical utility due to the lack of sufficient evidence on clinical validity. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have CLD who receive transient elastography (TE), the evidence includes many systematic reviews of more than 50 observational studies (>10,000 patients). Relevant outcomes are test validity, morbid events, and treatment-related morbidity. TE (FibroScan) has been studied in populations with viral hepatitis, NALFD, and ALD. There are varying cutoffs for positivity. Failures of the test are not uncommon, particularly for those with high body mass index, but these failures often went undetected in analyses of the validation studies. Given these limitations and the imperfect reference standard, it can be difficult to interpret performance characteristics. However, for the purposes of deciding whether a patient has severe fibrosis or cirrhosis, the FibroScan results provide data sufficiently useful to determine therapy. In fact, FibroScan has been used as an alternative to biopsy to establish eligibility regarding the presence of fibrosis or cirrhosis in the participants of several RCTs. These trials showed the efficacy of HCV treatments, which in turn demonstrated that the test can identify patients who would benefit from therapy. The evidence is sufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have CLD who receive multiparametric magnetic resonance imaging (MMRI), the evidence includes several prospective and retrospective observational studies. Multiparametric MRI (e.g., LiverMultiScan) has been studied in mixed populations, including NAFLD, viral hepatitis, and ALD. Quantitative MRI provides various measures to assess liver fat content, fibrosis and inflammation. Various cutoffs have been utilized for positivity. Given these limitations and the imperfect reference standard, it can be difficult to interpret performance characteristics. Otherwise, multiparametric MRI performed similarly to transient elastography, and fewer technical failures of multiparametric MRI were reported. The prognostic ability of quantitative MRI to predict liver-related clinical events has been evaluated in 2 studies. Both studies reported positive correlations, but the confidence interval was wide. Larger cohorts with a longer follow-up time would be useful to further derive the prognostic characteristic of the test. Multiparametric MRI has been used to measure the presence of fibrosis or cirrhosis in patients who have achieved biochemical remission after treatment in small prospective studies. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
For individuals who have CLD who receive noninvasive radiologic methods other than TE or MMRI for liver fibrosis measurement, the evidence includes systematic reviews of observational studies and a comparative study with 5-year follow up. Relevant outcomes are test validity, morbid events, and treatment-related morbidity. Other radiologic methods (e.g., magnetic resonance elastography (MRE), real-time transient elastography (RTE), acoustic radiation force impulse imaging (ARFI) imaging) may have similar performance for detecting significant fibrosis or cirrhosis. A systematic review conducted to support the AASLD Practice Guidelines (2024) reported that liver-stiffness measurement from MRE and shear wave elastography/ARFI (in addition to TE) shows high accuracy for the detection of liver fibrosis across various liver disease etiologies. Accuracy increased from F2-4 to F3-4 and was the highest for F4. In the comparative study, ARFI elastography was found to be at least as effective as liver histology in predicting liver-related survival, and was superior to both histology and the FIB-4 score in predicting certain liver-related complications. Studies have frequently included varying cutoffs not prespecified or validated. Given these limitations and the imperfect reference standard, it is difficult to interpret performance characteristics. There is no direct evidence that other noninvasive radiologic methods improve health outcomes; further, it is not possible to construct a chain of evidence for clinical utility due to the lack of sufficient evidence on clinical validity. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
Not applicable.
The objective of this evidence review is to determine whether the use of noninvasive techniques for detecting liver fibrosis compared with liver biopsy can improve the net health outcome in patients with chronic liver disease.
Multianalyte Assays
The FibroSURE multianalyte assay may be considered medically necessary for the evaluation of fibrosis staging in individuals with chronic liver disease.
FibroSURE multianalyte assays are considered investigational for monitoring of individuals with chronic liver disease.
Other multianalyte assays with algorithmic analyses are considered investigational for the evaluation or monitoring of individuals with chronic liver disease.
Noninvasive Imaging Technologies
Transient elastography (e.g., FibroScan) imaging may be considered medically necessary for the evaluation of individuals with chronic liver disease.
Transient elastography imaging is considered investigational for monitoring of individuals with chronic liver disease.
The use of other noninvasive imaging, including but not limited to magnetic resonance elastography, multiparametric magnetic resonance imaging, acoustic radiation force impulse imaging , or real-time tissue elastography, is considered investigational for the evaluation or monitoring of individuals with chronic liver disease.
Increased fibrosis stage has important prognostic implications in nonalcoholic fatty liver disease (NAFLD) (now metabolic dysfunction-associated steatotic liver disease, MASLD, see Background).
The American Association for the Study of Liver Diseases (AASLD) has developed an algorithm intended to be used by clinicians in need of a readily available and simple decision support tool for liver disease assessment (see below). The AASLD recommends that fibrosis staging begin with nonproprietary blood-based tests because of their wide availability and performance compared to proprietary tests. Nonproprietary tests include the Fibrosis-4 (FIB-4) Index, and NAFLD/NASH fibrosis score (NFS) which are used as initial blood-based tests to rule-out advanced fibrosis. The fibrosis 4 (FIB-4) Index calculator estimates the likelihood of advanced liver fibrosis (scarring) by combining a patient's age with aspartate aminotransferase (AST), alanine aminotransferase (ALT), and platelet count values. A low FIB-4 score (typically <1.3 or <1.45) suggests a low risk of advanced fibrosis, while a high score (typically >2.67 or >3.25) indicates a high risk and may warrant further assessment, potentially a liver biopsy.
The NFS score is calculated using a formula that considers the following factors: age, body mass index (BMI), diabetes status, and blood test results (AST/ALT ratio, albumin, platelet count). The NFS is interpreted as follows:
The AASLD Practice Guidelines Committee commissioned a diverse group of experts across multiple disciplines in the field of adult and pediatric liver disease to develop guidelines and guidance statements along with a systematic review covering blood-based noninvasive tests to address specific clinically focused questions. Of these tests, FIB-4 was considered to have superior performance, particularly for the identification of F3-4 stages of fibrosis, which is the spectrum of fibrosis for which the tests were designed. NFS was considered an equivalent to FIB-4 in patients with NAFLD in the assessment of advanced fibrosis. FIB-4 thresholds of ≤1.30 and ≥2.67, and NFS thresholds of ≤-1.455 and ≥0.676, have been proposed as having higher predictive values for F3-4 in NAFLD. The AASLD recommends that in the appropriate clinical setting (i.e., low pre-test probability), both tests should suffice to rule out significant/advanced fibrosis.
Confirmatory testing (secondary assessment) such as noninvasive imaging technologies should be performed for patients with values between the lower and upper thresholds of these tests. Patients with FIB-4 scores less than 1.3 are unlikely to have advanced fibrosis. High-risk individuals, such as those with type 2 diabetes, medically complicated obesity, family history of cirrhosis, or more than mild alcohol consumption, should be screened for advanced fibrosis.
The AASLD Practice Guidelines Committee made an ungraded statement that in adults with CLD, either ultrasound-based elastography methods or magnetic resonance elastography (MRE) can be utilized to stage fibrosis. Depending on local availability and expertise, it is reasonable to perform MRE as an investigation when concomitant cross-sectional imaging is needed or for patients in whom the accuracy of US-based elastography might be compromised.
See the Codes table for details.
Both FibroSURE and FIBROSpect are offered exclusively by reference laboratories, where the global charge will reflect the cost of the underlying laboratory analysis, and then, in addition, the charge associated with the use of the proprietary algorithm to analyze the data.
State or federal mandates (eg, Federal Employee Program) may dictate that certain U.S. Food and Drug Administration approved devices, drugs, or biologics may not be considered investigational, and thus these devices may be assessed only by their medical necessity.
Benefits are determined by the group contract, member benefit booklet, and/or individual subscriber certificate in effect at the time services were rendered. Benefit products or negotiated coverages may have all or some of the services discussed in this medical policy excluded from their coverage.
Chronic liver disease (CLD) is associated with approximately two million annual deaths worldwide. CLD is a progressive deterioration of liver function for more than 6 months, adversely affecting synthesis of clotting factors, other proteins, detoxification of harmful products of metabolism, and excretion of bile. CLD is a continuous process of inflammation, destruction, and regeneration of liver parenchyma, which leads to fibrosis and cirrhosis. Multiple etiologies are associated with CLD including toxin exposures, chronic alcohol abuse, infection, autoimmune diseases, genetic and metabolic disorders. CLD is the 9th cause of death in the United States (U.S.). According to the National Center for Health Statistics from the U.S. Center for Disease Control and Prevention, approximately 4.5 million adults had CLD and cirrhosis. This represents 1.8 percent of the adult population. There were 52,222 deaths through 2023 (15.6 deaths per 100,000 population) from CLD and cirrhosis.1,
Steatosis (also known as fatty liver disease) is a condition caused by an excessive buildup of fat in the liver. Steatotic liver disease (SLD) is a generic term for the accumulation of lipids in liver parenchymal cells. Primary risk factors for SLD include alcohol, insulin resistance, and obesity. In 2023, a global consensus conference described 5 subclasses of SLD: metabolic dysfunction-associated steatotic liver disease (MASLD), formerly known as nonalcoholic fatty liver disease (NAFLD); alcohol-associated liver disease (ALD); SLD with specific etiology (e.g., drug-induced); cryptogenic SLD, and MASLD with increased alcohol intake (MetALD).2,
The Brunt-Kleiner scoring system and the NASH Clinical Research Network (CRN) scoring system ((i.e., NAFLD Activity Score, NAS) are two of the most widely used methods for histologically assessing steatosis and fibrosis in MASLD. The Brunt-Kleiner system has four possible grades (0-3) and five possible stages (0-4). The NAS is an 8-point scale classifying the severity of steatosis (score: 0-3), lobular inflammation (score: 0-3) and ballooning (score: 0-2), with greater scores equating more severe disease. Both systems determine the degree of steatosis based on the percentage of steatotic hepatocytes involved: normal <5%, mild =5% to 33%, moderate =34% to 66%, and severe >66%.
Fibrosis scores are generally disease-specific and technically cannot be unified across different CLDs. To achieve a unified approach, the American Association for the Study of Liver Diseases (AASLD) Practice Guidelines Committee incorporated the different fibrosis staging systems by consolidating them into a single framework. The AASLD defined three primary categories: "at least significant fibrosis," corresponding to fibrosis stage 2 or higher (F2-4); "at least advanced fibrosis," encompassing stages F3 and F4; and "cirrhosis," represented by stage F4 (Table 1).3,4,
| Significant Fibrosis | |||||
| Advanced Fibrosis | |||||
| Etiology | 0 | FI | F2 | F3 | F4 |
| ALD |
|
|
|
|
|
| MASLD [Brunt-Kleiner system] |
|
|
|
|
|
| Viral and Autoimmune Hepatitis |
|
|
|
|
|
| PBC and PSC |
|
|
|
|
|
| Various etiologies [Metavir system] |
|
|
|
|
|
Alcohol-Associated Liver Disease
ALD is a major cause of liver disease worldwide, both on its own and as a co-factor in the progression of chronic viral hepatitis, MASLD, iron overload, and other liver diseases.5,6, ALD represents a spectrum of liver injury resulting from alcohol use, ranging from steatosis to steatohepatitis and cirrhosis. ALD progression relies on persistent alcohol use and factors such as genetics, sex, diet, and concurrent liver conditions.
Metabolic Dysfunction-Associated Steatotic Liver Disease (MASLD) formerly Nonalcoholic Fatty Liver Disease (NAFLD)
In 2023, the AASLD and other professional societies adopted new nomenclature for the spectrum of NAFLD. The new terminology reflected the role of metabolic dysfunction in the development of what is now termed MASLD. Given this recent nomenclature shift, this policy will continue to use the abbreviations NAFLD and NASH (nonalcoholic steatohepatitis) unless a publication specifically refers to MASLD or MASH (metabolic dysfunction-associated steatohepatitis).
MASLD is characterized by hepatic steatosis (>5%) along with at least one cardiometabolic risk factor, no other causes of SLD, and minimal or no alcohol consumption. MASH, a more severe subtype of MASLD, is a progressive liver disease characterized by the presence of at least 5% hepatic steatosis, along with hepatocellular damage and inflammation.7,2, This condition can develop into advanced liver fibrosis, cirrhosis, and hepatocellular carcinoma (HCC), all of which are linked to significant morbidity and mortality. In the U.S., MASH ranks among the leading causes of HCC and is the second most common reason for liver transplantation after hepatitis C.8, Once MASH advances to clinically significant fibrosis (stages F2 and F3), the risk of serious clinical outcomes rises. Cardiovascular incidents are the primary cause of death in individuals with MASH, with non-liver cancers being the second leading cause.9,10,
Infectious Etiologies
Hepatitis C Virus
Infection with hepatitis C virus (HCV) can lead to permanent liver damage. Prior to noninvasive testing, liver biopsy was typically recommended before the initiation of antiviral therapy. Repeat biopsies may be performed to monitor fibrosis progression. Liver biopsies are analyzed according to the most commonly used histologic scoring system known as the Metavir system. The Metavir system includes scores for fibrosis (Table 1) and necroinflammatory activity (which refers to a combination of cellular events in which tissue necrosis is accompanied by an inflammatory response). This activity is graded as A0 = no activity, A1 = mild activity, A2 = moderate activity, and A3 = severe activity.
Hepatitis B Virus
Most people who become infected with hepatitis B virus (HBV) recover fully, but a small portion develops chronic HBV, which can lead to permanent liver damage. Identification of liver fibrosis is needed to determine timing and management of treatment, and liver biopsy is the criterion standard for staging fibrosis. The Metavir grading system is applied to HBV.
Autoimmune Etiologies
Autoimmune liver diseases include autoimmune hepatitis (AIH), primary biliary cirrhosis (PBC), and primary sclerosing cholangitis (PSC). AIH is a rare, chronic inflammatory condition leading to liver parenchyma destruction by autoantibodies, commonly affecting women and associated with antinuclear antibodies, anti-smooth muscle antibodies, and hypergammaglobulinemia. PBC involves progressive autoimmune destruction of intrahepatic biliary channels, portal inflammation, and fibrosis, resulting in cholestatic jaundice, primarily in middle-aged women, with increased alkaline phosphatase. PSC, often linked to ulcerative colitis, is characterized by inflammation and fibrosis reducing intrahepatic and extrahepatic bile duct size, leading to bile duct strictures and cholestasis.
Genetic Etiologies
Alpha-1 antitrypsin deficiency, hereditary hemochromatosis, and Wilson disease are genetic etiologies of childhood onset of CLD. Alpha-1 antitrypsin deficiency is the most common. Hemochromatosis and Wilson disease are autosomal recessive conditions. Hemachromatosis involves HFE gene mutations causing excess iron deposition in the liver and Wilson disease involves ATP7B gene mutations causing excess copper buildup.
Other Etiologies
A wide range of drugs and drug classes can cause hepatotoxicity. Various vascular abnormalities, including but not limited to Budd-Chiari syndrome can also lead to advanced liver damage. Budd-Chiari syndrome is a rare vascular disorder caused by the obstruction of the hepatic venous outflow tract, which can be triggered by a hypercoagulable state resulting from specific medications. In 5-10% cases, the cause is unknown (cryptogenic or idiopathic).11,
The diagnosis of non-neoplastic liver disease can be made from needle biopsy samples. In addition to establishing a disease etiology, liver biopsy can determine the degree of inflammation present and stage the degree of fibrosis (see Table 1).
Accurate assessment of the degree of hepatic fibrosis and steatosis is essential in predicting prognosis and making treatment recommendations in individuals with CLD. While liver biopsy has long been the reference standard for assessing fibrosis and steatosis, the procedure is costly, invasive, and carries a small, but important, risk of complications. The frequency of biopsy-related complications varies based on operator experience, underlying comorbidities, size of the needle, number of needle passes, and hemostatic abnormalities such as thrombocytopenia and/or prolonged prothrombin time.3,
Noninvasive Alternatives to Liver Biopsy
Multiple noninvasive blood-based biomarkers and imaging technologies have been developed to reduce the need for liver biopsies. The term “noninvasive liver disease assessment(s)" (NILDA), has been used to describe these tests. They have been developed to determine the presence and severity of liver fibrosis, steatosis, and clinically significant portal hypertension.3, They offer safer and more repeatable assessments for disease progression and treatment response.
Multianalyte Assays
Multianalyte tests for CLD typically combine several blood-based biomarkers and clinical data (like age, sex, BMI) into a proprietary algorithm to assess steatosis, fibrosis, or liver cancer risk. These assays are often used in conjunction with imaging technologies to provide a comprehensive, non-invasive assessment of liver status. Most commercially available laboratory-developed biomarker tests for liver fibrosis are regulated under the Clinical Laboratory Improvement Amendments standards. These laboratory-developed tests (LDTs) have not been cleared or approved by the Food and Drug Administration (FDA).
The FDA cleared the ADVIA Centaur Enhanced Liver Fibrosis (ELF) test for marketing in the U.S. as a novel Class II medical device following a De Novo review (513(f)(2) pathway, DEN190056).
Table 2 lists the proprietary algorithm-based serum markers for liver fibrosis which are currently available in the U.S.:
| Test (Manufacturer) | Description | Regulatory Status |
| FibroSURE (LabCorp) |
|
|
| FIBROSpect II (Prometheus Laboratories) |
|
|
| OWLiver panel (CIMA Sciences in partnership with Luxor Scientific) |
|
|
| Enhanced Liver Fibrosis (Siemens Healthineers) |
|
|
Noninvasive Imaging Technologies
Noninvasive imaging technologies to detect liver fibrosis or cirrhosis among patients with CLD are being evaluated as alternatives to liver biopsy. The noninvasive imaging technologies for review are transient elastography (TE), magnetic resonance elastography (MRE), acoustic radiation force impulse (ARFI) imaging, multiparametric magnetic resonance imaging (MRI), and real-time tissue elastography (RTE). Noninvasive imaging tests have been used in combination with multianalyte serum tests.
| Technology | Description | Device (Vendor, FDA Decision Date, 510(k) Number) |
| Ultrasound Technologies | ||
| ARFI imaging (shear wave elastography) |
|
|
| RTE |
|
|
| TE |
|
|
| Magnetic Resonance Technologies | ||
| MRE |
|
|
| MMRI |
|
|
CLD comprehensive treatment includes etiological management, lifestyle modifications, pharmacotherapy, nutritional support, prevention and management of complications, regular monitoring, and health education. For individuals with advanced liver disease, such as those experiencing cirrhosis or hepatic failure, liver transplantation may ultimately become the only effective option. Chronic hepatitis B or C can be treated with antivirals, such as lamivudine, entecavir, tenofovir (for HBV), or direct-acting antivirals like sofosbuvir or harvoni (for HCV).12,
There are two pharmacologic treatment options for MASH as adjuncts to lifestyle interventions.13,14,15, Lifestyle modification, including weight loss through a hypocaloric diet and physical activity, remains the cornerstone of MASH management and can reduce hepatic steatosis and improve insulin sensitivity.
For individuals with biopsy-confirmed MASH and fibrosis (≥F2), the FDA has granted accelerated approval for resmetirom (Rezdiffra, Madrigal Pharmaceuticals) and semaglutide (Wegovy, Novo Nordisk). These are prescribed in combination with diet and exercise for the treatment of adults with MASH and moderate to advanced liver fibrosis, (stages F2 to F3). Resmetirom, a liver-specific thyroid hormone receptor beta-agonist, is the first FDA-approved drug for non-cirrhotic MASH with moderate to advanced fibrosis, demonstrating histological and biochemical benefits. Resmetirom is administered orally once daily.
Semaglutide, a Glucagon-like peptide-1 (GLP-1) receptor agonist, is the second FDA-approved agent for MASH and is administered as a weekly subcutaneous injection. Both resmetirom and semaglutide are addressed in the related policy #5.01.13 (Pharmacologic Treatments for Metabolic Dysfunction-Associated Steatohepatitis).
This evidence review was created in June 2005 and has been updated regularly with searches of the PubMed database. The most recent literature update was performed through September 22, 2025.
Evidence reviews assess whether a medical test is clinically useful. A useful test provides information to make a clinical management decision that improves the net health outcome. That is, the balance of benefits and harms is better when the test is used to manage the condition than when another test or no test is used to manage the condition.
The first step in assessing a medical test is to formulate the clinical context and purpose of the test. The test must be technically reliable, clinically valid, and clinically useful for that purpose. Evidence reviews assess the evidence on whether a test is clinically valid and clinically useful. Technical reliability is outside the scope of these reviews, and credible information on technical reliability is available from other sources.
Liver biopsy is an imperfect reference standard. There is a high rate of sampling error, which can lead to underdiagnosis of liver disease.16,17, These errors will bias estimates of performance characteristics of the noninvasive tests to which it is compared, and therefore such errors must be considered in appraising the body of evidence. Mehta et al (2009) estimated that even under the best scenario where sensitivity and specificity of liver biopsy are 90%, and the prevalence of significant disease (increased liver fibrosis, scored as Metavir ≥F2) is 40%; a perfect alternative marker would have calculated the area under the receiver operating characteristic (AUROC) curve of 0.90.18, Therefore, the effectiveness of alternative technologies may be underestimated. In fact, when the accuracy of biopsy is presumed to be 80%, a comparative technology with an AUROC curve of 0.76 may actually have an AUROC curve of 0.93 to 0.99 for diagnosing true disease.
Due to a large number of primary studies published on this topic, this evidence review focuses on systematic reviews when available. The validation of multiple noninvasive tests is assessed individually in the following sections. Although options exist for performing systematic reviews with imperfect reference standards,19, most available reviews did not use any correction for the imperfect reference.
A systematic review by Crossan et al (2015) was performed for the UK National Institute for Health Research.20, The first objective of the review was to determine the diagnostic accuracy of different noninvasive liver tests compared with liver biopsy in the diagnosis and monitoring of liver fibrosis and cirrhosis in patients with HCV, HBV, NAFLD, and ALD. Reviewers selected 302 publications and presentations from 1998 to April 2012. Patients with HCV were the most common population included in the studies while patients with ALD were the least common. FibroScan and FibroTest were the most commonly assessed tests across liver diseases. Aminotransferase to platelet ratio index (APRI) was also widely assessed in HBV and HCV but not in NAFLD or ALD. The estimates of diagnostic accuracy for each test by disease are discussed in further detail in the following sections. Briefly, for diagnosing significant fibrosis (stage ≥F2) in HCV, the summary sensitivities and specificities were: FibroScan, 79% and 83%; FibroTest, 68% and 72%; APRI (low cutoff), 82% and 57%; ARFI imaging, 85% and 89%; HepaScore, 73% and 73%; FIBROSpect II, 78% and 71%; and FibroMeter, 79% and 73%, respectively. For diagnosing advanced fibrosis in HBV, the summary sensitivities and specificities were: FibroScan, 71% and 84% and FibroTest, 66% and 80%, respectively. There are no established or validated cutoffs for fibrosis stages across the diseases for most tests. For FibroTest, established cutoffs exist, but were used inconsistently across studies. Test failures or reference standard(s) were frequently not captured in analyses. Most populations included in the studies were from tertiary care settings that have more advanced disease than the general population, which would overestimate the prevalence of the disease and diagnostic accuracy. These issues likely cause overestimates of sensitivities and specificities. The quality of the studies was generally rated as poor, with only 1.6% receiving a high-quality rating.
Houot et al (2016) reported on a systematic review funded by BioPredictive, the manufacturer of FibroTest.21, This review included 71 studies published between January 2002 to February 2014 with over 12,000 participants with HCV and HBV comparing the diagnostic accuracy of FibroTest, FibroScan, APRI, and fibrosis-4 (FIB-4) index. Included studies directly compared the tests and calculated median differences in the AUROC curve using Bayesian methods. There was no evaluation of the methodologic quality of the included studies. The Bayesian difference in AUROC curve for significant fibrosis (stage ≥F2) between FibroTest and FibroScan was based on 15 studies and estimated to be 0.06 (95% credible interval [CrI], 0.02 to 0.09) favoring FibroTest. The difference in AUROC curve for cirrhosis for FibroTest versus FibroScan was based on 13 studies and estimated to be 0.00 (95% CrI, -0.04 to 0.04). The difference for advanced fibrosis between FibroTest and APRI was based on 21 studies and estimated to be 0.05 (95% CrI, 0.03 to 0.07); for cirrhosis, it was based on 14 studies and estimated to be 0.05 (95% CrI, 0.00 to 0.11), both favoring FibroTest.
The purpose of noninvasive testing in individuals with CLD is to detect liver fibrosis so that individuals can avoid the potential adverse events of an invasive liver biopsy and receive appropriate treatment. The degree of liver fibrosis is an important factor in determining the appropriate approach for managing individuals with liver disease (e.g., ALD, NAFLD/MASLD, hepatitis).
The following PICO was used to select literature to inform this review.
The relevant population of interest is individuals with CLD.
The test being considered is the FibroSURE serum panel.
The following tests and practices are currently being used to diagnose CLD: liver biopsy, noninvasive radiologic methods, and other multianalyte serum assays.
The general outcomes of interest are test validity, morbid events, and treatment-related morbidity. Follow-up over months to years is of interest to the relevant outcomes.
For the evaluation of the clinical validity of the tests within this review, studies that meet the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores).
Included a suitable reference standard (describe the reference standard).
Patient/sample clinical characteristics were described.
Patient/sample selection criteria were described.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
The diagnostic value of FibroSURE (FibroTest in Europe) has also been evaluated for the prediction of liver fibrosis in patients with ALD and NAFLD.22,23, Thabut et al (2006) reported the development of a panel of biomarkers (ASH FibroSURE [ASH Test]) for the diagnosis of alcoholic steatohepatitis (ASH) in patients with chronic ALD.24, Biomarkers were initially assessed in a training group of 70 patients, and a panel was constructed using a combination of the 6 biochemical components of the FibroTest-ActiTest plus AST). The algorithm was subsequently studied in 2 validation groups (1 prospective study for severe ALD, 1 retrospective study for nonsevere ALD) that included 155 patients and 299 controls. The severity of ASH (none, mild, moderate, severe) was blindly assessed from biopsy samples. In the validation groups, there were 28 (18%) cases of discordance between the diagnosis of ASH predicted by the ASH Test and biopsy; 10 (36%) were considered false-negatives of the ASH Test, and 11 were suspected failures of biopsy. Seven cases were indeterminate by biopsy. The AUROC curves were 0.88 and 0.89 in the validation groups. The median ASH Test value was 0.005 in controls, 0.05 in patients without or with mild ASH, 0.64 in the moderate ASH grade, and 0.84 in severe ASH grade 3. Using a cutoff value of 0.50, the ASH Test had a sensitivity of 80% and specificity of 84%, with PPVs and NPVs of 72% and 89%, respectively. Several authors had an interest in the commercialization of this test, and no independent studies on the diagnostic accuracy of ASH FibroSURE (ASH Test) were identified. In addition, it is not clear if the algorithm used in this study is the same as that used in the currently commercially available test, which includes 10 biochemicals.
FibroTest has been studied in patients with ALD. In the Crossan et al (2015) systematic review, 1 study described the diagnostic accuracy of the FibroTest for significant fibrosis (stage ≥ F2) or cirrhosis in ALD.20, With a high cutoff for positivity (0.7), the sensitivity and specificity for advanced fibrosis were 55% (95% CI, 47% to 63%) and 93% (95% CI, 85% to 97%) and for cirrhosis were 91% (95% CI, 82% to 96%) and 87% (95% CI, 81% to 91%), respectively. With a low cutoff for positivity (0.3), the sensitivity and specificity for advanced fibrosis were 84% (95% CI, 77% to 89%) and 65% (95% CI, 55% to 75%), respectively. The sensitivity and specificity for cirrhosis were 100% (95% CI, 95% to 100%) and 50% (95% CI, 42% to 58%), respectively.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, more effective therapy, or avoid unnecessary therapy or testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
No studies were identified that assessed clinical outcomes following the use of the ASH FibroSURE (ASH Test) in ALD and ASH.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
Crossan et al (2015) published a systematic review which included 4 studies in the pooled estimate of the diagnostic accuracy of FibroSure/FibroTest for advanced fibrosis (stage ≥ 3) in NAFLD (MASLD).20, The summary sensitivities and specificities were 40% (95% CI, 24% to 58%) and 96% (95% CI, 91% to 98%), respectively. Only 1 study included reported accuracy for cirrhosis, with sensitivity and specificity of 74% (95% CI, 54%, to 87%) and 92% (95% CI, 88% to 95%), respectively.
A systematic review conducted to support the AASLD Practice Guidelines (2024) did not identify any studies that examined the relationship between changes in FibroSure/FibroTest and histological improvement in fibrosis among patients with MASLD or MASH.3,
Poynard et al (2006) reported the development of a panel of biomarkers (NASH FibroSURE [NASH Test]) for the prediction of nonalcoholic steatohepatitis (NASH) in patients with NAFLD.25, Biomarkers were initially assessed with a training group of 160 patients, and a panel was constructed using a combination of 13 of 14 parameters of the currently available test. The algorithm was subsequently studied in a validation group of 97 patients and 383 controls. Patients in the validation group were from a prospective multicenter study with hepatic steatosis at biopsy and suspicion of NAFLD. Histologic diagnoses used Kleiner et al’s scoring system, with 3 classes for NASH (NASH, borderline NASH, no NASH). The main endpoint was steatohepatitis, defined as a histologic NASH score of 5 or greater. The AUROC curve for the validation group was 0.79 for the diagnosis of NASH, 0.69 for the diagnosis of borderline NASH, and 0.83 for the diagnosis of no NASH. Results showed a sensitivity of 33% and specificity of 94% for NASH, with a PPV and NPV of 66% and 81%, respectively. For borderline NASH or NASH, sensitivity was 88%, specificity 50%, PPV 74%, and NPV 72%. Clinically significant discordance (2 class difference) was observed in 8 (8%) patients. None of the 383 controls were considered to have NASH by NASH FibroSURE (NASH Test). Authors proposed that this test would be suitable for mass screening for NAFLD in patients with obesity and diabetes.
An independent study by Lassailly et al (2011) attempted to prospectively validate the NASH Test (along with the FibroTest, SteatoTest, and ActiTest) in a cohort of 288 patients treated with bariatric surgery.24, Included were patients with severe or morbid obesity (body mass index, >35 kg/m2), at least 1 comorbidity for at least 5 years, and resistance to medical treatment. Excluded were patients with current excessive drinking, long-term consumption of hepatotoxic drugs, and positive screening for chronic liver diseases including hepatitis. Histology and biochemical measurements were centralized and blinded to other characteristics. The NASH Test provided a 3-category score for no NASH (0.25), possible NASH (0.50), and NASH (0.75). The prevalence of NASH was 6.9%, while the prevalence of NASH or possible NASH was 27%. The concordance rate between the histologic NASH score and the NASH Test was 43.1%, with a weak κ reliability test (0.14). In 183 patients categorized as possible NASH by the NASH Test, 124 (68%) were classified as no NASH by biopsy. In 15 patients categorized as NASH by the NASH Test, 7 (47%) were no NASH and 4 (27%) were possible NASH by biopsy. The NPV of the NASH Test for possible NASH or NASH was 47.5%. Authors suggested that the power of this study to validate agreement between the NASH Test and biopsy was low, due to the low prevalence of NASH. However, the results showed poor concordance between the NASH Test and biopsy, particularly for intermediate values.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, more effective therapy, or avoid unnecessary therapy or testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
No studies were identified that assessed clinical outcomes following the use of the NASH FibroSURE (NASH Test) in NAFLD and NASH.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
Following the initial research into FibroSURE (patients with liver fibrosis who had undergone biopsy)26,, the next step in the development of this test was a further evaluation of the algorithm in a cross-section of patients, including patients with HCV participating in large clinical trials before and after the initiation of antiviral therapy. A study by Poynard et al (2003) focused on patients with HCV participating in a randomized study of pegylated interferon and ribavirin.27, From the 1530 participants, 352 patients with stored serum samples and liver biopsies at study entry and at 24-week follow-up were selected. The HCV FibroSURE score was calculated and then compared with the Metavir liver biopsy score. At a cutoff of 0.30, the HCV FibroSURE score had 90% sensitivity and 88% positive predictive value (PPV) for the diagnosis of Metavir F2 to F4 fibrosis; the specificity was 36%, and the negative predictive value (NPV) was 40%.
Poynard et al (2004) also evaluated discordant results in 537 patients who underwent liver biopsy and the HCV FibroSURE and ActiTest on the same day; discordance was attributed to either the limitations in the biopsy or serum markers.28, In this study, cutoff values were used for individual Metavir scores (ie, F0 to F4) and for combinations of Metavir scores (ie, F0 to F1, F1 to F2). The definition of a significant discordance between FibroTest and ActiTest and biopsy scores was at least 2 stages or grades in the Metavir system. Discordance was observed in 29% of patients. Risk factors for failure of the HCV FibroSURE scoring system were as follows: the presence of hemolysis, inflammation, possible Gilbert syndrome, acute hepatitis, drugs inducing cholestasis, or an increase in transaminases. Discordance was attributable to markers in 2.4% of patients, to the biopsy in 18%, and unattributed in 8.2% of patients. As noted in 2 reviews, the bulk of the research on HCV FibroSURE was conducted by researchers with an interest in the commercialization of the algorithm.29,30,
In the Crossan et al (2015) systematic review, FibroTest was the most widely validated commercial serum test.20, Seventeen studies were included in the pooled estimate of the diagnostic accuracy of FibroTest for significant fibrosis (stage ≥F2) in HCV. With varying cutoffs for positivity between 0.32 and 0.53, the summary sensitivity in HCV was 68% (95% confidence interval [CI], 58% to 77%) and specificity was 72% (95% CI, 70% to 77%). Eight studies were included for cirrhosis (stage F4) in HCV. The cutoffs for positivity ranged from 0.56 to 0.74 and the summary sensitivity and specificity were 60% (95% CI, 43% to 76%) and 86% (95% CI, 81% to 91%), respectively. Uninterpretable results were rare for tests based on serum markers.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, more effective therapy, or avoid unnecessary therapy or testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from randomized controlled trials (RCTs). The primary benefit of the FibroSURE (FibroTest in Europe) for HCV is the ability to avoid liver biopsy in patients without significant fibrosis. There are currently no such published studies to demonstrate the effect on patient outcomes.
The FibroTest has been used as an alternative to biopsy for the purposes of establishing trial eligibility in terms of fibrosis or cirrhosis; several trials with FibroTest (ION-1,-3; VALENCE; ASTRAL-2, -3, -4) have established the efficacy of HCV treatments.31,32,33,34,35,36, For example, in the ASTRAL-2 and -3 trials, cirrhosis could be defined by a liver biopsy; a FibroScan or a FibroTest score of more than 0.75; or an APRI of more than 2.
These tests also need to be adequately compared with other noninvasive tests of fibrosis to determine their comparative efficacy. In particular, the proprietary, algorithmic tests should demonstrate superiority to other readily available, nonproprietary scoring systems to demonstrate that the tests improve health outcomes.
The FibroSURE test also has a potential effect on patient outcomes as a means to follow response to therapy. In this case, evidence needs to demonstrate that the use of the test for response to therapy impacts decision making and that these changes in management decisions lead to improved outcomes. It is not clear whether HCV FibroSURE could be used as an interval test in patients receiving therapy to determine whether an additional liver biopsy is necessary.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
While most multianalyte assay studies that have identified fibrosis have been conducted in patients with HCV, studies are also being conducted in patients with chronic HBV.37,38, In a study, Park et al (2013) compared liver biopsy with the FibroTest results obtained on the same day from 330 patients who had chronic HBV.39, Discordance was found in 30 (9.1%) patients for whom the FibroTest underestimated fibrosis in 25 patients and overestimated it in 5 patients. Those with Metavir liver fibrosis stage F3 or F4 (15.4%) had a significantly higher discordance rate than those with stages F1 or F2 (3.0%; p<.001). The only independent factor for discordance on multivariate analysis was a Metavir stage F3 or F4 on liver biopsy (p<.001).
Salkic et al (2014) conducted a meta-analysis of studies on the diagnostic accuracy of FibroTest in chronic HBV.40, Included in the meta-analysis were 16 studies (n=2494) on liver fibrosis diagnosis and 13 studies (n=1754) on cirrhosis diagnosis. There was strong evidence of heterogeneity in the 16 fibrosis studies and evidence of heterogeneity in the cirrhosis studies. For significant liver fibrosis (Metavir F2 to F4) diagnosis using all of the fibrosis studies, the AUROC curve was 0.84 (95% CI, 0.78 to 0.88). At the recommended FibroTest threshold of 0.48 for a significant liver fibrosis diagnosis, the sensitivity was 60.9%, specificity was 79.9%, and the diagnostic odds ratio (OR) was 6.2. For liver cirrhosis (Metavir F4) diagnosis using all of the cirrhosis studies, the AUROC curve was 0.87 (95% CI, 0.85 to 0.9). At the recommended FibroTest threshold of 0.74 for cirrhosis diagnosis, the sensitivity was 61.5%, specificity was 90.8%, and the diagnostic OR was 15.7. While the results demonstrated FibroTest may be useful in excluding a diagnosis of cirrhosis in patients with chronic HBV, the ability to detect significant fibrosis and cirrhosis and exclude significant fibrosis is suboptimal.
Xu et al (2014) reported on a systematic review and meta-analysis of studies assessing biomarkers to detect fibrosis in HBV.41, Included in the analysis of FibroTest were 11 studies (N=1640). In these 11 studies, AUROC curves ranged from 0.69 to 0.90. Heterogeneity in the studies was statistically significant.
Crossan et al (2015) published a systematic review which included 6 studies in the pooled estimate of the diagnostic accuracy of FibroTest for significant fibrosis (stage ≥F2) in HBV.20, The cutoffs for positivity ranged from 0.40 to 0.48, and the summary sensitivities and specificities were 66% (95% CI, 57% to 75%) and 80% (95% CI, 72% to 86%), respectively. The accuracy for diagnosing cirrhosis in HBV was based on 4 studies with cutoffs for positivity ranging from 0.58 to 0.74; sensitivities and specificities were 74% (95% CI, 25% to 96%) and 90% (95% CI, 83% to 94%), respectively.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, more effective therapy, or avoid unnecessary therapy or testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
There are no studies evaluating the effect of this test on outcomes for patients with HBV. Of note, some researchers have suggested that different markers (eg, HBV FibroSURE) may be needed for this assessment in patients with hepatitis B.42,
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
For individuals who have CLD who receive FibroSURE serum panels, the evidence includes systematic reviews of more than 30 observational studies (>5000 patients). FibroSURE has been studied in populations with ALD, NAFLD, and viral hepatitis. There are established cutoffs, although they were not consistently used in validation studies. Given these limitations and the imperfect reference standard, it is difficult to interpret performance characteristics. However, for the purposes of deciding whether a patient has severe fibrosis or cirrhosis, FibroSURE results provide data sufficiently useful to determine therapy. Specifically, FibroSURE has been used as an alternative to biopsy to establish eligibility regarding the presence of fibrosis or cirrhosis in several RCTs that showed the efficacy of HCV treatments, which in turn demonstrated that the test can identify patients who would benefit from therapy.
Population Reference No. 2
The purpose of noninvasive testing in individuals with chronic liver disease is to detect liver fibrosis so that individuals can avoid the potential adverse events of an invasive liver biopsy and receive appropriate treatment. The degree of liver fibrosis is an important factor in determining the appropriate approach for managing individuals with liver disease (e.g., hepatitis, ALD, NAFLD).
The following PICO was used to select literature to inform this review.
The relevant population of interest is individuals with chronic liver disease.
The tests being considered are multianalyte serum assays (other than FibroSURE).
The following tests and practices are currently being used to diagnose chronic liver disease: liver biopsy, noninvasive radiologic methods, and other multianalyte serum assays.
The general outcomes of interest are test validity, morbid events, and treatment-related morbidity. Follow-up over months to years is of interest to the relevant outcomes.
For the evaluation of the clinical validity of the tests within this review, studies that meet the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores).
Included a suitable reference standard (describe the reference standard).
Patient/sample clinical characteristics were described.
Patient/sample selection criteria were described.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
Patel et al (2004) investigated the use of serum markers in an initial training set of 294 patients with HCV and further validated the resulting algorithm in a validation set of 402 patients.43, The algorithm was designed to distinguish between no or mild fibrosis (F0 to F1) and moderate-to-severe fibrosis (F2 to F4). With the prevalence of F2 to F4 disease of 52% and a cutoff value of 0.36, the PPVs and NPVs were 74.3% and 75.8%, respectively.
The published studies for this combination of markers continue to focus on test characteristics such as sensitivity, specificity, and accuracy.44,45,46, In Crossan et al (2015), the summary diagnostic accuracy for detecting significant fibrosis (stage ≥F2) in 5 studies of HCV with FibroSpect II, with cutoffs ranging from 42 to 72, was 78% (95% CI, 49% to 93%) and the summary specificity was 71% (95% CI, 59% to 80%).20,
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, more effective therapy, or avoid unnecessary therapy or testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
The issues of effect on patient outcomes are similar to those discussed for the FibroSURE (FibroTest in Europe). No studies were identified in the published literature in which the results of the FIBROSpect test were actively used in the management of the patient.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
Because the clinical validity of FIBROSpect has not been established, a chain of evidence supporting the clinical utility of this test for this population cannot be constructed.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
Enhanced Liver Fibrosis
The Enhanced Liver Fibrosis (ELF) score is based on a proprietary algorithm that combines three specific biomarkers (Table 2). By contrast, non-proprietary scoring systems discussed below use a simplified nonproprietary formula that can be calculated to produce a score for the prediction of fibrosis.
Several systematic reviews have assessed the diagnostic accuracy of ELF in patients across various CLD etiologies. A meta-analysis by Vali et al (2020) of 11 studies using ELF tests in NAFLD for F3-4 noted a high sensitivity (0.93) but limited specificity (0.34) at the lower recommended threshold of 7.7 (Table 2); higher thresholds and F3-4 prevalence of at least 30% were required for increasing ELF positive predictive value to >0.8 for advanced fibrosis.47,
A systematic review, conducted in support of the the American Association for the Study of Liver Diseases (AASLD) Practice Guidelines (2024),3, reported conflicting data on the diagnostic accuracy of ELF compared with nonproprietary blood-based tests such as FIB-4 and NFS for the detection of fibrosis in NAFLD. The AASLD noted that in community-based and other low prevalence cohorts, blood-based noninvasive tests are useful for excluding advanced fibrosis with high NPV but require additional noninvasive tests to improve their PPV.3,
OWLiver panel
The OWLiver panel is a serum-based non-invasive test used for the diagnosis of MASH and fibrosis (Table 2). Iruzubieta et al (2024) conducted a multicenter cross‐sectional study that included 124 biopsy‐proven MASLD in adult patients with overweight/obesity and type 2 diabetes.48, TE, FIB‐4, NFS, FibroScan‐AST, and the OWLiver panel were performed. Sensitivity, specificity, PPV, NPV and AUC were calculated. These four noninvasive tests were assessed individually and in sequential/parallel combinations. Thirty-five (28%) patients had early MASH and 66 (53%) had MASH with significant fibrosis ("at‐risk" MASH). The OWLiver panel (OWLiver-MASH and MASEF® algorithm) correctly classified 86% as MASH, showing an accuracy, sensitivity, specificity, PPV, and NPV of 0.77, 0.86, 0.35, 0.85, and 0.36, respectively. Class III obesity, diabetes control, or gender did not impact on the performance of the OWLiver panel (p >.1). Tests for at‐risk MASH showed an AUC >0.70 except for NFS. The MASEF algorithm showed the highest accuracy and NPV for at‐risk MASH (AUC 0.77 [0.68–0.85], NPV 72%) and advanced fibrosis (AUC 0.80 [0.71-0.88], NPV 92%). Combinations of tests for the identification of at‐risk MASH did not provide any additional benefit over using MASEF algorithm alone. Further studies involving larger patient groups is required to confirm these results and determine their relevance across broader/heterogenous study populations.
Nonproprietary scoring systems have also been developed, including FIB-4, NAFLD fibrosis score (NFS), APRI, AST/ALT ratio, combined body mass index, AST/ALT ratio and diabetes status (BARD) (see Appendix)
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, more effective therapy, or avoid unnecessary therapy or testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs. The primary benefit of the multivariate serum assays is the ability to avoid liver biopsy.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
For individuals who have CLD who receive multianalyte serum assays for liver function assessment, such as the ELF test and OWLiver panel, the evidence includes observational studies and systematic reviews. The ELF test shows high sensitivity but lower specificity for detecting advanced fibrosis in NAFLD, especially at lower thresholds. Its PPV improves with higher thresholds and greater disease prevalence. A systematic review conducted in support of the the AASLD Practice Guidelines (2025) reported conflicting data on the diagnostic accuracy of ELF compared with nonproprietary blood-based tests such as FIB-4 and NFS for the detection of fibrosis in NAFLD. The AASLD noted that in community-based and other low prevalence cohorts, blood-based noninvasive tests are useful for excluding advanced fibrosis with high NPV but require additional noninvasive tests to improve their PPV. A multicenter cross‐sectional study demonstrated high accuracy of the OWLiver panel for diagnosing MASH and advanced fibrosis in patients with obesity and type 2 diabetes, with consistent results across obesity levels and diabetes control. Further studies comparing the OWLiver panel to nonproprietary tests in larger and more diverse patient populations is necessary to confirm these findings.
For individuals who have chronic liver disease (CLD) who receive FibroSURE serum panels, the evidence includes systematic reviews of more than 30 observational studies (>5000 patients). Relevant outcomes are test validity, morbid events, and treatment-related morbidity. FibroSURE has been studied in populations with alcohol-associated liver disease (ALD), nonalcoholic fatty liver disease (NALFD)/metabolic dysfunction-associated steatotic liver disease (MASLD), and viral hepatitis. There are established cutoffs, although they were not consistently used in validation studies. Given these limitations and the imperfect reference standard, it is difficult to interpret performance characteristics. However, for the purposes of deciding whether a patient has severe fibrosis or cirrhosis, FibroSURE results provide data sufficiently useful to determine therapy. Specifically, FibroSURE has been used as an alternative to biopsy to establish eligibility regarding the presence of fibrosis or cirrhosis in several randomized controlled trials (RCTs) that showed the efficacy of hepatitis C virus (HCV) treatments, which in turn demonstrated that the test can identify patients who would benefit from therapy. The evidence is sufficient to determine that the technology results in an improvement in the net health outcome.
| [X] Medically Necessary | [ ] Investigational |
For individuals who have CLD who receive multianalyte serum assays for liver function assessment other than FibroSure, such as the Enhanced Liver Fibrosis (ELF) test and OWLiver panel, the evidence includes observational studies and systematic reviews. The ELF test shows high sensitivity but lower specificity for detecting advanced fibrosis in NAFLD/MASLD, especially at lower thresholds. The positive predictive value of the test improves with higher thresholds and greater disease prevalence. A systematic review conducted in support of the the American Association for the Study of Liver Diseases (AASLD) Practice Guidelines (2024) reported conflicting data on the diagnostic accuracy of ELF compared with nonproprietary blood-based tests such as the Fibrosis-4 (FIB-4) Index, and NAFLD/NASH fibrosis score (NFS) for the detection of fibrosis in NAFLD. The AASLD noted that in community-based and other low prevalence cohorts, blood-based noninvasive tests are useful for excluding advanced fibrosis with high negative predictive value but require additional noninvasive tests to improve their positive predictive value. A multicenter cross‐sectional study demonstrated high accuracy of the OWLiver panel for diagnosing MASH and advanced fibrosis in patients with obesity and type 2 diabetes, with consistent results across obesity levels and diabetes control. Further studies comparing the OWLiver panel to nonproprietary tests in larger and more diverse patient populations is necessary to confirm these findings. There is no direct evidence that either of these multianalyte serum assays improve health outcomes; further, it is not possible to construct a chain of evidence for clinical utility due to the lack of sufficient evidence on clinical validity. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
| [ ] Medically Necessary | [X] Investigational |
The purpose of noninvasive testing in individuals with CLD is to detect liver fibrosis so that individuals can avoid the potential adverse events of an invasive liver biopsy and receive appropriate treatment. The degree of liver fibrosis is an important factor in determining the appropriate approach for managing individuals with liver disease (eg, hepatitis, ALD, NAFLD/MASLD).
The following PICO was used to select literature to inform this review.
The relevant population of interest is individuals with CLD.
The test being considered is transient elastography (TE).
The following tests and practices are currently being used to diagnose chronic liver disease: liver biopsy, other noninvasive radiologic methods, and multianalyte serum assays.
The general outcomes of interest are test validity, morbid events, and treatment-related morbidity. Follow-up over months to years is of interest to the relevant outcomes.
For the evaluation of the clinical validity of the tests within this review, studies that meet the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores).
Included a suitable reference standard (describe the reference standard).
Patient/sample clinical characteristics were described.
Patient/sample selection criteria were described.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
There is extensive literature on the use of transient elastography (eg, FibroScan) to gauge liver fibrosis and cirrhosis. Summaries of systematic reviews are shown in Tables 5 and 6.
Duarte-Rojo et al (2025) conducted a systematic review to assess the evidence on the accuracy of TE, shear wave elastography (ARFI imaging), and MRE to stage liver fibrosis.49, This review was undertaken to support the AASLD guidelines on noninvasive imaging technologies for staging liver fibrosis in CLD. A comprehensive search was performed for studies (published through April 2022) assessing these methods for the identification of significant fibrosis (F2-4), advanced fibrosis (F3-4), or cirrhosis (F4), using histopathology as the standard of reference by liver disease etiology in adults or children. Two-hundred and forty (240) studies (N=61,193 patients) were included in this systematic review. Fifty-four studies (22%) reported the accuracy of TE for staging fibrosis in patients with NAFLD. For significant fibrosis (F2-4), a TE-liver stiffness measurement (LSM) cutoff value of 7 kPa yielded a sensitivity of 76% and a specificity of 73%, whereas for advanced fibrosis (F3-4), a cutoff of 10 kPa had a sensitivity of 82% and a specificity of 79% (see Table 3 for cut-off thresholds). To detect cirrhosis (F4), a TE-LSM cutoff of 13 kPa had a sensitivity of 90% and a specificity of 89%.
Brener (2015) performed a health technology assessment summarizing many of the systematic reviews below.50, The assessment focused on reviews of the diagnostic accuracy and effect on patient outcomes of TE for liver fibrosis in patients with HCV, HBV, NAFLD, ALD, or cholestatic diseases. Fourteen systematic reviews of TE with biopsy reference standard shown below were included in the Brener assessment, summarizing more than 150 primary studies.51,52,53,54,55,56,57,58,59,60,61,62,63,64, There was variation in the underlying cause of liver disease and the cutoff values of TE stiffness used to define Metavir stages in the systematic reviews. There did not appear to be a substantial difference in diagnostic accuracy for one disease over any other. The reviews demonstrated that TE has good diagnostic accuracy compared with biopsy for the assessment of liver fibrosis and steatosis.
Crossan et al (2015) found that FibroScan was the noninvasive liver test most assessed in validation studies across liver diseases (37 studies in HCV, 13 in HBV, 8 in NAFLD, 6 in ALD).20, Cutoffs for positivity for fibrosis staging varied between diseases and were frequently not prespecified or validated: HCV, 5.2 to 10.1 kPa in the 37 studies for Metavir stages ≥F2; HBV, 6.3 to 8.9 kPa in 13 studies for stages ≥F2; NAFLD, 7.5 to 10.4 kPa in 8 studies for stages ≥F3; ALD, 11.0 to 12.5 kPa in 4 studies for stages ≥F3. Summary sensitivities and specificities by disease are shown in Table 5. The overall sensitivity and specificity for cirrhosis including all diseases (65 studies; cutoffs range, 9.2 to 26.5 kPa) were 89% (95% CI, 86% to 91%) and 89% (95% CI, 87% to 91%), respectively. The rate of uninterpretable results, when reported, with FibroScan (due to <10 valid measurements; success rate, <60%; interquartile range, >30%) was 8.5% in HCV and 9.6% in NAFLD.
| Study | Dates | Studies | N | Population |
| Bota et al (2013)51, | To May 2012 | 13 | 1163 | Chronic hepatitis |
| Cai et al (2021)65, | To Mar 2019 | 62 | NR | ALD, NAFLD |
| Chon et al (2012)52, | 2002 to Mar 2011 | 18 | 2772 | HBV |
| Crossan et al (2015)20, | 1998 to Apr 2012 | 66 | NR | HCV, HBV, NAFLD, ALD |
| Friedrich-Rust et al (2008)53, | 2002 to Apr 2007 | 50 | 11,275 | All causes of liver disease |
| Geng et al (2016)66, | To Jan 2015 | 57 | 10,569 | Multiple causes of liver disease |
| Jiang et al (2018)67, | To Dec 2017 | 11 | 1735 | NAFLD |
| Kwok et al (2014)54, | To Jun 2013 | 22 | 1047 | NAFLD |
| Li et al (2016)68, | Jan 2003 to Nov 2014 | 27 | 4386 | HBV |
| Njei et al (2016)69, | To Jan 2016 | 6 | 756 | HCV/HIV coinfection |
| Pavlov et al (2015)70, | To Aug 2014 | 14 | 834 | ALD |
| Poynard et al (2011)56, | Feb 2001 to Dec 2010 | 18 | 2714 | HBV |
| Shaheen et al (2007)57, | Jan 1997 to Oct 2006 | 12 | 1981 | HCV |
| Shi et al (2014)58, | To May 2013 | 9 | 1771 | All causes of steatosis |
| Steadman et al (2013)59, | 2001 to Jun 2011 | 64 | 6028 | HCV, HBV, NAFLD, CLD, liver transplant |
| Stebbing et al (2010)60, | NR, prior to Feb 2009 | 22 | 4625 | All causes of liver disease |
| Talwalkar et al (2007)61, | To Jan 2027 | 9 | 2083 | All causes of liver disease |
| Tsochatzis et al (2011)62, | To May 2009 | 40 | 7661 | All causes of liver disease |
| Tsochatzis et al (2014)63, | 1998 to Apr 2012 | 302 | NR | HCV, HBV, ALD, NAFLD |
| Xu et al (2015)71, | To Dec 2013 | 19 | 3113 | HBV |
| Xue-Ying (2020)64, | Jan 2008 to Dec 2018 | 81 | 32,694 | HBV |
| Significant Fibrosis (ie, Metavir Stages F2 to F4) | Cirrhosis (ie, Metavir Stage F4) | |||||
| Study | Population | Studies/ Sample Size | AUROC (95% CI) Sensitivity (95% CI) Specificity (95% CI) | Studies/ Sample Size | AUROC (95% CI) Sensitivity (95% CI) Specificity (95% CI) | |
| Bota et al (2013)51, | Multiple diseases | 10/1016 | 0.87 (0.83 to 0.89) 78% (72% to 83%) 84% (75% to 90%) | 13/1163 | 0.93 (0.91 to 0.95) 89% (80% to 94%) 87% (82% to 91%) | |
| HCV | 4/NR | NR 92% (78% to 97%) 86% (82% to 90%) | ||||
| Cai et al (2021)65, | ALD/NAFLD | 40/2569 | 0.86 (0.83 to 0.89) 77% (73% to 81%) 82% (78% to 86%) | 34/914 | 0.95 (0.92 to 0.96) 91% (87% to 94%) 86% (83% to 89%) | |
| Chon et al (2012)52, | Chronic HBV | 12/2000 | 0.86 (0.86 to 0.86) 74.3% (NR) 78.3% (NR) | 16/2614 | 0.93 (0.93 to 0.93) 84.6% (NR) 81.5% (NR) | |
| Crossan et al(2015)20, | HCV | 37/NR | NR 79% (74% to 84%) 83% (77% to 88%) | 36/NR | NR 89% (84% to 92%) 91% (89% to 93%) | |
| HBV | 13/NR | NR 71% (62% to 78%) 84% (74% to 91%) | 19/NR | NR 86% (79% to 91%) 85% (78% to 89%) | ||
| NAFLD | 4/NR | NR 96% (83% to 99%) 89% (85% to 92%) | ||||
| ALD | 1/NR | NR 81% (70% to 88%) 92% (76% to 98%) | 4/NR | NR 87% (64% to 96%) 82% (67% to 91%) | ||
| Friedrich-Rust (2008)53, | Multiple diseases | 25/3685 | 0.84 (0.82 to 0.86) NR NR | 25/4557 | 0.94 (0.93 to 0.95) NR NR | |
| HCV | NR | 0.84 (0.80 to 0.86) NR NR | ||||
| Geng et al(2016)66, | Multiple diseases | 0.93 (NR) 81% (79% to 83%) 88% (87% to 89%) | ||||
| Jiang et al (2018)67, | NAFLD | 10/NR | 0.85 (0.82 to 0.88) 77% (70% to 84%) 80% (74% to 84%) | 11/NR | 0.96 (0.93 to 0.97) 90% (73% to 97%) 91% (87% to 94%) | |
| Kwok et al(2014)54, | NAFLD | 7/800 | 0.83 (0.79 to 0.87) 0.79 (0.72 to 0.84) 0.75 (0.71 to 0.79) | 57/10,569 | 0.96 (0.94 to 0.99) 92% (82% to 97%) 92% (86% to 98%) | |
| Li et al (2016)68, | HBV | 19/NR | 0.88 (0.85 to 0.91) 81% (76% to 85%) 82% (71% to 87%) | 24/NR | 0.93 (0.91 to 0.95) 86% (82% to 90%) 88% (84% to 90%) | |
| Njei et al (2016)69, | HCV/HIV | 6/756 | NR 97% (82% to 91%) 64% (45% to 79%) | 6/756 | NR 90% (74% to 91%) 87% (80% to 92%) | |
| Pavlov et al(2015)70, | ALD | 7/338 | NR 94% (86% to 97%) 89% (76% to 95%) | 7/330 | NR 95% (87% to 98%) 71% (56% to 82%) | |
| Poynard et al(2011)56, | HBV | 4/NR | 0.84 (0.78 to 0.89) NR NR | NR | 0.93 (0.87 to 0.99) NR NR | |
| Shaheen et al(2007)57, | HCV | 4/NR | 0.84 (0.78 to 0.89) NR NR | NR | 0.93 (0.87 to 0.99) NR NR | |
| Shi et al(2014)58, | No summary statistics reported. Concluded that transient elastography controlled attenuation parameter has good sensitivity and specificity for diagnosing steatosis, but it has limited utility. | |||||
| Steadman et al(2013)59, | Multiple diseases | 45/NR | 0.88 (0.84 to 0.90) 80% (76% to 83%) 81% (77% to 85%) | 49/NR | 0.94 (0.91 to 0.96) 86% (82% to 89%) 89% (87% to 91%) | |
| HBV | 5/710 | 0.81 (0.78 to 0.84) 77% (68% to 84%) 72% (55% to 85%) | 8/1092 | 0.86 (0.82 to 0.89) 67% (57% to 75%) 87% (83% to 91%) | ||
| HCV | 13/2732 | 0.89 (0.86 to 0.91) 76% (61% to 86%) 86% (77% to 92%) | 12/2887 | 0.94 (0.92 to 0.96) 85% (77% to 91%) 91% (87% to 93%) | ||
| NAFLD | 5/630 | 0.78 (0.74 to 0.82) 77% (70% to 83%) 75% (70% to 79%) | 4/469 | 0.96 (0.94 to 0.97) 92% (77% to 98%) 95% (88% to 98%) | ||
| Stebbing et al(2010)60, | Multiple diseases | 17/3066 | NR 72% (71% to 72%) 82% (82% to 83%) | 17/4052 | NR 84% (84% to 85%) 95% (94% to 95%) | |
| Talwalkar et al(2007)61, | Multiple diseases | 7/>1100 | 0.87 (0.83 to 0.91) 70% (67% to 73%) 84% (80% to 88%) | 9/2083 | 0.96 (0.94 to 0.98) 87% (84% to 90%) 91% (89% to 92%) | |
| Tsochatzis et al(2011)62, | Multiple diseases | 31/5919 | NR 79% (74% to 82%) 78% (72% to 83%) | 30/6530 | NR 83% (79% to 86%) 89% (87% to 91%) | |
| HCV | 14/NR | NR 78% (71% to 84%) 80% (71% to 86%) | 11/NR | NR 83% (77% to 88%) 90% (87% to 93%) | ||
| HBV | 4/NR | NR 84% (67% to 93%) 78% (68% to 85%) | 6/NR | NR 80% (61% to 91%) 86% (82% to 94%) | ||
| Tsochatzis et al(2014)63, | HCV | 37/NR | 0.87 (0.83 to 0.90) 79% (74% to 84%) 83% (77% to 88%) | 36/NR | 0.96 (0.94 to 0.97) 89% (84% to 92%) 91% (89% to 93%) | |
| HBV | 13/NR | 0.83 (0.76 to 0.90) 71% (62% to 78%) 84% (74% to 91%) | 13/NR | 0.92 (0.89 to 0.96) 86% (79% to 91%) 85% (78% to 89%) | ||
| NAFLD | 4/NR | 0.96 (0.94 to 0.99) 96% (83% to 99%) 89% (85% to 92%) | ||||
| ALD | 6/NR | 0.90 (0.87 to 0.94) 86% (76% to 92%) 83% (74% to 89%) | ||||
| Xu et al(2015)71, | HBV | 14/2318 | 0.82 (0.78 to 0.86) NR NR | 18/2996 | 0.91 (0.89 to 0.93) NR NR | |
| Xue-Ying (2020)64, | HBV | 29/5035 | 0.83 (0.80 to 0.86) 72% (68% to 76%) 82% (77% to 86%) | NR/NR | NR NR NR | |
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, more effective therapy, or avoid unnecessary therapy or testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
There are currently no published studies that directly demonstrate the effect of TE (e.g., FibroScan) on patient outcomes.
FibroScan is used extensively in practice to make management decisions. In addition, FibroScan was used as an alternative to biopsy to diagnose fibrosis or cirrhosis to establish trial eligibility in several trials (ION-1,-3; VALENCE; ASTRAL-2, -3, -4) that confirmed the efficacy of HCV treatments.31,32,33,34,35,36, For example, in the VALENCE trial, cirrhosis could be defined by liver biopsy or a confirmatory FibroTest or FibroScan result at 12.5 kPa or greater. In VALENCE, FibroScan was used to determine cirrhosis in 74% of the participants. In a retrospective, multicenter analysis of 7256 chronic HCV patients by Abdel Alem et al (2019), both transient elastography and FIB-4 were found to be predictors of treatment failure to sofosbuvir-based treatment regimens with an NPV of 95%.72,
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
For individuals who have chronic liver disease who receive TE (e.g., FibroScan), the evidence includes many systematic reviews of more than 50 observational studies (>10,000 patients). TE has been studied in populations with viral hepatitis, NAFLD/MASLD, and ALD. There are varying cutoffs for positivity. Failures of the test are not uncommon, particularly for those with high body mass index, but these failures often went undetected in analyses of the validation studies. Given these limitations and the imperfect reference standard, it can be difficult to interpret performance characteristics. However, for the purposes of deciding whether a patient has severe fibrosis or cirrhosis, the FibroScan results provide data sufficiently useful to determine therapy. In fact, FibroScan has been used as an alternative to biopsy to establish eligibility regarding the presence of fibrosis or cirrhosis in the participants of several RCTs. These trials showed the efficacy of HCV treatments, which in turn demonstrated that the test can identify patients who would benefit from therapy.
The purpose of noninvasive testing in individuals with CLD is to detect liver fibrosis so that individuals can avoid the potential adverse events of an invasive liver biopsy and receive appropriate treatment. The degree of liver fibrosis is an important factor in determining the appropriate approach for managing individuals with liver disease (eg, hepatitis, ALD, NAFLD/MASLD).
The following PICO was used to select literature to inform this review.
The relevant population of interest is individuals with CLD.
The test being considered is multiparametric MRI (e.g., LiverMultiScan).
The following tests and practices are currently being used to diagnose chronic liver disease: liver biopsy, other noninvasive radiologic methods, and multianalyte serum assays.
The general outcomes of interest are test validity, morbid events, and treatment-related morbidity. Follow-up over months to years is of interest to the relevant outcomes.
For the evaluation of the clinical validity of the tests within this review, studies that meet the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores).
Included a suitable reference standard (describe the reference standard).
Patient/sample clinical characteristics were described.
Patient/sample selection criteria were described.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
Azizi et al (2024) published a systematic review comparing the diagnostic accuracy of MRI proton density fat fraction with liver biopsy.73, Tables 7 and 8 summarize study characteristics and results, respectively. Authors concluded that MRI Proton Density Fat Fraction has high diagnostic accuracy, though its accuracy slightly declines as the severity of hepatic steatosis increases.
| Study | Dates | Studies | N (Range) | Population | Index tests | Reference Standard |
| Azizi et al (2024)73, | Until January 2024 | 22 | 2844 (19 to 497) | Patients with MASLD and hepatic steatosis | MRI-PDFF | Histology |
| Index Test | Steatosis | ||
| Azizi et al (2024)73, | AUC Sensitivity Specificity | ||
| Grade ≥1 | Grade ≥2 | Grade 3 | |
| Total studies (n) | 17 (2454) | 16 (1726) | 12 (1469) |
| Index Test Threshold | 5.7 | NR | NR |
| MRI-PDFF | 0.97 0.93 0.93 | 0.91 0.79 0.90 | 0.91 0.76 0.89 |
Tables 8 and 9 summarize studies that have evaluated the diagnostic accuracy of multiparametric MRI, which incorporates assessment of proton density fat‐fraction, T2*, and T1 mapping to characterize liver fat, iron, fibrosis, and inflammation. Generally, technical failures were less common with MRI than transient elastography.74,75,76,
| Study | Population | Design | Index Test(s) | Reference Standard | Timing of Reference and Index Tests |
| Beyer et al (2021)74, | N=580 patients with suspected NAFLD/NASH | Retrospective evaluation of patients from 2 clinical trials | MRI PDFF (LMS-IDEAL)* CAP (FibroScan) | Liver biopsy | Not reported |
| Imajo et al (2021)75, | N=145 patients with suspected NASH | Prospective, observational | MRI liver fat* MRI cT1 measurements* MRI cT1 + PDFF* MRE VCTE-LSM (FibroScan) CAP (FibroScan) 2D-SWE | Liver biopsy | All performed at first clinical visit |
| McDonald et al (2018)76, | N=149 patients with known or suspected liver disease | Prospective, validation cohort | MRI cT1* ELF test TE (FibroScan) | Liver biopsy | Liver biopsy performed within 2 weeks of noninvasive assessments |
| Significant Fibrosis | Steatosis | Advanced NASH (NAS ≥4 and ≥F2) | ||||||||
| Study | Population | Test | AUROC (95% CI) Sensitivity Specificity | Test | AUROC (95% CI) Sensitivity Specificity | Test | AUROC (95% CI) Sensitivity Specificity | |||
| Grade ≥1 | Grade ≥2 | Grade ≥3 | ||||||||
| Beyer et al (2021)74, | Suspected NAFLD/NASH | - | - | MRI PDFF (LMS-IDEAL)* | 1.0 (0.99 to 1.00) 99% 100% | 0.77 (0.73 to 0.82) 72% 72% | 0.81 (0.76 to 0.87) 68% 81% | - | - | |
| - | - | CAP (FibroScan) | 0.95 (0.91 to 0.99) 89% 100% | 0.60 (0.55 to 0.65) 78% 41% | 0.63 (0.57 to 0.70) 61% 59% | - | - | |||
| Stage ≥2 | ||||||||||
| Imajo et al (2021)75, | Suspected NASH | MRE | 0.92 (0.87 to 0.97) NR NR | MRI liver fat* | 0.92 (0.87 to 0.98) NR NR | 0.86 (0.80 to 0.93) NR NR | - | MRI cT1* | 0.74 (0.66 to 0.82) NR NR | |
| VCTE-LSM | 0.88 (0.81 to 0.95) NR NR | CAP (FibroScan) | 0.75 (0.58 to 0.92) NR NR | 0.68 (0.59 to 0.78) NR NR | - | MRI liver fat* | 0.71 (0.63 to 0.80) NR NR | |||
| 2D-SWE | 0.87 (0.76 to 0.99) NR NR | MRE | 0.66 (0.57 to 0.75) NR NR | |||||||
| MRI cT1* | 0.62 (0.49 to 0.74) NR NR | VCTE-LSM | 0.64 (0.54 to 0.74) NR NR | |||||||
| Stage ≥3 | Stage ≥5 | |||||||||
| McDonald et al (2018)76, | Known or suspected liver disease (unselected) | MRI cT1* | 0.72 (0.63 to 0.80) 88% 51% | 0.72 (0.64 to 0.81) 71% 64% | ||||||
| ELF test | 0.70 (0.61 to 0.78) 49% 77% | 0.68 (0.57 to 0.79) 19% 91% | ||||||||
| TE | 0.84 (0.76 to 0.91) NR NR | 0.86 (0.79 to 0.93) NR NR | ||||||||
Jayaswal et al (2020) compared the prognostic value of MRI cT1 measurements, transient elastography, and multianalyte serum assays in a cohort of 197 patients with compensated chronic liver disease.77, Patients who were referred for a clinically indicated liver biopsy, or with a known diagnosis of liver cirrhosis, were eligible. At baseline, patients underwent multiparametric MRI scans, transient elastography, and blood tests. Additionally, all patients received a liver biopsy and had their fibrosis rated on the Ishak scale; results of the biopsies informed clinical care. The most common underlying disease states were NAFLD (n=85, 43%), viral hepatitis (n=50, 25%), and ALD (n=22, 11%). The primary endpoint was a composite of ascites, variceal bleeding, hepatic encephalopathy, hepatocellular carcinoma, liver transplantation and mortality. Binary cutoff values were predefined. Patients were followed for a median of 43 months. Over this period, 14 new clinical events were recorded, including 11 deaths. The prognostic value of the noninvasive testing is summarized in Table 10. Technical failures were also reported (eg, poor quality scan); reliable measurements were obtained in 182 of 197 (92%) patients for multiparametric MRI and in 121 of 160 (76%) patients for transient elastography (transient elastography was additionally not attempted in 37 patients). The study was limited by having variable follow-up periods and the effect of patients being censored at different time points was not taken into account, so sensitivities, specificities, PPVs, and NPVs should be interpreted cautiously. The CI for the survival analysis was wide likely due to the relatively small number of new clinical events observed.
| Test, Binary Cutoff | Cox Regression Analysis, HR (95% CI) | Sensitivity | Specificity | Positive Predictive Value | Negative Predictive Value |
| Liver cT1 >825 ms | 9.91 (1.287 to 76.24) | 92.3 | 47.3 | 11.9 | 98.8 |
| Transient elastography >8 kPa | 7.79 (0.974 to 62.3) | 88.9 | 51.8 | 12.9 | 98.3 |
| FIB-4 >1.45 | 4.11 (0.91 to 18.56) | 84.6 | 47.7 | 10.9 | 97.6 |
| APRI >1 | 2.645 (0.886 to 7.9) | 46.2 | 79.2 | 14.3 | 95.1 |
| AST/ALT >1 | 6.093 (1.673 to 22.19) | 76.9 | 65.6 | 14.3 | 97.4 |
| Ishak >F4 (liver biopsy) | 12.64 (2.8 to 57.08) | 84.6 | 73.9 | 20.4 | 98.4 |
Pavlides et al (2016) evaluated whether data obtained from multiparametric MRI was predictive of all-cause mortality and liver-related clinical events.78, Patients who were referred for a clinically indicated liver biopsy, or with a diagnosis of liver cirrhosis on MRI scan, were eligible. Liver-related clinical events were defined as liver-related death, hepatocellular carcinoma, and new hepatic decompensation (ie, clinically evident ascites, variceal bleeding, and hepatic encephalopathy). Patients received multiparametric MRI and liver cT1 values were mapped into a Liver Inflammation and Fibrosis (LIF) score. One hundred twenty three patients were recruited to the study; 6 were excluded due to claustrophobia or incomplete MRI data. Of the 117 patients who had complete MRI data, follow-up data were available for 112; the study reported outcomes on these 112 patients. The most common underlying disease states were NAFLD (35%), viral hepatitis (30%), and ALD (10%). Over a median follow-up time of 27 months, 10 patients had a liver-related clinical event and 6 patients died. No patients who had a LIF <2 (no or mild liver disease) developed a clinical event. Ten of 56 (18%) patients with a LIF ≥2 (moderate or severe liver disease) experienced a clinical event. A study limitation is the use of LIF scores, which are no longer used in clinical practice. The authors further described the study as a small proof of principle study.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, more effective therapy, or avoid unnecessary therapy or testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs. The primary benefit of multiparametric MRI for chronic liver disease is the ability to avoid liver biopsy in patients without significant fibrosis. There are currently no such published studies to demonstrate the effect on patient outcomes.
Multiparametric MRI has been used as an alternative to biopsy for measuring fibrosis or cirrhosis in clinical trials. Phase 2 clinical trials have used multiparametric MRI to measure therapeutic efficacy of an investigational treatments for NASH79, and NAFLD.80,
The utility of multiparametric MRI to provide clinically useful information on the presence and extent of liver fibrosis and inflammation has been evaluated in smaller prospective studies. Specifically, it has been evaluated in the setting of biochemical remission in liver diseases where noninvasive testing for continued disease activity could further aid in direct management of patients as a prognostic marker of future liver-related complications. Quantitative multiparametric MRI has been used to measure disease burden after treatment in patients with chronic HCV81, and autoimmune hepatitis.82,83,84,85,
Currently, there is not evidence that demonstrates that the use of the test for response to therapy impacts decision making and that these changes in management decisions lead to improved outcomes.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
For individuals who have chronic liver disease who receive multiparametric MRI, the evidence includes several prospective and retrospective observational studies. Multiparametric MRI (eg, LiverMultiScan) has been studied in mixed populations, including NAFLD/MASLD, viral hepatitis, and ALD. Quantitative MRI provides various measures assessing both liver fat content and fibrosis and inflammation. Various cutoffs have been utilized for positivity. Generally, multiparametric MRI performed similarly to transient elastography, and fewer technical failures of multiparametric MRI were reported. Given these limitations and the imperfect reference standard, it can be difficult to interpret performance characteristics. The prognostic ability of quantitative MRI to predict liver-related clinical events has been evaluated in 2 studies; both reported positive correlations with wide CIs. Larger cohorts with a longer follow-up time would be useful to further derive the prognostic ability. Additionally, multiparametric MRI has been used to measure the presence of fibrosis or cirrhosis in the patients who have achieved biochemical remission after treatment in small prospective studies.
Population Reference No. 5
The purpose of noninvasive testing in individuals with CLD is to detect liver fibrosis so that individuals can avoid the potential adverse events of an invasive liver biopsy and receive appropriate treatment. The degree of liver fibrosis is an important factor in determining the appropriate approach for managing individuals with liver disease (e.g., hepatitis, ALD, NAFLD/MASLD).
The following PICO was used to select literature to inform this review.
The relevant population of interest is individuals with chronic liver disease.
The tests being considered are other noninvasive imaging, including MRE, ARFI , and RTE (see Table 3).
The following tests and practices are currently being used to diagnose chronic liver disease: liver biopsy, other noninvasive radiologic methods, and multianalyte serum assays.
The general outcomes of interest are test validity, morbid events, and treatment-related morbidity. Follow-up over months to years is of interest to the relevant outcomes.
For the evaluation of the clinical validity of the tests within this review, studies that meet the following eligibility criteria were considered:
Reported on the accuracy of the marketed version of the technology (including any algorithms used to calculate scores).
Included a suitable reference standard (describe the reference standard).
Patient/sample clinical characteristics were described.
Patient/sample selection criteria were described.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
Duarte-Rojo et al (2025) conducted a systematic review to assess the evidence on the accuracy of transient elastography (TE), shear wave elastography (ARFI imaging), and magnetic resonance elastography to stage liver fibrosis.49, This review was undertaken to support the AASLD guidelines on noninvasive imaging technologies for staging liver fibrosis in CLD. A comprehensive search was performed for studies (published through April 2022) assessing these methods for the identification of significant fibrosis (F2-4), advanced fibrosis (F3-4), or cirrhosis (F4), using histopathology as the standard of reference by liver disease etiology in adults or children. Two-hundred and forty (240) studies (N=61,193 patients) were included in this systematic review. Regarding pSWE (see Table 3), 8 studies reported its accuracy to stage fibrosis in NAFLD. For significant fibrosis (F2-4), a pSWE-LSM cutoff value of 1.2 m/s showed a sensitivity of 85%–90% and a specificity of 36%–90%, whereas for advanced fibrosis (F3-4), the 1.5 m/s threshold had a sensitivity of 70% and a specificity of 92%. To detect cirrhosis (F4), pSWE-LSM at a cutoff of 2 m/s had a sensitivity of 75%–90% and a specificity of 67%–90%. Regarding 2D-SWE, 11 studies reported its accuracy to stage fibrosis in NAFLD. For significant fibrosis (F2-4), using a 2D-SWE-LSM cutoff value of 7.4 kPa, the sensitivity was 85% and the specificity was 79%, whereas for advanced fibrosis (F3-4), the 8.4 kPa threshold had a sensitivity of 90% and a specificity of 79%. To detect cirrhosis (F4), 2D-SWE-LSM at a cutoff value of 10 kPa had a sensitivity of 83%–92% and a specificity of 76%–90%.
Tables 11 and 12 summarize the characteristics and results of systematic reviews that have assessed the diagnostic accuracy of ARFI imaging.
| Study | Dates | Studies | N | Population |
| Bota et al (2013)51, | To May 2012 | 6 | 518 | Chronic hepatitis |
| Crossan et al (2015)20, | 1998 to Apr 2012 | 4 | NR | HCV |
| Guo et al (2015)86, | To Jun 2013 | 15 | 2128 | Multiple diseases |
| Hu et al (2017)87, | To Jul 2014 | 7 | 723 | NAFLD |
| Lin et al (2020)88, | To Apr 2019 | 29 | NR | Non-viral liver disease |
| Jiang et al (2018)67, | To Dec 2017 | 9 | 982 | NAFLD |
| Liu et al (2015)89, | To Apr 2016 | 23 | 2691 | Chronic HBV or HCV |
| Nierhoff et al (2013)90, | 2007 to Feb 2012 | 36 | 3951 | Multiple diseases |
| Significant Fibrosis(ie, Metavir Stages F2 to F4) | Cirrhosis (ie, Metavir Stage F4) | ||||
| Study | Population | Studies/ Sample Size | AUROC (95% CI) Sensitivity (95% CI) Specificity (95% CI) | Studies/ Sample Size | AUROC (95% CI) Sensitivity (95% CI) Specificity (95% CI) |
| Bota et al (2013)51, | Chronic hepatitis | 6/518 | 0.88 (0.83 to 0.93) NR NR | 0.92 (0.87 to 0.98) NR NR | |
| Crossan et al (2015)20, | HCV | 4/NR | NR 85% (69% to 94%) 89% (72% to 97%) | ||
| Guo et al (2015)86, | Multiple diseases | 13/NR | NR 76% (73% to 78%) 80% (77% to 83%) | 14/NR | NR 88% (84% to 91%) 80% (81% to 84%) |
| Hu et al (2017)87, | HBV, HCV | 15/NR | 88% (85% to 91%) 75% (69% to 78%) 85% (81% to 89%) | ||
| Jiang et al (2018)67, | NAFLD | 6/NR | 0.86 (0.83 to 0.89) 70% (59% to 79%) 84% (79% to 88%) | 7/NR | 0.95 (0.93 to 0.97) 89% (60% to 98%) 91% (82% to 95%) |
| Liu et al (2015)89, | NAFLD | 7/723 | NR 80% (76% to 84%) 85% (81% to 89%) | ||
| Lin et al (2020)88, | Non-viral liver disease | 23/NR | 0.87 (0.83 to 0.89) 79% (73% to 83%) 81% (75% to 86%) | 14/NR | 0.94 (0.92 to 0.96) 89% (79% to 95%) 89% (85% to 92%) |
| Nierhoff et al (2013)90, | Multiple diseases | 26/NR | 0.83 (0.80 to 0.86) NR NR | 27/NR | 0.91 (0.89 to 0.93) NR NR |
The previously introduced 5-year observational study by Kluppel et al (2023) compared the prognostic value of ARFI elastography, the FIB-4 score, and liver biopsy.91, AFRI was significantly better than FIB-4 at predicting liver-related death within 5 years (p=.02), but it did not differ significantly from biopsy (p=.83). For predicting liver decompensation or variceal bleeding, AFRI outperformed both biopsy (p=.02) and FIB-4 (p=.003). However, there was no significant difference between AFRI and biopsy (p=.33) or FIB-4 (p=.14) in predicting hepatocellular carcinoma.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, more effective therapy, or avoid unnecessary therapy or testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
There are currently no published studies that directly demonstrate the effect of ARFI imaging on patient outcomes.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
Because the clinical validity of ARFI imaging has not been established, a chain of evidence supporting the clinical utility of this test for this population cannot be constructed.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
Duarte-Rojo et al (2025) conducted a systematic review to assess the evidence on the accuracy of transient elastography (TE), shear wave elastography (ARFI imaging), and magnetic resonance elastography to stage liver fibrosis.49, This review was undertaken to support the AASLD guidelines on noninvasive imaging technologies for staging liver fibrosis in CLD. A comprehensive search was performed for studies (published through April 2022) assessing these methods for the identification of significant fibrosis (F2-4), advanced fibrosis (F3-4), or cirrhosis (F4), using histopathology as the standard of reference by liver disease etiology in adults or children. Two-hundred and forty (240) studies (N=61,193 patients) were included in this systematic review.
Twelve studies reported MRE accuracy to stage fibrosis in NAFLD. For significant fibrosis (F2-4), an MRE-LSM cutoff value of 3.4 kPa yielded a sensitivity of 78% and a specificity of 90%, whereas for advanced fibrosis (F3-4), with a cutoff of 3.7 kPa, the sensitivity was 82%–93% and the specificity was 90%–95%. To detect cirrhosis (F4), an MRE-LSM cutoff value of 6.7 kPa had a sensitivity of 91% and a specificity of 95%.
Tables 13 and 14 summarize the characteristics and results of systematic reviews that have assessed the diagnostic accuracy of MRE. MRE has been studied primarily in hepatitis and NAFLD.
| Study | Dates | Studies | N | Population |
| Crossan et al (2015)20, | 1998 to Apr 2012 | 3 | NR | CLD |
| Guo et al (2015)86, | To Jun 2013 | 11 | 982 | Multiple diseases |
| Singh et al (2015)92, | 2003 to Sep 2013 | 12 | 697 | Chronic hepatitis |
| Singh et al (2016)93, | To Oct 2014 | 9 | 232 | NAFLD |
| Xiao et al (2017)94, | To 2016 | 5 | 628 | NAFLD |
| Significant Fibrosis (ie, Stages F2 to F4) | Cirrhosis (ie, Stage F4) | ||||
| Study | Population | Studies/ Sample Size | AUROC (95% CI) Sensitivity (95% CI) Specificity (95% CI) | Studies/ Sample Size | AUROC (95% CI) Sensitivity (95% CI) Specificity (95% CI) |
| Crossan et al (2015)20, | CLD | 3/NR | NR 94% (13% to 100%) 92% (72% to 98%) | ||
| Guo et al (2015)86, | Multiple diseases | 9/NR | NR 87% (84% to 90%) 94% (91% to 97%) | NR 93% (88% to 96%) 91% (88% to 93%) | |
| Singh et al (2015)92, | Chronic hepatitis | 12/697 | 0.84 (0.76 to 0.92) 73% (NR) 79% (NR) | 12/697 | 0.92 (0.90 to 0.94) 91% (NR) 81% (NR) |
| Singh et al (2016)93, | NAFLD | 9/232 | 0.87 (0.82 to 0.93) 79% (76% to 90%) 81% (72% to 91%) | 9/232 | 0.91 (0.76 to 0.95) 88% (82% to 100%) 87% (77% to 97%) |
| Xiao et al (2017)94, | NAFLD | 3/384 | 0.88 (0.83 to 0.92) 73.2% (65.7% to 87.3%) 90.7% (85.0% to 95.7%) | 3/384 | 0.92 (0.80 to 1.00) 86.6% (80.0% to 90.9%) 93.4% (91.4% to 94.5%) |
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, more effective therapy, or avoid unnecessary therapy or testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
There are currently no published studies that directly demonstrate the effect of MRE on patient outcomes.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
Because the clinical validity of MRE has not been established, a chain of evidence supporting the clinical utility of this test for this population cannot be constructed.
A test must detect the presence or absence of a condition, the risk of developing a condition in the future, or treatment response (beneficial or adverse).
Kobayashi et al (2015) published the results of a meta-analysis assessing RTE for staging liver fibrosis.95, The authors selected 15 studies (N=1626) published through December 2013, including patients with multiple liver diseases and healthy adults. A bivariate random-effects model was used to estimate summary sensitivity and specificity. The summary AUROC, sensitivity, and specificity were 0.69 , 79% (95% CI, 75% to 83%), and 76% (95% CI, 68% to 82%) for detection of significant fibrosis (stage ≥F2), and 0.72 , 74% (95% CI, 63% to 82%), and 84% (95% CI, 79% to 88%) for detection of cirrhosis, respectively. Reviewers found evidence of heterogeneity due to differences in study populations, scoring methods, and cutoffs for positivity. They also found evidence of publication bias based on funnel plot asymmetry.
Hong et al (2014) reported on the results of a meta-analysis evaluating RTE for staging fibrosis in multiple diseases.96, Thirteen studies (N=1,347) published between April 2000 and April 2014 that used a liver biopsy or transient elastography as the reference standard were included. Different quantitative methods were used to measure liver stiffness in the included studies: Liver Fibrosis Index (LFI), Elasticity Index, elastic ratio 1 (ER1), and elastic ratio 2. For predicting significant fibrosis (stage ≥F2), the pooled sensitivities for LFI and ER1 were 78% (95% CI, 70% to 84%) and 86% (95% CI, 80% to 90%), respectively. The specificities were 63% (95% CI, 46% to 78%) and 89% (95% CI, 83% to 94%) and the AUROCs were 0.79 (95% CI, 0.75 to 0.82) and 0.94 (95% CI, 0.92 to 0.96), respectively. For predicting cirrhosis (stage F4), the pooled sensitivities of LFI, ER1, and elastic ratio 2 were 79% (95% CI, 61% to 91%), 96% (95% CI, 87% to 99%), and 79% (95% CI, 61% to 91%), respectively. The specificities were 88% (95% CI, 81% to 93%) for LFI, 89% (95% CI, 83% to 93%) for ER1, and 88% (95% CI, 81% to 93%) for elastic ratio 2, and the AUROCs were 0.85 (95% CI, 0.81 to 0.87), 0.93 (95% CI, 0.94 to 0.98), and 0.92 (95% CI, not reported), respectively. Pooled estimates for Elasticity Index were not performed due to insufficient data.
A test is clinically useful if the use of the results informs management decisions that improve the net health outcome of care. The net health outcome can be improved if patients receive correct therapy, more effective therapy, or avoid unnecessary therapy or testing.
Direct evidence of clinical utility is provided by studies that have compared health outcomes for patients managed with and without the test. Because these are intervention studies, the preferred evidence would be from RCTs.
There are currently no published studies that directly demonstrate the effect of RTE on patient outcomes.
Indirect evidence on clinical utility rests on clinical validity. If the evidence is insufficient to demonstrate test performance, no inferences can be made about clinical utility.
Because the clinical validity of RTE has not been established, a chain of evidence supporting the clinical utility of this test for this population cannot be constructed.
The use of ARFI imaging has been evaluated in viral hepatitis and NAFLD. ARFI imaging has potential advantages over FibroScan. ARFI can be implemented on a standard ultrasound machine, may be more applicable for assessing complications such as ascites, and may be more applicable in obese patients. ARFI imaging appears to have similar diagnostic accuracy to FibroScan, but there are fewer data available on performance characteristics. Validation studies have used varying cutoffs for positivity.
Magnetic resonance elastography (MRE) has a high success rate and is highly reproducible. The diagnostic accuracy also appears to be high. In particular, MRE has high diagnostic accuracy for the detection of fibrosis in NAFLD, independent of BMI and degree of inflammation. However, further validation is needed to determine standard cutoffs and confirm performance characteristics because CI for estimates are wide. MRE is also not widely available. RTE has been evaluated in multiple diseases with varying scoring methods and cutoffs. Although data are limited, the accuracy of RTE appears to be similar to FibroScan for the evaluation of significant liver fibrosis, but less accurate for the evaluation of cirrhosis. There was evidence of publication bias in the systematic review and the diagnostic accuracy may be overestimated.
A systematic review conducted to inform the AASLD Practice Guidelines (2024) reported that liver-stiffness measurement from shear wave elastography/ARFI and MRE (in addition to TE) shows acceptable to outstanding accuracy for the detection of liver fibrosis across various liver disease etiologies. Accuracy increased from F2-4 to F3-4 and was the highest for F4. Given these limitations and the imperfect reference standard, it is difficult to interpret performance characteristics. There is no direct evidence that other noninvasive radiologic methods improve health outcomes; further, it is not possible to construct a chain of evidence for clinical utility due to the lack of sufficient evidence on clinical validity.
For individuals who have CLD who receive transient elastography (TE), the evidence includes many systematic reviews of more than 50 observational studies (>10,000 patients). Relevant outcomes are test validity, morbid events, and treatment-related morbidity. TE (FibroScan) has been studied in populations with viral hepatitis, NALFD, and ALD. There are varying cutoffs for positivity. Failures of the test are not uncommon, particularly for those with high body mass index, but these failures often went undetected in analyses of the validation studies. Given these limitations and the imperfect reference standard, it can be difficult to interpret performance characteristics. However, for the purposes of deciding whether a patient has severe fibrosis or cirrhosis, the FibroScan results provide data sufficiently useful to determine therapy. In fact, FibroScan has been used as an alternative to biopsy to establish eligibility regarding the presence of fibrosis or cirrhosis in the participants of several RCTs. These trials showed the efficacy of HCV treatments, which in turn demonstrated that the test can identify patients who would benefit from therapy. The evidence is sufficient to determine that the technology results in an improvement in the net health outcome.
| [X] Medically Necessary | [ ] Investigational |
For individuals who have CLD who receive multiparametric magnetic resonance imaging (MMRI), the evidence includes several prospective and retrospective observational studies. Multiparametric MRI (e.g., LiverMultiScan) has been studied in mixed populations, including NAFLD, viral hepatitis, and ALD. Quantitative MRI provides various measures to assess liver fat content, fibrosis and inflammation. Various cutoffs have been utilized for positivity. Given these limitations and the imperfect reference standard, it can be difficult to interpret performance characteristics. Otherwise, multiparametric MRI performed similarly to transient elastography, and fewer technical failures of multiparametric MRI were reported. The prognostic ability of quantitative MRI to predict liver-related clinical events has been evaluated in 2 studies. Both studies reported positive correlations, but the confidence interval was wide. Larger cohorts with a longer follow-up time would be useful to further derive the prognostic characteristic of the test. Multiparametric MRI has been used to measure the presence of fibrosis or cirrhosis in patients who have achieved biochemical remission after treatment in small prospective studies. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
| [ ] Medically Necessary | [X] Investigational |
For individuals who have CLD who receive noninvasive radiologic methods other than TE or MMRI for liver fibrosis measurement, the evidence includes systematic reviews of observational studies and a comparative study with 5-year follow up. Relevant outcomes are test validity, morbid events, and treatment-related morbidity. Other radiologic methods (e.g., magnetic resonance elastography (MRE), real-time transient elastography (RTE), acoustic radiation force impulse imaging (ARFI) imaging) may have similar performance for detecting significant fibrosis or cirrhosis. A systematic review conducted to support the AASLD Practice Guidelines (2024) reported that liver-stiffness measurement from MRE and shear wave elastography/ARFI (in addition to TE) shows high accuracy for the detection of liver fibrosis across various liver disease etiologies. Accuracy increased from F2-4 to F3-4 and was the highest for F4. In the comparative study, ARFI elastography was found to be at least as effective as liver histology in predicting liver-related survival, and was superior to both histology and the FIB-4 score in predicting certain liver-related complications. Studies have frequently included varying cutoffs not prespecified or validated. Given these limitations and the imperfect reference standard, it is difficult to interpret performance characteristics. There is no direct evidence that other noninvasive radiologic methods improve health outcomes; further, it is not possible to construct a chain of evidence for clinical utility due to the lack of sufficient evidence on clinical validity. The evidence is insufficient to determine that the technology results in an improvement in the net health outcome.
| [ ] Medically Necessary | [X] Investigational |
The purpose of the following information is to provide reference material. Inclusion does not imply endorsement or alignment with the evidence review conclusions.
While the various physician specialty societies and academic medical centers may collaborate with and make recommendations, input received does not represent an endorsement or position statement by the physician specialty societies or academic medical centers, unless otherwise noted.
Guidelines or position statements will be considered for inclusion in ‘Supplemental Information' if they were issued by, or jointly by, a US professional society, an international society with US representation, or National Institute for Health and Care Excellence (NICE). Priority will be given to guidelines that are informed by a systematic review, include strength of evidence ratings, and include a description of management of conflict of interest.
In 2018, the practice guidelines on the diagnosis and management of nonalcoholic fatty liver disease (NAFLD), developed by the American Gastroenterological Association (AGA), the American Association for the Study of Liver Diseases (AASLD), and the American College of Gastroenterology, stated that “NFS [NAFLD fibrosis score] or FIB-4 [Fibrosis-4] index are clinically useful tools for identifying NAFLD patients with a higher likelihood of having bridging fibrosis (stage 3) or cirrhosis (stage 4).”97, This guideline also cited vibration-controlled transient elastography (VCTE) and magnetic resonance elastography (MRE) as “clinically useful tools for identifying advanced fibrosis in patients with NAFLD.”
A 2022 consensus-based clinical care pathway was published by the AGA on risk stratification and management of NAFLD, including some recommendations regarding the use of non-invasive testing for individuals with chronic liver disease98, Among individuals with increased risk of NAFLD or nonalcoholic steatohepatitis (NASH)-related fibrosis (i.e., individuals with type-2 diabetes, ≥2 metabolic risk factors, or an incidental finding of hepatic steatosis or elevated aminotransferases), assessment with a nonproprietary fibrosis scoring system such as FIB-4 is recommended, although aspartate transaminase to platelet ratio index can be used in lieu of FIB-4 scoring. Depending on the fibrosis score, imaging-based testing for liver stiffness may be warranted with transient elastography (FibroScan), although bidimensional shear wave elastography or point shear wave elastography are also imaging options included in the clinical care pathway.
In 2023, the AGA published an expert review on the role of noninvasive tests [NITs] in the evaluation and management of NAFLD.99, The following practice advice statements were made.
A 2023 updated practice guidance focused on the clinical assessment and management NAFLD and hepatic steatosis issued by the AASLD included the following guidance statements on the use of noninvasive techniques for diagnosis and management of NAFLD and hepatic steatosis.7,
All patients with hepatic steatosis or clinically suspected NAFLD based on the presence of obesity and metabolic risk factors should undergo primary risk assessment with FIB-4
In patients with pre-DM [diabetes mellitus], T2DM, or 2 or more metabolic risk factors (or imaging evidence of hepatic steatosis), primary risk assessment with FIB-4 should be repeated every 1–2 years
Although standard ultrasound can detect hepatic steatosis, it is not recommended as a tool to identify hepatic steatosis due to low sensitivity across the NAFLD spectrum
CAP [controlled attenuation parameter] as a point-of-care technique may be used to identify steatosis. MRI-PDFF [proton density fat fraction] can additionally quantify steatosis
If FIB-4 is ≥ 1.3, VCTE, MRE, or ELF [ Enhanced Liver Fibrosis] may be used to exclude advanced fibrosis
Improvement in ALT or reduction in liver fat content by imaging in response to an intervention can be used as a surrogate for histological improvement in disease activity
The 2023 guidance recommend that patients with hepatic steatosis or NAFLD/MASLD based on the presence of obesity and metabolic risk factors should undergo primary risk assessment with FIB-4 index as this is considered the most valid noninvasive test.7, Patients with FIb-4 scores less than 1.3 are unlikely to have advanced fibrosis. High-risk individuals, such as those with type 2 diabetes, medically complicated obesity, family history of cirrhosis, or more than mild alcohol consumption, should be screened for advanced fibrosis. VCTE or ultrasound-based methods such as ARFI are favored over MRE, as initial secondary assessments due to cost considerations. The ELF test is approved for prognostication when advanced fibrosis is suspected, although it can be ordered for secondary risk assessment, particularly because the availability of elastography may be limited in some settings.
A 2024 publication from the AASLD describes the impact of new nomenclature on the AASLD practice guidance on NAFLD and hepatic steatosis described above.2, Briefly, available data suggest a near complete overlap (99%) between the metabolic dysfunction-associated steatotic liver disease (MASLD)-defined population and the historical NAFLD-defined population. Therefore, all recommendations on the clinical assessment and management of NAFLD AND NASH can be applied to patients with MASLD and metabolic-dysfunction associated steatohepatitis (MASH). Additionally, data from biomarker validation studies among patients with NAFLD and NASH are applicable to patients with MASLD and MASH, respectively, until further guidance
A 2022 joint clinical practice guideline issued by the American Association of Clinical Endocrinology and AASLD included the following recommendations on the use of noninvasive techniques for diagnosis of NAFLD with clinically significant fibrosis (stage F2 to F4)100,:
Clinicians should use liver fibrosis prediction calculations to assess the risk of NAFLD with liver fibrosis. The preferred noninvasive initial test is the FIB-4 (Grade B, Level 2 evidence)
High-risk individuals with indeterminate or high FIB-4 score for further workup with an transient elastography or enhanced liver fibrosis test, as available (Grade B, Level 2 evidence)
Clinicians should prefer the use of transient elastography as best validated to identify advanced disease and predict liver-related outcomes. Alternative imaging approaches may be considered, including shear wave elastography (less well validated) and/or magnetic resonance elastography (most accurate but with a high cost and limited availability; best if ordered by liver specialist for selected cases) (Grade B, Level 2 evidence).
In 2024, the AASLD published 2 guidelines focused on blood-based and imaging-based noninvasive liver disease assessment (NILDA) of hepatic fibrosis and steatosis.3,4, Recommendations are provided in Table 15 and include guidance for individuals with various etiologies of chronic liver disease, including hepatocellular (hepatitis C virus [HCV], HCV/HIV, hepatitis B virus [HBV], HCV/HBV, HBV/HIV, NAFLD, alcohol-associated liver disease [ALD]) and cholestatic disorders (primary sclerosing cholangitis [PSC], primary biliary cholangitis [PBC]).
| Blood-based |
|
| Imaging-based |
|
In 2016, the NICE published guidance on the assessment and management of NAFLD.101, The guidance did not reference elastography. The guidance recommended the enhanced liver fibrosis test to test for advanced liver fibrosis, utilizing a cutoff enhanced liver fibrosis score of 10.51.
In 2024, the AASLD published 2 guidelines focused on blood-based and imaging-based NILDA of hepatic fibrosis and steatosis.3,4, Recommendations regarding the use of these noninvasive assessments for patients with HBV and HCV are found in Table 16.
In 2020, the American Association for the Study of Liver Diseases and Infectious Diseases Society of America guidelines for testing, managing, and treating hepatitis C virus (HCV) recommended that, for counseling and pretreatment assessment purposes, the following should be completed:
"Evaluation for advanced fibrosis using noninvasive markers and/or elastography, and rarely liver biopsy, is recommended for all persons with HCV infection to facilitate decision making regarding HCV treatment strategy and determine the need for initiating additional measures for the management of cirrhosis (eg, hepatocellular carcinoma screening) Rating: Class I, Level A [evidence and/or general agreement; data derived from multiple randomized trials, or meta-analyses]”102,
The guidelines noted that there are several NITs to stage the degree of fibrosis in patients with HCV. Tests included indirect serum biomarkers, direct serum biomarkers, and VCTE. The guidelines asserted that no single method is recognized to have high accuracy alone and careful interpretation of these tests is required.
A 2023 update of this guideline includes noninvasive liver markers such as HCV FibroSure, FIB-4, and FibroScan in their simplified treatment algorithm for HCV.103, Specific recommendations for a preferred noninvasive testing strategy are not provided.
In 2017, the NICE published updated guidance on the management and treatment of patients with hepatitis B virus.104, The guidance recommends offering transient elastography as the initial test in adults diagnosed with chronic hepatitis B, to inform the antiviral treatment decision (Table 17).
| Transient Elasticity Score | Antiviral Treatment |
| >11 kPa | Offer antiviral treatment |
| 6 to 10 kPa | Offer liver biopsy to confirm fibrosis level prior to offering antiviral treatment |
| <6 kPa plus abnormal ALT | Offer liver biopsy to confirm fibrosis level prior to offering antiviral treatment |
| <6 kPa plus normal ALT | Do not offer antiviral treatment |
In 2024, the AASLD published 2 guidelines focused on blood-based and imaging-based NILDA of hepatic fibrosis and steatosis.3,4, Recommendations regarding the use of these noninvasive assessments for patients with chronic liver disease, including hepatocellular (HCV, HCV/HIV, HBV, HCV/HBV, HBV/HIV, NAFLD, ALD) and cholestatic disorders (PSC, PBC) are found in Table 16.
In 2020, the American College of Radiology appropriateness criteria rated ultrasound shear wave elastography as an 8 (usually appropriate) for the diagnosis of liver fibrosis in patients with chronic liver disease.105, The criteria noted that high-quality data can be difficult to obtain in obese patients, and assessments of liver stiffness can be confounded by parenchyma, edema, inflammation, and cholestasis.
A 2020 U.S. Preventive Services Task Force Recommendation Statement for HCV screening notes that a diagnostic evaluation for fibrosis stage or cirrhosis with a noninvasive test reduces the risk for harm compared to a liver biopsy.106, This statement does not give preference to a specific noninvasive test.
There is no national coverage determination. In the absence of a national coverage determination, coverage decisions are left to the discretion of local Medicare carriers.
Some currently ongoing and unpublished trials that might influence this review are listed in Table 17.
| NCT No. | Trial Name | Planned Enrollment | Completion Date |
| Ongoing | |||
| NCT06592820a | Shear Wave Elastography Registry Study (SW) | 300 | Mar 2027 recruiting) |
| NCT06463366 | Multi-parametric Magnetic Resonance Imaging for the Precise Diagnosis and Quantitative Study of Liver Steatosis, Inflammation, and Fibrosis in Chronic Liver Disease. | 100 | Sep 2025 (recruiting) |
| NCT04365855 | The Olmsted NAFLD Epidemiology Study (TONES) | 800 | Jun 2028 ( recruiting) |
| NCT04550481 | Role of Lisinopril in Preventing the Progression of Non-Alcoholic Fatty Liver Disease, RELIEF-NAFLD Study | 45 | Sept 2026 (not recruiting) |
| Unpublished | |||
| NCT03789825 | Screening for Liver Fibrosis. A Population-based Study in European Countries. The ''LiverScreen'' Project. | 30000 | Jan 2025 |
| NCT04435054 | Screening for NAFLD-related Advanced Fibrosis in High Risk popuLation: Optimization of the Diabetology Pathway Referral Using Combinations of Non-invAsive Biological and elastogRaphy paramEters | 1000 | Sep 2024 |
| Codes | Number | Description |
|---|---|---|
| 0002M | Liver disease, ten biochemical assays (ALT, A2-macroglobulin, apolipoprotein A-1, total bilirubin, GGT, haptoglobin, AST, glucose, total cholesterol and triglycerides) utilizing serum, prognostic algorithm reported as quantitative scores for fibrosis, steatosis and alcoholic steatohepatitis (ASH) | |
| 0003M | Liver disease, ten biochemical assays (ALT, A2-macroglobulin, apolipoprotein A-1, total bilirubin, GGT, haptoglobin, AST, glucose, total cholesterol and triglycerides) utilizing serum, prognostic algorithm reported as quantitative scores for fibrosis, steatosis and nonalcoholic steatohepatitis (NASH) | |
| 0166U | Liver disease, 10 biochemical assays (α2-macroglobulin, haptoglobin, apolipoprotein A1, bilirubin, GGT, ALT, AST, triglycerides, cholesterol, fasting glucose) and biometric and demographic data, utilizing serum, algorithm reported as scores for fibrosis, necroinflammatory activity, and steatosis with a summary interpretation | |
| 0344U | Hepatology (nonalcoholic fatty liver disease [NAFLD]), semiquantitative evaluation of 28 lipid markers by liquid chromatography with tandem mass spectrometry (LC-MS/MS), serum, reported as at-risk for nonalcoholic steatohepatitis (NASH) or not NASH | |
| 81517 | Liver disease, analysis of 3 biomarkers (hyaluronic acid [HA], procollagen III amino terminal peptide [PIIINP], tissue inhibitor of metalloproteinase 1 [TIMP-1]), using immunoassays, utilizing serum, prognostic algorithm reported as a risk score and risk of liver fibrosis and liver[1]related clinical events within 5 years | |
| 81596 | Infectious disease, chronic hepatitis C virus (HCV) infection, six biochemical assays (ALT, A2-macroglobulin, apolipoprotein A-1, total bilirubin, GGT, and haptoglobin) utilizing serum, prognostic algorithm reported as scores for fibrosis and necroinflammatory activity in liver | |
| 83883 | Nephelometry, each analyte not elsewhere specified (no specific code for FIBROSpect) | |
| 76391 | Magnetic resonance (eg, vibration) elastography | |
| 76981 | Ultrasound, elastography; parenchyma (eg, organ) | |
| 76982 | Ultrasound, elastography; first target lesion | |
| 76983 | Ultrasound, elastography; each additional target lesion (List separately in addition to code for primary procedure) | |
| 91200 | Liver elastography, mechanically induced shear wave (eg, vibration), without imaging, with interpretation and report | |
| ICD-10-CM | K70.0-K77 | Liver diseases code range (fibrosis is K74.0) |
| R94.5 | Abnormal results of liver function tests | |
| ICD-10-PCS | Not applicable. There are no ICD procedure codes for laboratory tests. | |
| Type of Service | Medicine | |
| Place of Service | Outpatient |
| Date | Action | Description |
| 11/10/25 | Annual Review | Policy updated with literature review through September 22, 2025; references added. Substantive policy statement edits; intent unchanged. See related policy Pharmacologic Treatments for Metabolic Dysfunction-Associated Steatohepatitis. |
| 12/12/24 | Annual Review | Policy updated with literature review through September 27, 2024; references added. Policy statements unchanged. |
| 12/13/23 | Annual Review | Policy updated with literature review through September 25, 2023; references added. Policy statements unchanged. Added Cpt code 81517 Liver disease, analysis of 3 biomarkers (eff 01/01/2024). Deleted 0014M Liver disease, analysis of 3 biomarkers (deleted eff 12/31/2023). |
| 12/05/22 | Annual review | Policy updated with literature review through September 12, 2022; references added. Minor editorial refinements to policy statements; intent unchanged. |
| 12/07/21 | Annual review | Policy updated with literature review through October 4, 2021; references added. Multiparametric magnetic resonance imaging added as investigational for the evaluation or monitoring of patients with chronic liver disease. |
| 12/07/20 | Annual review | Policy updated with literature review through September 17, 2020; references added. Policy statements unchanged. |
| 12/17/19 | Annual review | No changes. |
| 11/14/17 | | |
| 08/12/16 | | |
| 01/12/15 | | |
| 07/10/14 | | |
| 09/17/13 | | |
| 08/15/12 | | |
| 03/20/12 | | |
| 06/29/09 | | iCES |
| 02/16/07 | | |
| 12/11/06 | Created | New policy |