- Open Access
ESR statement on the stepwise development of imaging biomarkers
Insights into Imaging volume 4, pages 147–152 (2013)
Development of imaging biomarkers is a structured process in which new biomarkers are discovered, verified, validated and qualified against biological processes and clinical end-points. The validation process not only concerns the determination of the sensitivity and specificity but also the measurement of reproducibility. Reproducibility assessments and standardisation of the acquisition and data analysis methods are crucial when imaging biomarkers are used in multicentre trials for assessing response to treatment. Quality control in multicentre trials can be performed with the use of imaging phantoms. The cost-effectiveness of imaging biomarkers also needs to be determined. A lot of imaging biomarkers are being developed, but there are still unmet needs—for example, in the detection of tumour invasiveness.
• Using imaging biomarkers to streamline drug discovery and disease progression is a huge advancement in healthcare.
• The qualification and technical validation of imaging biomarkers pose unique challenges in that the accuracy, methods, standardisations and reproducibility are strictly monitored.
• The clinical value of new biomarkers is of the highest priority in terms of patient management, assessing risk factors and disease prognosis.
There is increasing interest in developing the quantitative imaging of biomarkers in personalised medicine. Biomarkers are defined as “characteristics that are objectively measured and evaluated as indicators of normal biological processes, pathological processes, or pharmaceutical responses to a therapeutic intervention” . Broadly, biomarkers fall into two categories: bio-specimen biomarkers, including molecular biomarkers and genetic biomarkers, and bio-signal biomarkers or imaging biomarkers. Bio-specimen biomarkers are obtained by removing a sample from a patient. Examples of these molecular biomarkers are genes and proteins detected from fluids or tissue samples. Bio-signal biomarkers remove no material from the patient, but rather detect and analyse an electromagnetic, photonic or acoustic signal emitted by the patient . These imaging biomarkers have the advantage of being non-invasive, spatially resolved and repeatable . They are of particular interest if they can overcome the limitations of the established histological “gold standards”. Indeed, invasive reference examinations, such as biopsy, can be inconclusive, are non-representative of the whole tissue (which is a tremendous limitation when assessing malignant tumours, which are known to be heterogeneous) and possess non-negligible levels of mortality and morbidity.
Genetic biomarkers indicate whether a disease may occur, but they are usually inefficient to assess the presence and stage of a disease. Similar to molecular biomarkers, imaging biomarkers can be used for early detection of diseases, staging and grading, and predicting or assessing the response to treatment . Accordingly, because of their relative lower cost compared with imaging, molecular biomarkers may be more appropriate for disease screening and early detection than imaging biomarkers. With their high sensitivity, molecular biomarkers could also detect subclinical stages of disease before any morphological or functional change is detectable on imaging. In contrast, imaging biomarkers are often more useful than molecular biomarkers for disease staging, and also grading and for assessing tumour response, because localised information is crucial.
Similar to new drugs, the development of biomarkers has to pass along a pipeline going from discovery, through verification in different laboratories, validation and qualification before they can be used in clinical routine. Validation includes the determination of the accuracy and the precision (reproducibility) of the biomarker and standardisation concerns both acquisition and analysis. Qualification, defined as a “graded, fit-for-purpose evidentiary process linking a biomarker with biological processes and clinical end-points”, is a validation process in large cohorts of patients involving multiple centres, similar to phase III clinical trials, to obtain regulatory approval as surrogate endpoints . A more extensive path to biomarker development has been reported . The first step is the proof of concept, which defines any specific change relevant to the disease that can be studied using the available imaging and computational techniques. The relationship between this change and the presence, grading and response to treatment of the disease constitutes the proof of mechanism. The images needed to extract the biomarker must be appropriate (in terms of resolution, signal and contrast behaviour). Preparation of images relates to improving the data before the analysis (such as segmentation, filtering, interpolation or registration). The analysis and modelling of the signal by computational numerical adjustment of a mathematical model allow extracting the needed information (such as structural, physical, chemical, biological and functional properties). After this voxel-by-voxel computation, the spatial distribution of the biomarker can be depicted by parametric images, defined as derived secondary images which pixels represent the distribution values of a given parameter. Multivariate parametric images obtained by statistical modelling of the relevant parameters allow the reduction of data and a clear definition of the defined disease target. The abnormal values should be defined and measured through histogram analysis. A pilot test on a small sample of subjects, with and without the disease, has to be performed to validate the process—also called proof of principle—and to evaluate the influence of potential variations related to age, sex or any other source of biases. Finally, proofs of efficacy and effectiveness on larger and well-defined series of patients will show the ability of a biomarker to measure the clinical endpoint (Fig. 1).
Before being routinely used in the clinic, imaging biomarkers must be validated. Determining the accuracy implies calculating the sensitivity and specificity of the biomarker when compared with a biological process, such as tumour necrosis, which can be assessed at histopathological examination.
This validation process is challenging because changes in tissue properties due to diseases that are measured by imaging, such as the diffusion coefficients at DW-MRI or the mechanical properties at MR elastography, are only indirectly linked to structural changes such as necrosis, cellularity, fibrosis and vascular architecture. Moreover, the functional properties that are measured may be influenced by other co-existing factors, such as inflammation, perfusion, permeability and interstitial pressure. For example, the apparent diffusion coefficient (ADC) is decreased in chronic liver disease. This ADC decrease has been shown to be influenced by increased fibrosis, inflammation and steatosis, as well as by decreased perfusion [6–9]. Equating what is measured by imaging and what is occurring at the cellular level in tissue is a difficult task because our understanding of the biophysical underpinnings of many imaging biomarkers, such as diffusion measurements of in vivo systems, remains partial [10, 11].
To help in this understanding, pre-validations studies are conducted in animal models of the disease of interest, where histopathological analysis and other invasive reference examinations can be easily conducted . More basic ex-vivo research in tissues, phantoms or theoretical models may also help in the understanding of the relationship between signal formation and underlying physiopathology . The transition to the patient has then to be realised, and the biomarker once again validated using small-cohort then large-cohort clinical studies.
The ultimate goal for an imaging biomarker is to understand its predictivity so well that it can become a surrogate for clinical outcome. One primary end-point in therapy assessment studies is patient survival. No imaging biomarker, even the familiar “response evaluation criteria in solid tumours” (RECIST) , universally employed in oncology drug development, is widely accepted as surrogate for survival. The RECIST criteria can be used to define time to progression, but increases in time to progression as a result of therapy is not necessarily a surrogate of improved overall survival, as shown by the avastin (bevacizumab) story . In 2011, the FDA withdrew approval for the combined use of avastin and chemotherapy for the treatment of metastatic breast cancer because preliminary licensing was predicated on future demonstration of improvement in survival or quality of life, both of which were not forthcoming when clinical trials were completed.
Surrogacy can only be reliably established with a large number of adequately powered clinical studies using a variety of interventions, and with the aid of meta-analyses. This is a daunting goal, which constitutes the very last step in biomarker qualification .
Repeatability (measurements at short intervals on the same subjects using the same equipment in the same centres) and reproducibility (measurements at short intervals on the same subjects using different facilities in the same and different centres) studies must be conducted for image acquisition and image analysis. These studies have to be performed with the same observer (intra-observer variability) and with different observers (inter-observer variability). Repeatability and reproducibility are particularly important to assess if the imaging biomarkers are to be used in longitudinal studies; for example, for treatment follow-up, to ensure that the changes in parameter are caused by a response to treatment and not by inherent technical or physiological variation. The reproducibility will affect the diagnostic usefulness of the biomarker. As an example, it is known that perfusion parameters are markedly variable between subjects. Therefore, it has been reported that post-therapy decrease of Ktrans should at least be in the 30–50 % range to represent a significant therapy-induced change, whereas for ADC at DW-MRI a change of 10–20 % would be sufficient . Reproducibility studies are now very often included in scientific papers, as advised by the “standards for reporting of diagnostic accuracy” (STARD) criteria and should ideally include Bland-Altman plots and results of coefficients of repeatability [16, 17].
Standardisation relates to the establishment of norms or requirements about technical aspects. In the development of imaging biomarkers, two main aspects should be considered.
Standardisation of image acquisition: similar acquisition parameters should be used across imaging platforms, when these parameters affect the results of the biomarker. For example, the calculation of ADC depends on the number and choice of the gradient “b” values. A collaborative paper by Padhani et al.  lays the foundation for acquisition standardisation, notably by recommending that monoexponential assessments of ADC should use two b values above 100 mm2/s.
Moreover, DW-MRI is very sensitive to motion. Motion correction schemes are thus advised for DW-MRI acquisition. However, it is still unclear which scheme is optimal. As an example for upper abdominal studies, some consider that free breathing acquisition produces reliable enough data, even with a better reproducibility than breath-hold, and that a respiratory-triggered scheme produces less reproducible data, while others recommend using tracking-only navigator techniques [19–21].
Standardisation of image analysis: volume and region of interest (ROI) determinations and parameter calculation (mathematical models) should be standardised. In tumour perfusion imaging, it has been shown that the ROI placements in the vascular input and in the tumour influence the results and reproducibility of the parameter measurements . To take motion into account, rigid and non-rigid registration of images at different time points can be used. In heterogeneous lesions such as tumours, imaging biomarkers are frequently calculated as parametric maps with spatial resolution. We need to define how to handle the histogram that displays the obtained values. Descriptive statistics such as mean value, standard deviation, and range can be directly obtained from the histogram. The main drawback with this approach is the clear tendency to underestimate the changes in body tissues and organs, since the values indicative of disease, or its most relevant manifestations, are minimised. For this reason, percentiles are used in some settings to obtain a better relationship with the most relevant predictive clinical variables. The optimal type of approach must be defined for each problem (complete histogram, partial histogram in quartiles, partial histogram in deciles). A further approach involves the analysis of the heterogeneity in the spatial distribution of a biomarker provided by its parametric image. To this end, some distribution asymmetry statistics such as kurtosis can be used [23–26]. Finally, the choice of the mathematical model that is used to calculate the quantitative parameters has also a major influence on the results that are obtained [27, 28]. Standardisation procedures are currently being developed [18, 29, 30]. It is important that standardisation be a collaborative effort of academia and industry. Standardisation of data reporting should also be performed. For example, to describe the liver elasticity in cirrhosis, different units (Young modulus in kPa, shear modulus in kPa, wave speed in m/s) and different cut-off values are currently used [31–33]. Standardisation of these data would improve the communication between research groups.
Adequate phantoms could be used to validate, on a day-to-day basis, that the biomarker stays robust and to avoid any drift in the machine, acquisition or processing protocol. The advantage of using phantoms is that the sequence can be optimised in detail before being performed in patients (which is particularly adapted to CT studies to limit the radiation imposed on the patient), and distribution of the same phantom across imaging platforms allows control of the quality and standardisation of the procedures. Multicentre quality control studies have already been conducted using a simple, ice-water filled, DW-MRI phantom containing tubes of solutions of known diffusion coefficients, which allowed for comparing machines and centres . For ultrasound and CT, phantoms ranging from simple gels with inclusions of different shapes and sizes (for control of tumour size measurement) to complex thoracic models including vasculature inserts (to test perfusion acquisitions) are available [30, 35]. Mechanically-induced motion of these phantoms can also be realised . Another possibility is to simulate images based on computerised phantoms . This computerised phantom dataset can even incorporate deformation information mimicking respiration of patients .
When imaging biomarkers are validated for use in drug development studies or clinical trials, several additional points should be considered. First, the imaging biomarker should bring new information on top of existing diagnostic tools or existing risk factors and have the potential to modify the patient management . The coronary artery calcium score, one of the most evaluated cardiovascular imaging biomarkers, is not only associated with the risk of future cardiovascular events but it improves the traditional classification of risk by shifting patients from intermediate to high risk categories . It is likely that a panel of biomarkers will be required to achieve the high accuracy required at the clinical level.
Second, the imaging biomarker should be completely non-invasive, for not losing the advantage of safe imaging methods over invasive reference examinations. Third, the imaging biomarker should be cost-effective. If the biomarker is to be added as part of the clinical routine examination, and not to further burden the public health system with increased costs of care, its diagnostic advantages have to offset its cost. The imaging biomarker also should be easy to implement in the clinic, meaning that the machinery must already exist or be easily available, that there should not be the need for specific expertise from hospital employees, and that the parameter must be easy to measure and interpret. Few guidelines currently exist for imaging biomarker use [40, 41]. Together with other agencies, guidelines, evaluation and implementation may be an important task for the biomarkers subcommittee of the ESR.
Biomarkers have also a potential for the industry as pharmacodynamic markers and even surrogate endpoints for targeted clinical phase I to III studies . Development of new biomarkers was identified as the highest priority for scientific effort by the FDA to ease the marketing of newly developed drugs .
Development of new biomarkers
When seeing the difficulties in the qualification and standardisation of existing imaging biomarkers, is there a need to develop additional ones? The answer is yes; for example, in the field of oncology, where the palette of reasonably well-understood biomarkers, has major gaps. The hallmarks of cancer include sustaining proliferative signalling, evading growth suppressors, resisting cell death, enabling replicative immortality, inducing angiogenesis, activating invasion and metastasis, reprogramming of energy metabolism and evading immune destruction . Regarding angiogenesis, there are useful biomarkers utilising MRI, CT, ultrasound or PET. For drugs affecting the deregulated cellular energetics of the Warburg effect, FDG-PET offers an obvious assessment. For cellularity, proliferation and apoptosis, a joint public-private partnership between the EU and pharmaceutical companies called “Quantitative Imaging in Cancer: Connecting Cellular Processes (QuIC-ConCePT)” is currently devoted to the validation of imaging biomarkers, namely ADC at DW-MRI, [18F]30-deoxy-30-fluorothymidine PET (FLT–PET) and isatin-5-sulphonamide PET ([18F]ICMT-11), an apoptosis radiotracer with subnanomolar affinity for caspase-3 [12, 45, 46]. However, we currently do not have good markers for activation of invasion and appearance of metastasis before these events become macroscopically evident. Thus, development of new imaging biomarkers is still needed.
The European Society of Radiology and its related European Institute for Biomedical Imaging Research (EIBIR) should have a relevant role in coordinating future developments of biomarkers and in the assessment and validation of imaging biomarkers as surrogate end points.
Biomarkers Definitions Working Group (2001) Biomarkers and surrogate endpoints: preferred definitions and conceptual framework. Clin Pharmacol Ther 69(3):89–95
Waterton JC, Pylkkanen L (2012) Qualification of imaging biomarkers for oncology drug development. Eur J Cancer 48(4):409–415
European Society of Radiology (2010) White paper on imaging biomarkers. Insights Imaging 1(2):42–45
Wagner JA, Williams SA, Webster CJ (2007) Biomarkers and surrogate end points for fit-for-purpose development and regulatory evaluation of new drugs. Clin Pharmacol Ther 81(1):104–107
Marti Bonmati L, Alberich-Bayarri A, Garcia-Marti G, Sanz Requena R, Pérez Castillo C, Carot Sierra JM, Herrera M (2012) Imaging biomarkers, quantitative imaging, and bioengineering. Radiol 54(3):269–278
Lewin M, Poujol-Robert A, Boelle PY et al (2007) Diffusion-weighted magnetic resonance imaging for the assessment of fibrosis in chronic hepatitis C. Hepatology 46(3):658–665
Luciani A, Vignaud A, Cavet M et al (2008) Liver cirrhosis: intravoxel incoherent motion MR imaging–pilot study. Radiology 249(3):891–899
Bonekamp S, Torbenson MS, Kamel IR (2011) Diffusion-weighted magnetic resonance imaging for the staging of liver fibrosis. J Clin Gastroenterol 45(10):885–892
Leitao HS, Doblas S, d’Assignies G, Garteiser P, Daire JL, Paradis V, Geraldes CF, Vilgrain V, Van Beers BE (2012) Fat deposition decreases diffusion parameters at MRI: a study in phantoms and patients with liver steatosis. Eur Radiol 23(2):461-467
Le Bihan D, Urayama S, Aso T, Hanakawa T, Fukuyama H (2006) Direct and fast detection of neuronal activation in the human brain with diffusion MRI. PNAS 103(21):8263–8268
Xu J, Does MD, Gore JC (2011) Dependence of temporal diffusion spectra on microstructural properties of biological tissues. Magn Reson Imaging 29(3):380–390
Sinkus R, Van Beers BE, Vilgrain V, DeSouza N, Waterton JC (2012) Apparent diffusion coefficient from magnetic resonance imaging as a biomarker in oncology drug development. Eur J Cancer 48(4):425–431
Yablonskiy DA, Sukstanskii AL (2010) Theoretical models of the diffusion weighted MR signal. NMR Biomed 23(7):661–681
Eisenhauer EA, Therasse P, Bogaerts J et al (2009) New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer 45(2):228–247
Padhani AR, Khan AA (2010) Diffusion-weighted (DW) and dynamic contrast-enhanced (DCE) magnetic resonance imaging (MRI) for monitoring anticancer therapy. Target Oncol 5(1):39–52
Bossuyt PM, Reitsma JB, Bruns DE et al (2003) Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Radiology 226(1):24–28
Barnhart HX, Barboriak DP (2009) Applications of the repeatability of quantitative imaging biomarkers: a review of statistical analysis of repeat data sets. Transl Oncol 2(4):231–235
Padhani AR, Liu G, Koh DM et al (2009) Diffusion-weighted magnetic resonance imaging as a cancer biomarker: consensus and recommendations. Neoplasia 11(2):102–125
Taouli B, Koh DM (2010) Diffusion-weighted MR imaging of the liver. Radiology 254(1):47–66
Kwee TC, Takahara T, Koh DM, Nievelstein RA, Luijten PR (2008) Comparison and reproducibility of ADC measurements in breathhold, respiratory triggered, and free-breathing diffusion-weighted MR imaging of the liver. J Magn Reson Imaging 28(5):1141–1148
Ivancevic MK, Kwee TC, Takahara T et al (2009) Diffusion-weighted MR imaging of the liver at 3.0 Tesla using tracking only navigator echo (TRON): a feasibility study. J Magn Reson Imaging 30(5):1027–1033
Zussman B, Jabbour P, Talekar K, Gorniak R, Flanders AE (2011) Sources of variability in computed tomography perfusion: implications for acute stroke management. Neurosurg Focus 30(6):E8
Rajaraman S, Rodriguez JJ, Graff C et al (2011) Automated registration of sequential breath-hold dynamic contrast-enhanced MR images: a comparison of three techniques. Magn Reson Imaging 29(5):668–682
Wagner M, Doblas S, Daire JL, Paradis V, Haddad N, Leitao H, Garteiser P, Vilgrain V, Sinkus R, Van Beers BE (2012) Diffusion-weighted MR imaging for the regional characterization of liver tumors. Radiology 264(2):464–472
Moffat BA, Chenevert TL, Lawrence TS et al (2005) Functional diffusion map: a noninvasive MRI biomarker for early stratification of clinical brain tumor response. PNAS 102(15):5524–5529
Yang X, Knopp MV (2011) Quantifying tumor vascular heterogeneity with dynamic contrast-enhanced magnetic resonance imaging: a review. J Biomed Biotechnol 732848:1–12
Buckley DL (2002) Uncertainty in the analysis of tracer kinetics using dynamic contrast-enhanced T1-weighted MRI. Magn Reson Med 47(3):601–606
Michoux N, Huwart L, Abarca-Quinones J et al (2008) Transvascular and interstitial transport in rat hepatocellular carcinomas: dynamic contrast-enhanced MRI assessment with low- and high-molecular weight agents. J Magn Reson Imaging 28(4):906–914
Leach MO, Brindle KM, Evelhoch JL et al (2005) The assessment of antiangiogenic and antivascular therapies in early-stage clinical trials using magnetic resonance imaging: issues and recommendations. Br J Cancer 92(9):1599–1610
Buckler AJ, Schwartz LH, Petrick N et al (2010) Data sets for the qualification of volumetric CT as a quantitative imaging biomarker in lung cancer. Opt Express 18(14):15267–15282
Huwart L, Sempoux C, Vicaut E et al (2008) Magnetic resonance elastography for the noninvasive staging of liver fibrosis. Gastroenterology 135(1):32–40
Friedrich-Rust M, Nierhoff J, Lupsor M et al (2012) Performance of Acoustic Radiation Force Impulse imaging for the staging of liver fibrosis: a pooled meta-analysis. J Viral Hepat 19(2):e212–e219
Degos F, Perez P, Roche B et al (2010) Diagnostic accuracy of FibroScan and comparison to liver fibrosis biomarkers in chronic viral hepatitis: a multicenter prospective study (the FIBROSTIC study). J Hepatol 53(6):1013–1021
Chenevert TL, Galban CJ, Ivancevic MK et al (2011) Diffusion coefficient measurement using a temperature-controlled fluid for quality control in multicenter studies. J Magn Reson Imaging 34(4):983–987
Lee YC, Fullerton GD, Baiu C, Lescrenier MG, Goins BA (2011) Preclinical multimodality phantom design for quality assurance of tumor size measurement. BMC Med Phys 11:1
Szegedi M, Rassiah-Szegedi P, Fullerton G, Wang B, Salter B (2010) A proto-type design of a real-tissue phantom for the validation of deformation algorithms and 4D dose calculations. Phys Med Biol 55(13):3685–3699
Wilhjelm JE, Jespersen SK, Falk E, Sillesen H (2006) The challenges in creating reference maps for verification of ultrasound images. Ultrasonics 4(Suppl 1):e141–e146
Wang TJ (2011) Assessing the role of circulating, genetic, and imaging biomarkers in cardiovascular risk prediction. Circulation 123(5):551–565
Polonsky TS, McClelland RL, Jorgensen NW et al (2010) Coronary artery calcium score and risk classification for coronary heart disease prediction. JAMA 303(16):1610–1616
Wahl RL, Jacene H, Kasamon Y, Lodge MA (2009) From RECIST to PERCIST: Evolving Considerations for PET response criteria in solid tumors. J Nucl Med 50(Suppl 1):122S–150S
Cummings J, Ward TH, Dive C (2010) Fit-for-purpose biomarker method validation in anticancer drug development. Drug Discov Today 15(19–20):816–825
Richter WS (2006) Imaging biomarkers as surrogate endpoints for drug development. Eur J Nucl Med Mol Imaging 33(Suppl 1):6–10
Woodcock J, Woosley R (2008) The FDA critical path initiative and its influence on new drug development. Annu Rev Med 59:1–12
Hanahan D, Weinberg RA (2011) Hallmarks of cancer: the next generation. Cell 144(5):646–674
Soloviev D, Lewis D, Honess D, Aboagye E (2012) [(18)F]FLT: an imaging biomarker of tumour proliferation for assessment of tumour response to treatment. Eur J Cancer 48(4):416–424
Nguyen QD, Challapalli A, Smith G, Fortt R, Aboagye EO (2012) Imaging apoptosis with positron emission tomography: ‘bench to bedside’ development of the caspase-3/7 specific radiotracer [(18)F]ICMT-11. Eur J Cancer 48(4):432–440
This paper was kindly prepared by the ESR Subcommittee on Imaging Biomarkers (Chairperson: Bernard Van Beers. Research Committee Chairperson: Luis Martí-Bonmatí. Members: Marco Essig, Thomas Helbich, Celso Matos, Wiro Niessen, Anwar Padhani, Harriet C. Thoeny, Siegfried Trattnig, Jean-Paul Vallée. Co-opted members: Peter Brader, Nicolas Grenier) on behalf of the European Society of Radiology (ESR) and with the help of Sabrina Doblas, INSERM U773, Paris, France.
It was approved by the ESR Executive Council in December 2012.
About this article
Cite this article
European Society of Radiology (ESR). ESR statement on the stepwise development of imaging biomarkers. Insights Imaging 4, 147–152 (2013). https://doi.org/10.1007/s13244-013-0220-5
- Imaging biomarkers