Evaluation of five multi steroid LC-MS/MS methods used for routine clinical analysis

madman · Dec 16, 2023

Abstract

Objectives

A mass spectrometry (LC-MS/MS)--based interlaboratory comparison study was performed for nine steroid analytes with five participating laboratories. The sample set contained 40 pooled samples of human serum generated from preanalyzed leftovers. To obtain a well-balanced distribution across reference intervals of each steroid, the leftovers first underwent a targeted mixing step.

Methods

All participants measured a sample set once using their own multianalyte protocols and calibrators. Four participants used in-house developed measurement platforms, including IVD-CE certified calibrators, which were used by three participants; the 5th lab used the whole LC‒MS kit from an IVD manufacturer. All labs reported results for 17-hydroxyprogesterone, androstenedione, cortisol, and testosterone, and four labs reported results for 11-deoxycortisol, corticosterone, cortisone, dehydroepiandrosterone sulfate (DHEAS), and progesterone.

Results

Good or acceptable overall comparability was found in Bland‒Altmann and Passing-Bablok analyses. Mean bias against the overall mean remained less than ±10 % except for DHEAS, androstenedione, and progesterone at one site and for cortisol and corticosterone at two sites (max. −18.9 % for androstenedione). The main analytical problems unraveled by this study included a bias not previously identified in proficiency testing, operator errors, non-supported matrix types, and higher inaccuracy and imprecision at the lower ends of measuring intervals.

Conclusions

This study shows that inter-method comparison is essential for monitoring the validity of an assay and should serve as an example of how external quality assessment could work in addition to organized proficiency testing schemes.

Introduction

Modern laboratory diagnostics increasingly rely on liquid chromatography-mass spectrometry (LC-MS/MS) methods for measuring steroidal hormones. Over the past two decades, the possibilities and limitations of this technology have become well understood [1–3]. Metrological traceability has been demonstrated for several key analytes, and the IVD industry has established certified assay formats[1, 4, 5]. In addition, the in-house development of measurement methods (lab-developed tests, LDTs) is widespread in the field of endocrinology. For example, to target prepubertal estradiol and testosterone levels, the measurement range of commercially available kits must be significantly extended at its lower end; as a result, clinical needs remain unmet in certain populations (e.g., in pediatrics) [6, 7]. In addition, the sensitivity of IVD kits is often limited due to poorly optimized chromatography and mass spectrometry settings [8]. A mixed form is achieved by LDTs, which use certified and traceable commercial IVD calibration systems to minimize the bias input into the method, which is possibly associated with the production of calibrator samples.

Overall, the situation is complex, but the monitoring interlaboratory testing schemes showed a satisfactory outlook; these schemes first and foremost include the United Kingdom National External Quality Assessment Service (UK NEQAS) proficiency testing (PT) scheme for steroids, which works with real patient materials [9]. Furthermore, mass spectrometric laboratories obtain measurements with comparable precision or better than the analytical realization on the immunological high-throughput automats. However, mass spectrometric examination procedures are decisively advantageous because in these methods, an analytical design that is principally free of interference can be obtained and patient-specific systematic errors due to nonspecific measurements of the measurand in the ligand-binding assay (“cross-reactivity”) can be suppressed.

In the present study, five laboratories from Switzerland and Germany designated Lab A–Lab E, attempted to test a central hypothesis that explains the advantage of using mass spectrometry in routine clinical diagnostics. All participating laboratories possess multianalyte LC-MS/MS methods for diagnostic use; all laboratories are accredited according to either ISO 15189 or ISO 17025. Most of the laboratories use IVD-certified calibrator materials from one vendor (Supplementary Table S1). The aims of this study were threefold. First, the study was performed to investigate the level by which random and systematic errors contribute towards the results. Second, evaluations were performed to determine whether these figures of merit were comparable to the interlaboratory scatter found in the UK NEQAS PT scheme. As a third goal, the desirable total allowable error (TAE) derived from biological variation(BV) data was compared to the experimentally found intra and interlaboratory errors to unveil whether the numbers exceed the TAE goals [10, 11]. The role of using IVD-certified calibrator materials is of special interest since the use of these materials may lead to reduced interlaboratory deviations [12–15]. To test these hypotheses as realistically as possible, one participant purposefully prepared 40 multianalyte pooled serum samples that covered the reference interval of the individual analytes as completely as possible. In this respect, the approach differs from the recently published HarmoSter study, which focused more on targeted individual sample analyses and material comparisons [12, 13].

*Limitations of this study are the low number of participating laboratories and the restricted sample number, which could affect the validity of the results. However, the results of this study were substantiated by comparison to the UK NEQAS sample set and to other studies mentioned above. Furthermore, since the sample set was not a value assigned by reference methods, individual measurement deviation (bias) could not be calculated against a true value but only against the overall mean.

Conclusions

This study provides a good overview that compares and describes the current state of multi-steroid LC-MS/MS-LDTs.The results provide valuable feedback to the study participants on their respective methods. In addition, generalized conclusions on the performance of LDT solutions in modern routine laboratories could be obtained. The results of this study show that, despite remarkable instrumental heterogeneity, the LC–MS/MS assays provided comparable measurement results. The result scatter was within the generally accepted performance limit (TAE) based on the BV of the endogenously present analytes. Thus, it can be concluded that the application of LC-MS/MS for steroid analysis in clinical laboratories has developed so that an increased risk for patients from LDTs cannot be identified [34]. This observation is especially true when considering risks associated with the limitations of currently widely used fully automated immunoassays, such as cross-reactivities or limited assay sensitivities [35].

madman · Dec 16, 2023

Table 1: Overview of measurement results. Sample numbers (n) with concentrations above the LOQ of the individual laboratories (Labs A–E), laboratory specific and overall mean of found sample concentrations and concentration ranges with comparison to biological reference intervals, mean bias against overall mean with SD of bias and imprecision statistics including the number of samples exceeding desired or optimal TAE goals based on biological variation data.

madman · Dec 16, 2023

Figure 1: Passing‒Bablok regression analysis comparing single lab results with the overall mean.

madman · Dec 16, 2023

Table 2: Results from Passing‒Bablok regression analysis comparing the overall mean and single measurements of each laboratory (Labs A–E). Laboratory results for the slope parameter are bolded if they are significantly different from the other laboratory results (no overlapping 95 % CI interval).

madman · Dec 16, 2023

Figure 2: Bland‒Altman-styled scatter plots showing the percent difference of single results of each laboratory against the overall mean (y-axis) drawn against the overall mean (x-axis). Consequently, the mean difference against the overall mean is always 0.0 %, as illustrated with a blue line. The overall 1.96 SD interval (dashed red line) is compared with the total allowable error (green line) derived from biological variation data (see Supplementary Table S2).

madman · Dec 16, 2023

Figure 3: Comparison of interlaboratory CV values obtained from this study (white dots) and from the UK-NEQAS proficiency testing scheme (black dots)as a function of the mean sample concentration. The green line represents the optimal total allowable error (TAE), and the red line represents the desirable TAE, which was derived from the EFLM database or from the literature [8, 22–24]. LCMS group CV data from UK NEQAS PT distributions 470–494 were used, and only samples in the concentration range of the study samples were included. Study samples: all analytes, n=40; UK NEQAS samples:17-OH-progesterone, n=56; androstenedione, n=63; cortisol, n=82; testosterone (female and male), n=150; DHEAS, n=54; progesterone, n=22.

madman · Dec 16, 2023

CDC’s Clinical Standardization Programs - Excel Male TRT Forum

High-quality disease biomarker tests are critical for the correct diagnosis and treatment of patients, accurate interpretation of research data, and effective use of research findings in health care. CDC’s Clinical Standardization Program (CDC CSP) assists researchers and laboratories with...

www.excelmale.com