madman
Super Moderator
Abstract
Objectives
A mass spectrometry (LC-MS/MS)--based interlaboratory comparison study was performed for nine steroid analytes with five participating laboratories. The sample set contained 40 pooled samples of human serum generated from preanalyzed leftovers. To obtain a well-balanced distribution across reference intervals of each steroid, the leftovers first underwent a targeted mixing step.
Methods
All participants measured a sample set once using their own multianalyte protocols and calibrators. Four participants used in-house developed measurement platforms, including IVD-CE certified calibrators, which were used by three participants; the 5th lab used the whole LC‒MS kit from an IVD manufacturer. All labs reported results for 17-hydroxyprogesterone, androstenedione, cortisol, and testosterone, and four labs reported results for 11-deoxycortisol, corticosterone, cortisone, dehydroepiandrosterone sulfate (DHEAS), and progesterone.
Results
Good or acceptable overall comparability was found in Bland‒Altmann and Passing-Bablok analyses. Mean bias against the overall mean remained less than ±10 % except for DHEAS, androstenedione, and progesterone at one site and for cortisol and corticosterone at two sites (max. −18.9 % for androstenedione). The main analytical problems unraveled by this study included a bias not previously identified in proficiency testing, operator errors, non-supported matrix types, and higher inaccuracy and imprecision at the lower ends of measuring intervals.
Conclusions
This study shows that inter-method comparison is essential for monitoring the validity of an assay and should serve as an example of how external quality assessment could work in addition to organized proficiency testing schemes.
Introduction
Modern laboratory diagnostics increasingly rely on liquid chromatography-mass spectrometry (LC-MS/MS) methods for measuring steroidal hormones. Over the past two decades, the possibilities and limitations of this technology have become well understood [1–3]. Metrological traceability has been demonstrated for several key analytes, and the IVD industry has established certified assay formats[1, 4, 5]. In addition, the in-house development of measurement methods (lab-developed tests, LDTs) is widespread in the field of endocrinology. For example, to target prepubertal estradiol and testosterone levels, the measurement range of commercially available kits must be significantly extended at its lower end; as a result, clinical needs remain unmet in certain populations (e.g., in pediatrics) [6, 7]. In addition, the sensitivity of IVD kits is often limited due to poorly optimized chromatography and mass spectrometry settings [8]. A mixed form is achieved by LDTs, which use certified and traceable commercial IVD calibration systems to minimize the bias input into the method, which is possibly associated with the production of calibrator samples.
Overall, the situation is complex, but the monitoring interlaboratory testing schemes showed a satisfactory outlook; these schemes first and foremost include the United Kingdom National External Quality Assessment Service (UK NEQAS) proficiency testing (PT) scheme for steroids, which works with real patient materials [9]. Furthermore, mass spectrometric laboratories obtain measurements with comparable precision or better than the analytical realization on the immunological high-throughput automats. However, mass spectrometric examination procedures are decisively advantageous because in these methods, an analytical design that is principally free of interference can be obtained and patient-specific systematic errors due to nonspecific measurements of the measurand in the ligand-binding assay (“cross-reactivity”) can be suppressed.
In the present study, five laboratories from Switzerland and Germany designated Lab A–Lab E, attempted to test a central hypothesis that explains the advantage of using mass spectrometry in routine clinical diagnostics. All participating laboratories possess multianalyte LC-MS/MS methods for diagnostic use; all laboratories are accredited according to either ISO 15189 or ISO 17025. Most of the laboratories use IVD-certified calibrator materials from one vendor (Supplementary Table S1). The aims of this study were threefold. First, the study was performed to investigate the level by which random and systematic errors contribute towards the results. Second, evaluations were performed to determine whether these figures of merit were comparable to the interlaboratory scatter found in the UK NEQAS PT scheme. As a third goal, the desirable total allowable error (TAE) derived from biological variation(BV) data was compared to the experimentally found intra and interlaboratory errors to unveil whether the numbers exceed the TAE goals [10, 11]. The role of using IVD-certified calibrator materials is of special interest since the use of these materials may lead to reduced interlaboratory deviations [12–15]. To test these hypotheses as realistically as possible, one participant purposefully prepared 40 multianalyte pooled serum samples that covered the reference interval of the individual analytes as completely as possible. In this respect, the approach differs from the recently published HarmoSter study, which focused more on targeted individual sample analyses and material comparisons [12, 13].
*Limitations of this study are the low number of participating laboratories and the restricted sample number, which could affect the validity of the results. However, the results of this study were substantiated by comparison to the UK NEQAS sample set and to other studies mentioned above. Furthermore, since the sample set was not a value assigned by reference methods, individual measurement deviation (bias) could not be calculated against a true value but only against the overall mean.
Conclusions
This study provides a good overview that compares and describes the current state of multi-steroid LC-MS/MS-LDTs.The results provide valuable feedback to the study participants on their respective methods. In addition, generalized conclusions on the performance of LDT solutions in modern routine laboratories could be obtained. The results of this study show that, despite remarkable instrumental heterogeneity, the LC–MS/MS assays provided comparable measurement results. The result scatter was within the generally accepted performance limit (TAE) based on the BV of the endogenously present analytes. Thus, it can be concluded that the application of LC-MS/MS for steroid analysis in clinical laboratories has developed so that an increased risk for patients from LDTs cannot be identified [34]. This observation is especially true when considering risks associated with the limitations of currently widely used fully automated immunoassays, such as cross-reactivities or limited assay sensitivities [35].
Objectives
A mass spectrometry (LC-MS/MS)--based interlaboratory comparison study was performed for nine steroid analytes with five participating laboratories. The sample set contained 40 pooled samples of human serum generated from preanalyzed leftovers. To obtain a well-balanced distribution across reference intervals of each steroid, the leftovers first underwent a targeted mixing step.
Methods
All participants measured a sample set once using their own multianalyte protocols and calibrators. Four participants used in-house developed measurement platforms, including IVD-CE certified calibrators, which were used by three participants; the 5th lab used the whole LC‒MS kit from an IVD manufacturer. All labs reported results for 17-hydroxyprogesterone, androstenedione, cortisol, and testosterone, and four labs reported results for 11-deoxycortisol, corticosterone, cortisone, dehydroepiandrosterone sulfate (DHEAS), and progesterone.
Results
Good or acceptable overall comparability was found in Bland‒Altmann and Passing-Bablok analyses. Mean bias against the overall mean remained less than ±10 % except for DHEAS, androstenedione, and progesterone at one site and for cortisol and corticosterone at two sites (max. −18.9 % for androstenedione). The main analytical problems unraveled by this study included a bias not previously identified in proficiency testing, operator errors, non-supported matrix types, and higher inaccuracy and imprecision at the lower ends of measuring intervals.
Conclusions
This study shows that inter-method comparison is essential for monitoring the validity of an assay and should serve as an example of how external quality assessment could work in addition to organized proficiency testing schemes.
Introduction
Modern laboratory diagnostics increasingly rely on liquid chromatography-mass spectrometry (LC-MS/MS) methods for measuring steroidal hormones. Over the past two decades, the possibilities and limitations of this technology have become well understood [1–3]. Metrological traceability has been demonstrated for several key analytes, and the IVD industry has established certified assay formats[1, 4, 5]. In addition, the in-house development of measurement methods (lab-developed tests, LDTs) is widespread in the field of endocrinology. For example, to target prepubertal estradiol and testosterone levels, the measurement range of commercially available kits must be significantly extended at its lower end; as a result, clinical needs remain unmet in certain populations (e.g., in pediatrics) [6, 7]. In addition, the sensitivity of IVD kits is often limited due to poorly optimized chromatography and mass spectrometry settings [8]. A mixed form is achieved by LDTs, which use certified and traceable commercial IVD calibration systems to minimize the bias input into the method, which is possibly associated with the production of calibrator samples.
Overall, the situation is complex, but the monitoring interlaboratory testing schemes showed a satisfactory outlook; these schemes first and foremost include the United Kingdom National External Quality Assessment Service (UK NEQAS) proficiency testing (PT) scheme for steroids, which works with real patient materials [9]. Furthermore, mass spectrometric laboratories obtain measurements with comparable precision or better than the analytical realization on the immunological high-throughput automats. However, mass spectrometric examination procedures are decisively advantageous because in these methods, an analytical design that is principally free of interference can be obtained and patient-specific systematic errors due to nonspecific measurements of the measurand in the ligand-binding assay (“cross-reactivity”) can be suppressed.
In the present study, five laboratories from Switzerland and Germany designated Lab A–Lab E, attempted to test a central hypothesis that explains the advantage of using mass spectrometry in routine clinical diagnostics. All participating laboratories possess multianalyte LC-MS/MS methods for diagnostic use; all laboratories are accredited according to either ISO 15189 or ISO 17025. Most of the laboratories use IVD-certified calibrator materials from one vendor (Supplementary Table S1). The aims of this study were threefold. First, the study was performed to investigate the level by which random and systematic errors contribute towards the results. Second, evaluations were performed to determine whether these figures of merit were comparable to the interlaboratory scatter found in the UK NEQAS PT scheme. As a third goal, the desirable total allowable error (TAE) derived from biological variation(BV) data was compared to the experimentally found intra and interlaboratory errors to unveil whether the numbers exceed the TAE goals [10, 11]. The role of using IVD-certified calibrator materials is of special interest since the use of these materials may lead to reduced interlaboratory deviations [12–15]. To test these hypotheses as realistically as possible, one participant purposefully prepared 40 multianalyte pooled serum samples that covered the reference interval of the individual analytes as completely as possible. In this respect, the approach differs from the recently published HarmoSter study, which focused more on targeted individual sample analyses and material comparisons [12, 13].
*Limitations of this study are the low number of participating laboratories and the restricted sample number, which could affect the validity of the results. However, the results of this study were substantiated by comparison to the UK NEQAS sample set and to other studies mentioned above. Furthermore, since the sample set was not a value assigned by reference methods, individual measurement deviation (bias) could not be calculated against a true value but only against the overall mean.
Conclusions
This study provides a good overview that compares and describes the current state of multi-steroid LC-MS/MS-LDTs.The results provide valuable feedback to the study participants on their respective methods. In addition, generalized conclusions on the performance of LDT solutions in modern routine laboratories could be obtained. The results of this study show that, despite remarkable instrumental heterogeneity, the LC–MS/MS assays provided comparable measurement results. The result scatter was within the generally accepted performance limit (TAE) based on the BV of the endogenously present analytes. Thus, it can be concluded that the application of LC-MS/MS for steroid analysis in clinical laboratories has developed so that an increased risk for patients from LDTs cannot be identified [34]. This observation is especially true when considering risks associated with the limitations of currently widely used fully automated immunoassays, such as cross-reactivities or limited assay sensitivities [35].