5:00pm  5:20pmA Novel Approach to Outlier Identification in Bioassays
Hannes Buchner^{1}, Robert Reidy^{1}, Michael Matiu^{1}, Johannes Solzin^{2}, Alexander Berger^{2}, Armin Boehrer^{2}, Erich Bluhmki^{2,3}
^{1}Staburo GmbH, Munich Germany; ^{2}Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach a.d. Riss, Germany; ^{3}Hochschule Biberach, University of Applied Sciences, 88400 Biberach, Germany
Introduction:
Humanbased identification of unusual or outofplace measurements from bioassays can suffer from bias as well as inconsistency. In this subjective outlier detection approach, there can be large inconsistencies between outlier identification both between analysts and even within analysts.
To minimise the intrinsic subjectivity in outlier detection, statistical methods should be applied as a classification system in order to achieve consistency and reduce subjectivity [1]. However, classical statistical outlier testing procedures (ROUT, Roesner, Grubbs, Dixon, kSD, Hampel) suffer from the inability to distinguish between relevant and irrelevant differences as statistical hypotheses test do. This leads to a significant number of irrelevant cases where the human expert needs to overrule the outlier tool. Our proposed approach is a mixed approach in which outliers are identified using all major existing methods in combination with a derived ‘action limit’ which identifies a region where observations are not assessed for irregularity.
Method:
The variability between two external and experienced reviewers was assessed using a total of 62 plates from 5 assays with 24 replicates per plate of standard and samples. The necessity of the statistical approach in addition to the invoked action limit rule was investigated with the abovementioned data as well as the necessity of the action limit to reliably estimate EC50.
The action limit approach defines an exception region, centred on a fitted four parameter logistic model of the bioassay measurements by concentration. The action limit itself is the margin around the model mean, calculated directly from an average variance within a relevant range of assay values. This variance is obtained using historical data from each assay of interest.
Results:
Independent reviewer 1 identified 27 outliers while reviewer 2 identified 53. The intersection of these sets of points was 17 while the union of the sets amounts to 63 identified outliers. Not applying the action limit leads to 56% additional points identified as outliers depending on the outlier method. 40 points fell outside of the action limit with between 32% and 66% categorised as statistical outlier depending on the detection method. When averaging over models which resulted in at least 1 outlier, the mixed approach resulting in a 1438% average difference in the EC50 estimate depending on the detection method while the absence of the action limit resulted in a 423% average change. When averaging over all fitted models, the average change in the EC50 estimate was 1.32.7% and 1.72.8% for the mixed and nonmixed approach respectively.
Conclusions:
As expected, traditional reviewerbased approaches to outlier detection lead to greatly inconsistent interpretations. The proposed mixed method is a new strategy that significantly reduces the false outlier identification rate. This, in turn, minimises subjectivity as compared to the known standard methods. This approach better reflects the USP requirements [1] thereby also reducing costs.
References:
[1] The Unites States Pharmacopeial Convention – Article 1034
5:20pm  5:40pmHow to handle deviating control values in doseresponse curves
Franziska Kappenberg^{1}, Jan Hengstler^{2}, Jörg Rahnenführer^{1}
^{1}TU Dortmund, Germany; ^{2}Leibniz Research Centre for Working Environment and Human Factors at the Technical University of Dortmund (IfaDo), Germany
In many toxicological assays a response variable is repeatedly measured under different conditions, for example for a negative control and increasing concentrations of a compound. A fitted doseresponse curve (DRC) can be used to determine the concentration where a specific effect level is attained. Usually data is normalized before curvefitting in order to have an inital value of 100% corresponding to the response value of the negative control.
In some cases problems arise from the fact that the response value of the control does not fit the left asymptote of the fitted DRC and therefore the asymptote does not correspond to an effect of 100%. This leads to the inability to properly interpret the concentration where the curve attains a given value.
In a simulation study we analyse different methods for dealing with the problem of deviating control values. A decision rule is derived which of the presented methods should be used, depending on different parameters, such as the number of concentrations, the number of replicates per concentration, the variance of the replicates and the difference between the left asymptote and the control value.
Results from the simulation study are applied to a toxicity assay in which the effect of a compound on the vitality of cells is measured for several replicates of increasing concentrations for three donors.
5:40pm  6:00pmNew Approaches for Bivariate Quantitative DoseResponse  A Screening Study from Hormone Research and Development
Reinhard Meister
Beuth Hochschule für Technik Berlin, Germany
This paper addresses an important question in hormone research. Since 90 years, the question of safely administering hormones like e.g. estrogen orally has not been solved, as the hormone is immediately metabolized at every passage of blood through the liver. The metabolites of estrogen are compounds aggressively inducing adverse effects. Therefore, the concept of prodrugs, enabling a safe transfer of a substance without being metabolized in the liver is very attractive. We have contributed to a recent publication (Elger et.al. 2016), where results of a screening.series of doseresponse experiments with estrogens carrying prodrugs have been reported. The statistical challenges and proposals for solutions will be presented.
In summary, we will give a very dense look at new dose response models and, in addition, a promising extension of the BlandAltman ideas (see e.g. Bland JM, Altman DG 1986) to doseresponse setups and a representation of such models in a function space.
I turns out, that our approach enables the comparison of benefitrisk ratios of a treatment and a competitor in a general setting without the assumption of parallel doseresponse curves as commonly assumed in calculating relative efficiencies.
Our proposal goes back to the roots of bioassay (see e.g. Finney (1952)) along the pragmatic attitude of Fisher (1947) who advocated the log transform as a convenient way for transforming inference on ratios into one of inference on differences.
References
Fisher, R.A. (1947) The analysis of covariance method for the relation between a part and the whole. Biometrics, 3, 65–68.
Finney, D. J., Ed. (1952). Probit Analysis . Cambridge, England, Cambridge University Press.
Bland J.M,, Altman D.G. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. Lancet, i, 307310.
6:00pm  6:20pmAdaptive designs in preclinical dose finding studies
Konrad Neumann^{1}, Samuel Knauss^{2}, Ulrike Grittner^{1}
^{1}Institute of Biometry and Clinical Epidemiology, Charité, Universitätsmedizin Berlin, Germany; ^{2}Department of Experimental Neurology, Charité Universitätsmedizin Berlin, Germany
Preclinical research is widely criticized for its wastage of test animals. The recently established Charité 3R center tries to address growing concerns in society. “3R” is an animal protection principle and stands for “Replace”, “Reduce” and “Refine”. Novel study designs could contribute to the efforts to reduce the number of animals in preclinical trials.
The need for a great deal of test animals is particularly high in dose finding studies since they are usually performed with many treatment arms. Moreover, differences in effect size between the experimental groups are small entailing the need for even a larger sample size.
In the talk, we will propose an adaptive twostage design that has the potential to reduce the needed number of test animals.
The study design starts with four groups (control group and three groups with different dose levels) in the first stage. After an interim analysis, at least one, but possibly two dose levels are dropped or the study is terminated for futility. The second and final stage has then only two or three study groups. The basic idea of adaptive testing goes back to Fishers’s pvalue combination criterion [1]. Here we generalize and modify it for the needs of the proposed experimental design.
Preclinical trials often lack power. That could be one reason why translation of promising preclinical findings into clinical application often fails (c.f. [2] for a discussion of this issue in neurological stroke research). We demonstrate that the proposed adaptive design reduces the number of animals needed particularly in the case of underpowered studies that unfortunately occur so often in preclinical animal experiments.
Reference
1. P. Bauer; J. Röhmel: An Adaptive Method for Establishing a DoseResponse Relationship; Statistics in Medicine, Vol 14; 15951607 (1995).
2. O’Collins VE; Macleod MR; Donnan GA; Horky LL; van der Worp BH; Howells DW: 1,026 experimental treatments in acute stroke; Annals of neurology (2006) Mar, 59 (3), 467477.
