Accelerated Bayesian Experimental Design for Chemical Kinetic Models
14
Citation
94
Reference
20
Related Paper
Citation Trend
Abstract:
The optimal selection of experimental conditions is
essential in maximizing the value of data for inference and
prediction, particularly in situations where experiments are
time-consuming and expensive to conduct. A general Bayesian
framework for optimal experimental design with nonlinear
simulation-based models is proposed. The formulation accounts for
uncertainty in model parameters, observables, and experimental
conditions. Straightforward Monte Carlo evaluation of the objective
function - which reflects expected information gain
(Kullback-Leibler divergence) from prior to posterior - is
intractable when the likelihood is computationally intensive.
Instead, polynomial chaos expansions are introduced to capture the
dependence of observables on model parameters and on design
conditions. Under suitable regularity conditions, these expansions
converge exponentially fast. Since both the parameter space and the
design space can be high-dimensional, dimension-adaptive sparse
quadrature is used to construct the polynomial expansions.
Stochastic optimization methods will be used in the future to
maximize the expected utility. While this approach is broadly
applicable, it is demonstrated on a chemical kinetic system with
strong nonlinearities. In particular, the Arrhenius rate parameters
in a combustion reaction mechanism are estimated from observations
of autoignition. Results show multiple order-of-magnitude speedups
in both experimental design and parameter
inference.Keywords:
Uncertainty Quantification
Divergence (linguistics)
Polynomial Chaos
Autoignition temperature
Cite
AbstractExpected gain in Shannon information is commonly suggested as a Bayesian design evaluation criterion. Because estimating expected information gains is computationally expensive, examples in which they have been successfully used in identifying Bayes optimal designs are both few and typically quite simplistic. This article discusses in general some properties of estimators of expected information gains based on Markov chain Monte Carlo (MCMC) and Laplacian approximations. We then investigate some issues that arise when applying these methods to the problem of experimental design in the (technically nontrivial) random fatigue-limit model of Pascual and Meeker. An example comparing follow-up designs for a laminate panel study is provided.Key Words: Bayesian optimal designLaplacian approximationMarkov chain monteCarlo (MCMC)
Cite
Citations (162)
Summary Data from experiments in steady state enzyme kinetic studies and radioligand binding assays are usually analysed by fitting non-linear models developed from biochemical theory. Designing experiments for fitting non-linear models is complicated by the fact that the variances of parameter estimates depend on the unknown values of these parameters and Bayesian optimal exact design for non-linear least squares analysis is often recommended. It has been difficult to implement Bayesian L-optimal exact design, but we show how it can be done by using a computer algebra package to invert the information matrix, sampling from the prior distribution to evaluate the optimality criterion for candidate designs and implementing an exchange algorithm to search for candidate designs. These methods are applied to finding optimal designs for the motivating applications in biological kinetics, in the context of which some practical problems are discussed. A sensitivity study shows that the use of a prior distribution can be essential, as is careful specification of that prior.
Fisher information
Optimal design
Design of experiments
Design matrix
Cite
Citations (11)
With the advancements in modeling and numerical algorithms, the decision supported by modeling and simulation has became more mainstream than ever. Even though the computational power is continually increased, in most engineering applications the problem of optimal design under uncertainty has became prohibitively expensive due to long runtimes of single simulations. The obvious solution is to reduce the complexity of the model by employing di erent assumptions and constructing this way an approximate model. The calibration of these simpler models requires a large number of runs of the complex model, which may still be too expensive and ine cient for the task at hand. In this paper, we study the problem of optimal data collection to e ciently learn the model parameters of an approximate model in the context of Bayesian analysis. The paper emphasizes the in uence of model discrepancy on the calibration of the approximate model and hence the choice of optimal designs. Model discrepancy is modeled using a Gaussian process in this study. The optimal design is obtained as a result of an information theoretic sensitivity analysis. Thus, the preferred design is where the statistical dependence between the model parameters and observables is the highest possible. In this paper, the statistical dependence between random variables is quanti ed by mutual information and estimated using a k-nearest neighbor based approximation. As a model problem, a convective-dispersion model is calibrated to approximate the physics of Burgers’ equation in a limited time domain of interest.
Cite
Citations (11)
Optimal experimental design (OED) seeks experiments expected to yield the most useful data for some purpose. In practical circumstances where experiments are time-consuming or resource-intensive, OED can yield enormous savings. We pursue OED for nonlinear systems from a Bayesian perspective, with the goal of choosing experiments that are optimal for parameter inference. Our objective in this context is the expected information gain in model parameters, which in general can only be estimated using Monte Carlo methods. Maximizing this objective thus becomes a stochastic optimization problem. This paper develops gradient-based stochastic optimization methods for the design of experiments on a continuous parameter space. Given a Monte Carlo estimator of expected information gain, we use infinitesimal perturbation analysis to derive gradients of this estimator. We are then able to formulate two gradient-based stochastic optimization approaches: (i) Robbins-Monro stochastic approximation, and (ii) sample average approximation combined with a deterministic quasi-Newton method. A polynomial chaos approximation of the forward model accelerates objective and gradient evaluations in both cases. We discuss the implementation of these optimization methods, then conduct an empirical comparison of their performance. To demonstrate design in a nonlinear setting with partial differential equation forward models, we use the problem of sensor placement for source inversion. Numerical results yield useful guidelines on the choice of algorithm and sample sizes, assess the impact of estimator bias, and quantify tradeoffs of computational cost versus solution quality and robustness.
Robustness
Stochastic Approximation
Cite
Citations (72)
Computer models usually have a variety of parameters that can (and need to) be tuned so that the model better reflects reality. This problem is called calibration and is an inverse problem. We assume that we have a set of observed responses to given inputs in a physical system and a computer model that depends on parameters that models the physical system being studied. It is often the case that many more simulations can be run than experiments conducted, so we typically have many more simulation results (at various parameter values) than experimental results (at the “true” parameter value). In this paper, we use Maximum Likelihood Estimation (MLE) to calibrate model parameters. We assume that the response data is vector-valued, e.g. a response is given as a function of time. We approximate the underlying models with Gaussian Processes (GPs) and fit the parameters of the GPs with MLE. Specifically, we propose a decomposition approach to identify the basis vectors that allows for efficient calculation of the parameters. Experimental data is then used to calibrate the model parameters. This approach is demonstrated on one test problem.
Cite
Citations (2)
Estimation of parameter sensitivities for stochastic chemical reaction networks is an important and challenging problem. Sensitivity values are important in the analysis, modeling and design of chemical networks. They help in understanding the robustness properties of the system and also in identifying the key reactions for a given outcome. In a discrete setting, most of the methods that exist in the literature for the estimation of parameter sensitivities rely on Monte Carlo simulations along with finite difference computations. However these methods introduce a bias in the sensitivity estimate and in most cases the size or direction of the bias remains unknown, potentially damaging the accuracy of the analysis. In this paper, we use the random time change representation of Kurtz to derive an exact formula for parameter sensitivity. This formula allows us to construct an unbiased estimator for parameter sensitivity, which can be efficiently evaluated using a suitably devised Monte Carlo scheme. The existing literature contains only one method to produce such an unbiased estimator. This method was proposed by Plyasunov and Arkin and it is based on the Girsanov measure transformation. By taking a couple of examples we compare our method to this existing method. Our results indicate that our method can be much faster than the existing method while computing sensitivity with respect to a reaction rate constant which is small in magnitude. This rate constant could correspond to a reaction which is slow in the reference time-scale of the system. Since many biological systems have such slow reactions, our method can be a useful tool for sensitivity analysis.
Robustness
Cite
Citations (0)
We consider a class of misspecified dynamical models where the governing term is only approximately known. Under the assumption that observations of the system's evolution are accessible for various initial conditions, our goal is to infer a non-parametric correction to the misspecified driving term such as to faithfully represent the system dynamics and devise system evolution predictions for unobserved initial conditions. We model the unknown correction term as a Gaussian Process and analyze the problem of efficient experimental design to find an optimal correction term under constraints such as a limited experimental budget. We suggest a novel formulation for experimental design for this Gaussian Process and show that approximately optimal (up to a constant factor) designs may be efficiently derived by utilizing results from the literature on submodular optimization. Our numerical experiments exemplify the effectiveness of these techniques.
Submodular set function
Parametric model
Cite
Citations (1)
This paper reviews the literature on Bayesian experimental design. A unified view of this topic is presented, based on a decision-theoretic approach. This framework casts criteria from the Bayesian literature of design as part of a single coherent approach. The decision-theoretic structure incorporates both linear and nonlinear design problems and it suggests possible new directions to the experimental design problem, motivated by the use of new utility functions. We show that, in some special cases of linear design problems, Bayesian solutions change in a sensible way when the prior distribution and the utility function are modified to allow for the specific structure of the experiment. The decision-theoretic approach also gives a mathematical justification for selecting the appropriate optimality criterion.
Bayesian experimental design
Decision theory
Cite
Citations (1,799)
Uncertainty Quantification
Kronecker product
Cite
Citations (169)
Computer experiments have emerged as a popular tool for studying the relationship between a response variable and factors that affect the response when a computational model of this relationship is available. They have proven to be particularly useful in applications where properly designed physical experiments are infeasible. This thesis considers two problems that occur in the design and analysis of computer experiments. The first problem is approximation of the Pareto front and set in a multiple-output computer experiment. The second problem extends the calculation of sensitivity indices of input factors to a broader class of models than have been studied previously.
To solve the first problem, several new design criteria for approximating the Pareto front are developed. The resulting sequential designs generalize the well-known expected improvement approach for optimization of a single-objective function. The new methods are compared to previously proposed expected improvement generalizations for multiobjective optimization from the literature. The comparisons are based on both theoretical considerations and empirical results from using the sequential design criteria on several test functions and engineering applications.
In the sensitivity index problem, formulas are derived for calculating empirical Bayesian estimates of sensitivity indices for Gaussian process models with an arbitrary polynomial mean structure and three parametric correlation families. The use of a polynomial mean has the potential to provide more accurate estimates of sensitivity indices when the computer output has a large-scale polynomial trend. Additionally, when combined with a compactly supported correlation function and parameter space restrictions that force a particular degree of sparsity on the correlation matrix at the design, the polynomial mean assumption allows one to estimate sensitivity indices for computer experiments with a large number of runs. In such large-design applications, estimates based on the standard constant mean Gaussian processes with a power exponential correlation function can be computationally infeasible. Examples are presented that exhibit the accuracy of the estimates under these nonstandard modeling assumptions.
Cite
Citations (30)