Top page | Research Projects | Publications and Working Papers | Statistical Programs | Data Science Blog
When we want to validate a statistical model, we usually simulate data via a Monte Carlo method. For example, we might use an R code such as:
n <- 1e3
u <- rnorm(n, mean = 0, sd = 10)
x <- rnorm(n, mean = 1.5*u, sd = 10)
y <- rnorm(n, mean = -1*x + -2.5*u, sd = 10)
where all parameters are fixed values (that I arbitrary chose) and known to us. This setup conforms well to the Frequentist philosophy of statistics, which assumes there is a fixed parameter value while data are randomly generated.
Meanwhile, fixed parameters in a simulation seem to be at odds with the Bayesian philosophy of statistics, which assumes a parameter is a random variable. If we stick with this Bayesian philosophy, we might simulate data as follows:
n <- 1e3
u <- rnorm(n, mean = rnorm(1, mean = 0, sd = 1), sd = rexp(1, rate = 1))
x <- rnorm(n, mean = rnorm(1, mean = 0, sd = 1)*u, sd = rexp(1, rate = 1))
y <- rnorm(n, mean = rnorm(1, mean = 0, sd = 1)*x + rnorm(1, 0, 1)*u, sd = rexp(1, rate = 1))
where rnorm(1, mean = 0, sd = 1) and rexp(1, rate = 1) is just an arbitrary choice, but the point is to make these parameters random variables and which value is drawn unknown to us (unless we set a seed number, which the above code explicitly does not). Then, the performance of a model might be evaluated based on how much a posterior overlaps the true parameter distribution.
Validating a model like this may be more useful than by a Frequentist way of looking at a long-term coverage of a fixed parameter value, if the Bayesian philosophy of statistics makes more sense in a given task than the Frequentist one. For example, if an analyst wanted to compare the performance of two models to compute the uncertainty of her estimate as the probability that a parameter takes specific values (perhaps because repeated sampling is unrealistic, which is often the case in social science), the Bayesian philosophy of statistics would make more sense, and she could compare the performance of the two models by looking at which one covers more of the true parameter distribution.