class: center, middle, inverse, title-slide # Beyond Bayes: What We Can Do with a Partial Prior ## @ GSO Seminar ### Yixuan Qiu ### 04/06/2017
Joint work with Prof. Lingsong Zhang and Chuanhai Liu --- # Outline - Prologue — About Bayes - Motivation — A Partial Prior - Problem — Partially Specified Bayesian Models - Foundation — The Inferential Models - Methodology — The Partial Bayes Method - Application — NBA Three-Point Shots --- class: inverse, center, middle # Prologue --- # A Picture I Took Today at HAAS ![](images/posters.jpg) --- # A Closer Look... ![](images/posters3.jpg) --- # About Bayes - Bayes' Theorem provides a great framework for statistical inference - Knowledge = Prior + Data ![](images/bayes.jpg) --- # Bayesian Procedure - Interested in parameter `\(\theta\)` - Prior: `\(\pi(\theta)\)` - Data: `\(f(x|\theta)\)` - .emph[Posterior] `$$q(\theta|x)=\frac{\pi(\theta)f(x|\theta)}{\int \pi(\theta)f(x|\theta) \mathrm{d}\theta}$$` -- - What if `\(\pi(\theta)\)` is .warning[NOT] known? -- - What if `\(\pi(\theta)\)` is .warning[partially] known? --- class: inverse, center, middle # Motivation --- # A Partial Prior - Consider a normal hierarchical model - There are `\(n\)` groups of data, mutually independent, each with group mean `\(\mu_i\)` - Observation in each group is `\(X_i|\mu_i\sim N(\mu_i,1)\)` - The group means have a common prior `\(\mu_i\sim N(\mu,1)\)` - We are interested in `\(\mu_1\)`, and want to construct an interval estimation - However, `\(\mu\)` is .warning[unknown] --- # A Partial Prior cont. - A diagram of partial prior ![](images/pb_diagram.png) --- # Some Attemps - Frequentist: `\(X_1-\mu_1\sim N(0,1)\)` -- - .warning[Totally ignore the information from prior and other groups] -- - Bayes: `\(\mu_1|\{X_1,\ldots,X_n\}\sim N\left(\frac{1}{2}X_1+\frac{1}{2}\mu,\frac{1}{2}\right)\)` -- - .warning[ Cannot proceed since `\(\mu\)` is unknown ] -- - Empirical Bayes: `\(\hat{\mu}=\bar{X}\)`, `\(\mu_1|\{X_1,\ldots,X_n\}\overset{approx}{\sim} N\left(\frac{1}{2}X_1+\frac{1}{2}\bar{X},\frac{1}{2}\right)\)` -- - .warning[Uncertainty quantification is inaccurate] --- class: inverse, center, middle # Problem --- # Partially Specified Bayesian Model - Parameter `\(\theta\)` is partitioned into two blocks `\(\theta=(\tilde{\theta},\theta^{*})\)` - If prior is fully given, `\(\pi_0(\theta)=\pi(\tilde{\theta}|\theta^{*})\pi^{*}(\theta^{*})\)` - .emph[ In PB models, only `\(\pi(\tilde{\theta}|\theta^{*})\)` is available ] <table class="table"> <tr> <td>Sampling Model</td> <td>\(X|\theta\sim f(x|\theta)\)</td> </tr> <tr> <td>Parameter Partition</td> <td>\(\theta=(\tilde{\theta},\theta^{*})\)</td> </tr> <tr> <td>Partial Prior</td> <td>\(\tilde{\theta}|\theta^{*}\sim\pi(\tilde{\theta}|\theta^{*})\)</td> </tr> <tr> <td>Component without Prior</td> <td>\(\theta^{*}\)</td> </tr> <tr> <td>Parameter of Interest</td> <td>\(\eta=h(\tilde{\theta})\)</td> </tr> </table> --- # Inference Tools for PB Models - PB models reside in the "middle" of Bayes and Frequentist - Neither of these two is ideal to tackle such problems - We need new tools and techniques to do the inference - The .emph[Inferential Models] theory (Martin and Liu, 2013) provides a promising framework for PB models --- class: inverse, center, middle # Foundation --- # The Inferential Models The Inferential Modles (IMs) framework is: - A new paradigm for statistical inference - .emph[Parallel to Bayesian and Frequentist] - Compatible with Bayesian and Frequentist - .emph[Able to reproduce the results by these two] - Designed for exact inference - .emph[Interval estimators have guaranteed coverage probability] -- - Was born in Purdue Statistics! --- # The IM Procedure IM has a three-step procedure for inference: - .emph[Association]: Connects the parameter, observed data, and unobserved auxiliary random variable through an association function - .emph[Prediction]: Uses a random set to predict the unobserved auxiliary random variable - .emph[Combination]: Transform the uncertainty from the auxiliary space to the parameter space -- - .warning[Don't panic — I will give examples shortly] --- # The IM Outputs IM has two output quantities: - The .emph[belief] function: Quantifies how much evidence supports that an assertion is true, and - .emph["Direct" evidence] - The .emph[plausibility] function: Quantifies how much evidence does not support that an assertion is false - .emph["Indirect" evidence] --- # A Simple Example Given `\(X\sim N(\theta, 1)\)`, want to do inference on `\(\theta\)` -- - A-step: `\(X=\theta+Z,\ Z\sim N(0,1)\)` -- - P-step: Use a random interval `\(\mathcal{S}=(-|V|,|V|),\ V\sim N(0,1)\)` to predict the true value of `\(Z\)` -- - C-step: Since `\(\theta=X-Z\)`, given the observed data `\(x\)`, we use `\(\Theta_x(\mathcal{S})=(x-|V|,x+|V|)\)` to cover `\(\theta\)` --- # A Simple Example cont. For any assertion on `\(\theta\)`, for example `\(A=\{\theta: 1<\theta<2\}\)` - Belief: `$$\begin{align*} \mathsf{bel}_{x}(A) & =P\{\Theta_{x}(\mathcal{S})\subseteq A|\Theta_{x}(\mathcal{S})\ne\varnothing\}\\ & =P\{x-|V|>1,x+|V|<2\} \end{align*}$$` - Plausibility: `\(\mathsf{pl}_{x}(A)=1-\mathsf{bel}_{x}(A^{c})\)` - .emph[Plausibility Interval]: `\(\mathsf{PR}_{x}(\alpha)=\{\theta:\mathsf{pl}_{x}(\{\theta\})>\alpha\}\)` - Comparable to Bayesian credible interval and Frequentist's confidence interval --- # Extensions of IMs - Marginal IM: Dealing with nuisance parameters - Conditional IM: Combining information -- A Bayesian example: `\(\theta\sim Exp(1),\ X,Y|\theta\overset{indep}{\sim} N(\theta,1)\)` - Association `$$\begin{cases} \theta=e\\ X=\theta+Z_{1}\\ Y=\theta+Z_{2} \end{cases}\Rightarrow\begin{cases} \theta & =e\\ X & =e+Z_{1}\\ X-Y & =Z_{1}-Z_{2} \end{cases}$$` - `\(e+Z_1\)` and `\(Z_1-Z_2\)` are fully observed - Use `\(e|\{e+Z_1=x,Z_1-Z_2=x-y\}\)` to predict `\(e\)` --- class: inverse, center, middle # Methodology --- # The Partial Bayes Method - Using IMs to solve PB models - Recall the partial prior example with `\(n=2\)` `$$\mu_1,\mu_2\overset{indep}{\sim}N(\mu,1),\ X_i|\mu_i\overset{indep}{\sim} N(\mu_i,1),\ i=1,2$$` - Association `$$\begin{cases} X_{1}=\mu_{1}+e_{1}=(\mu+\varepsilon_{1})+e_{1}\\ X_{2}=\mu_{2}+e_{2}=(\mu+\varepsilon_{2})+e_{2} \end{cases}\Rightarrow\begin{cases} X_{1} & =\mu_{1}+e_{1}\\ X_{2}-X_{1} & =\varepsilon_{2}-\varepsilon_{1}+e_{2}-e_{1} \end{cases}$$` - Use `\(e_1|\{\varepsilon_{2}-\varepsilon_{1}+e_{2}-e_{1}=x_2-x_1\}\equiv N\left(\frac{1}{4}(x_{1}-x_{2}),\frac{3}{4}\right)\)` to predict `\(e_1\)` --- # The Normal Hierarchical Model - Generally, for `\(X_{i}|\mu_i\sim N(\mu_{i},\sigma^{2})\)`, `\(\mu_{i}\sim N(\mu,\tau_{0}^{2})\)` where `\(\sigma^2\)` and `\(\tau_0^2\)` are known constants, the plausibility interval for `\(\mu_1\)` is `$$\frac{\tau_{0}^{2}}{\tau_{0}^{2}+\sigma^{2}}X_{1}+\frac{\sigma^{2}}{\tau_{0}^{2}+\sigma^{2}}\overline{X}\pm z_{\alpha/2}\sigma\sqrt{1-\frac{n-1}{n}\cdot\frac{\sigma^{2}}{\tau_{0}^{2}+\sigma^{2}}}$$` ![](images/eb_pb.png) --- # Other Studied Models - Normal hierarchical model in which both parameters in the prior are unknown - Poisson hierarchical model (used in the application part) - Two sample binomial model: `\(X\sim B(m,p_1),\ Y\sim B(n,p_2)\)`, prior is on `\(\delta=p_1-p_2\sim \pi(\delta)\)`, and we want to do inference on `\(\delta\)` --- class: inverse, center, middle # Application --- # Basketball Three-Point Shot .left69[ - Rewards the highest score in one single attempt - Valuable for a team that has very limited offensive possessions - The choice of player that will make the attempt is crucial - .emph[How to evaluate shooter's performance?] ] .right30[ ![](images/three_point_shot.jpg) ] ??? Image credit: http://m.bizhizu.cn/pic/1471.html --- # NBA Three-Point Shooters Data - Data obtained from the official NBA website - 2015-2016 regular season - Select three players from each team - Retrieve data from each player's last ten games within the season --- # Overview of Data Set
--- # Estimate Success Rate - Point estimator: `\(\hat{p}_{i}=X_{i}/n_{i}\)`, `\(X_i\)` is the number of three-point shots made in `\(n_i\)` attemps by player `\(i\)` -- - .emph[How to quantify the uncertainty? 1/5 v.s. 10/50] - .emph[Any other assumptions we can make to improve the estimator?] -- - Players in the league may share some common characteristics - `\(X_i|p_i\sim Pois(n_i p_i)\)` - `\(p_i\sim Exp(\theta)\)` - We leave `\(\theta\)` unspecified - Do inference on `\(p_i\)` --- # Results ![](images/nba.png) --- # Results ![](images/players.png) --- # Summary - The partial prior problem calls for methodology beyond Bayes - We formalize such problems to introduce the Partially Specified Bayesian Models - Based on the Inferential Models theory, we develop the Partial Bayes method to solve PB models - Partial Bayes method provides exact inference results for any finite sample size --- class: inverse # Acknowledgement - I would like to thank Prof. Lingsong Zhang and Chuanhai Liu for their invaluable supervising - Thank Yaowu Liu for the extensive discussions on the IM theory - Thank Zach Hass and Nathan Hankey for their great help in the basketball example - Thank Will Eagan for organizing the GSO seminar - Thank the audience coming today