class: center, middle, inverse, title-slide .title[ # A review of statistical computing with TMB ] .author[ ### Andrea Havron
NOAA Fisheries, OST ] --- layout: true .footnote[U.S. Department of Commerce | National Oceanic and Atmospheric Administration | National Marine Fisheries Service] <style type="text/css"> code.cpp{ font-size: 14px; } code.r{ font-size: 14px; } </style> --- # Template Model Builder <img src="data:image/png;base64,#static/tmb-wordcloud.png" width="45%" style="display: block; margin: auto;" /> Wordcloud generated from: ([Kristensen et al., 2015](https://backend.orbit.dtu.dk/ws/portalfiles/portal/123599315/Publishers_version.pdf)) --- # Statistical Computing Review <img src="data:image/png;base64,#static/MLE_triangle.png" width="65%" style="display: block; margin: auto;" /> --- # Maximum Likelihood Inference #### What is the likelihood of getting 30 heads in 100 coin flips? .pull-left[ 1. **Specify the model** <br><br> `\(y ~ \sim Binomial(n, p)\)` ] --- # Maximum Likelihood Inference #### What is the likelihood of getting 30 heads in 100 coin flips? .pull-left[ 1. Specify the model 2. **Calculate the likelihood**<br><br> `\begin{align} L(p; n, y) &= \frac{n!}{y!(n-y)!}p^y(1-p)^{n-y} \end{align}` ] .pull-right[ `\(y = dbinom(x = 30, n = 100, p)\)`
] --- # Maximum Likelihood Inference #### What is the likelihood of getting 30 heads in 100 coin flips? .pull-left[ 1. Specify the model 2. Calculate the likelihood 3. **Calculate the negative log-likelihood** `\begin{align} -\ell(p; n, y) = &-[ln\big(\frac{n!}{y!(n-y)!}\big) + yln(p)\\ &+ (n-y)ln(1-p)] \end{align}` ] .pull-right[ `\(nll = -log(dbinom(x = 30, n = 100, p))\)`
] --- # Maximum Likelihood Inference #### What is the likelihood of getting 30 heads in 100 coin flips? .pull-left[ 1. Specify the model 2. Calculate the likelihood 3. Calculate the negative log-likelihood `\begin{align} -\ell(p; n, y) = &-[ln\big(\frac{n!}{y!(n-y)!}\big) + yln(p)\\ &+ (n-y)ln(1-p)] \end{align}` 4. **Calculate the derivative w.r.t. `\(p\)`** `\begin{align} \frac{d(\ell(p; n, y))}{dp} &= \frac{y}{p}- \frac{n-y}{1-p} \end{align}` ] .pull-right[ `\(nll = -log(dbinom(x = 30, n = 100, p))\)`
] --- # Maximum Likelihood Inference #### What is the likelihood of getting 30 heads in 100 coin flips? .pull-left[ 1. Specify the model 2. Calculate the likelihood 3. Calculate the negative log-likelihood 4. Calculate the derivate wrt p 5. **Set to `\(0\)` and solve for MLE**<br> `\begin{align} 0 &= \frac{y}{p}- \frac{n-y}{1-p} \\ E[p] &= \frac{y}{n} \\ E[y] &= np \end{align}` ] .pull-right[ `\(nll = -log(dbinom(x = 30, n = 100, p))\)`
] --- # Maximum Likelihood Inference #### What is the likelihood of getting 30 heads in 100 coin flips? .pull-left[ 1. Specify the model 2. Calculate the likelihood 3. Calculate the negative log-likelihood 4. Calculate the derivate wrt p 5. Set to `\(0\)` and solve for MLE `\begin{align} 0 &= \frac{y}{p}- \frac{n-y}{1-p}, \hspace{2mm} E[y] = np \\ \end{align}` 6. **Approximate `\(Var[p]\)` using the second derivative**<br> `\begin{align} &-\frac{y}{p^2} - \frac{(n-y)}{(1-p)^2} = -\frac{np}{p^2} - \frac{(n-np)}{(1-p)^2}\\ &= -\frac{n}{p} - \frac{n}{1-p}\\ l''(p) &= -\frac{n(1-p)}{p} \end{align}` ] .pull-right[ `\begin{align} Var[p] &\approx -\frac{1}{l''(p)} \approx \frac{p(1-p)}{n}\\ SE[p] &\approx \sqrt{ .3(.7)/100} \approx .00458 \end{align}`
] --- # Multivariate asymptotics * The second derivative measures the curvature of the likelihood surface * For models with N parameters, the curvature is represented by an NxN **Hessian** matrix of 2nd partial derivatives * Inverting the negative Hessian gives us a covariance matrix .pull-left[ `\begin{align} (\mathbb{H}_{f})_{i,j} &= \frac{\partial^2f}{\partial \theta_{i}, \partial x\theta_{j}} = \frac{-1}{Var(\Theta)} \end{align}` <br> [**What causes a singular covariance matrix?**](https://andrea-havron.shinyapps.io/mvnplot/) ] .pull-right[ <img src="data:image/png;base64,#00_00_Statistical_Computing_Review_files/figure-html/mvnorm-1.png" width="80%" /> ] --- # Automatic Differentiation <img src="data:image/png;base64,#static/AD_triangle.png" width="65%" style="display: block; margin: auto;" /> --- # What is AD? <br> <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:left;"> Automatic Differentiation </th> <th style="text-align:left;"> Symbolic Differentiation </th> <th style="text-align:left;"> Numerical Differentiation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> </td> <td style="text-align:left;"> Derivatives calculated automatically using the chain rule </td> <td style="text-align:left;"> Computer program converts function into exact derivative function </td> <td style="text-align:left;"> Approximation that relies on finite differences </td> </tr> <tr> <td style="text-align:left;"> Efficiency: </td> <td style="text-align:left;"> forward mode: O(n); reverse mode: O(m) </td> <td style="text-align:left;"> O(2n) </td> <td style="text-align:left;"> O(n) </td> </tr> <tr> <td style="text-align:left;"> Accuracy: </td> <td style="text-align:left;"> Machine Precision </td> <td style="text-align:left;"> Exact </td> <td style="text-align:left;"> Trade-off between truncation error and round-off error </td> </tr> <tr> <td style="text-align:left;"> Higher order derivatives: </td> <td style="text-align:left;"> Easy </td> <td style="text-align:left;"> Difficult due to complexity </td> <td style="text-align:left;"> Difficult due to error accumulation </td> </tr> </tbody> </table> *O(n): Order of operations is based on the number of parameters <br> *O(m): Order of operations is based on the number of outputs, in MLE inference, this is the nll --- # Computational Graph (Tape) <br> .three-column-left[ ```cpp //Program v1: x = ? v2: y = ? v3: a = x * y v4: b = sin(y) v5: z = a + b ``` ] .three-column[ <img src="data:image/png;base64,#static/comp-graph.png" width="100%" style="display: block; margin: auto auto auto 0;" /> ] .three-column-left[ ```cpp //Reverse Mode dz = ? da = dz db = dz dx = yda dy = xda + cos(y)db ``` ] --- # Reverse Mode .pull-left[ **Static (TMB: CppAD, TMBad)**<br> The graph is constructed once before execution .p[ - Less flexibility with conditional statements that depend on parameters. - Atomic functions can be used when conditional statements depend on parameters - High portability - Graph optimization possible ]] .pull-right[ **Dynamic (Stan: Stan Math Library, ADMB: AUTODIF)**<br> The graph is defined as forward computation is executed at every iteration .p[ - Flexibility with conditional statements - Optimization routine implemented into executable - Less time for graph optimization ] ] --- # TMB AD Systems <br> .pull-left[ **CppAD** - [CppAD package](https://coin-or.github.io/CppAD/doc/cppad.htm) - Developed and maintained by the CppAD community ```r TMB::compile("mymodel.cpp") TMB::compile("mymodel.cpp", framework = "CppAD") ``` ] .pull-right[ **TMBad**<br> - TMBad is available with TMB 1.8.0 and higher - Improved efficiencies over CppAD - Developed and maintained by Kasper Kristensen ```r TMB::compile("mymodel.cpp", framework = "TMBad") ``` ] --- # R and C++ Type Systems <img src="data:image/png;base64,#static/Type_Triangle.png" width="65%" style="display: block; margin: auto;" /> --- # Dynamic vs. Static Typing <br> .bluebox[ **Type System** <br> A logical system comprising a set of rules that assigns a property called a type (for example, integer, floating point, string) to every "term" (a word, phrase, or other set of symbols) <br> Source: [Wikipedia](https://en.wikipedia.org/wiki/Type_system) ] .pull-left[ **R: Dynamic** .p[ - Type checking occurs at run time - The values and types associated with names can change - Change in type tends to be implicit ]] .pull-right[ **C++: Static** .p[ - Type checking occurs at compile time - The values associated with a given name can be limited to just a few types and may be unchangeable - Change in type tends to be explicit ]] --- # Dynamic vs. Static Typing <br> .pull-left[ **R: Dynamic** - Flexible - Easy to debug at runtime - Code is easier to read and write - More errors detected later in development - More tests are needed to detect errors - Slower to run ```r `+` <- `-` 1 + 1 ``` ``` ## [1] 0 ``` ] .pull-right[ **C++: Static** - Errors detected early in development - Fewer errors occur at runtime - Optimized code runs faster - Code is harder to read and write - Difficult to debug <br><br> <img src="data:image/png;base64,#static/RStudioBomb.png" width="65%" style="display: block; margin: auto;" /> ] --- # What is the Template in TMB? <br> **Templated C++** <br> * Generic programming * Allows developer to write functions and classes that are independent of Type * Templates are expanded at compile time .pull-left[ ```cpp template <class Type> Type add(Type x, Type y){ return x + y; } int main(){ int a = 1; int b = 2; double c = 1.1; double d = 2.1; int d = add(a,b); double e = add(c,d); } ``` ] .pull-right[ ```cpp int add(int x, int y){ return x + y; } double add(double x, double y){ return x + y; } ``` ]