{"id":923,"date":"2024-03-16T11:29:05","date_gmt":"2024-03-16T15:29:05","guid":{"rendered":"https:\/\/www.clayford.net\/statistics\/?p=923"},"modified":"2024-03-16T11:29:05","modified_gmt":"2024-03-16T15:29:05","slug":"a-note-on-cardeltamethod","status":"publish","type":"post","link":"https:\/\/www.clayford.net\/statistics\/a-note-on-cardeltamethod\/","title":{"rendered":"A note on car::deltaMethod"},"content":{"rendered":"<blockquote>\n<p>As we would typically estimate the success probability <em>p<\/em> with the observed success probability <span class=\"math inline\">\\(\\hat{p} = \\sum_iX_i\/n\\)<\/span>, we might consider using <span class=\"math inline\">\\(\\frac{\\hat{p}}{1 &#8211; \\hat{p}}\\)<\/span> as an estimate of <span class=\"math inline\">\\(\\frac{p}{1 &#8211; p}\\)<\/span> (the odds). But what are the properties of this estimator? How might we estimate the variance of <span class=\"math inline\">\\(\\frac{\\hat{p}}{1 &#8211; \\hat{p}}\\)<\/span>? Moreover, how can we approximate its sampling distribution? Intuiton abandons us, and exact calculation is relatively hopeless, so we have to rely on an approximation. The Delta Method will allow us to obtain reasonable, approximate answers to our questions. (<a href=\"https:\/\/www.routledge.com\/Statistical-Inference\/Casella-Berger\/p\/book\/9781032593036\">Casella and Berger<\/a>, p.\u00a0240)<\/p>\n<\/blockquote>\n<p>Most statistics books that teach the Delta Method work a few examples where they manually derive the standard error of a nonlinear function of some statistic. This requires some calculus and algebra. The result is a closed-form formula we could ostensibly we use in a function to estimate the standard error of a statistic, such as estimated odds, which is a function of an estimated proportion. I want to document how we can use the <code>deltaMethod()<\/code> function in the {car} package to do this work for us.<\/p>\n<p>Casella and Berger show that the estimated standard error of the odds estimator is <span class=\"math inline\">\\(\\frac{\\hat{p}}{n(1 &#8211; \\hat{p})^3}\\)<\/span> (p.\u00a0242). If we didn\u2019t know this off hand or have a function available to us, we can use the <code>deltaMethod()<\/code> function to derive this estimator on-the-fly as we analyze data. For example, let\u2019s say we observe 19 successes out of 30 trials, an estimated probability of about 0.63, but we want to express that as odds and obtain a confidence interval on the estimated odds.<\/p>\n<p>To begin we load the {car} package. Next we need to store our probability estimate in a <em>named<\/em> vector. I gave it the name \u201cp\u201d. After that, we need to estimate the variance of the probability estimate, which in this case is the familiar <span class=\"math inline\">\\(\\hat{p}(1 &#8211; \\hat{p})\/n\\)<\/span>. Finally we use the <code>deltaMethod()<\/code> function. The first argument is our named vector containing the estimated probability. The second argument is the function of our estimate expressed as a <em>character string<\/em>. Notice this is the odds. The third argument is the estimated variance of our original estimate.<\/p>\n<pre class=\"r\"><code>library(car)\r\np_hat &lt;- c(&quot;p&quot; = 19\/30)\r\nvar_p &lt;- p_hat*(1 - p_hat)\/30\r\ndeltaMethod(p_hat, g. = &quot;p\/(1-p)&quot;, vcov. = var_p)<\/code><\/pre>\n<pre><code>##           Estimate      SE   2.5 % 97.5 %\r\n## p\/(1 - p)  1.72727 0.65441 0.44466 3.0099<\/code><\/pre>\n<p>So our estimated odds is about 1.73 with a 95% confidence interval of [0.44, 3.01]. The reported standard error agrees with the calculation using the formula provided in Casella and Berger.<\/p>\n<pre class=\"r\"><code>sqrt(p_hat\/(30*(1 - p_hat)^3))<\/code><\/pre>\n<pre><code>##         p \r\n## 0.6544077<\/code><\/pre>\n<p>In <em><a href=\"https:\/\/www.routledge.com\/Foundations-of-Statistics-for-Data-Scientists-With-R-and-Python\/Agresti-Kateri\/p\/book\/9780367748456\">Foundations of Statistics for Data Scientists<\/a><\/em>, Agresti and Kateri use the Delta Method to derive the variance of square root transformed Poisson counts. They show that the square root of a Poisson random variable with a \u201clarge mean\u201d has an approximate standard error of 1\/2. Again we can use the <code>deltaMethod()<\/code> function with data to derive this on-the-fly.<\/p>\n<p>Below we simulate 10,000 observations from a Poisson distribution with mean 25. Then we estimate the mean and assign it to a named vector. Finally we use the <code>deltaMethod()<\/code> function to show the result is indeed about 1\/2. Notice we simply have to provide the transformation as a character string in the second argument.<\/p>\n<pre class=\"r\"><code>set.seed(123)\r\ny &lt;- rpois(10000, 25)\r\nm &lt;- c(&quot;m&quot; = mean(y))\r\ndeltaMethod(m, g. = &quot;sqrt(m)&quot;, vcov. = var(y))<\/code><\/pre>\n<pre><code>##         Estimate      SE   2.5 % 97.5 %\r\n## sqrt(m)  4.99967 0.49747 4.02465 5.9747<\/code><\/pre>\n<p>Of course the <code>deltaMethod()<\/code> function was really designed to take fitted model objects and estimate the standard error of functions of coefficients. See its help page for a few examples. But I wanted to show it could also be used for more pedestrian textbook examples.<\/p>\n<p><strong>References<\/strong><\/p>\n<ul>\n<li>Agresti, A. and Kateri, M. (2022) <em>Foundations of Statistics for Data Scientists<\/em>. CRC Press.<\/li>\n<li>Casella, G. and Berger, R.L. (2002) <em>Statistical Inference. 2nd Edition<\/em>, Duxbury Press, Pacific Grove.<\/li>\n<li>Fox J, Weisberg S (2019). <em>An R Companion to Applied Regression<\/em>, Third edition. Sage, Thousand Oaks CA. <a href=\"https:\/\/socialsciences.mcmaster.ca\/jfox\/Books\/Companion\/\" class=\"uri\">https:\/\/socialsciences.mcmaster.ca\/jfox\/Books\/Companion\/<\/a>.<\/li>\n<li>R Core Team (2024). <em>R: A Language and Environment for Statistical Computing<\/em>. R Foundation for Statistical Computing, Vienna, Austria. <a href=\"https:\/\/www.R-project.org\/\" class=\"uri\">https:\/\/www.R-project.org\/<\/a>.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>As we would typically estimate the success probability p with the observed success probability \\(\\hat{p} = \\sum_iX_i\/n\\), we might consider&#8230; <a class=\"read-more\" href=\"https:\/\/www.clayford.net\/statistics\/a-note-on-cardeltamethod\/\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9],"tags":[80],"class_list":["post-923","post","type-post","status-publish","format-standard","hentry","category-expectation","tag-delta-method"],"_links":{"self":[{"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/posts\/923","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/comments?post=923"}],"version-history":[{"count":1,"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/posts\/923\/revisions"}],"predecessor-version":[{"id":924,"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/posts\/923\/revisions\/924"}],"wp:attachment":[{"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/media?parent=923"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/categories?post=923"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/tags?post=923"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}