{"id":1011,"date":"2025-03-23T18:24:34","date_gmt":"2025-03-23T22:24:34","guid":{"rendered":"https:\/\/www.clayford.net\/statistics\/?p=1011"},"modified":"2025-03-24T06:25:06","modified_gmt":"2025-03-24T10:25:06","slug":"some-notes-on-cronbachs-alpha","status":"publish","type":"post","link":"https:\/\/www.clayford.net\/statistics\/some-notes-on-cronbachs-alpha\/","title":{"rendered":"Some notes on Cronbach&#8217;s alpha"},"content":{"rendered":"<p>I was reading <a href=\"https:\/\/www.bmj.com\/content\/bmj\/314\/7080\/572.full.pdf\">this BMJ Statistics Note<\/a> by Bland and Altman on Cronbach\u2019s alpha and wanted to jot down a few notes.<\/p>\n<p>These two excerpts were worth copying and pasting:<\/p>\n<ul>\n<li><em>Many quantities of interest in medicine, such as anxiety or degree of handicap, are impossible to measure explicitly. Instead, we ask a series of questions and combine the answers into a single numerical value. Often this is done by simply adding a score from each answer.<\/em> (The questions are usually referred to as \u201citems\u201d.)<\/li>\n<li><em>When items are used to form a scale they need to have internal consistency. The items should all measure the same thing, so they should be correlated with one another.<\/em><\/li>\n<\/ul>\n<p>Cronbach\u2019s alpha is a statistic that allows us to assess the internal consistency of a scale. It ranges from 0 to 1. The closer to 1, the more internal consistency the scale has.<\/p>\n<p>The basic idea is to look at the ratio of the sum of k item variances to the variance of the sums. For example, say we have a k = 10 question (or 10 item) survey to measure anxiety, where each question is scored from 1 (no anxiety) to 4 (high anxiety), and the 10 item scores are summed to give an anxiety score. To assess internal consistency, we would give this survey to a bunch of people who are known to have anxiety and collect their responses. We would next calculate the variances of the 10 items as well as the variance of the total scores. We would then sum the 10 item variances and divide by the variance of the total scores to get a ratio. This ratio is at the core of the formula for Cronbach\u2019s alpha.<\/p>\n<p>To derive Cronbach\u2019s alpha, it helps to consider the two extremes this ratio could take. The easiest to consider is the case when <em>all k items are independent<\/em> and thus the variance of their sum is equal to the sum of their variances. No need to account for covariance since it\u2019s 0. That is,<\/p>\n<p><span class=\"math display\">\\[<br \/>\ns^2_T = \\sum_{i=1}^{k}s^2_i<br \/>\n\\]<\/span><\/p>\n<p>In this case, the ratio of sum of variances to variance of sums is 1: <span class=\"math inline\">\\(\\frac{\\sum_{i=1}^{k}s^2_i}{s^2_T} = 1\\)<\/span><\/p>\n<p>At the other extreme, <em>all items have identical variance and are perfectly correlated<\/em>. In this case we cannot simply sum the item variances to calculate the variance of the sums. The expression for <span class=\"math inline\">\\(s^2_T\\)<\/span> is more complicated. We need to include covariances. This comes out to<\/p>\n<p><span class=\"math display\">\\[<br \/>\ns^2_T = \\sum_{i=1}^{k}s^2_i + 2\\sum_{i&lt;j}\\text{Cov}(X_i,X_j)<br \/>\n\\]<\/span><\/p>\n<p>However, since all items have identical variances and are perfectly correlated, the covariance for any two variables is the same as their respective variances. Recall the formula for covariance for two variables:<\/p>\n<p><span class=\"math display\">\\[<br \/>\n\\text{Cov}(X,Y) = \\rho\\sigma_x\\sigma_y<br \/>\n\\]<\/span><\/p>\n<p>If <span class=\"math inline\">\\(\\rho=1\\)<\/span> and x and y have equal variance, then the product of their respective standard deviations is equal to the variance of x or y.<\/p>\n<p>To calculate <span class=\"math inline\">\\(s^2_T\\)<\/span> in this case we need to calculate two parts: (1) the sum of the variances and (2) the sum of the covariances.<\/p>\n<p>If we have k groups with equal variances, the sum of the variances is simply <span class=\"math inline\">\\(k\\sigma^2\\)<\/span>.<\/p>\n<p>If we have k groups, then we have <span class=\"math inline\">\\(\\frac{k(k-1)}{2}\\)<\/span> pairs of covariances. The sum of the covariances is <span class=\"math inline\">\\(2\\frac{k(k-1)}{2}\\sigma^2\\)<\/span>, which simplifies to <span class=\"math inline\">\\(k(k-1)\\sigma^2\\)<\/span>.<\/p>\n<p>Therefore we have<\/p>\n<p><span class=\"math display\">\\[<br \/>\ns^2_T = k\\sigma^2 + k(k-1)\\sigma^2<br \/>\n\\]<\/span><\/p>\n<p>Which simplifies to<\/p>\n<p><span class=\"math display\">\\[<br \/>\ns^2_T = k^2\\sigma^2<br \/>\n\\]<\/span><\/p>\n<p>Now when we form the ratio of sum of variances to variance of sums we get <span class=\"math inline\">\\(\\frac{k\\sigma^2}{k^2\\sigma^2}\\)<\/span>, which simplifies to <span class=\"math inline\">\\(\\frac{1}{k}\\)<\/span>.<\/p>\n<p>So at one extreme where all k items are independent we get a ratio of 1, and at the other extreme where all k items have equal variance and perfect correlation, we get a ratio of <span class=\"math inline\">\\(1\/k\\)<\/span>. If we subtract this ratio from 1, we get 0 at one extreme and <span class=\"math inline\">\\(1 &#8211; 1\/k\\)<\/span>, or <span class=\"math inline\">\\((k-1)\/k\\)<\/span> at the other. Therefore, if we multiply by <span class=\"math inline\">\\(k\/(k-1)\\)<\/span> after subtracting from 1, we\u2019ll still get 0 at one extreme but 1 at the other. This leads to the general formula for Cronbach\u2019s alpha:<\/p>\n<p><span class=\"math display\">\\[<br \/>\n\\alpha = \\frac{k}{k-1} \\Big(1 &#8211; \\frac{\\sum_{i=1}^{k}s^2_i}{s^2_T}\\Big)<br \/>\n\\]<\/span><\/p>\n<p>Here\u2019s a simple demonstration of where all items have identical variance and are perfectly correlated.<\/p>\n<pre class=\"r\"><code>it1 &lt;- c(3, 3, 4, 2)\r\nit2 &lt;- c(3, 3, 4, 2)\r\nit3 &lt;- c(3, 3, 4, 2)\r\nd &lt;- data.frame(it1, it2, it3)\r\nd<\/code><\/pre>\n<pre><code>##   it1 it2 it3\r\n## 1   3   3   3\r\n## 2   3   3   3\r\n## 3   4   4   4\r\n## 4   2   2   2<\/code><\/pre>\n<p>There are three items and four subjects gave the same response to each question. Now we sum the scores to add a total to the data frame and then take the variance of all 4 columns. Of course all three item variances are equal.<\/p>\n<pre class=\"r\"><code>d$tot &lt;- apply(d, 1, sum)\r\nvars &lt;- sapply(d, var)\r\nvars<\/code><\/pre>\n<pre><code>##       it1       it2       it3       tot \r\n## 0.6666667 0.6666667 0.6666667 6.0000000<\/code><\/pre>\n<p>If we sum the three item variances and divide by the variance of the sum, we get 1\/3 since k = 3.<\/p>\n<pre class=\"r\"><code>sum(vars[1:3])\/vars[&quot;tot&quot;]<\/code><\/pre>\n<pre><code>##       tot \r\n## 0.3333333<\/code><\/pre>\n<p>If all items are independent, the sum of variances equals the variance of sums. We can simulate a large amount of independent data to demonstrate this.<\/p>\n<pre class=\"r\"><code>X &lt;- matrix(data = rnorm(1e6*5), ncol = 5)\r\nsum(apply(X,2,var)) # sum of variances<\/code><\/pre>\n<pre><code>## [1] 4.995923<\/code><\/pre>\n<pre class=\"r\"><code>var(apply(X,1,sum)) # variance of sums<\/code><\/pre>\n<pre><code>## [1] 5.002947<\/code><\/pre>\n<p>The lower bound of 0 for Cronbach\u2019s alpha is basically theoretical. It seems impossible that one would ever get a 0 in real life.<\/p>\n<p>Bland and Altman conclude with this useful observation:<\/p>\n<p><em>Cronbach\u2019s alpha has a direct interpretation. The items in our test are only some of the many possible items which could be used to make the total score. If we were to choose two random samples of k of these possible items, we would have two different scores each made up of k items. The expected correlation between these scores is <span class=\"math inline\">\\(\\alpha\\)<\/span>.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I was reading this BMJ Statistics Note by Bland and Altman on Cronbach\u2019s alpha and wanted to jot down a&#8230; <a class=\"read-more\" href=\"https:\/\/www.clayford.net\/statistics\/some-notes-on-cronbachs-alpha\/\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[88],"class_list":["post-1011","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-cronbachs-alpha"],"_links":{"self":[{"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/posts\/1011","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/comments?post=1011"}],"version-history":[{"count":3,"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/posts\/1011\/revisions"}],"predecessor-version":[{"id":1014,"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/posts\/1011\/revisions\/1014"}],"wp:attachment":[{"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/media?parent=1011"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/categories?post=1011"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.clayford.net\/statistics\/wp-json\/wp\/v2\/tags?post=1011"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}