I found you after having read several articles you wrote on analyzing Categorical data using R posted on the UVa Library Services website. On two occassions now I was at a loss trying to understand what my professor (or Agresti) was talking about (or why it mattered). Then I read your articles working through basic analyses and afterwards engaging with the finer details in my notes and the textbook made a lot more sense. Thanks for taking the time to write so clearly. I appreciate it.

Regards,

Kyle

]]>I found your blog in a search for resources addressing the generation of (fake) data for analysis. Are you aware of any texts that treat this subject from an introductory level, including understanding and choosing appropriate probability distributions, etc.)?

]]>Best Vasilis ]]>

hist(rf(n = 1000, df1 = 3, df2 = 15))

But the following produces something close to Normal:

hist(replicate(n = 1000, sum(rf(n = 20, df1 = 3, df2 = 15))))

Notice the difference between the two. The first is just a histogram of 1000 F statistics. The second is a histogram of 1000 sums of 20 random F statistics.

Hope that helps.

]]>I have some thoughts. According to CLT, random sampling of any statistic will eventually result in normal distribution of that statistic. But why this not apply to F statistics? I did bootstrapping of F statistic on 10000 samples, and I still got skewed distribution of F statistics.

]]>