About

This is my modest space on the internet where I write about statistics and using R. I assume anyone reading this site is probably looking for help, hence the tagline “help with learning statistics.” I find that teaching statistical concepts helps me better understand the concepts myself. So this blog is mostly about me trying to solidify and expand my statistical knowledge. But if it helps others, so much the better!

Thanks,
Clay Ford

14 thoughts on “About

  1. Matan Gilbert

    Hello Mr. Ford,

    I found your blog in a search for resources addressing the generation of (fake) data for analysis. Are you aware of any texts that treat this subject from an introductory level, including understanding and choosing appropriate probability distributions, etc.)?

    Reply
  2. Clay Ford Post author

    You may want to check out the book An Introduction to Statistical Computing: A Simulation-based Approach by Voss. I have never read it and it looks expensive, but it appears it might be a resource for generating fake data. See chapter 2, Simulating Statistical Models.

    Reply
  3. Kyle

    Hi Clay,

    I found you after having read several articles you wrote on analyzing Categorical data using R posted on the UVa Library Services website. On two occassions now I was at a loss trying to understand what my professor (or Agresti) was talking about (or why it mattered). Then I read your articles working through basic analyses and afterwards engaging with the finer details in my notes and the textbook made a lot more sense. Thanks for taking the time to write so clearly. I appreciate it.

    Regards,

    Kyle

    Reply
  4. ss

    Hi Clay, I just saw your r codes on github about effects plot: https://github.com/clayford/effects_pkg. Both the rscript and the rmarkdown files gives error when running the codes. The rscript as far gives error because of the data set babies is not available and the markdown gives several erros. Do you have the opportunity to update them? I would really appreciate it!

    Reply
    1. Clay Ford Post author

      Hi. I just saw this comment. I have updated the GitHub repo so the Rmd file compiles and have added the babies data set. The effects package has been updated quite a bit since 2016 but the code still seems to work ok.

      Thanks,
      Clay

      Reply
  5. Kyle Grealis

    Hi, Clay! I see that’s it’s been a while since a reply was posted here, so I’m hoping that you’ll still see this. I was working with the {pwr} package in R, was reading through the vignette, and I’m hoping it’s you that wrote it. If possible, I do have a couple questions… at your convenience!

    Thank you!
    Kyle

    Reply
  6. jp

    So I was reading youre article on the problems with r2. My question is can a r2 value be too high with linear data or is that only a probem with none linear data?

    Reply
    1. Clay Ford Post author

      I’m not sure I follow your question. I’m also not sure what article you’re talking about.

      Reply
      1. jp

        I was referring to the article ‘Is R-squared Useless?” On the University of Virginia blog it attributes authorship to you. In it the arguments issued by Cosma Shalizi against the r2 metric were reviewed by you. In particular, I was asking you about the demonstration of how r2 can be arbitrarily high even with an incorrect model. You used non-linear data to prove the point. But my question is if the same issue can result using linear data? Secondly, I wanted to know in your opinion how well can adjusted r2 values compensate for these alleged failings.

        Link to article in question
        https://library.virginia.edu/data/articles/is-r-squared-useless

        Reply
        1. Clay Ford Post author

          I forgot about that article. That was almost 9 years ago. Where did the time go?

          Sure, I imagine the issue could arise with linear data. Fit a highly flexible non-linear model to linear data and you can get high r squared values despite the model being wrong. That’s an example of overfitting the data. Adjusted R squared in this case wouldn’t fix the fact you fit the wrong model.

          Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.