Wrangling "R": Rrangling!

0 comments
Trying to figure out an easy way to do Pearson's Chi-Squared Test in R when verifying goodness of fit. I thought the function chisq.test() would help!

Problems?

  • There's no way to tell chisq.test the correct degrees of freedom ("df") - it cannot figure this out itself. My stats text tells me to reduce df by 1 for each value derived from the observations. If I have 10 buckets of counts AND I have calculated the mean and sd from the data, df = 7 NOT 9 as assumed by chisq.test().
  • In the test distribution, again according to my text, no expected count should be less than 5. All the chisq.test does is tell you the results are unreliable - it doesn't fix them leaving the pre-processing up to you.

So here's how to use R's chisq.test()

GI => GO (IV) Generalized Extreme Value Distribution

1 comment
As part of my procedure for back-testing, I validate the data before using it. One of the validation steps is to check for unusually large daily changes in the raw close.

The changes on rollover days form a sub-set of daily changes - obviously they have the potential to be substantially different from regular daily changes.

Daily price changes are not normally distributed; typically log-normal is used. Even so, in a series of 4,000 daily price changes, there will often be several 4-sigma deviations (which should only happen less than 1 in 10,000 times).

Can the Generalized Extreme Value Distribution (GEV) help?
Get widget