Wednesday, August 9, 2017

Multiple Regression Explained



How to interpret multiple regression

Regression is useful for making a predictive model. Let's say there's a positive linear correlation between K and N, but you suspect that Factors L and M also contribute to Outcome N


Make up a storysay, that Factors K, L, and M represent intelligence, persistence, and amount of sleep per night and N refers to a course grade.

So, to test the relative impacts of Factors K, L, and M on Outcome N, you can feed each factor into a regression model, and test whether each factor increases the fit. That is, a correlation between Factor K and Outcome N yields a Pearson's r of .64 and R2 of .4096. 

But, when you run a regression testing the effect of Factors K and L on Outcome N, you find an R2 of .5625, with a significant change in the R2 value. That means that Factors K and L together do a better job of explaining the relationship than Factor K alone. 

Then, you run a regression with Factors K, L, and M together, and find an R2 of .5929, with no significant changethis means that Factor M does not help to explain the relationship. Outcome N is due mostly to Factors K and L; Factor M is an unimportant predictor of Outcome N.

Voilà! There's regression in a nutshell! 

And, if you're confused about the math...remember in middle school or high school math, when you learned about "rise over run" and learned the formula y = mx + b? Yeah, that's a simple linear regression. With multiple regression, you can add multiple terms, such that y = ax1 + bx2 + cx3...+ z. But it's still the same concept, just with more predictors than that lone "mx" term.

In case you missed it, there are some fantastic, easy-to-use, and FREE stats programs available now! I review them here.
For more help explaining statistical concepts and when to use them, 
please download my freely available PDF guide here!
https://drive.google.com/open?id=0B4ZtXTwxIPrjUzJ2a0FXbHVxaXc

Saturday, August 5, 2017

When to use a chi-square

 
 
When to use a chi-square

Not clear about when you should use a chi-square vs. when to use a t test? 

First, you should check out my free, downloadable PDF, A Practical Guide to Psych Stats.

Now that that's out of the wayif you're still not sure, how about a tasty example? 
Let's say that we want to know whether a bag of Original Skittles has a truly random distribution of colors. If so, we’d expect to find roughly equal numbers of red, green, purple, yellow, and orange Skittles, right? 

A chi-square goodness-of-fit test [that is, a one-variable chi-square] can help us evaluate this. If there are 18 red, 13 green, 18 purple, 19 yellow, and 17 orange, the chi-square goodness-of-fit test tells us whether this distribution is different enough from an even distribution of 17 apiece (85 Skittles / 5 colors) that we can reject the notion that the colors are evenly distributed. 

If you're really curious about my made-up numbers, by the way, here's a straightforward, easy-to-use online calculator to help you: http://www.socscistatistics.com/tests/goodnessoffit/Default2.aspx

***
Now, let's say we’re looking for differences in the proportion of red Skittles to the other colors in a bag of Original vs. a bag of Tropical Skittles. 



In this case, we have two categorical variables [Original vs. Tropical Skittles, and unequal distribution of colors], so we would need a chi-square test for independence. The additional category makes the calculation a little more complex (but not if you use statistical software to handle the dirty work! 😊), but ultimately, we're looking at the same thing as before: are there roughly equal numbers of each type Skittles in each bag?

In case you missed it, there are some fantastic, easy-to-use, and FREE stats programs available now! I review them here.
For more help explaining statistical concepts and when to use them, 
please download my freely available PDF guide here!
https://drive.google.com/open?id=0B4ZtXTwxIPrjUzJ2a0FXbHVxaXc

ResearcherID