Fearless Psychological Science: psych stats

Showing posts with label psych stats. Show all posts

Tuesday, May 29, 2018

Lies, Damned Lies, and Statistics

Lies, Damned Lies, and Statistics

In an interesting post, Michael Batnick, the Irrelevant Investor, makes a critical point about the oft-overlooked limitations of data in the world of behavioral finance: http://theirrelevantinvestor.com/2018/04/04/the-limits-to-data/

“Using Excel shows you how a robot should allocate its lottery winnings.
It doesn't show you that 70% of human lottery winners go bankrupt.”

Darwin famously didn't trust complicated mathematics ("I have no faith in anything short of actual measurement and the Rule of Three," he wrote in a letter). He wasn't wrong: complex procedures can obscure what's going on 'under the hood.' This can render a formula's weaknesses virtually invisible.

Have you heard about the studies showing that irrelevant neuroscientific information in a research summary makes people rate the conclusion as more credible? The same seems to go for math—when people see some complex, technical information, they'd often rather just believe it instead of thinking critically.

By Signe Wilkinson, for the Philadelphia Daily News
http://www.bcps.org/offices/lis/researchcourse/images/statisticsirony.gif

Multiple Regression Explained

How to interpret multiple regression

Regression is useful for making a predictive model. Let's say there's a positive linear correlation between K and N, but you suspect that Factors L and M also contribute to Outcome N.

http://www.statsoft.com/textbook/graphics/anima4.gif

Make up a story—say, that Factors K, L, and M represent intelligence, persistence, and amount of sleep per night and N refers to a course grade.

So, to test the relative impacts of Factors K, L, and M on Outcome N, you can feed each factor into a regression model, and test whether each factor increases the fit. That is, a correlation between Factor K and Outcome N yields a Pearson's r of .64 and R2 of .4096.

But, when you run a regression testing the effect of Factors K and L on Outcome N, you find an R2 of .5625, with a significant change in the R2 value. That means that Factors K and L together do a better job of explaining the relationship than Factor K alone.

Then, you run a regression with Factors K, L, and M together, and find an R2 of .5929, with no significant change—this means that Factor M does not help to explain the relationship. Outcome N is due mostly to Factors K and L; Factor M is an unimportant predictor of Outcome N.

Voilà! There's regression in a nutshell!

And, if you're confused about the math...remember in middle school or high school math, when you learned about "rise over run" and learned the formula y = mx + b? Yeah, that's a simple linear regression. With multiple regression, you can add multiple terms, such that y = ax₁ + bx₂ + cx₃...+ z. But it's still the same concept, just with more predictors than that lone "mx" term.

In case you missed it, there are some fantastic, easy-to-use, and FREE stats programs available now! I review them here.

For more help explaining statistical concepts and when to use them,

please download my freely available PDF guide here!

https://drive.google.com/open?id=0B4ZtXTwxIPrjUzJ2a0FXbHVxaXc

Saturday, August 5, 2017

When to use a chi-square

When to use a chi-square

Not clear about when you should use a chi-square vs. when to use a t test?

First, you should check out my free, downloadable PDF, A Practical Guide to Psych Stats.

Now that that's out of the way—if you're still not sure, how about a tasty example?

https://68.media.tumblr.com/df823e9a5429bc683f0d56b95b8d8d80/tumblr_oi945q8yMm1tsozubo1_1280.gif

Let's say that we want to know whether a bag of Original Skittles has a truly random distribution of colors. If so, we’d expect to find roughly equal numbers of red, green, purple, yellow, and orange Skittles, right?

A chi-square goodness-of-fit test [that is, a one-variable chi-square] can help us evaluate this. If there are 18 red, 13 green, 18 purple, 19 yellow, and 17 orange, the chi-square goodness-of-fit test tells us whether this distribution is different enough from an even distribution of 17 apiece (85 Skittles / 5 colors) that we can reject the notion that the colors are evenly distributed.

If you're really curious about my made-up numbers, by the way, here's a straightforward, easy-to-use online calculator to help you: http://www.socscistatistics.com/tests/goodnessoffit/Default2.aspx

***

Now, let's say we’re looking for differences in the proportion of red Skittles to the other colors in a bag of Original vs. a bag of Tropical Skittles.

https://upload.wikimedia.org/wikipedia/en/7/7b/Skittles-Tropical-Small.jpg

In this case, we have two categorical variables [Original vs. Tropical Skittles, and unequal distribution of colors], so we would need a chi-square test for independence. The additional category makes the calculation a little more complex (but not if you use statistical software to handle the dirty work! 😊), but ultimately, we're looking at the same thing as before: are there roughly equal numbers of each type Skittles in each bag?

In case you missed it, there are some fantastic, easy-to-use, and FREE stats programs available now! I review them here.

For more help explaining statistical concepts and when to use them,

please download my freely available PDF guide here!

Thursday, April 6, 2017

A replacement for SPSS?

Could this program be the end of SPSS?

I have previously recommended JASP as a useful—and free!—statistical software package. I stand by that recommendation (nay, I'm doubling down on it!) as JASP has the following advantages:

A slick, easy-to-grasp user interface
All of the major types of statistical test, including one-sample, repeated-measures, and independent-samples t tests, ANOVA, ANCOVA, correlation, regression, and even the chi-square test for independence [i.e. the two-variable chi-square]. It even has a module for structural equation modeling, for those who conduct such analyses!
Bayesian analogues to each of the above tests
A simple, one-click method to run these tests, which makes it an ideal instructional tool (and useful for many basic research needs as well).
It's a no-cost, open-source, cross-platform (Windows, MacOS, Linux) program, so there are zero barriers to personal use.
It launches pretty quickly, and runs extremely fast—even on low-powered computers.

JASP Screenshot from my own personal computer

This image is freely available for use; just cite http://psychsci.blogspot.com/2017/04/a-replacement-for-spss.html

The recent [March 21, 2017] release of version 0.8.1.1 has rendered JASP is even more useful than it was in the past! Here's the latest major change to the program:

Data synchronization that (finally!) allows you to edit your data from within the program itself. You can sync a .csv file, .sav file, or .ods [LibreOffice spreadsheet] file.

By the way, LibreOffice is a great, free alternative to Microsoft Office. I encourage you to check it out and shed the shackles of expensive, closed-source software!

Now that you can edit the data in a window in the statistical program itself (via data synchronization, which can be turned off if you so desire), and since a previous build allowed users to integrate JASP output with their OSF page, I think that JASP has finally become good enough to provide many researchers with all the statistical capability they need!

SPSS can still perform some of the more esoteric/advanced statistical procedures that JASP cannot, such as multi-level modeling. But since such procedures tend to be used relatively infrequently (at least in experimental social science research such as my own), JASP can probably handle the bulk of your analytical load.

Further, as a stats instructor, this tool is my secret weapon! I am encouraging students to use this program for an APA-style paper for which they have to run a handful of analyses to answer different questions.

This semester, I asked my Stats students how they felt about SPSS, and they generally weren't too fond of the program due to its complexity, pickiness, and uninformative error messages (not to mention an appearance that's stuck in the 1990s).

After I demonstrated JASP in class, the students seemed far more impressed with the free and open JASP than they were with the costly SPSS!

EDIT 5/30/2018: Just discovered that JASP is now available as an online resource, according to this blog post on the JASP website and a Tweet by E.J. Wagenmakers. You have to sign up for a RollApp account if you want to use this option.

And, if for some reason you're not a fan of JASP, a similar (and also free!) option called jamovi is under development. You can type data straight into this one, whereas I don't believe that's an option yet in JASP. jamovi lacks some features that JASP incorporates, but it's still a nice stats program for use in the classroom (or for your research)!

jamovi screenshot from my own personal computer

This image is freely available for use; just cite http://psychsci.blogspot.com/2017/04/a-replacement-for-spss.html

Should IBM be worried about SPSS adoption rates? Maybe...

Intrigued? Here is the website; you can download JASP by clicking

the "Download" tab and selecting the version that's appropriate for your operating system.

Or you can just follow this link instead. Hey, what do you have to lose?

Try out this free stats program and see if it meets your needs! If it doesn't, you can just uninstall it...

If you'd like to try jamovi, here's the link for that.

Monday, February 27, 2017

Stat-ception II: How to fix statistics in psychology

Stat-ception Part II

I'm a star!

OK, my public speaking skills may not exactly have made me a star (yet!), but I AM on YouTube! I've included a link to my recent (Feb 2017) Cognition Forum presentations, as well as my current thinking about easily--and immediately implementable--solutions to ameliorate those weaknesses.

The first video goes into depth about the issues; the second describes my proposed solutions to those problems.

https://www.youtube.com/playlist?list=PLvPJKAgYsyoKcGOCKEYT2GyzK0yLVXvzN

For your viewing pleasure, I've also embedded the videos here:

Any feedback or advice is welcome!

I've also made the slideshows available on Google Drive. Here's the link to the first slideshow, so you can follow along: https://drive.google.com/file/d/0B4ZtXTwxIPrjTktiMGdoQ3JBSHM/view. And here's the link to the slideshow for the second video as well: https://drive.google.com/file/d/0B4ZtXTwxIPrjalZxdFJfUWNKTVU/view?usp=sharing

A draft of my manuscript on the topic (intended for eventual publication) is freely available for download at https://osf.io/preprints/psyarxiv/hp53k/. Since I'm an advocate of the open science movement, it's only right that I make my own work publicly available--hence why I uploaded these videos (and my manuscript) to public repositories.

You may not trust my own take on these issues, in which case I commend you for your skepticism! In the videos, I made numerous references to Ziliak & McCloskey (2009), Gigerenzer (2004), and Open Science Collaboration (2015)--all are worth reading, for anyone who cares about scientific integrity and the research process. All three works were highly influential in my thinking on this topic, though I cited a variety of other papers as well in my aforementioned manuscript.

You may disagree with my recommendations in the second video, and if so, that's okay! How to address the limitations of NHST and fix science is absolutely a discussion worth having; I advance my own ideas in the spirit of jump-starting such a discussion.

So, please put your thoughts in the comments, and share my work with colleagues who may be interested in the topic!

Monday, February 20, 2017

Stat-ception: Everything you think you know about psych stats is wrong!

In the spirit of open science, I have posted a video of a talk on statistical practice that I gave in the Cognition Forum at Bowling Green State University.

This talk was in 2 parts; the first part summarizes many of the common objections to null hypothesis significance testing (NHST) that thinkers have made over the decades, and the second part goes over my current recommendations to tackle the problem.

Part I is available at https://youtu.be/JgZZkMJhPvI; Part II is forthcoming! I've also embedded the video right here:

You can view and download the full slideshow at https://drive.google.com/open?id=0B4ZtXTwxIPrjTktiMGdoQ3JBSHM. The free (and very easy-to-use!) statistical program JASP can be found at https://jasp-stats.org/. JASP is useful if you want to run the analysis on the precision-vs-oomph example that I discuss at the end of the video (at the 39:41 mark).

I have already tackled some of the issues with NHST on more than one occasion in prior posts here, and I have also provided a practical guide to psych stats as a freely available educational resource!

There are a variety of excellent papers on the topic of statistical practice in social science fields; my working paper on the subject summarizes them. In the interest of open science, I've made this working paper available at https://osf.io/preprints/psyarxiv/hp53k/. Other great resources on the topic include Gigerenzer (2004) and Ziliak & McCloskey (2009), which are also freely available.

Fearless Psychological Science