Date: February 3, 2015 at 12:55:59 PM EST
Subject: Sample Stats Question
From: ** <**@gmail.com>
Hello,
I hope all is well and trust that you are having a great day so far. I hate to bother you but I have a stats question that I need help with: How can you tell which group has the best readers when they have the following information: Group A-130, 140, 170,170, 190, 200, 215, 225, 240, 250
Group B- 188, 189, 193, 193, 193, 194, 194, 195, 195, 196
Group A-mean (193), median (195), mode (170)
Group B- mean (193), median(193.5), mode (193)
Why?
This is for my own personal use and understanding of this subkject matter so anything you could say and redirect me would be greatly appreicated.
Any feedback that you could give me to help understand this better would be greatly appreciated.
Thanks,
Right up there with the “nothing works” emails from students, though I suppose in that case at least you know it has something to do with a course you’re teaching.
I work as a programmer. I get emails such as “Something doesn’t work” all the time.
After reading another great addition to the continuing saga of weird emails, I received this email the instant I clicked over to my email. Are you playing a trick on us, when we click the link you mail us a random message? That aside, if you’ve got game, I’ll forward the message.
“Hello,
We are Forestville Basketball team in Australia,We are looking for Point guard,Shooting guard,Small forward,Power forward and Center with small to mid sized contracts($1200-$6000/per month)Interested players should forward game highlight or film.
We are interested to hear from any players who can add to the strength of our Men and Women. The Australian Basketball office will be pleased to assist players and parents with the complexities.Please get back to us We will contact you on receipt.
Regards
Mr R Dikel.
Team Manager.”
Are you required to have a “hot streak” to apply? Or must you be a believer?
The only interesting thing about this scam is figuring out how it works. Probably, this is a variation on the ‘shipping cost’ scam, with your airline ticket instead (you pay, they “reimburse” you with a stolen credit card — which you don’t find out about until weeks later).
Yeah this has to be a scam. I mean, why would you need to recruit a Center when you have 6’10” Pero Vasiljavic? That guy owns the paint.
http://basketball.australiabasket.com/team/Australia/Forestville_Eagles/3967
Goooooooooo Eagles!!!!1!
Ignoring the point of this post and just focusing on the question, it seems really stupid. Assuming the numbers represent reading scores, then the best five readers (out of twenty in both groups) are all in Group A. So, “which group has the best readers”? Group A has _all_ of the best readers.
Doesn’t look like a random sample. Rank ordered, group A has the five best plus the five worst, while group B has the middle ten. You can even eliminate the assumption, since if scores represent better reading or poorer reading, group A still has all of the extremes–no context needed.
…err almost.
> You can even eliminate the assumption, since if scores represent better reading or poorer reading, group A still has all of the extremes–no context needed.
I noticed that too. I suspect that’s supposed to be the clever aspect of it.
If ever there was a situation that calls for a KS test, this is it (though I suspect there aren’t enough n to differentiate the samples–maybe an exact KS test (there must be such a thing)).
KS? Oh yuck. Just do BEST.
For the good of my soul I thought I would run it through BEST. There was no credibility difference. The “”Distribution – Difference of Means” graph was roughly symmetric (zero was in the
“95% Highest Density Interval”, so I guess it “credible”). Also, the “hairy caterpillar” graph was “hairy” (by my definition–I guess once you start allowing anyone to impose their private beliefs on a statistical procedure Rorschach interpretation of vague shapes is acceptable).
I also ran a ks test:
Two-sample Kolmogorov-Smirnov test
data: d1 and d2
D = 0.5, p-value = 0.1641
alternative hypothesis: two-sided
Warning message:
In ks.test(d1, d2) : cannot compute exact p-value with ties
which confirmed my suspicion that there would not be sufficient power to distinguish the groups at the .05 level. A more exact test might, however, which was my original thought. BEST, while not clearly WORST, appears to have no real advantages for this problem.
Look at what BEST shows for the difference of SDs. (I’d rather see the difference of log-SDs, but then, I’m not John Kruschke.)
The difference of SD’s is well within the credibility interval, though the distribution does have a somewhat longer right-hand tail. The dynamic of this exchange has become typical of discussions I’ve had with Bayesians–have you tried this (yes, didn’t work), well, what about this (yes, it didn’t work). But there are indications (there always are–Andrew’s blog is devoted to researchers finding indications and running with them in almost a reductio ab absurdum manner).
My overall point is that while Andrew’s email was without context it really asks a very fundamental statistical question–when are two distributions different? Is BEST really better than a KS test–this simple example suggests not.
Maybe you’re having trouble reading the BEST graphs? My run yielded
SD Group 1:
Mean: 46.6, 95% HDI (23.9, 75.0)
SD Group 2:
Mean: 2.87, 95% HDI (1.27, 4.70)
Difference of SDs:
Mean: 43.8, 95% HDI (21.3, 72.2)
BEST clearly indicates that the two groups have very different dispersions. Not sure how you missed that…?
You’re right about the difference of SD’s–I read it correctly but misinterpreted it (for some reason I didn’t think that the mean of differences should be zero–my mind is going). Obviously simulation techniques give insight into the problem, but I ran Bartlett’s test and got
Bartlett test of homogeneity of variances
data: x and g
Bartlett’s K-squared = 35.2465, df = 1, p-value = 2.905e-09
As I recall the ks test does have low power but this is a pretty stunning display of that.
which also shows the
(Of course I ran BEST before I pointed you to it — it’s trivial to do thanks to the fully awesome Rasmus Bååth. I was laconic because I thought it would be obvious that the information one gets from BEST is much richer than from the KS test.)
numeric:
> As I recall the ks test does have low power but this is a pretty stunning display of that.
I am always unsure how folks get type one error and power without knowing the groups were randomized?
That is selective samples from the same population won’t have the same distribution – so that can’t be the Null?
If its taken as the Null, the distribution of the test statistics will be very far from Uniform(0,1).
If the Null is defined on the basis that both sample are from the same population (essentially taken as or deemed random samples from the same distribution) the amount of actual selection impacts the alternative. So what is the alternative?
If its just the impact of selective sampling, why would one assume anything remotely additive as might be represented by a shift in the mean?
Given one is usually interested in a effect of something after selection of the sample, how would the impact of selective sampling be _backed out_ (i.e. it could increase or decrease power)?
But without a well defined alternative, one can’t calculate power.
(OK now I am waiting for Andrew to post a section from the NY telephone directory to see how many comments that generates.)
We all agree that the centers are about the same but the spreads are different. And clearly the maximum and upper ranks are in group A. And the questioner most likely understands that. But they are in a situation where only the measures of central tendency are asked for. I would present the 75th percentile which in this case is 225 for group A and 195 for group B. This is the upper line of the box on the box plot and gives a sense of the ‘top of the middle.’
This problem is not without context. People in education and healthcare and now asked to describe things quantitatively with few legacy tools or personal understanding. Clearly these are from two different types of classes or methods. I had to do this once with a biology problem and the scientist found it the best way to describe his specimens of particulates.
I feel like this ignores the question. The question is, “which group has the best readers.” The best readers are in Group A. There is nothing to be gained by trying to characterize the distributions the groups are drawn from.
The sender could well be so statistically inexperienced that s/he would benefit from someone pointing out that dispersion is a thing.
Group A has the best best reader, and group B has the best worst reader.
This post made my day.
> Any feedback that you could give me to help understand this better would be greatly appreciated.
Heh. You’re pretty much at liberty there.
suppose the ‘scores’ represent the number of words the reader recalled reading 10 minutes after completing a passage. if the correct answer is 193, then group B has the ‘best’ readers.
it’s pretty likely this was the process used to generate this data…
Anscomble anyone?
R code
library(ggplot2)
dd1 <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("a",
"b"), class = "factor"), score = c(130, 140, 170, 170, 190, 200,
215, 225, 240, 250, 188, 189, 193, 193, 193, 194, 194, 195, 195,
196)), .Names = c("group", "score"), row.names = c(NA, -20L),
class = "data.frame")
ggplot(dd1, aes(score, group, colour = group)) +
geom_point()
jrkrideaku:
That just spoils the fun and we would have missed out on the insightful comments.
When folks start running named tests and recalling power rumors, I just want to reach for a graph or a simulation.
(Actually ran Anscombe’s Quartet of ‘Identical’ Simple Linear Regressions (its in R) and distributed to folks to inoculate them against being bullied by output from fancy procedures.)
(Actually ran Anscombe’s Quartet of ‘Identical’ Simple Linear Regressions (its in R) and distributed to folks YESTERDAY to inoculate them against being bullied by output from fancy procedures.)
Maybe I should graph my comments before posting.
which fancy procedures did you use on anscombe’s quartet?
Not sure what you are asking but
A description of Anscombe’s quartet is here http://en.wikipedia.org/wiki/Anscombe%27s_quartet
Now its often easier to convince people of problems that clearly arise in simple situations, so showing the problem with simple linear regression may well be more effective than showing it in say multivariate repeated measures analysis of sparse data.
Mea culpa, mea culpa.
How can you tell which group has the best readers? As is, you can’t. Without the context and more information, you can’t tell a thing. For instance, we’re all making the assumption that the scores are actually reading scores. That’s actually an assumption. They could be lat/lon sets for all we know :-)
I think this is a great question for an advanced course, with the right answer being “what the hell is going on here?”
A bunch of intelligent answers to a stupid question…
I mean, the group that got an A obviously did better than the group that got a B.
#AgainstTheGrain