Walk a Crooked MiIe

Posted on December 24, 2017 9:50 AM by Andrew

An academic researcher writes:

I was wondering if you might have any insight or thoughts about a problem that has really been bothering me. I have taken a winding way through academia, and I am seriously considering a career shift that would allow me to do work that more directly translates to societal good and more readily incorporates quantitative as well as qualitative methodologies. Statistics have long been my “intellectual first language”—I believe passionately in the possibilities they open, and I’d love to find a place (ideally outside of the university) where I could think more about how they allow us to find beauty in uncertainty.

The problem is that, as you know, many sectors seem to apply quantitative methods of convenience rather than choosing analytical frames that are appropriate to the questions at hand. Even when I was in psychology I was bothered by the widescale application of parametric tests and small sample sizes to questions where they did not seem appropriate and your writings suggest to me that not only was I right in sensing a problem in certain research trends, but that its magnitude is now being felt (with increasing horror) across the sciences. Yet when I look at, for example, programs in public health or epidemiology, much of the quantitative training still seems to concentrate on the same old methods and p-testing that seems so problematic.

So, I suppose I have 2 questions for you, one general and one specific. First, what advice do you have for the “quantitatively mindful and concerned” who are trying to negotiate an increasingly chaotic datascape? Do you have advice for, for example, junior researchers who have to juggle learning all these new methodologies with the tenure and advancement expectations set by an older generation who may be more comfortable with research paradigms that are now harder to justify than they were 25 years ago? And second, do you see any areas of social or biomedical research that you think have done a good (or at least better) job incorporating good data and statistical practices, where non-normal distributions and strange relationships between samples and populations are acknowledged and addressed openly?

My reply: I’m not sure how hard it is to use innovative methods. I remember when I started doing applied statistics, thirty years ago, that people warned me that subject-matter experts might not trust anything Bayesian. But that hasn’t happened to me, at least not directly. I mean, sure, the world is fully of buggy-whip operators and ignorant haters. There’s always gonna be someone invoking non-existent data or non-existent principles as a way of telling you what to do, or what not to do. But ultimately it’s a live-and-let-live world out there. The buggy-whip guy, in all his clueless and pontificating ways, still wasn’t trying to stop other pollsters from using MRP, it’s not like the got an injunction against Yougov to stop them from winning 2017. And the “no Bayesians in foxholes” guy, ok, sure, he did try to get me fired, but he didn’t try to stop statistics journals from publishing my work. Even with all his attitude problems, he recognized that researchers gonna research and he didn’t get in the way of me plying my trade once I’d done him the favor of no longer hanging around his workplace.

So, I guess what I’m saying is, use the methods that you think will best solve your problem, and stay focused on the three key leaps of statistics:
– Extrapolating from sample to population
– Extrapolating from control to treatment conditions
– Extrapolating from observed data to underlying constructs of interest.
Whatever methods you use, consider directly how they address these issues.

P.S. These Westlake titles continue to work!

17 thoughts on “Walk a Crooked MiIe”

Dale Lehman on December 24, 2017 10:05 AM at 10:05 am said:

This does not answer either of the 2 questions, but are good places to look if you (or any readers) want to move into using data for “good” purposes:

https://dssg.uchicago.edu/
http://www.datakind.org/

Reply ↓
Bob Carpenter on December 24, 2017 10:43 AM at 10:43 am said:

My advice is always the same. Find a good group to work with—good peers and at least one good mentor—that’s how you learn and build cool things. That’s what grad school choice is about. It’s also what drew me out of industry and back into academia to work with Andrew; other great colleague’s and a really fun project is what’s kept me here.

Academia’s often a generational fight. And often simultaneously a clan fight. Different institutions (departments, journals, conferences) play by different rules as do different generations. Basically, different groups of people adopt different aesthetics, goals, and conventions. I had a much worse time than Andrew with paradigm supression (as in NSF program managers refusing to send grants to reviewers on the basis that we weren’t doing linguistics). Luckily, I had a much better time at my local institution (computational approaches to linguistics were on the rise and seeing more and more real applications and Carnegie Mellon is nothing if not sympathetic to applying computation to a problem). I saw others like me go into more hostile “traditional” departments and basically have experiences like Andrew’s.

Reply ↓
- Thanatos Savehn on December 24, 2017 11:43 AM at 11:43 am said:
  
  Confirmation, perhaps, of your observation: http://www.nber.org/papers/w21788
  
  Reply ↓
- Keith O'Rourke on December 25, 2017 7:04 AM at 7:04 am said:
  
  > Find a good group to work with—good peers and at least one good mentor—that’s how you learn and build cool things.
  Definitely agree – all else is manageable with the good group and un-manageable with the not so good group.
  
  Reply ↓
  - Sameera Daniels on December 25, 2017 8:10 AM at 8:10 am said:
    
    Keith, is the term ‘surprisal’ have any relevance in statistics. I saw it used in Sander Greenland’s recent letter about the use of ‘statistical significance. I plan to ask Sander as well. B
    
    Reply ↓
    - Corey on December 25, 2017 10:39 AM at 10:39 am said:
      
      https://en.wikipedia.org/wiki/Self-information
    - Sameera Daniels on December 25, 2017 10:47 AM at 10:47 am said:
      
      Thank you. ‘Surprisal’ seems to have many connotations and uses in different disciplines. For example, defined as a ‘surprise’, act of surprise, or state of being surprised. That is to say even non-technical. But it is difficult for me to ascertain precisely what it could connote in statistics.
    - Keith O'Rourke on December 26, 2017 1:11 PM at 1:11 pm said:
      
      Specifically for Sander’s view, this might help: Statistical Training Needs to Address Cognitive Limitations and Biases https://ww2.amstat.org/meetings/ssi/2017/onlineprogram/AbstractDetails.cfm?AbstractID=304110 (the slides can be downloaded).
jrc on December 24, 2017 3:39 PM at 3:39 pm said:

“First, what advice do you have for the “quantitatively mindful and concerned” who are trying to negotiate an increasingly chaotic datascape?…”

I think for my first year or so out of graduate school I really, really wanted someone to help me answer this question. And then I realized that no one had a good answer for me and I had to find my own way to navigate the tenure-getting world without publishing research I was ashamed to put my name on. Of course that’s true of just about every hard decision in life, but you’d think there’d be better answers out there somewhere.

Now, as I hit the border of “beginning researcher” and have become somewhat more able to find projects that I think are worthy of attention, do not require p-hacking or just-so stories, and that are actually publishable, I wish I had a good and concise response for you. I don’t. I have a few notes, but they probably aren’t helpful:

1) do stuff that is interesting no matter what the result is. If the project is only interesting if you “find an effect”, then drop it. a point estimate and a narrow confidence interval should be enough to make the result interesting.

2) try to estimate some actual parameter or coefficient that really matters, and estimate it really well – smartly, carefully, and/or in new ways that produce precise estimates. Someone will want to publish the best estimate of something. Your contribution doesn’t have to be the first to find something, it can just do the best at estimating it (see: polling for instance). Whoever did it first surely did it bad – improvement is a publishable contribution.

3) don’t aim for the absolute top journals, and you won’t have to play that bullshit game. sure, papers in lower-tier journals do it too, but it is because those authors think the only way to write a paper is to write a hype-rap, because they want to publish in top journals and that is how you do it (mostly). my experience refereeing at slightly lower journals is that they already don’t believe super-BIG results (why would it go to that journal) and are more sympathetic to well-written, carefully-articulated and interesting work that is not all hyped-up nonsense. Of course some of those journals liked hyped-up nonsense, but find the good-but-not-top journals in your field that want to publish good work, and aim to publish there.

4) much like estimating something better, psuedo-methodological papers dressed up as empirical papers can work too – just write them carefully and be clear that the empirical results are suggestive but inconclusive, and that the contribution is in the methods/approach/perspective of your analysis. If it is smart, other people will want to use that perspective, and journals will want to publish it. I think people under-estimate the publishable value of well-written methods-y work – if it is carefully thought-out and written, people will read it and cite it and teach it (because it clarifies something for them), and journals will want to publish it.

Like I said, these are just tactics, not specific suggestions, but maybe they’ll be helpful. As for question 2: “do you see any areas of social or biomedical research that you think have done a good (or at least better) job incorporating good data and statistical practices, where non-normal distributions and strange relationships between samples and populations are acknowledged and addressed openly?”

1) I think Demography broadly construed does good stuff like this – they know about sampling and populations and there is plenty of data so you don’t see so much small-N nonsense. Other nonsense sure, but no field is perfect. I think Demography work tends to at least be clearly motivated, and graphical presentation of data is generally expected as part of the analysis.

2) I think the within-field heterogeneity in quality overwhelms the between-field…there is really good and really terrible work in all disciplines. Find the good work and see what use you can make of it – the results sure, but more the intellectual/analytical framework and the statistical methods.

I wish I better answers for you. It is all just really hard. Sometimes I’m tempted to say that it would be easier to unlearn everything you’ve learned about the failures of the practice of social science and just p-hack and over-interpret your way to fame and tenure. But I can’t really believe that or I’d do it myself. Who would want a job as a fraud (at least if they knew they were being a fraud)? Academia on your own intellectual terms can be wonderful, but I understand that I may not be able to keep doing it on my own terms, and if that happens to pass, there are plenty of other things any of us well-trained data-people can do for a living. Just be glad you don’t have a degree in Comp Lit or Philosophy…

So in conclusion: do you; try to do you well; believe in your own ability to do something other than academia, and I suspect you’ll find a way to do interesting work that you believe in, enjoy and that pays you enough to live a pretty decent life (again, sorry Humanities people…).

Good luck!

Reply ↓
- Martha (Smith) on December 25, 2017 12:06 AM at 12:06 am said:
  
  +1
  
  Reply ↓
- Sameera Daniels on December 25, 2017 1:01 PM at 1:01 pm said:
  
  Feyerabend I hear.
  
  Reply ↓
Garrett on December 24, 2017 4:59 PM at 4:59 pm said:

I would also add “extrapolating from past to future” to cover time series analysis, unless you feel that’s already captured in the sample to population jump

Reply ↓
- Andrew on December 24, 2017 10:11 PM at 10:11 pm said:
  
  Garrett:
  
  I did mean to include time series extrapolation in my sample-to-population jump, but it’s good to clarify this point, that extrapolation can be in space or time.
  
  Reply ↓
Rahul on December 25, 2017 1:05 AM at 1:05 am said:

My suggestion is get out of academia & *use* the best of academic methods to solve problems outside of the academic environment.

If possible, don’t depend on Journals and reviewers or managers to be arbitrators of whether you did well. Find some problem, where the feedback on your performance is more organically evident.

Reply ↓
- Bob Carpenter on January 2, 2018 6:03 PM at 6:03 pm said:
  
  I’d also suggest choosing a problem existing systems (including humans) are bad at.
  
  You don’t need to leave academia to solve real world problems. It can help, though, as the reward structure is different.
  
  Most of us have to rely on managers of some stripe—we don’t have the funds to do this alone.
  
  I don’t think much is “organically evident” without some statistical perspective. Sometimes a system is just overwhelmingly better, but often we have to settle for incremental improvements which are much harder to measure.
  
  Reply ↓
Lydia Maniatis on December 25, 2017 2:12 AM at 2:12 am said:

Gosh, what is a “non-existent principle”? Is it something no one believes in?

Reply ↓
- Lydia Maniatis on December 25, 2017 2:15 AM at 2:15 am said:
  
  Who needs “subject matter experts,” anyway? Who needs facts? Who needs prediction? A discussion for another day.
  
  Reply ↓

Statistical Modeling, Causal Inference, and Social Science

Walk a Crooked MiIe

17 thoughts on “Walk a Crooked MiIe”

Leave a Reply Cancel reply