“Starting at the beginning again can be exhausting and stressful. But, opportunities are finally coming into focus . . .”

Ashley Steel writes:

Walking away from science or walking away with science?

This is an essay about career transitions and the value of statistical thinking in, perhaps, surprising places. It is written in hopes of opening a conversation.

When my father, a kind and distinguished academic physician, gave me a chemistry set for my 12th birthday, it came with a notebook in which he had neatly written out on separate pages “introduction”, “methods”, “results”, “discussion.” I was only 17 or so when he called me into his office to question me as to why I had not yet published a peer-reviewed paper. He was holding a stack of applications and, apparently, several candidates for whatever position he was trying to fill had published scientific papers when they were in high school. When I tell co-workers that “publish or perish” is in my blood, I’m not sure they really understand how deep the roots run. And yet, one year ago, I sent a resignation letter to my scientific supervisor, packed up my entire house, and moved across the world for a position in international development.

A lot of people wanted to know why, including me. It was obviously the right decision but it was a difficult one to articulate, even in my own head. There were personal reasons such as the desire to live internationally and to be part of a global community. Those were the easy reasons. More problematic was an internal fear that I was running in the wrong direction. I was succeeding well enough in all the metrics: my H-factor, my ResearchGate score, the number of pages of my curriculum vitae, but the public’s mistrust of science, consistent and costly failures in statistical thinking, and the rapid pace of climate change were proving the entire endeavor to be, somehow, ineffective. The world was not broken for lack of scientific papers. The world was broken despite them.

When I was offered my new position, it was difficult to unearth much detail from the HR department. To gain insight, I sent an e-mail to the most senior person in my potential new department whose contact information I could find on-line. Within hours, we were on the phone. The job opportunity and the fast response had caught me a bit off guard and, in fact, on vacation in Scotland. From the back corner of a café-with-wireless in Glasgow, I plugged one ear with a finger, tried to sound both organized and dignified, and asked as many questions as I could. I learned a good deal about the new job but I was unable to convey my deepest concern.

After we hung up, I wrote this in an e-mail “My deepest concern is a potential loss of my identity as a scientist. It’s a bit silly on some level as I would gain a new identity. However, in all honesty, being a scientist is more of an identity than nationality or even topic of scientific research.… Would it be possible to be at [this job] and have an identify as someone who publishes the occasional paper, contributes to publications, and associates with students and universities? Although it is a challenge, I think it is both possible and in the best interest of [the agency] and students/universities. I am wondering if I am perhaps naïve to think this?” When we got back on the phone, she quite earnestly explained to me that the agency I was contemplating joining was a knowledge-generating agency and not a science agency.
Knowledge without science? For some inexplicable reason, this question motivated rather than undermined my desire to join the agency. Not because I thought there could be knowledge without science but because it was so clear in that statement that I had much to learn and, likely, much to contribute. I could bring my obsession with statistical thinking and apply it in a new way.

Six months later I was sitting in a community center with a dirt floor in southeast Zambia engaged in what I call “my second job interview”. I was initiating a relatively small field project and a senior leader from our partner agency had come across the border from Zimbabwe to meet me while I was in-country. I figured out as we were talking that his purpose in that community center was not to nail down project sampling details as I had originally thought but, rather, to ascertain whether I was worth his time at all. He had a lot of questions, mostly aimed at uncovering whether I had arrived in Zambia thinking I could save the world and whether I realized just how large the problem of poverty reduction really was. Sometimes I am tongue-tied but, occasionally, I know exactly what I want to say. Perhaps because these questions had been fermenting in my brain for half a year or more, it just bubbled out of me. I told him that so many well-meaning efforts ended in unintended negative consequences. There were only three things I was absolutely sure were positive: education, data, and collaboration. I must have passed the second interview because, now, I have a project in Zambia. And, I was on my way to, finally, being able to articulate my new mission.

As a professional research scientist, I was involved in education. I was educating undergraduates, who otherwise had lots of existing opportunity, about statistical thinking. I miss that but those students surely will get educated whether I am there or not. I was also educating interns and other scientists through mentoring and statistical consultation. My hobby was judging at science fairs. Some of these things are not possible in my new position but, with a little creativity, it is certainly possible to contribute to science and statistical education in new ways. Again, mentoring interns and informal statistics education are open doors. What about simply supporting and inspiring my field crew in Zambia to really understand random sampling or to go on for more formal education? What about designing a lecture series on statistical thinking that is relevant to staff at national statistics offices or to the knowledge-without-science thinkers at international agencies? Hard, but not impossible.

Data. I love honest data and I believe it is the only way to knowledge. Thoughtfully structured observations, compiled, analyzed, and well-communicated have been one of the few things to ever change the world in a positive way. There are mountains of data available in disciplines and arenas rarely touched by trained statisticians. Science is the way we can make those data useful. Clearly, a scientific approach to data is priceless in development work.

What about collaboration? When done well, science involves miraculous collaborations but collaboration and fostering collaboration have tremendous value well everywhere. I recently read a beautiful interview with a Tlingit native in which she recalls her father telling a politician to be more like a tree – holding hands in the roots, joining hands to prevent avalanches and soil erosions. I struggle to come up with an example in which collaboration is not positive. Sure, it is possible for two people or a group of people to conspire for sinister purposes. But, almost always, when people come together across belief systems, across continents, and across disciplines, the sharing of knowledge and the coupling of insight is positive. The newest and biggest challenges in science demand this type of cross-pollination and, although messy and complex, scientists rapidly gain experience in working across technical languages to generate new ideas that are bigger and better than the sum of the little ideas each individual scientist brings to a project. All stereotypes of nerdy self-serving scientists aside, collaborating effectively is something we know how to do and we know how to do well!

On, approximately, the one-year anniversary of the knowledge-without-science debacle, I’m hoping to open the conversation about career transitions and statistical opportunities in unexpected places. First, I am asking myself if I did the right thing. No. I am still sure I did the right thing. I am asking myself if I can finally articulate the reasons why it is a good thing. The answer, if you haven’t guessed it already, is that I was in no way walking away from science but rather I was walking away with science. Scientific, data-driven processes are in my blood. I suppose this is similar to the way an artist might feel about color. By choosing to step away from the world of H-scores and impact factors, I have dropped the metric-centric approach to science and am now trekking into new territory with old skills and experiences at my disposal, looking for exciting ways to inject science education where it is uncommon, advocate for unbiased data and analysis, and collaborate across worlds.

Perhaps a scientific career can be a process that begins with one’s first observations about the world and then moves from elementary science class to master’s thesis to post-doc. The next phases of a career are often about publishing simple papers, asking questions worthy of big grants, publishing more papers, and then publishing synthesis papers. Finally, one has to figure out how to leverage all that experience to make an impact in the larger world. Of course, many people get a master’s degree or a PhD and go on to use that scientific and statistical training directly in a wide variety of disciplines. Happily, those career paths exist, perhaps increasingly. Here, however, I am talking about those individuals who choose, as the main activity of the first 10 or more years of their career, science as a verb: the process of making structured observations in order to answer questions and then publishing descriptions of these endeavors in scientific journals.

Some of these scientific careers culminate in ever more impressive keynote lectures and others culminate in advocacy for facts and scientific content. Scientific careers, it turns out, might also culminate in advocacy for the process of science, the actions of science, in new and diverse arenas. Science pollinators perhaps? As a discipline, we have valued the careers of those who stay inside traditional confines over those who wander into new worlds. There are inspirational exceptions in scientific communicators, e.g. David Suzuki, and policy-makers, e.g., Jane Lubchenco, but the vast majority of scientists work mainly with other scientists. Science has, perhaps more than other disciplines, been cordoned off to universities, research institutes, and think tanks. Looking at the state of the world today, I think we can all agree that this has not worked well.

I’ve spent a lot of my career dreaming about how to wonderful it would be if lawyers, nurses, politicians, journalists, teachers, economists, social workers, writers, and international NGO staff had had better training in the process of scientific inquiry. I still dream about this and believe we should keep up the work in science and statistical education. Many great improvements have already been made. Science fairs, for example, can provide hands-on experiences that may be a reference point for life. But we can do more. We can incentivize and inspire humans long-trained in the process of scientific inquiry and statistical thinking to venture out beyond the scientific comfort zone, bringing their science with them. Just imagine an invasion of Congress by ex-science nerds, a hospital run by doctors and nurses who deeply understand p-values, or an NGO full of staff demanding confidence intervals and high-quality assessments of project effectiveness.

It’s not easy. It will demand some “significant” humility. Stepping into a world where no one really cares about your ResearchGate score or understands a little statistical joke can bring on an identity crisis. I speak from experience. And most of us don’t actually know how to do any other kind of job. Who knew? Starting at the beginning again can be exhausting and stressful. But, opportunities are finally coming into focus and, while I certainly don’t yet speak from the experience of having made an impact, I am watching a few role models carefully with hope and inspiration. It’s a path rarely discussed and not well-understood amongst those of us who trade in peer-reviewed papers and conference proceedings and so I am sharing my thinking, happy for the warnings and insights of others. Knowledge without science makes no sense. The world desperately needs an injection of science and statistical thinking well beyond the confines of the publish-or-perish world.

96 thoughts on ““Starting at the beginning again can be exhausting and stressful. But, opportunities are finally coming into focus . . .”

  1. It is weird how she keeps saying “science and statistical thinking”, as if they must always go together and are of equal importance.

    Statistics is just another minor tool that scientists may (but not must) use.

    • She doesn’t actually say what she was doing in academia, but my impression was she was a statistician or biostatistician or some such thing. She talks about what kind of teaching she did as ” I was educating undergraduates, who otherwise had lots of existing opportunity, about statistical thinking.” so basically teaching some kind of stats or applied stats type course…

      Anyway, I guess it depends on what people include under the “statistics” umbrella, but if you make that umbrella wide enough to include what I do when I do what I think of as statistics, then I’d argue you can’t do science without statistics. Of course, what I do is I build mathematical models of scientific processes, and then I use Bayesian methods to fit those models to data and try to figure out what went right and what went wrong, lather rinse repeat.

      You can do science all day long without doing a t-test or a wilcoxon rank test or an F test or whatever crap they teach in Stats 101. You can’t do science at all without logic and model building and comparing ideas to data in some logically meaningful way.

      • I’d argue you can’t do science without statistics.

        So then Newton wasn’t doing science? Deriving a prediction from some assumptions and comparing it to observation does not require statistics.

        MCMC is a definitely useful method for fitting a model, but ideally there would not even be any parameters to fit.

        • As I say it depends on what you put under the definition of “statistics”. In the broadest definition I can think of, “statistics” means “collecting data and comparing it to some kind of predictions”.

          You’re not doing statistics if you don’t have data, and you’re not doing statistics if you don’t have some kind of model, even if your model is just “data under condition A and condition B are the same to within some kind of error tolerance”.

          I don’t think statistics *requires* probability. For example, graphing data and looking at it by eye and saying “there seems to be something wrong with our predictions” or “our predictions do a good job of matching the data” or the like, is statistics in my view.

        • Feynman’s cartoon of science is more or less:

          1) Guess what happens
          2) Compute the consequences of your guess
          3) Compare the consequences to experiment

          https://www.youtube.com/watch?v=b240PGCMwV0

          In my view the statistics comes into number 3, where you design experiments so that they are meaningfully similar to situations where the guess applies, you collect the data, and you compare the consequences of the guess to the data.

          The guessing, and computing are separate parts of science.

          Of course, guesses often come from having seen a bunch of data, so there’s data collection outside of number 3 too… So the Census bureau does that kind of thing… I don’t consider the data collection they do to be statistics really. I mean data collection is not automatically statistics, you must be doing some comparisons to make it more than “data collection”.

        • So statistics is the step in science where you compare your prediction to observation? I am pretty sure there are very few people who think of such a vague definition when they hear “statistics”.

        • I think most people are befuddled by statistics anyway. I mean, most people just think of “Calculating averages” or “running t-tests” but I don’t think we should define what statistics is on the basis of what ignorant people think of when they hear the term.

          there’s a whole corner of the internet about what typical people think of science: https://www.iflscience.com/editors-blog/scientists-are-sharing-the-worst-stock-photos-of-their-jobs-and-theyre-hilarious/

        • So that would technically exclude for example Bayesian models in which there is no sufficient statistic for example. Unless by “numerical quantity derived from a dataset” we include each individual data point itself… which seems fairly facile.

          It would also exclude plotting of data and predictions against each other, again unless you include every individual data point and the plot containing them as “numerical quantities derived from a dataset”.

          I think the historical development is from https://www.etymonline.com/search?q=statistics 1770 “the science dealing with data about the condition of a [political] state or community”

          from this probably some mathematicians wanted to talk about derived quantities and coined the term “a statistic” as “a numerical quantity computed from a dataset” and now we’re forming kind of a “backronym” type definition: statistics is the study of quantities computed from a dataset, which seems to be putting the mathematical formalism first and the purpose second.

          To me, although the original development of early techniques in statistics was motivated by trying to understand economics and demographics and things, the basic concept is clearly using data to inform scientific ideas about something you’re studying. That’s more or less part 3 in the Feynman cartoon.

        • So that would technically exclude for example Bayesian models in which there is no sufficient statistic for example. Unless by “numerical quantity derived from a dataset” we include each individual data point itself… which seems fairly facile.

          You mean like ABC? That uses statistics.

        • It occurs to me though that maybe a big part of the “statistics wars” fought in the last few years is that different groups are cross-talking about what is even included in stats.

          I mean there certainly are plenty of people who think of statistics as basically “that set of push-button analyses you should run to check to see if you have followed the god-given procedure for doing science and can claim a “true discovery”” (this is kind of the cartoon of statistics held by biologists such as my wife’s colleagues)

          vs the people who think “statistics is basically the logical application of data to making an argument about whether models and predictions are accurate and reliable when tested in the real world” (this is more or less the area of discussion of this blog right?)

        • Sure ABC uses statistics, but if I give you a sample from a cauchy distribution with central parameter a and scale 1 and you try to do inference on a, you’ll have to use a likelihood p(d1,d2,d3,…dn | a) in which the “sufficient statistic” is the entire dataset. It’s no good just to calculate some numerical quantity derived from the dataset.

        • You could still use the median to characterize a sample from a Cauchy distribution. You can also use a mean, but it won’t have the expected behavior.

        • The likelihood is a quantity derived from the dataset of course, but it’s not derived *just from the dataset* it’s also derived from the model. Whereas the average can be calculated without reference to any model for example. Normally people talk about a “statistic” as something like the average where the quantity can be computed independent of a model.

        • “Formally, statistical theory defines a statistic as a function of a sample where the function itself is independent of the unknown estimands” (the wiki article you quoted)

          So, the likelihood or the probability density function of a Bayesian model don’t count because they are functions of the unknown parameters for example.

          This view of statistics basically excludes Bayes, it’s a “sampling theory” only view. So I can’t buy into it.

          But I can see where if someone already has this view of what statistics is, and then someone comes along with a load of Bayes, it’d feel like someone coming and dumping a load of chemicals in your pool… you’d be sort of “hey, what’s all this crap, I didn’t order this”, which might explain all the “methodological terrorism” type stuff.

        • “Formally, statistical theory defines a statistic as a function of a sample where the function itself is independent of the unknown estimands” (the wiki article you quoted)

          So, the likelihood or the probability density function of a Bayesian model don’t count because they are functions of the unknown parameters for example.

          This view of statistics basically excludes Bayes, it’s a “sampling theory” only view. So I can’t buy into it.

          Every Bayesian model I have seen uses the concepts of mean, variance, min/max, etc. That is “using statistics”. I don’t know where that “formal” definition came from…

        • The cartoon is based on CS Pierce concepts of abduction, deduction and induction.

          One perhaps important feature that gets left out by the cartoon is they are better though as phases rather separate activities and everything involves a bit of each and the relative weights change as inquiry proceeds.

        • Anoneuoid:

          the distinction in Bayes vs Frequentist/Sampling Theory usage of “mean” and “variances” etc is in whether we’re talking about things that can be calculated knowing only the numerical values in the dataset and some constants, such as sample means, which are “statistics” in the formal sense, vs working with unknown parameters which can never be calculated exactly and are generally thought devices anyway, such as population means, or location parameters for error distributions. These are “unknown parameters”.

          Bayesian analysis operates on probability distributions over parameters, and on data. Sometimes these data can be summarized by a single “sufficient statistic” like the sample mean. So if we’re talking about a Bayesian fit to a normal distribution with known variance, you can get a posterior if instead of sending you a whole dataset, I just send you the sample mean.

          But other times, for other models like the cauchy model, you can’t just work with a sufficient statistic (for example, the median is not “sufficient” it doesn’t summarize the inference entirely), instead you need the entire data set.

          claiming that “statistics” is the study of the mathematical behavior of numerical functions calculated from samples is *far far* too narrow a definition for the discipline, it’s the kind of definition an abstract probabilist who has never collected real data in their life might give.

          You might get away with “statistics is the application of probability to the analysis of real world data” or something, but even that is too narrow in my opinion as it’s possible to do analysis of data without probability theory, as you show Newton and Kepler did, and as is just plotting data with a pencil on graph paper and drawing a line through it with a ruler like we used to do in my High School Physics class. Drawing that line is “statistics” in my opinion, even though we don’t do any formal mathematical calculations to get the line.

        • the distinction in Bayes vs Frequentist/Sampling Theory usage of “mean” and “variances” etc is in whether we’re talking about things that can be calculated knowing only the numerical values in the dataset and some constants, such as sample means, which are “statistics” in the formal sense, vs working with unknown parameters which can never be calculated exactly and are generally thought devices anyway, such as population means, or location parameters for error distributions. These are “unknown parameters”.

          Yes. The wiki page made a distinction there but I don’t see why (and there is no source, it was probably just a definition in the back of some stats 101 textbook). It is both “using statistics” to me.

        • The wiki article cites :

          DeGroot and Schervish. “Definition of a Statistic”. Probability and Statistics. International Edition. Third Edition. Addison Wesley. 2002. ISBN 0-321-20473-5. Pages 370 to 371.

          But there is a good reason why that distinction is made. It has to do with the observability/computability of the quantity. You can calculate a sample mean as an objective observable fact about a data set. In other words, you can supply a function that takes the data set, and only the data set, and returns a number. For example using the Scheme dialect of LISP as a very basic functional language, the mean function is

          (lambda (data) (/ (sum data) (mean data)))

          Any number that comes from a function which immediately evaluates to a number once a dataset is provided qualifies as a “statistic” of that data.

          No such function is available in a Bayesian analysis… instead what you have is a function that takes a data set, and returns a function which takes a model and some parameters and returns a value

          (lambda (data) (lambda (model paramvals) (model data paramvals)))

          the result of evaluating this function with a dataset is a function …. a very general function that returns different values for different models and different parameter values….

          Since there are an infinity of statistical model assumptions that could be made about any given dataset, it’s useful to distinguish between things that are essentially “bound variables” once you’ve observed the data vs functions of some free variables (model and paramvals).

          the definition of “statistic” emphasizes the computationally bound nature of the quantity once the dataset is given.

          https://en.wikipedia.org/wiki/Free_variables_and_bound_variables

          This is a pretty fundamental aspect of the logic/computing/mathematics of “a statistic” as understood by “Sampling Theory / Frequentism”

          Sampling Theory basically takes “statistics” as the study of the mathematical connections between the output of “statistic functions” when applied to datasets, and the parameters of random number generators that generate datasets.

          In my opinion this is at the heart of confusion regarding Bayes vs Frequentist ideas.

          I’m hoping Bob Carpenter will chime in here with helpful thoughts from a computer science perspective…

        • I think Daniel considers the comparison between data and prediction to be important to science and to be included in ‘statistics’, which I think is certainly a reasonable position although there are other reasonable positions too.

          But there are many aspects to science other than simply compare data to predictions. Michaelson and Morley designed an experiment to measure the speed of light in different directions (not statistics), then made the measurements (not statistics), and then quantified the difference by direction (in terms of diffraction fringes or something if I recall correctly) and compared those to both zero and to predictions from the aether theory (statistics). Motivated by those results and others, Einstein developed the Theory of Relativity (not statistics) and used it to calculate the expected visual displacement of a star on a photographic plate during an eclipse (not statistics). Someone took a photograph through a telescope (not statistics) and compared that prediction to what was observed (statistics) as a test of the theory. If we adopt Daniel’s definition, statistics is an important part of science but there is a lot of science that isn’t statistics.

        • Pretty much this. Statistics plays some role in experiment design. So for example when you design an experiment to take photos during an eclipse for the purpose of checking Einstein, you may be doing statistics to the extent that you are determining whether the particular photos to be taken are relevant to the question or not. How close should the star be to the edge of the eclipsing object in order for the effect to be detectable, etc.

          When we have a pile of data collected by others, like for example the Census, we have to do statistical thinking to decide whether that data can answer some question we have, like perhaps whether homelessness has increased through time or not…

  2. she quite earnestly explained to me that the agency I was contemplating joining was a knowledge-generating agency and not a science agency.

    Could you elaborate on what a “knowledge-generating agency” is? Is there a website we could get more information?

    • Pretty sure this is E. Ashley Steel of the Pacific Northwest Research Station. If so, I think she’s with the FAO currently. Their website is fao.org

      The FAO is (a tiny) bit like the all-but-dead USDA Economic Research Service, which was recently defunded and moved out of Washington D.C. by the Trump administration. In a recent op-ed in the Washington Post, one of their former researchers described the agency thus: “At ERS, we studied all aspects of food production, occupying an obscure but important niche: Many of our research topics wouldn’t make for an exciting academic tenure file, but had huge implications for policy.” For example “according to [their] models, food assistance programs were often a positive multiplier for local economies.”

      This is what I think Ashley Steel (whom my graduate advisors worked with who, from all that I hear from those who know her, is an truly excellent researcher and human being) means by a “knowledge-generating agency” – one that produces finding that (should) result in in better management of natural resources or better policy, as opposed to simply aiming to produce peer-reviewed research that makes a splash, but doesn’t result in any real world changes. Similar knowledge-generating agency might be the National Weather Service which every single day produces knowledge (weather forecasts) that are not “science” in the sense that they are not peer-reviewed papers (of course they do that too).

      Unrelated, but it is worth reading that op-ed by the former ERS scientist here: https://www.washingtonpost.com/outlook/2019/10/21/white-house-didnt-like-my-agencys-research-so-it-sent-us-missouri/

  3. Thanks to Dr. Steel for the story and to Andrew for sharing!

    I couldn’t help but notice, especially given our recent discussions on the “science-as-rhetorical-gloss” set of people, that the way Dr. Steel talks about “science”, in terms of how papers are formatted and in terms of publication metrics, is really quite distressing. Science *shouldn’t* be about those things, it should be about “knowledge generation”, but it is clear that many people who enter the field with the hopes of generating knowledge end up stuck in a cycle of doing things that “seem sciency” but don’t actually act to advance knowledge. So really it is no wonder that she has found something outside of academic science that is far more rewarding and does more to advance knowledge than she was able to do while following the forms and rituals of academic science.

    It reminds me a bit of an old Rod Serling play about a writer (“Velvet Alley”) who sets out to do nothing more than be a good and impactful artist but, when he experiences success, begins to compromise in order to hold onto the trappings of that success. The goal shifts toward maintaining certain symbols of success/status because now those are things you can lose if you don’t behave.

    I know I’m preaching to the choir here, but I think this story underscores how scientific training and practice often emphasizes form at the expense of the goals of science to advance knowledge. I understand her point is that people with scientific training are valuable in many settings, but I would say instead that the umbrella of “science” should be expanded to include those people who don’t always follow the forms of academia. In other words, Dr. Steel seems like the sort of person who *should* have a home in academic science and it is a failure of the field that she does not.

    • This is what I took away from the post as well, thanks for stating it so clearly. Steel seems to have been embedded in academia as rent-seeking-game from an early age (why haven’t you published a peer reviewed paper at age 17? uh…). Eventually she found that the rent-seeking game wasn’t actually interesting or rewarding, instead she wanted to “generate knowledge” and maybe also to “fix the world” (“the world was broken despite them”).

      I doubt very much that in the period between say 1900 and 1950 there was much concern that science wasn’t a “knowledge generating” endeavor, or that things in the world weren’t developing due to science and the resulting technology.

  4. “I struggle to come up with an example in which collaboration is not positive.”

    Wow! Competition has been at least as beneficial as collaboration to human welfare! All purely collaborative organizations (UN, World Bank, IMF, charities) require continual cash influx from the competitive world of business to pursue collaboration so, ultimately, it’s competition that makes purely collaborative organizations possible – including science itself.

    For me, one of the greatest inventions of competition is something so simple hardly anyone gives it a second thought: ibuprofen. For many years I got horrible headaches that only ibuprofen could stop. And yes it’s a product of collaboration, but it’s also a product of competition. Science might make discoveries and that’s all well and good, but without businesses to develop, produce, market and deliver them at a reasonable cost, science would be, at the very least, much less effective.

    “I’ve spent a lot of my career dreaming about how to wonderful it would be if lawyers, nurses, politicians, journalists, teachers, economists, social workers, writers, and international NGO staff had had better training in the process of scientific inquiry.”

    Personally, I think the western world is suffering not from too little understanding of the scientific method, but from growing ignorance of the fundamental role that business plays in improving human welfare, combined with ridiculous and unachievable expectations of the scientific method.

    • in fact it’s probably these unachievable expectations of science that are driving some of the problems we’re seeing in science, with respect to abuse of statistics, poorly designed studies etc.

    • Saying “competition is important too” is not “an example in which collaboration is not positive”

      IMHO collaboration is not positive when the project itself is meaningless. In that case, collaboration is just wasting more and more resources. For example the whole discussion about “candidate genes” a few days ago. https://statmodeling.stat.columbia.edu/2019/10/17/heres-an-interesting-story-right-in-your-sweet-spot/

      collaborating on understanding how 5-HTTLPR works was a MASSIVE problem, because it *didn’t do* any of the things people thought it did, and it took decades to figure that out, and it was not the collaborators who figured it out (except in the broadest sense that maybe all of science is collaboration in some sense, but that’s too broad to be very meaningful)

      ——

      By the same token, competition between businesses to be the best at seeking rent from the government and extracting cash from broken policies is not helpful either. The finance industry figured out that it could extract cash from people by selling them mortgage services that they had no chance of repaying, and then when the crash inevitably came, by extracting cash from the government in bailouts. There’s nothing good about the resulting real world harm to individuals in the US. However the finance industry is at more or less the same fraction of GDP as it was in 2006:

      https://fred.stlouisfed.org/series/VAPGDPFI

      Similarly, for example Pharma making a killing off selling Tamiflu stockpiles to the UK or whatever is an example of one company out-competing others to extract rent from the government.

      Competition works well to drive well-being when the dollar amounts received are strongly correlated with actual well being. In the presence of highly asymmetric information, principal-agent problems, crony capitalism, and outright corruption, businesses can get rich without providing real value.

      Asymmetric information is one of the biggest problems. Snake oil salesmen are like the classic example, but how much of todays Pharma is super fancy snake-oil? How much of technology today is people selling you things they convince you are going to be beneficial to you but in reality just spy on you, have terrible security, and dramatically increase the risk of identify theft, financial fraud, vote fraud, or whatever?

      • How much of things you buy today would be free to not only share with your friends, but make new derivative works from if industries hadn’t bought continual extensions of Copyright?

        It’s useful to look for example at this graph: https://en.wikipedia.org/wiki/File:Tom_Bell%27s_graph_showing_extension_of_U.S._copyright_term_over_time.svg

        Which shows the steady progress of rent seeking over the last 200 years or so. If copyright were the same as under the early constitution in 1790 or so, they’d last about 30 years. Which means that audio recordings from the earliest days of audio recording up through about 1999 would all be public domain today, and I’d be listening to a massive collection of mid 20th century jazz recordings without paying tribute to corporate overlords. Of course, the dollars I do pay look like “GDP” so they go towards the “economic good” in the accounting, which is about as much of a bald-faced lie as I can think of. It’s like counting money paid to mob extortion rackets as GDP.

        So, I’m not against competition, or acknowledging the role it can play, I just think it’s important to remember the reality of how corporate profits come about in many industries today.

        • “In the presence of highly asymmetric information, principal-agent problems, crony capitalism, and outright corruption, businesses can get rich without providing real value.”

          In general, yes, I agree, for business to function as a provider of benefits it has to work within a “fair” legal structure without corruption. (or at least a structure that’s mostly fair with minimal corruption). I personally don’t think a business has to provide “value” to justify getting rich, as long as it does no damage.

          In principle I agree with everything in your comments, although we might differ on exactly how those concepts would be applied.

          You’re preaching to the converted on copyright. Obviously there’s a benefit to some protection, but the current situation is ridiculous. Should be the same as patents.

        • I don’t understand the providing value thing. When you spend money you get a good or service. If the good or service doesn’t accomplish the thing you were led to believe it would accomplish, then by taking your money and giving you a lousy good the business has done harm. only if after the fact you agree that you “got your money’s worth” did the business not do harm.

          or at least someone paying attention should have been likely to make the correct assessment about the good. if you clearly state in your advertising “this electronic device has a number of known security vulnerabilities and sends pictures of your naked children to pornographers” but you bought it anyway … buyer beware can apply. surprisingly few devices have that warning though, even though it may be true for MOST IP cameras these days for example.

        • When I say “do no damage” I mean large scale societal damage (i.e., bank collapses; companies bilking the govt etc).

          Whether or not someone gets the value they expected out of any individual item is a minor legal matter. You can’t lie in advertising. But it doesn’t matter how many regulations you write or how many thousands of apparatchiks you hire to enforce them, our beautiful language will leave room for inference. Most things you can return if you don’t like them.

    • or, alternatively, perhaps the world of business scrapes so much money from anything that would hope to be collaborative because it is based on a model of getting money, patenting products, and cornering markets. although, I’m guessing you would claim that all of this business practice is necessary and good (which, of course, I would vehemently disagree with). In an economic system where competition is the widespread norm, how could one possibly expect collaborative efforts aimed at the greater good to compete with those focused on money grabbing and making shareholders happy? Doesn’t mean collaboration doesn’t work, it just means our system is loaded against it. Not really sure what your point is aside from demonstrating your own political colors.

    • Uh, jim, she never said nothing except collaboration is positive. If I say “All X is good”, that doesn’t mean “only X is good.”

      Actually I can think of many examples of collaboration that are not positive, and it’s not hard. But I take this is a bit of pro-collaboration puffery rather than a literal claim. We all know what she means. Don’t pretend you don’t.

      • Its a fair point that she claims only that collaboration is positive – and says nothing about competition positive or negative.

        Just the same, its inaccurate to omit competition as an equivalent driver of “good” in the world. I doubt she did that intentionally. People tend not to recognize the benefit of competition. Its taken for granted.

        • I thought it was important to point out the benefits of competition because I feel that there is a starry-eyed view of science floating around in the ether that’s woefully inaccurate and ultimately damaging to both science and society.

          I’d go so far as to say that a person can do at least as much good in the world running a business as they can working for an NGO, but that of course depends on the business and the NGO. I’m inclined to think the best thing NGOs can do in the third world is what some have already decided on – just hand out cash and trust that people will mostly use it for the right thing.

          Last but not least – and it’s only vaguely related, but I wonder – oh, never mind. If I say that people will have a friggin’ coronary or at least a hernia. I don’t want anyone to get hurt. :)

        • I agree lots of good has been done by people who run companies, and by the companies that are run by those people. Lots of bad done by other companies and other people. And it’s fine with me if you want to make that point, lord knows. I encourage you to not have a chip on your shoulder about it, though. If the potential good done by competitive capitalism doesn’t come up when in a particular post or a particular context, you don’t have to bring it up every time. This is intended as friendly advice, not an attempt to stifle your free speech.

  5. I think we have several large overlappying umbrellas here: science, academia, collaboration, competition. I found the essay uplifting and somewhat inspirational – but I do wonder whether the key is really “science.” There are many schools of thought – the scientific process is not the only kind of knowledge, unless we expand the definition to include any structured and rigorous pursuit of knowledge. These are all practiced to some extent in academia, though most of us see as many failings as successes in that practice. Similarly, we see both competition and collaboration practiced inside and outside of academia (for me, the greatest disappointment in academia is how little collaboration takes place compared with competition).

    I could paraphrase her essay, for my own experience, by replacing “science” with “business.” Some of my most satisfying work has been in business – I’ve seen more cooperation take place than in academia, and with far less of the academic ritual which has outlived its usefulness. I think the world needs more knowledge, better appreciation of data (from measurement through communication), more appreciation of different modes of inquiry, and attention to removing the many artificial rituals we have set up which stand in the way of these things. Defining science and statistics is largely irrelevant to this. Distinguishing between collaboration and competition is largely irrelevant to this. We need all of that – and more. But first we have to admit our failure, despite the many successes, to harness all our knowledge to solving some of the most daunting problems facing us. We can (and should, to an extent) celebrate the many victories, but I think the experience the author is conveying is a reminder of just how far our created “scientific” world is from many of the world’s continuing problems.

    • You might say it’s the alignment of activity with utility that is at the heart of the matter. it doesn’t matter how many peer-reviewed publications you make or whether they’re even correct or provide knowledge unless the knowledge is somehow proceeding towards something of good. for example if you have many publications that show that it’s absolutely the case that your drug accomplishes some unimportant surrogate goal towards cancer treatment this isn’t going to qualify in my mind unless hit ultimately results in less cancer death or cancer suffering.

      the same in business, making enormous amounts of money selling snake oil by making people believe that snake oil is somehow going to make them healthier does not qualify as success. nor does making enormous quantities of money by convincing the government to make it illegal for other people to produce competing products

  6. I like the concept of bringing some energy and rigour to enable local, bottom-up solutions (which I think the author is doing).
    I hate the idea of the ‘expert scientist’ turning up armed with the latest top-down solutions.

  7. “Just imagine an invasion of Congress by ex-science nerds, a hospital run by doctors and nurses who deeply understand p-values, or an NGO full of staff demanding confidence intervals and high-quality assessments of project effectiveness. ”

    I had the same nightmare once. Considered turning it into a Hollywood script: something the like “The Day After Tomorrow” except with well meaning stat nerds replacing the bad weather.

  8. Thanks for the comments. I was hoping to start a discussion. Success. I decided to let the comments pile up for a full 24 hours before jumping in. Instead of replying here and there, I will just make one (give or take) over-arching reply.

    First, I am amazed, although I really should not be, in the interest in figuring out who I am. I didn’t think at all about it – neither the idea of staying anonymous nor the idea of being more specific. It’s a good reminder that the message is always interpreted in light of who the messenger is (or who we think they are). With that, I will say that Dalton had it right and, with that, I must now say that I wrote this in my personal capacity and that the views I expressed and am about to express do not reflect that of my current or former agencies. I guess that is the real reason for developing a habit of anonymity – as a government or United Nations employee one generally has to be very careful what is said publicly.

    I estimate, from the timing of replies, that most discussants are in North America, and this is something to consider. Is that a reflection of who reads the blog or of the culture of commenting? Is perspective described in the discussion representative of the larger scientific / statistical perspective?

    So that brings me to point two, which I never thought would be the subject of the conversation generated by this essay but I am so happy that it is. Statistics versus science! A good reason to write something down and share it with people who don’t know you is to uncover your own assumptions. And I assumed we all basically agreed on the following: statistics is the backbone of science. By that I mean statistics, the discipline, and not statistics, a collection of numbers.

    In some groups, sure, there is the perception that statistics is the process of correctly selecting a procedure from a series of pull-down menus or the act of model building and fitting or the act of testing. But here?! I definitely have another essay to write! For now, let me just respond to the first comment: “It is weird how she keeps saying “science and statistical thinking”, as if they must always go together and are of equal importance. Statistics is just another minor tool that scientists may (but not must) use.” It is not weird! And, I fully, as in 100% +/- 0% disagree with the 2nd sentence! Statistics is the process of making correct inference from observation. Science is the structured process of making observations to learn about the world. Statistical thinking starts from asking a well-formulated question and has, as a foundation, the consideration of which observations to make and then to what population one can make inference from those observations. My question above (previous paragraph) is statistical in nature. Sampling Theory is statistical (right? We all agree on this I hope) and, furthermore, accurately communicating the correct inference is clearly part of statistical thinking (this blog reflects statistical thinking and the great writing on fivethirtyeight.com is statistical thinking). Any inference that gets made correctly and then communicated and applied incorrected is clearly a failure, in large part, of statistical thinking.

    Science without statistics? Natural history, archeology (at times) and there are a few more. Sometimes people say “qualitative research” or “focus groups” are not statistical. Ridiculous. There are no calculations but there is a need for correctly making and communicating inference from observations and for understanding the inferential implications of what is being observed. There are activities in the statistical discipline that don’t necessarily feed into science (as in a structured way to learn about the world from observations) but they too are few. The biggest errors made in science, I would argue, are from a failure to incorporate statistical thinking, which is, as I said, the backbone of the whole enterprise.

    And my third point, is just to note that I was not really unhappy in my past nor particularly caught up in the metrics. I was unsatisfied despite being successful-enough. I was aiming at some sort of sarcasm and a description of the gap between on-the-ground impact and impact-factor (that was good right?). Happily, I did not judge my success by the number of pages in my CV. It was intended as a joke reminiscent of the old jokes about the value of a PhD dissertation being measured by far down the stairs it went when it was thrown. That was back in the day when one could envision a dissertation as being produced on paper. Sigh – I have some communication work to do. I really enjoyed the comments that better described the true purpose of science.

    Anyway, I was trying to create a picture of a world in which we all share a set of connections, norms of behaviors, indications of success (ridiculously encapsulated in the metrics) but in which we all talk to each other – mostly. It’s not so much that I am no longer in academia as that I am no longer in “the fold” where we all sort of agree on how things get done. At the next 20 meetings I will attend, often on official statistics, indicators of sustainable development, data collection in the field, there is unlikely to be anyone who has ever heard of Andrew Gelman. Right? It seems like a small world but that is because we all mostly stay inside a small world.

    Last and then I should go back to the meeting that I snuck out of to write this, thanks! I have already heard privately from folks who have made or are considering making similar choices. And thanks for encouraging me to write my “Essay on why statistical thinking is at the heart of everything we think we know” or some such. And for the place where surprising and interesting conversations are nearly a guarantee.

    p.s. for this entire comment which was written without long hours of editing and angsting, please just consider a final statement of something like “for almost every mu”. Yes, you can probably think of a counter-example to something I wrote but if I defined everything perfectly and added a million disclaimers, I would be a lawyer and that is not a career transition I am prepared to consider.

    • Thanks for the follow up comment.

      Your original post reminded me of me about 10 years ago ago when I lost my Scientist appointment and moved into a regulatory agency that publicly claims to be science based. A simple definition of science is what scientists do. I think artists have it right – once an artist always an artist. It’s just a way of thinking that one should not lose just because of what they currently doing as a day job.

      Now, I felt bad for the first few years but got over it with time. Most of us? don’t get thrown out of the fold but rather instead of publishing papers we comment on blogs, instead of presenting at meetings we (try to ask) scientifically profitable questions etc. I have also found that by attending even one meeting every year or so you can keep a number of colleagues as colleagues.

      Also, there can be real advantages being outside academia and adjunct university appointments are easy to get if you want them. Also, your citation counts continue to increase as more and more people claim they have read your papers (perhaps just to please journal reviewers).

      Hope you stay in touch here.

        • And considering when he was making the prediction, we see Halley came up with these values:

          Year, Ascension, Inclination, Perihelion, NearSun, logNearSun, DatePerihelion, NodePerihelion, Direction
          1531, 19.25.0, 17.56.0, 1.39.00, 56700, 9.753583, 1531.08.24.21.18.5, 107.46.00, retro
          1607, 20.21.0, 17.20.0, 2.16.00, 58680, 9.768490, 1607.10.16.03.50.0, 108.05.00, retro
          1682, 21.16.3, 17.56.0, 2.52.45, 58328, 9.765877, 1682.09.04.07.39.0, 108.23.45, retro

          From that we can see these values are similar, but not exactly so. He says:

          And, indeed, there are many things which make me believe that the comet which Apian observed in the year 1531, was the same with that which Kepler and Longomontanus took notice of and described in the Year 1607, and which I myself have seen return, and observed in the year 1682.

          All the elements agree, and nothing seems to contradict this my opinion, besides the inequality of the periodic revolutions. Which inequality is not so great neither, as that it may not be owing to physical causes. For the motions of Saturn is so disturbed by the rest of the planets, especially Jupiter, that the periodic time of that planet is uncertain for some whole days together. How much therefore will a comet be subject to such like errors, which rises almost four times higher than Saturn, and whole velocity, tho increased but a very little, would be sufficient to change its orbit, from an elliptical to a parabolical one… Hence I dare venture foretell, that it will return again in the year 1758.

          So Halley is most concerned about systematic errors, not statistical errors. Later he goes on to mention some other observations that “are too rude and unskillful, for anything of certainty to be drawn from them, in so nice a manner.” It isn’t clear to me how he came to that conclusion.

          I would say that the orbital elements are statistics I guess.

        • Making the judgement “Which inequality is not so great neither, as that it may not be owing to physical causes.” is statistics (IMHO), that is, it’s attributing an error to some particular kind of cause

          so is “Hence I dare venture foretell, that it will return again in the year 1758” which is making a quantitative prediction from a model with some uncertainty (evidently to within an error of about a year). If the comet came December 29th 1757 no one would have had real reason to accuse Halley of being wrong. Quantitative measures of accuracy are the realm of statistics.

          also when you say “So Halley is most concerned about systematic errors, not statistical errors.” you are in my opinion limiting “statistical errors” to too small a scope, you’re thinking “random errors”. The process of deciding whether a thing is “systematic” which is to say, basically a bias, vs “unsystematic” meaning essentially “variance” is itself a statistics question. Like in the case where you have a biased sample of voters and you do MRP the whole point is to “account for systematic differences between the sample and the population of interest” right? That’s certainly something a statistician is rightfully concerned about.

          But, there’s something to your point that “typical” discussion of statistics is surrounding all these random sampling type issues. And because of that I don’t call myself a “statistician” usually, though most of the services I offer are about these “statistics writ large” ideas we’re discussing here. I generally tell people that I “do mathematical modeling, data analysis, decision making, and statistics” because I want to ensure that they understand the broader scope. (this is often the point at which people think “ooh math is hard” and turn off their brain and their eyes glaze over, unfortunately the people I can help the most are the ones who know the least about what is possible… a difficult marketing issue)

          generally there’s a tension between statistician as model builder, vs statistician as analyzer. Most statisticians aren’t going to formulate pharmacokinetic models or write agent based models for cell migration or describe biomechanics models for evaluation of bone fracture risk, or come up with tidal mixing equations for pollution transport… instead they’ll have some “subject matter expert” build those models, and then try to bolt on some data analysis voodoo on top…

          On the other hand, people like Andrew do build their own models to describe voting or suicide risk or whatnot… so sometimes the statistician *is* the model builder. I think this is most common in social sciences.

          I think there’s a blurring between Feynman’s part 1 and 2 (guess the answer, and compute the consequences) and part 3 (compare to experiment). Certainly the compare to experiment part is done these days using statistical thinking. Sometimes even parts 1 and 2 may be done by the same person. Ideally as Keith says, each part is a phase of a continuous whole, and individuals participate in all the phases to some extent.

          The thing that makes me not a typical scientist is that I don’t study one particular topic, like I’m not a biologist, though I’ve published several biology papers, and I’m not a geoscientist though I’ve published a major paper in soil earthquake liquefaction. And I’m not a biomechanics and biokinesiologist though I’m working with people who do gait analysis for stroke victims… I’m not a civil engineer in the sense that I don’t have a license and don’t perform the services people think of like grading and paving and structure design and soforth, but I have a PhD in Civil Engineering and a major concern of mine is how to arrange the community use of resources: pollution, water, education facilities, hospitals, whatever. I’ve worked for several lawyers doing analysis of damages in construction projects, but I’m not a construction project manager…

          What is the name for a person who goes from group to group helping them do a better job of building mathematical models and logically deducing things from data they’ve collected?

          The closest standard term we have is statistician, but it’d be great if we could come up with a commonly understood term that encompasses my job. Let me know ;-)

        • How about polymath? It seems this was the main point you were trying to get across from your self-indulgent monologue. Perhaps a tiny dose of humility would do you some good.

        • Making the judgement “Which inequality is not so great neither, as that it may not be owing to physical causes.” is statistics (IMHO), that is, it’s attributing an error to some particular kind of cause

          Well, I don’t see that being statistics at all.

          Anyway, the reason I care is I have seen that scientists gave statisticians the authority to make the rules for how to extract useful info from their data. This has obviously been an unmitigated disaster with millions of generic studies designed to see if there is a difference between groups, etc.

        • “I have seen that scientists gave statisticians the authority to make the rules for how to extract useful info from their data.”

          Yes, this has been an unmitigated disaster in many fields. But when a mathematician tells me that there’s a proof that any infinitely smooth function on [0,1] can be approximated arbitrarily well uniformly on the interval by a polynomial I more or less trust them (The Weierstrass approximation theorem: https://en.wikipedia.org/wiki/Stone%E2%80%93Weierstrass_theorem)

          I haven’t sat down and read the proof and verified it myself. I’ve basically “given mathematicians the authority to make the rules for how to logically use mathematical structures”… but this hasn’t been an unmitigated disaster.

          The way I see it, we’ve got a lot of BAD / illogical statistics out there, right in the Stats 101 textbooks even, hence the importance of places like this blog.

        • I haven’t sat down and read the proof and verified it myself. I’ve basically “given mathematicians the authority to make the rules for how to logically use mathematical structures”… but this hasn’t been an unmitigated disaster.

          The way I see it, we’ve got a lot of BAD / illogical statistics out there, right in the Stats 101 textbooks even, hence the importance of places like this blog.

          I mean if you give someone trained in statistics a bunch of dates and coordinates of comet observations, will they come up with something like this that allows us to see the same object reappearing at approximately the same interval: https://i.ibb.co/mBThp1s/synopsisofastron00hall-0013.jpg

          I don’t think of doing something like that as “statistical thinking”.

        • If a person came to a masters student in statistics with that table of numbers and a hypothesis generated in “step 1” of Feynman’s cartoon

          “I think some comets might be objects orbiting the sun which return periodically” do you think they’d fail to detect periodicity in those numbers, or that they would have a hard time coming up with a prediction for the next return of Halley’s comet or that they wouldn’t be able to decide whether they had made a good prediction when 1758 finally rolled around?

          Sure, the step 1 is definitely more than “statistics” it’s a creative part of science that may be done by statisticians but might also be done by scientists or other “subject matter experts”

          But the part 3, where you look at the data *with the hypothesis in mind* and try to see if you can find evidence for or against the hypothesis, and try to make predictions after “fitting” the model and then decide what extra data to collect and how to evaluate whether it approximately “fit the predictions” or not… those are all routine ideas in statistics aren’t they?

        • Or at least, they seem like routine ideas in *Bayesian* statistics at least… I could see how you might get completely snowed if you tried to model Halley’s table of numbers as a sample from some kind of Random Number Generator.

        • If a person came to a masters student in statistics with that table of numbers

          You don’t get those numbers, that is an intermediate step. You get about 25 entries like:

          Date
          Latitude of observer
          Longitude of observer
          Azimuth of comet
          Altitude of comet

        • Well, if you include some information about the position of the earth relative to the sun in some coordinate system, then all of those numbers in that table are statistics of the data set you just mentioned ;-)

        • Well, if you include some information about the position of the earth relative to the sun in some coordinate system, then all of those numbers in that table are statistics of the data set you just mentioned ;-)

          I mentioned that above:

          I would say that the orbital elements are statistics I guess.

          So yes we are summarizing the data for each comet sighting into a smaller set of values that I would call statistics. But the nature of those values are domain specific, it isn’t something general like the mean/median/etc. I don’t think they were arrived at via a process I would call “statistical thinking”.

        • Well now we really are just arguing over the meaning of words I think. So Steel and I think that this idea of analyzing datasets in whatever way makes sense is a kind of “statistical thinking” and you think that when you hear “statistical thinking” it means something much more narrow… ok, I guess we come from different communities and have different but related meanings for words…

          In any case, the world needs people who think about how data can be collected and used to weed through which scientific ideas turn out to be actually true or effective or whatever vs which don’t, and who can explicate what the appropriate logical arguments are.

          I agree with you that the “typical statistics toolbox” from Stats 101, 102, 201, 202 etc is often not particularly good for this purpose.

          re matt: well it may seem self indulgent and self aggrandizing, but I assure you it wasn’t intended to be that way. It really is hard for me to tell people what I do in ways that they can understand. “What do you do for a living? Oh I’m a polymath” doesn’t get you many clients, neither does a long description of a bunch of seemingly unrelated projects from fields as diverse as construction, cell biology, finance, and industrial accident analysis…

          Certainly if I were to tell them I’m a statistician that’d give them the wrong impression, which is relevant information to the question Anoneuoid and I were debating. People would probably think that I design surveys and analyze them or something. I’m sorry if it seemed self aggrandizing, I assure you it’s not intended that way, it’s more of a “maybe we really have identified a kind of activity that needs a name and for which statistician isn’t sufficient” sort of thing. I also hate “data science”, I certainly don’t study data the way say “planetary scientists” study the earth and moons of jupiter and things.

          The fact that I’ve worked on a bunch of different kinds of projects is more or less a negative for the typical client, they usually are looking for “an expert in X”… meh

        • So Steel and I think that this idea of analyzing datasets in whatever way makes sense is a kind of “statistical thinking” and you think that when you hear “statistical thinking” it means something much more narrow…

          Ok, how about this. I’ve been asking some physics questions on stack exchange recently. I don’t think anyone is using “statistical thinking” here: https://physics.stackexchange.com/questions/508573/why-does-this-simple-equation-predict-the-venus-surface-temperature-so-accuratel

          It is about a simple no-free parameter model that seems to work allowing for +/- a reasonable percentage of the correct answer for every available datapoint (which is only two until some planet is terraformed or an exoplanet is studied in detail). No statistics, no one is asking for statistics, etc. All the discussion is about the logic/assumptions and how to interpret the outputs/inputs.

          At the very least you have to agree that whatever role “statistical thinking” plays in there is a tiny proportion of the whole. Yet in fields like biomed and psych, statistical output tends to dominate the entire discussion for some reason.

        • Most of the thinking there is physics and dimensional analysis I agree. The model is actually nothing but parameters all of which are determined by estimators from external data sets. I mean, you can’t calculate the mass of venus or the earth from first principles just considering say geometry right? You’d at least need to know what the mean mass density of the planet is, and that doesn’t come from pure logic…

          your estimate is:

          Tv = (de/dv)^(1/2) Te + hp gamma_e (mv/me)

          From first principles, without knowing a bunch of measurements, you have 5 parameters:

          de/dv

          Te

          hp

          gamma_e

          and mv/me

          none of which can be determined by pure logic, whereas for example the exponent 1/2 is determined by logical considerations from your assumptions.

          Fortunately you have external sources where these are estimated from data to high precision. When you have very high precision estimates of things, you can just ignore the uncertainty in them.

          Finally, we don’t expect this to have accuracy of say 18 decimal places, so when we find out it’s good to 5% or whatever we are happy because it’s within the realm of what we might expect (basically a prior on model error).

          So I see a lot of “statistical thinking” of the Bayesian kind at least: previous measurements give us delta-function-like priors over some of the parameters, so we replace them with a point estimate, and model error puts a lot of prior weight on a few percent errors… so we consider the final result to be accurate to within the expectations.

        • It’s a really super nice example though. It has all the cartoon component right?

          1) You made some educated guesses about physical processes

          2) You computed the consequences of the guesses

          3) You used high precision data to plug in numbers to your computation and found that it worked well.

          If you had less high precision data like in the 30% errors on venus issue, and you did a Bayesian model you’d find:

          “either the model works well and we can get a much improved estimate of the mass and distance for venus, or the data is close to accurate and the model doesn’t work that well”

          which would come in the form of a posterior distribution with a dependency structure between the model errors and the parameters that describe the true mass and distance.

        • “High precision” isn’t really the right way to put it. There are all sorts of “rule of thumb” numbers involved like T_e = 288, or the lapse rates and temperatures of Titan and Venus which can only be based on a few measurements during select seasons and at a couple latitudes at most. Even on the Titan wikipedia page it gives two values that differ by 5%:

          The average surface temperature is about 98.29 K… The net effect is that the surface temperature (94 K)

          https://en.wikipedia.org/wiki/Climate_of_Titan

          But I think it works out to the potential for systematic/other error swamps the statistical error so a rough estimate is sufficient.

        • It’s a really super nice example though. It has all the cartoon component right?

          To me, the most interesting part is that it “makes no sense” given the way atmospheres are usually understood. Ie, it is otherwise surprising. Eg, the fact that lapse rate is a linear function of mass (for these types of atmospheres) is apparently surprising to one of the people answering it.

        • You also may be calculating the Venus lapse rate wrong?

          The pressure on Venus is ~.1 atm at ~65 km altitude where it is ~-30 K. The surface is ~462 K. That gives an average lapse rate of (462 – 75)/50 = 7.56 K/km.

          shouldn’t that be (462 – -30)/65 = 7.57 which looks like you typed the wrong thing but calculated the right number at least.

          The fact that you’re using a linear equation for the lapse and it’s working ok for 3 data points isn’t really evidence that it really is linear here because the data points are kind of at the endpoints of this line, especially given your point about the errors in the masses and soforth. I mean, you could do an expansion (a + b* mv/me + c*(mv/me)^2 + d * (mv/me)^3) and then insert a model for the uncertainty in mv/me and find a bunch of different curves that fit more or less similarly well… later if you find some other planet about half the mass of venus say in another solar system… you might find that it has I ~ equal to the V value… so then suddenly your model would be favoring something that looks more like a square root law for example, and still passes between V and E…

          I guess the point is without some “statistical thinking” we can’t really assess the goodness of this model realistically, we may be over-confident in the model based on its apparently working well with 3 data points which are incapable of detecting much nonlinearity because they are kind of clustered together… To me this is statistical thinking.

        • It’d be a really nice example problem to put into Stan, do measurement error parameters for the masses and temperatures, and fit a more complicated function for lapse rate, and put a prior on the model error sizes, and get posterior distribution on the lapse rate function to assess how much remaining uncertainty there is in the form of this function under reasonable assumptions. You could then asses for example whether you have good evidence for a linear relationship or not.

          If you don’t call that statistics… well then we really are talking past each other. That kind of thing is the WHOLE REASON that Stan was written.

        • You also may be calculating the Venus lapse rate wrong?

          Yes, it was a Kelvin to Celsius issue: (735 – 243)/65 = 7.57

          The fact that you’re using a linear equation for the lapse and it’s working ok for 3 data points isn’t really evidence that it really is linear here because the data points are kind of at the endpoints of this line

          The alternative proposed was that the lapse rate should be a function of m^(1/3), which doesn’t fit as well but it is really no big deal. Linear is just the simplest function, and is good enough for these purposes.

          I guess the point is without some “statistical thinking” we can’t really assess the goodness of this model realistically, we may be over-confident in the model based on its apparently working well with 3 data points which are incapable of detecting much nonlinearity because they are kind of clustered together… To me this is statistical thinking.

          Well, yes it is so few data points it could be “coincidence”, as some people said. But can statistics really answer that question for you (this is the famous “is it real” question)?

          The issue is that it is derived from some pretty basic physics considerations and then makes the most simple assumptions about the relationship between Earth and similar atmospheres possible. No one thinks it is “true”, but it really shouldn’t even come close if the standard assumptions are ok.

        • You could then asses for example whether you have good evidence for a linear relationship or not.

          See, but this is exactly what I don’t care about. You want to add this tuneable parameter that will soak up the error, but for what reason? We know the model is just some approximation already.

        • Well, if you don’t care whether you’re right, then you never even need to collect data at all ;-)

          what I think you’re saying is that according to what you’d have thought before doing this exercise, the method should have *big* and *systematic* errors in temperature prediction. The fact that it doesn’t when you use any old O(1) function of the mass ratio indicates that the “standard” methods are the ones that are wrong. You’ve basically done a bayesian model comparison, and left the “standard stuff” which evidently would predict something much different, having very little posterior probability relative to this simple model. You know the simple model isn’t particularly correct, but it’s not far wrong… so you accomplished your goal, which was to compare data to theory and filter out theories… again statistics…

        • The fact that it doesn’t when you use any old O(1) function of the mass ratio indicates that the “standard” methods are the ones that are wrong. You’ve basically done a bayesian model comparison, and left the “standard stuff” which evidently would predict something much different, having very little posterior probability relative to this simple model. You know the simple model isn’t particularly correct, but it’s not far wrong… so you accomplished your goal, which was to compare data to theory and filter out theories… again statistics…

          I look at it this way. Let’s pretend that something more impressive than two objects were fit reasonably well by that equation, say 100. Then we would assume that the equation must be deducible from the standard assumptions used for more complex models in some way. It should be some special case, etc. The actual comparison to the observation is only a tiny part of the process that needn’t be very thorough.

    • By science without statistics I meant more what Kepler did, or Newton. It was definitely quantitative… and very accurately and precisely so, much more than most of what people use statistics for today. I’m using those examples because they were before statistics really became a thing.

      If you can read some of their work and point out where you see statistics being used that could help. Galileo and Haley would probably be a good examples of statistics free science too.

      • Well, Newton and Kepler maybe didn’t do things you’d call statistics, but least absolute deviation and least squares were invented in the 1700’s by people like Laplace and Gauss specifically to handle astronomical data and enable things like navigation in the open ocean or doing land surveys.

        • I think a rough understanding of the uncertainty as seen in my Halley’s comet example above is sufficient in many cases. Call that “using statistics” if you want, but it seems like something else to me.

          Statistical methods are more useful when both of these are small relative to the measurement error:

          A) The systematic error
          B) The “ROPE” (Region of Practical Equivalence – Kruschke)

          Ie, if your observations could be getting swamped by systematic error anyway you don’t need more than an order of magnitude estimate of the measurement error. Same if there is a wide range of values you would consider functionally equivalent. Also, A and B may not be independent considerations.

    • Hi Ashley,

      I didn’t mean to “out” you (your name was right there at the top so I didn’t think it was too anonymous!). Thank you for the essay and rejoinder. It’s interesting for me to think about the attitudes you wrote about and how they related to the academic-industrial complex that generates too many PhDs for too few positions, given that they’re your words. We’ve never met (I think), but I’m familiar with your name because some of your collaborators were my brilliant M.S. advisers (Lisa and Kelly – hopefully that’s anonymous enough for them but sufficient for you). Actually, I think you were in some weird way my “model” for what a graduate student should be. In my first month of graduate school, Lisa lent me “Scientific Method for Ecological Research” which walks through the process of formulating a research question using you (or someone with your name) as an example. I learned from Lisa how to come up with a research question. For me, that has a direct relationship to Statistics because I was taught that a well-formatted research question can be (almost) directly translated into a mathematical relationship between a measurable thing and some other set of measurable things.

      So it’s interesting for me to read about your disenchantment with the publish-or-perish model of what constitutes a scientific career and how that relates to my mental model of a good scientist and the pursuit of an academic career. PhDs are not easy things to obtain. They require foremost dedication, persistence, and a willingness to suspend other life-goals (like an income). Secondarily, talent and insight can sometimes help one obtain one (you have these, not all PhDs do). I do not have PhD. I decided to stop at two Masters degrees (Statistics being one) despite being encouraged by the department head of my other degree to pursue a PhD. Quite simply, the cost in blood, sweat, tears, and most of all TIME, did not compute for me. It was my opinion that PhDs these days take longer than ever to get and to get one you have to fulfill a somewhat narrow set of requirements that don’t necessarily translate to any meaningful impact outside of your number of citations.

      I’m with an agency now (that I won’t name but probably isn’t hard to figure out) that also means I have to somewhat careful in what I say, lest it me interpreted as me representing something other than my own personal views. It is a “science-agency” but I also think a “knowledge-generating” agency. I’m very happy with the work I’m privileged to do every day, but that’s because we are tasked to with answering specific research questions with management implications. Yes, we must publish to be relevant to a certain extent, but that does not always mean in a peer-reviewed journal. Our job is to investigate and communicate. Hopefully what we communicate informs decision made by those invested with the power to make decisions (it ain’t me!). But I’m happy that in the position I’m in that line seems a bit more direct.

      Anyway, good luck in your new position! If I do see you at a conference at some point in the future, I’ll be sure to say hello.

      • No worries! Who doesn’t like to get semi-anonymous compliments on public blogs?! I fully agree that generating a well-formatted research question is a foundational part of, eventually, making a strong statistical conclusion. Happy to imagine that my MS and this way of thinking has been a model for your vision of research. I also had brilliant advisors, like Dr. Ford, who gave me great advice and support during those early days. Do say hello whenever there is a chance. Not hard to figure out how to reach me if we don’t have the chance to meet in person. Best wishes for doing science that makes an impact!

    • “Statistics is the process of making correct inference from observation”

      That’s news to most of the scientific world. I think it would be news to a lot of statisticians.

      “Natural history, archeology (at times) and there are a few more…”

      Evolution, Plate Tectonics, The rise of civilization, Solar System, formation of the planets…just the small stuff. I mean, with statistics, you get the really important stuff like….like….normal distributions!

  9. EAS wrote above:

    Anoneuoid says:
    October 21, 2019 at 4:53 pm
    Is there a stats 101 class that starts off with eyeballing the data as an example of statistics?
    Happily, yes!https://www.tandfonline.com/doi/full/10.1080/00031305.2018.1505657

    Figure 1 (which I assume was what you are referring to) shows a histogram and attempt to explain the sampling distribution of the mean. This is statistics (ie, it uses the concept of a mean).

    I meant plot theoretically predicted values vs observed values and “eyeball” the fit to see if it was good enough. Imagine you are Kepler comparing your laws to Brahe’s data and no one really used statistics at the time.

    * As an aside, the solution to poorly reproducible studies is to simply make replicating each others work a standard practice. People will quickly figure out what needs to be recorded, etc to allow others to repeat what they did. It has nothing to do with statistics like this survey apparently suggests:

    In the previously mentioned poll on the reproducibility crisis, the number one factor needed for boosting reproducibility in science, cited by just over 50% of those surveyed, was “better understanding of statistics.”.

    https://www.tandfonline.com/doi/full/10.1080/00031305.2018.1505657

    • A couple thoughts: (1) There is, of course, more to our course than what is described in the article. I was not refering to a particular figure but to the underlyig concepts. It’s true that we don’t start exactly with a graph to explore but we do start with a result and we ask the students to explain it. Then we look at a graph. (2) There is a lot more to the replicability crisis than exactly writing enough to replicate it. There are many statistical issues related to p-hacking, desk-drawer effect and more. So, in fact, I strongly believe that the replicability crisis has everything to do with statistical understanding.

Leave a Reply to jim Cancel reply

Your email address will not be published. Required fields are marked *