“ai-promised-to-revolutionize-radiology-but-so-far-its-failing”

Gary Smith points us to this news article:

Geoffrey Hinton is a legendary computer scientist . . . Naturally, people paid attention when Hinton declared in 2016, “We should stop training radiologists now, it’s just completely obvious within five years deep learning is going to do better than radiologists.” The US Food and Drug Administration (FDA) approved the first AI algorithm for medical imaging that year and there are now more than 80 approved algorithms in the US and a similar number in Europe.

OK, so far no surprise. But then:

Yet, the number of radiologists working in the US has gone up, not down, increasing by about 7% between 2015 and 2019. Indeed, there is now a shortage of radiologists that is predicted to increase over the next decade. What happened? The inert AI revolution in radiology is yet another example of how AI has overpromised and under delivered. . . .

Radiology—the analysis of images for signs of disease—is a narrowly defined task that AI might be good at, but image recognition algorithms are often brittle and inconsistent. . . . only about 11% of radiologists used AI for image interpretation in a clinical practice. Of those not using AI, 72% have no plans to do so while approximately 20% want to adopt within five years. The reason for this slow diffusion is poor performance. . . .

Interesting. I’m not sure what to think here. AI will only get better, not worse, so it seems reasonable to suppose that in the not-too-distant future it will be useful, at the very least as aid to radiologists. A lot of work has to get into making any system be useful in practice, but there’s lots and lots of money in radiology so I’d think that someone could be put on the job of building a useful tool.

Here’s an analogy to a much simpler, but still not trivial, problem. Nearly twenty years ago some colleagues and I came up with an improved analysis of serial dilution assays. The default software on these lab machines was using various seat-of-the-pants statistical methods that were really inefficient, averaging data inappropriately and declaring observations “below detection limit” when they were actually carrying useful information. We took the same statistical model that was used in that software and just fit it better. It wasn’t super-hard but there were various subtle twists, and we published our method in the journal Biometrics. I thought this would revolutionize dilution assays.

Well, no. In the 17 years since that paper has appeared, it’s been cited only 61 times. And the assay machines still use some version of the same old crappy software.

Why? It’s simple. We haven’t provided a clean alternative. It’s not enough to have a published paper showing how to fit the model. You need to program it in, and you need the program to handle all the bad things that might happen to the data, and we haven’t done that. Way back when we got an NIH grant to implement our model, but in the meantime things changed at the lab and we didn’t have access to a stream of data to try out ideas, and everything was a bit harder to implement than we thought, and we got tangled up in some issues with the raw data, and . . . well, we’ve returned to the problem recently so maybe we’ll make some progress, but the point is that it’s hard to come up with a usable replacement, even in a problem as clean and clearly defined as a numerical laboratory assay. There are lots of dilution assays being done in the world, so in theory I think there would be money available to increase the efficiency of the estimates, but it hasn’t happened. The radiology story is different in that there’s more money but the problems, both technical and institutional, are more difficult.

But I guess Hinton was wrong when he said, “We should stop training radiologists now.” You can be good at science and engineering but still not be the right person to forecast employment trends.

71 thoughts on ““ai-promised-to-revolutionize-radiology-but-so-far-its-failing”

  1. “You can be good at science engineering but still not be the right person to forecast employment trends.”

    It’s not so easy to forecast employment trends, even for people who study forecasting or for people who study employment or for people who study trends.

    The best coordination tool we have is prices. Radiologists are still making a lot of money, which is encouraging people to study it and become radiologists. If image recognition starts reducing the income of radiologists, then the marginal radiologists will start doing something else.

    • Also just because the best societal outcome is to use some technology doesn’t mean that’s the outcome that will occur. You think radiologists will want to have their top 1% jobs eliminated and just go on to be paid $15/hr to plug images into software tools? Hell no, they’re going to argue that people should die because they dont use the superior tools just so they can keep paying the mortgage on their 6 bedroom houses and upkeep on their fishing boats. Of course they won’t put it like that. But it’ll amount to the same thing.

      • Radiologist here. People always seem to confuse radiologists with a machine which gets presented a set of imagrs and comes up with a diagnosis. This is one of the reasons why I am jot afraid of being replaced in the coming 40 years left in my career.

        • Suppose there were a machine, you give it a set of images and it writes a textual radiologist type report. It provably does better than say 85% of human radiologists at that job.

          What would be the value added by a random radiologist? Real actual question not rhetorical. I’d like to know.

        • Isn’t this mostly just “a machine which gets presented a set of images and comes up with a diagnosis”? I thought Maarten was arguing that radiologists do much more than that?

          (This also sounds like a pretty fantastic machine, in multiple senses of the word.)

        • Sometimes the question for the radiologist is pretty general, like “What’s wrong with this person?” Or “…this spleen?” Or “…these blood vessels?” AI has a lot of potential to answer very specific questions but has shown very little skill in reading between the lines (or making sense of anatomy that’s unusual). I work in medical imaging and believe it’ll be a long time before AI-based diagnostic tools replace physicians fully… decades, if ever. But there’s a lot of potential for building better lab-based diagnostic assays along the way where AI will indeed make major inroads.

        • Although Google does pretty well replacing doctors.

          Most people fail – surely doctors included – when they try to follow a simple decision tree. The tree leads them to the right answer, but they reject the correct answer for all kinds of reasons.

        • Assuming it gets to that point (which I’d wager is 20+ years away, optimistically), then sure maybe radiology is no longer a relevant field of medicine. Here’s the catch though: radiologists are still MDs. Absolute worst cases is that we’d suddenly have a bunch more GPs, which would be wonderful because we need a lot more GPs.

        • While some proportion of radiologists might be able to re-train themselves as primary care providers, I think you are greatly underestimating the difficulty of making a transition between specialties after you have been practicing one for a long time. Most radiologists have forgotten the vast majority of whatever they knew about primary care (and vice versa), and over the course of even just a decade the changes (in both specialties) are pretty large. It would take an enormous amount of training to get up to speed, and I suspect that relatively few would be willing to do it. I think you’d see a lot of early retirements, and many leaving medicine altogether for another career. I also think the number of radiologists who would try to switch to other specialties that are more closely related to their practice would dwarf the number going into primary care.

        • Suppose there were a machine that does all the crap that I have to do every day to get by; and on top of that it pretends to be me. Meanwhile I haven’t got crap to do so I go out *looking* for trouble. But I’m thrilled because I haven’t go anything at all to do. The essential workers, now have to scramble on top of each other like the grunions for the chance they might be lined up for some work — changing diapers maybe; though that too can be done by intelligent machines if we put our minds to it. You know what will happen? August 1914, when they marched off to the Somme like it was a holiday. Because they were *bored*. Except this one is of world-scope. Radiologists and diaper-changers alike. So that the fanatics have a tub to beat; which is to say, the apocalypse and the after-world will really seem like a vacation; from the shameful scrambling around for shit-work.

        • I’m not a radiologist, but I used to work with radiation oncologists and I felt that reading scans were a very small part of their work.

        • True. There are many types of radiologists with varying degrees of patient and medical peer contact. Working in a hospital is very different than the back office of an imaging clinic.

      • “Also just because the best societal outcome is to use some technology doesn’t mean that’s the outcome that will occur. ”

        However, it almost certainly *would* occur if it weren’t for one thing: government regulation.

        • Chris Wilson Said: “This One Weird Trick!”

          Not at all! :)

          There are plenty of policies that could be changed today that have negative social outcomes and don’t even require new technology. But some people have very deep beliefs in protecting their own turf and/or beliefs so they’d rather not see any regulations pulled back, no matter how beneficial.

          In the case of radiology, if there were a functional AI alternative, we can be sure that the regulatory state would be use by doctors to protect their turf, as we already see with so many other licensing requirements.

        • I pointed out a good example below. Railroad long haul could probably operate entirely autonomously right now, but the jobs in the cab are protected.

        • Jim:

          In the example of my dilution assay modeling, the problem wasn’t government regulation. The problem was just that nobody was motivated to put in the effort to get the better method working. In theory, I’d think that one of the companies that makes the assay machines would be motivated to do it, as it should give them a competitive advantage to get more efficient inferences—better estimates using less data!—but I guess they’re not really set up to do this, I guess they don’t have internal research departments that could do this and there’s nobody from the outside pitching to them. Back in 2004 or so when we wrote our paper, I thought of trying to contact the assay company and sell them on the idea of our method, but it seemed simpler to get a NIH grant to do the work. Then for various reasons we didn’t get it done. In retrospect if we’d been working directly with the company we might have made more progress. On the other hand, it’s risky to do this within a corporate environment because what if they own the IP and then decide not to release it. Anyway, yes, there were technical challenges but I still think the problem would not be so hard to solve by a qualified team working on it full time.

          I think there must be lots and lots of examples like this: moderately low-hanging fruit that don’t get picked because nobody’s quite set up to do the work.

        • > the problem would not be so hard to solve by a qualified team working on it full time.
          At least in some cases I am aware, it was too hard to convince decision makers of a reasonable payback to funding a qualified team working on it full time. Even if that happens, it will be too hard for them to ensure the team is adequately qualified.

          And then there is the inertia for adequately qualified wanting stick with what they learned in grad school or are already familiar with… http://www.stat.columbia.edu/~gelman/research/published/authorship2.pdf

        • There’s a sort of Keynesian Beauty Contest thing going on too.

          Even if you yourself are certain that a new approach is better, your analysis is usually for some audience that might not know/agree. You can try to convince them that about the approach AND whatever subject-matter conclusion you’re drawing from the data…but now you’re having two arguments, rather than one.

          I think this, rather than straight-up ignorance, arrogance, or sloth, is a big factor in why less-good methods tend to stick around.

        • “I think there must be lots and lots of examples like this: moderately low-hanging fruit that don’t get picked because nobody’s quite set up to do the work.”

          I agree. The degree to which the “fruit gets picked” is probably directly proportional to the cost/effort/benefit for a company. Investors don’t go dumping piles of development cash into technological improvements that will generate a marginal profit on small volume years in the future. :)

        • jim:

          You are right. And industry or market disruption is only possible where the costs or potential profits are misunderstood.

        • Ah, if only there were no pesky antitrust regulation, then we could all be enjoying the advantages of internet explorer 6. Just think of all the toolbars we’ve missed out on! The revenue we could have diverted to enterprising young malware distributors!

          https://en.wikipedia.org/wiki/Embrace,_extend,_and_extinguish

          I agree that regulatory capture is one of many underhanded tactics for incumbent corporate players to defend inferior products and eliminate competition. That government regulation is the *only* such tactic that works is the sort of deranged opinion that can only be held by a truly blind ideologue or one who’s never worked for a large private sector firm.

        • Indeed, the realities of market power, and real-world behavior of large firms, constitute the F-35 sized hole in libertarian ideology…reference intended ;)

        • “constitute the F-35 sized hole in libertarian ideology”

          Only insofar as regulators choke up the market to give a few large firms a massive advantage. :) Regulation always benefits large firms because they have the scale to manage it.

          Here in Seattle a pile of new regulations over the last decade on landlords is driving small landlords out of business. Once they’re gone, the new corporate landlords will bring in the lawyers and master the regulation to their advantage. Mean time homelessness is still on the rise despite all the happy new restrictions intended to screw landlords.

        • Yup, regulatory capture is why excessive market power shouldn’t be allowed to develop in the first place…which is why you need strong democratic institutions and muscular anti-trust enforcement. Weakened institutions and weak government – the stated aim of libertarians – leads directly to greater market power for corporations, and greater reliance on “good old boy” networks (aka the equivalent of medieval, hereditary gentry) to run things at the local scale. So getting back to your original quip, no, it is definitely NOT the case that government regulation is why technological developments don’t invariably lead to the socially optimal outcomes. Not in the slightest.

        • In short, you need minimally two categories to have a hope of a good outcome in this project of self-governance: good regulation (no child labor, no chattel slavery, food safety, drug purity, can’t dump surplus dioxins in the local aquifer, can’t form cartels, etc.etc.etc.) versus bad regulation (excessive guild protectionism, excessive red tape necessitating only majors with compliance departments can succeed).

        • “That government regulation is the *only* such tactic that works:”

          Did anyone say that here? :) It *is* the only such tactic that’s directly implemented and controlled by the government.

          “we could all be enjoying the advantages of internet explorer 6.”

          Not really sure what you’re getting at there. The attack on MSFT might have saved Apple, but it didn’t create competition in the browser market and it didn’t create Google (or chrome), Amazon, Facebook, Adobe, Salesforce or most of the other major tech companies we have today, and these days AAPL is having it’s own troubles.

        • “That government regulation is the *only* such tactic that works:”

          Did anyone say that here? :)

          You did.

          https://statmodeling.stat.columbia.edu/2021/06/07/ai-promised-to-revolutionize-radiology-but-so-far-its-failing/#comment-1857719

          “Also just because the best societal outcome is to use some technology doesn’t mean that’s the outcome that will occur. “

          However, it almost certainly *would* occur if it weren’t for one thing: government regulation.

          The best thing will almost always happen except for “one thing”.

          Every so often, when you’ve said something idiotic, the best course of action is actually not to double down.

          Not really sure what you’re getting at there. The attack on MSFT might have saved Apple

          As a corollary, when you don’t know what you’re talking about, the best of course of action is actually not to make things up. I’m referring to the first example on the page I linked, the 1998 antitrust case United States v Microsoft, the explicit subject of which was competition in the web browser market. Microsoft was engaged in practices such as barring OEMs from bundling other web browsers with their machines, intentionally delaying access to APIs to software developers they viewed as competition, and paying content providers to develop their technology with their piece of shit ActiveX, specifically to break functionality on non-internet explorer browsers. If Microsoft could strong arm web companies into using proprietary Silverlight or whatever, we would absolutely have a less competitive browser market. Another practice barred by the case was Microsoft breaking the cross platform java runtime. Whatever your opinion of java’s garbage collected runtime or verbose language design, a functioning cross platform java runtime is of core importance to lots of important software. I’m guessing by “attack on Microsoft” that “saved Apple”, you’re referring to the out of court settlement wherein Microsoft agreed to buy $150 million of Apple stock. Entirely separate. Not even on the page I linked.

      • > You think radiologists will want to have their top 1% jobs eliminated and just go on to be paid $15/hr to plug images into software tools? Hell no, they’re going to argue that people should die because they dont use the superior tools just so they can keep paying the mortgage on their 6 bedroom houses and upkeep on their fishing boats.

        If the argument is a tool to make the job easier will make the job go away, there’s counterexamples to that. Programming now is probably easier than it was 50 years ago. It remains a lucrative job and there are probably more programmers now than there were then. Obviously automation can go the other way too, but I don’t think the idea that the radiologist labor market is the strongest force here would hold water.

        From the article:

        > Furthermore, only 40 of the more than 80 radiology algorithms currently cleared by the FDA, along with 27 in-house tools, were utilized by respondents. Only 34% of these were used for image interpretation; the other applications included work list management, image enhancement, operations, and measurements. The bottom line: only about 11% of radiologists used AI for image interpretation in a clinical practice. Of those not using AI, 72% have no plans to do so while approximately 20% want to adopt within five years.

        So from the article we have 3x as much AI in 5 years? That’s pretty substantial growth. They’re reading it glass half empty but to me that seems glass quite full lol.

        And then:

        > when we collect data from Stanford Hospital, then we train and test on data from the same hospital, indeed, we can publish papers showing [the algorithms] are comparable to human radiologists in spotting certain conditions. It turns out [that when] you take that same model, that same AI system, to an older hospital down the street, … the performance of AI system to degrade significantly.

        I don’t know about radiology or ML, but that really sounds like a very solvable problem. The companies need a lot of labeled data from a lot of hospitals.

        • As a trained MD and engineer, and ML practioner, I am fully confident saying AI won’t replace radiology anytime in the next 20+ years at least (actually in my lifetime is even more likely). Same problem with fully-automated cars. The current neural-net based approaches of ML can’t solve both problems. Even if we solve the technical problems, there’s still the human dimensions to both of these problems. No AI won’t be replacing radiologist any time soon. getting to a fully-automated, machine only transport systems is possible. But AI co-driving with humans, we are still decades away.
          More data, as you seem to be suggesting, is not the answer. The tech just isn’t there to actually solve all elements (technical, human, social) of the problem.

  2. It is hard for me to imagine a worse use of AI than radiology.

    I got to experience the AI revolution in real time. Early in my career, I taught myself how to interpret infrared spectra of finished products, mostly plastics and lubricants and such.

    First it was just me comparing my results to printed spectra in fat books. Then the first computers led to the FTIR revolution (Fourier Transform IR) and brought primitive search-match software, which only worked on pure substances. Reading spectra is like learning a new language, and many (most) labs didn’t have someone who could do it. So right from the beginning of search-match software, some labs simply gave the customer the search-match results list. In the real world, the results were useless because a typical plastic has five constituents, all of which respond to infrared differently, and the correct plastic would not appear on the search-match list.

    As time went on, the software got better of course. By the time I retired, there was a function that could deconvolute up to three spectra superimposed over each other, better than I could. It was only possible because of vast computing power compared to what had been available, but still worked by brute force to a large degree.

    Thinking about radiology – amorphous shapes in grayscale – I can imagine that the software will eventually get there, but it might be the last field where it happens.

    • Coming from the chemical industry I appreciate what you are saying here but I think there’s a difference:

      Deducing component identity from a mixtures IR signature was never easy even for a human.

      OTOH in the case of radiology we are probably only trying to get an AI algorithm to do as well as a human radiologist already does.

    • “It is hard for me to imagine a worse use of AI than radiology.” That seems odd to me. Computers can now be trained to recognize many patterns and phenomena much, much better than humans can. Why would radiology be different?

      Matt: my father spent much of his career at the U.S. Department of Agriculture, where he worked on ways to extract useful information from satellite images. He got tired of people assuming that you could determine the type and health of a crop based on its spectrum, so he wrote a little program: feed in a spectrum (from a LANDSAT image of a specific location, say) and the program would spit out a bunch of spectra that looked just like it and tell you what crops they were. A spectrum from a field of young, slightly dry corn might turn out to be virtually indistinguishable from that of mature soybeans or whatever. I guess mostly the chlorophyll absorption just dominates and everything else is small background that is modified by atmospheric conditions etc. anyway. At any rate, as a practical matter, given the spectral resolution available at the time, you couldn’t even tell what species the crops were.

      • > He got tired of people assuming that you could determine the type and health of a crop based on its spectrum

        I would have guessed something like this is is why the satellites were taking pictures in first place. Is this true and the spectrums didn’t end up being that useful? Or were they put up for something else and everyone just thought this would be one ez addition to the software? Or what?

        • I wish I had asked my dad more about this.

          I think it’s true that the satellite designers thought the spectra would be useful for more things than they turned out to be. But even with their limitations for determining the status of crops they were still useful for lots of other stuff. If all you have is a photograph — by which I mean red, green, and blue channels, which is the same as a spectrum with that just gives you the average in each of three wavelength ranges — you can’t always even distinguish sand from dry grass, but that’s easy with even limited spectral data.

      • “‘It is hard for me to imagine a worse use of AI than radiology.’ That seems odd to me. Computers can now be trained to recognize many patterns and phenomena much, much better than humans can. Why would radiology be different?”

        Radiology is only different by degree. (In addition to infrared spectroscopy, I was also a radiographer, including real-time later in life.) Image recognition by AI is incredibly powerful, no doubt about it. But it is optimized for problems like “Where’s Waldo.” By that I mean you can make the image as complicated as you like, but Waldo’s face is always the same and modern computers could find it nearly instantaneously. In comparison, radiography has a host of problems. It reduces 3D to 2D. It causes parallax. Your brain is pretty good at recognizing and accounting for parallax, but it is challenging for AI. X-rays are in grayscale, so no shading from perspective, no color differences, etc. If you think about accounting for 3D parallax on a 2D image, using density variations in gray scale, you get an idea of the difficulty.

        I think that before we get good AI for radiography, we will have implement widespread usage of confocal X-ray imaging.

  3. Posts like this keep reminding me of an article published by MIT Tech Review last fall. It discusses how the Duke University Health System implemented Sepsis Watch, a Duke University deep-learning model, to better detect sepsis in its patients. The takeaway is essentially that the tool was only successful because of its bespoke implementation process, which relied heavily on hospital nurses and their understanding of how the hospital actually operated. A technology is not introduced to a blank slate, it is folded into a web of existing structural and social hierarchies and systems that, more often than not, it disrupts. A new tech only works when it has sufficient buy-in. That only comes when the tech complements and is an outgrowth of these underlying systems. I think the same could be said of AI’s place in radiology.

    Article: https://www.technologyreview.com/2020/10/02/1009267/ai-reduced-hospital-deaths-in-the-real-world/

    • Agreed, there is a “user-centered design” process behind integrating AI into health settings effectively (and probably most other settings for that matter). How to elicit and apply domain knowledge from the various people the AI is supposed to help is a part that doesn’t get discussed as much (at least not in the ML lit) but which is obviously critical.

    • Ben B:

      This “web of existing structural and social hierarchies” is also why many companies are unable to adapt to a changing competitive landscape. Required buy-in can simply reinforce and concretize current maladaptive structures and processes.

        • Hmmm. (How many are Benjamin B? Bens I’ve known include Bentley, Bennett, and Benedicte as well as Benjamin …)

  4. No field in medicine saw more technological innovation than radiology during my career. I remember being slack jawed with amazement when I saw the my first head CT scan in a patient with melanoma and a suspected brain metastasis; the image was so clear, so obvious! This was on the second scanner in the US in 1976, and it obtained images getting 5 mm slices. Image quality and precision has increased dramatically since then. We can imagine normal sized lymph nodes and determine whether they contain cancer with pretty good reliability. MRI and PET also generate a huge amount of data. The amount of data is growing rapidly, and there is no way the number of human readers can keep pace. I think that AI is going to be a necessity. We can’t build using guys with pick and shovels; we can’t do actuarial work with paper and pencil, and we can’t do radiology with guys looking at a viewbox in a dark room. The amount of information generated by the tools of 2020 is too great, and the tools are getting better. There are aircraft that require adjustments in trim faster than humans can respond, and we have devices crawling around Mars that are too far away for immediate human oversight; these things require machine learning. To get useful information out of the fantastic imaging tools we have requires AI tools. Development will take more time than predicted by an enthusiast, but it is coming.

    • Has the development post 1976 been as amazing?

      Most of the advances in CT / MRI etc. so far seem to be what I would call digital signal processing, sensors, algorithms, physics rather than what goes as AI.

      Basically we get increasingly precises and high resolution images coded by various attributes.

      I guess the AI part is what would allow the diagnosis to be made automatically and that does not seem to have progressed much?

      I mean even a macroscopic thing like a femur fracture still gets manually reported by a human radiologist?

  5. Without actually seeing the performance data, not linked anywhere as far as I can tell, it’s hard to know where the conclusion actually lies on the spectrum between (i) “AI has overpromised and under delivered” and (ii) people resist better, non-human tools that might make their jobs redundant. I can easily imagine either.

    • Andrew:

      Looks like your 2004 article fell victim to the first law of wing walking: the natural and normatively rational reluctance of scientists and practitioners to let go of what they’re holding until they are holding something at least as secure.

      John

    • I completely agree that the story here could be either (i) or (ii), or some combination of the two. The practitioner self-interest point (ii) is particularly important in healthcare because the economic pressure toward lower costs is relatively weak (sometimes even inverted) in that sector, at least in the US.

      Of course, the other sector that AI was scheduled to have revolutionized by now was transportation. The downward cost pressure in that sector is relentless, so that seems like a more straightforward case of AI over-hype.

      I’m curious if others think this is just a slight timeline adjustment or some more fundamental barrier. I don’t claim to know much about AI, but some surprising failures (e.g., the crazy volume of obviously fake reviews on Amazon) give me pause.

    • Raghu:

      Yes, there are a lot of steps between technology and implementation. Here’s a quick list:

      1. Making the technology work at all.
      2. Making it reliable.
      3. Implementing it so it is easy to use.
      4. Getting a track record so people trust it.
      5. Overcoming economic barriers.

      When writing the above post, I was thinking of all these difficulties, and that’s one reason I gave the example of my own technology with the serial dilution assays that succeeded with #1 but not #2 ,3, 4, and 5.

      In other settings, it can be possible to start with #5 and then try to go backward to reach #1. For example, that Tesla autopilot thing which doesn’t work but has somehow succeeded with #3, 4, and 5. Or various scammy things like health care databases that cost zillions of dollars and have to be thrown out, but they succeed in the sense that someone cons the purchasing officer at some major company to pay 2 million dollars for it.

    • (i) “AI has overpromised and under delivered”

      This! AI today is nowhere ready to replace radiologist and won’t be there even in 10 years. I say this as an ML practitioner and a trained MD.

      Yes the medical world is slow to change, and necessarily so. Imaging hardware and modality are constantly improving incrementally. The likely path to AI usefulness in radiology is through finding limited use cases where AI tech can actually augment those incremental improvements with well-defined problems that current.

  6. A major problem with the scaling of research projects in this space is the lack of enough open data and stringent review processes. Algorithms that supposedly ‘worked’ were later found to have rulers in images (a physician would put a ruler only next to a more suspect lesion, or the rule would be in the image from a specialist provider so these lesions already passed an earlier screening). Much of this was written up a couple years ago, https://www.statnews.com/2019/10/23/advancing-ai-health-care-trust/

  7. I suspect one of the bumps in the road here was the position that given we can’t understand how people make decisions (the explanations they give are just stories rather than fact) we should not expect or even try to understand machine learning models/representations.

    However, models are deductive and they can be fully understood (what they imply) until they get too complex. However, even those as complex as deep neural nets do get understood well enough in certain cases e.g. no green pixels: can’t be a cow. So they are almost ideal for becoming untrustworthy.

    So thinking interpretability could just be dismissed as an unreasonable expectation may have worked initially – hey the accuracy versus interpretability trade off myth persists and there is still an “explainable” AI (XAI) industry – it was a self defeating strategy. Eventually people do understand at least in whenever they can.

  8. Here are two ways that AI, as applied, could get worse with time.

    1)Radiologists can exercise quality control over images. Fuzzy image? Obvious artifacts? The radiologist can insist on getting another image, and has the clout to insist on decent image quality. Fast forward 20 years, to a world with AI reading the images. Now the “AI radiologist” is just a machine with no clout. Over time people get sloppy about image quality and the AI just has to do the best it can with poor images. Accuracy goes down.

    2)Along comes a respiratory virus. Lots of patients have mild scarring in their lungs. But FDA certification is an expensive, one way street. Accuracy has gone down, but the old, certified software doesn’t get decertified. Meanwhile, upgrading the software is technically quite demanding and nobody wants to spend the large sums of money needed to get the upgraded software certified, so it doesn’t happen.

    Getting somethings that is good in theory, to be good when rolled out nation wide, is hard. Ensuring that the deployed version stays good is harder still.

  9. Reminds me of some work related to visual query refinement interfaces for pathologists working with deep learning based image search algorithms, e.g. https://dl.acm.org/doi/pdf/10.1145/3290605.3300234. The idea there seems to be that what possible diagnoses are relevant and what visual features should be analyzed closely will depend in part on information the pathologist has about the patient and case at the specific point of the search, so they can lose trust in fully automated methods.

  10. I don’t believe this has much to do with ‘performance’ at all. It seems to me that the places where AI has the hardest time making a real impact are places where qualitative results are most important, rather than quantitative ones. Getting self driving cars is a totally qualitative goal (does it need to avoid 100% of accidents? handle 100% of situations? These are impossible goals for any human driver, but any other quantitative threshold seems irresponsible). At a certain point someone needs to say “yeah this is good enough” and there is not really a good indication where that point should be, so it’s impossible to reach.

    Diagnosing from radiology scans seems like it could be quantitative, but it’s not. It’s only quantitative when you have a database of labeled data, which is not reflective of the real world at all. In the real world success is measured by something like “did the patient get better or avoid some bad outcome” where getting better involves a huge number of things above and beyond detecting some anomaly on a scan, and most of which have more to do with interpersonal interactions or institutional practices. It’s not at all surprising to me that adoption of a complex new tool to marginally improve this one aspect of an already complex process isn’t happening. And it’s not plainly obvious to me that the world in which is DOES happen would have better outcomes (because of the rest of the complex process that has nothing to do with image processing).

    This is not to mention that any of these ‘automated’ systems require constant upkeep, usually of highly trained professionals who demand salaries near the top of the earning ladder. You need a pretty big improvement to justify that amount of additional overhead. Maybe this could work if you can replace a whole team of radiologists with one ML engineer, but there is no incremental path to this. And it’s not like any given hospital system has millions of radiologists to potentially replace, so the scaling potential is really limited.

    AI seems to dominate in certain areas (advertising, finance) where a small quantitative improvement is both important and measurable. This is also the the world in which AI researchers operate (optimizing some model accuracy metric, and being rewarded with papers in flashy journals). Given that advancements in this area seem to consist mostly of adding complexity (and therefore cost), I’m not sure if will ever reach the point where the benefits scale enough to outweigh the costs in the real world for things like healthcare.

    Of course, people much smarter than me are making billion dollar bets to the contrary.

  11. I don’t know enough about radiology to vouch for the closeness of the analogy, but a dozen or so years ago there was a tremendous amount of talk about advances in autonomy eliminating long-haul trucking as a profession in the very near future (3 to 5 years). Perhaps not coincidentally, recruiting people for the field has gotten more difficult.

    • The problem for the LHT industry is that there is already a competing technology w/ much lower labor cost per ton: railroads. Railroads would probably already be operating without crew if the regulators were out of the picture.

  12. Interesting that nobody (as far as I see in the comments) mentioned the best replicated result of research in decision making in psychology:
    Clinical versus statistical prediction,
    namely the research of Paul E Meehl and colleagues, which started in 1954.
    Statistiscal(mechanical, actuarial) prediction was almost always better than the clinical (intuitive) judgments of trained clinicians.Even the models taken from the expert clinicians had been better than the clinicians themselves.
    See also the references at a homepage archiving Meehl`s work at UMinnesota:
    https://meehl.umn.edu/publications/category-0
    See also the latest metaanalysis there
    and the slow process of accepting these results. Guess this will also apply to radiology.
    Best regards
    Werner

  13. AI indeed is not doing well in radiology.

    However, AI is doing well for ophthalmology. Of the FDA approved AI algorithms, I believe only 2 are autonomous AI, as opposed to assistive AIs, and they are both for screening for diabetic retinopathy.

    One of those AI tools, EyeArt, it is being used successfully in the US for community screening programs. There was a good webinar about it recently, which is, ironically, not on the EyeArt website. I was able to find a link on twitter, https://twitter.com/EyenukInc/status/1395381595251560456?s=20 (It requires registration, but after registering you get a link to the recording which may work https://zoom.us/rec/play/oB_Q_EMGU_DTNqHdL2yM6zIYBEmaShiAGX0GnNRnkDUZGn5-NAq1-F3R_MTLI1Gq4CjhMBhi_rxq4QPU.d2zzaGVHBFgW3KxJ?continueMode=true&_x_zm_rtaid=s45jfveOQL2emkH7uuGL0g.1623279951087.4f079ebec742e4e88d95c78a4f9dea3f&_x_zm_rhtaid=647)

  14. 1) radiologists have to sign off any AI produced diagnostic reports. AI reports need a stamp from human beings.
    2) radiologists also do radiotherapy, which is not AI.
    3) as it has always happened, whenever a new tech is implemented in medical field, it induces new and more demands and raise the health care costs up, often significantly.
    4) New machine new tech will lead to more jobs for high end workers, but will replace low end workers.
    5) finally, AI field is full of self-congratulated research. You always need to take a grain of salt on their claims.

  15. Radiology has had AI for decades. For example in mammo and chest XR. And we still have to override the vast majority of machine calls.
    It’s just not very good, and in some cases absolutely terrible. And they’ve been at those softwares for at least 25 years.
    I’m not even mentioning the liability issues. Will the software company carry a malpractice policy? Who will be responsible for the (enormous number of mistakes.
    Finally the machines cannot even reliably make the findings. But the real job of the radiologist is ton integrate those in the big picture of the medical history and correlation with a multitude of others test. And for that the machines are useless.
    Not worried a bit about the future of radiologists. Actually just the opposite.

Leave a Reply to Maarten Cancel reply

Your email address will not be published. Required fields are marked *