“MIT Built a Theranos for Plants”

This news article by Tom McKay is hilarious:

The prestigious multidisciplinary MIT Media Lab built a “personal food computer” that worked so poorly that demos had to be faked Theranos-style . . . According to Business Insider, the project—a plastic hydroponic grow box filled with “advanced sensors and LED lights” that would supposedly make it possible to replicate crop conditions from any part of the global—was a sham, with MIT’s Open Agriculture Initiative director Caleb Harper resorting to faking demos . . . According to Business Insider, Harper directed an email requesting comment to an MIT spokesperson, who “didn’t provide a comment.”

No comment, huh? The “Media Lab” should be able to do better than that, no? If that’s all they can do, they could just as well hire the media team from Cornell. You’re the goddam media lab, dudes: you should be able to work with the media, no?

Ahhh, MIT . . . what’s happened to you? Lots has changed since 1986, I guess.

29 thoughts on ““MIT Built a Theranos for Plants”

  1. I’m not bothered by washing dust off the leaves to get a better picture for publication. Let’s give them a pass for that one.

    As for the rest: WTF? They put dozens of machines in schools and “at most two grew a plant”, how is that even possible? A plant needs air, water, and light. That’s it. (For years I thought orchids were especially finicky, from reading Nero Wolf novels, but that wasn’t true, they’re easy too). I mean, even if the box did LITERALLY NOTHING you’d be able to grow a plant in it, as long as you’re allowed to water the plant.

    • In my experience it is quite challenging to grow a plant in a box.

      Without airflow, the plants gets spindly and fall over. Powdery mildew is a big problem as well. You have to spray with insecticidal soap to prevent aphids. There are no pollinators unless you put the box outside and open it up, and now you are not growing in a box anymore. The list goes on.

      I agree that there is a WTF here, but it is more like: couldn’t they find one person who had actually tried to grow a plant in a box before? How about one of those Biosphere terranauts?

    • Dusting off the leaves for a photo shoot might be fine except how would the leaves get dusty on a plant grown in the box? They wouldn’t, and the plant wasn’t, so the plant was put in the box and dusted to hide from funders the fact that it wasn’t grown in the box.

    • “how is that even possible?”

      I remember reading about this ages ago. One of the problems was that the boxes had a Unix computer in them, and getting each box set up required someone who could do Unix systems maintenance, required an internet connection to download the software, and never worked. Typical computer-related stupidity.

  2. Not to defend this stupidity, but I feel very often “no comment” is a good response personally wherever the media is concerned even when there’s nothing to hide.

    The average journalist today has very low standards. Much harder to misquote a no comment.

  3. I wouldn’t speak to a journalist of media person on any topic for any reason, period. As Rahul says, there’s just no future in letting someone else interpret, paraphrase or simply mis-represent your position.

    There is no media person, anywhere, whose interest is perfectly aligned with mine. Someone’s quest for more page views or clicks is not going to necessarily involve quoting me accurately or completely.

    • People who are on the public payroll have a responsibility to explain their actions to the media. If they don’t like what’s written, they can always complain to the media organization or file suit.

      The allegation of a “Theranos Style” fake is a serious allegation. If I had this story I’d take it up the ladder before publishing it. The prof doesn’t want to talk, well, why not go to dept chair or dean? When they get it that they’ll be on the spot too, things might look different to them.

      Or maybe not. The reality is no one really gives hoot about anything anymore. The USPS has been bleeding billions for a decade but no one did anything – after all, if the USPS is managed by 400-odd congressional reps, who could possibly be held responsible for anything?
      No one cares because voters don’t punish corrupt and incompetent public officials. But hey you can always try.

      • Well I don’t like it if someone misinterprets something I say. I try to be clear and would rather be understood, naturally. But journalists make a habit of deliberately and publicly misrepresenting what people say, in order to suit their own ends.

        Here’s a test. If a media person asks you for a comment on something, ask if they will let you proofread, correct and have final approval of any remarks they attribute to you. If they aren’t willing to do that, they can’t be trusted.

        • Since I disagree with much of your other posts, I thought I should chirp in here: you are exactly correct.

          This is old news. My father (whose day jobs included being the safety officer for Harvard’s Chemistry department) once taught a summer school course at Boston University’s music school. Teaching his side job*, since classical musicians don’t know squat about their instruments. The local press got wind of it and wrote it up. And made a complete mess of just about everything. Father was initially aghast and livid, but calmed down and decided that telling people about the course, even if the details were all wrong, was maybe useful. And that he’d never again talk to a reporter.

          *: https://pbase.com/davidjl/image/121610792

    • Pete:

      Most of our posts are on a 6-month lag, and then I rescheduled many posts originally planned for the spring because they were displaced by coronavirus material. So that’s how we end up with year-old material sometimes.

  4. “The “Media Lab” should be able to do better than that, no?”

    Uh, no. As an MIT type from way back (EECS ’76), I’ve _NEVER_ been able to figure out why the Media Lab exists. To the best I can tell, it’s always been pure, unadulterated hot air. This plant box thing is a well-known disaster, and that group (or a related one) has been accused of illegally discharging industrial waste into the local sewage system. And then there was the Epstein thing. They do invite in some kewl people, occasionally. Sputniko! was there for a while.

    “Ahhh, MIT . . . what’s happened to you? Lots has changed since 1986, I guess.”

    MIT has always had a penchant for making really stupid mistakes. IAP my freshman year (January 1973), the Mech. E. department decided to make the world’s largest yoyo and run it from the Green Building (the tallest building on campus at the time). That January had record cold temperatures, and it froze up and didn’t work. And it turns out the US Navy has a much larger yoyo that it swings from a helicopter and actually works.

    I recently started checking out MIT course videos, and the first two I looked at were terrible. The Comp. Sci. one was just hideous (it uses Python, and doesn’t understand why that’s a bad idea*). And the math one was a very famous mathematician blathering on and on not explaining what he was doing. His fans (people who already knew the material) loved it. But the course linked below is a blast. The instructor is frigging superb. So we’re not all bad. But the screw-ups are embarassing.

    https://ocw.mit.edu/courses/physics/8-04-quantum-physics-i-spring-2013/

    *: I use Python because it handles Unicode correctly and transparently (C++ doesn’t, sheesh). But it’s an insane language “begin” is spelled with a single colon, and “end” is spelled with _negative one tabs_. And it’s got worse variable semantics problems that 1960s Lisp.

      • The begin and end bits are nice, but Julia’s Unicode handling is overmuch. It’s theoretically possible to do string processing with a variable-length encoding, but it’s seriously bad for one’s sanity. Python gets this one right: it bites the bullet and says all internal strings are 32-bit bytes holding real Unicode code point values (and converted to/from UTF-8 or whatever on I/O). Julia’s basic string handling appears to be on UTF-8 variable-length (8-bit byte) strings, with conversion to/from fixed width representation provided, but it’s the user’s job to use it. Dunno if Julia can actually do string operations on 32-bit arrays: I got to the point where I found the description of the functions for counting bytes vs. counting characters in variable-width strings, laughed hysterically, and ran like hell.

        I realize that most programmers will freak out when told* strings are 32-bit bytes, but strings have to be: emoji are outside the BMP in Unicode and thus require 32-bits. It’s time we got over it. (I was considering writing my own (C++) UTF-8 to 16-bit conversion routines that discarded stuff outside the BMP. It would have been fine for what I’m doing, but I’d rather be doing the domain programming that’s the actual object of the game than low-level systems programming. (It sounds as though Julia’s UTF-8 to 16-bit conversion does just that (maybe).) And Python actually worked for said domain programming.

        (Translation: Thanks for the heads up. I might not have bothered looking at Julia otherwise: it clearly gets a lot right and is worth paying attention to.)

        *: Python gets around this by simply not telling you what they’re doing. “We handle the strings. Don’t worry about it. They just work.” they say.

        • Converting everything to a 4 byte format will work fine for small stuff. But if you want to read in a 8GB corpus of UTF-8 text and do computing on it, turning it into 32GB is not necessarily helpful. Julia usually aims for capacity to do very heavy lifting.

          Julia’s methodology is to use its strong typing system. AbstractChar is a supertype, Char is a subtype that uses the 32 bit representation you mention. codepoint(c) gives you the number that represents the character in the unicode standard.

          Julia makes a distinction between the length of a string (in characters) and the size of a string (in bytes).

          length(str) # number of characters
          sizeof(str) # number of bytes

          If you want to hand a Julia string to C you convert it to UTF-8 bytes with Core.String:

          Core.String(myjuliastring) # a byte buffer using UTF-8

          The usual issue someone might have is that they want to index strings using indexing notation, like mystr[3]

          This could be problematic for example if the first character is a 1 byte character, and the next one is a 2 byte character, then there is no character at mystr[3].

          The problem is usually solved by not indexing but instead iterating, which is more “julian” anyway.

          for char in mystr

          end

          or by collecting the indexes and iterating over the collection:

          for i in eachindex(mystr)

          mystr[i] = ‘q’

          end

          if you need to step forward and backward by characters you use nextind and prevind

          more details here: https://docs.julialang.org/en/v1/manual/strings/#String-Basics-1

          In Julia it’s more rare to write loops as

          for i in 1:100

          end

          rather than

          for i in {{some collection here}}

          end

          so the issues of sanity don’t arise as much

        • ” But if you want to read in a 8GB corpus of UTF-8 text and do computing on it, turning it into 32GB is not necessarily helpful. Julia usually aims for capacity to do very heavy lifting.”

          That assumes your file is predominantly ASCII. But if you are working with, say, Japanese, each char is 3 bytes on disk, 4 bytes in memory. (With a 16-byte trash-the-emoji-and-math-symbols encoding, it’d be 2 bytes in memory.) And my files are only 100 and 50 MB anyway. They might become twice that, but that largely works for what I’m doing. If a word doesn’t appear in one or the other, it really wasn’t used in that period (pre-war literature vs. recent writing). And If I wrote a web-scraper to, say, download (notable subsets of) the Japanese text from the Japanese Wikipedia, 3 vs 4 bytes would be the least of my problems.

          “Julia makes a distinction between the length of a string (in characters) and the size of a string (in bytes).”

          Yes. This is what I don’t want to see. I see you’ve noticed that indexing is O(n) instead of O(1), but incrementing an index requires reading and testing the data. You are talking dozens of instructions instead of one. Really. Variable-width characters aren’t nice.

          “so the issues of sanity don’t arise as much”

          Sounds like something for a famous last words collection…

        • > You are talking dozens of instructions instead of one.

          Well, dozens of CPU instructions (Julia is compiled to machine code) instead of hundreds or thousands of instructions (Python is compiled to bytecode and the bytecode is interpreted).

          Julia will probably wind up being 30x faster I’d guess.

          I copied some random wikipedia text from the front page in japanese, and then looked at the machine code to iterate over it and do nothing:

          julia> @benchmark ((txt) -> for c in txt nothing end)(jptxt)
          BenchmarkTools.Trial:
          memory estimate: 0 bytes
          allocs estimate: 0
          ————–
          minimum time: 662.409 ns (0.00% GC)
          median time: 663.119 ns (0.00% GC)
          mean time: 669.148 ns (0.00% GC)
          maximum time: 1.336 μs (0.00% GC)
          ————–
          samples: 10000
          evals/sample: 159

          julia> length(jptxt)
          195

          So it’s about 3.4 nanoseconds per character to iterate over japanese. If you have say 100MB of 3-byte japanese characters, pure reading through it would take

          julia> 100e6/3 * 663e-9/195
          0.11333333333333334

          0.113 seconds

        • “Well, dozens of CPU instructions (Julia is compiled to machine code) instead of hundreds or thousands of instructions (Python is compiled to bytecode and the bytecode is interpreted).”

          Well, no. The trick to Python is to use built-in functions, not to do it yourself. Counting substrings, sorting, hashing. All that happens at C speed. {/begin cheapshot} On a sensible data structure. {/end cheapshot}

          If you have to do the low-level stuff yourself, you don’t use Python*.

          Also, small tests don’t scale. You are looking at stuff that probably lives in your CPU cache. And you aren’t looking at, say, extracting substrings based on positions of delimiters: x <- find(startdelim), y <- find(enddelim), frobnitz E dictionary, count the occurrences of each dictionary entry word in my two corpuses, take a weighted average, discard words too rare to bother with, and output a somewhat smaller dictionary annotated with the weighted frequency and a parameter that indicates the ratio of that word’s usage in pre-war literature vs. current writing. It.Just.Worked. Run time wasn’t a problem.

          Mostly, I just use it to see how obnoxious the author I’m currently reading is about using obscure vocabulary…

          **: This shouldn’t have been a surprise. I spent most of my CS career working in interpreted Lisp on machines in the .3 MIPS (1970) to 1 MIPS (1990) range…

          Tokyo, Japan. Where even the footnotes have footnotes.

        • > If you have to do the low-level stuff yourself, you don’t use Python

          Indeed, this is precisely the problem that Julia aims to solve. They call it the “two language problem”

          If you are an old LISP hacker from way back, you’ll love Julia. The community considers Julia a LISP dialect.

        • The readme for that string library has some interesting snark. Words such as “faster” “safer” and “much easier to use than to have deal with nextind and prevind, etc. ”

          ROFL. I seem to have called it correctly on the native string handling…

          “This alleviates a common source of bugs in Julia, where people are unfamiliar with the difference between indexing by the byte or codeunit offset, instead of by the codepoints (characters). Using these types in your code can help speed things up.”

  5. I read a Twitter thread by Dr Sarah Taber on this back in September ’19 when the Epstein scandal reached MIT. She’s got lots of experience with indoor agriculture. I quote:
    “I used to work with room-sized versions of those. Back in 2001. They’re called growth chambers, they’ve been standard plant physiology research tools for decades, & they ain’t new at all.”
    https://twitter.com/SarahTaber_bww/status/1171895657872941056?s=20

    • Anon:

      I followed the link, and I can understand Taber’s annoyance! Also, she writes this: “if you think a lot of the high-profile science world seems to be useless, stupid crap, YOU’RE ABSOLUTELY RIGHT.” I guess this is similar to stock market bubbles.

      Compare to, say, movies. To have a successful movie, you have to sell lots of tickets. Advertising and promotion will do some of that for you, but, ultimately, to sell lots of tickets you need to have lots of people actually decide they want to go to your movie. Yes, some bad movies can make lots of money, but it’s hard to make a successful movie that nobody wants to see.

      In contrast, the funding and evaluation system for science and for companies are indirect. Publicity and the “pal review” system can keep junk science afloat for decades, even if there’s nothing there. And even when these bubbles finally bursts, it often seems more like luck than anything else: for example, if Brian Wansink hadn’t done that infamous blog, maybe we’d still be seeing him on TV, NPR, etc. Not to mention the “undead” science such as critical positivity ratio which continues to get hyped even years after it’s been debunked. And MIT’s various unsavory entities such as Undark and the Media Lab haven’t gone anywhere.

      Companies, too, can stay inflated for a long time. To keep a stock afloat, you don’t have to do the equivalent of selling tickets to the movie. You just have to convince some gullible rich people to keep giving you money. I’m not saying it’s easy to convince gullible rich people to give you money, and for that matter I’m not saying it’s easy to do what Wansink or the embodied cognition people did and build an entire scientific subfield out of thin air. It’s hard work, and you need special talents to do it. Unscrupulousness and ignorance aren’t, in and of themselves, enough. What I am saying is that you can get success in these fields without actually producing a functional product. Again, with movies it’s different, because at the end of the day you need to produce something that people are willing to sit and watch.

      • To borrow from another post, journalists at the WSJ, NPR, etc. need a checklist for evaluating the plausibility and newsworthiness of research. Although, perverse incentives come into play here for journalists, too: if a news org properly evaluates bad research, it gets one “dog bites man” story; if it reports credulously on “important” research, which turns out to be a fraud, it gets two sensational stories! Hard to imagine an editor proclaiming, “Darn it, we’ve been duped by a scientist again! Now we don’t have the credibility to report on the scandal we helped create….”

  6. Funny, the article refers to the Media Lab as prestigious and almost immediately says recent scandals have hurt its credibility. Maybe it should say, “Prestigious MIT’s low-credibility Media Lab….”

Leave a Reply to David J. Littleboy Cancel reply

Your email address will not be published. Required fields are marked *