Talk – “AI” follow up talk about labour and academia
I gave a follow up talk to an earlier talk about “AI” at the University of Bristol TARG research group meeting on 22 November 2024. As usual, lots of stuff I couldn't fit into the talk, so I'm putting them here plus further reading, a transcript, and video recording of the talk.
The slides are published on Zenodo with DOI 10.5281/zenodo.11051128 listed under the “30 minute version”.
I will try to gather here:
- the video recording;
- short summary;
- further reading collected when developing the talk; and
- a transcript of the talk.
I'll try to clean up this post with more context and details on a best-effort basis.
Video recording
There is a live video recording made during my 22 November 2024 talk which is viewable on the Internet Archive. The video is also embedded here (click the “CC” icon for subtitles):
Short summary
Please see the notes for my original “AI” talk for additional information.
Aware of the irony, I was curious how a large language model (LLM) could take the transcript of my talk (see below) and infer a short summary. The following is what Claude 3.5 Sonnet produced, with some edits by me:
This talk came from my conversation with Jennifer Ding at the Turing Institute about which underlying issues around “AI” technology deserve more attention versus the overhyped aspects. While I acknowledge that new technologies like “AI” can bring positive changes – such as a helpful Speech Schema Filling Tool that helps chemists record experimental metadata in real time as they run experiments – I wanted to focus on several key concerns.
The first observation I made is how “AI”-generated content is affecting academia. I shared examples including a published paper that began with “Certainly, here's a possible introduction...” (clearly ChatGPT-generated) and most amusingly, a paper featuring an anatomically incorrect lab rat with comically oversized genitals that somehow made it through peer review. I've also noted evidence of academics using “AI” tools for both writing and reviewing papers, and even PhD programs where applicants and reviewers use “AI” to convert application letters between bullet points and prose.
I emphasized that words really matter in this discussion. “AI” has become more of a marketing term than a technical term of art, and I pointed to how papers from just before the “AI” hype rarely used the term for the same technologies. I argue that this ambiguous language serves as a smokescreen, shifting power to those who control these tools.
This led me to discuss how “AI” often masks human exploitation. I shared examples including Kenyan sweatshop workers traumatized by moderating graphic content for ChatGPT, their Indian counterparts manually tracking purchases in ostensibly automated Amazon Fresh supermarkets, and bus drivers in “driverless” buses who must remain hypervigilant for that 1% chance of needing to intervene. As Kate Crawford notes, “AI” is “neither artificial nor intelligent” – it's not replacing labor but rather making it more invisible (which Lilly Irani also discussed in depth).
For scientific research, I see several concerns. There's a growing trend of papers proposing to replace human participants with large language models or suggesting complete automation of the scientific process – with one paper proudly claiming it could produce entire research projects from ideation to paper publication for just USD 15 each. I warn that building science on top of opaque and unaccountable “AI” systems risks turning science into alchemy.
While some suggest banning “AI” in academic publishing (following incidents like the well-endowed lab rat paper), I caution that focusing solely on “AI” (“solely” being the key word) might entrench deeper problems like the broken peer review system and publish-or-perish culture. For example, publishing companies might offer proprietary “AI”-generated paper detection tools, which would make us more reliant on them and further consolidating their power without tackling why researchers feel pressured to publish fake papers in the first place.
My key message is that “AI” often highlights existing problems rather than creating new ones. Instead of fixating on “AI” itself, we should address underlying issues in research culture, from job security to toxic workloads. I concluded by recommending resources like the Mystery AI Hype Theater 3000 podcast and the book “AI Snake Oil” for those interested in deeper exploration of these themes.
Further reading
Please see the notes for my original “AI” talk for links and references in addition to what's here.
- [report] Amazon’s AI Cameras Are Punishing Drivers for Mistakes They Didn’t Make: https://www.vice.com/en/article/amazons-ai-cameras-are-punishing-drivers-for-mistakes-they-didnt-make/
- [report] Amazon Fresh kills “Just Walk Out” shopping tech—it never really worked: https://arstechnica.com/gadgets/2024/04/amazon-ends-ai-powered-store-checkout-which-needed-1000-video-reviewers/
- [report] Look, no hands! My trip on Seoul's self-driving bus: https://www.bbc.co.uk/news/business-68823705
- [podcast] Mystery AI Hype Theater 3000: https://www.dair-institute.org/maiht3k/
- [editorial] The advent of human-assisted peer review by AI – in Nature Biomedical Engineering: https://doi.org/10.1038/s41551-024-01228-0
- Words matter, they affect the way we think about issues:
- [essay] Stefano Quintarelli is a former Italian member of parliament who said that instead of “AI”, we could call those technologies “Systematic Approaches to Learning Algorithms and Machine Inferences (SALAMI)”: https://blog.quintarelli.it/2019/11/lets-forget-the-term-ai-lets-call-them-systematic-approaches-to-learning-algorithms-and-machine-inferences-salami/
- [podcast] Completely randomly, I heard another “AI” replacement term “Technical Oriented Artificial StupidiTy (TOAST)” coined by Chris Roberts in the middle of a gaming podcast (19:31 into the video): https://www.youtube.com/live/ADYB-QJGheA?feature=shared&t=1171
- I didn't get to talk about the environmental costs of scaling (or the urge to scale up) “AI” technology, Timnit Gebru of the DAIR Institute touches on this and other issues in this interview (57:58 into the video):
Books
Hanna, A., & Bender, E. M. (2025). The AI Con—How to fight big tech’s hype and create the future we want. Harper. https://thecon.ai/
Narayanan, A., & Kapoor, S. (2024). AI Snake Oil: What artificial intelligence can do, what it can’t, and how to tell the difference. Princeton University Press. https://press.princeton.edu/books/hardcover/9780691249131/ai-snake-oil
Academic literature
Argyle, L. P., Busby, E. C., Fulda, N., Gubler, J. R., Rytting, C., & Wingate, D. (2023). Out of one, many: Using language models to simulate human samples. Political Analysis, 31(3), 337–351. https://doi.org/10.1017/pan.2023.2
Bender, E.M., Gebru, T., McMillan-Major, A., Shmitchell, S. (2021) On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT '21). Association for Computing Machinery, New York, New York, United States, 610–623. https://doi.org/10.1145/3442188.3445922
Gu, J., Liu, L., Wang, P., Theobalt, C. (2021) StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis. arXiv, 2110.08985. https://doi.org/10.48550/arXiv.2110.08985
Transcript
This started from my conversation with Jennifer Ding at the Turing Institute. And we were talking about: what are some of the underlying issues around “AI” technology that we feel should be surfaced a little more rather than some of the stuff that we think is a little overhyped? And I'm gonna go over a lot of those problems today.
Before I get into it, I want to do something I always emphasize in talks like this, which is that I think for any kind of technology, it can bring about a lot of change in how we do things and how we organize ourselves. And it's not a matter of saying: oh, you know, let's just not use it. There's a potential for “AI” technologies, right? Because if you think about it, when the printing press came around, you don't want to ban the printing press just because you're afraid that the scribes are gonna go out of business. We hopefully can work together to find a way to realize the potential of a new technology.
And I think a positive example that I'd like to share before jumping to everything else is this tool that Shern Tee shared with me. It's called the Speech Schema Filling Tool. So it was developed by chemists for use in their experiments. And what happens is that as you do your experiments, you talk into the microphone on your computer and the large language model on it will use your audio input to do a speech to text conversion and fill in your lab notebook with what you're saying. But what's really cool about it is that the tool will also parse what you're saying and record relevant metadata into a structured data format to go with your lab notebook. So there's a very well-structured metadata set to go with the particular experiment that you're doing. And I think as long as you're happy to talk through your experiment as you're doing it, this tool is so helpful for you to improve the quality of the data that you're capturing, helping make your experiments more reproducible and so on, right?
So there are certainly really good uses, of what people are calling “AI” technologies these days. Having said all of that, obviously there's also a lot of concern that we've seen over the past couple of years, such as in terms of how people publish papers, right? This is a classic one I think Marcus shared a while back where if you look at the paper, starting right from the first sentence in the introduction, it says: “Certainly, here's a possible introduction for your topic.” And I think it's pretty clear that this probably came from ChatGPT, which is one of the more commonly used so-called “AI” tools today to generate text.
However, this is not my favorite one. So my favorite paper is this one. I don't know if some of you have seen it. I see some of you smiling, so you know what I'm getting to. First of all, this was published in Frontiers back in February [2024]. If you look at the text, a lot of it looks fairly generic and probably “AI”-generated. But the most dramatic part is one of the figures which was a lab rat. And most of the lab rat looks kind of like a normal rat, but it's got these giant genitals sticking out of it. For the phallus, it's so long that it extends beyond the figure.
I just love how a figure like this would get past the peer reviewers, it gets past the editors, it gets past the copyeditors of the journal and gets published. Now, for the record, it was retracted by the publisher pretty soon afterwards. But not after everyone on the internet got copies of the PDF first and then archived it. That's how I was able to get this amazing picture of this lab rat, which I love. And you can also see a lot of weirdly spelled words that annotate this figure. So definitely check it out. I think this is one of the classics that's come out of some of the papers we've seen over the past couple of years.
And in addition to generating these papers, we are also seeing some evidence that academics are using these tools to generate the peer reviews that they write. And to be honest, I can kind of relate to what these academics are going through because who has time, right, to do a really good peer review these days? And in higher education, of course, we know that some students feel really tempted to use these sort of [large] language models to generate their essays, and we're also seeing that some instructors are using the same tools to grade and mark the essays.
You know, there's an anecdote I heard for a PhD program that was recruiting students, I think it was in the US, they found that a lot of the applicants to the PhD program didn't have time to write so many cover letters in the application. So they would write a few bullet points saying what they want in their cover letters. They use a large language model, turn it into the cover letter. And then when the professors on the program, they have so many applications to sift through, they ask the same tool to translate it back into bullet points so that it's quicker for them to skim through.
So a lot of interesting use cases here, but I just wanna use this to set the stage to talk about three things today. So the first one is that I think words really matter when we talk about so-called “AI” technologies because there's a lot of ambiguity in the language. And that can become really problematic because it allows so-called “AI” to become a smokescreen that distracts us from what I think a lot of the underlying issues are. That's more important to tackle. And lastly, I will try to bring all of this back to scientific research and think about what this means for scientific research and maybe what it doesn't mean.
Okay, so what do I mean by words matter? Well, I think it's very important for us to realize that so-called “AI”, as we colloquially use it today, is very much just a marketing term and not a technical term of art!
To illustrate this point, I really like this paper. It's called “A style-based 3D-aware generator for high-resolution image synthesis.” And you can see that you can use this tool to generate very realistic-looking photos of people. And I use this example because I searched through the whole paper, including the title, and other than one of the affiliations of the first author, there's no mention of “artificial intelligence” in this paper at all.
And if you look at the publication date, it's 2022, just before all of the hype around “AI” started. And I think if this paper is published just a year later, the text is going to be filled with references to “artificial intelligence”. And I think this is really important because it comes back to the point that a lot of the terminology we're using today around these technologies are marketing terms, like hallucinations or reasoning skills or training these models.
First of all, it really anthropomorphizes this technology, and it gives us a sense kind of like how humans have a tendency to recognize faces in things. And I feel using this terminology misleads us into recognizing intelligence in these tools as well. And I think that can be really problematic.
Another way to think about it is that when we are using our word processors to type up our papers, there's spellcheck, right? And spellcheck is basically a statistical model that takes an input and infers, in this case, the possible correct spelling for the word you're trying to spell. And this is not to minimize the amazing amount of work that's gone into these artificial intelligence technologies, but roughly speaking, large language models are also a very, very sophisticated form of statistical modeling that takes text as input and infers a natural-looking output.
And I think Emily Bender describes it really well when she calls these models “stochastic parrots”, because parrots, they might repeat words back to you, but they are literally incapable of understanding what it's saying. And this also applies to all of these “artificial intelligence” technologies.
And I think this ambiguous language is the feature, not the bug, because it's not just a matter of linguistics or semantics or nitpicking, but we know from history that ambiguous language shifts power to people who hold control over those tools and technologies. And I feel that the powerful people behind so-called “AI” is using this ambiguous language as a smokescreen to distract us from the very real problems underneath it.
So just, I think it was last year where there was this union that was formed in Kenya, because there were so many sweatshop workers in Kenya that were hired by the company behind ChatGPT and also Facebook and other companies to, well, as you can see here, to make the models less toxic.
So what they do is that you're constantly looking at outputs for the most egregious stuff, such as descriptions of sexual abuse, murder, suicide, and other really graphic details. And they're basically tweaking the model inputs whenever something really graphic comes out [so that] the statistical inferences from these large language models are slightly less offensive.
And they're so traumatized by this and doing this kind of sweatshop work all day, every day, trying to keep ChatGPT working that they were able to actually form a union. And I think this is important because that chemistry example I gave you earlier was one of the “AI” assisting humans, right? But actually, a lot of the exploitation comes in when you have a human-assisted “AI”, such as these sweatshop workers.
Another one is, of course, Amazon Fresh. I took this picture of the Amazon Fresh store. This one is just south of Aldgate East Station in London. And I know some of you know this... So the selling point for Amazon Fresh is that you walk in, pick up whatever you wanna buy, and you just walk out. And they use really advanced “artificial intelligence” to all of the cameras in the shop will figure out what you bought and automatically charge your Amazon account.
But it also came out in the news this year [2024] that all of the so-called “artificial intelligence” was actually Amazon hiring sweatshop workers in India whose sole job is to watch all of those cameras and manually tag what people are buying in these shops when everyone is thinking that's actually the “artificial intelligence” technology doing all of those things.
And actually, Amazon shut down the whole thing soon afterwards, and they're actually shifting Amazon Fresh to one where, rather than having all of those cameras watch you, whenever you grab an item, you have to manually scan it into your cart before you take it out.
And the other example that I think is very, very telling is this piece of news that was in the BBC earlier this year [2024] about this new driverless bus route that was started in Seoul in South Korea. So what happens is that this bus is supposed to be completely driverless, right? And you can see a picture of this guy sitting on the [driver's seat].
So I like this picture, by the way, of how this person actually also has his feet up to indicate that he doesn't even have his feet on the pedal. And I wanna use this example to say that all of what I've been showing to you so far are cases of human-assisted “AI”.
And what this driver has to do, you might be asking, “Okay, if this bus is completely driverless, why do you still need someone to sit there?” So what happens is that this driver will sit in the driver's seat. They don't usually have to do anything, like 99% of the time they can just sit and watch the bus drive itself, but this bus driver has to be super vigilant the whole time. Just in case, you know, in that 1% of the situations where the driverless bus makes a mistake, this driver has to immediately react and come in and actually make an adjustment to whatever the bus is doing.
So this driver is actually more vigilant than they usually have to be if they were just driving a regular bus. And this is what we're also seeing, of course, of the Amazon delivery drivers who are [used by] the so-called “artificial intelligence” system. You know, it's constantly watching the drivers on these trucks as they make their deliveries.
And they're under so much pressure because on one hand, Amazon is constantly pressuring them into making their delivery quotas. On the other hand, this “artificial intelligence” disciplinary system is constantly watching their behavior, such as watching their eyeballs [to track] where they're looking. There's also some evidence that the camera is watching their lips because apparently some drivers, they would whistle or sing a tune as they're driving, and apparently that's a bad thing and you'll get marks taken off and you might not get your bonus at the end of the week. So they're constantly being disciplined like this.
Or they have to deal with these inhuman competing demands. And in these examples, it's like, you know, us humans, we're basically mindless bodies where the “AI” acts as the head to discipline us and make us do exactly what it wants us to do.
And it comes back to my point where if we think of it as an “artificial intelligence”, then we attribute agency to this technology. And that distracts us from the Jeff Bezos-es behind the technology who's actually using them to exert that power over us. And I think that's really dangerous, right?
And I think Kate Crawford describes it really well, where so-called “artificial intelligence” is neither artificial nor intelligent. And the use of this technology in the ways that I just described, you know, it's not really replacing labor. It is displacing labor and making it even more invisible to us.
And this is why I think words matter because they have so much epistemic power over how we think about things. And often the use of language in “artificial intelligence” distract us from all of these underlying problems. Because, you know, if the “AI” on that driverless bus, you know, let's say hallucinates and makes a mistake, who are you gonna blame? We might blame, you know, that driver who wasn't vigilant enough to catch that 1% chance of the bus making a mistake, but is that really the issue here?
And that's where I'd like to try to bring this back to scientific research. So what does what we do as academic scientists have anything to do with this, right? Well, first of all, I'm kind of concerned about how even in academic scientific research, there is already sometimes a tendency to exploit.
So this is a paper that I actually cited in my previous research where it talks about, crowdsourcing the work that we do in science, whether it's data collection or data processing to online volunteers. And I want to first say that sometimes this can be done really well. For instance, a lot of this is integrated into science outreach and science education and science engagement, where as part of your engagement activity, the participant, they get to do part of the science and help you analyze data. And they can be mutually beneficial, but in papers like this, you often see language like, crowdsourcing, right? Which allows all of these free labor that you hired to shorten the time to perform the work for you, or it lowers the cost of labor for the academic who's running the project.
And I think there's a little bit of a danger here where we are perpetuating some of the exploitation, especially now where I am actually asked to review papers over time about this kind of crowdsourcing work and the way they talk about the participants make me concerned about where this is going in terms of various technologies where we might accidentally perpetuate this smokescreen that I keep talking about.
The second thing is that because the language around “AI” is so misleading we get papers like this who are, of course, it's basically saying it's so costly and labor intensive to recruit participants in your project. So why don't we replace them with large language models who will never get tired of our interview questions? We don't need to give them any compensation and we can get as many participants as we want in our study because, you know, they're as good as the real thing anyway, right? So I think that's pretty problematic.
Another one is talking about human assisted peer review in “AI” where they actually want to use these models to do peer reviews. And of course, proposing this particular editorial in this Nature Journal is that they're claiming: “oh, it's gonna save so much work for the actual peer reviewer because the 'AI' is gonna do all of it” and then the human, they just need to come in at the end and briefly check that peer review to see if it's okay.
But this sounds so much like that bus driver to me, and I feel we're seeing a lot of really high profile papers like this. There's one that I didn't get to stick into the slide in time, which is literally proposing, using “AI” to completely take over the scientific discovery process, where you're gonna use the large language model for question generation to design and conduct the experiment, analyze the result, write a paper, and then get another large language model to come into peer review that paper.
And at the end of the abstract, so I really wish I should put the abstract here, but at the end of the abstract it says, this saves so much money: “We calculated on average that if you outsource this entire thing to our 'AI' tool, it will be able to produce all of that scientific research for you at a cost of $15 per paper.” And I think that says a lot about how there's so much misunderstanding and hype around these technologies that high profile papers like this are starting to appear.
And I think Lisa Messeri described it really well where if we develop this kind of reliance and we think that “AI” technology is actually, sentient and intelligent, then by doing science this way, it will give us illusions of understanding. And this is a fantastic paper I suggest you check out.
Okay, now as someone who has been an open research advocate for a long time, another thing that's talked about, in “AI” circles right now is that we should really make a lot of these “AI” tools open source. And I think there are good reasons for that. But in the context of open research, there's a lot of messiness there as well.
So you might have heard of LLAMA 2, one of the large language models released by Meta last year. Then they called it an “open source” large language model. But if you actually click on download the model, it actually comes with a ton of restrictions on what you can do with it and a lot of limitations. And a lot of parts of it are completely opaque and you're not allowed to see what the model is doing. So it certainly doesn't meet the industry definition of open source as it has been established for software.
Now, the Open Source Initiative has been working on this issue for a long time. And actually just a few weeks ago, they released the first version of an open source “AI” definition. And I think it's really important for academic researchers to be part of this process as well.
But in any case, what happens in practice is that there was another study published earlier this year where they looked at dozens of the popularly used, large language models these days and scored them using 14 different criteria on their openness. And the overwhelming majority of them comes not only with a ton of restrictions, but also a lot of black boxes where you're not really allowed to know what's actually happening inside these models.
So you can see that ChatGPT is right there on the very bottom as one of the most black box large language models that there is that we're using. And I think there's a real danger here for... with all of this hype around so-called “artificial intelligence” and all the talk about completely integrating that into the science that we do. We're building all of the science on top of this “AI” technology.
I think what's gonna happen is that we won't end up doing science anymore. We will be doing alchemy! Because it's built on top of this completely opaque system. And I think that's a fundamental danger to the future of doing science.
And I want to quickly bring us back to this very well-endowed lab rat that I mentioned at the beginning, because I know that in response to papers like this, some people are saying, okay, so of course, you know, we should certainly ban the use of “AI” technologies in the creation of papers. So maybe we should just completely cut “AI” out of the paper writing process, right?
And I think that's understandable to a large degree, but I think there's a concern about if and what kind of problems are we actually solving if we focus on dealing with the “AI” part of it. Because I'm concerned that fixing “AI” might actually entrench deeper problems.
In this case, the broken peer review system, the publish-or-perish culture, right? Where these publishing monopolies... because I wouldn't be surprised, given what we've seen in higher education in terms of finding fake essays written by students. I wouldn't be surprised if one of those big publishers, they release some proprietary “AI” tool saying, “hey, if you publish a journal with us, then we'll let you use our proprietary 'AI' tool to detect fake paper submissions.”
That might seem to superficially solve the problem, but I think the deeper risk of thinking about “AI” is that in this example, we will become even more reliant on these huge publishers and cede even more power to them, right? And I think that's what I'm really concerned about because, solutions like this, don't really get at the actual problems leading to why people want, well, not necessarily want, but feel pressured into publishing those fake papers.
So I think a core message that I've got from these examples is that “AI” highlights existing problems that we have. And it's important for us to be aware of deeper problems in our research culture. And it could be really long standing issues like job security or the toxic workloads that we have to put up with, right? And think about all of those lecturers who have to live in tents because they can't afford anything more than that.
And it's important to realize that “AI” didn't create these problems just as “AI” didn't create the sweatshops that I mentioned earlier.
So to wrap things up, I think the main messages I want to send today is that words really matter when we talk about these technologies. And we should be very sensitive in understanding what those words really mean. And instead of thinking about “AI”, we should think about these deeper underlying issues that have plagued us for so long because, you know, very often “AI” is NOT the problem. It highlights existing problems and we should reflect on and focus on those underlying issues.
If we only focus on “AI”, it risks making those problems even worse. Okay, so that's the bulk of my talk, but if I've piqued your interest a little bit, I will leave you with some further reading, one of which is this one about generative “AI” and the automating of academia. The lead author is Richard Watermeyer based right here in Bristol. It's a fantastic read.
But if you're tired of reading yet another paper, I mentioned Emily Bender earlier. So Emily Bender and Alex Hanna host an incredible podcast called Mystery AI Hype Theater 3000, where every week they look at one of these so-called “AI” papers like the ones that I just showed you and tear it apart. And it's both very depressing and very entertaining at the same time.
Or if you'd like to read, these two Princeton professors, they wrote a book called “AI Snake Oil,” again, along the veins of what I'm talking about today. And I think it's really informative in terms of how we think about how we want to adapt our research culture in light of this new technology.
So that's some additional material that I think is useful. And in the interest of doing open research, I've published these slides, the transcript, additional notes, and all of the references to Zenodo. So you can look at that and remix and use it if you want.
And I also want to just give a shout out to Jennifer Ding from the Turing Institute and Shern [Tee], and everyone from the Turing Way community who's helped me develop this talk.
So that's what I have for you today. And thank you for coming.
Unless otherwise stated, all original content in this post is shared under the Creative Commons Attribution-ShareAlike 4.0 license (CC BY-SA 4.0).