<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>naclscrg</title>
    <link>https://naclscrg.writeas.com/</link>
    <description>Various notes from my research. For some context, see: [https://www.penonek.com/](https://www.penonek.com/)</description>
    <pubDate>Wed, 22 Apr 2026 08:32:23 +0000</pubDate>
    <item>
      <title>Open research in social sciences</title>
      <link>https://naclscrg.writeas.com/open-research-in-social-sciences?pk_campaign=rss-feed</link>
      <description>&lt;![CDATA[Recently I had the chance to consider good examples of open research in social sciences. Once again, with thanks to the various open research communities I&#39;m a part of and in the interests of my enquiry taking the form of open research, here are some relevant resources/examples. On a meta level, it was interesting what people suggested, which says something about their perspectives on the matter.&#xA;!--more--&#xA;From The Turing Way&#xA;&#xA;Anne reminded me the importance of recognising/avoiding what I would call the trap of &#34;performative objectivity&#34;, i.e.: &#xA;&#xA;  ...traditional &#39;social science&#39; (which can often fall prey to the same ideas of being &#39;objective&#39; and &#39;removed&#39; from the communities they are studying or working with, in their desire to be more aligned with the hard as opposed to &#39;soft&#39; sciences)&#xA;&#xA;This then reminded me of the excellent book Data Feminism: &#xA;&#xA;https://data-feminism.mitpress.mit.edu/&#xA;&#xA;This book was a great read, and contains — among other things — great examples of projects which avoid that trap, and creative ways to share and conceptualise &#34;data&#34;. &#xA;&#xA;With that in mind, here&#39;s a good example of community-led research: &#xA;&#xA;https://grassrootsjusticenetwork.org/resources/community-action-guide-on-community-led-research/&#xA;&#xA;Engaged and public anthropology can be creative in its outputs (as opposed to &#34;traditional&#34; academic outputs), such as an open city documentary festival: &#xA;&#xA;https://opencitylondon.com/about-us/&#xA;&#xA;Or a set of walking tours: &#xA;&#xA;https://open-city.org.uk/events&#xA;&#xA;Anne also mentioned that Paz Bernaldo from Open Life Sciences and Beth Duckles from &#34;organisational mycology&#34; (so curious what this term means!) may have insights, too.&#xA;&#xA;Beth later responded with some very useful examples, saying that: &#xA;&#xA;  ...participant action research (PAR) and community based participatory research (CBPR) are another set of methodological approaches that seeks to reconsider subject/object of the research by including the folks &#34;being studied&#34; in the research leadership, essentially trying to lessen the power inequalities and make the research more community led. It comes out of a social justice lens, particularly Paolo Friere&#39;s work.&#xA;&#xA;One of which is a piece of open participatory research that makes &#34;the research process, data collection and analysis open to those who were able and interested in being a part of the process&#34;: &#xA;&#xA;https://zenodo.org/records/8015576&#xA;&#xA;There are also online repositories for publishing open social science outputs, such as SOCARXIV and SOAR: &#xA;&#xA;https://socopen.org/&#xA;&#xA;https://www.gesis.org/ssoar&#xA;&#xA;Beth also linked to a discussion about &#34;open social science&#34;: &#xA;&#xA;https://blogs.lse.ac.uk/impactofsocialsciences/2022/01/11/eight-components-for-open-social-science-an-agenda-for-cultural-change/&#xA;&#xA;And Angela Okune worked for a long time with a community in Kenya, published on the Platform for Experimental, Collaborative Ethnography (PECE: pronounced “peace”): &#xA;&#xA;https://worldpece.org/&#xA;&#xA;Anne also curated a list of tools for social science researchers: &#xA;&#xA;https://open-source-social-science.github.io&#xA;&#xA;Even though it&#39;s a list of tools, tools affect the questions we could ask and I think Anne&#39;s list can serve as an inspiration for what kinds of open research one could do! &#xA;&#xA;From FORRT&#xA;&#xA;Priya reminded me of the wonderful work she and others at the UK Reproducibility Network did to collate examples of open research across disciplines: &#xA;&#xA;https://www.ukrn.org/disciplines/&#xA;&#xA;https://doi.org/10.31219/osf.io/3r8hb&#xA;&#xA;Aleksandra shared a platform her lab established to share psychological methods/tools translated into Serbian. To me, this is a great example not because it&#39;s about psychology research, but because that it&#39;s a community effort at translation, making resources accessible to a different audience: &#xA;&#xA;https://www.repopsi.f.bg.ac.rs/en/&#xA;&#xA;Flavio shared a great paper about &#34;Teaching open and reproducible scholarship: a critical review of the evidence base for current pedagogical methods and their outcomes&#34;. An important reminder that pedagogy/teaching is a key component of open research: &#xA;&#xA;https://doi.org/10.1098/rsos.221255&#xA;&#xA;Also, FORRT resources on adopting open research and replication: &#xA;&#xA;https://forrt.org/adopting&#xA;&#xA;https://forrt.org/replication-hub/&#xA;&#xA;On a more meta level, Flavio noted that FORRT itself might be a good social sciences example of people coming together as a community to build something.&#xA;&#xA;From NASA TOPS&#xA;&#xA;Christine shared two very cool resources. &#xA;&#xA;SEEKCommons, which seeks to promote &#34;the &#39;commons&#39; in science and technology with an emphasis on collaborative socio-environmental research&#34;: &#xA;&#xA;https://seekcommons.org/about.html&#xA;&#xA;And the ICPSR (Inter-university Consortium for Political and Social Research), which &#34;provides leadership and training in data access, curation, and methods of analysis for the social science research community&#34;: &#xA;&#xA;https://www.icpsr.umich.edu/web/pages/about/&#xA;&#xA;Citizen science&#xA;&#xA;There&#39;s also lots of work in the citizen/community science circles that may be good examples of open research in social sciences. &#xA;&#xA;For example, the classic story I always tell is about Public Lab: &#xA;&#xA;https://publiclab.org/&#xA;&#xA;Where their famous open source balloon mapping kit — originally for mapping the spread of the 2010 Deepwater Horizon oil spill — was adapted by those in the Bourj Al Shamali refugee camp to see their urban space from above for the first time: &#xA;&#xA;https://placesjournal.org/article/camp-code/&#xA;&#xA;The story in this article is an inspiring example of open research. And, the article itself is open research by Claudia Martinez Mansell, sharing her work in a public way. &#xA;&#xA;Other resources&#xA;&#xA;A great talk about open qualitative research by Natasha Mauthner, Professor of Social Science Philosophy and Method at Newcastle University: &#xA;https://oercommons.org/courseware/lesson/134043/overview&#xA;&#xA;Qualitative research software&#xA;&#xA;Qualcoder is an open source replacement for proprietary qualitative research software like NVivo: &#xA;&#xA;https://github.com/ccbogel/QualCoder&#xA;&#xA;Some Qualcoder learning resources: &#xA;&#xA;https://guides.library.illinois.edu/c.php?g=997192&amp;p=10050831#s-lg-box-31727131&#xA;https://shsulibraryguides.org/az/scworkshops/qualitative-data-coding-intro-to-taguette-and-qualcoder&#xA;https://ndporter.github.io/open-qualitative-research-qualcoder/04-qualitative-data-analysis.html&#xA;https://guides.temple.edu/qda/qualcoder&#xA;&#xA;And there&#39;s also Taguette which seems much more user-friendly: &#xA;&#xA;https://www.taguette.org/&#xA;&#xA;To complement Tageutte, I also learned of a new tool which allows you to view your codes in different ways: &#xA;&#xA; https://qdb.n.gardella.cc/&#xA;&#xA;Acknowledgements&#xA;&#xA;In alphabetical order.&#xA;&#xA;The Turing Way&#xA;&#xA;Anne Lee Steele, Beth Duckles&#xA;&#xA;Framework for Open and Reproducible Research Training (FORRT)&#xA;&#xA;Aleksandra Lazić, Flavio Azevedo, Priya Silverstein&#xA;&#xA;NASA TOPS community&#xA;&#xA;Christine Kirkpatrick&#xA;&#xA;openresearch&#xA;&#xA;----------&#xD;&#xA;&#xD;&#xA; p xmlns:cc=&#34;http://creativecommons.org/ns#&#34; Unless otherwise stated, all original content in this post is shared under the a href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;Creative Commons Attribution-ShareAlike 4.0 International/a licensea href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1&#34; alt=&#34;&#34;/a/p ]]&gt;</description>
      <content:encoded><![CDATA[<p>Recently I had the chance to consider good examples of open research in social sciences. Once again, with thanks to the various open research communities I&#39;m a part of and in the interests of my enquiry taking the form of open research, here are some relevant resources/examples. On a meta level, it was interesting what people suggested, which says something about their perspectives on the matter.
</p>

<h2 id="from-the-turing-way" id="from-the-turing-way">From The Turing Way</h2>

<p>Anne reminded me the importance of recognising/avoiding what I would call the trap of “performative objectivity”, i.e.:</p>

<blockquote><p>...traditional &#39;social science&#39; (which can often fall prey to the same ideas of being &#39;objective&#39; and &#39;removed&#39; from the communities they are studying or working with, in their desire to be more aligned with the hard as opposed to &#39;soft&#39; sciences)</p></blockquote>

<p>This then reminded me of the excellent book Data Feminism:</p>

<p><a href="https://data-feminism.mitpress.mit.edu/">https://data-feminism.mitpress.mit.edu/</a></p>

<p>This book was a great read, and contains — among other things — great examples of projects which avoid that trap, and creative ways to share and conceptualise “data”.</p>

<p>With that in mind, here&#39;s a good example of community-led research:</p>

<p><a href="https://grassrootsjusticenetwork.org/resources/community-action-guide-on-community-led-research/">https://grassrootsjusticenetwork.org/resources/community-action-guide-on-community-led-research/</a></p>

<p>Engaged and public anthropology can be creative in its outputs (as opposed to “traditional” academic outputs), such as an open city documentary festival:</p>

<p><a href="https://opencitylondon.com/about-us/">https://opencitylondon.com/about-us/</a></p>

<p>Or a set of walking tours:</p>

<p><a href="https://open-city.org.uk/events">https://open-city.org.uk/events</a></p>

<p>Anne also mentioned that Paz Bernaldo from Open Life Sciences and Beth Duckles from “organisational mycology” (so curious what this term means!) may have insights, too.</p>

<p>Beth later responded with some very useful examples, saying that:</p>

<blockquote><p>...participant action research (PAR) and community based participatory research (CBPR) are another set of methodological approaches that seeks to reconsider subject/object of the research by including the folks “being studied” in the research leadership, essentially trying to lessen the power inequalities and make the research more community led. It comes out of a social justice lens, particularly Paolo Friere&#39;s work.</p></blockquote>

<p>One of which is a piece of open participatory research that makes “the research process, data collection and analysis open to those who were able and interested in being a part of the process”:</p>

<p><a href="https://zenodo.org/records/8015576">https://zenodo.org/records/8015576</a></p>

<p>There are also online repositories for publishing open social science outputs, such as SOCARXIV and SOAR:</p>

<p><a href="https://socopen.org/">https://socopen.org/</a></p>

<p><a href="https://www.gesis.org/ssoar">https://www.gesis.org/ssoar</a></p>

<p>Beth also linked to a discussion about “open social science”:</p>

<p><a href="https://blogs.lse.ac.uk/impactofsocialsciences/2022/01/11/eight-components-for-open-social-science-an-agenda-for-cultural-change/">https://blogs.lse.ac.uk/impactofsocialsciences/2022/01/11/eight-components-for-open-social-science-an-agenda-for-cultural-change/</a></p>

<p>And Angela Okune worked for a long time with a community in Kenya, published on the Platform for Experimental, Collaborative Ethnography (PECE: pronounced “peace”):</p>

<p><a href="https://worldpece.org/">https://worldpece.org/</a></p>

<p>Anne also curated a list of tools for social science researchers:</p>

<p><a href="https://open-source-social-science.github.io">https://open-source-social-science.github.io</a></p>

<p>Even though it&#39;s a list of tools, <a href="https://sparcopen.org/impact-story/often-overlooked-sharing-of-hardware-is-a-missing-link-in-open-science-puzzle/">tools affect the questions we could ask</a> and I think Anne&#39;s list can serve as an inspiration for what kinds of open research one could do!</p>

<h2 id="from-forrt" id="from-forrt">From FORRT</h2>

<p>Priya reminded me of the wonderful work she and others at the UK Reproducibility Network did to collate examples of open research across disciplines:</p>

<p><a href="https://www.ukrn.org/disciplines/">https://www.ukrn.org/disciplines/</a></p>

<p><a href="https://doi.org/10.31219/osf.io/3r8hb">https://doi.org/10.31219/osf.io/3r8hb</a></p>

<p>Aleksandra shared a platform her lab established to share psychological methods/tools translated into Serbian. To me, this is a great example not because it&#39;s about psychology research, but because that it&#39;s a community effort at translation, making resources accessible to a different audience:</p>

<p><a href="https://www.repopsi.f.bg.ac.rs/en/">https://www.repopsi.f.bg.ac.rs/en/</a></p>

<p>Flavio shared a great paper about “Teaching open and reproducible scholarship: a critical review of the evidence base for current pedagogical methods and their outcomes”. An important reminder that pedagogy/teaching is a key component of open research:</p>

<p><a href="https://doi.org/10.1098/rsos.221255">https://doi.org/10.1098/rsos.221255</a></p>

<p>Also, FORRT resources on adopting open research and replication:</p>

<p><a href="https://forrt.org/adopting">https://forrt.org/adopting</a></p>

<p><a href="https://forrt.org/replication-hub/">https://forrt.org/replication-hub/</a></p>

<p>On a more meta level, Flavio noted that FORRT itself might be a good social sciences example of people coming together as a community to build something.</p>

<h2 id="from-nasa-tops" id="from-nasa-tops">From NASA TOPS</h2>

<p>Christine shared two very cool resources.</p>

<p>SEEKCommons, which seeks to promote “the &#39;commons&#39; in science and technology with an emphasis on collaborative socio-environmental research”:</p>

<p><a href="https://seekcommons.org/about.html">https://seekcommons.org/about.html</a></p>

<p>And the ICPSR (Inter-university Consortium for Political and Social Research), which “provides leadership and training in data access, curation, and methods of analysis for the social science research community”:</p>

<p><a href="https://www.icpsr.umich.edu/web/pages/about/">https://www.icpsr.umich.edu/web/pages/about/</a></p>

<h2 id="citizen-science" id="citizen-science">Citizen science</h2>

<p>There&#39;s also lots of work in the citizen/community science circles that may be good examples of open research in social sciences.</p>

<p>For example, the classic story I always tell is about Public Lab:</p>

<p><a href="https://publiclab.org/">https://publiclab.org/</a></p>

<p>Where their famous open source balloon mapping kit — originally for mapping the spread of the 2010 Deepwater Horizon oil spill — was adapted by those in the Bourj Al Shamali refugee camp to see their urban space from above for the first time:</p>

<p><a href="https://placesjournal.org/article/camp-code/">https://placesjournal.org/article/camp-code/</a></p>

<p>The story in this article is an inspiring example of open research. <em>And</em>, the article itself is open research by Claudia Martinez Mansell, sharing her work in a public way.</p>

<h2 id="other-resources" id="other-resources">Other resources</h2>

<p>A great talk about open qualitative research by Natasha Mauthner, Professor of Social Science Philosophy and Method at Newcastle University:
<a href="https://oercommons.org/courseware/lesson/134043/overview">https://oercommons.org/courseware/lesson/134043/overview</a></p>

<h3 id="qualitative-research-software" id="qualitative-research-software">Qualitative research software</h3>

<p><strong>Qualcoder</strong> is an open source replacement for proprietary qualitative research software like NVivo:</p>

<p><a href="https://github.com/ccbogel/QualCoder">https://github.com/ccbogel/QualCoder</a></p>

<p>Some Qualcoder learning resources:</p>
<ul><li><a href="https://guides.library.illinois.edu/c.php?g=997192&amp;p=10050831#s-lg-box-31727131">https://guides.library.illinois.edu/c.php?g=997192&amp;p=10050831#s-lg-box-31727131</a></li>
<li><a href="https://shsulibraryguides.org/az/scworkshops/qualitative-data-coding-intro-to-taguette-and-qualcoder">https://shsulibraryguides.org/az/scworkshops/qualitative-data-coding-intro-to-taguette-and-qualcoder</a></li>
<li><a href="https://ndporter.github.io/open-qualitative-research-qualcoder/04-qualitative-data-analysis.html">https://ndporter.github.io/open-qualitative-research-qualcoder/04-qualitative-data-analysis.html</a></li>
<li><a href="https://guides.temple.edu/qda/qualcoder">https://guides.temple.edu/qda/qualcoder</a></li></ul>

<p>And there&#39;s also <strong>Taguette</strong> which seems much more user-friendly:</p>

<p><a href="https://www.taguette.org/">https://www.taguette.org/</a></p>

<p>To complement Tageutte, I also learned of a new tool which allows you to view your codes in different ways:</p>

<p> <a href="https://qdb.n.gardella.cc/">https://qdb.n.gardella.cc/</a></p>

<h2 id="acknowledgements" id="acknowledgements">Acknowledgements</h2>

<p>In alphabetical order.</p>

<h3 id="the-turing-way-https-the-turing-way-netlify-app" id="the-turing-way-https-the-turing-way-netlify-app"><a href="https://the-turing-way.netlify.app/">The Turing Way</a></h3>

<p>Anne Lee Steele, Beth Duckles</p>

<h3 id="framework-for-open-and-reproducible-research-training-forrt-https-forrt-org" id="framework-for-open-and-reproducible-research-training-forrt-https-forrt-org"><a href="https://forrt.org/">Framework for Open and Reproducible Research Training (FORRT)</a></h3>

<p>Aleksandra Lazić, Flavio Azevedo, Priya Silverstein</p>

<h3 id="nasa-tops-community" id="nasa-tops-community">NASA TOPS community</h3>

<p>Christine Kirkpatrick</p>

<p><a href="https://naclscrg.writeas.com/tag:openresearch" class="hashtag"><span>#</span><span class="p-category">openresearch</span></a></p>

<hr/>

<p> <p>Unless otherwise stated, all original content in this post is shared under the <a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;">Creative Commons Attribution-ShareAlike 4.0 International</a> license<a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1" alt=""></a></p></p>
]]></content:encoded>
      <guid>https://naclscrg.writeas.com/open-research-in-social-sciences</guid>
      <pubDate>Mon, 12 May 2025 15:19:54 +0000</pubDate>
    </item>
    <item>
      <title>Talk - &#34;AI&#34; follow up talk about labour and academia</title>
      <link>https://naclscrg.writeas.com/talk-ai-is-not-the-problem-follow-up?pk_campaign=rss-feed</link>
      <description>&lt;![CDATA[I gave a follow up talk to an earlier talk about &#34;AI&#34; at the University of Bristol TARG research group meeting on 22 November 2024. As usual, lots of stuff I couldn&#39;t fit into the talk, so I&#39;m putting them here plus further reading, a transcript, and video recording of the talk.&#xA;&#xA;The slides are published on Zenodo with DOI 10.5281/zenodo.11051128 listed under the &#34;30 minute version&#34;. &#xA;!--more--&#xA;I will try to gather here: &#xA;&#xA;the video recording;&#xA;short summary; &#xA;further reading collected when developing the talk; and&#xA;a transcript of the talk.&#xA;&#xA;I&#39;ll try to clean up this post with more context and details on a best-effort basis.&#xA;&#xA;Video recording&#xA;&#xA;There is a live video recording made during my 22 November 2024 talk which is viewable on the Internet Archive. The video is also embedded here (click the &#34;CC&#34; icon for subtitles): &#xA;&#xA;iframe src=&#34;https://archive.org/embed/AI-is-not-the-problem-2024-11-22&#34; width=&#34;640&#34; height=&#34;480&#34; frameborder=&#34;0&#34; webkitallowfullscreen=&#34;true&#34; mozallowfullscreen=&#34;true&#34; allowfullscreen/iframe&#xA;&#xA;Short summary&#xA;&#xA;Please see the notes for my original &#34;AI&#34; talk for additional information.&#xA;&#xA;Aware of the irony, I was curious how a large language model (LLM) could take the transcript of my talk (see below) and infer a short summary. The following is what Claude 3.5 Sonnet produced, with some edits by me: &#xA;&#xA;This talk came from my conversation with Jennifer Ding at the Turing Institute about which underlying issues around &#34;AI&#34; technology deserve more attention versus the overhyped aspects. While I acknowledge that new technologies like &#34;AI&#34; can bring positive changes - such as a helpful Speech Schema Filling Tool that helps chemists record experimental metadata in real time as they run experiments - I wanted to focus on several key concerns.&#xA;&#xA;The first observation I made is how &#34;AI&#34;-generated content is affecting academia. I shared examples including a published paper that began with &#34;Certainly, here&#39;s a possible introduction...&#34; (clearly ChatGPT-generated) and most amusingly, a paper featuring an anatomically incorrect lab rat with comically oversized genitals that somehow made it through peer review. I&#39;ve also noted evidence of academics using &#34;AI&#34; tools for both writing and reviewing papers, and even PhD programs where applicants and reviewers use &#34;AI&#34; to convert application letters between bullet points and prose.&#xA;&#xA;I emphasized that words really matter in this discussion. &#34;AI&#34; has become more of a marketing term than a technical term of art, and I pointed to how papers from just before the &#34;AI&#34; hype rarely used the term for the same technologies. I argue that this ambiguous language serves as a smokescreen, shifting power to those who control these tools.&#xA;&#xA;This led me to discuss how &#34;AI&#34; often masks human exploitation. I shared examples including Kenyan sweatshop workers traumatized by moderating graphic content for ChatGPT, their Indian counterparts manually tracking purchases in ostensibly automated Amazon Fresh supermarkets, and bus drivers in &#34;driverless&#34; buses who must remain hypervigilant for that 1% chance of needing to intervene. As Kate Crawford notes, &#34;AI&#34; is &#34;neither artificial nor intelligent&#34; - it&#39;s not replacing labor but rather making it more invisible (which Lilly Irani also discussed in depth).&#xA;&#xA;For scientific research, I see several concerns. There&#39;s a growing trend of papers proposing to replace human participants with large language models or suggesting complete automation of the scientific process - with one paper proudly claiming it could produce entire research projects from ideation to paper publication for just USD 15 each. I warn that building science on top of opaque and unaccountable &#34;AI&#34; systems risks turning science into alchemy.&#xA;&#xA;While some suggest banning &#34;AI&#34; in academic publishing (following incidents like the well-endowed lab rat paper), I caution that *focusing solely on &#34;AI&#34; (&#34;solely&#34; being the key word) might entrench deeper problems like the broken peer review system and publish-or-perish culture. For example, publishing companies might offer proprietary &#34;AI&#34;-generated paper detection tools, which would make us more reliant on them and further consolidating their power without tackling why researchers feel pressured to publish fake papers in the first place.&#xA;&#xA;My key message is that &#34;AI&#34; often highlights existing problems rather than creating new ones. Instead of fixating on &#34;AI&#34; itself, we should address underlying issues in research culture, from job security to toxic workloads. I concluded by recommending resources like the Mystery AI Hype Theater 3000 podcast and the book &#34;AI Snake Oil&#34; for those interested in deeper exploration of these themes.&#xA;&#xA;P.S. Note that a newer book, &#34;The AI Con&#34;, is about to be published in 2025: https://thecon.ai/&#xA;&#xA;Further reading&#xA;&#xA;Please see the notes for my original &#34;AI&#34; talk for links and references in addition to what&#39;s here. &#xA;&#xA;[report] Amazon’s AI Cameras Are Punishing Drivers for Mistakes They Didn’t Make: https://www.vice.com/en/article/amazons-ai-cameras-are-punishing-drivers-for-mistakes-they-didnt-make/&#xA;[report] Amazon Fresh kills “Just Walk Out” shopping tech—it never really worked: https://arstechnica.com/gadgets/2024/04/amazon-ends-ai-powered-store-checkout-which-needed-1000-video-reviewers/&#xA;[report] Look, no hands! My trip on Seoul&#39;s self-driving bus: https://www.bbc.co.uk/news/business-68823705&#xA;[podcast] Mystery AI Hype Theater 3000: https://www.dair-institute.org/maiht3k/&#xA;[editorial] The advent of human-assisted peer review by AI - in Nature Biomedical Engineering: https://doi.org/10.1038/s41551-024-01228-0&#xA;Words matter, they affect the way we think about issues: &#xA;  [essay] Stefano Quintarelli is a former Italian member of parliament who said that instead of &#34;AI&#34;, we could call those technologies &#34;Systematic Approaches to Learning Algorithms and Machine Inferences (SALAMI)&#34;: https://blog.quintarelli.it/2019/11/lets-forget-the-term-ai-lets-call-them-systematic-approaches-to-learning-algorithms-and-machine-inferences-salami/&#xA;  [podcast] Completely randomly, I heard another &#34;AI&#34; replacement term &#34;Technical Oriented Artificial StupidiTy (TOAST)&#34; coined by Chris Roberts in the middle of a gaming podcast (19:31 into the video): https://www.youtube.com/live/ADYB-QJGheA?feature=shared&amp;t=1171&#xA;I didn&#39;t get to talk about the environmental costs of scaling (or the urge to scale up) &#34;AI&#34; technology, Timnit Gebru of the DAIR Institute touches on this and other issues in this interview (57:58 into the video): https://youtu.be/nh7-ZNBql38?feature=shared&amp;t=3478&#xA;&#xA;Books&#xA;&#xA;Hanna, A., &amp; Bender, E. M. (2025). The AI Con—How to fight big tech’s hype and create the future we want. Harper. https://thecon.ai/&#xA;&#xA;Narayanan, A., &amp; Kapoor, S. (2024). AI Snake Oil: What artificial intelligence can do, what it can’t, and how to tell the difference. Princeton University Press. https://press.princeton.edu/books/hardcover/9780691249131/ai-snake-oil&#xA;&#xA;Academic literature&#xA;&#xA;Argyle, L. P., Busby, E. C., Fulda, N., Gubler, J. R., Rytting, C., &amp; Wingate, D. (2023). Out of one, many: Using language models to simulate human samples. Political Analysis, 31(3), 337–351. https://doi.org/10.1017/pan.2023.2&#xA;&#xA;Bender, E.M., Gebru, T., McMillan-Major, A., Shmitchell, S. (2021) On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT &#39;21). Association for Computing Machinery, New York, New York, United States, 610–623. https://doi.org/10.1145/3442188.3445922&#xA;&#xA;Gu, J., Liu, L., Wang, P., Theobalt, C. (2021) StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis. arXiv*, 2110.08985. https://doi.org/10.48550/arXiv.2110.08985&#xA;&#xA;Transcript&#xA;&#xA;This started from my conversation with Jennifer Ding at the Turing Institute. And we were talking about: what are some of the underlying issues around &#34;AI&#34; technology that we feel should be surfaced a little more rather than some of the stuff that we think is a little overhyped? And I&#39;m gonna go over a lot of those problems today.&#xA;&#xA;Before I get into it, I want to do something I always emphasize in talks like this, which is that I think for any kind of technology, it can bring about a lot of change in how we do things and how we organize ourselves. And it&#39;s not a matter of saying: oh, you know, let&#39;s just not use it. There&#39;s a potential for &#34;AI&#34; technologies, right? Because if you think about it, when the printing press came around, you don&#39;t want to ban the printing press just because you&#39;re afraid that the scribes are gonna go out of business. We hopefully can work together to find a way to realize the potential of a new technology.&#xA;&#xA;And I think a positive example that I&#39;d like to share before jumping to everything else is this tool that Shern Tee shared with me. It&#39;s called the Speech Schema Filling Tool. So it was developed by chemists for use in their experiments. And what happens is that as you do your experiments, you talk into the microphone on your computer and the large language model on it will use your audio input to do a speech to text conversion and fill in your lab notebook with what you&#39;re saying. But what&#39;s really cool about it is that the tool will also parse what you&#39;re saying and record relevant metadata into a structured data format to go with your lab notebook. So there&#39;s a very well-structured metadata set to go with the particular experiment that you&#39;re doing. And I think as long as you&#39;re happy to talk through your experiment as you&#39;re doing it, this tool is so helpful for you to improve the quality of the data that you&#39;re capturing, helping make your experiments more reproducible and so on, right?&#xA;&#xA;So there are certainly really good uses, of what people are calling &#34;AI&#34; technologies these days. Having said all of that, obviously there&#39;s also a lot of concern that we&#39;ve seen over the past couple of years, such as in terms of how people publish papers, right? This is a classic one I think Marcus shared a while back where if you look at the paper, starting right from the first sentence in the introduction, it says: &#34;Certainly, here&#39;s a possible introduction for your topic.&#34; And I think it&#39;s pretty clear that this probably came from ChatGPT, which is one of the more commonly used so-called &#34;AI&#34; tools today to generate text.&#xA;&#xA;However, this is not my favorite one. So my favorite paper is this one. I don&#39;t know if some of you have seen it. I see some of you smiling, so you know what I&#39;m getting to. First of all, this was published in Frontiers back in February [2024]. If you look at the text, a lot of it looks fairly generic and probably &#34;AI&#34;-generated. But the most dramatic part is one of the figures which was a lab rat. And most of the lab rat looks kind of like a normal rat, but it&#39;s got these giant genitals sticking out of it. For the phallus, it&#39;s so long that it extends beyond the figure.&#xA;&#xA;I just love how a figure like this would get past the peer reviewers, it gets past the editors, it gets past the copyeditors of the journal and gets published. Now, for the record, it was retracted by the publisher pretty soon afterwards. But not after everyone on the internet got copies of the PDF first and then archived it. That&#39;s how I was able to get this amazing picture of this lab rat, which I love. And you can also see a lot of weirdly spelled words that annotate this figure. So definitely check it out. I think this is one of the classics that&#39;s come out of some of the papers we&#39;ve seen over the past couple of years.&#xA;&#xA;And in addition to generating these papers, we are also seeing some evidence that academics are using these tools to generate the peer reviews that they write. And to be honest, I can kind of relate to what these academics are going through because who has time, right, to do a really good peer review these days? And in higher education, of course, we know that some students feel really tempted to use these sort of [large] language models to generate their essays, and we&#39;re also seeing that some instructors are using the same tools to grade and mark the essays.&#xA;&#xA;You know, there&#39;s an anecdote I heard for a PhD program that was recruiting students, I think it was in the US, they found that a lot of the applicants to the PhD program didn&#39;t have time to write so many cover letters in the application. So they would write a few bullet points saying what they want in their cover letters. They use a large language model, turn it into the cover letter. And then when the professors on the program, they have so many applications to sift through, they ask the same tool to translate it back into bullet points so that it&#39;s quicker for them to skim through.&#xA;&#xA;So a lot of interesting use cases here, but I just wanna use this to set the stage to talk about three things today. So the first one is that I think words really matter when we talk about so-called &#34;AI&#34; technologies because there&#39;s a lot of ambiguity in the language. And that can become really problematic because it allows so-called &#34;AI&#34; to become a smokescreen that distracts us from what I think a lot of the underlying issues are. That&#39;s more important to tackle. And lastly, I will try to bring all of this back to scientific research and think about what this means for scientific research and maybe what it doesn&#39;t mean.&#xA;&#xA;Okay, so what do I mean by words matter? Well, I think it&#39;s very important for us to realize that so-called &#34;AI&#34;, as we colloquially use it today, is very much just a marketing term and not a technical term of art!&#xA;&#xA;To illustrate this point, I really like this paper. It&#39;s called &#34;A style-based 3D-aware generator for high-resolution image synthesis.&#34; And you can see that you can use this tool to generate very realistic-looking photos of people. And I use this example because I searched through the whole paper, including the title, and other than one of the affiliations of the first author, there&#39;s no mention of &#34;artificial intelligence&#34; in this paper at all.&#xA;&#xA;And if you look at the publication date, it&#39;s 2022, just before all of the hype around &#34;AI&#34; started. And I think if this paper is published just a year later, the text is going to be filled with references to &#34;artificial intelligence&#34;. And I think this is really important because it comes back to the point that a lot of the terminology we&#39;re using today around these technologies are marketing terms, like hallucinations or reasoning skills or training these models.&#xA;&#xA;First of all, it really anthropomorphizes this technology, and it gives us a sense kind of like how humans have a tendency to recognize faces in things. And I feel using this terminology misleads us into recognizing intelligence in these tools as well. And I think that can be really problematic.&#xA;&#xA;Another way to think about it is that when we are using our word processors to type up our papers, there&#39;s spellcheck, right? And spellcheck is basically a statistical model that takes an input and infers, in this case, the possible correct spelling for the word you&#39;re trying to spell. And this is not to minimize the amazing amount of work that&#39;s gone into these artificial intelligence technologies, but roughly speaking, large language models are also a very, very sophisticated form of statistical modeling that takes text as input and infers a natural-looking output.&#xA;&#xA;And I think Emily Bender describes it really well when she calls these models &#34;stochastic parrots&#34;, because parrots, they might repeat words back to you, but they are literally incapable of understanding what it&#39;s saying. And this also applies to all of these &#34;artificial intelligence&#34; technologies.&#xA;&#xA;And I think this ambiguous language is the feature, not the bug, because it&#39;s not just a matter of linguistics or semantics or nitpicking, but we know from history that ambiguous language shifts power to people who hold control over those tools and technologies. And I feel that the powerful people behind so-called &#34;AI&#34; is using this ambiguous language as a smokescreen to distract us from the very real problems underneath it.&#xA;&#xA;So just, I think it was last year where there was this union that was formed in Kenya, because there were so many sweatshop workers in Kenya that were hired by the company behind ChatGPT and also Facebook and other companies to, well, as you can see here, to make the models less toxic.&#xA;&#xA;So what they do is that you&#39;re constantly looking at outputs for the most egregious stuff, such as descriptions of sexual abuse, murder, suicide, and other really graphic details. And they&#39;re basically tweaking the model inputs whenever something really graphic comes out [so that] the statistical inferences from these large language models are slightly less offensive.&#xA;&#xA;And they&#39;re so traumatized by this and doing this kind of sweatshop work all day, every day, trying to keep ChatGPT working that they were able to actually form a union. And I think this is important because that chemistry example I gave you earlier was one of the &#34;AI&#34; assisting humans, right? But actually, a lot of the exploitation comes in when you have a human-assisted &#34;AI&#34;, such as these sweatshop workers.&#xA;&#xA;Another one is, of course, Amazon Fresh. I took this picture of the Amazon Fresh store. This one is just south of Aldgate East Station in London. And I know some of you know this... So the selling point for Amazon Fresh is that you walk in, pick up whatever you wanna buy, and you just walk out. And they use really advanced &#34;artificial intelligence&#34; to all of the cameras in the shop will figure out what you bought and automatically charge your Amazon account.&#xA;&#xA;But it also came out in the news this year [2024] that all of the so-called &#34;artificial intelligence&#34; was actually Amazon hiring sweatshop workers in India whose sole job is to watch all of those cameras and manually tag what people are buying in these shops when everyone is thinking that&#39;s actually the &#34;artificial intelligence&#34; technology doing all of those things.&#xA;&#xA;And actually, Amazon shut down the whole thing soon afterwards, and they&#39;re actually shifting Amazon Fresh to one where, rather than having all of those cameras watch you, whenever you grab an item, you have to manually scan it into your cart before you take it out.&#xA;&#xA;And the other example that I think is very, very telling is this piece of news that was in the BBC earlier this year [2024] about this new driverless bus route that was started in Seoul in South Korea. So what happens is that this bus is supposed to be completely driverless, right? And you can see a picture of this guy sitting on the [driver&#39;s seat].&#xA;&#xA;So I like this picture, by the way, of how this person actually also has his feet up to indicate that he doesn&#39;t even have his feet on the pedal. And I wanna use this example to say that all of what I&#39;ve been showing to you so far are cases of human-assisted &#34;AI&#34;.&#xA;&#xA;And what this driver has to do, you might be asking, &#34;Okay, if this bus is completely driverless, why do you still need someone to sit there?&#34; So what happens is that this driver will sit in the driver&#39;s seat. They don&#39;t usually have to do anything, like 99% of the time they can just sit and watch the bus drive itself, but this bus driver has to be super vigilant the whole time. Just in case, you know, in that 1% of the situations where the driverless bus makes a mistake, this driver has to immediately react and come in and actually make an adjustment to whatever the bus is doing.&#xA;&#xA;So this driver is actually more vigilant than they usually have to be if they were just driving a regular bus. And this is what we&#39;re also seeing, of course, of the Amazon delivery drivers who are [used by] the so-called &#34;artificial intelligence&#34; system. You know, it&#39;s constantly watching the drivers on these trucks as they make their deliveries.&#xA;&#xA;And they&#39;re under so much pressure because on one hand, Amazon is constantly pressuring them into making their delivery quotas. On the other hand, this &#34;artificial intelligence&#34; disciplinary system is constantly watching their behavior, such as watching their eyeballs [to track] where they&#39;re looking. There&#39;s also some evidence that the camera is watching their lips because apparently some drivers, they would whistle or sing a tune as they&#39;re driving, and apparently that&#39;s a bad thing and you&#39;ll get marks taken off and you might not get your bonus at the end of the week. So they&#39;re constantly being disciplined like this.&#xA;&#xA;Or they have to deal with these inhuman competing demands. And in these examples, it&#39;s like, you know, us humans, we&#39;re basically mindless bodies where the &#34;AI&#34; acts as the head to discipline us and make us do exactly what it wants us to do.&#xA;&#xA;And it comes back to my point where if we think of it as an &#34;artificial intelligence&#34;, then we attribute agency to this technology. And that distracts us from the Jeff Bezos-es behind the technology who&#39;s actually using them to exert that power over us. And I think that&#39;s really dangerous, right?&#xA;&#xA;And I think Kate Crawford describes it really well, where so-called &#34;artificial intelligence&#34; is neither artificial nor intelligent. And the use of this technology in the ways that I just described, you know, it&#39;s not really replacing labor. It is displacing labor and making it even more invisible to us.&#xA;&#xA;And this is why I think words matter because they have so much epistemic power over how we think about things. And often the use of language in &#34;artificial intelligence&#34; distract us from all of these underlying problems. Because, you know, if the &#34;AI&#34; on that driverless bus, you know, let&#39;s say hallucinates and makes a mistake, who are you gonna blame? We might blame, you know, that driver who wasn&#39;t vigilant enough to catch that 1% chance of the bus making a mistake, but is that really the issue here?&#xA;&#xA;And that&#39;s where I&#39;d like to try to bring this back to scientific research. So what does what we do as academic scientists have anything to do with this, right? Well, first of all, I&#39;m kind of concerned about how even in academic scientific research, there is already sometimes a tendency to exploit.&#xA;&#xA;So this is a paper that I actually cited in my previous research where it talks about, crowdsourcing the work that we do in science, whether it&#39;s data collection or data processing to online volunteers. And I want to first say that sometimes this can be done really well. For instance, a lot of this is integrated into science outreach and science education and science engagement, where as part of your engagement activity, the participant, they get to do part of the science and help you analyze data. And they can be mutually beneficial, but in papers like this, you often see language like, crowdsourcing, right? Which allows all of these free labor that you hired to shorten the time to perform the work for you, or it lowers the cost of labor for the academic who&#39;s running the project.&#xA;&#xA;And I think there&#39;s a little bit of a danger here where we are perpetuating some of the exploitation, especially now where I am actually asked to review papers over time about this kind of crowdsourcing work and the way they talk about the participants make me concerned about where this is going in terms of various technologies where we might accidentally perpetuate this smokescreen that I keep talking about.&#xA;&#xA;The second thing is that because the language around &#34;AI&#34; is so misleading we get papers like this who are, of course, it&#39;s basically saying it&#39;s so costly and labor intensive to recruit participants in your project. So why don&#39;t we replace them with large language models who will never get tired of our interview questions? We don&#39;t need to give them any compensation and we can get as many participants as we want in our study because, you know, they&#39;re as good as the real thing anyway, right? So I think that&#39;s pretty problematic.&#xA;&#xA;Another one is talking about human assisted peer review in &#34;AI&#34; where they actually want to use these models to do peer reviews. And of course, proposing this particular editorial in this Nature Journal is that they&#39;re claiming: &#34;oh, it&#39;s gonna save so much work for the actual peer reviewer because the &#39;AI&#39; is gonna do all of it&#34; and then the human, they just need to come in at the end and briefly check that peer review to see if it&#39;s okay.&#xA;&#xA;But this sounds so much like that bus driver to me, and I feel we&#39;re seeing a lot of really high profile papers like this. There&#39;s one that I didn&#39;t get to stick into the slide in time, which is literally proposing, using &#34;AI&#34; to completely take over the scientific discovery process, where you&#39;re gonna use the large language model for question generation to design and conduct the experiment, analyze the result, write a paper, and then get another large language model to come into peer review that paper.&#xA;&#xA;And at the end of the abstract, so I really wish I should put the abstract here, but at the end of the abstract it says, this saves so much money: &#34;We calculated on average that if you outsource this entire thing to our &#39;AI&#39; tool, it will be able to produce all of that scientific research for you at a cost of $15 per paper.&#34; And I think that says a lot about how there&#39;s so much misunderstanding and hype around these technologies that high profile papers like this are starting to appear.&#xA;&#xA;And I think Lisa Messeri described it really well where if we develop this kind of reliance and we think that &#34;AI&#34; technology is actually, sentient and intelligent, then by doing science this way, it will give us illusions of understanding. And this is a fantastic paper I suggest you check out.&#xA;&#xA;Okay, now as someone who has been an open research advocate for a long time, another thing that&#39;s talked about, in &#34;AI&#34; circles right now is that we should really make a lot of these &#34;AI&#34; tools open source. And I think there are good reasons for that. But in the context of open research, there&#39;s a lot of messiness there as well.&#xA;&#xA;So you might have heard of LLAMA 2, one of the large language models released by Meta last year. Then they called it an &#34;open source&#34; large language model. But if you actually click on download the model, it actually comes with a ton of restrictions on what you can do with it and a lot of limitations. And a lot of parts of it are completely opaque and you&#39;re not allowed to see what the model is doing. So it certainly doesn&#39;t meet the industry definition of open source as it has been established for software.&#xA;&#xA;Now, the Open Source Initiative has been working on this issue for a long time. And actually just a few weeks ago, they released the first version of an open source &#34;AI&#34; definition. And I think it&#39;s really important for academic researchers to be part of this process as well.&#xA;&#xA;But in any case, what happens in practice is that there was another study published earlier this year where they looked at dozens of the popularly used, large language models these days and scored them using 14 different criteria on their openness. And the overwhelming majority of them comes not only with a ton of restrictions, but also a lot of black boxes where you&#39;re not really allowed to know what&#39;s actually happening inside these models.&#xA;&#xA;So you can see that ChatGPT is right there on the very bottom as one of the most black box large language models that there is that we&#39;re using. And I think there&#39;s a real danger here for... with all of this hype around so-called &#34;artificial intelligence&#34; and all the talk about completely integrating that into the science that we do. We&#39;re building all of the science on top of this &#34;AI&#34; technology.&#xA;&#xA;I think what&#39;s gonna happen is that we won&#39;t end up doing science anymore. We will be doing alchemy! Because it&#39;s built on top of this completely opaque system. And I think that&#39;s a fundamental danger to the future of doing science.&#xA;&#xA;And I want to quickly bring us back to this very well-endowed lab rat that I mentioned at the beginning, because I know that in response to papers like this, some people are saying, okay, so of course, you know, we should certainly ban the use of &#34;AI&#34; technologies in the creation of papers. So maybe we should just completely cut &#34;AI&#34; out of the paper writing process, right?&#xA;&#xA;And I think that&#39;s understandable to a large degree, but I think there&#39;s a concern about if and what kind of problems are we actually solving if we focus on dealing with the &#34;AI&#34; part of it. Because I&#39;m concerned that fixing &#34;AI&#34; might actually entrench deeper problems.&#xA;&#xA;In this case, the broken peer review system, the publish-or-perish culture, right? Where these publishing monopolies... because I wouldn&#39;t be surprised, given what we&#39;ve seen in higher education in terms of finding fake essays written by students. I wouldn&#39;t be surprised if one of those big publishers, they release some proprietary &#34;AI&#34; tool saying, &#34;hey, if you publish a journal with us, then we&#39;ll let you use our proprietary &#39;AI&#39; tool to detect fake paper submissions.&#34;&#xA;&#xA;That might seem to superficially solve the problem, but I think the deeper risk of thinking about &#34;AI&#34; is that in this example, we will become even more reliant on these huge publishers and cede even more power to them, right? And I think that&#39;s what I&#39;m really concerned about because, solutions like this, don&#39;t really get at the actual problems leading to why people want, well, not necessarily want, but feel pressured into publishing those fake papers.&#xA;&#xA;So I think a core message that I&#39;ve got from these examples is that &#34;AI&#34; highlights existing problems that we have. And it&#39;s important for us to be aware of deeper problems in our research culture. And it could be really long standing issues like job security or the toxic workloads that we have to put up with, right? And think about all of those lecturers who have to live in tents because they can&#39;t afford anything more than that.&#xA;&#xA;And it&#39;s important to realize that &#34;AI&#34; didn&#39;t create these problems just as &#34;AI&#34; didn&#39;t create the sweatshops that I mentioned earlier.&#xA;&#xA;So to wrap things up, I think the main messages I want to send today is that words really matter when we talk about these technologies. And we should be very sensitive in understanding what those words really mean. And instead of thinking about &#34;AI&#34;, we should think about these deeper underlying issues that have plagued us for so long because, you know, very often &#34;AI&#34; is NOT the problem. It highlights existing problems and we should reflect on and focus on those underlying issues.&#xA;&#xA;If we only focus on &#34;AI&#34;, it risks making those problems even worse. Okay, so that&#39;s the bulk of my talk, but if I&#39;ve piqued your interest a little bit, I will leave you with some further reading, one of which is this one about generative &#34;AI&#34; and the automating of academia. The lead author is Richard Watermeyer based right here in Bristol. It&#39;s a fantastic read.&#xA;&#xA;But if you&#39;re tired of reading yet another paper, I mentioned Emily Bender earlier. So Emily Bender and Alex Hanna host an incredible podcast called Mystery AI Hype Theater 3000, where every week they look at one of these so-called &#34;AI&#34; papers like the ones that I just showed you and tear it apart. And it&#39;s both very depressing and very entertaining at the same time.&#xA;&#xA;Or if you&#39;d like to read, these two Princeton professors, they wrote a book called &#34;AI Snake Oil,&#34; again, along the veins of what I&#39;m talking about today. And I think it&#39;s really informative in terms of how we think about how we want to adapt our research culture in light of this new technology.&#xA;&#xA;So that&#39;s some additional material that I think is useful. And in the interest of doing open research, I&#39;ve published these slides, the transcript, additional notes, and all of the references to Zenodo. So you can look at that and remix and use it if you want.&#xA;&#xA;And I also want to just give a shout out to Jennifer Ding from the Turing Institute and Shern Tee, and everyone from the Turing Way community who&#39;s helped me develop this talk.&#xA;&#xA;So that&#39;s what I have for you today. And thank you for coming.&#xA;&#xA;----------&#xA;&#xA;#talks #AI&#xA;&#xA;----------&#xD;&#xA;&#xD;&#xA; p xmlns:cc=&#34;http://creativecommons.org/ns#&#34; Unless otherwise stated, all original content in this post is shared under the a href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;Creative Commons Attribution-ShareAlike 4.0 International/a licensea href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1&#34; alt=&#34;&#34;/a/p ]]&gt;</description>
      <content:encoded><![CDATA[<p>I gave a follow up talk to <a href="https://write.as/naclscrg/talk-ai-is-not-the-problem">an earlier talk about “AI”</a> at the University of Bristol TARG research group meeting on 22 November 2024. As usual, lots of stuff I couldn&#39;t fit into the talk, so I&#39;m putting them here plus further reading, a transcript, and video recording of the talk.</p>

<p>The <strong>slides are <a href="https://doi.org/10.5281/zenodo.11051128">published on Zenodo</a> with DOI <a href="https://doi.org/10.5281/zenodo.11051128">10.5281/zenodo.11051128</a></strong> listed under the “30 minute version”.

I will try to gather here:</p>
<ul><li>the <a href="#video-recording"><strong>video recording</strong></a>;</li>
<li><a href="#short-summary"><strong>short summary</strong></a>;</li>
<li><a href="#further-reading"><strong>further reading</strong></a> collected when developing the talk; and</li>
<li>a <a href="#transcript"><strong>transcript</strong></a> of the talk.</li></ul>

<p>I&#39;ll try to clean up this post with more context and details on a best-effort basis.</p>

<h2 id="video-recording" id="video-recording">Video recording</h2>

<p>There is a live video recording made during my 22 November 2024 talk which is <a href="https://archive.org/details/AI-is-not-the-problem-2024-11-22">viewable on the Internet Archive</a>. The video is also embedded here (click the “CC” icon for subtitles):</p>

<iframe src="https://archive.org/embed/AI-is-not-the-problem-2024-11-22" width="640" height="480" frameborder="0" allowfullscreen=""></iframe>

<h2 id="short-summary" id="short-summary">Short summary</h2>

<p>Please see the <a href="https://write.as/naclscrg/talk-ai-is-not-the-problem/">notes for my original “AI” talk</a> for additional information.</p>

<p>Aware of the irony, I was curious how a large language model (LLM) could take the transcript of my talk (see below) and infer a short summary. The following is what Claude 3.5 Sonnet produced, with some edits by me:</p>

<p>This talk came from my conversation with Jennifer Ding at the Turing Institute about which underlying issues around “AI” technology deserve more attention versus the overhyped aspects. While I acknowledge that new technologies like “AI” can bring positive changes – such as a helpful Speech Schema Filling Tool that helps chemists record experimental <em>meta</em>data in real time as they run experiments – I wanted to focus on several key concerns.</p>

<p>The first observation I made is how “AI”-generated content is affecting academia. I shared examples including a published paper that began with “Certainly, here&#39;s a possible introduction...” (clearly ChatGPT-generated) and most amusingly, a paper featuring an anatomically incorrect lab rat with comically oversized genitals that somehow made it through peer review. I&#39;ve also noted evidence of academics using “AI” tools for both writing and reviewing papers, and even PhD programs where applicants and reviewers use “AI” to convert application letters between bullet points and prose.</p>

<p>I emphasized that <strong>words really matter</strong> in this discussion. <strong>“AI” has become more of a marketing term than a technical term of art</strong>, and I pointed to how papers from just before the “AI” hype rarely used the term for the same technologies. I argue that this <strong>ambiguous language serves as a smokescreen, shifting power to those who control these tools</strong>.</p>

<p>This led me to discuss how “AI” often masks human exploitation. I shared examples including Kenyan sweatshop workers traumatized by moderating graphic content for ChatGPT, their Indian counterparts manually tracking purchases in ostensibly automated Amazon Fresh supermarkets, and bus drivers in “driverless” buses who must remain hypervigilant for that 1% chance of needing to intervene. As Kate Crawford notes, <strong>“AI” is “neither artificial nor intelligent” – it&#39;s not replacing labor but rather making it more invisible</strong> (which Lilly Irani also discussed in depth).</p>

<p>For scientific research, I see several concerns. There&#39;s a growing trend of papers proposing to replace human participants with large language models or suggesting complete automation of the scientific process – with one paper proudly claiming it could produce entire research projects from ideation to paper publication for just USD 15 each. I warn that <strong>building science on top of opaque and unaccountable “AI” systems risks turning science into alchemy</strong>.</p>

<p>While some suggest banning “AI” in academic publishing (following incidents like the well-endowed lab rat paper), I caution that <strong>focusing <em>solely</em> on “AI” (“solely” being the key word) might entrench deeper problems</strong> like the broken peer review system and publish-or-perish culture. For example, publishing companies might offer proprietary “AI”-generated paper detection tools, which would make us <em>more</em> reliant on them and further consolidating their power without tackling why researchers feel pressured to publish fake papers in the first place.</p>

<p>My key message is that “AI” often highlights existing problems rather than creating new ones. <strong>Instead of fixating on “AI” itself, we should address underlying issues in research culture</strong>, from job security to toxic workloads. I concluded by recommending resources like the Mystery AI Hype Theater 3000 podcast and the book “AI Snake Oil” for those interested in deeper exploration of these themes.</p>

<p>P.S. Note that a newer book, “The AI Con”, is about to be published in 2025: <a href="https://thecon.ai/">https://thecon.ai/</a></p>

<h2 id="further-reading" id="further-reading">Further reading</h2>

<p>Please see the <a href="https://write.as/naclscrg/talk-ai-is-not-the-problem/">notes for my original “AI” talk</a> for links and references in addition to what&#39;s here.</p>
<ul><li>[report] <strong>Amazon’s AI Cameras Are Punishing Drivers</strong> for Mistakes They Didn’t Make: <a href="https://www.vice.com/en/article/amazons-ai-cameras-are-punishing-drivers-for-mistakes-they-didnt-make/">https://www.vice.com/en/article/amazons-ai-cameras-are-punishing-drivers-for-mistakes-they-didnt-make/</a></li>
<li>[report] <strong>Amazon Fresh</strong> kills “Just Walk Out” shopping tech—it never really worked: <a href="https://arstechnica.com/gadgets/2024/04/amazon-ends-ai-powered-store-checkout-which-needed-1000-video-reviewers/">https://arstechnica.com/gadgets/2024/04/amazon-ends-ai-powered-store-checkout-which-needed-1000-video-reviewers/</a></li>
<li>[report] Look, no hands! My trip on <strong>Seoul&#39;s self-driving bus</strong>: <a href="https://www.bbc.co.uk/news/business-68823705">https://www.bbc.co.uk/news/business-68823705</a></li>
<li>[podcast] Mystery AI Hype Theater 3000: <a href="https://www.dair-institute.org/maiht3k/">https://www.dair-institute.org/maiht3k/</a></li>
<li>[editorial] The advent of human-assisted peer review by AI – in Nature Biomedical Engineering: <a href="https://doi.org/10.1038/s41551-024-01228-0">https://doi.org/10.1038/s41551-024-01228-0</a></li>
<li><strong>Words matter</strong>, they affect the way we think about issues:
<ul><li>[essay] Stefano Quintarelli is a former Italian member of parliament who said that instead of “AI”, we could call those technologies “<strong>S</strong>ystematic <strong>A</strong>pproaches to <strong>L</strong>earning <strong>A</strong>lgorithms and <strong>M</strong>achine <strong>I</strong>nferences (<strong>SALAMI</strong>)”: <a href="https://blog.quintarelli.it/2019/11/lets-forget-the-term-ai-lets-call-them-systematic-approaches-to-learning-algorithms-and-machine-inferences-salami/">https://blog.quintarelli.it/2019/11/lets-forget-the-term-ai-lets-call-them-systematic-approaches-to-learning-algorithms-and-machine-inferences-salami/</a></li>
<li>[podcast] Completely randomly, I heard another “AI” replacement term “<strong>T</strong>echnical <strong>O</strong>riented <strong>A</strong>rtificial <strong>S</strong>tupidi<strong>T</strong>y (<strong>TOAST</strong>)” coined by Chris Roberts in the middle of a gaming podcast (19:31 into the video): <a href="https://www.youtube.com/live/ADYB-QJGheA?feature=shared&amp;t=1171">https://www.youtube.com/live/ADYB-QJGheA?feature=shared&amp;t=1171</a></li></ul></li>
<li>I didn&#39;t get to talk about the <strong>environmental costs</strong> of scaling (or the urge to scale up) “AI” technology, Timnit Gebru of the DAIR Institute touches on this and other issues in this interview (57:58 into the video): <iframe allow="monetization" class="embedly-embed" src="//cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2Fnh7-ZNBql38%3Ffeature%3Doembed%26start%3D3478&display_name=YouTube&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3Dnh7-ZNBql38&image=https%3A%2F%2Fi.ytimg.com%2Fvi%2Fnh7-ZNBql38%2Fhqdefault.jpg&type=text%2Fhtml&schema=youtube" width="640" height="360" scrolling="no" title="YouTube embed" frameborder="0" allow="autoplay; fullscreen; encrypted-media; picture-in-picture;" allowfullscreen="true"></iframe></li></ul>

<h3 id="books" id="books">Books</h3>

<p>Hanna, A., &amp; Bender, E. M. (2025). <strong>The AI Con</strong>—How to fight big tech’s hype and create the future we want. Harper. <a href="https://thecon.ai/">https://thecon.ai/</a></p>

<p>Narayanan, A., &amp; Kapoor, S. (2024). <strong>AI Snake Oil</strong>: What artificial intelligence can do, what it can’t, and how to tell the difference. Princeton University Press. <a href="https://press.princeton.edu/books/hardcover/9780691249131/ai-snake-oil">https://press.princeton.edu/books/hardcover/9780691249131/ai-snake-oil</a></p>

<h3 id="academic-literature" id="academic-literature">Academic literature</h3>

<p>Argyle, L. P., Busby, E. C., Fulda, N., Gubler, J. R., Rytting, C., &amp; Wingate, D. (2023). Out of one, many: Using language models to simulate human samples. <em>Political Analysis</em>, 31(3), 337–351. <a href="https://doi.org/10.1017/pan.2023.2">https://doi.org/10.1017/pan.2023.2</a></p>

<p>Bender, E.M., Gebru, T., McMillan-Major, A., Shmitchell, S. (2021) On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜. In <em>Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT &#39;21)</em>. Association for Computing Machinery, New York, New York, United States, 610–623. <a href="https://doi.org/10.1145/3442188.3445922">https://doi.org/10.1145/3442188.3445922</a></p>

<p>Gu, J., Liu, L., Wang, P., Theobalt, C. (2021) StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis. <em>arXiv</em>, 2110.08985. <a href="https://doi.org/10.48550/arXiv.2110.08985">https://doi.org/10.48550/arXiv.2110.08985</a></p>

<h2 id="transcript" id="transcript">Transcript</h2>

<p>This started from my conversation with Jennifer Ding at the Turing Institute. And we were talking about: what are some of the underlying issues around “AI” technology that we feel should be surfaced a little more rather than some of the stuff that we think is a little overhyped? And I&#39;m gonna go over a lot of those problems today.</p>

<p>Before I get into it, I want to do something I always emphasize in talks like this, which is that I think for any kind of technology, it can bring about a lot of change in how we do things and how we organize ourselves. And it&#39;s not a matter of saying: oh, you know, let&#39;s just not use it. There&#39;s a potential for “AI” technologies, right? Because if you think about it, when the printing press came around, you don&#39;t want to ban the printing press just because you&#39;re afraid that the scribes are gonna go out of business. We hopefully can work together to find a way to realize the potential of a new technology.</p>

<p>And I think a positive example that I&#39;d like to share before jumping to everything else is this tool that Shern Tee shared with me. It&#39;s called the Speech Schema Filling Tool. So it was developed by chemists for use in their experiments. And what happens is that as you do your experiments, you talk into the microphone on your computer and the large language model on it will use your audio input to do a speech to text conversion and fill in your lab notebook with what you&#39;re saying. But what&#39;s really cool about it is that the tool will also parse what you&#39;re saying and record relevant metadata into a structured data format to go with your lab notebook. So there&#39;s a very well-structured metadata set to go with the particular experiment that you&#39;re doing. And I think as long as you&#39;re happy to talk through your experiment as you&#39;re doing it, this tool is so helpful for you to improve the quality of the data that you&#39;re capturing, helping make your experiments more reproducible and so on, right?</p>

<p>So there are certainly really good uses, of what people are calling “AI” technologies these days. Having said all of that, obviously there&#39;s also a lot of concern that we&#39;ve seen over the past couple of years, such as in terms of how people publish papers, right? This is a classic one I think Marcus shared a while back where if you look at the paper, starting right from the first sentence in the introduction, it says: “Certainly, here&#39;s a possible introduction for your topic.” And I think it&#39;s pretty clear that this probably came from ChatGPT, which is one of the more commonly used so-called “AI” tools today to generate text.</p>

<p>However, this is not my favorite one. So my favorite paper is this one. I don&#39;t know if some of you have seen it. I see some of you smiling, so you know what I&#39;m getting to. First of all, this was published in Frontiers back in February [2024]. If you look at the text, a lot of it looks fairly generic and probably “AI”-generated. But the most dramatic part is one of the figures which was a lab rat. And most of the lab rat looks kind of like a normal rat, but it&#39;s got these giant genitals sticking out of it. For the phallus, it&#39;s so long that it extends beyond the figure.</p>

<p>I just love how a figure like this would get past the peer reviewers, it gets past the editors, it gets past the copyeditors of the journal and gets published. Now, for the record, it was retracted by the publisher pretty soon afterwards. But not after everyone on the internet got copies of the PDF first and then archived it. That&#39;s how I was able to get this amazing picture of this lab rat, which I love. And you can also see a lot of weirdly spelled words that annotate this figure. So definitely check it out. I think this is one of the classics that&#39;s come out of some of the papers we&#39;ve seen over the past couple of years.</p>

<p>And in addition to generating these papers, we are also seeing some evidence that academics are using these tools to generate the peer reviews that they write. And to be honest, I can kind of relate to what these academics are going through because who has time, right, to do a really good peer review these days? And in higher education, of course, we know that some students feel really tempted to use these sort of [large] language models to generate their essays, and we&#39;re also seeing that some instructors are using the same tools to grade and mark the essays.</p>

<p>You know, there&#39;s an anecdote I heard for a PhD program that was recruiting students, I think it was in the US, they found that a lot of the applicants to the PhD program didn&#39;t have time to write so many cover letters in the application. So they would write a few bullet points saying what they want in their cover letters. They use a large language model, turn it into the cover letter. And then when the professors on the program, they have so many applications to sift through, they ask the same tool to translate it back into bullet points so that it&#39;s quicker for them to skim through.</p>

<p>So a lot of interesting use cases here, but I just wanna use this to set the stage to talk about three things today. So the first one is that I think words really matter when we talk about so-called “AI” technologies because there&#39;s a lot of ambiguity in the language. And that can become really problematic because it allows so-called “AI” to become a smokescreen that distracts us from what I think a lot of the underlying issues are. That&#39;s more important to tackle. And lastly, I will try to bring all of this back to scientific research and think about what this means for scientific research and maybe what it doesn&#39;t mean.</p>

<p>Okay, so what do I mean by words matter? Well, I think it&#39;s very important for us to realize that so-called “AI”, as we colloquially use it today, is very much just a marketing term and not a technical term of art!</p>

<p>To illustrate this point, I really like this paper. It&#39;s called “A style-based 3D-aware generator for high-resolution image synthesis.” And you can see that you can use this tool to generate very realistic-looking photos of people. And I use this example because I searched through the whole paper, including the title, and other than one of the affiliations of the first author, there&#39;s no mention of “artificial intelligence” in this paper at all.</p>

<p>And if you look at the publication date, it&#39;s 2022, just before all of the hype around “AI” started. And I think if this paper is published just a year later, the text is going to be filled with references to “artificial intelligence”. And I think this is really important because it comes back to the point that a lot of the terminology we&#39;re using today around these technologies are marketing terms, like hallucinations or reasoning skills or training these models.</p>

<p>First of all, it really anthropomorphizes this technology, and it gives us a sense kind of like how humans have a tendency to recognize faces in things. And I feel using this terminology misleads us into recognizing intelligence in these tools as well. And I think that can be really problematic.</p>

<p>Another way to think about it is that when we are using our word processors to type up our papers, there&#39;s spellcheck, right? And spellcheck is basically a statistical model that takes an input and infers, in this case, the possible correct spelling for the word you&#39;re trying to spell. And this is not to minimize the amazing amount of work that&#39;s gone into these artificial intelligence technologies, but roughly speaking, large language models are also a very, very sophisticated form of statistical modeling that takes text as input and infers a natural-looking output.</p>

<p>And I think Emily Bender describes it really well when she calls these models “stochastic parrots”, because parrots, they might repeat words back to you, but they are literally incapable of understanding what it&#39;s saying. And this also applies to all of these “artificial intelligence” technologies.</p>

<p>And I think this ambiguous language is the feature, not the bug, because it&#39;s not just a matter of linguistics or semantics or nitpicking, but we know from history that ambiguous language shifts power to people who hold control over those tools and technologies. And I feel that the powerful people behind so-called “AI” is using this ambiguous language as a smokescreen to distract us from the very real problems underneath it.</p>

<p>So just, I think it was last year where there was this union that was formed in Kenya, because there were so many sweatshop workers in Kenya that were hired by the company behind ChatGPT and also Facebook and other companies to, well, as you can see here, to make the models less toxic.</p>

<p>So what they do is that you&#39;re constantly looking at outputs for the most egregious stuff, such as descriptions of sexual abuse, murder, suicide, and other really graphic details. And they&#39;re basically tweaking the model inputs whenever something really graphic comes out [so that] the statistical inferences from these large language models are slightly less offensive.</p>

<p>And they&#39;re so traumatized by this and doing this kind of sweatshop work all day, every day, trying to keep ChatGPT working that they were able to actually form a union. And I think this is important because that chemistry example I gave you earlier was one of the “AI” assisting humans, right? But actually, a lot of the exploitation comes in when you have a human-assisted “AI”, such as these sweatshop workers.</p>

<p>Another one is, of course, Amazon Fresh. I took this picture of the Amazon Fresh store. This one is just south of Aldgate East Station in London. And I know some of you know this... So the selling point for Amazon Fresh is that you walk in, pick up whatever you wanna buy, and you just walk out. And they use really advanced “artificial intelligence” to all of the cameras in the shop will figure out what you bought and automatically charge your Amazon account.</p>

<p>But it also came out in the news this year [2024] that all of the so-called “artificial intelligence” was actually Amazon hiring sweatshop workers in India whose sole job is to watch all of those cameras and manually tag what people are buying in these shops when everyone is thinking that&#39;s actually the “artificial intelligence” technology doing all of those things.</p>

<p>And actually, Amazon shut down the whole thing soon afterwards, and they&#39;re actually shifting Amazon Fresh to one where, rather than having all of those cameras watch you, whenever you grab an item, you have to manually scan it into your cart before you take it out.</p>

<p>And the other example that I think is very, very telling is this piece of news that was in the BBC earlier this year [2024] about this new driverless bus route that was started in Seoul in South Korea. So what happens is that this bus is supposed to be completely driverless, right? And you can see a picture of this guy sitting on the [driver&#39;s seat].</p>

<p>So I like this picture, by the way, of how this person actually also has his feet up to indicate that he doesn&#39;t even have his feet on the pedal. And I wanna use this example to say that all of what I&#39;ve been showing to you so far are cases of human-assisted “AI”.</p>

<p>And what this driver has to do, you might be asking, “Okay, if this bus is completely driverless, why do you still need someone to sit there?” So what happens is that this driver will sit in the driver&#39;s seat. They don&#39;t usually have to do anything, like 99% of the time they can just sit and watch the bus drive itself, but this bus driver has to be super vigilant the whole time. Just in case, you know, in that 1% of the situations where the driverless bus makes a mistake, this driver has to immediately react and come in and actually make an adjustment to whatever the bus is doing.</p>

<p>So this driver is actually more vigilant than they usually have to be if they were just driving a regular bus. And this is what we&#39;re also seeing, of course, of the Amazon delivery drivers who are [used by] the so-called “artificial intelligence” system. You know, it&#39;s constantly watching the drivers on these trucks as they make their deliveries.</p>

<p>And they&#39;re under so much pressure because on one hand, Amazon is constantly pressuring them into making their delivery quotas. On the other hand, this “artificial intelligence” disciplinary system is constantly watching their behavior, such as watching their eyeballs [to track] where they&#39;re looking. There&#39;s also some evidence that the camera is watching their lips because apparently some drivers, they would whistle or sing a tune as they&#39;re driving, and apparently that&#39;s a bad thing and you&#39;ll get marks taken off and you might not get your bonus at the end of the week. So they&#39;re constantly being disciplined like this.</p>

<p>Or they have to deal with these inhuman competing demands. And in these examples, it&#39;s like, you know, us humans, we&#39;re basically mindless bodies where the “AI” acts as the head to discipline us and make us do exactly what it wants us to do.</p>

<p>And it comes back to my point where if we think of it as an “artificial intelligence”, then we attribute agency to this technology. And that distracts us from the Jeff Bezos-es behind the technology who&#39;s actually using them to exert that power over us. And I think that&#39;s really dangerous, right?</p>

<p>And I think Kate Crawford describes it really well, where so-called “artificial intelligence” is neither artificial nor intelligent. And the use of this technology in the ways that I just described, you know, it&#39;s not really replacing labor. It is displacing labor and making it even more invisible to us.</p>

<p>And this is why I think words matter because they have so much epistemic power over how we think about things. And often the use of language in “artificial intelligence” distract us from all of these underlying problems. Because, you know, if the “AI” on that driverless bus, you know, let&#39;s say hallucinates and makes a mistake, who are you gonna blame? We might blame, you know, that driver who wasn&#39;t vigilant enough to catch that 1% chance of the bus making a mistake, but is that really the issue here?</p>

<p>And that&#39;s where I&#39;d like to try to bring this back to scientific research. So what does what we do as academic scientists have anything to do with this, right? Well, first of all, I&#39;m kind of concerned about how even in academic scientific research, there is already sometimes a tendency to exploit.</p>

<p>So this is a paper that I actually cited in my previous research where it talks about, crowdsourcing the work that we do in science, whether it&#39;s data collection or data processing to online volunteers. And I want to first say that sometimes this can be done really well. For instance, a lot of this is integrated into science outreach and science education and science engagement, where as part of your engagement activity, the participant, they get to do part of the science and help you analyze data. And they can be mutually beneficial, but in papers like this, you often see language like, crowdsourcing, right? Which allows all of these free labor that you hired to shorten the time to perform the work for you, or it lowers the cost of labor for the academic who&#39;s running the project.</p>

<p>And I think there&#39;s a little bit of a danger here where we are perpetuating some of the exploitation, especially now where I am actually asked to review papers over time about this kind of crowdsourcing work and the way they talk about the participants make me concerned about where this is going in terms of various technologies where we might accidentally perpetuate this smokescreen that I keep talking about.</p>

<p>The second thing is that because the language around “AI” is so misleading we get papers like this who are, of course, it&#39;s basically saying it&#39;s so costly and labor intensive to recruit participants in your project. So why don&#39;t we replace them with large language models who will never get tired of our interview questions? We don&#39;t need to give them any compensation and we can get as many participants as we want in our study because, you know, they&#39;re as good as the real thing anyway, right? So I think that&#39;s pretty problematic.</p>

<p>Another one is talking about human assisted peer review in “AI” where they actually want to use these models to do peer reviews. And of course, proposing this particular editorial in this Nature Journal is that they&#39;re claiming: “oh, it&#39;s gonna save so much work for the actual peer reviewer because the &#39;AI&#39; is gonna do all of it” and then the human, they just need to come in at the end and briefly check that peer review to see if it&#39;s okay.</p>

<p>But this sounds so much like that bus driver to me, and I feel we&#39;re seeing a lot of really high profile papers like this. There&#39;s one that I didn&#39;t get to stick into the slide in time, which is literally proposing, using “AI” to completely take over the scientific discovery process, where you&#39;re gonna use the large language model for question generation to design and conduct the experiment, analyze the result, write a paper, and then get another large language model to come into peer review that paper.</p>

<p>And at the end of the abstract, so I really wish I should put the abstract here, but at the end of the abstract it says, this saves so much money: “We calculated on average that if you outsource this entire thing to our &#39;AI&#39; tool, it will be able to produce all of that scientific research for you at a cost of $15 per paper.” And I think that says a lot about how there&#39;s so much misunderstanding and hype around these technologies that high profile papers like this are starting to appear.</p>

<p>And I think Lisa Messeri described it really well where if we develop this kind of reliance and we think that “AI” technology is actually, sentient and intelligent, then by doing science this way, it will give us illusions of understanding. And this is a fantastic paper I suggest you check out.</p>

<p>Okay, now as someone who has been an open research advocate for a long time, another thing that&#39;s talked about, in “AI” circles right now is that we should really make a lot of these “AI” tools open source. And I think there are good reasons for that. But in the context of open research, there&#39;s a lot of messiness there as well.</p>

<p>So you might have heard of LLAMA 2, one of the large language models released by Meta last year. Then they called it an “open source” large language model. But if you actually click on download the model, it actually comes with a ton of restrictions on what you can do with it and a lot of limitations. And a lot of parts of it are completely opaque and you&#39;re not allowed to see what the model is doing. So it certainly doesn&#39;t meet the industry definition of open source as it has been established for software.</p>

<p>Now, the Open Source Initiative has been working on this issue for a long time. And actually just a few weeks ago, they released the first version of an open source “AI” definition. And I think it&#39;s really important for academic researchers to be part of this process as well.</p>

<p>But in any case, what happens in practice is that there was another study published earlier this year where they looked at dozens of the popularly used, large language models these days and scored them using 14 different criteria on their openness. And the overwhelming majority of them comes not only with a ton of restrictions, but also a lot of black boxes where you&#39;re not really allowed to know what&#39;s actually happening inside these models.</p>

<p>So you can see that ChatGPT is right there on the very bottom as one of the most black box large language models that there is that we&#39;re using. And I think there&#39;s a real danger here for... with all of this hype around so-called “artificial intelligence” and all the talk about completely integrating that into the science that we do. We&#39;re building all of the science on top of this “AI” technology.</p>

<p>I think what&#39;s gonna happen is that we won&#39;t end up doing science anymore. We will be doing alchemy! Because it&#39;s built on top of this completely opaque system. And I think that&#39;s a fundamental danger to the future of doing science.</p>

<p>And I want to quickly bring us back to this very well-endowed lab rat that I mentioned at the beginning, because I know that in response to papers like this, some people are saying, okay, so of course, you know, we should certainly ban the use of “AI” technologies in the creation of papers. So maybe we should just completely cut “AI” out of the paper writing process, right?</p>

<p>And I think that&#39;s understandable to a large degree, but I think there&#39;s a concern about if and what kind of problems are we actually solving if we focus on dealing with the “AI” part of it. Because I&#39;m concerned that fixing “AI” might actually entrench deeper problems.</p>

<p>In this case, the broken peer review system, the publish-or-perish culture, right? Where these publishing monopolies... because I wouldn&#39;t be surprised, given what we&#39;ve seen in higher education in terms of finding fake essays written by students. I wouldn&#39;t be surprised if one of those big publishers, they release some proprietary “AI” tool saying, “hey, if you publish a journal with us, then we&#39;ll let you use our proprietary &#39;AI&#39; tool to detect fake paper submissions.”</p>

<p>That might seem to superficially solve the problem, but I think the deeper risk of thinking about “AI” is that in this example, we will become even more reliant on these huge publishers and cede even more power to them, right? And I think that&#39;s what I&#39;m really concerned about because, solutions like this, don&#39;t really get at the actual problems leading to why people want, well, not necessarily want, but feel pressured into publishing those fake papers.</p>

<p>So I think a core message that I&#39;ve got from these examples is that “AI” highlights existing problems that we have. And it&#39;s important for us to be aware of deeper problems in our research culture. And it could be really long standing issues like job security or the toxic workloads that we have to put up with, right? And think about all of those lecturers who have to live in tents because they can&#39;t afford anything more than that.</p>

<p>And it&#39;s important to realize that “AI” didn&#39;t create these problems just as “AI” didn&#39;t create the sweatshops that I mentioned earlier.</p>

<p>So to wrap things up, I think the main messages I want to send today is that words really matter when we talk about these technologies. And we should be very sensitive in understanding what those words really mean. And instead of thinking about “AI”, we should think about these deeper underlying issues that have plagued us for so long because, you know, very often “AI” is NOT the problem. It highlights existing problems and we should reflect on and focus on those underlying issues.</p>

<p>If we only focus on “AI”, it risks making those problems even worse. Okay, so that&#39;s the bulk of my talk, but if I&#39;ve piqued your interest a little bit, I will leave you with some further reading, one of which is this one about generative “AI” and the automating of academia. The lead author is Richard Watermeyer based right here in Bristol. It&#39;s a fantastic read.</p>

<p>But if you&#39;re tired of reading yet another paper, I mentioned Emily Bender earlier. So Emily Bender and Alex Hanna host an incredible podcast called Mystery AI Hype Theater 3000, where every week they look at one of these so-called “AI” papers like the ones that I just showed you and tear it apart. And it&#39;s both very depressing and very entertaining at the same time.</p>

<p>Or if you&#39;d like to read, these two Princeton professors, they wrote a book called “AI Snake Oil,” again, along the veins of what I&#39;m talking about today. And I think it&#39;s really informative in terms of how we think about how we want to adapt our research culture in light of this new technology.</p>

<p>So that&#39;s some additional material that I think is useful. And in the interest of doing open research, I&#39;ve published these slides, the transcript, additional notes, and all of the references to Zenodo. So you can look at that and remix and use it if you want.</p>

<p>And I also want to just give a shout out to Jennifer Ding from the Turing Institute and Shern Tee, and everyone from the Turing Way community who&#39;s helped me develop this talk.</p>

<p>So that&#39;s what I have for you today. And thank you for coming.</p>

<hr/>

<p><a href="https://naclscrg.writeas.com/tag:talks" class="hashtag"><span>#</span><span class="p-category">talks</span></a> <a href="https://naclscrg.writeas.com/tag:AI" class="hashtag"><span>#</span><span class="p-category">AI</span></a></p>

<hr/>

<p> <p>Unless otherwise stated, all original content in this post is shared under the <a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;">Creative Commons Attribution-ShareAlike 4.0 International</a> license<a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1" alt=""></a></p></p>
]]></content:encoded>
      <guid>https://naclscrg.writeas.com/talk-ai-is-not-the-problem-follow-up</guid>
      <pubDate>Sat, 08 Feb 2025 15:54:31 +0000</pubDate>
    </item>
    <item>
      <title>Resist the urge to quantify scientific research assessment</title>
      <link>https://naclscrg.writeas.com/dont-quantify-assessments?pk_campaign=rss-feed</link>
      <description>&lt;![CDATA[Alarmingly, a recent article titled &#34;DeSci Labs launches novelty scores for scientific manuscripts&#34; (which I saw shared in this post) describes a new: &#xA;&#xA;  ...mathematical model scores feature which is an objective measure of novelty for scientific work.&#xA;!--more--&#xA;https://pharmaceuticalmanufacturer.media/pharma-manufacturing-news/latest-pharmaceutical-manufacturing-news/desci-labs-launches-novelty-scores-for-scientific-manuscript/&#xA;&#xA;The article says: &#xA;&#xA;  ...evaluating the novelty of scientific manuscripts and grant applications takes centre stage in the scientific peer review process. The primary reason work is rejected by editors of high-impact journals or funding agencies is because referees think it is not novel enough. However, the current peer review process is subjective, slow, labour-intensive, and prone to bias and inaccuracy [...] The release of these novelty scores [...] means there is now an objective, automated measurement of one of the core parts of the peer review process.&#xA;&#xA;As a general principle, I assume goodwill. With that in mind, it is with genuine, all due respect that I find this development to be deeply alarming. &#xA;&#xA;First of all, how can there possibly be an &#34;objective&#34; measure of novelty?????&#xA;&#xA;Secondly, while it&#39;s great to see on DeSci Labs&#39;s about page some laudable goals like enabling FAIRness, open science, developing open source software, and preserving scientific outputs (which I care deeply about), the same page also speaks of securing USD 6.5 million in &#34;seed funding&#34;, accelerating science, using &#34;Web3&#34; technology, and to &#34;accelerate growth and enhance customer loyalty&#34;. To me, this reeks of techno-solutionism and -accelerationism.&#xA;&#xA;Third, the underlying math is published in Nature: &#xA;&#xA;https://doi.org/10.1038/s41467-023-36741-4&#xA;&#xA;To me, all three of the above speak volumes about the state of scientific research culture, and not in a good way... 😩&#xA;&#xA;Contrast this with the excellent essay on &#34;The Limits of Data&#34; by C. Thi Nguyen recently shared with the Turing Way community by Shern Tee: &#xA;&#xA;https://doi.org/10.58875/LUXD6515&#xA;&#xA;Which reminds us: &#xA;&#xA;  ...policymakers and data users should remember that not everything is as tractable to the methodologies of data. It is tempting to act as if data-based methods simply offer direct, objective, and unhindered access to the world—that if we follow the methods of data, we will banish all bias, subjectivity, and unclarity from the world. The power of data is vast scalability; the price is context. We need to wean ourselves off the pure-data diet, to balance the power of data-based methodologies with the context-sensitivity and flexibility of qualitative methods and local experts with deep but nonportable understanding. Data is powerful but incomplete; don’t let it entirely drown out other modes of understanding.&#xA;&#xA;I hope the work on reforming academic research culture and #metaresearch could include diverse and skeptical voices in addition to simply developing new quantitative &#34;metrics&#34;.&#xA;&#xA;----------&#xD;&#xA;&#xD;&#xA; p xmlns:cc=&#34;http://creativecommons.org/ns#&#34; Unless otherwise stated, all original content in this post is shared under the a href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;Creative Commons Attribution-ShareAlike 4.0 International/a licensea href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1&#34; alt=&#34;&#34;/a/p ]]&gt;</description>
      <content:encoded><![CDATA[<p>Alarmingly, a recent article titled “<a href="https://pharmaceuticalmanufacturer.media/pharma-manufacturing-news/latest-pharmaceutical-manufacturing-news/desci-labs-launches-novelty-scores-for-scientific-manuscript/">DeSci Labs launches novelty scores for scientific manuscripts</a>” (which I saw shared in <a href="https://mastodon.social/@hannaSH/113434991374602986">this post</a>) describes a new:</p>

<blockquote><p>...mathematical model scores feature which is an objective measure of novelty for scientific work.

<a href="https://pharmaceuticalmanufacturer.media/pharma-manufacturing-news/latest-pharmaceutical-manufacturing-news/desci-labs-launches-novelty-scores-for-scientific-manuscript/">https://pharmaceuticalmanufacturer.media/pharma-manufacturing-news/latest-pharmaceutical-manufacturing-news/desci-labs-launches-novelty-scores-for-scientific-manuscript/</a></p></blockquote>

<p>The article says:</p>

<blockquote><p>...evaluating the novelty of scientific manuscripts and grant applications takes centre stage in the scientific peer review process. The primary reason work is rejected by editors of high-impact journals or funding agencies is because referees think it is not novel enough. However, the current peer review process is subjective, slow, labour-intensive, and prone to bias and inaccuracy [...] The release of these novelty scores [...] means there is now an objective, automated measurement of one of the core parts of the peer review process.</p></blockquote>

<p>As a general principle, I assume goodwill. With that in mind, it is with genuine, all due respect that I find this development to be <em>deeply alarming</em>.</p>

<p>First of all, how can there possibly be an “objective” measure of novelty?????</p>

<p>Secondly, while it&#39;s great to see on <a href="https://desci.com/about">DeSci Labs&#39;s about page</a> some laudable goals like enabling FAIRness, open science, developing open source software, and preserving scientific outputs (which I care deeply about), the same page also speaks of securing USD 6.5 million in “seed funding”, accelerating science, using “Web3” technology, and to “accelerate growth and enhance customer loyalty”. To me, this reeks of techno-solutionism and -accelerationism.</p>

<p>Third, the underlying math is published in Nature:</p>

<p><a href="https://doi.org/10.1038/s41467-023-36741-4">https://doi.org/10.1038/s41467-023-36741-4</a></p>

<p>To me, all three of the above speak volumes about the state of scientific research culture, and not in a good way... 😩</p>

<p>Contrast this with the <em>excellent</em> essay on “The Limits of Data” by C. Thi Nguyen recently shared with the Turing Way community by Shern Tee:</p>

<p><a href="https://doi.org/10.58875/LUXD6515">https://doi.org/10.58875/LUXD6515</a></p>

<p>Which reminds us:</p>

<blockquote><p>...policymakers and data users should remember that not everything is as tractable to the methodologies of data. It is tempting to act as if data-based methods simply offer direct, objective, and unhindered access to the world—that if we follow the methods of data, we will banish all bias, subjectivity, and unclarity from the world. The power of data is vast scalability; the price is context. We need to wean ourselves off the pure-data diet, to balance the power of data-based methodologies with the context-sensitivity and flexibility of qualitative methods and local experts with deep but nonportable understanding. Data is powerful but incomplete; don’t let it entirely drown out other modes of understanding.</p></blockquote>

<p>I hope the work on reforming academic research culture and <a href="https://naclscrg.writeas.com/tag:metaresearch" class="hashtag"><span>#</span><span class="p-category">metaresearch</span></a> could include diverse and skeptical voices in addition to simply developing new quantitative “metrics”.</p>

<hr/>

<p> <p>Unless otherwise stated, all original content in this post is shared under the <a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;">Creative Commons Attribution-ShareAlike 4.0 International</a> license<a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1" alt=""></a></p></p>
]]></content:encoded>
      <guid>https://naclscrg.writeas.com/dont-quantify-assessments</guid>
      <pubDate>Tue, 24 Dec 2024 20:43:57 +0000</pubDate>
    </item>
    <item>
      <title>Talk - Open source hardware for more equitable open science</title>
      <link>https://naclscrg.writeas.com/talk-open-source-hardware-for-more-equitable-open-science?pk_campaign=rss-feed</link>
      <description>&lt;![CDATA[Since 2023, I&#39;ve given several variations of my talk about open source hardware as a key component of open science. Here, I will share extra notes on what didn&#39;t fit in the talk, a transcript, further reading/resources, and a recording of the talk. &#xA;!--more--&#xA;This note is structured as follows, please scroll down to the section you&#39;re looking for. &#xA;&#xA;Recording&#xA;Transcript&#xA;Further reading/resources&#xA;&#xA;Recording&#xA;&#xA;I&#39;ve given several variations of this talk with multiple recordings. For now, here is the recording of an early iteration I gave at the Edinburgh Open Research Conference in mid-2023 (click on the &#34;Presentation Video&#34; link on the page): &#xA;&#xA;https://doi.org/10.2218/eor.2023.8112&#xA;&#xA;I will try to put other recordings here on a best effort basis. &#xA;&#xA;Transcript&#xA;&#xA;I will put a transcript of the talk here as soon as I can.&#xA;&#xA;Further reading/resources&#xA;&#xA;The official Open Source Hardware Definition: https://www.oshwa.org/definition/&#xA;OreSat open source cubesats: https://www.oresat.org/&#xA;Public Lab is the group which developed the open source balloon mapping platform in response to the 2010 Deepwater Horizon oil spill: https://publiclab.org/&#xA;Story about capturing photographic evidence of dumping toxic waste in the Mississippi River: https://publiclab.org/notes/eustatic/05-28-2013/kite-photos-of-ongoing-coal-pollution-in-plaquemines-parish-la&#xA;Claudia Martinez Mansell is a humanitarian worker and independent researcher who worked at the Bourj Al Shamali refugee camp in Lebanon. It&#39;s the community there that remixed the Public Lab balloon mapping platform for use in their camp. Relevant reading: &#xA;  https://placesjournal.org/article/camp-code/&#xA;  https://publiclab.org/notes/clauds/04-28-2016/camp-code-how-to-navigate-a-refugee-settlement&#xA;&#xA;Peer-reviewed papers&#xA;&#xA;Arancio, J. (2023). From inequalities to epistemic innovation: Insights from open science hardware projects in Latin America. Environmental Science &amp; Policy, 150, 103576. https://doi.org/10.1016/j.envsci.2023.103576&#xA;  Associated article: https://sparcopen.org/impact-story/often-overlooked-sharing-of-hardware-is-a-missing-link-in-open-science-puzzle/&#xA;Burke, N., Müller, G., Saggiomo, V., Hassett, A. R., Mutterer, J., Ó Súilleabháin, P., Zakharov, D., Healy, D., Reynaud, E. G., &amp; Pickering, M. (2024). EnderScope: A low-cost 3D printer-based scanning microscope for microplastic detection. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 382(2274), 20230214. https://doi.org/10.1098/rsta.2023.0214&#xA;Collins, J. T., Knapper, J., Stirling, J., Mduda, J., Mkindi, C., Mayagaya, V., Mwakajinga, G. A., Nyakyi, P. T., Sanga, V. L., Carbery, D., White, L., Dale, S., Lim, Z. J., Baumberg, J. J., Cicuta, P., McDermott, S., Vodenicharski, B., &amp; Bowman, R. (2020). Robotic microscopy for everyone: The OpenFlexure microscope. Biomedical Optics Express, 11(5), 2447–2460. https://doi.org/10.1364/BOE.385729&#xA;Grant, S. D., Cairns, G. S., Wistuba, J., &amp; Patton, B. R. (2019). Adapting the 3D-printed Openflexure microscope enables computational super-resolution imaging (No. 8:2003). F1000Research. https://doi.org/10.12688/f1000research.21294.1&#xA;Hsing, P.-Y., Johns, B., &amp; Matthes, A. (2024). Ecology and conservation researchers should adopt open source technologies. Frontiers in Conservation Science, 5. https://doi.org/10.3389/fcosc.2024.1364181&#xA;Pearce, J. M. (2020). Economic savings for scientific free and open source technology: A review. HardwareX, 8, e00139. https://doi.org/10.1016/j.ohx.2020.e00139&#xA;Thaler, A., Sturdivant, K., Neches, R., &amp; Levenson, J. (2024). The OpenCTD: A low-cost, open-source CTD for collecting baseline oceanographic data in coastal waters. Oceanography. https://doi.org/10.5670/oceanog.2024.60&#xA;&#xA;Useful guides&#xA;&#xA;UNESCO Open Science Toolkit guide on &#34;Supporting open hardware for open science&#34;: https://doi.org/10.54677/LUMO4515&#xA;Report - Creating an Open-source Hardware Ecosystem for Research and Sustainable Development: https://doi.org/10.5281/zenodo.8301858&#xA;Report - Supporting Open Science Hardware in Academia: Policy Recommendations for Science Funders and University Managers: https://doi.org/10.5281/zenodo.8030028&#xA;Open Know-How is a specification for including detailed metadata with your open source hardware project so that its designs are more machine readable, interoperable, and reproducible: https://www.internetofproduction.org/openknowhow&#xA;DIN SPEC 3105 is a specification for good practices in publishing and peer reviewing open source hardware designs: https://www.beuth.de/en/technical-rule/din-spec-3105-1/324805763&#xA;&#xA;Relevant organisations&#xA;&#xA;Gathering for Open Science Hardware (GOSH): https://openhardware.science/&#xA;Open Source Hardware Association: https://www.oshwa.org/&#xA;Open Science Hardware Foundation: https://opensciencehardware.org/&#xA;Internet of Production Alliance: https://www.internetofproduction.org/&#xA;Open Hardware Makers provide mentoring and training on how to develop and support open source hardware: https://openhardware.space/&#xA;IO Rodeo sells open source hardware for scientific research, including the OpenFlexure microscope: https://www.iorodeo.com/&#xA;&#xA;#talks #opensource #openresearch&#xA;&#xA;----------&#xD;&#xA;&#xD;&#xA; p xmlns:cc=&#34;http://creativecommons.org/ns#&#34; Unless otherwise stated, all original content in this post is shared under the a href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;Creative Commons Attribution-ShareAlike 4.0 International/a licensea href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1&#34; alt=&#34;&#34;/a/p ]]&gt;</description>
      <content:encoded><![CDATA[<p>Since 2023, I&#39;ve given several variations of my talk about open source <strong>hardware</strong> as a key component of open science. Here, I will share extra notes on what didn&#39;t fit in the talk, a transcript, further reading/resources, and a recording of the talk.

This note is structured as follows, please scroll down to the section you&#39;re looking for.</p>
<ul><li>Recording</li>
<li>Transcript</li>
<li>Further reading/resources</li></ul>

<h2 id="recording" id="recording">Recording</h2>

<p>I&#39;ve given several variations of this talk with multiple recordings. For now, here is the recording of an early iteration I gave at the Edinburgh Open Research Conference in mid-2023 (click on the “Presentation Video” link on the page):</p>

<p><a href="https://doi.org/10.2218/eor.2023.8112">https://doi.org/10.2218/eor.2023.8112</a></p>

<p>I will try to put other recordings here on a best effort basis.</p>

<h2 id="transcript" id="transcript">Transcript</h2>

<p>I will put a transcript of the talk here as soon as I can.</p>

<h2 id="further-reading-resources" id="further-reading-resources">Further reading/resources</h2>
<ul><li>The official Open Source Hardware Definition: <a href="https://www.oshwa.org/definition/">https://www.oshwa.org/definition/</a></li>
<li>OreSat open source cubesats: <a href="https://www.oresat.org/">https://www.oresat.org/</a></li>
<li>Public Lab is the group which developed the open source balloon mapping platform in response to the 2010 Deepwater Horizon oil spill: <a href="https://publiclab.org/">https://publiclab.org/</a></li>
<li>Story about capturing photographic evidence of dumping toxic waste in the Mississippi River: <a href="https://publiclab.org/notes/eustatic/05-28-2013/kite-photos-of-ongoing-coal-pollution-in-plaquemines-parish-la">https://publiclab.org/notes/eustatic/05-28-2013/kite-photos-of-ongoing-coal-pollution-in-plaquemines-parish-la</a></li>
<li>Claudia Martinez Mansell is a humanitarian worker and independent researcher who worked at the Bourj Al Shamali refugee camp in Lebanon. It&#39;s the community there that remixed the Public Lab balloon mapping platform for use in their camp. Relevant reading:
<ul><li><a href="https://placesjournal.org/article/camp-code/">https://placesjournal.org/article/camp-code/</a></li>
<li><a href="https://publiclab.org/notes/clauds/04-28-2016/camp-code-how-to-navigate-a-refugee-settlement">https://publiclab.org/notes/clauds/04-28-2016/camp-code-how-to-navigate-a-refugee-settlement</a></li></ul></li></ul>

<h3 id="peer-reviewed-papers" id="peer-reviewed-papers">Peer-reviewed papers</h3>
<ul><li>Arancio, J. (2023). From inequalities to epistemic innovation: Insights from open science hardware projects in Latin America. <em>Environmental Science &amp; Policy</em>, 150, 103576. <a href="https://doi.org/10.1016/j.envsci.2023.103576">https://doi.org/10.1016/j.envsci.2023.103576</a>
<ul><li>Associated article: <a href="https://sparcopen.org/impact-story/often-overlooked-sharing-of-hardware-is-a-missing-link-in-open-science-puzzle/">https://sparcopen.org/impact-story/often-overlooked-sharing-of-hardware-is-a-missing-link-in-open-science-puzzle/</a></li></ul></li>
<li>Burke, N., Müller, G., Saggiomo, V., Hassett, A. R., Mutterer, J., Ó Súilleabháin, P., Zakharov, D., Healy, D., Reynaud, E. G., &amp; Pickering, M. (2024). EnderScope: A low-cost 3D printer-based scanning microscope for microplastic detection. <em>Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences</em>, 382(2274), 20230214. <a href="https://doi.org/10.1098/rsta.2023.0214">https://doi.org/10.1098/rsta.2023.0214</a></li>
<li>Collins, J. T., Knapper, J., Stirling, J., Mduda, J., Mkindi, C., Mayagaya, V., Mwakajinga, G. A., Nyakyi, P. T., Sanga, V. L., Carbery, D., White, L., Dale, S., Lim, Z. J., Baumberg, J. J., Cicuta, P., McDermott, S., Vodenicharski, B., &amp; Bowman, R. (2020). Robotic microscopy for everyone: The OpenFlexure microscope. <em>Biomedical Optics Express</em>, 11(5), 2447–2460. <a href="https://doi.org/10.1364/BOE.385729">https://doi.org/10.1364/BOE.385729</a></li>
<li>Grant, S. D., Cairns, G. S., Wistuba, J., &amp; Patton, B. R. (2019). Adapting the 3D-printed Openflexure microscope enables computational super-resolution imaging (No. 8:2003). <em>F1000Research</em>. <a href="https://doi.org/10.12688/f1000research.21294.1">https://doi.org/10.12688/f1000research.21294.1</a></li>
<li>Hsing, P.-Y., Johns, B., &amp; Matthes, A. (2024). Ecology and conservation researchers should adopt open source technologies. <em>Frontiers in Conservation Science</em>, 5. <a href="https://doi.org/10.3389/fcosc.2024.1364181">https://doi.org/10.3389/fcosc.2024.1364181</a></li>
<li>Pearce, J. M. (2020). Economic savings for scientific free and open source technology: A review. <em>HardwareX</em>, 8, e00139. <a href="https://doi.org/10.1016/j.ohx.2020.e00139">https://doi.org/10.1016/j.ohx.2020.e00139</a></li>
<li>Thaler, A., Sturdivant, K., Neches, R., &amp; Levenson, J. (2024). The OpenCTD: A low-cost, open-source CTD for collecting baseline oceanographic data in coastal waters. <em>Oceanography</em>. <a href="https://doi.org/10.5670/oceanog.2024.60">https://doi.org/10.5670/oceanog.2024.60</a></li></ul>

<h3 id="useful-guides" id="useful-guides">Useful guides</h3>
<ul><li>UNESCO Open Science Toolkit guide on “Supporting open hardware for open science”: <a href="https://doi.org/10.54677/LUMO4515">https://doi.org/10.54677/LUMO4515</a></li>
<li>Report – Creating an Open-source Hardware Ecosystem for Research and Sustainable Development: <a href="https://doi.org/10.5281/zenodo.8301858">https://doi.org/10.5281/zenodo.8301858</a></li>
<li>Report – Supporting Open Science Hardware in Academia: Policy Recommendations for Science Funders and University Managers: <a href="https://doi.org/10.5281/zenodo.8030028">https://doi.org/10.5281/zenodo.8030028</a></li>
<li>Open Know-How is a specification for including detailed metadata with your open source hardware project so that its designs are more machine readable, interoperable, and reproducible: <a href="https://www.internetofproduction.org/openknowhow">https://www.internetofproduction.org/openknowhow</a></li>
<li>DIN SPEC 3105 is a specification for good practices in <em>publishing</em> and <em>peer reviewing</em> open source hardware designs: <a href="https://www.beuth.de/en/technical-rule/din-spec-3105-1/324805763">https://www.beuth.de/en/technical-rule/din-spec-3105-1/324805763</a></li></ul>

<h3 id="relevant-organisations" id="relevant-organisations">Relevant organisations</h3>
<ul><li>Gathering for Open Science Hardware (GOSH): <a href="https://openhardware.science/">https://openhardware.science/</a></li>
<li>Open Source Hardware Association: <a href="https://www.oshwa.org/">https://www.oshwa.org/</a></li>
<li>Open Science Hardware Foundation: <a href="https://opensciencehardware.org/">https://opensciencehardware.org/</a></li>
<li>Internet of Production Alliance: <a href="https://www.internetofproduction.org/">https://www.internetofproduction.org/</a></li>
<li>Open Hardware Makers provide mentoring and training on how to develop and support open source hardware: <a href="https://openhardware.space/">https://openhardware.space/</a></li>
<li>IO Rodeo <strong>sells</strong> open source hardware for scientific research, including the OpenFlexure microscope: <a href="https://www.iorodeo.com/">https://www.iorodeo.com/</a></li></ul>

<p><a href="https://naclscrg.writeas.com/tag:talks" class="hashtag"><span>#</span><span class="p-category">talks</span></a> <a href="https://naclscrg.writeas.com/tag:opensource" class="hashtag"><span>#</span><span class="p-category">opensource</span></a> <a href="https://naclscrg.writeas.com/tag:openresearch" class="hashtag"><span>#</span><span class="p-category">openresearch</span></a></p>

<hr/>

<p> <p>Unless otherwise stated, all original content in this post is shared under the <a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;">Creative Commons Attribution-ShareAlike 4.0 International</a> license<a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1" alt=""></a></p></p>
]]></content:encoded>
      <guid>https://naclscrg.writeas.com/talk-open-source-hardware-for-more-equitable-open-science</guid>
      <pubDate>Mon, 02 Dec 2024 16:36:30 +0000</pubDate>
    </item>
    <item>
      <title>A digital preservation workflow for academic research</title>
      <link>https://naclscrg.writeas.com/a-digital-preservation-workflow-for-academic-research?pk_campaign=rss-feed</link>
      <description>&lt;![CDATA[As part of the Data Lifeboat meeting I attended in November 2024, I&#39;m jotting down some rough, high-level thoughts on what a good digital preservation workflow might be. &#xA;!--more--&#xA;I am writing this as a stream of consciousness from my experience as an academic researcher. There are certainly things I missed or that I will think of later. &#xA;&#xA;The workflow is organised below into three stages: Pre research; during research; and post research. Within each one I&#39;ll write down what would be good to happen at that stage. &#xA;&#xA;Pre research&#xA;&#xA;Start with a research &#34;data&#34; management plan. I&#39;m using the term data very broadly here to mean the artefacts that result from a research projects, which could be (but not limited to) general notes, numerical data, interview transcripts, audio/video recordings, artwork, lab notebooks, etc. &#xA;&#xA;When writing the plan, think about: &#xA;&#xA;What artefacts do you anticipate from the research? Which will be shared and preserved? Remember things may be be produced throughout, not just at the end. &#xA;How will artefacts be shared and preserved? Any anticipated barriers? How might they be overcome? How could this be done in a way that ensures, as much as possible, that they can be human and machine readable years later?&#xA;Where will they be preserved? Make sure the appropriate digital repositories are in place. &#xA;When do you expect each output to be produced? Will the &#34;how&#34; be ready at those times? Closely related is for how long (what timescales) do you hope for they to be preserved? 10 years? 20 years? 100 years?!&#xA;Who will take on the responsibility of carrying out this plan? &#xA;&#xA;From experience, I know that a big challenge is not just coming up with such a plan, but to budget the time, resources, and labour to implement it. In academic research, I think this is an underappreciated point. At least from my scientific background, there are many scientists who scramble to prepare and publish data (usually because an academic journal requires them to publish data) at the last minute, and end up doing a poor job at digital preservation. &#xA;&#xA;During research&#xA;&#xA;During the course of a research project, remember to do good documentation. In my view, it is especially important to write down things like spontaneous learnings (&#34;what are we learning along the way?&#34;) or to note deviations from the research plan. &#xA;&#xA;Documentation could also be informal, like rehearsal notes for performing arts or daily lab notebooks for an experimental scientist. Blog posts are also good. &#xA;&#xA;Regularly check in with the original data management plan to see if it is being followed or if changes are needed. &#xA;&#xA;Post research&#xA;&#xA;In my view, a post-mortem is a critical exercise in any research project. This is true, too, for reflecting on how well a project&#39;s digital preservation plan/data management plan worked. Some questions to ask: &#xA;&#xA;Did we produce the digital artefacts we anticipated at the beginning? &#xA;What was the experience of sharing and preserving those artefacts? Any points of friction? &#xA;What would we do differently next time? &#xA;How will we preserve and shared what we learned from this post-mortem to inform future efforts?&#xA;&#xA;Another meta issue I see in academic research is the lack of appreciation, and highlighting of, the reuse of digitally preserved material. At least from what I&#39;ve seen, there&#39;s lots of talk in #openresearch circles about sharing and how to do it well, but far less on using what others have shared! &#xA;&#xA;I think if we do a good job of telling stories about the use of shared stuff, then we can more effectively make a case for digitally preserving said stuff and reducing #intellectualpoverty. &#xA;&#xA;----------&#xD;&#xA;&#xD;&#xA; p xmlns:cc=&#34;http://creativecommons.org/ns#&#34; Unless otherwise stated, all original content in this post is shared under the a href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;Creative Commons Attribution-ShareAlike 4.0 International/a licensea href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1&#34; alt=&#34;&#34;/a/p ]]&gt;</description>
      <content:encoded><![CDATA[<p>As part of the Data Lifeboat meeting I attended in November 2024, I&#39;m jotting down some rough, high-level thoughts on what a good digital preservation workflow might be.

I am writing this as a stream of consciousness from my experience as an academic researcher. There are certainly things I missed or that I will think of later.</p>

<p>The workflow is organised below into three stages: Pre research; during research; and post research. Within each one I&#39;ll write down what would be good to happen at that stage.</p>

<h2 id="pre-research" id="pre-research">Pre research</h2>

<p>Start with a research “data” management plan. I&#39;m using the term data very broadly here to mean the <strong>artefacts</strong> that result from a research projects, which could be (but not limited to) general notes, numerical data, interview transcripts, audio/video recordings, artwork, lab notebooks, etc.</p>

<p>When writing the plan, think about:</p>
<ul><li><strong>What</strong> artefacts do you anticipate from the research? Which will be shared and preserved? Remember things may be be produced throughout, not just at the end.</li>
<li><strong>How</strong> will artefacts be shared and preserved? Any anticipated barriers? How might they be overcome? How could this be done in a way that ensures, as much as possible, that they can be human and machine readable years later?</li>
<li><strong>Where</strong> will they be preserved? Make sure the appropriate digital repositories are in place.</li>
<li><strong>When</strong> do you expect each output to be produced? Will the “how” be ready at those times? Closely related is <strong>for how long</strong> (what timescales) do you hope for they to be preserved? 10 years? 20 years? 100 years?!</li>
<li><strong>Who</strong> will take on the responsibility of carrying out this plan?</li></ul>

<p>From experience, I know that a big challenge is not just coming up with such a plan, but to <strong>budget the time, resources, and labour to implement it</strong>. In academic research, I think this is an underappreciated point. At least from my scientific background, there are many scientists who scramble to prepare and publish data (usually because an academic journal requires them to publish data) at the last minute, and end up doing a poor job at digital preservation.</p>

<h2 id="during-research" id="during-research">During research</h2>

<p>During the course of a research project, remember to do good documentation. In my view, it is especially important to write down things like spontaneous learnings (“what are we learning along the way?”) or to note deviations from the research plan.</p>

<p>Documentation could also be informal, like rehearsal notes for performing arts or daily lab notebooks for an experimental scientist. Blog posts are also good.</p>

<p>Regularly check in with the original data management plan to see if it is being followed or if changes are needed.</p>

<h2 id="post-research" id="post-research">Post research</h2>

<p>In my view, a post-mortem is a critical exercise in any research project. This is true, too, for reflecting on how well a project&#39;s digital preservation plan/data management plan worked. Some questions to ask:</p>
<ul><li>Did we produce the digital artefacts we anticipated at the beginning?</li>
<li>What was the experience of sharing and preserving those artefacts? Any points of friction?</li>
<li>What would we do differently next time?</li>
<li><strong>How will we preserve and shared what we learned from this post-mortem to inform future efforts?</strong></li></ul>

<p>Another meta issue I see in academic research is the lack of appreciation, and highlighting of, the <strong>reuse</strong> of digitally preserved material. At least from what I&#39;ve seen, there&#39;s lots of talk in <a href="https://naclscrg.writeas.com/tag:openresearch" class="hashtag"><span>#</span><span class="p-category">openresearch</span></a> circles about <em>sharing</em> and how to do it well, but far less on <em>using</em> what others have shared!</p>

<p>I think if we do a good job of telling stories about the <em>use</em> of shared stuff, then we can more effectively make a case for digitally preserving said stuff and reducing <a href="https://naclscrg.writeas.com/tag:intellectualpoverty" class="hashtag"><span>#</span><span class="p-category">intellectualpoverty</span></a>.</p>

<hr/>

<p> <p>Unless otherwise stated, all original content in this post is shared under the <a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;">Creative Commons Attribution-ShareAlike 4.0 International</a> license<a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1" alt=""></a></p></p>
]]></content:encoded>
      <guid>https://naclscrg.writeas.com/a-digital-preservation-workflow-for-academic-research</guid>
      <pubDate>Mon, 11 Nov 2024 15:23:54 +0000</pubDate>
    </item>
    <item>
      <title>Visual accessibility notes</title>
      <link>https://naclscrg.writeas.com/visual-accessibility-notes?pk_campaign=rss-feed</link>
      <description>&lt;![CDATA[I recently posted threads and received helpful responses from the Turing Way Slack group discussing visual #accessibility both for #datavisualisation and text. &#xA;!--more--&#xA;Visualisations&#xA;&#xA;My original prompt was: &#xA;&#xA;  Visual accessibility question: Is converting color to greyscale (or black and white) an adequate test of color blind accessibility (e.g. if I convert a data visualization with color to greyscale and check if I can still understand it)? If no, what&#39;s a good test or rule of thumb?&#xA;&#xA;Short answer: No. &#xA;&#xA;Here are the useful responses I got for which I&#39;m grateful (bold emphasis mine): &#xA;&#xA;Liz Hare: Good question! I don&#39;t think that would work because of the way colors are perceived.  There are a few different approaches you could take. You could use some secondary code like texture or text labels. Also, it depends on how you are working. I know there are colorblind-friendly color palette packages for R. And don&#39;t forget the alt text.&#xA;Alycia Crall: I’ve always found this testing tool very helpful: https://webaim.org/resources/contrastchecker/&#xA;Hao Ye: Why this doesn&#39;t work:&#xA;  converting color to grayscale is dimensional reduction (3 color axes -  1 axis of brightness)&#xA;  the conversion method is probably based on the perceptual attributes of an average human with 3 cone types&#xA;  someone who deviates from that, e.g. by not having a particular cone, will perceive relative brightness differently than what the grayscale conversion produces&#xA;DavidPS: If you open the image with Firefox, and right-click on it you will see a &#34;inspect accessibility properties&#34; button. Clicking on it you will see a simulate button. There you can try many different types of colour accessibility issues.&#xA;Anne Lee Steele: @DavidPS - I&#39;ve previously used this extension in another context: https://addons.mozilla.org/en-GB/firefox/addon/let-s-get-color-blind/, I didn&#39;t realise that this is now built in to the browser, amazing!&#xA;Shern Tee: Given that a &#34;grayscale-legible&#34; chart is not necessarily color-accessible -- what about the reverse? That is, do schemes that account for different colour perceptions also tend to make charts more grayscale-legible? Or is that not generally ensured? I ask because I don&#39;t have different colour perception (as far as I know!), but I do frequently print papers in grayscale. I&#39;m guilty of assuming that grayscale legibility would equal colour accessibility. I&#39;ve often encouraged my students to consider being as thoughtful as possible -- not just using colour palettes but line-dashing, symbol shapes, and explicit labels to clarify information -- but I wonder now if there&#39;s no overlap, or some overlap!&#xA;  Hao Ye: @Shern Tee - I think so, based on arguments that a set of colors that are distinct under different color perception modes, would probably have to rely on brightness that is agnostic to any specific color channels, and thus render as distinct when converted to grayscale. I would probably have to do some linear algebra to check for sure! &#xA;&#xA;Fonts&#xA;&#xA;Original prompt: &#xA;&#xA;  As a follow up, over the years I&#39;ve noted some open source fonts designed for accessibility:&#xA;  Atkinson Hyperlegible by the Braille Institute (source code): https://brailleinstitute.org/freefont (expanded forks here and here)&#xA;  Inclusive Sans (source code): https://www.oliviaking.com/inclusive-sans (now here: https://www.oliviaking.com/inclusivesans/feature)&#xA;  * OpenDyslexic: https://github.com/antijingoist/opendyslexic&#xA;  My question is: While I like the idea of accessible fonts (e.g. I like good distinction between 0,o,O or 1,i,l), I don&#39;t know how to critically evaluate them. What should one consider when choosing a font for visual accessibility?&#xA;&#xA;Liam McGee gave a useful response from the perpsective of dyslexic accessibility. The short version is that ostensibly dyslexic accessible fonts might not be that useful after all. &#xA;&#xA;With regards to dyslexia (according to Liam): &#xA;&#xA;They don&#39;t have much evidential backup... [see] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629233/ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5934461/&#xA;In general, using particular fonts are shown to help particular people with dyslexia (sans-serif is better for some, some like comic sans, distressingly) but I have yet to see evidence for a single font being generally helpful.&#xA;https://www.linkedin.com/pulse/dyslexic-myths-presented-truths-gareth-ford-williams/ is worth reading on the subject.&#xA;also https://www.linkedin.com/posts/christophestrobbeat-the-bbc-20-fonts-were-tested-for-readability-activity-7001490480043540480-7idF/&#xA;And... https://dyslexiaida.org/do-special-fonts-help-people-with-dyslexia/&#xA;https://link.springer.com/article/10.1007/s11881-018-0164-z&#xA;And, more nuanced: https://onlinelibrary.wiley.com/doi/10.1002/dys.1527&#xA;&#xA;More generally: &#xA;&#xA;...distinguishability is important, as is kerning.&#xA;A guide to understanding what makes a typeface accessible - And how to make more informed design decisions: https://medium.com/the-readability-group/a-guide-to-understanding-what-makes-a-typeface-accessible-and-how-to-make-informed-decisions-9e5c0b9040a0&#xA;Don&#39;t overlook more general typography such as leading and margins. https://en.wikipedia.org/wiki/TheElementsofTypographicStyle is an excellent reference for this.&#xA;  Which informed a thesis style: https://bitbucket.org/amiede/classicthesis/wiki/Home&#xA;&#xA;Liam also insightfully noted that &#34;Accessibility is just aesthetics with a more sensitive gauge… where the consequence of a lack of clarity, harmony and structure is greater to some people than to others... But good typography and layout is definitely an accessibility aid.&#34; Great point!&#xA;&#xA;Liam also mentioned the 2:3 aspect ratio which is &#34;12mm off the side of A4 (so 198x297)&#34;, where &#34;2:3... cut in half, it&#39;s 3:4... cut in half, 2:3. Like a musical harmonic.&#34;&#xA;&#xA;Other than the above, I note that SIL publishes various open source fonts, including Charis SIL (&#34;optimized for readability&#34;) and Andika (for the needs of &#34;beginning readers&#34;), both with wide character coverage for various languages. What&#39;s cool is that SIL hosts a TypeTuner which allows you to customise font features (e.g. whether to have slashes through 0s and 7s) and download their fonts with those features enabled. &#xA;&#xA;Also, Atkinson Hyperlegible had a new release in early 2025 called Next (wider character coverage) and Mono (official monospace version!): &#xA;https://www.brailleinstitute.org/freefont/&#xA;&#xA;Alt-text&#xA;&#xA;Great guide on how to compose alt-text for images: &#xA;https://www.perkins.org/resource/how-write-alt-text-and-image-descriptions-visually-impaired/&#xA;&#xA;Which has a great visual example: &#xA;Visual depiction of elements which should go into alt-text image captions&#xA;&#xA;----------&#xD;&#xA;&#xD;&#xA; p xmlns:cc=&#34;http://creativecommons.org/ns#&#34; Unless otherwise stated, all original content in this post is shared under the a href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;Creative Commons Attribution-ShareAlike 4.0 International/a licensea href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;_blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1&#34; alt=&#34;&#34;/a/p ]]&gt;</description>
      <content:encoded><![CDATA[<p>I recently posted threads and received helpful responses from the Turing Way Slack group discussing visual <a href="https://naclscrg.writeas.com/tag:accessibility" class="hashtag"><span>#</span><span class="p-category">accessibility</span></a> both for <a href="https://naclscrg.writeas.com/tag:datavisualisation" class="hashtag"><span>#</span><span class="p-category">datavisualisation</span></a> and text.
</p>

<h2 id="visualisations" id="visualisations">Visualisations</h2>

<p>My original prompt was:</p>

<blockquote><p>Visual accessibility question: Is converting color to greyscale (or black and white) an adequate test of color blind accessibility (e.g. if I convert a data visualization with color to greyscale and check if I can still understand it)? If no, what&#39;s a good test or rule of thumb?</p></blockquote>

<p>Short answer: No.</p>

<p>Here are the useful responses I got for which I&#39;m grateful (bold emphasis mine):</p>
<ul><li>Liz Hare: Good question! I don&#39;t think that would work because of the way colors are perceived.  There are a few different approaches you could take. You could use some <strong>secondary code like texture or text labels</strong>. Also, it depends on how you are working. I know there are <strong>colorblind-friendly color palette packages for R</strong>. And don&#39;t forget the <strong>alt text</strong>.</li>
<li>Alycia Crall: I’ve always found this testing tool very helpful: <a href="https://webaim.org/resources/contrastchecker/">https://webaim.org/resources/contrastchecker/</a></li>
<li>Hao Ye: Why this <strong>doesn&#39;t</strong> work:
<ul><li>converting color to grayscale is dimensional reduction (3 color axes –&gt; 1 axis of brightness)</li>
<li>the conversion method is probably based on the perceptual attributes of an average human with 3 cone types</li>
<li>someone who deviates from that, e.g. by not having a particular cone, will perceive relative brightness differently than what the grayscale conversion produces</li></ul></li>
<li>DavidPS: <strong>If you open the image with Firefox, and right-click on it you will see a “inspect accessibility properties”</strong> button. Clicking on it you will see a <strong>simulate button</strong>. There you can try many different types of colour accessibility issues.</li>
<li>Anne Lee Steele: @DavidPS – <strong>I&#39;ve previously used this extension in another context: <a href="https://addons.mozilla.org/en-GB/firefox/addon/let-s-get-color-blind/">https://addons.mozilla.org/en-GB/firefox/addon/let-s-get-color-blind/</a></strong>, I didn&#39;t realise that this is now built in to the browser, amazing!</li>
<li>Shern Tee: Given that a “grayscale-legible” chart is not necessarily color-accessible — <strong>what about the reverse?</strong> That is, do schemes that account for different colour perceptions also tend to make charts more grayscale-legible? Or is that not generally ensured? I ask because I don&#39;t have different colour perception (as far as I know!), but I do frequently print papers in grayscale. I&#39;m guilty of assuming that grayscale legibility would equal colour accessibility. I&#39;ve often encouraged my students to consider being as thoughtful as possible — not just using colour palettes but line-dashing, symbol shapes, and explicit labels to clarify information — but I wonder now if there&#39;s no overlap, or some overlap!
<ul><li>Hao Ye: @Shern Tee – I think so, based on arguments that a set of colors that are distinct under different color perception modes, would probably have to rely on brightness that is agnostic to any specific color channels, and thus render as distinct when converted to grayscale. I would probably have to do some linear algebra to check for sure!</li></ul></li></ul>

<h2 id="fonts" id="fonts">Fonts</h2>

<p>Original prompt:</p>

<blockquote><p>As a follow up, over the years I&#39;ve noted some open source fonts designed for accessibility:
* Atkinson Hyperlegible by the Braille Institute (source code): <a href="https://brailleinstitute.org/freefont">https://brailleinstitute.org/freefont</a> (expanded forks here and here)
* Inclusive Sans (source code): <a href="https://www.oliviaking.com/inclusive-sans">https://www.oliviaking.com/inclusive-sans</a> (now here: <a href="https://www.oliviaking.com/inclusivesans/feature">https://www.oliviaking.com/inclusivesans/feature</a>)
* OpenDyslexic: <a href="https://github.com/antijingoist/opendyslexic">https://github.com/antijingoist/opendyslexic</a>
My question is: While I like the idea of accessible fonts (e.g. I like good distinction between 0,o,O or 1,i,l), I don&#39;t know how to critically evaluate them. What should one consider when choosing a font for visual accessibility?</p></blockquote>

<p>Liam McGee gave a useful response from the perpsective of dyslexic accessibility. The short version is that ostensibly dyslexic accessible fonts might not be that useful after all.</p>

<p>With regards to dyslexia (according to Liam):</p>
<ul><li>They don&#39;t have much evidential backup... [see] <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629233/">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629233/</a> <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5934461/">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5934461/</a></li>
<li>In general, using particular fonts are shown to help particular people with dyslexia (sans-serif is better for some, some like comic sans, distressingly) but I have yet to see evidence for a single font being generally helpful.</li>
<li><a href="https://www.linkedin.com/pulse/dyslexic-myths-presented-truths-gareth-ford-williams/">https://www.linkedin.com/pulse/dyslexic-myths-presented-truths-gareth-ford-williams/</a> is worth reading on the subject.</li>
<li>also <a href="https://www.linkedin.com/posts/christophestrobbe_at-the-bbc-20-fonts-were-tested-for-readability-activity-7001490480043540480-7idF/">https://www.linkedin.com/posts/christophestrobbe_at-the-bbc-20-fonts-were-tested-for-readability-activity-7001490480043540480-7idF/</a></li>
<li>And... <a href="https://dyslexiaida.org/do-special-fonts-help-people-with-dyslexia/">https://dyslexiaida.org/do-special-fonts-help-people-with-dyslexia/</a></li>
<li><a href="https://link.springer.com/article/10.1007/s11881-018-0164-z">https://link.springer.com/article/10.1007/s11881-018-0164-z</a></li>
<li>And, more nuanced: <a href="https://onlinelibrary.wiley.com/doi/10.1002/dys.1527">https://onlinelibrary.wiley.com/doi/10.1002/dys.1527</a></li></ul>

<p>More generally:</p>
<ul><li>...distinguishability is important, as is kerning.</li>
<li>A guide to understanding what makes a typeface accessible – And how to make more informed design decisions: <a href="https://medium.com/the-readability-group/a-guide-to-understanding-what-makes-a-typeface-accessible-and-how-to-make-informed-decisions-9e5c0b9040a0">https://medium.com/the-readability-group/a-guide-to-understanding-what-makes-a-typeface-accessible-and-how-to-make-informed-decisions-9e5c0b9040a0</a></li>
<li>Don&#39;t overlook more general typography such as leading and margins. <a href="https://en.wikipedia.org/wiki/The_Elements_of_Typographic_Style">https://en.wikipedia.org/wiki/The_Elements_of_Typographic_Style</a> is an excellent reference for this.
<ul><li>Which informed a thesis style: <a href="https://bitbucket.org/amiede/classicthesis/wiki/Home">https://bitbucket.org/amiede/classicthesis/wiki/Home</a></li></ul></li></ul>

<p>Liam also insightfully noted that “Accessibility is just aesthetics with a more sensitive gauge… where the consequence of a lack of clarity, harmony and structure is greater to some people than to others... But good typography and layout is definitely an accessibility aid.” Great point!</p>

<p>Liam also mentioned the 2:3 aspect ratio which is “12mm off the side of A4 (so 198x297)”, where “2:3... cut in half, it&#39;s 3:4... cut in half, 2:3. Like a musical harmonic.”</p>

<p>Other than the above, I note that <a href="https://software.sil.org/fonts/">SIL publishes various open source fonts</a>, including <a href="https://software.sil.org/charis/">Charis SIL</a> (“optimized for readability”) and <a href="https://software.sil.org/andika/">Andika</a> (for the needs of “beginning readers”), both with wide character coverage for various languages. What&#39;s cool is that SIL hosts a <a href="https://scripts.sil.org/ttw/fonts2go.cgi">TypeTuner</a> which allows you to customise font features (e.g. whether to have slashes through <code>0</code>s and <code>7</code>s) and download their fonts with those features enabled.</p>

<p>Also, Atkinson Hyperlegible had a new release in early 2025 called Next (wider character coverage) and Mono (official monospace version!):
<a href="https://www.brailleinstitute.org/freefont/">https://www.brailleinstitute.org/freefont/</a></p>

<h2 id="alt-text" id="alt-text">Alt-text</h2>

<p>Great guide on how to compose alt-text for images:
<a href="https://www.perkins.org/resource/how-write-alt-text-and-image-descriptions-visually-impaired/">https://www.perkins.org/resource/how-write-alt-text-and-image-descriptions-visually-impaired/</a></p>

<p>Which has a great visual example:
<img src="https://www.perkins.org/wp-content/uploads/2023/07/capybara_alt_text.png.webp" alt="Visual depiction of elements which should go into alt-text image captions"/></p>

<hr/>

<p> <p>Unless otherwise stated, all original content in this post is shared under the <a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;">Creative Commons Attribution-ShareAlike 4.0 International</a> license<a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1" alt=""></a></p></p>
]]></content:encoded>
      <guid>https://naclscrg.writeas.com/visual-accessibility-notes</guid>
      <pubDate>Thu, 22 Aug 2024 10:10:03 +0000</pubDate>
    </item>
    <item>
      <title>Studying collective action problems in academic research</title>
      <link>https://naclscrg.writeas.com/studying-collective-action-problems-in-academic-research?pk_campaign=rss-feed</link>
      <description>&lt;![CDATA[A question that came from a recent conversation: Is there published (meta)research on solving collective action problems in academic research?&#xA;&#xA;!--more--&#xA;&#xA;Context&#xA;&#xA;We&#39;ve been doing many interviews over the past 1.5 years with different stakeholders in academia, and one of the most common barriers to changing behavior (such as doing more open research or changing research culture) is that &#34;no one else is doing it and it doesn&#39;t benefit me&#34;, but actually if everyone does it, then everyone benefits. Is solving such collective action problems something that has been studied in the academic context? If so, where and by whom?&#xA;&#xA;I posed this question to the Turing Way and NASA TOPS Slack groups. Here&#39;s my attempt at collecting the responses so far. &#xA;&#xA;Turing Way&#xA;&#xA;So far, I haven&#39;t heard from someone who knows of research specifically about collective action problems in academia. But, a few theoretical frameworks were suggested as ways to examine the problem. &#xA;&#xA;Agent-based modelling of individual vs collective behaviour&#xA;&#xA;(from Shern Tee)&#xA;&#xA;You may find the Stanford Encyclopaedia of Philosophy entry useful: &#34;Agent-Based Modeling in the Philosophy of Science&#34; https://plato.stanford.edu/entries/agent-modeling-philscience/#TheoDiveInceStruScie1&#xA;  Unfortunately it doesn&#39;t directly answer the question of collective action problems. But (because I am a straitjacketed physicist) I find myself thinking about these situations as agent-based: a model simulation that shows agents doing things that are individually rational, but as a whole cause problems for science, is a demonstration of one possible model of collective action failure.&#xA;This paper is a more concise overview of the above link: https://compass.onlinelibrary.wiley.com/doi/10.1111/phc3.12855&#xA;An agent-based model of peer review, studying how scientists might want to trade-off work publishing papers with work reviewing papers: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6096663/&#xA;This PhD thesis describes agent-based modelling of the academic publishing system -- actors being journals and scientists: https://nova.newcastle.edu.au/vital/access/services/Download/uon:31339/ATTACHMENT01&#xA;&#xA;This is interesting to me in the sense that I first of agent-based modelling in my intro ecology course during undergrad, but haven&#39;t considered it in the context of collective human behaviour. &#xA;&#xA;Organisational theory&#xA;&#xA;(from Liam McGee)&#xA;&#xA;Off the top of my head, and this might be a bit too general, but I quite like https://uk.sagepub.com/en-gb/eur/a-very-short-fairly-interesting-and-reasonably-cheap-book-about-studying-organizations/book276268 as a rapid intro to the competing models in organisational theory. Lots of references to explore. There&#39;s a book from the same series on &#34;Studying Leadership&#34;, which may also be of interest in approaching this problem.&#xA;&#xA;Economic theories&#xA;&#xA;(from Liam McGee)&#xA;&#xA;Other route might be down alternate economic theories (ones that seek to resolve the tragedy of the commons, like doughnut economics, or Ole Bjerg&#39;s stuff). Or Kahneman and Tversky from a Experimental Psych/CogSci perspective.&#xA;&#xA;Religions&#xA;&#xA;(from Liam McGee)&#xA;&#xA;Another interesting direction is to understand how the pro-social behaviour of various world religions works -- a practical use case on that here: https://www.linkedin.com/pulse/church-cheese-frog-hugh-mason/ -- comments on that article worth a dig too.&#xA;&#xA;The Collective Action in Science Committee&#xA;&#xA;(Julien Colomb) You may ask the people behind. https://freeourknowledge.org/committee/&#xA;&#xA;Note: &#xA;&#xA;I see that it proposes the model of “We will all do X (the ‘action’) when Y people have pledged (the ‘threshold’)”. This reminds me of the National Popular Vote Interstate Compact in the United States: https://en.wikipedia.org/wiki/NationalPopularVoteInterstateCompact&#xA;&#xA;Thinking about common pool resources&#xA;&#xA;(from Jonah Duckles)&#xA;&#xA;A bit of a different tack on collective action, but still VERY related is Elanor Ostrom&#39;s work on Common Pool Resources, detailed in her book &#34;Governing the Commons&#34;. If you think about the work of an academic as working to advocate for and gather common pool resources (grant money) for themselves, I think it is an informative model for imagining a way that grant money could be considered less &#34;contested&#34; and more of a common pool of resources. The open science movement does kind of implicitly treat information as a common pool resource. Ostrom&#39;s work, I think, helps think about ways to build structures and systems around governing it for the benefit of many.  A summary of &#34;Governing the Commons&#34; is her 8-point &#34;Design principles illustrated by long-enduring Common Pool Resource (CPR) institutions&#34; which is under the Research header on the Wikipedia page about her.&#xA;&#xA;Thinking about &#34;doers&#34; and &#34;thinkers&#34; &#xA;&#xA;(from Anne Lee Steele)&#xA;&#xA;I think there are a few ways to approach this question: as sometimes the people doing the collection action &amp; organising may not necessarily being the ones studying it, and vice versa. (Similarly for example: the work of community management is different from the act of studying communities!)&#xA;&#xA;Regarding broader theories and ideas of social change at the individual level, the trans-theotical model (coming from medicine) is a very popular one: https://sphweb.bumc.bu.edu/otlt/MPH-Modules/SB/BehavioralChangeTheories/BehavioralChangeTheories6.html&#xA;&#xA;There&#39;s also the studies of &#39;innovation diffusion&#39; that talks about how systems change more broadly, studied by quite a few folks: https://en.wikipedia.org/wiki/Diffusionofinnovations#Process&#xA;&#xA;Regarding the tension between &#39;doers&#39; and &#39;thinkers&#39; (which of course is not necessarily cut and dry), it might be helpful to think through a few examples:&#xA;&#xA;Organisers of collective action (for example - there are so many!):&#xA;https://movementecology.org.uk/&#xA;https://scienceforthepeople.org/&#xA;https://techworkerscoalition.org/&#xA;&#xA;Studies of collective action:&#xA;&#xA;The Logic of Collective Action: Public Goods and the Theory of Groups by Mancur Olson is one I&#39;ve heard cited quite a bit&#xA;Elinor Ostrom (as @Jonah Duckles mentioned!) also wrote about collective action theory: https://academic.oup.com/edited-volume/28345/chapter-abstract/215160451?redirectedFrom=fulltext&#xA;Institutional ethnography has been used to understand the roles, rituals and practices of all sorts of different environments, including academic spaces: https://blogs.lse.ac.uk/highereducation/2023/11/17/are-we-proper-institutional-ethnographers/&#xA;More broadly, I&#39;ve also seen how some studies of neoliberalism in academic institutions affect collectivising practices - stumbled upon this interesting piece: https://discovery.dundee.ac.uk/en/publications/revealing-the-manifestations-of-neoliberalism-in-academia-academi&#xA;&#xA;Hope this helps!&#xA;&#xA;Note: &#xA;&#xA;Interestingly, the innovation diffusion model by Rogers is cited in the Center for Open Science&#39;s theory for behaviour change: https://www.annualreviews.org/content/journals/10.1146/annurev-psych-020821-114157#f3&#xA;&#xA;NASA TOPS&#xA;&#xA;Similar to the Turing Way responses, nothing specific to academia here. But there&#39;s a very interesting one about learning from climate action suggested by Jamaica Jones: &#xA;&#xA;  This is such an interesting question! I am not sure if it&#39;s exactly what you are looking for, but you might find Sheila Jasanoff&#39;s work to be informative. She contributed a chapter to a book called  Human Choice and Climate Change that may be relevant. I also found another climate change-focused citation that seems similarly aligned: the article is called &#34;Doing What Others Do: Norms, Science, and Collective Action on Global Warming&#34;, by Bolsen et al.&#xA;&#xA;Here&#39;s the Bolsen et al. paper: https://web.archive.org/web/20240522100857/http://eprints.lse.ac.uk/64670/1/LeeperDoing what others do2016.pdf&#xA;&#xA;For the book Human Choice and Climate Change, it&#39;s available to borrow online from the Internet Archive: &#xA;&#xA;https://archive.org/details/humanchoiceclima0001unse&#xA;&#xA;It reminds me of my past life studying environmental sciences and learning about the concept of collective action problems and the tragedy of the commons. I wonder if anyone&#39;s done research on how to take lessons solving collective action problems in one domain (e.g. climate action) and applying them to another (e.g. academia)?&#xA;&#xA;#metaresearch #ideas&#xA;&#xA;----------&#xD;&#xA;&#xD;&#xA; p xmlns:cc=&#34;http://creativecommons.org/ns#&#34; Unless otherwise stated, all original content in this post is shared under the a href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;Creative Commons Attribution-ShareAlike 4.0 International/a licensea href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1&#34; alt=&#34;&#34;/a/p ]]&gt;</description>
      <content:encoded><![CDATA[<p>A question that came from a recent conversation: Is there published (meta)research on solving collective action problems in academic research?</p>



<h2 id="context" id="context">Context</h2>

<p>We&#39;ve been doing many interviews over the past 1.5 years with different stakeholders in academia, and one of the most common barriers to changing behavior (such as doing more open research or changing research culture) is that “no one else is doing it and it doesn&#39;t benefit me”, but actually if everyone does it, then everyone benefits. Is solving such <strong>collective action problems</strong> something that has been studied in the academic context? If so, where and by whom?</p>

<p>I posed this question to the Turing Way and NASA TOPS Slack groups. Here&#39;s my attempt at collecting the responses so far.</p>

<h2 id="turing-way" id="turing-way">Turing Way</h2>

<p>So far, I haven&#39;t heard from someone who knows of research specifically about collective action problems <em>in academia</em>. But, a few theoretical frameworks were suggested as ways to examine the problem.</p>

<h3 id="agent-based-modelling-of-individual-vs-collective-behaviour" id="agent-based-modelling-of-individual-vs-collective-behaviour">Agent-based modelling of individual vs collective behaviour</h3>

<p>(from Shern Tee)</p>
<ul><li>You may find the Stanford Encyclopaedia of Philosophy entry useful: “Agent-Based Modeling in the Philosophy of Science” <a href="https://plato.stanford.edu/entries/agent-modeling-philscience/#TheoDiveInceStruScie_1">https://plato.stanford.edu/entries/agent-modeling-philscience/#TheoDiveInceStruScie_1</a>
<ul><li>Unfortunately it doesn&#39;t directly answer the question of collective action problems. But (because I am a straitjacketed physicist) I find myself thinking about these situations as agent-based: a model simulation that shows agents doing things that are individually rational, but as a whole cause problems for science, is a demonstration of one possible model of collective action failure.</li></ul></li>
<li>This paper is a more concise overview of the above link: <a href="https://compass.onlinelibrary.wiley.com/doi/10.1111/phc3.12855">https://compass.onlinelibrary.wiley.com/doi/10.1111/phc3.12855</a></li>
<li>An agent-based model of peer review, studying how scientists might want to trade-off work publishing papers with work reviewing papers: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6096663/">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6096663/</a></li>
<li>This PhD thesis describes agent-based modelling of the academic publishing system — actors being journals and scientists: <a href="https://nova.newcastle.edu.au/vital/access/services/Download/uon:31339/ATTACHMENT01">https://nova.newcastle.edu.au/vital/access/services/Download/uon:31339/ATTACHMENT01</a></li></ul>

<p>This is interesting to me in the sense that I first of agent-based modelling in my intro ecology course during undergrad, but haven&#39;t considered it in the context of collective human behaviour.</p>

<h3 id="organisational-theory" id="organisational-theory">Organisational theory</h3>

<p>(from Liam McGee)</p>
<ul><li>Off the top of my head, and this might be a bit too general, but I quite like <a href="https://uk.sagepub.com/en-gb/eur/a-very-short-fairly-interesting-and-reasonably-cheap-book-about-studying-organizations/book276268">https://uk.sagepub.com/en-gb/eur/a-very-short-fairly-interesting-and-reasonably-cheap-book-about-studying-organizations/book276268</a> as a rapid intro to the competing models in organisational theory. Lots of references to explore. There&#39;s a book from the same series on “Studying Leadership”, which may also be of interest in approaching this problem.</li></ul>

<h3 id="economic-theories" id="economic-theories">Economic theories</h3>

<p>(from Liam McGee)</p>
<ul><li>Other route might be down alternate economic theories (ones that seek to resolve the tragedy of the commons, like doughnut economics, or Ole Bjerg&#39;s stuff). Or Kahneman and Tversky from a Experimental Psych/CogSci perspective.</li></ul>

<h3 id="religions" id="religions">Religions</h3>

<p>(from Liam McGee)</p>
<ul><li>Another interesting direction is to understand how the pro-social behaviour of various world religions works — a practical use case on that here: <a href="https://www.linkedin.com/pulse/church-cheese-frog-hugh-mason/">https://www.linkedin.com/pulse/church-cheese-frog-hugh-mason/</a> — comments on that article worth a dig too.</li></ul>

<h3 id="the-collective-action-in-science-committee" id="the-collective-action-in-science-committee">The Collective Action in Science Committee</h3>

<p>(Julien Colomb) You may ask the people behind. <a href="https://freeourknowledge.org/committee/">https://freeourknowledge.org/committee/</a></p>

<p>Note:</p>

<p>I see that it proposes the model of “We will all do X (the ‘action’) when Y people have pledged (the ‘threshold’)”. This reminds me of the National Popular Vote Interstate Compact in the United States: <a href="https://en.wikipedia.org/wiki/National_Popular_Vote_Interstate_Compact">https://en.wikipedia.org/wiki/National_Popular_Vote_Interstate_Compact</a></p>

<h3 id="thinking-about-common-pool-resources" id="thinking-about-common-pool-resources">Thinking about common pool resources</h3>

<p>(from Jonah Duckles)</p>

<p>A bit of a different tack on collective action, but still VERY related is Elanor Ostrom&#39;s work on Common Pool Resources, detailed in her book “Governing the Commons”. If you think about the work of an academic as working to advocate for and gather common pool resources (grant money) for themselves, I think it is an informative model for imagining a way that grant money could be considered less “contested” and more of a common pool of resources. The open science movement does kind of implicitly treat information as a common pool resource. Ostrom&#39;s work, I think, helps think about ways to build structures and systems around governing it for the benefit of many.  A summary of “Governing the Commons” is her 8-point “Design principles illustrated by long-enduring Common Pool Resource (CPR) institutions” which is under the Research header on the <a href="https://en.wikipedia.org/wiki/Elinor_Ostrom">Wikipedia page about her</a>.</p>

<h3 id="thinking-about-doers-and-thinkers" id="thinking-about-doers-and-thinkers">Thinking about “doers” and “thinkers”</h3>

<p>(from Anne Lee Steele)</p>

<p>I think there are a few ways to approach this question: as sometimes the people doing the collection action &amp; organising may not necessarily being the ones studying it, and vice versa. (Similarly for example: the work of community management is different from the act of studying communities!)</p>

<p>Regarding broader theories and ideas of social change at the individual level, the trans-theotical model (coming from medicine) is a very popular one: <a href="https://sphweb.bumc.bu.edu/otlt/MPH-Modules/SB/BehavioralChangeTheories/BehavioralChangeTheories6.html">https://sphweb.bumc.bu.edu/otlt/MPH-Modules/SB/BehavioralChangeTheories/BehavioralChangeTheories6.html</a></p>

<p>There&#39;s also the studies of &#39;innovation diffusion&#39; that talks about how systems change more broadly, studied by quite a few folks: <a href="https://en.wikipedia.org/wiki/Diffusion_of_innovations#Process">https://en.wikipedia.org/wiki/Diffusion_of_innovations#Process</a></p>

<p>Regarding the tension between &#39;doers&#39; and &#39;thinkers&#39; (which of course is not necessarily cut and dry), it might be helpful to think through a few examples:</p>

<p>Organisers of collective action (for example – there are so many!):
* <a href="https://movementecology.org.uk/">https://movementecology.org.uk/</a>
* <a href="https://scienceforthepeople.org/">https://scienceforthepeople.org/</a>
* <a href="https://techworkerscoalition.org/">https://techworkerscoalition.org/</a></p>

<p>Studies of collective action:</p>
<ul><li>The Logic of Collective Action: Public Goods and the Theory of Groups by Mancur Olson is one I&#39;ve heard cited quite a bit</li>
<li>Elinor Ostrom (as @Jonah Duckles mentioned!) also wrote about collective action theory: <a href="https://academic.oup.com/edited-volume/28345/chapter-abstract/215160451?redirectedFrom=fulltext">https://academic.oup.com/edited-volume/28345/chapter-abstract/215160451?redirectedFrom=fulltext</a></li>
<li>Institutional ethnography has been used to understand the roles, rituals and practices of all sorts of different environments, including academic spaces: <a href="https://blogs.lse.ac.uk/highereducation/2023/11/17/are-we-proper-institutional-ethnographers/">https://blogs.lse.ac.uk/highereducation/2023/11/17/are-we-proper-institutional-ethnographers/</a></li>
<li>More broadly, I&#39;ve also seen how some studies of neoliberalism in academic institutions affect collectivising practices – stumbled upon this interesting piece: <a href="https://discovery.dundee.ac.uk/en/publications/revealing-the-manifestations-of-neoliberalism-in-academia-academi">https://discovery.dundee.ac.uk/en/publications/revealing-the-manifestations-of-neoliberalism-in-academia-academi</a></li></ul>

<p>Hope this helps!</p>

<p>Note:</p>

<p>Interestingly, the innovation diffusion model by Rogers is cited in the Center for Open Science&#39;s theory for behaviour change: <a href="https://www.annualreviews.org/content/journals/10.1146/annurev-psych-020821-114157#f3">https://www.annualreviews.org/content/journals/10.1146/annurev-psych-020821-114157#f3</a></p>

<h2 id="nasa-tops" id="nasa-tops">NASA TOPS</h2>

<p>Similar to the Turing Way responses, nothing specific to academia here. But there&#39;s a very interesting one about learning from climate action suggested by Jamaica Jones:</p>

<blockquote><p>This is such an interesting question! I am not sure if it&#39;s exactly what you are looking for, but you might find Sheila Jasanoff&#39;s work to be informative. She contributed a chapter to a book called  <a href="https://dare.uva.nl/search?identifier=a5049982-46ac-4b89-9e50-9e2bb4cee4f5"><em>Human Choice and Climate Change</em></a> that may be relevant. I also found another climate change-focused citation that seems similarly aligned: the article is called “<a href="https://journals-sagepub-com.pitt.idm.oclc.org/doi/full/10.1177/1532673X13484173"><em>Doing What Others Do: Norms, Science, and Collective Action on Global Warming</em></a>”, by Bolsen et al.</p></blockquote>

<p>Here&#39;s the Bolsen et al. paper: <a href="https://web.archive.org/web/20240522100857/http://eprints.lse.ac.uk/64670/1/Leeper_Doing">https://web.archive.org/web/20240522100857/http://eprints.lse.ac.uk/64670/1/Leeper_Doing</a> what others do_2016.pdf</p>

<p>For the book <em>Human Choice and Climate Change</em>, it&#39;s available to borrow online from the Internet Archive:</p>

<p><a href="https://archive.org/details/humanchoiceclima0001unse">https://archive.org/details/humanchoiceclima0001unse</a></p>

<p>It reminds me of my past life studying environmental sciences and learning about the concept of collective action problems and the tragedy of the commons. I wonder if anyone&#39;s done research on how to take lessons solving collective action problems in one domain (e.g. climate action) and applying them to another (e.g. academia)?</p>

<p><a href="https://naclscrg.writeas.com/tag:metaresearch" class="hashtag"><span>#</span><span class="p-category">metaresearch</span></a> <a href="https://naclscrg.writeas.com/tag:ideas" class="hashtag"><span>#</span><span class="p-category">ideas</span></a></p>

<hr/>

<p> <p>Unless otherwise stated, all original content in this post is shared under the <a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;">Creative Commons Attribution-ShareAlike 4.0 International</a> license<a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1" alt=""></a></p></p>
]]></content:encoded>
      <guid>https://naclscrg.writeas.com/studying-collective-action-problems-in-academic-research</guid>
      <pubDate>Mon, 29 Jul 2024 13:42:54 +0000</pubDate>
    </item>
    <item>
      <title>Talk - AI is not the problem - thinking about outcomes (updated)</title>
      <link>https://naclscrg.writeas.com/talk-ai-is-not-the-problem?pk_campaign=rss-feed</link>
      <description>&lt;![CDATA[On 25 April 2024, I gave a talk at the Open Science &amp; Societal Impact conference titled &#34;AI is not the problem - thinking about outcomes&#34;. It was co-created with Jennifer Ding of the Turing Way who is the real AI expert here, and wrote a great post about an outcomes-based approach to AI. There&#39;s extra stuff I couldn&#39;t fit into the talk, so I&#39;m putting them here plus a transcript and video recording of the talk.&#xA;!--more--&#xA;Note that I have a follow up talk focused on labour and academia in November 2024. &#xA;&#xA;The slides are published on Zenodo with DOI: 10.5281/zenodo.11051128&#xA;&#xA;I also tweaked this talk linking it to reproducibility in science at the Reproducibility by Design symposium on 26 June 2024 (at the life sciences department at the University of Bristol), kindly organised by Nick and Fiona.&#xA;&#xA;I will try to gather: &#xA;&#xA;general notes; &#xA;other resources/further reading collected when developing the talk; and&#xA;a transcript of the talk (with reproducibility addendum).&#xA;&#xA;I&#39;ll try to clean up this post with more context and details on a best-effort basis.&#xA;&#xA;There is a video recording (of the April 2024 version) which is saved in a Zenodo item and viewable on the Internet Archive. The video is also embedded here (click the &#34;CC&#34; icon for subtitles): &#xA;&#xA;iframe src=&#34;https://archive.org/embed/AI-is-not-the-problem-2024-04-25&#34; width=&#34;640&#34; height=&#34;480&#34; frameborder=&#34;0&#34; webkitallowfullscreen=&#34;true&#34; mozallowfullscreen=&#34;true&#34; allowfullscreen/iframe&#xA;&#xA;Further reading&#xA;&#xA;The talk cites various people and resources: &#xA;&#xA;Open Source Initiative&#39;s community process for defining open source &#34;AI&#34;&#xA;  https://opensource.org/deepdive&#xA;The Turing Way community&#xA;  https://book.the-turing-way.org/&#xA;Infamous paper with figure of lab rat with giant genitals (later retracted) (full citation below)&#xA;  PDF: https://web.archive.org/web/20240324051904/https://cdn.arstechnica.net/wp-content/uploads/2024/02/fcell-11-1339390-1.pdf&#xA;  https://arstechnica.com/science/2024/02/scientists-aghast-at-bizarre-ai-rat-with-huge-genitals-in-peer-reviewed-article/&#xA;Kate Crawford on &#34;Artificial intelligence is neither artificial nor intelligent&#34;&#xA;  https://link.springer.com/article/10.1007/s43681-021-00115-7&#xA;  https://www.technologyreview.com/2021/04/23/1023549/kate-crawford-atlas-of-ai-review/&#xA;  https://www.theguardian.com/technology/2021/jun/06/microsofts-kate-crawford-ai-is-neither-artificial-nor-intelligent&#xA;  https://nicospage.eu/unethical-academics-ai-and-peer-review&#xA;  https://www.technologyreview.com/2021/04/23/1023549/kate-crawford-atlas-of-ai-review/&#xA;&#34;Invisible&#34; Kenyan sweatshop workers keeping Meta and OpenAI&#39;s tools running&#xA;  https://time.com/6247678/openai-chatgpt-kenya-workers/&#xA;  who have now unionised: https://time.com/6275995/chatgpt-facebook-african-workers-union/&#xA;Lilly Irani on *&#34;AI&#34; displacing instead of replacing labour &#xA;  https://www.publicbooks.org/justice-for-data-janitors/&#xA;  https://quote.ucsd.edu/lirani/white-house-nyu-ainow-summit-talk-the-labor-that-makes-ai-magic/&#xA;Speech Schema Filling tool for hands-free electronic lab notebooks&#xA;  https://github.com/hampusnasstrom/speech-schema-filling&#xA;  https://www.linkedin.com/posts/juliaschumannas-part-of-the-2024-llm-hackathon-for-applications-activity-7194416033728724993-tHYj&#xA;Some evidence strongly suggesting that some academics may be auto-generating their peer reviews&#xA;  https://nicospage.eu/unethical-academics-ai-and-peer-review&#xA;CNN report - Teachers are using &#34;AI&#34; to grade essays&#xA;  https://www.cnn.com/2024/04/06/tech/teachers-grading-ai/index.html&#xA;Mozilla Foundation report on AI&#xA;  https://foundation.mozilla.org/en/research/library/accelerating-progress-toward-trustworthy-ai/whitepaper/&#xA;&#xA;And here are the academic literature cited in the talk or are relevant: &#xA;&#xA;Ball, P. (2023). Is AI leading to a reproducibility crisis in science? Nature, 624(7990), 22–25. https://doi.org/10.1038/d41586-023-03817-6&#xA;&#xA;RETRACTED Guo, X., Dong, L., &amp; Hao, D. (2024). Cellular functions of spermatogonial stem cells in relation to JAK/STAT signaling pathway. Frontiers in Cell and Developmental Biology, 11. https://doi.org/10.3389/fcell.2023.1339390 (original PDF)&#xA;&#xA;Hicks, M. T., Humphries, J., &amp; Slater, J. (2024). ChatGPT is bullshit. Ethics and Information Technology, 26(2), 1–10. https://doi.org/10.1007/s10676-024-09775-5&#xA;&#xA;Liesenfeld, A., &amp; Dingemanse, M. (2024). Rethinking open source generative AI: open-washing and the EU AI Act. Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, 1774–1787. https://doi.org/10.1145/3630106.3659005&#xA;&#xA;Messeri, L., &amp; Crockett, M. J. (2024). Artificial intelligence and illusions of understanding in scientific research. Nature, 627(8002), 49–58. https://doi.org/10.1038/s41586-024-07146-0&#xA;&#xA;Sauermann, H., &amp; Franzoni, C. (2015). Crowd science user contribution patterns and their implications. Proceedings of the National Academy of Sciences, 201408907. https://doi.org/10.1073/pnas.1408907112&#xA;&#xA;Watermeyer, R., Lanclos, D., &amp; Phipps, L. (2024). Does generative AI help academics to do more or less? Nature, 625(7995), 450–450. https://doi.org/10.1038/d41586-024-00115-7&#xA;&#xA;Watermeyer, R., Phipps, L., Lanclos, D., &amp; Knight, C. (2024). Generative AI and the automating of academia. Postdigital Science and Education, 6(2), 446–466. https://doi.org/10.1007/s42438-023-00440-6&#xA;&#xA;White, M., Haddad, I., Osborne, C., Yanglet, X.-Y. L., Abdelmonsef, A., &amp; Varghese, S. (2024). The model openness framework: Promoting completeness and openness for reproducibility, transparency, and usability in artificial intelligence (arXiv:2403.13784). arXiv. https://doi.org/10.48550/arXiv.2403.13784&#xA;&#xA;Widder, D. G., West, S., &amp; Whittaker, M. (2023). Open (for business): Big tech, concentrated power, and the political economy of open AI (SSRN Scholarly Paper 4543807). https://dx.doi.org/10.2139/ssrn.4543807&#xA;&#xA;RETRACTED Zhang, M., Wu, L., Yang, T., Zhu, B., &amp; Liu, Y. (2024). The three-dimensional porous mesh structure of Cu-based metal-organic-framework—Aramid cellulose separator enhances the electrochemical performance of lithium metal anode batteries. Surfaces and Interfaces, 46, 104081. https://doi.org/10.1016/j.surfin.2024.104081&#xA;&#xA;Transcript&#xA;&#xA;Thank you for the introduction. For this talk, I’m going to stay on a high level, and offer my reflections on how to situate &#34;AI&#34; in open science as it relates to wider society. There is a lot of understandable concern about how this technology will affect scientific practice. &#xA;&#xA;And we&#39;ve seen some pretty egregious examples in academic science. Last month this engineering paper published by Elsevier made the rounds because as soon as you start reading the introduction, you’ll see that it starts with “Certainly, here is a possible introduction for your topic…” This is very likely a sentence generated by ChatGPT, a chatbot based on large language models, and brings into doubt the rigour of the rest of the paper.&#xA;&#xA;I think the most dramatic example is one published by Frontiers in February 2024, where it’s pretty obvious that much of the contents are AI-generated, with a dramatic figure of a lab rat with giant gonads. You can also see some gibberish text in the annotations.&#xA;&#xA;What’s remarkable is that these papers were seen by peer reviewers, editors, and copyeditors and were still published.&#xA;&#xA;On the other side of this is that there is growing evidence of academics using tools like ChatGPT to write their peer reviews.&#xA;&#xA;And in higher education, we know that some students would use generative AI to write their essays. But now some instructors are using the same tools to grade those essays.&#xA;&#xA;With that in mind, there are three things I’d like to cover today.&#xA;&#xA;The first is that words matter. A lot. With all of the hype around “AI” right now, it’s important to realise that this is a big umbrella marketing term (instead of a technical term of art) for a bunch of different technologies.&#xA;&#xA;And I really appreciate how Kate Crawford reminds us that these technologies are neither artificial nor intelligent. What we call AI is built on human labour, and it is certainly not intelligent in the way humans are.&#xA;&#xA;In the context of open science, there are calls for open source AI that is transparent, reproducible, and reusuable by others. I agree with this, but what counts as open source or open AI is also not clearly defined.&#xA;&#xA;Last year Meta released a large language model called Llama 2 and marketed it as open source. However, the license for Llama 2 actually came with many restrictions on who can use it and how they can use it. We can agree or disagree with these restrictions, but these restrictions mean that Llama 2 is categorically not open source as it has been widely defined for software.&#xA;&#xA;There’s this paper by Widder, Whittaker, and West in 2023 about how ambiguity in words like AI and open source AI has created an opening for the big players to openwash their products. What happens here is that the word “open” becomes a very fuzzy term that feels good, while meaning very little at the same time. And this furthers the power that these big players hold over technology and society.&#xA;&#xA;All of this is to say that what people call open source AI is often neither open, artificial, nor intelligent! For the purposes of today’s meeting, I think this is a major problem because when a term is taken to mean everything, it ends up meaning nothing. &#xA;&#xA;And the societal impact of this ambiguity is that the wider public will trust science even less than they already do. &#xA;&#xA;What this means in practice is that we should be clear about what we mean when talking about AI. If there’s a specific underlying concept like machine learning, training large language models, and so on, then let us use more specific terms.&#xA;&#xA;There are also cross-cutting work to collaboratively define terms like open source AI, and I believe the scientific research community should absolutely be part of this conversation. The Open Source Initiative is one of the leaders on this and I encourage everyone to check it out.&#xA;&#xA;Having said that. Even though having clearly defined terminology can help us conceptualise and communicate issues around artificial intelligence it is a necessary but insufficient step for addressing those issues. Because being effective communication doesn’t solve problems by itself. &#xA;&#xA;Yes, words matter, and outcomes also matter. And once again, there is a lot of work in this space on topics like reproducibility which is important in scientific research, to others like democracy, trustworthiness, inclusion, accountability, to safety. &#xA;&#xA;I really like the work by the Mozilla Foundation, such as their thinking about trustworthy AI and the need for openness, competition, and accountability. There are so outcomes for us to consider, and to make things more concrete, I want to focus on a real world example which challenges us to think more deeply about what outcomes we want to see. &#xA;&#xA;To make this point we should realise that what’s often called “artificial intelligence” is foundationally similar to autocorrect/spell check. In this case, your typing input is fed into a statistical model that suggests the correct spelling for a word. Now, I know this is simplifying things a bit, and not to minimise the amazing math and computer science research that went into it, but the large language models underlying much of generative AI today is – on a high level – an autocorrect that runs some very very sophisticated statistics on your input to produce natural feeling outputs. It’s important to know this because enormous amounts of human labour goes into labelling the huge datasets used to train these models.&#xA;&#xA;Around this time last year (2023), workers for the companies behind ChatGPT, TikTok, and Facebook formed a union in response to the horrible working conditions they had to put up with.&#xA;&#xA;What’s behind the “artificial intelligence” façade is that many of them are sweatshop workers who manually label training data.&#xA;&#xA;For ChatGPT, these sweatshop workers where hired to tag and filter text that describes extremely graphic details like sexual abuse, murder, suicide, or torture.&#xA;&#xA;This reminds us of how “artificial intelligence” is neither artificial nor intelligent, and it has become a smokescreen for deeper issues like how labour is not being replaced by machines when in fact it is being displaced and made even more invisible.&#xA;&#xA;So, when we think about what outcomes we want to see, we must consider underlying problems like outsourcing, labour rights, or colonialism. &#xA;&#xA;But what does this have to do with scientific research?&#xA;&#xA;Well, there are similar things happening, where what some people call “crowd science” is used as a research methodology, where academic scientists crowdsource data collection and data labelling to online volunteers. &#xA;&#xA;To be clear, there are positive things that can come from this, for example some scientists build crowdsourcing into science outreach and engagement activities, and there are ways to integrate crowd science into science education.&#xA;&#xA;However, I’ve reviewed many scientific papers about this over the years, and some are really focused on how crowdsourcing is a way to shorten the time needed to process data, and to lower costs for the scientist. &#xA;&#xA;Right now, a lot of this is being used to train machine learning models and other AI applications. And I feel there is a risk that parts of the scientific community is inadvertently perpetuate not just the hype around AI, but also the exploitation of people.&#xA;&#xA;I give these examples because I think that we, as members of the scientific community, should go outside of the ivory tower and engage with wider efforts to think about what outcomes we’d like to see in a world with AI. For instance, what can we learn from labour movements to inform more equitable practices when doing crowd science? &#xA;&#xA;This is just one possibility for thinking about outcomes for science.&#xA;&#xA;And the third thing I want to cover is what AI means for open science. To do this I want to take us back to this extraordinary generated figure of a lab rat. One response that we might have to AI-generated papers or peer reviews is to ban the use of AI tools for scientific papers. Some publishers and journals have already implemented these policies. But I’m concerned about if and what problems we actually solve if we focus on dealing with AI.&#xA;&#xA;I fear that we might inadvertently think that we’ve “solved” the problem, when we are entrenching a much deeper problem.&#xA;&#xA;For example, I wouldn’t be surprised if one of the big academic publishers would release a new proprietary tool for detecting AI generated text in submitted papers and reviews, and tie this feature into journals that they publish. On one hand, maybe the tool is really effective and would weed out these junk papers. &#xA;&#xA;But “solutions” like this might concentrate even more power into these huge publishers, who are a big part of why peer review is so broken in the first place. And in this case, I think fixing peer review is more important than dealing with AI.&#xA;&#xA;I think the broader lesson is that we should support existing open science efforts. For example, there are many tools to help fix peer review, such as preregistration, publishing Registered Reports, publishing preprints followed by open post-publication peer review. Groups like PREreview or journals like the Journal of Open Source Software have been doing this work for years. &#xA;&#xA;We also have to tackle even deeper problems like job precarity in academic research, where some researchers move from one short term job to another, or professors who live in tents. And many of us have to deal with toxic workloads where we are expected to do even more for less pay.&#xA;&#xA;And what’s most important to realise is that AI didn’t create these problems, just like how AI didn’t create sweatshops. &#xA;&#xA;So what I want to suggest is that AI is not the problem. At least it often isn’t.&#xA;&#xA;Instead, AI reminds us of existing systemic problems. And if we only focus on AI, then we risk making those problems much worse.&#xA;&#xA;So, these are the three suggestions I want to make today: &#xA;&#xA;Words matter, and we should work to clearly define key terms such as AI or open source AI. This is not only to make communication easier, but also to increase societal trust of scientific institutions. But this alone is not enough.&#xA;Because we should also reflect on what outcomes we want to see for underlying issues.&#xA;With the understanding that AI is very often not the cause of these problems, and if we focus too much on AI we risk making things worse.&#xA;&#xA;I hope there was something useful in this talk and that it can provoke more conversations.&#xA;&#xA;And if you’re interested in continuing the conversation, I want to point to the Turing Way community.&#xA;&#xA;The Turing Way started as an online guide on open science practices, but over the past five years has turned into a global community of concerned researchers who reflect on some of the issues I talked about today.&#xA;&#xA;For example, last year my co-author Jennifer Ding led a Turing Way Fireside Chat about open source AI, and the labour issues behind it.&#xA;&#xA;I invited you to visit the Turing Way to talk about AI or other open science and open research topics.&#xA;&#xA;With that, thank you very much for coming to my little show and tell today.&#xA;&#xA;addendum on reproducibility&#xA;&#xA;Here are the additional points I made about reproducibility at the Bristol life science Reproducibility by Design symposium on 26 June 2024: &#xA;&#xA;There are possible good uses of so-called &#34;AI&#34; to help with reproducibility (not everything is doom and gloom!).&#xA;&#xA;For example, my colleague Shern Tee pointed me to the &#34;Speech Schema Filling&#34; tool made by Näsström, Götte, and Schumann (2024). This tool was developed by and for chemists to help them better document their experiments. &#xA;&#xA;It uses speech recognition and a large language model running locally on your computer, so that you talk through each step in your experiment as you are doing it, and this tool records everything into an electronic lab notebook. &#xA;&#xA;The remarkable thing is that this language model actually parses what you are saying and records the details of your experiment into a standardized structured data format (for chemistry) that can go with your lab notebook (see this example). &#xA;&#xA;I think this is super cool because as long as you’re willing to talk into a microphone as you work, this tool makes documentation so much easier, and helps with data quality and reproducibility. &#xA;&#xA;That said, considering that so-called &#34;AI&#34; and &#34;open source AI&#34; are neither open, artificial, nor intelligent, there is a recent conference paper (just published June 2024) where they sampled 40 of the commonly used large language models for generative AI. &#xA;&#xA;They evaluated the &#34;openness&#34; of these models with 14 measures of availability of underlying materials, documentation, and access (see Figure 2 in: https://doi.org/10.1145/3630106.3659005). The overwhelming majority of them are highly closed source, so you have no idea what&#39;s happening under the hood. Notably Meta&#39;s Llama 2 which was marketed as &#34;open source&#34; is 6 from the bottom, and OpenAI&#39;s ChatGPT comes in last place. &#xA;&#xA;I think this is bad for reproducibility, especially if we integrate them into the scientific process. And unfortunately we are starting to see this happen. &#xA;&#xA;For example, I&#39;ve seen real papers in real, highly prestigious journals proposing things such as (paraphrased): &#xA;&#xA;Recruiting human participants is hard. Let&#39;s replace (some of) them with chat bots who will never get tired of our interview questions. &#xA;Let&#39;s use &#34;AI&#34; to design and run scientific experiments...&#xA;...or to make inferences, predictions, or even decisions. &#xA;&#xA;In my view, if we build our science on top of the really opaque &#34;AI&#34; which most of the popularly used ones are, then we are not doing science. We&#39;d be doing alchemy*. (not to mention we would become even more beholden to Big Tech who holds power over that technology)&#xA;&#xA;And this alchemy would give us &#34;illusions of understanding&#34; as wonderfully described by Messeri &amp; Crockett (2024) (https://doi.org/10.1038/s41586-024-07146-0). I believe this is a great risk to science. &#xA;&#xA;----------&#xA;&#xA;This talk is open source and I published it on Zenodo.org with this DOI (10.5281/zenodo.11051128) along with a transcript, and I encourage you to check it out, fork it, turn it into what you like, and visit the Turing Way community where we can continue these conversations. &#xA;&#xA;#talks #AI&#xA;&#xA;----------&#xD;&#xA;&#xD;&#xA; p xmlns:cc=&#34;http://creativecommons.org/ns#&#34; Unless otherwise stated, all original content in this post is shared under the a href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;Creative Commons Attribution-ShareAlike 4.0 International/a licensea href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;_blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1&#34; alt=&#34;&#34;/a/p ]]&gt;</description>
      <content:encoded><![CDATA[<p>On 25 April 2024, I gave a talk at the <a href="https://aesisnet.com/events/openscience2024.html">Open Science &amp; Societal Impact</a> conference titled “<strong>AI is not the problem – thinking about outcomes</strong>”. It was co-created with Jennifer Ding of the Turing Way who is the real AI expert here, and wrote a great post about an <a href="https://jending.medium.com/what-are-the-outcomes-of-openness-in-ai-c57ccdbce896">outcomes-based approach to AI</a>. There&#39;s extra stuff I couldn&#39;t fit into the talk, so I&#39;m putting them here plus a transcript and video recording of the talk.

Note that I have a <a href="https://write.as/naclscrg/talk-ai-is-not-the-problem-follow-up">follow up talk focused on labour and academia in November 2024</a>.</p>

<p>The <strong>slides are <a href="https://doi.org/10.5281/zenodo.11051128">published on Zenodo</a> with DOI: <a href="https://doi.org/10.5281/zenodo.11051128">10.5281/zenodo.11051128</a></strong></p>

<p>I also tweaked this talk linking it to reproducibility in science at the Reproducibility by Design symposium on 26 June 2024 (at the life sciences department at the University of Bristol), kindly organised by <a href="https://orcid.org/0000-0001-7342-2771">Nick</a> and <a href="https://orcid.org/0009-0008-1617-9822">Fiona</a>.</p>

<p>I will try to gather:</p>
<ul><li>general notes;</li>
<li>other resources/<strong>further reading</strong> collected when developing the talk; and</li>
<li>a transcript of the talk (with reproducibility addendum).</li></ul>

<p>I&#39;ll try to clean up this post with more context and details on a best-effort basis.</p>

<p>There is a video recording (of the April 2024 version) which is saved in a <a href="https://doi.org/10.5281/zenodo.11051128">Zenodo item</a> and <a href="https://archive.org/details/AI-is-not-the-problem-2024-04-25">viewable on the Internet Archive</a>. The video is also embedded here (click the “CC” icon for subtitles):</p>

<iframe src="https://archive.org/embed/AI-is-not-the-problem-2024-04-25" width="640" height="480" frameborder="0" allowfullscreen=""></iframe>

<h2 id="further-reading" id="further-reading">Further reading</h2>

<p>The talk cites various people and resources:</p>
<ul><li>Open Source Initiative&#39;s community process for <strong>defining open source “AI”</strong>
<ul><li><a href="https://opensource.org/deepdive">https://opensource.org/deepdive</a></li></ul></li>
<li>The <strong>Turing Way</strong> community
<ul><li><a href="https://book.the-turing-way.org/">https://book.the-turing-way.org/</a></li></ul></li>
<li>Infamous paper with figure of <strong>lab rat with giant genitals</strong> (later retracted) (full citation below)
<ul><li>PDF: <a href="https://web.archive.org/web/20240324051904/https://cdn.arstechnica.net/wp-content/uploads/2024/02/fcell-11-1339390-1.pdf">https://web.archive.org/web/20240324051904/https://cdn.arstechnica.net/wp-content/uploads/2024/02/fcell-11-1339390-1.pdf</a></li>
<li><a href="https://arstechnica.com/science/2024/02/scientists-aghast-at-bizarre-ai-rat-with-huge-genitals-in-peer-reviewed-article/">https://arstechnica.com/science/2024/02/scientists-aghast-at-bizarre-ai-rat-with-huge-genitals-in-peer-reviewed-article/</a></li></ul></li>
<li>Kate Crawford on <strong>“Artificial intelligence is neither artificial nor intelligent”</strong>
<ul><li><a href="https://link.springer.com/article/10.1007/s43681-021-00115-7">https://link.springer.com/article/10.1007/s43681-021-00115-7</a></li>
<li><a href="https://www.technologyreview.com/2021/04/23/1023549/kate-crawford-atlas-of-ai-review/">https://www.technologyreview.com/2021/04/23/1023549/kate-crawford-atlas-of-ai-review/</a></li>
<li><a href="https://www.theguardian.com/technology/2021/jun/06/microsofts-kate-crawford-ai-is-neither-artificial-nor-intelligent">https://www.theguardian.com/technology/2021/jun/06/microsofts-kate-crawford-ai-is-neither-artificial-nor-intelligent</a></li>
<li><a href="https://nicospage.eu/unethical-academics-ai-and-peer-review">https://nicospage.eu/unethical-academics-ai-and-peer-review</a></li>
<li><a href="https://www.technologyreview.com/2021/04/23/1023549/kate-crawford-atlas-of-ai-review/">https://www.technologyreview.com/2021/04/23/1023549/kate-crawford-atlas-of-ai-review/</a></li></ul></li>
<li><strong>“Invisible” Kenyan sweatshop workers</strong> keeping Meta and OpenAI&#39;s tools running
<ul><li><a href="https://time.com/6247678/openai-chatgpt-kenya-workers/">https://time.com/6247678/openai-chatgpt-kenya-workers/</a></li>
<li>who have now unionised: <a href="https://time.com/6275995/chatgpt-facebook-african-workers-union/">https://time.com/6275995/chatgpt-facebook-african-workers-union/</a></li></ul></li>
<li>Lilly Irani on <strong>“AI” <em>displacing</em> instead of <em>replacing</em> labour</strong>
<ul><li><a href="https://www.publicbooks.org/justice-for-data-janitors/">https://www.publicbooks.org/justice-for-data-janitors/</a></li>
<li><a href="https://quote.ucsd.edu/lirani/white-house-nyu-ainow-summit-talk-the-labor-that-makes-ai-magic/">https://quote.ucsd.edu/lirani/white-house-nyu-ainow-summit-talk-the-labor-that-makes-ai-magic/</a></li></ul></li>
<li>Speech Schema Filling tool for <strong>hands-free electronic lab notebooks</strong>
<ul><li><a href="https://github.com/hampusnasstrom/speech-schema-filling">https://github.com/hampusnasstrom/speech-schema-filling</a></li>
<li><a href="https://www.linkedin.com/posts/juliaschumann_as-part-of-the-2024-llm-hackathon-for-applications-activity-7194416033728724993-tHYj">https://www.linkedin.com/posts/juliaschumann_as-part-of-the-2024-llm-hackathon-for-applications-activity-7194416033728724993-tHYj</a></li></ul></li>
<li>Some evidence strongly suggesting that <strong>some academics may be auto-generating their peer reviews</strong>
<ul><li><a href="https://nicospage.eu/unethical-academics-ai-and-peer-review">https://nicospage.eu/unethical-academics-ai-and-peer-review</a></li></ul></li>
<li>CNN report – Teachers are <strong>using “AI” to grade essays</strong>
<ul><li><a href="https://www.cnn.com/2024/04/06/tech/teachers-grading-ai/index.html">https://www.cnn.com/2024/04/06/tech/teachers-grading-ai/index.html</a></li></ul></li>
<li><strong>Mozilla Foundation report on AI</strong>
<ul><li><a href="https://foundation.mozilla.org/en/research/library/accelerating-progress-toward-trustworthy-ai/whitepaper/">https://foundation.mozilla.org/en/research/library/accelerating-progress-toward-trustworthy-ai/whitepaper/</a></li></ul></li></ul>

<p>And here are the academic literature cited in the talk or are relevant:</p>

<p>Ball, P. (2023). Is AI leading to a reproducibility crisis in science? Nature, 624(7990), 22–25. <a href="https://doi.org/10.1038/d41586-023-03817-6">https://doi.org/10.1038/d41586-023-03817-6</a></p>

<p><strong>RETRACTED</strong> Guo, X., Dong, L., &amp; Hao, D. (2024). Cellular functions of spermatogonial stem cells in relation to JAK/STAT signaling pathway. Frontiers in Cell and Developmental Biology, 11. <a href="https://doi.org/10.3389/fcell.2023.1339390">https://doi.org/10.3389/fcell.2023.1339390</a> (<a href="https://web.archive.org/web/20240324051904/https://cdn.arstechnica.net/wp-content/uploads/2024/02/fcell-11-1339390-1.pdf">original PDF</a>)</p>

<p>Hicks, M. T., Humphries, J., &amp; Slater, J. (2024). ChatGPT is bullshit. Ethics and Information Technology, 26(2), 1–10. <a href="https://doi.org/10.1007/s10676-024-09775-5">https://doi.org/10.1007/s10676-024-09775-5</a></p>

<p>Liesenfeld, A., &amp; Dingemanse, M. (2024). Rethinking open source generative AI: open-washing and the EU AI Act. Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, 1774–1787. <a href="https://doi.org/10.1145/3630106.3659005">https://doi.org/10.1145/3630106.3659005</a></p>

<p>Messeri, L., &amp; Crockett, M. J. (2024). Artificial intelligence and illusions of understanding in scientific research. Nature, 627(8002), 49–58. <a href="https://doi.org/10.1038/s41586-024-07146-0">https://doi.org/10.1038/s41586-024-07146-0</a></p>

<p>Sauermann, H., &amp; Franzoni, C. (2015). Crowd science user contribution patterns and their implications. Proceedings of the National Academy of Sciences, 201408907. <a href="https://doi.org/10.1073/pnas.1408907112">https://doi.org/10.1073/pnas.1408907112</a></p>

<p>Watermeyer, R., Lanclos, D., &amp; Phipps, L. (2024). Does generative AI help academics to do more or less? Nature, 625(7995), 450–450. <a href="https://doi.org/10.1038/d41586-024-00115-7">https://doi.org/10.1038/d41586-024-00115-7</a></p>

<p>Watermeyer, R., Phipps, L., Lanclos, D., &amp; Knight, C. (2024). Generative AI and the automating of academia. Postdigital Science and Education, 6(2), 446–466. <a href="https://doi.org/10.1007/s42438-023-00440-6">https://doi.org/10.1007/s42438-023-00440-6</a></p>

<p>White, M., Haddad, I., Osborne, C., Yanglet, X.-Y. L., Abdelmonsef, A., &amp; Varghese, S. (2024). The model openness framework: Promoting completeness and openness for reproducibility, transparency, and usability in artificial intelligence (arXiv:2403.13784). arXiv. <a href="https://doi.org/10.48550/arXiv.2403.13784">https://doi.org/10.48550/arXiv.2403.13784</a></p>

<p>Widder, D. G., West, S., &amp; Whittaker, M. (2023). Open (for business): Big tech, concentrated power, and the political economy of open AI (SSRN Scholarly Paper 4543807). <a href="https://dx.doi.org/10.2139/ssrn.4543807">https://dx.doi.org/10.2139/ssrn.4543807</a></p>

<p><strong>RETRACTED</strong> Zhang, M., Wu, L., Yang, T., Zhu, B., &amp; Liu, Y. (2024). The three-dimensional porous mesh structure of Cu-based metal-organic-framework—Aramid cellulose separator enhances the electrochemical performance of lithium metal anode batteries. Surfaces and Interfaces, 46, 104081. <a href="https://doi.org/10.1016/j.surfin.2024.104081">https://doi.org/10.1016/j.surfin.2024.104081</a></p>

<h2 id="transcript" id="transcript">Transcript</h2>

<p>Thank you for the introduction. For this talk, I’m going to stay on a high level, and offer my reflections on how to situate “AI” in open science as it relates to wider society. There is a lot of understandable concern about how this technology will affect scientific practice.</p>

<p>And we&#39;ve seen some pretty egregious examples in academic science. Last month this engineering paper published by Elsevier made the rounds because as soon as you start reading the introduction, you’ll see that it starts with “Certainly, here is a possible introduction for your topic…” This is very likely a sentence generated by ChatGPT, a chatbot based on large language models, and brings into doubt the rigour of the rest of the paper.</p>

<p>I think the most dramatic example is one published by Frontiers in February 2024, where it’s pretty obvious that much of the contents are AI-generated, with a dramatic figure of a lab rat with giant gonads. You can also see some gibberish text in the annotations.</p>

<p>What’s remarkable is that these papers were seen by peer reviewers, editors, and copyeditors and were still published.</p>

<p>On the other side of this is that there is growing evidence of academics using tools like ChatGPT to <em>write</em> their peer reviews.</p>

<p>And in higher education, we know that some students would use generative AI to write their essays. But now some instructors are using the same tools to grade those essays.</p>

<p>With that in mind, there are three things I’d like to cover today.</p>

<p>The first is that words matter. A lot. With all of the hype around “AI” right now, it’s important to realise that this is a big umbrella marketing term (instead of a technical term of art) for a bunch of different technologies.</p>

<p>And I really appreciate how Kate Crawford reminds us that these technologies are neither artificial nor intelligent. What we call AI is built on human labour, and it is certainly not intelligent in the way humans are.</p>

<p>In the context of open science, there are calls for open source AI that is transparent, reproducible, and reusuable by others. I agree with this, but what counts as open source or open AI is also not clearly defined.</p>

<p>Last year Meta released a large language model called Llama 2 and marketed it as open source. However, the license for Llama 2 actually came with many restrictions on who can use it and how they can use it. We can agree or disagree with these restrictions, but these restrictions mean that Llama 2 is categorically not open source as it has been widely defined for software.</p>

<p>There’s this paper by Widder, Whittaker, and West in 2023 about how ambiguity in words like AI and open source AI has created an opening for the big players to openwash their products. What happens here is that the word “open” becomes a very fuzzy term that feels good, while meaning very little at the same time. And this furthers the power that these big players hold over technology and society.</p>

<p>All of this is to say that what people call open source AI is often neither open, artificial, nor intelligent! For the purposes of today’s meeting, I think this is a major problem because when a term is taken to mean everything, it ends up meaning nothing.</p>

<p>And the societal impact of this ambiguity is that the wider public will trust science even less than they already do.</p>

<p>What this means in practice is that we should be clear about what we mean when talking about AI. If there’s a specific underlying concept like machine learning, training large language models, and so on, then let us use more specific terms.</p>

<p>There are also cross-cutting work to collaboratively define terms like open source AI, and I believe the scientific research community should absolutely be part of this conversation. The Open Source Initiative is one of the leaders on this and I encourage everyone to check it out.</p>

<p>Having said that. Even though having clearly defined terminology can help us conceptualise and communicate issues around artificial intelligence it is a necessary but insufficient step for addressing those issues. Because being effective communication doesn’t solve problems by itself.</p>

<p>Yes, words matter, and outcomes also matter. And once again, there is a lot of work in this space on topics like reproducibility which is important in scientific research, to others like democracy, trustworthiness, inclusion, accountability, to safety.</p>

<p>I really like the work by the Mozilla Foundation, such as their thinking about trustworthy AI and the need for openness, competition, and accountability. There are so outcomes for us to consider, and to make things more concrete, I want to focus on a real world example which challenges us to think more deeply about what outcomes we want to see.</p>

<p>To make this point we should realise that what’s often called “artificial intelligence” is foundationally similar to autocorrect/spell check. In this case, your typing input is fed into a statistical model that suggests the correct spelling for a word. Now, I know this is simplifying things a bit, and not to minimise the amazing math and computer science research that went into it, but the large language models underlying much of generative AI today is – on a high level – an autocorrect that runs some very very sophisticated statistics on your input to produce natural feeling outputs. It’s important to know this because enormous amounts of human labour goes into labelling the huge datasets used to train these models.</p>

<p>Around this time last year (2023), workers for the companies behind ChatGPT, TikTok, and Facebook formed a union in response to the horrible working conditions they had to put up with.</p>

<p>What’s behind the “artificial intelligence” façade is that many of them are sweatshop workers who manually label training data.</p>

<p>For ChatGPT, these sweatshop workers where hired to tag and filter text that describes extremely graphic details like sexual abuse, murder, suicide, or torture.</p>

<p>This reminds us of how “artificial intelligence” is neither artificial nor intelligent, and it has become a smokescreen for deeper issues like how labour is not being replaced by machines when in fact it is being displaced and made even more invisible.</p>

<p>So, when we think about what outcomes we want to see, we must consider underlying problems like outsourcing, labour rights, or colonialism.</p>

<p>But what does this have to do with scientific research?</p>

<p>Well, there are similar things happening, where what some people call “crowd science” is used as a research methodology, where academic scientists crowdsource data collection and data labelling to online volunteers.</p>

<p>To be clear, there are positive things that can come from this, for example some scientists build crowdsourcing into science outreach and engagement activities, and there are ways to integrate crowd science into science education.</p>

<p>However, I’ve reviewed many scientific papers about this over the years, and some are really focused on how crowdsourcing is a way to shorten the time needed to process data, and to lower costs for the scientist.</p>

<p>Right now, a lot of this is being used to train machine learning models and other AI applications. And I feel there is a risk that parts of the scientific community is inadvertently perpetuate not just the hype around AI, but also the exploitation of people.</p>

<p>I give these examples because I think that we, as members of the scientific community, should go outside of the ivory tower and engage with wider efforts to think about what outcomes we’d like to see in a world with AI. For instance, what can we learn from labour movements to inform more equitable practices when doing crowd science?</p>

<p>This is just one possibility for thinking about outcomes for science.</p>

<p>And the third thing I want to cover is what AI means for open science. To do this I want to take us back to this extraordinary generated figure of a lab rat. One response that we might have to AI-generated papers or peer reviews is to ban the use of AI tools for scientific papers. Some publishers and journals have already implemented these policies. But I’m concerned about if and what problems we actually solve if we focus on dealing with AI.</p>

<p>I fear that we might inadvertently think that we’ve “solved” the problem, when we are entrenching a much deeper problem.</p>

<p>For example, I wouldn’t be surprised if one of the big academic publishers would release a new proprietary tool for detecting AI generated text in submitted papers and reviews, and tie this feature into journals that they publish. On one hand, maybe the tool is really effective and would weed out these junk papers.</p>

<p>But “solutions” like this might concentrate even more power into these huge publishers, who are a big part of why peer review is so broken in the first place. And in this case, I think fixing peer review is more important than dealing with AI.</p>

<p>I think the broader lesson is that we should support existing open science efforts. For example, there are many tools to help fix peer review, such as preregistration, publishing Registered Reports, publishing preprints followed by open post-publication peer review. Groups like PREreview or journals like the Journal of Open Source Software have been doing this work for years.</p>

<p>We also have to tackle even deeper problems like job precarity in academic research, where some researchers move from one short term job to another, or professors who live in tents. And many of us have to deal with toxic workloads where we are expected to do even more for less pay.</p>

<p>And what’s most important to realise is that AI didn’t create these problems, just like how AI didn’t create sweatshops.</p>

<p>So what I want to suggest is that AI is not the problem. At least it often isn’t.</p>

<p>Instead, AI reminds us of existing systemic problems. And if we only focus on AI, then we risk making those problems much worse.</p>

<p>So, these are the three suggestions I want to make today:</p>
<ul><li>Words matter, and we should work to clearly define key terms such as AI or open source AI. This is not only to make communication easier, but also to increase societal trust of scientific institutions. But this alone is not enough.</li>
<li>Because we should also reflect on what outcomes we want to see for underlying issues.</li>
<li>With the understanding that AI is very often not the cause of these problems, and if we focus too much on AI we risk making things worse.</li></ul>

<p>I hope there was something useful in this talk and that it can provoke more conversations.</p>

<p>And if you’re interested in continuing the conversation, I want to point to the Turing Way community.</p>

<p>The Turing Way started as an online guide on open science practices, but over the past five years has turned into a global community of concerned researchers who reflect on some of the issues I talked about today.</p>

<p>For example, last year my co-author Jennifer Ding led a Turing Way Fireside Chat about open source AI, and the labour issues behind it.</p>

<p>I invited you to visit the Turing Way to talk about AI or other open science and open research topics.</p>

<p>With that, thank you very much for coming to my little show and tell today.</p>

<h3 id="addendum-on-reproducibility" id="addendum-on-reproducibility">addendum on reproducibility</h3>

<p>Here are the additional points I made about reproducibility at the Bristol life science Reproducibility by Design symposium on 26 June 2024:</p>

<p>There are possible good uses of so-called “AI” to help with reproducibility (not everything is doom and gloom!).</p>

<p>For example, my colleague Shern Tee pointed me to the “<a href="https://github.com/hampusnasstrom/speech-schema-filling">Speech Schema Filling</a>” tool made by Näsström, Götte, and Schumann (2024). This tool was developed by and for chemists to help them better document their experiments.</p>

<p>It uses speech recognition and a large language model running locally on your computer, so that you <em>talk</em> through each step in your experiment as you are doing it, and this tool records everything into an electronic lab notebook.</p>

<p>The remarkable thing is that this language model actually <em>parses</em> what you are saying and records the details of your experiment into a <em>standardized structured data format</em> (for chemistry) that can go with your lab notebook (see <a href="https://nomad-lab.eu/prod/v1/oasis/gui/user/uploads/upload/id/_3d9bVH6Qa2vnhLGA3U5rw/entry/id/GMER924PLNU_Bz8sqeD9-INx322m">this example</a>).</p>

<p>I think this is super cool because as long as you’re willing to talk into a microphone as you work, this tool makes documentation so much easier, and helps with data quality and reproducibility.</p>

<p>That said, considering that so-called “AI” and “open source AI” are neither open, artificial, nor intelligent, there is a recent conference paper (just published June 2024) where they sampled 40 of the commonly used large language models for generative AI.</p>

<p>They evaluated the “openness” of these models with 14 measures of availability of underlying materials, documentation, and access (see Figure 2 in: <a href="https://doi.org/10.1145/3630106.3659005">https://doi.org/10.1145/3630106.3659005</a>). The overwhelming majority of them are highly closed source, so you have no idea what&#39;s happening under the hood. Notably Meta&#39;s Llama 2 which was marketed as “open source” is 6 from the bottom, and OpenAI&#39;s ChatGPT comes in last place.</p>

<p>I think this is bad for reproducibility, especially if we integrate them into the scientific process. And unfortunately we are starting to see this happen.</p>

<p>For example, I&#39;ve seen real papers in real, highly prestigious journals proposing things such as (paraphrased):</p>
<ul><li>Recruiting human participants is hard. Let&#39;s replace (some of) them with chat bots who will never get tired of our interview questions.</li>
<li>Let&#39;s use “AI” to design and run scientific experiments...</li>
<li>...or to make inferences, predictions, or even <em>decisions</em>.</li></ul>

<p>In my view, if we build our science on top of the really opaque “AI” which most of the popularly used ones are, then we are not doing science. We&#39;d be doing <em>alchemy</em>. (not to mention we would become even more beholden to Big Tech who holds power over that technology)</p>

<p>And this alchemy would give us “illusions of understanding” as wonderfully described by Messeri &amp; Crockett (2024) (<a href="https://doi.org/10.1038/s41586-024-07146-0">https://doi.org/10.1038/s41586-024-07146-0</a>). I believe this is a great risk to science.</p>

<hr/>

<p>This talk is open source and I published it on Zenodo.org with this DOI (<a href="https://doi.org/10.5281/zenodo.11051128">10.5281/zenodo.11051128</a>) along with a transcript, and I encourage you to check it out, fork it, turn it into what you like, and visit the Turing Way community where we can continue these conversations.</p>

<p><a href="https://naclscrg.writeas.com/tag:talks" class="hashtag"><span>#</span><span class="p-category">talks</span></a> <a href="https://naclscrg.writeas.com/tag:AI" class="hashtag"><span>#</span><span class="p-category">AI</span></a></p>

<hr/>

<p> <p>Unless otherwise stated, all original content in this post is shared under the <a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;">Creative Commons Attribution-ShareAlike 4.0 International</a> license<a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1" alt=""></a></p></p>
]]></content:encoded>
      <guid>https://naclscrg.writeas.com/talk-ai-is-not-the-problem</guid>
      <pubDate>Thu, 25 Apr 2024 10:09:26 +0000</pubDate>
    </item>
    <item>
      <title>Talk - The critical role of open source in open research</title>
      <link>https://naclscrg.writeas.com/critical-role-of-open-source-in-open-research?pk_campaign=rss-feed</link>
      <description>&lt;![CDATA[On 20 March 2024, I gave a talk at the Open Source for Innovation in Universities event titled &#34;The critical role of open source in open research&#34; (open source slides published to Zenodo). Like last time, it was informed by incredible feedback I received from various open research communities, especially Malvika of the Turing Way who first connected me to the organisers. There&#39;s extra stuff I couldn&#39;t fit into the talk, so I&#39;m putting them here.&#xA;&#xA;!--more--&#xA;&#xA;I&#39;m posting: &#xA;&#xA;a few general notes; &#xA;other resources/further reading suggested by Turing Way members; and&#xA;a transcript of my talk.&#xA;&#xA;I&#39;ll try to clean up this post with more context and details on a best-effort basis.&#xA;&#xA;There is a video recording which is saved in the Zenodo item, viewable on YouTube, and embedded here: &#xA;&#xA;iframe width=&#34;560&#34; height=&#34;315&#34; src=&#34;https://www.youtube-nocookie.com/embed/MFKmZmp7HmI?si=stsNsVycBEqGKjAx&#34; title=&#34;YouTube video player&#34; frameborder=&#34;0&#34; allow=&#34;accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share&#34; referrerpolicy=&#34;strict-origin-when-cross-origin&#34; allowfullscreen/iframe&#xA;&#xA;General notes&#xA;&#xA;In-person verbal feedback was positive, though I didn&#39;t get to use as much time preparing it as I wanted. I was also running out of time near the end, and wish I could have talked about the Turing Way more!&#xA;&#xA;This time, I also opened a Turing Way GitHub issue #3570, to track the development of this talk.&#xA;&#xA;As expected, I wasn&#39;t able to fit everything in, but also thank you to Sarah Gibson, Julien Colomb, Esther Plomp for your feedback earlier to help me prepare! I&#39;m also grateful to the organisers Michael Meagher and Clare Dillon who gathered a great group of warm and interesting people for this event. :) Special thanks for Malvika Sharan for the several meetings we had to structure this talk.&#xA;&#xA;a note about creating a transcript&#xA;&#xA;For my FOSDEM lightning talk, I typed what I wanted to say directly into the presenter notes in my slides before the talk. However, this time I just didn&#39;t have time to do that.&#xA;&#xA;So, I tried using my phone to make a live audio recording as I gave the presentation. Then, I used the open source Whisper.cpp automatic speech recognition tool with its open-ish ggml-small.en model to generate a transcript.&#xA;&#xA;Then, I copied that transcript into the presenter notes of the final slides published to Zenodo.&#xA;&#xA;In the end, I think this method works, but is still time-consuming. The generated transcript is a huge text file that I had to manually split into paragraphs, and copy and paste individual chunks of text into their corresponding presenter notes. This is also what&#39;s below in the &#34;Transcript&#34; section.&#xA;&#xA;Will I continue to use Whisper.cpp in the future? Yes, I think its text transcription is remarkably accurate and is getting better. Though there are still paper cuts in the user experience that adds some work for me.&#xA;&#xA;Other resources/examples&#xA;&#xA;Thanks to Sarah Gibson and Julien Colomb for the suggested examples: &#xA;&#xA;The Gorgas tracker as mentioned in this post and described in Arancio (2023).&#xA;CERN&#39;s White Rabbit project. Also see this interview about it.&#xA;The Python and R ecosystems vs MATLAB and SPSS in days past.&#xA;JupyterHub, specifically the QGreenland project (WARNING: Medium link). I really like this one because it&#39;s not just one piece of open source hardware, but an entire stack that could only work well when all components are open source and remixable.&#xA;&#xA;Transcript&#xA;&#xA;Note: This transcript is lightly edited for clarity, such as by removing the &#34;uh&#34;s and &#34;you know&#34;s, or &#34;ah&#34;s.&#xA;&#xA;Thank you so much for that introduction, Clare. I&#39;m really excited to be here with you today. It&#39;s really quite a privilege to be speaking to you. And as Clare mentioned, I am a member of the Turing Way community, which I will come back to near the end of the talk. But today, I&#39;d like to share some of my own reflections being not only an advocate for open research in the academic community over the past several years, but also as a member of the open source community. I very much think of my talk as a kind of &#34;yes, and...&#34; kind of presentation. And it&#39;s also intentionally provocative with the intention of stimulating, new thinking around what kind of opportunities can we consider when it comes to open source technologies and open research.&#xA;&#xA;I want to start very briefly by focusing on the term open research and make kind of a subtle point here. So I consider open research to cover a very wide and diverse array of different research disciplines. And a lot of the examples I&#39;d like to share today come from my experience advocating for open science, which I consider to be a very important component of open research, but it&#39;s not all of open research. So there&#39;s a subtle difference between the terms and I&#39;d just like to delineate the two, even though most of what I&#39;m talking about today comes from the open science world.&#xA;&#xA;With that said, the structure of my talk today, I&#39;d like to start with my reflections on some of the core values of open science, why open science is so important, including in academic research. Very briefly on a lot of the invisible infrastructure of technology that underlies the scientific research that we do, followed by I think the biggest part of my talk today, which are the additional motivations for open source technologies to enable open science. And I&#39;d like to bring up the hardware component as well because we&#39;ve heard a lot about software. And finally, I will talk about some of the communities that have been so lucky to be a part of over the years that discusses a lot of the things in my talk today.&#xA;&#xA;So, open science. I&#39;ve talked about open science to so many people over the years, and what I have learned is that... &#xA;&#xA;...if you ask 10 people what open science means, they will tell you, yes, I know what it is, but they will give you 10 different answers. So I&#39;d just like to set the scene a little bit for my talk today to establish a common understanding just to help with the conversation.&#xA;&#xA;And one of the initiatives that I&#39;ve been really privileged to be a part of is the drafting of the UNESCO Recommendation on Open Science that was ratified in 2021. I had a very small role to play in this, but it was a huge privilege to be part of the process and it produced an amazing document.&#xA;&#xA;It&#39;s really long, but I recommend you check it out. And part of it defines open science to mean a set of practices for reproducibility, transparency, sharing, and collaboration from the increased opening of scientific contents towards and processes. Again, I think this is an amazing document, but this definition is also quite a mouthful, right? So I tried to reflect on: is there kind of like an essence to this definition?&#xA;&#xA;And what I came to is actually the difference between science and alchemy. So what do I mean by this?&#xA;&#xA;I was inspired to think about this by a very provocative digital rights author called Cory Doctorow. He writes a lot about these kind of fundamental values underlying open research and open science and open source. And he said, if we think about how alchemists used to work superficially, they were running experiments, they had some research questions, they took lots of notes, and they were actually learning along the way. But the thing with alchemists is that they kept what they knew a secret from each other for 500 years. Because of that secrecy, they didn&#39;t advance the state of the art very much. And because of that, every single one of them had to learn in the hardest possible way that drinking mercury is a bad idea.&#xA;&#xA;I think this really hits at the core of the difference between science and alchemy because science is a fundamentally iterative process where we are always building on knowledge shared by other people and what came before. So in a way, for us to be responsible scientists, we have to continue to share what we have learned with other people to build upon our successes and failures. So I think to do good science is to do open science, and I think that&#39;s what open science is really about.&#xA;&#xA;Another way to think about this is what I think of as intellectual humility because I&#39;ve been an academic researcher for like 15 years now. And reflecting on these years of research, I realized that whatever little bit I&#39;ve added to our collective body of knowledge, I was able to do that because of everything that I&#39;ve learned from the people who came before me. So a researchers, we really didn&#39;t get here on your own. It&#39;s really built on top of what everyone else has shared with you. &#xA;&#xA;And it is with all of this in mind that I think open science really comes with four fundamental freedoms, where for any piece of knowledge, it should come with the freedoms for anyone to use it, study it, modify it, and continue to share it with other people to continue that iterative cycle. So this is how I like to think of open science. And that&#39;s the first thing I wanted to cover today.&#xA;&#xA;The next thing I wanted to quickly establish is that for this science to happen, we&#39;re making use of so much shared technical infrastructure today. I remember many years ago I was at this hackathon with Arfon Smith from GitHub. And he was the person who gave me that lifetime Pro subscription to a GitHub that I&#39;m still getting dividends from to this day. It is a platform that comes with amazing features.&#xA;&#xA;But at the same time, I also remember how a couple of years ago there was this big GitHub outage for a couple of hours. And it is when things like this happen that we realize how reliant we have become on the software and hardware infrastructure in our lives. Because when they break, when we hurt, that&#39;s when we realize our reliance on these things. &#xA;&#xA;And it&#39;s important to think about this because it reminds us to reflect on who gets to have a say in how this infrastructure works and how that infrastructure can work for us as researchers, and how we live out our lives. So this invisible infrastructure is really important. And this kind of centralization that&#39;s happening, I think, is a challenge that open source technologies can tackle.&#xA;&#xA;So I&#39;ve been thinking about a lot of the motivations for open source, including a lot of the reasons that people have talked about today. And I&#39;d like to go over some examples. I want to talk about hardware, but will start with a software example that I think is amazing, which is...&#xA;&#xA;...the QGreenland project. So I thought this project was so cool because it started out as a bunch of academic scientists who share a common theme, which is that they all study Greenland. It could be meteorologists, geologists, and a lot of other scientists. And they developed this common software platform for analyzing geospatial data about Greenland.&#xA;&#xA;And they built it on top of an open source software called QGIS. It is a geographical information system, so that they can pull all of the geospatial data about Greenland into one place. They have a whole suite of tools built on top of QGIS to analyze that data. And the whole stack is called QGreenland. And what happened was that this project became successful. And last year in 2023, they wanted to run a training workshop for other researchers to learn about how to use QGreenlandfor their scientific research. &#xA;&#xA;But one problem they encountered was that if they have 20 scientists in the room coming to this workshop, all with their laptops and their different operating systems and configurations,it takes so much time to just get people to the same page to install QGIS,get it running, and then put QGreenland on top of it. That takes so much time from the actual training they wanted to do.&#xA;&#xA;So they thought, okay, can we reduce this friction a little bit?&#xA;&#xA;And the solution they came up with was that they started with JupyterHub, which is kind of like a server-hosted version of the Python-based kind of Jupyter computational notebook that a lot of data scientists use.&#xA;&#xA;But they were able to make some additions to Jupyter and tweak it so that instead of just running Python, they&#39;re running an entire Linux desktop environment on top of JupyterHub.&#xA;&#xA;And with that, they can then install QGIS into that Linux environment, and then they put the whole QGreenland geospatial data platform on top of that.&#xA;&#xA;And once they put all of this together into one package, they serve it from their server so that the participants in the workshop, they can just open up their web browsers, go to a particular URL, and the whole package runs as a web page inside their browser. And this saves so much time in the workshop because they don&#39;t need to set QGIS up on every individual computer.&#xA;&#xA;Now, the reason I love this example is that all of these components, they are open source to begin with, and they demonstrate the FAIR principles of open science. Now, I think a lot of you know what FAIR stands for, but just so we&#39;re on the same page, FAIR stands for... &#xA;&#xA;...Findable, Accessible, Interoperable, and Reusable. And this is a big thing in open science, and I think QGreenland demonstrates all of it. Because of this open source publishing online, it&#39;s easy for people to find it. The way they set it up is really accessible. It&#39;s interoperable because the components are open source, and they were able to tweak the components to interact with each other. And, of course, it&#39;s reusable because other scientists can adapt it to different research contexts. And I think this is a demonstration of how the FAIR principles that are so important to open science are enabled by open source technologies.&#xA;&#xA;Okay, so this is a software example, but if you look at the UNESCO recommendation on open science, it talks about several main pillars of open science, including the usual suspects like open access publications, open data, open educational resources (I think this one is really important!), and of course, open source software code.&#xA;&#xA;In addition to that, the recommendation emphasizes that hardware is a really important part of open science as well. So I like to focus a bit on the open source hardware side of things.&#xA;&#xA;And if you really think about it, hardware underpins so much of scientific research. It was literally hardware that took people to the moon. That&#39;s how much we rely on hardware to do science.&#xA;&#xA;It can be huge pieces of equipment like the Large Hadron Collider,&#xA;&#xA;Or it can be something seemingly simple, but equally integral to the research infrastructure, like microscopes that we use in so many labs today.&#xA;&#xA;Now, the thing with hardware is that it&#39;s very often closed source, like a lot of software. &#xA;&#xA;And some of the challenges with that is that it&#39;s not reproducible in a scientific way. There&#39;s vendor lock-in, which was mentioned before. There&#39;s forced obsolescence, and there are very high costs. The cost is not only in terms of a very expensive piece of equipment. It&#39;s also the very high switching costs, where if you decide there&#39;s another equipment you want to use, but since it&#39;s not open source and there&#39;s no interoperability, it&#39;s very difficult for you to switch to a different platform.&#xA;&#xA;And this causes a lot of global inequalities in research. I personally know some scientists in some global south countries who really want to have a particular piece of instrument in their lab, but the one manufacturer that makes it simply do not sell it in their country.&#xA;&#xA;And even if they somehow get access to buy it, the cost is so high that they cannot afford it. &#xA;&#xA;And if they somehow scrunch together the money to be able to afford to buy it, once they have it, they won&#39;t be able to get any support on it. They cannot maintain it themselves.&#xA;&#xA;And it just becomes prohibitively difficult for a lot of researchers in different places around the world.&#xA;&#xA;So I think when it comes to the social impact of our technologies, it&#39;s really important to be mindful of a lot of the global inequalities that come with the technologies of today.&#xA;&#xA;So in contrast to that, open source hardware is defined as hardware whose design is available so that anyone, again, can study, modify, distribute, make, and sell hardware based on that design. And there are a lot of examples, actually, in scientific research.&#xA;&#xA;An amazing one that I know about is the Open Source Imaging Initiative. So this is a consortium of universities across Europe, including some companies, I believe, who came together to create a completely open source MRI machine for medical scanning and diagnosis.&#xA;&#xA;And if you know anything about MRI machines, you know how complicated and intricate they are. And they&#39;re actually creating an open source version of it that&#39;s becoming successful!&#xA;&#xA;Open source hardware has been to space. Researchers in the U.S., they&#39;ve developed the ORESAT, which is an open source CubeSat, that became a common platform for scientists across the U.S. to build on top of for remote sensing applications.&#xA;&#xA;It&#39;s been launched several times already, and I think they have more launches scheduled.&#xA;&#xA;But the example that I&#39;d really love to talk about is the OpenFlexure microscope. So this is a lab-grade microscope, originally developed by researchers at the University of Bath in the U.K. (I think their team is based in Glasgow now). The point is it&#39;s completely open source and modular, and you can 3D print most of the microscope yourself.&#xA;&#xA;It comes with a lot of features, starting with the basic ones like bright field imaging, or fluorescence imaging. But because it is fully open source, there was a separate research team in a different part of the world that looked at the designs, and they actually enhanced it and improved it to greatly increase the resolution for fluorescence imaging.&#xA;&#xA;And this is something that people weren&#39;t able to do with the closed-sourced microscopes that they used before.&#xA;&#xA;These are just a couple of features, but what&#39;s also really cool is that this open-source microscope, if you want to build it yourself, the cost of doing so is only about 200 US dollars.&#xA;&#xA;Now, for those of you who have used and bought microscopes for use in the lab before, you will know that these microscopes often cost an order of magnitude more than OpenFlexure for doing the same thing, and I think that&#39;s absolutely remarkable.&#xA;&#xA;And because of its low cost and because it&#39;s open source, again, as an example, researchers in several sub-Saharan countries, they were able to take the OpenFlexure design to locally produce and maintain that microscope from malaria diagnosis when they weren&#39;t able to do it before.&#xA;&#xA;And in addition to this, it has actually prompted the formation of some small businesses in those countries to locally produce and sell these microscopes, and it&#39;s again becoming a new business model that&#39;s enabled by open source technology.&#xA;&#xA;Okay, so to kind of build on some of the points made earlier, Joshua Pierce is a researcher in this, and he calculated that open source technologies, including hardware, can provide economic savings of up to 87% compared to functionally equivalent proprietary tools.&#xA;&#xA;And again, my other point is that in addition to the savings, it creates new kinds of businesses.&#xA;&#xA;So I have a bit of a background in molecular biology, and I&#39;ve used PCR machines a lot. And there&#39;s a company who sells these Ninja PCR machines for US$500. Again, if you have bought this for labs before, you&#39;ll know that they typically cost an order of magnitude more. So it&#39;s amazing how open source not only lower costs, but creates new kinds of businesses as well.&#xA;&#xA;Okay, so I talked about some of the benefits of open source technology just now. And to build on Clare&#39;s point earlier, I think we&#39;re faced with so many global challenges today, whether that&#39;s climate change or pandemics or other problems. And they&#39;re so big and urgent that I think open source technology is what enables the inclusive and rapid innovation needed to address these really urgent issues.&#xA;&#xA;And to bring it back to my earlier point, I truly believe that we simply don&#39;t have time to be alchemists anymore. We cannot afford to be alchemists. And I think this is a huge motivator for why open source is so important and critical to open research.&#xA;&#xA;Now, with all of that said, here actually comes what might be the most provocative part of my talk today. So, you know, again, we&#39;ve seen so many motivations for open source, like the collaboration that happens, faster innovation that&#39;s so critical to solve problems of today, the lower costs and business opportunities, and so many other benefits, right?&#xA;&#xA;But I feel they are just the tip of the iceberg in terms of why open source is so important. And there are some underlying values that I think really adds a lot to the value proposition of open source.&#xA;&#xA;In my view, that could be things like the autonomy and agency that we can have over the technology that we use and the freedom to use it for our purposes. And I think these are the things that also underpin why open source is so important.&#xA;&#xA;Dr. Julieta Arancio is a researcher of open source technologies, and I think she characterizes it really well, where technology really affects the way we think about research questions. &#xA;&#xA;And when a piece of technology and the tool that we use is closed source, it means that rather than being enablers of our creativity, we end up doing what the available tech lets us do. &#xA;&#xA;Because the people behind that technology gets to dictate what you can do with that technology. And what that means is, in this context, is that closed source technology also implies a certain kind of epistemic power behind it in terms of what knowledge we are allowed to have and what we can use that knowledge for.&#xA;&#xA;And the risks with closed source technology and the challenges with it is that, depending on how you wield that epistemic power, unfortunately, sometimes it leads to a kind of intellectual poverty. Because only certain people get to have certain pieces of knowledge and not other people. Some people get to make use of that knowledge in certain ways, while other people don&#39;t get to do that.&#xA;&#xA;So I think intellectual poverty is an unfortunate side effect that sometimes come from closed source technologies. And this is where the value proposition of open-source technology really comes in.&#xA;&#xA;This is not only convenient and amazing in terms of the collaboration and innovation that happens, there is also an ethical underpinning to it that makes it even more attractive&#xA;and adds to the value that we already have.&#xA;&#xA;And this connects with open research, because open research is not only about publishing open outputs, whether that&#39;s open access papers, open data, and open source software or hardware designs, it is also about how we hold that epistemic power together in a more equitable way.&#xA;&#xA;Such as those scientists I told you about in the Global South who couldn&#39;t do what they wanted to do in their research. And I think this is, you know, in addition to all of the benefits that we talked about, a very important value proposition.&#xA;&#xA;If you think some of these conversations are interesting, I&#39;d like to share with you some of the communities in which these conversations are happening.&#xA;&#xA;I&#39;d like to start on the hardware side of things. Over the past few years, I have been very lucky to be part of a group called the Gathering for Open Science Hardware, also known as GOSH.&#xA;&#xA;And this is a network of researchers, hundreds of researchers from across the world, literally from every continent, except maybe Antarctica, who come together to think about the important role of open source hardware in scientific research.&#xA;&#xA;And we have done a lot of interesting work, such as last year we created a policy toolkit for UNESCO on the role of open source hardware in scientific research, which was just published at the end of last year, that provides a lot of policy guidance on the national level for research and innovation policy.&#xA;&#xA;I mentioned Julieta just now. She is the author of an amazing report called Supporting Open Science Hardware in Academia. This report is geared towards scientific research funders and technology transfer offices in universities to provide some guidance on how universities can enact policies to support people to work on open source technologies in research and also ways to spin off that development into successful business models around open source. So I think this is a remarkable report that I highly recommend you to check out.&#xA;&#xA;So that&#39;s the hardware side of things, but if you go more broadly than that, I think this is where the Turing way comes in. So let me tell you a little bit about this community.&#xA;&#xA;It started back in 2019 initially as a book, an online book, made with Jupyter, by the way,about data science and best practices around how to do data science in an open and reproducible way.&#xA;&#xA;Now it started off as this book, but the founders of the Turing way, they thought: &#34;We are not the only experts here, so can we invite other people to help us co-create this book together?&#34;&#xA;&#xA;And they started a distributed collaboration process that eventually turned into scientists and researchers from around the world contributing to this book, not only in terms of data science, but other aspects of open science and open source as well.&#xA;&#xA;And it&#39;s grown into a huge book, hundreds of pages long, and because of how the book brought together scientists from different backgrounds, it&#39;s grown into a very vibrant community where a lot of conversations are happening around what open research means, what open science means, and what open source means for this work.&#xA;&#xA;So we talk about things like what I presented in my talk today. There&#39;s also talk about diverse roles in research, such as the important role of research software engineers in scientific research that&#39;s not recognized enough, or things like localization.&#xA;&#xA;So many things about open science and open source are in English right now, but can we translate that to different languages and what does that mean for people from different backgrounds and social backgrounds as well?&#xA;&#xA;With the hope that eventually we can galvanize a cultural shift in terms of how we think about technology and how we think about research so that, again, we can hold this power together in a more equitable way and think of new opportunities for research and innovation.&#xA;&#xA;So I think it&#39;s remarkable how over the past five years there has been more than 450 contributors to the Turing way, not just in terms of pull requests to the repository&#xA;and adding to the book, but also all of the richness that&#39;s been brought into the conversations that&#39;s been held together by this community.&#xA;&#xA;It&#39;s a really amazing place, and I highly recommend you check out the Turing Way if you&#39;re interested in having these discussions and connecting to other researchers in this space.&#xA;&#xA;And I think this really shows how open research and open source can connect to each other in a mutually beneficial way.&#xA;&#xA;Okay, so I&#39;m just about running out of time, but before I end, of course I&#39;d like to thank all of the people who have helped me so much to come here today to share some of these reflections with you.&#xA;&#xA;First of all, of course, there&#39;s the Turing way community. Malvika Sharan is one of the co-founders of the Turing Way. We had a lot of interesting discussions on how to make this a more provocative talk.&#xA;&#xA;I&#39;d like to thank Bri, our community coordinator in GOSH, for contributing a lot of the thoughts on open source hardware, and of course to the organizers, including Michael and Clare, for having me here today to share some of these reflections to you, and for all of you for putting up with the last 27 minutes and 50 seconds of my talk.&#xA;&#xA;The meta-commentary is that this talk is open source and it&#39;s published on Zenodo.org with this DOI, and I encourage you to check it out, fork it, turn it into what you like, and visit the Turing Way and GOSH communities where we can continue to have these conversations. &#xA;&#xA;#talks #opensource #openresearch&#xA;&#xA;----------&#xD;&#xA;&#xD;&#xA; p xmlns:cc=&#34;http://creativecommons.org/ns#&#34; Unless otherwise stated, all original content in this post is shared under the a href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;Creative Commons Attribution-ShareAlike 4.0 International/a licensea href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;_blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1&#34; alt=&#34;&#34;/a/p ]]&gt;</description>
      <content:encoded><![CDATA[<p>On 20 March 2024, I gave a talk at the <a href="https://web.archive.org/web/20240304132500/https://www.eventbrite.ie/e/open-source-for-innovation-in-universities-tickets-830091424797">Open Source for Innovation in Universities</a> event titled “<strong>The critical role of open source in open research</strong>” (open source slides <a href="https://doi.org/10.5281/zenodo.10828119">published to Zenodo</a>). <a href="https://write.as/naclscrg/epistemic-and-disciplinary-diversity">Like last time</a>, it was informed by incredible feedback I received from various open research communities, especially Malvika of the Turing Way who first connected me to the organisers. There&#39;s extra stuff I couldn&#39;t fit into the talk, so I&#39;m putting them here.</p>



<p>I&#39;m posting:</p>
<ul><li>a few general notes;</li>
<li>other resources/further reading suggested by Turing Way members; and</li>
<li>a transcript of my talk.</li></ul>

<p>I&#39;ll try to clean up this post with more context and details on a best-effort basis.</p>

<p>There is a video recording which is saved in the <a href="https://doi.org/10.5281/zenodo.10828119">Zenodo item</a>, <a href="https://www.youtube.com/watch?v=MFKmZmp7HmI">viewable on YouTube</a>, and embedded here:</p>

<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/MFKmZmp7HmI?si=stsNsVycBEqGKjAx" title="YouTube video player" frameborder="0" allowfullscreen=""></iframe>

<h2 id="general-notes" id="general-notes">General notes</h2>

<p>In-person verbal feedback was positive, though I didn&#39;t get to use as much time preparing it as I wanted. I was also running out of time near the end, and wish I could have talked about the Turing Way more!</p>

<p>This time, I also opened a Turing Way <a href="https://github.com/the-turing-way/the-turing-way/issues/3570">GitHub issue #3570</a>, to track the development of this talk.</p>

<p>As expected, I wasn&#39;t able to fit everything in, but also thank you to Sarah Gibson, Julien Colomb, Esther Plomp for your feedback earlier to help me prepare! I&#39;m also grateful to the organisers Michael Meagher and Clare Dillon who gathered a great group of warm and interesting people for this event. :) Special thanks for Malvika Sharan for the several meetings we had to structure this talk.</p>

<h3 id="a-note-about-creating-a-transcript" id="a-note-about-creating-a-transcript">a note about creating a transcript</h3>

<p>For my <a href="https://write.as/naclscrg/epistemic-and-disciplinary-diversity">FOSDEM lightning talk</a>, I typed what I wanted to say directly into the presenter notes in my slides <em>before</em> the talk. However, this time I just didn&#39;t have time to do that.</p>

<p>So, I tried using my phone to make a live audio recording as I gave the presentation. Then, I used the open source <a href="https://github.com/ggerganov/whisper.cpp">Whisper.cpp</a> automatic speech recognition tool with its open-ish <code>ggml-small.en</code> model to generate a transcript.</p>

<p>Then, I copied that transcript into the presenter notes of the final slides <a href="https://doi.org/10.5281/zenodo.10828119">published to Zenodo</a>.</p>

<p>In the end, I think this method works, but is still time-consuming. The generated transcript is a huge text file that I had to manually split into paragraphs, and copy and paste individual chunks of text into their corresponding presenter notes. This is also what&#39;s below in the “Transcript” section.</p>

<p>Will I continue to use Whisper.cpp in the future? Yes, I think its text transcription is remarkably accurate and is getting better. Though there are still paper cuts in the user experience that adds some work for me.</p>

<h2 id="other-resources-examples" id="other-resources-examples">Other resources/examples</h2>

<p>Thanks to Sarah Gibson and Julien Colomb for the suggested examples:</p>
<ul><li>The <a href="https://github.com/healthinnovation/gorgas_tracker">Gorgas tracker</a> as mentioned in <a href="https://drexel.edu/coas/news-events/news/2021/July/open-science-hardware-accelerates-innovation-democratizes-science/">this post</a> and described in <a href="https://doi.org/10.1016/j.envsci.2023.103576">Arancio (2023)</a>.</li>
<li>CERN&#39;s <a href="https://ohwr.org/project/white-rabbit/wikis/home">White Rabbit</a> project. Also see <a href="https://www.openmake.de/blog/2022/10/20/2022-10-06-interview-white-rabbit/">this interview</a> about it.</li>
<li>The Python and R ecosystems vs MATLAB and SPSS in days past.</li>
<li>JupyterHub, specifically the <a href="https://blog.jupyter.org/desktop-gis-software-in-the-cloud-with-jupyterhub-ddced297019a">QGreenland project</a> (WARNING: Medium link). I really like this one because it&#39;s not just one piece of open source hardware, but an entire <em>stack</em> that could only work well when all components are open source and remixable.</li></ul>

<h2 id="transcript" id="transcript">Transcript</h2>

<p>Note: This transcript is lightly edited for clarity, such as by removing the “uh”s and “you know”s, or “ah”s.</p>

<p>Thank you so much for that introduction, Clare. I&#39;m really excited to be here with you today. It&#39;s really quite a privilege to be speaking to you. And as Clare mentioned, I am a member of the Turing Way community, which I will come back to near the end of the talk. But today, I&#39;d like to share some of my own reflections being not only an advocate for open research in the academic community over the past several years, but also as a member of the open source community. I very much think of my talk as a kind of “yes, and...” kind of presentation. And it&#39;s also intentionally provocative with the intention of stimulating, new thinking around what kind of opportunities can we consider when it comes to open source technologies and open research.</p>

<p>I want to start very briefly by focusing on the term open research and make kind of a subtle point here. So I consider open research to cover a very wide and diverse array of different research disciplines. And a lot of the examples I&#39;d like to share today come from my experience advocating for open science, which I consider to be a very important component of open research, but it&#39;s not all of open research. So there&#39;s a subtle difference between the terms and I&#39;d just like to delineate the two, even though most of what I&#39;m talking about today comes from the open science world.</p>

<p>With that said, the structure of my talk today, I&#39;d like to start with my reflections on some of the core values of open science, why open science is so important, including in academic research. Very briefly on a lot of the invisible infrastructure of technology that underlies the scientific research that we do, followed by I think the biggest part of my talk today, which are the additional motivations for open source technologies to enable open science. And I&#39;d like to bring up the hardware component as well because we&#39;ve heard a lot about software. And finally, I will talk about some of the communities that have been so lucky to be a part of over the years that discusses a lot of the things in my talk today.</p>

<p>So, open science. I&#39;ve talked about open science to so many people over the years, and what I have learned is that...</p>

<p>...if you ask 10 people what open science means, they will tell you, yes, I know what it is, but they will give you 10 different answers. So I&#39;d just like to set the scene a little bit for my talk today to establish a common understanding just to help with the conversation.</p>

<p>And one of the initiatives that I&#39;ve been really privileged to be a part of is the drafting of the UNESCO Recommendation on Open Science that was ratified in 2021. I had a very small role to play in this, but it was a huge privilege to be part of the process and it produced an amazing document.</p>

<p>It&#39;s really long, but I recommend you check it out. And part of it defines open science to mean a set of practices for reproducibility, transparency, sharing, and collaboration from the increased opening of scientific contents towards and processes. Again, I think this is an amazing document, but this definition is also quite a mouthful, right? So I tried to reflect on: is there kind of like an essence to this definition?</p>

<p>And what I came to is actually the difference between science and alchemy. So what do I mean by this?</p>

<p>I was inspired to think about this by a very provocative digital rights author called Cory Doctorow. He writes a lot about these kind of fundamental values underlying open research and open science and open source. And he said, if we think about how alchemists used to work superficially, they were running experiments, they had some research questions, they took lots of notes, and they were actually learning along the way. But the thing with alchemists is that they kept what they knew a secret from each other for 500 years. Because of that secrecy, they didn&#39;t advance the state of the art very much. And because of that, every single one of them had to learn in the hardest possible way that drinking mercury is a bad idea.</p>

<p>I think this really hits at the core of the difference between science and alchemy because science is a fundamentally iterative process where we are always building on knowledge shared by other people and what came before. So in a way, for us to be responsible scientists, we have to continue to share what we have learned with other people to build upon our successes and failures. So I think to do good science is to do open science, and I think that&#39;s what open science is really about.</p>

<p>Another way to think about this is what I think of as intellectual humility because I&#39;ve been an academic researcher for like 15 years now. And reflecting on these years of research, I realized that whatever little bit I&#39;ve added to our collective body of knowledge, I was able to do that because of everything that I&#39;ve learned from the people who came before me. So a researchers, we really didn&#39;t get here on your own. It&#39;s really built on top of what everyone else has shared with you.</p>

<p>And it is with all of this in mind that I think open science really comes with four fundamental freedoms, where for any piece of knowledge, it should come with the freedoms for anyone to use it, study it, modify it, and continue to share it with other people to continue that iterative cycle. So this is how I like to think of open science. And that&#39;s the first thing I wanted to cover today.</p>

<p>The next thing I wanted to quickly establish is that for this science to happen, we&#39;re making use of so much shared technical infrastructure today. I remember many years ago I was at this hackathon with Arfon Smith from GitHub. And he was the person who gave me that lifetime Pro subscription to a GitHub that I&#39;m still getting dividends from to this day. It is a platform that comes with amazing features.</p>

<p>But at the same time, I also remember how a couple of years ago there was this big GitHub outage for a couple of hours. And it is when things like this happen that we realize how reliant we have become on the software and hardware infrastructure in our lives. Because when they break, when we hurt, that&#39;s when we realize our reliance on these things.</p>

<p>And it&#39;s important to think about this because it reminds us to reflect on who gets to have a say in how this infrastructure works and how that infrastructure can work for us as researchers, and how we live out our lives. So this invisible infrastructure is really important. And this kind of centralization that&#39;s happening, I think, is a challenge that open source technologies can tackle.</p>

<p>So I&#39;ve been thinking about a lot of the motivations for open source, including a lot of the reasons that people have talked about today. And I&#39;d like to go over some examples. I want to talk about hardware, but will start with a software example that I think is amazing, which is...</p>

<p>...the QGreenland project. So I thought this project was so cool because it started out as a bunch of academic scientists who share a common theme, which is that they all study Greenland. It could be meteorologists, geologists, and a lot of other scientists. And they developed this common software platform for analyzing geospatial data about Greenland.</p>

<p>And they built it on top of an open source software called QGIS. It is a geographical information system, so that they can pull all of the geospatial data about Greenland into one place. They have a whole suite of tools built on top of QGIS to analyze that data. And the whole stack is called QGreenland. And what happened was that this project became successful. And last year in 2023, they wanted to run a training workshop for other researchers to learn about how to use QGreenlandfor their scientific research.</p>

<p>But one problem they encountered was that if they have 20 scientists in the room coming to this workshop, all with their laptops and their different operating systems and configurations,it takes so much time to just get people to the same page to install QGIS,get it running, and then put QGreenland on top of it. That takes so much time from the actual training they wanted to do.</p>

<p>So they thought, okay, can we reduce this friction a little bit?</p>

<p>And the solution they came up with was that they started with JupyterHub, which is kind of like a server-hosted version of the Python-based kind of Jupyter computational notebook that a lot of data scientists use.</p>

<p>But they were able to make some additions to Jupyter and tweak it so that instead of just running Python, they&#39;re running an entire Linux desktop environment on top of JupyterHub.</p>

<p>And with that, they can then install QGIS into that Linux environment, and then they put the whole QGreenland geospatial data platform on top of that.</p>

<p>And once they put all of this together into one package, they serve it from their server so that the participants in the workshop, they can just open up their web browsers, go to a particular URL, and the whole package runs as a web page inside their browser. And this saves so much time in the workshop because they don&#39;t need to set QGIS up on every individual computer.</p>

<p>Now, the reason I love this example is that all of these components, they are open source to begin with, and they demonstrate the FAIR principles of open science. Now, I think a lot of you know what FAIR stands for, but just so we&#39;re on the same page, FAIR stands for...</p>

<p>...Findable, Accessible, Interoperable, and Reusable. And this is a big thing in open science, and I think QGreenland demonstrates all of it. Because of this open source publishing online, it&#39;s easy for people to find it. The way they set it up is really accessible. It&#39;s interoperable because the components are open source, and they were able to tweak the components to interact with each other. And, of course, it&#39;s reusable because other scientists can adapt it to different research contexts. And I think this is a demonstration of how the FAIR principles that are so important to open science are enabled by open source technologies.</p>

<p>Okay, so this is a software example, but if you look at the UNESCO recommendation on open science, it talks about several main pillars of open science, including the usual suspects like open access publications, open data, open educational resources (I think this one is really important!), and of course, open source software code.</p>

<p>In addition to that, the recommendation emphasizes that hardware is a really important part of open science as well. So I like to focus a bit on the open source hardware side of things.</p>

<p>And if you really think about it, hardware underpins so much of scientific research. It was literally hardware that took people to the moon. That&#39;s how much we rely on hardware to do science.</p>

<p>It can be huge pieces of equipment like the Large Hadron Collider,</p>

<p>Or it can be something seemingly simple, but equally integral to the research infrastructure, like microscopes that we use in so many labs today.</p>

<p>Now, the thing with hardware is that it&#39;s very often closed source, like a lot of software.</p>

<p>And some of the challenges with that is that it&#39;s not reproducible in a scientific way. There&#39;s vendor lock-in, which was mentioned before. There&#39;s forced obsolescence, and there are very high costs. The cost is not only in terms of a very expensive piece of equipment. It&#39;s also the very high switching costs, where if you decide there&#39;s another equipment you want to use, but since it&#39;s not open source and there&#39;s no interoperability, it&#39;s very difficult for you to switch to a different platform.</p>

<p>And this causes a lot of global inequalities in research. I personally know some scientists in some global south countries who really want to have a particular piece of instrument in their lab, but the one manufacturer that makes it simply do not sell it in their country.</p>

<p>And even if they somehow get access to buy it, the cost is so high that they cannot afford it.</p>

<p>And if they somehow scrunch together the money to be able to afford to buy it, once they have it, they won&#39;t be able to get any support on it. They cannot maintain it themselves.</p>

<p>And it just becomes prohibitively difficult for a lot of researchers in different places around the world.</p>

<p>So I think when it comes to the social impact of our technologies, it&#39;s really important to be mindful of a lot of the global inequalities that come with the technologies of today.</p>

<p>So in contrast to that, open source hardware is defined as hardware whose design is available so that anyone, again, can study, modify, distribute, make, and sell hardware based on that design. And there are a lot of examples, actually, in scientific research.</p>

<p>An amazing one that I know about is the Open Source Imaging Initiative. So this is a consortium of universities across Europe, including some companies, I believe, who came together to create a completely open source MRI machine for medical scanning and diagnosis.</p>

<p>And if you know anything about MRI machines, you know how complicated and intricate they are. And they&#39;re actually creating an open source version of it that&#39;s becoming successful!</p>

<p>Open source hardware has been to space. Researchers in the U.S., they&#39;ve developed the ORESAT, which is an open source CubeSat, that became a common platform for scientists across the U.S. to build on top of for remote sensing applications.</p>

<p>It&#39;s been launched several times already, and I think they have more launches scheduled.</p>

<p>But the example that I&#39;d really love to talk about is the OpenFlexure microscope. So this is a lab-grade microscope, originally developed by researchers at the University of Bath in the U.K. (I think their team is based in Glasgow now). The point is it&#39;s completely open source and modular, and you can 3D print most of the microscope yourself.</p>

<p>It comes with a lot of features, starting with the basic ones like bright field imaging, or fluorescence imaging. But because it is fully open source, there was a separate research team in a different part of the world that looked at the designs, and they actually enhanced it and improved it to greatly increase the resolution for fluorescence imaging.</p>

<p>And this is something that people weren&#39;t able to do with the closed-sourced microscopes that they used before.</p>

<p>These are just a couple of features, but what&#39;s also really cool is that this open-source microscope, if you want to build it yourself, the cost of doing so is only about 200 US dollars.</p>

<p>Now, for those of you who have used and bought microscopes for use in the lab before, you will know that these microscopes often cost an order of magnitude more than OpenFlexure for doing the same thing, and I think that&#39;s absolutely remarkable.</p>

<p>And because of its low cost and because it&#39;s open source, again, as an example, researchers in several sub-Saharan countries, they were able to take the OpenFlexure design to locally produce and maintain that microscope from malaria diagnosis when they weren&#39;t able to do it before.</p>

<p>And in addition to this, it has actually prompted the formation of some small businesses in those countries to locally produce and sell these microscopes, and it&#39;s again becoming a new business model that&#39;s enabled by open source technology.</p>

<p>Okay, so to kind of build on some of the points made earlier, Joshua Pierce is a researcher in this, and he calculated that open source technologies, including hardware, can provide economic savings of up to 87% compared to functionally equivalent proprietary tools.</p>

<p>And again, my other point is that in addition to the savings, it creates new kinds of businesses.</p>

<p>So I have a bit of a background in molecular biology, and I&#39;ve used PCR machines a lot. And there&#39;s a company who sells these Ninja PCR machines for US$500. Again, if you have bought this for labs before, you&#39;ll know that they typically cost an order of magnitude more. So it&#39;s amazing how open source not only lower costs, but creates new kinds of businesses as well.</p>

<p>Okay, so I talked about some of the benefits of open source technology just now. And to build on Clare&#39;s point earlier, I think we&#39;re faced with so many global challenges today, whether that&#39;s climate change or pandemics or other problems. And they&#39;re so big and urgent that I think open source technology is what enables the inclusive and rapid innovation needed to address these really urgent issues.</p>

<p>And to bring it back to my earlier point, I truly believe that we simply don&#39;t have time to be alchemists anymore. We cannot afford to be alchemists. And I think this is a huge motivator for why open source is so important and critical to open research.</p>

<p>Now, with all of that said, here actually comes what might be the most provocative part of my talk today. So, you know, again, we&#39;ve seen so many motivations for open source, like the collaboration that happens, faster innovation that&#39;s so critical to solve problems of today, the lower costs and business opportunities, and so many other benefits, right?</p>

<p>But I feel they are just the tip of the iceberg in terms of why open source is so important. And there are some underlying values that I think really adds a lot to the value proposition of open source.</p>

<p>In my view, that could be things like the autonomy and agency that we can have over the technology that we use and the freedom to use it for our purposes. And I think these are the things that also underpin why open source is so important.</p>

<p>Dr. Julieta Arancio is a researcher of open source technologies, and I think she characterizes it really well, where technology really affects the way we think about research questions.</p>

<p>And when a piece of technology and the tool that we use is closed source, it means that rather than being enablers of our creativity, we end up doing what the available tech lets us do.</p>

<p>Because the people behind that technology gets to dictate what you can do with that technology. And what that means is, in this context, is that closed source technology also implies a certain kind of epistemic power behind it in terms of what knowledge we are allowed to have and what we can use that knowledge for.</p>

<p>And the risks with closed source technology and the challenges with it is that, depending on how you wield that epistemic power, unfortunately, sometimes it leads to a kind of intellectual poverty. Because only certain people get to have certain pieces of knowledge and not other people. Some people get to make use of that knowledge in certain ways, while other people don&#39;t get to do that.</p>

<p>So I think intellectual poverty is an unfortunate side effect that sometimes come from closed source technologies. And this is where the value proposition of open-source technology really comes in.</p>

<p>This is not only convenient and amazing in terms of the collaboration and innovation that happens, there is also an ethical underpinning to it that makes it even more attractive
and adds to the value that we already have.</p>

<p>And this connects with open research, because open research is not only about publishing open outputs, whether that&#39;s open access papers, open data, and open source software or hardware designs, it is also about how we hold that epistemic power together in a more equitable way.</p>

<p>Such as those scientists I told you about in the Global South who couldn&#39;t do what they wanted to do in their research. And I think this is, you know, in addition to all of the benefits that we talked about, a very important value proposition.</p>

<p>If you think some of these conversations are interesting, I&#39;d like to share with you some of the communities in which these conversations are happening.</p>

<p>I&#39;d like to start on the hardware side of things. Over the past few years, I have been very lucky to be part of a group called the Gathering for Open Science Hardware, also known as GOSH.</p>

<p>And this is a network of researchers, hundreds of researchers from across the world, literally from every continent, except maybe Antarctica, who come together to think about the important role of open source hardware in scientific research.</p>

<p>And we have done a lot of interesting work, such as last year we created a policy toolkit for UNESCO on the role of open source hardware in scientific research, which was just published at the end of last year, that provides a lot of policy guidance on the national level for research and innovation policy.</p>

<p>I mentioned Julieta just now. She is the author of an amazing report called Supporting Open Science Hardware in Academia. This report is geared towards scientific research funders and technology transfer offices in universities to provide some guidance on how universities can enact policies to support people to work on open source technologies in research and also ways to spin off that development into successful business models around open source. So I think this is a remarkable report that I highly recommend you to check out.</p>

<p>So that&#39;s the hardware side of things, but if you go more broadly than that, I think this is where the Turing way comes in. So let me tell you a little bit about this community.</p>

<p>It started back in 2019 initially as a book, an online book, made with Jupyter, by the way,about data science and best practices around how to do data science in an open and reproducible way.</p>

<p>Now it started off as this book, but the founders of the Turing way, they thought: “We are not the only experts here, so can we invite other people to help us co-create this book together?”</p>

<p>And they started a distributed collaboration process that eventually turned into scientists and researchers from around the world contributing to this book, not only in terms of data science, but other aspects of open science and open source as well.</p>

<p>And it&#39;s grown into a huge book, hundreds of pages long, and because of how the book brought together scientists from different backgrounds, it&#39;s grown into a very vibrant community where a lot of conversations are happening around what open research means, what open science means, and what open source means for this work.</p>

<p>So we talk about things like what I presented in my talk today. There&#39;s also talk about diverse roles in research, such as the important role of research software engineers in scientific research that&#39;s not recognized enough, or things like localization.</p>

<p>So many things about open science and open source are in English right now, but can we translate that to different languages and what does that mean for people from different backgrounds and social backgrounds as well?</p>

<p>With the hope that eventually we can galvanize a cultural shift in terms of how we think about technology and how we think about research so that, again, we can hold this power together in a more equitable way and think of new opportunities for research and innovation.</p>

<p>So I think it&#39;s remarkable how over the past five years there has been more than 450 contributors to the Turing way, not just in terms of pull requests to the repository
and adding to the book, but also all of the richness that&#39;s been brought into the conversations that&#39;s been held together by this community.</p>

<p>It&#39;s a really amazing place, and I highly recommend you check out the Turing Way if you&#39;re interested in having these discussions and connecting to other researchers in this space.</p>

<p>And I think this really shows how open research and open source can connect to each other in a mutually beneficial way.</p>

<p>Okay, so I&#39;m just about running out of time, but before I end, of course I&#39;d like to thank all of the people who have helped me so much to come here today to share some of these reflections with you.</p>

<p>First of all, of course, there&#39;s the Turing way community. Malvika Sharan is one of the co-founders of the Turing Way. We had a lot of interesting discussions on how to make this a more provocative talk.</p>

<p>I&#39;d like to thank Bri, our community coordinator in GOSH, for contributing a lot of the thoughts on open source hardware, and of course to the organizers, including Michael and Clare, for having me here today to share some of these reflections to you, and for all of you for putting up with the last 27 minutes and 50 seconds of my talk.</p>

<p>The meta-commentary is that this talk is open source and it&#39;s published on Zenodo.org with this DOI, and I encourage you to check it out, fork it, turn it into what you like, and visit the Turing Way and GOSH communities where we can continue to have these conversations.</p>

<p><a href="https://naclscrg.writeas.com/tag:talks" class="hashtag"><span>#</span><span class="p-category">talks</span></a> <a href="https://naclscrg.writeas.com/tag:opensource" class="hashtag"><span>#</span><span class="p-category">opensource</span></a> <a href="https://naclscrg.writeas.com/tag:openresearch" class="hashtag"><span>#</span><span class="p-category">openresearch</span></a></p>

<hr/>

<p> <p>Unless otherwise stated, all original content in this post is shared under the <a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;">Creative Commons Attribution-ShareAlike 4.0 International</a> license<a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1" alt=""></a></p></p>
]]></content:encoded>
      <guid>https://naclscrg.writeas.com/critical-role-of-open-source-in-open-research</guid>
      <pubDate>Mon, 08 Apr 2024 17:37:42 +0000</pubDate>
    </item>
    <item>
      <title>Talk - Representing epistemic and disciplinary diversity in open research</title>
      <link>https://naclscrg.writeas.com/epistemic-and-disciplinary-diversity?pk_campaign=rss-feed</link>
      <description>&lt;![CDATA[On 10 February 2024, I gave a lightning talk at FOSDEM 2024&#39;s Open Research Online Devroom titled &#34;Representing epistemological and disciplinary diversity in open research discourse&#34; (slides and video shared here). I later gave a tweaked version of this talk to introduce a UK Reproducibility Network online workshop on this topic on 31 March 2025.&#xA;!--more--&#xA;It was informed by incredible feedback I received from various open research communities. There&#39;s so much good stuff I couldn&#39;t fit them into a 10-minute lightning talk, so I&#39;m putting them here.&#xA;&#xA;I&#39;m posting: &#xA;&#xA;a video recording of the latest version; &#xA;a transcript of the talk; &#xA;additional discussions that didn&#39;t fit; and&#xA;other resources/further reading that couldn&#39;t fit (scroll down to see!).&#xA;&#xA;I&#39;ll try to clean up this post with more context and details on a best-effort basis.&#xA;&#xA;Video&#xA;&#xA;The version of this talk given on 31 March 2025 to introduce an online workshop for the UK Reproducibility Network train-the-trainer community. It was recorded which you can watch here: &#xA;&#xA;iframe src=&#34;https://archive.org/embed/baking-in-disciplinary-and-epistemic-diversity-2025-03-31&#34; width=&#34;640&#34; height=&#34;480&#34; frameborder=&#34;0&#34; webkitallowfullscreen=&#34;true&#34; mozallowfullscreen=&#34;true&#34; allowfullscreen/iframe&#xA;&#xA;Or download the slides and recording directly from Zenodo: &#xA;&#xA;https://doi.org/10.5281/zenodo.10643245&#xA;&#xA;Transcript&#xA;&#xA;My very sciency background started in ecology and environmental research 15-ish years ago, including evaluating impacts from a major marine oil spill to wildlife. During those years, I heard about a fellow PhD student getting their thesis criticised by a committee member (i.e. examiner) because their work is not &#34;reproducible&#34; and not structured with explicit &#34;hypotheses&#34; and tests of those hypotheses. &#xA;&#xA;I remember myself wondering why should scientific research be defined by reproducible experiments? The oil spill I studied is fundamentally not reproducible. And even if it were, it&#39;s probably not ethical to reproduce it! I also didn&#39;t conduct any experiments. Does that mean I am doing bad science?&#xA;&#xA;In the years since, I&#39;ve become an advocate for open science as a way to do good science, where I think what makes good science different from alchemy is best summed up by Cory Doctorow, who said: “Alchemists kept what they knew a secret for 500 years. They didn’t advance the art very much… and each of them learned in the hardest way possible that drinking mercury is a bad idea.”&#xA;&#xA;I also learned that the Latin origins of the word science comes from Latin &#34;scientia&#34;, i.e. “knowledge”.&#xA;&#xA;This prompted me to reflect more deeply on &#34;open research&#34; instead of just &#34;open science&#34;; and how open research could encompass diverse ways of learning and organising what we know.&#xA;&#xA;For the purposes of my lightning talk, this is what I mean by epistemic diversity. My fear is that conversations about open research is not representative of epistemic diversity.&#xA;&#xA;For example, we encourage people to share open data, but what does that mean to, say, a scholar of medieval literature? &#xA;&#xA;There is much focus on &#34;reproducibility&#34;, but would a law professor care about this?&#xA;&#xA;We have made progress on publishing open access papers, but does that mean anything to a musicologist?&#xA;&#xA;I think it is possible for us to shoehorn these concepts into these disciplines (e.g. perhaps medieval books or sheet music are the &#34;data&#34;, and the law professor should document their reasoning and arguments in a &#34;reproducible&#34; way), but why should we? What if these researchers get to define research - and open research - in their terms instead? What might open research look like then?&#xA;&#xA;Another way to look at this is to ask: who gets to decide how to conceptualise and talk about research? Who holds the epistemic power to define the terms of the conversation? In my view this is not an abstract problem. &#xA;&#xA;Last year, I had a long conversation with a new history professor who has an interest in open research. And he told me about how groups about open research are often dominated by researchers from very STEM-focused subjects like the life sciences. And he doesn’t see any researcher who looks like him. He told me that there is resentment and a sense of exclusion among other historians.&#xA;&#xA;In other words, despite good intentions, the epistemic power of the loudest voices in open research discourse might have accidentally caused epistemic injustice which systemically excluded some researchers.&#xA;&#xA;People often ask &#34;what can I do as an individual&#34;?&#xA;&#xA;A core part of open research is diversity and inclusion and over the years I&#39;ve learned much (and have much more to learn!) on how words matter when it comes to dimensions like race, gender, or accessibility. I suggest applying that same sensitivity to epistemic diversity.&#xA;&#xA;For example, the word &#34;manuscript&#34; could mean a paper submitted to a scientific journal, or a physical piece of ancient paper that a historian studies. I hear the terms &#34;lab&#34; and &#34;PI&#34; a lot in open research communities, but there are research disciplines whose social structures are very different and don&#39;t use these terms at all. Or, sometimes I hear talk of &#34;STEM&#34; vs &#34;non-STEM&#34; research, but that is itself a very STEM-centric view.&#xA;&#xA;And, of course, the conflation of &#34;science&#34; and &#34;research&#34; as if they&#39;re the same thing. If we can be mindful of epistemic diversity when talking about open research, then maybe we can start to avoid excluding people from the conversation. &#xA;&#xA;Finally, as someone from a very sciency background, I am in a privileged position. My imagination is constrained and it&#39;s not my place to authoritatively declare what we should do. That&#39;s why I&#39;m hesitant to prescribe &#34;say x instead of y&#34;.&#xA;&#xA;Instead, there may be value in elevating under-represented groups through gatherings like focus groups or workshops, where we can learn from epistemologically diverse researchers directly on how to make open research more inclusive.&#xA;&#xA;Over the past weeks, I received amazing suggestions on where this topic could go, and my lightning talk is just the tip of that iceberg.&#xA;&#xA;In the mean time, I’d like to give thanks to them, including: The Turing Way community, Framework for Open and Reproducible Research Training, Nowhere Lab, Gathering for Open Science Hardware, and the organisers of this FOSDEM Open Research Devroom!&#xA;&#xA;Additional discussions&#xA;&#xA;In no particular order (don&#39;t have time to organise ATM), here are some other ideas which came up in the Turing Way, FORRT, Nowhere Lab, or GOSH communities (acknowledgements at the end of this post): &#xA;&#xA;Start with &#39;the idea of open scholarship and then narrow to &#34;open science&#34; or &#34;open research&#34; if needed, depending on who I&#39;m talking to&#39;.&#xA;&#34;...it’s not just an issue in the open science/research movements, but interdisciplinary fields in general and any inter-/multi-disciplinary attempts to change research practices need to adopt epistemological flexibility &amp; tolerance towards other&#34;&#xA;There is interest from the FORRT community to write something about this.&#xA;Also possible for me to re-present this lightning talk at an upcoming Nowhere Lab meeting, maybe in March 2024.&#xA;Classification can be confusing for some researchers, especially those who don&#39;t fit in typical &#34;STEM&#34; boxes.&#xA;Sabina Leonelli&#39;s work on open science and philosophy of science, linked to below.&#xA;Harding’s sciences from below and Fricker’s epistemic injustices are also useful reading (see below).&#xA;There is a Digital Humanities group for the UK and Ireland: https://digitalhumanities-uk-ie.org&#xA;    Including research software engineering: https://digitalhumanities-uk-ie.org/community-interest-groups/research-software-engineering/&#xA;There are different levels of abstraction when talking about epistemic diversity, which is a separate deep dive on its own.&#xA;&#39;Beware the &#34;pageant effect&#34;: you&#39;re likely to learn the amazing successes of an unfamiliar discipline before learning about its flaws and failures.&#39;&#xA;Connections to science, technology, and society (STS) studies, link to epistemic power and (in)injustice.&#xA;We had a conversation about the good intentions, usefulness, but also limitations of the CReditT taxonomy for research contributor roles (readings below). Though there is further work on improving it like SCoRO.&#xA;&#xA;Suggested readings&#xA;&#xA;(I copied these citations directly from their respective websites, so the citation styles vary, sorry!)&#xA;&#xA;Peter Branney, Kate Reid, Nollaig Frost, Susan Coan, Amy Mathieson &amp; Maxine Woolhouse (2019) A context-consent meta-framework for designing open (qualitative) data studies, Qualitative Research in Psychology, 16:3, 483-502, DOI: 10.1080/14780887.2019.1605477&#xA;&#xA;Knorr Cetina, Karin. Epistemic Cultures: How the Sciences Make Knowledge, Cambridge, MA and London, England: Harvard University Press, 1999. https://doi.org/10.4159/9780674039681&#xA;&#xA;Farran, E. K., Silverstein, P., Ameen, A. A., Misheva, I., &amp; Gilmore, C. (2020). Open Research: Examples of good practice, and resources across disciplines. https://doi.org/10.31219/osf.io/3r8hb&#xA;&#xA;Fricker, Miranda, Epistemic Injustice: Power and the Ethics of Knowing (Oxford, 2007; online edn, Oxford Academic, 1 Sept. 2007), https://doi.org/10.1093/acprof:oso/9780198237907.001.0001&#xA;&#xA;Harding, S. (2008) Sciences from Below - Feminisms, Postcolonialities, and Modernities. Duke University Press. https://www.dukeupress.edu/sciences-from-below/&#xA;&#xA;Hartmann, H., Darda, K. M., PhD, Meletaki, V., Ilchovska, Z., Corral-Frías, N. S., Hofer, G., … Sauvé, S. A. (2023, September 11). Incorporating feminist practices into  (psychological) science - the why, the what and the how. https://doi.org/10.31219/osf.io/2rcuz&#xA;&#xA;Jasanoff, S. (Ed.). (2004). States of Knowledge: The Co-Production of Science and the Social Order (1st ed.). Routledge. https://doi.org/10.4324/9780203413845&#xA;&#xA;Latour, B., &amp; Woolgar, S. (1986). Laboratory Life: The Construction of Scientific Facts. Princeton University Press. https://doi.org/10.2307/j.ctt32bbxc&#xA;&#xA;Bruno Latour. (1988) Science in Action, How to Follow Scientists and Engineers through Society. Harvard University Press. https://www.hup.harvard.edu/books/9780674792913&#xA;&#xA;Leonelli, S. (2022). Open Science and Epistemic Diversity: Friends or Foes? Philosophy of Science, 89(5), 991–1001. doi:10.1017/psa.2022.45&#xA;&#xA;Plomp, Esther. 2023. “Valuing a Broad Range of Research Contributions through Team Infrastructure Roles: Why CRediT Is Not Enough.” Commonplace, December. https://doi.org/10.21428/6ffd8432.f92deec7&#xA;&#xA;Pownall, M., Talbot, C. V., Henschel, A., Lautarescu, A., Lloyd, K. E., Hartmann, H., Darda, K. M., Tang, K. T. Y., Carmichael-Murphy, P., &amp; Siegel, J. A. (2021). Navigating Open Science as Early Career Feminist Researchers. Psychology of Women Quarterly, 45(4), 526-539. https://doi.org/10.1177/03616843211029255&#xA;&#xA;Reddy, G., &amp; Amer, A. (2023). Precarious engagements and the politics of knowledge production: Listening to calls for reorienting hegemonic social psychology. British Journal of Social Psychology, 62(Suppl. 1), 71–94. https://doi.org/10.1111/bjso.12609&#xA;&#xA;Sichani, A.-M., Ahnert, R., Baker, J., Beavan, D., Ciula, A., Crouch, S., De Roure, D., Francois, P., Hetherington, J., Jeffries, N., McGillivray, B., Ridge, M., Terras, M., Tupman, C., Turner, M., Weinzierl, M., Willcox, P., Winters, J., Wynne, M., &amp; Smithies, J. (2023). iDAH Research Software Engineering (RSE) Steering Group Working Paper (v.1.0). Zenodo. https://doi.org/10.5281/zenodo.8177926&#xA;&#xA;Steltenpohl, C. N., Lustick, H., Meyer, M. S., Lee, L. E., Stegenga, S. M., Reyes, L. S., &amp; Renbarger, R. L. (2023). Rethinking Transparency and Rigor from a Qualitative Open Science Perspective. Journal of Trial &amp; Error. https://doi.org/10.36850/mr7&#xA;&#xA;More resources&#xA;&#xA;The UK Reproducibility Network has done good work on this, such as: &#xA;&#xA;Event by the UK Reproducibility Network: How relevant is the open research and scholarship agenda to the arts, humanities and social science disciplines? (warning: YouTube link) https://www.youtube.com/watch?v=L6TEyElbTqE&#xA;Preprint title &#34;Open Research: Examples of good practice, and resources across disciplines&#34;: https://doi.org/10.31219/osf.io/3r8hb &#xA;Working paper 6: https://doi.org/10.31219/osf.io/chyd4 &#xA;Working paper 7: https://doi.org/10.31219/osf.io/c78qu &#xA;&#xA;👇 And there&#39;s more: &#xA;&#xA;Humanities Commons: https://hcommons.org/&#xA;&#xA;Replicable History Project: https://ljmu.libcal.com/event/4130747&#xA;&#xA;Works by Karen Barad: https://en.wikipedia.org/wiki/KarenBarad&#xA;&#xA;Joint meeting of the European Association for the Study of Science and Technology (EASST) and the Society for Social Studies of Science (4S): https://www.easst4s2024.net/&#xA;&#xA;FOSDEM talk: FLOSS meets Social Science Research (and lived to tell the tale): https://archive.fosdem.org/2021/schedule/event/openresearchflossmeetsocialscience/&#xA;&#xA;SCoRO, the Scholarly Contributions and Roles Ontology: http://purl.org/spar/scoro&#xA;&#xA;Research Software Engineering in the Arts and Humanities: https://digitalhumanities-uk-ie.org/community-interest-groups/research-software-engineering/&#xA;&#xA;Acknowledgements&#xA;&#xA;Turing Way&#xA;&#xA;Anne Lee Steele, Bastian Greshake Tzovaras, Esther Plomp, Jason Hills, Julien Colomb, Liz Hare, Malvika Sharan, Marion Weinzierl, Maya Anderson-González, Richard J. Acton, Sarada Mahesh, Shern Tee&#xA;&#xA;Framework for Open and Reproducible Research Training (FORRT)&#xA;&#xA;Crystal Steltenpohl, Flavio Azevedo, Katja Rogers&#xA;&#xA;Nowhere Lab&#xA;&#xA;Gavin Taylor, Priya Silverstein&#xA;&#xA;Gathering for Open Science Hardware (GOSH)&#xA;&#xA;Brianna Johns (original co-author of this talk!), Laura Olalde&#xA;&#xA;UK Reproducibility Network&#xA;&#xA;Steve Boneham, Joe Cornelli, Natasha Mauthner, Stefana Juncu&#xA;&#xA;#talks #epistemicdiversity #openresearch&#xA;&#xA;----------&#xD;&#xA;&#xD;&#xA; p xmlns:cc=&#34;http://creativecommons.org/ns#&#34; Unless otherwise stated, all original content in this post is shared under the a href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;Creative Commons Attribution-ShareAlike 4.0 International/a licensea href=&#34;https://creativecommons.org/licenses/by-sa/4.0/&#34; target=&#34;blank&#34; rel=&#34;license noopener noreferrer&#34; style=&#34;display:inline-block;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1&#34; alt=&#34;&#34;img style=&#34;height:22px!important;margin-left:3px;vertical-align:text-bottom;&#34; src=&#34;https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1&#34; alt=&#34;&#34;/a/p ]]&gt;</description>
      <content:encoded><![CDATA[<p>On 10 February 2024, I gave a lightning talk at FOSDEM 2024&#39;s <a href="https://research-fosdem.github.io/2024-online-schedule.html">Open Research Online Devroom</a> titled “<strong>Representing epistemological and disciplinary diversity in open research discourse</strong>” (slides and video shared <a href="https://doi.org/10.5281/zenodo.10643245">here</a>). I later gave a tweaked version of this talk to introduce a UK Reproducibility Network online workshop on this topic on 31 March 2025.

It was informed by incredible feedback I received from various open research communities. There&#39;s so much good stuff I couldn&#39;t fit them into a 10-minute lightning talk, so I&#39;m putting them here.</p>

<p>I&#39;m posting:</p>
<ul><li>a <strong>video recording</strong> of the latest version;</li>
<li>a <strong>transcript</strong> of the talk;</li>
<li><strong>additional discussions</strong> that didn&#39;t fit; and</li>
<li><strong>other resources/further reading</strong> that couldn&#39;t fit (<em>scroll down to see</em>!).</li></ul>

<p>I&#39;ll try to clean up this post with more context and details on a best-effort basis.</p>

<h2 id="video" id="video">Video</h2>

<p>The version of this talk given on 31 March 2025 to introduce an online workshop for the UK Reproducibility Network train-the-trainer community. It was recorded which you can watch here:</p>

<iframe src="https://archive.org/embed/baking-in-disciplinary-and-epistemic-diversity-2025-03-31" width="640" height="480" frameborder="0" allowfullscreen=""></iframe>

<p>Or download the slides and recording directly from Zenodo:</p>

<p><a href="https://doi.org/10.5281/zenodo.10643245">https://doi.org/10.5281/zenodo.10643245</a></p>

<h2 id="transcript" id="transcript">Transcript</h2>

<p>My very sciency background started in ecology and environmental research 15-ish years ago, including evaluating impacts from a major marine oil spill to wildlife. During those years, I heard about a fellow PhD student getting their thesis criticised by a committee member (i.e. examiner) because their work is not “reproducible” and not structured with explicit “hypotheses” and tests of those hypotheses.</p>

<p>I remember myself wondering why should scientific research be defined by reproducible experiments? The oil spill I studied is fundamentally not reproducible. And even if it were, it&#39;s probably not ethical to reproduce it! I also didn&#39;t conduct any experiments. Does that mean I am doing bad science?</p>

<p>In the years since, I&#39;ve become an advocate for open science as a way to do good science, where I think what makes good science different from alchemy is best summed up by Cory Doctorow, who said: “Alchemists kept what they knew a secret for 500 years. They didn’t advance the art very much… and each of them learned in the hardest way possible that drinking mercury is a bad idea.”</p>

<p>I also learned that the Latin origins of the word science comes from Latin “scientia”, i.e. “knowledge”.</p>

<p>This prompted me to reflect more deeply on “open research” instead of just “open science”; and how open research could encompass diverse ways of learning and organising what we know.</p>

<p>For the purposes of my lightning talk, this is what I mean by epistemic diversity. <strong>My fear is that conversations about open research is not representative of epistemic diversity</strong>.</p>

<p><em>For example, we encourage people to share open data, but what does that mean to, say, a scholar of medieval literature</em>?</p>

<p><em>There is much focus on “reproducibility”, but would a law professor care about this</em>?</p>

<p><em>We have made progress on publishing open access papers, but does that mean anything to a musicologist</em>?</p>

<p>I think it is possible for us to shoehorn these concepts into these disciplines (e.g. perhaps medieval books or sheet music <em>are</em> the “data”, and the law professor should document their reasoning and arguments in a “reproducible” way), but why should we? What if these researchers get to define research – and open research – <em>in their terms</em> instead? What might open research look like then?</p>

<p>Another way to look at this is to ask: who gets to decide how to conceptualise and talk about research? Who holds the epistemic power to define the terms of the conversation? In my view this is not an abstract problem.</p>

<p>Last year, I had a long conversation with a new history professor who has an interest in open research. And he told me about how groups about open research are often dominated by researchers from very STEM-focused subjects like the life sciences. And he doesn’t see any researcher who <em>looks like him</em>. He told me that there is resentment and a sense of exclusion among other historians.</p>

<p>In other words, despite good intentions, the epistemic power of the loudest voices in open research discourse might have accidentally caused epistemic injustice which systemically excluded some researchers.</p>

<p>People often ask “what can I do as an individual”?</p>

<p>A core part of open research is diversity and inclusion and over the years I&#39;ve learned much (and have much more to learn!) on how words matter when it comes to dimensions like race, gender, or accessibility. I suggest applying that same sensitivity to epistemic diversity.</p>

<p>For example, the word “manuscript” could mean a paper submitted to a scientific journal, or a physical piece of ancient paper that a historian studies. I hear the terms “lab” and “PI” a lot in open research communities, but there are research disciplines whose social structures are very different and don&#39;t use these terms at all. Or, sometimes I hear talk of “STEM” vs “non-STEM” research, but that is itself a very STEM-centric view.</p>

<p>And, of course, the conflation of “science” and “research” as if they&#39;re the same thing. If we can be mindful of epistemic diversity when talking about open research, then maybe we can start to avoid excluding people from the conversation.</p>

<p>Finally, as someone from a very sciency background, I am in a privileged position. My imagination is constrained and it&#39;s not my place to authoritatively declare what we should do. That&#39;s why I&#39;m hesitant to prescribe “say x instead of y”.</p>

<p>Instead, there may be value in elevating under-represented groups through gatherings like focus groups or workshops, where we can learn from epistemologically diverse researchers directly on how to make open research more inclusive.</p>

<p>Over the past weeks, I received amazing suggestions on where this topic could go, and my lightning talk is just the tip of that iceberg.</p>

<p>In the mean time, I’d like to give thanks to them, including: The Turing Way community, Framework for Open and Reproducible Research Training, Nowhere Lab, Gathering for Open Science Hardware, and the organisers of this FOSDEM Open Research Devroom!</p>

<h2 id="additional-discussions" id="additional-discussions">Additional discussions</h2>

<p>In no particular order (don&#39;t have time to organise ATM), here are some other ideas which came up in the Turing Way, FORRT, Nowhere Lab, or GOSH communities (acknowledgements at the end of this post):</p>
<ul><li>Start with &#39;the idea of open scholarship and then narrow to “open science” or “open research” if needed, depending on who I&#39;m talking to&#39;.</li>
<li>”...it’s not just an issue in the open science/research movements, but interdisciplinary fields in general and any inter-/multi-disciplinary attempts to change research practices need to adopt epistemological flexibility &amp; tolerance towards other”</li>
<li>There is interest from the FORRT community to write something about this.</li>
<li>Also possible for me to re-present this lightning talk at an upcoming Nowhere Lab meeting, maybe in March 2024.</li>
<li>Classification can be confusing for some researchers, especially those who don&#39;t fit in typical “STEM” boxes.</li>
<li>Sabina Leonelli&#39;s work on open science and philosophy of science, linked to below.</li>
<li>Harding’s sciences from below and Fricker’s epistemic injustices are also useful reading (see below).</li>
<li>There is a Digital Humanities group for the UK and Ireland: <a href="https://digitalhumanities-uk-ie.org">https://digitalhumanities-uk-ie.org</a>
<ul><li>Including research software engineering: <a href="https://digitalhumanities-uk-ie.org/community-interest-groups/research-software-engineering/">https://digitalhumanities-uk-ie.org/community-interest-groups/research-software-engineering/</a></li></ul></li>
<li>There are different levels of abstraction when talking about epistemic diversity, which is a separate deep dive on its own.</li>
<li>&#39;Beware the “pageant effect”: you&#39;re likely to learn the amazing successes of an unfamiliar discipline before learning about its flaws and failures.&#39;</li>
<li>Connections to science, technology, and society (STS) studies, link to epistemic power and (in)injustice.</li>
<li>We had a conversation about the good intentions, usefulness, but also limitations of the <a href="https://credit.niso.org/">CReditT taxonomy</a> for research contributor roles (readings below). Though there is further work on improving it like SCoRO.</li></ul>

<h2 id="suggested-readings" id="suggested-readings">Suggested readings</h2>

<p>(I copied these citations directly from their respective websites, so the citation styles vary, sorry!)</p>

<p>Peter Branney, Kate Reid, Nollaig Frost, Susan Coan, Amy Mathieson &amp; Maxine Woolhouse (2019) A context-consent meta-framework for designing open (qualitative) data studies, Qualitative Research in Psychology, 16:3, 483-502, DOI: 10.1080/14780887.2019.1605477</p>

<p>Knorr Cetina, Karin. Epistemic Cultures: How the Sciences Make Knowledge, Cambridge, MA and London, England: Harvard University Press, 1999. <a href="https://doi.org/10.4159/9780674039681">https://doi.org/10.4159/9780674039681</a></p>

<p>Farran, E. K., Silverstein, P., Ameen, A. A., Misheva, I., &amp; Gilmore, C. (2020). Open Research: Examples of good practice, and resources across disciplines. <a href="https://doi.org/10.31219/osf.io/3r8hb">https://doi.org/10.31219/osf.io/3r8hb</a></p>

<p>Fricker, Miranda, Epistemic Injustice: Power and the Ethics of Knowing (Oxford, 2007; online edn, Oxford Academic, 1 Sept. 2007), <a href="https://doi.org/10.1093/acprof:oso/9780198237907.001.0001">https://doi.org/10.1093/acprof:oso/9780198237907.001.0001</a></p>

<p>Harding, S. (2008) Sciences from Below – Feminisms, Postcolonialities, and Modernities. Duke University Press. <a href="https://www.dukeupress.edu/sciences-from-below/">https://www.dukeupress.edu/sciences-from-below/</a></p>

<p>Hartmann, H., Darda, K. M., PhD, Meletaki, V., Ilchovska, Z., Corral-Frías, N. S., Hofer, G., … Sauvé, S. A. (2023, September 11). Incorporating feminist practices into  (psychological) science – the why, the what and the how. <a href="https://doi.org/10.31219/osf.io/2rcuz">https://doi.org/10.31219/osf.io/2rcuz</a></p>

<p>Jasanoff, S. (Ed.). (2004). States of Knowledge: The Co-Production of Science and the Social Order (1st ed.). Routledge. <a href="https://doi.org/10.4324/9780203413845">https://doi.org/10.4324/9780203413845</a></p>

<p>Latour, B., &amp; Woolgar, S. (1986). Laboratory Life: The Construction of Scientific Facts. Princeton University Press. <a href="https://doi.org/10.2307/j.ctt32bbxc">https://doi.org/10.2307/j.ctt32bbxc</a></p>

<p>Bruno Latour. (1988) Science in Action, How to Follow Scientists and Engineers through Society. Harvard University Press. <a href="https://www.hup.harvard.edu/books/9780674792913">https://www.hup.harvard.edu/books/9780674792913</a></p>

<p>Leonelli, S. (2022). Open Science and Epistemic Diversity: Friends or Foes? Philosophy of Science, 89(5), 991–1001. doi:10.1017/psa.2022.45</p>

<p>Plomp, Esther. 2023. “Valuing a Broad Range of Research Contributions through Team Infrastructure Roles: Why CRediT Is Not Enough.” Commonplace, December. <a href="https://doi.org/10.21428/6ffd8432.f92deec7">https://doi.org/10.21428/6ffd8432.f92deec7</a></p>

<p>Pownall, M., Talbot, C. V., Henschel, A., Lautarescu, A., Lloyd, K. E., Hartmann, H., Darda, K. M., Tang, K. T. Y., Carmichael-Murphy, P., &amp; Siegel, J. A. (2021). Navigating Open Science as Early Career Feminist Researchers. Psychology of Women Quarterly, 45(4), 526-539. <a href="https://doi.org/10.1177/03616843211029255">https://doi.org/10.1177/03616843211029255</a></p>

<p>Reddy, G., &amp; Amer, A. (2023). Precarious engagements and the politics of knowledge production: Listening to calls for reorienting hegemonic social psychology. British Journal of Social Psychology, 62(Suppl. 1), 71–94. <a href="https://doi.org/10.1111/bjso.12609">https://doi.org/10.1111/bjso.12609</a></p>

<p>Sichani, A.-M., Ahnert, R., Baker, J., Beavan, D., Ciula, A., Crouch, S., De Roure, D., Francois, P., Hetherington, J., Jeffries, N., McGillivray, B., Ridge, M., Terras, M., Tupman, C., Turner, M., Weinzierl, M., Willcox, P., Winters, J., Wynne, M., &amp; Smithies, J. (2023). iDAH Research Software Engineering (RSE) Steering Group Working Paper (v.1.0). Zenodo. <a href="https://doi.org/10.5281/zenodo.8177926">https://doi.org/10.5281/zenodo.8177926</a></p>

<p>Steltenpohl, C. N., Lustick, H., Meyer, M. S., Lee, L. E., Stegenga, S. M., Reyes, L. S., &amp; Renbarger, R. L. (2023). Rethinking Transparency and Rigor from a Qualitative Open Science Perspective. Journal of Trial &amp; Error. <a href="https://doi.org/10.36850/mr7">https://doi.org/10.36850/mr7</a></p>

<h2 id="more-resources" id="more-resources">More resources</h2>

<p>The UK Reproducibility Network has done good work on this, such as:</p>
<ul><li>Event by the UK Reproducibility Network: How relevant is the open research and scholarship agenda to the arts, humanities and social science disciplines? (warning: YouTube link) <iframe allow="monetization" class="embedly-embed" src="//cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FL6TEyElbTqE%3Ffeature%3Doembed&display_name=YouTube&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DL6TEyElbTqE&image=https%3A%2F%2Fi.ytimg.com%2Fvi%2FL6TEyElbTqE%2Fhqdefault.jpg&key=d932fa08bf1f47efbbe54cb3d746839f&type=text%2Fhtml&schema=youtube" width="640" height="360" scrolling="no" title="YouTube embed" frameborder="0" allow="autoplay; fullscreen; encrypted-media; picture-in-picture;" allowfullscreen="true"></iframe></li>
<li>Preprint title “Open Research: Examples of good practice, and resources across disciplines”: <a href="https://doi.org/10.31219/osf.io/3r8hb">https://doi.org/10.31219/osf.io/3r8hb</a></li>
<li>Working paper 6: <a href="https://doi.org/10.31219/osf.io/chyd4">https://doi.org/10.31219/osf.io/chyd4</a></li>
<li>Working paper 7: <a href="https://doi.org/10.31219/osf.io/c78qu">https://doi.org/10.31219/osf.io/c78qu</a></li></ul>

<p>👇 And there&#39;s more:</p>

<p>Humanities Commons: <a href="https://hcommons.org/">https://hcommons.org/</a></p>

<p>Replicable History Project: <a href="https://ljmu.libcal.com/event/4130747">https://ljmu.libcal.com/event/4130747</a></p>

<p>Works by Karen Barad: <a href="https://en.wikipedia.org/wiki/Karen_Barad">https://en.wikipedia.org/wiki/Karen_Barad</a></p>

<p>Joint meeting of the European Association for the Study of Science and Technology (EASST) and the Society for Social Studies of Science (4S): <a href="https://www.easst4s2024.net/">https://www.easst4s2024.net/</a></p>

<p>FOSDEM talk: FLOSS meets Social Science Research (and lived to tell the tale): <a href="https://archive.fosdem.org/2021/schedule/event/open_research_floss_meet_social_science/">https://archive.fosdem.org/2021/schedule/event/open_research_floss_meet_social_science/</a></p>

<p>SCoRO, the Scholarly Contributions and Roles Ontology: <a href="http://purl.org/spar/scoro">http://purl.org/spar/scoro</a></p>

<p>Research Software Engineering in the Arts and Humanities: <a href="https://digitalhumanities-uk-ie.org/community-interest-groups/research-software-engineering/">https://digitalhumanities-uk-ie.org/community-interest-groups/research-software-engineering/</a></p>

<h2 id="acknowledgements" id="acknowledgements">Acknowledgements</h2>

<h3 id="turing-way-https-the-turing-way-netlify-app" id="turing-way-https-the-turing-way-netlify-app"><a href="https://the-turing-way.netlify.app/">Turing Way</a></h3>

<p>Anne Lee Steele, Bastian Greshake Tzovaras, Esther Plomp, Jason Hills, Julien Colomb, Liz Hare, Malvika Sharan, Marion Weinzierl, Maya Anderson-González, Richard J. Acton, Sarada Mahesh, Shern Tee</p>

<h3 id="framework-for-open-and-reproducible-research-training-forrt-https-forrt-org" id="framework-for-open-and-reproducible-research-training-forrt-https-forrt-org"><a href="https://forrt.org/">Framework for Open and Reproducible Research Training (FORRT)</a></h3>

<p>Crystal Steltenpohl, Flavio Azevedo, Katja Rogers</p>

<h3 id="nowhere-lab-https-nowherelab-com" id="nowhere-lab-https-nowherelab-com"><a href="https://nowherelab.com/">Nowhere Lab</a></h3>

<p>Gavin Taylor, Priya Silverstein</p>

<h3 id="gathering-for-open-science-hardware-gosh-https-openhardware-science" id="gathering-for-open-science-hardware-gosh-https-openhardware-science"><a href="https://openhardware.science/">Gathering for Open Science Hardware (GOSH)</a></h3>

<p>Brianna Johns (original co-author of this talk!), Laura Olalde</p>

<h3 id="uk-reproducibility-network-https-www-ukrn-org" id="uk-reproducibility-network-https-www-ukrn-org"><a href="https://www.ukrn.org">UK Reproducibility Network</a></h3>

<p>Steve Boneham, Joe Cornelli, Natasha Mauthner, Stefana Juncu</p>

<p><a href="https://naclscrg.writeas.com/tag:talks" class="hashtag"><span>#</span><span class="p-category">talks</span></a> <a href="https://naclscrg.writeas.com/tag:epistemicdiversity" class="hashtag"><span>#</span><span class="p-category">epistemicdiversity</span></a> <a href="https://naclscrg.writeas.com/tag:openresearch" class="hashtag"><span>#</span><span class="p-category">openresearch</span></a></p>

<hr/>

<p> <p>Unless otherwise stated, all original content in this post is shared under the <a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;">Creative Commons Attribution-ShareAlike 4.0 International</a> license<a href="https://creativecommons.org/licenses/by-sa/4.0/" target="_blank" style="display:inline-block;"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1" alt=""><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1" alt=""></a></p></p>
]]></content:encoded>
      <guid>https://naclscrg.writeas.com/epistemic-and-disciplinary-diversity</guid>
      <pubDate>Sat, 10 Feb 2024 09:11:24 +0000</pubDate>
    </item>
  </channel>
</rss>