Training the Next Generation of Scientists to Standardize Human Oversight in AI Interactions

Published 2026-05-07
tags: #understandingLLMs

In the Summer of 2025, I taught a remote upper-level course called AI as a Research Partner. Before this course, many students told me that they felt that AI was too generic to be useful or was cheating/shortcutting work, and while that can certainly be the case, I don't think it has to be.

I thought really hard about how I could design a course to model what I consider a healthy working relationship with AI, and I presented my thoughts in a poster at the Teaching and Learning Center's Teaching Excellence Showcaseon May 7th, 2026. Below you'll find the resources from my poster along with some extra commentary and expanded assessment details.

Any added extra commentary that was not originally on the poster will appear like this!

Introduction

As Large Language Models (LLMs) become a ubiquitous component of the scientific workflow, the pedagogical challenge shifts from teaching students how to operate AI proficiently to actively partner and co-create with these models. One strategy for building healthy human-AI partnerships is to establish habitual human oversight over these tools. I present a three-stage scaffolding framework implemented in a Summer 2025 distance-learning course for undergraduate biology students (AI as a Research Partner), designed to move learners from passive consumption to active intellectual interaction. These assignments, which build on each other, are designed to model how human ideas and agency can persist while treating AI as a tool for work, as a partner for thinking/writing, and as a catalyst for synthesis and creativity. By centering the "Human-in-the-Loop" (HITL) through these scaffolded interactions, the course moves beyond basic prompt engineering to cultivate a professional identity of director rather than user.

Stage 0: AI proficiency

Most students have used enough AI to understand that it can be smart at times and confidently wrong in others. In my class, before we started working on best practices when using LLMs, I thought it was worth setting the foundation for how LLMs work, where they fall short, and how a human who understands both the subject matter and the model's limitations can get the best out of these tools.

Although it's not 100% necessary to understand the fine details of "pre-training" and the "transformer" to understand hallucination, I think it certainly helps. I didn't want to spend too much time explaining neural networks and matrix multiplication (this is a Biology class after all), but I did try to explain the basics in broad strokes with the videos I recorded for the class. If any of you are looking for a deeper dive into how LLMs work, check out this playlist and this video specifically. Much of my personal understanding of LLMs come from that playlist.

Goals

Understand strengths and weaknesses of LLMs
Establish the importance of humans in human-AI interactions

Resources

"We are baking in human bias into these models whether we like it or not. Our bias as humans is directly reflected in the chatbots and how they function."

"Hallucinations are incredibly convincing. They're also incredibly wrong. And Large Language Models can do this because they're not rooted in knowledge. They're just rooted in statistical information in how certain words give other words meaning without actually knowing what that meaning is."

Assessments

Post in discussion boards what more you would like to learn about LLMs
Chat with LLM and share your chat log with the class

I think that including chat logs as part of assessments is one way that we can prevent "AI cheating" in the future. It's so much better to invite chatbots to the conversation so that we can assess how our students are interacting with them to produce work than to assume whether students are using AI or not. Also, having students share their chats with the entire class promotes accountability and normalizes the idea that AI interactions are part of the intellectual work itself, something worth examining together rather than hiding.

Student feedback

Students showed great interest in how hallucinations can occur despite vast amounts of training data
Students were generally interested in the training process
Some were surprised with the financial cost of training LLMs

Stage 1: AI as a Tool

The problem with most engagement with LLMs is that users are used to querying rather than chatting. Most of us treat chatbots like a Google search, and while that works for some small-stakes cases, we can get more out of the LLMs if we provide additional context and instruction. If garbage in equals garbage out, rich context in equals rich output. This idea is also why I thought it was very important to discuss the idea that "AI is not an expert" early. Also, it's too easy to trust a system that speaks so confidently and with so much authority. We need to bake in a healthy distrust into every interaction we have with LLMs. Both goals, building healthy skepticism and providing rich context, are addressed in this stage through the creation of custom chatbots.

Goals

Encourage students to disagree with LLMs
Create custom, reusable chatbots with defined roles and instructions

Resources

"You want to tell the AI who it is. The reason you want to do that is because the AI has been trained on billions of documents. Telling it who it is gives it the right context from which to start."
"To work with AI interactively, you have to push back. And this is really where the magic happens."

Assessments

Do the following on a piece of scratch paper:

List what you'd want in a research partner.
List what you don't want in a research partner.

Using this information, make a prompt using the following guidance:

Tell the chatbot who you are and who you want the chatbot to be
List what makes a good research partner
Ask what it thinks about your definition and what other things make a good research partner?
Then, choose one thing it mentioned that you disagree with, tell it your opinion, and ask it to expand.

I want to force my students to disagree with the chatbots early. It's often in these disagreements that users clarify what they really think. Articulating an objection forces you to examine your own reasoning in a way that passive acceptance rarely does.

Student feedback

The process of making custom chatbots was straightforward; primary effort involves crafting high-quality prompt characteristics
LLMs can be used as a tool to identify logical gaps, challenge arguments, and avoid "tunnel vision"

Stage 2: AI is a partner

This is a natural extension of the prompting and custom chatbot work we did in stage 1. Now, instead of just making a decent digital research partner, we're going to use one to help us write. If you watch the included video, you'll notice that I spent a good amount of time discussing the role of a writer vs. an editor. The writer generates original ideas and takes ownership of the argument. The editor refines and responds to what's already there. AI works best in the editor role, which means humans need to show up as the writer first. The basic lesson is pretty simple: instead of asking AI to write something for you, ask it to comment on your own writing.

Goals

Model how LLMs can assist, not replace, human thoughts in writing through recursive co-writing

Resources

"We want to be able to use computers to help us clarify our thoughts, to help us broaden our thoughts, but that really means that we need to put our own ideas and thoughts first."
"I don't use large language models or chatbots to actually write for me. I actually use them to help me think, to help me improve my clarity of thought, to help me improve my own writing."

Assessments

Write an outline of your introductory paragraph on a scientific topic.
Paste your outline into the chatbot prompt and ask the following questions:
- what is missing?
- what should I be aware of as I'm writing this for a broad audience?
- any other questions you think are appropriate?
Using the feedback from the chatbot, update your outline and write the paragraph without AI's help.
Copy and paste your paragraph back into the chatbot and ask for strengths and weaknesses as well as how it could be improved.

I think that this kind of an assignment works best once students have already gone through the process of learning how to write and why to write. Keep in mind, my students were mostly seniors and well-exposed to good writing practices already. Almost every student found some advice from AI that they ignored or flat out disagreed with (and I think that is a very healthy way to approach AI-assisted writing).

Student feedback

Student Agency: AI enhanced work without "stealing" the student's voice; the student remains the leader of the conversation.
Using AI in this way facilitated a "back-and-forth" process to generate alternative phrasing, even when students disagree with initial suggestions.
Increased Accessibility: Served as a vital alternative for students with scheduling conflicts or social anxiety regarding in-person writing centers.

Stage 3: AI as a catalyst

I knew that I wanted to have my students dig into the environmental impact of these large language models. I also anticipated that some of my Biology students would have a lot of opinions and things to say about the issue. But I really struggled coming up with this final project that addressed this using the tools and skills we've built in the last few weeks.

The final project was actually inspired by having my students listen to the Hard Fork podcast episode below. As a form of assessment, I asked them what questions they would pose to an environmentalist that specializes in this area. And then I thought...wait...why not program a custom chatbot with that knowledge and let them actually conduct the interview themselves. And why not interview a big tech CEO as well?

Reflecting on it now, I think this project worked on a level beyond AI literacy. Roleplaying conversations with people who hold fundamentally different values, a tech optimist, an environmentalist, a domain scientist, is practice for something students will encounter throughout their careers. Engaging productively with people who disagree with you is one of the hardest skills to build. Sneaking it in through a custom chatbot assignment felt like a small win.

Goals

Apply new skills to explore the environmental cost of LLMs
Use LLMs to navigate complex, multi-stakeholder discourse

Resources

"Google and Microsoft actually put out reports saying that they're not meeting their own sustainability targets. They essentially dropped the ball on their own energy and carbon goals because of AI."

Blog posts from Big Tech CEOs

Sam Altman, OpenAI: The Gentle Singularity
Sundar Pichai, Google and Alphabet: AI Action Summit
- "As AI continues to improve, it will spur innovation, opportunity and growth in economies around the world, and drive an explosion in knowledge, learning, creativity, and productivity that will shape the future in exciting ways."
Dario Amodei, Anthropic: Machines of Loving Grace
Mark Zuckerberg, Meta: Personal Superintelligence

I highly recommend giving Dario's essay a read. It's a relatively long piece, but it's refreshing to see a tech CEO engage seriously with the downsides and risks of AI rather than leading with product announcements. He's still deeply optimistic about AI's potential, so if you're looking for a skeptical take, this isn't it. But the optimism feels earned rather than performed. By comparison, the other pieces on this list read more like PR: long on vision and short on self-examination in my opinion.

Assessments

Write a fictional blog post entitled "The Future of AI in Research." Make 3 custom chatbots that you instruct to role-play as 3 different professionals: a big tech CEO, an environmental specialist, and a leading scientist in your field. You will create these 3 different chatbots, provide them documents and whatever instructions you think are appropriate, and then interact with each chatbot and perform a fake interview. After generating these fake interviews, use all the tools we've learned in this class to write your article. Make sure to use quotations from your interview in the article.

Given this was a final project, I wanted to capture as many artifacts of my students' interaction with AI as possible. A final product was going to look polished with AI's help, but I wanted to assess how deeply students thought about the instructions they gave to their chatbots and I wanted to see how they questioned those bots as well. To capture as much of this as possible, I provided them with the following template.

General takeaways

What students took from the course

One question I asked my students at the end of the course was whether or not they felt optimistic about AI-use in the future, and I received varying responses here. Some students had their eyes opened to a new future of knowledge work while others still had major concerns about the effect of AI on entry level workers, human creativity, and the environment. But one comment stood out to me: one student was very happy that we were having these conversations now. There are some things that we, as individual users, cannot control regarding the future of AI. But it's so important to have exposure if you want to have an informed opinion. At the end of the course, I think it was clear to my students that using AI is both easier (e.g., the technical skills of making a custom chatbot) and more difficult (e.g., providing chatbots with the best context for the best output) than expected.

What I would change in the future

Initially, I was very hesitant about going through the background information of how LLMs work, but given the many questions about how hallucinations occur, I think it would be a great service to my students to dive into this deeper. These students are likely not going to become AI engineers, but many felt unresolved tension surrounding "why AI's lie" and "what does it actually mean to train an LLM." Additionally, many students mentioned that they would have appreciated more time to discuss the dark side of AI. We briefly talked about the environmental impact, but I did not have enough time to get into what we currently know about the ethics of where these data come from (stolen work and underpaid work from third-world countries), how AI has negatively affected work productivity (work slop and falling asleep at the wheel), how AI has negatively affected knowledge workers (AI brain fry), the nation's response to more data centers, etc. There's also so much that has happened recently with AI agents in the past year, that I truly think we've moved from "cool product demos" to "useful tools."

I think the core of this course would stay roughly the same though. My major goal for this whole course was to show my students that there can be a future where humans can work with AI collaboratively. General public sentiment about LLMs currently fall in the AI doomerism camp. And while there many good reasons to be cautious of this technology, I don't think it serves our best interest to be afraid of it. This technology is here, and in the right hands and with the right training, this technology can be very powerful and impactful. My hope is that if we continue to educate ourselves and others about this technology, that power will be distributed among the masses and not concentrated in a handful of big tech companies.