Inside UXR

37. How do I conduct research on AI based products?

Drew Freeman and Joe Marcantano Episode 37

In this episode of Inside UXR, Drew and Joe dive into the complexities of researching AI-based products. They explore how AI research differs from traditional UX research, from recruitment challenges and user biases to the importance of trust and accuracy in AI-generated outputs. They also discuss why diary studies and co-design sessions are particularly effective for understanding AI adoption. Whether you’re testing a new AI feature or a fully AI-driven product, this episode will help you navigate the unique challenges of researching AI.

Send us a text

Support the show

Send your questions to InsideUXR@gmail.com

Visit us on LinkedIn, or our website, at www.insideUXR.com

Credits:
Art by Kamran Hanif
Theme music by Nearbysound
Voiceover by Anna V

37.  How do I conduct research on AI based products?


Drew Freeman: Welcome Joe. How are you doing on this last episode of this recording session?

Joe Marcantano: I am doing well. I'm getting ready to go out and shovel the driveway after this because that's how things are in March and Denver.

Drew Freeman: do not envy you.

Joe Marcantano: At least when I get snow, it melts. When you get snow, it hangs out for months sometimes.

Drew Freeman: The last couple of years climate change has not been friendly to snow sticking around, which is a bummer for me becausee I do like snow based outdoor activities. So it's a little sad for me, but that is what it is and there's nothing we can do about that on this podcast.

Joe Marcantano: I'm not at all surprised by that given that you and I both enjoy the same winter sport as well.

Drew Freeman: Okay, so actually switching gears to the topic that we're here to talk about today, this, I think should be a really interesting one because Joe, this is actually something that you have done a lot of professional work on. So our question today is how do I do research on products that are AI based or have a lot of AI features?

Joe Marcantano: This is a really good question. You're right. It's something I've done a fair amount of work on. Products that are either AI or have, new AI features. And I'm sure folks have said this before, like this product is completely new and we have to adjust how we research. but that really is the case with AI. I think that we do have to make some adjustments, have to make some accommodations on how we're doing research on products that are, that are AI and that are AI based.

Drew Freeman: It's interesting you say that because I haven't done. I'm not sure I've done any work that was specifically on an AI product, but my thought was at the core research is still research. So I'm not going to change my approach too much. So tell me where I might be wrong in that and tell me where you think I should change my approach.

Joe Marcantano: Well, the first thing, and this change will likely be forced on researchers for this one. It's not one that like we're going to consantly choose to make. You know, normally when you recruit for any new product, you can put an NDA out and you're likely good. Right? You might have Some leaks to the general public, but it is what it is there. Like a lot of the companies that are developing AI tools are playing things so close to the chest that they're only going to let you do testing with employees. They may not let you do recruitment of the general public. And so your recruitment pool is already like, I want to say tainted, but it's not really the right word. Maybe biased is the right word. Like they're a little more presuming that this is a company that's developing an AI product. The presumption there is that the employees are already going to be a little more tech savvy, so they're not really representative of the general population.

Drew Freeman: That's interesting. I didn't realize that these AI companies were being that locked down with their recruitment.

Joe Marcantano: Yeah, I did some recruitment for a client once, and they had said you can only use employees of our company. We're not going to let you recruit outside people. And one of the ways that we got around it know number one is we just flagged it in the findings. We said, hey, the recruitment base based on the limitation we were given is not representative of the general public. So take everything here with a grain of salt. The other thing that we did though is we tried to recruit from non technical roles within that company.

Drew Freeman: That makes a lot of sense.

Joe Marcantano: Yeah, think like hr, think event planning, maybe even like facilities, right?

Drew Freeman: Like absolutely. I've done the same thing.

Joe Marcantano: Yeah. So you can try your best to get folks who aren't in those kind of tech savvy jobs. Just keep in mind though, that like odds are just by being exposed to the other people who are kind of at the one end of incredibly tech savvy,

00:05:00

Joe Marcantano: they are going to naturally drift that way a little bit. Justuse they're going toa be exposed to that life and that kind of experience a little more often than somebody who's in facilities and works at, you know, a school district maybe or whatever other non technical field.

Drew Freeman: Yeah, that's been my experience too. Is that even if they aren't, even if they don't consider themselves particularly tech savvy, it seems like they just kind of pick things up through osmosis.

Joe Marcantano: Yeah. I mean there's tons of research that we are all as human beings influenced by our environment and the people in it. Right. So there's lots of studies out there that show that if you hang out with people who tend to eat less healthy, then you're going to eat less healthy. And, and it's the same kind of Thing If I'm around 100 people who have the newest iPhone, I'm more likely to get the newest iPhone.

Drew Freeman: Okay, so that is one of the big differences with recruitment and with privacy and confidentiality. What are some other differences between researching an AI product and researching a widget?

Joe Marcantano: The other thing is that, and this comes up with products more than just AI. But there are a lot of folks out there who have very strongly held notions and beliefs about AI in general and without getting into, know whether those beliefs are right or wrong or correct or incorrect. You know, there are folks out there who think that because of the environmental impact of AI, all AI is bad. And there are folks out there who think that AI is going to take people's jobs and therefore AI is bad. There are also people at the other end who say AI is the best thing since slice bread. It's going to make everyone's lives easier. it's going to make everything faster and better. Those are often values based beliefs that are held very, very strongly. And if I'm doing research on a widget, it, it's unlikely that I'm gonna run into those beliefs.

Drew Freeman: So how do you kind of not factor out those beliefs, but how do you kind of account for that in your research?

Joe Marcantano: So the first thing is kind of understanding who you want toa recruit. If I am trying to recruit, let's say my company has developed a, new AI feature on a product. Do I want to recruit? And this is like a question you need to ask your stakeholders, right? Are we trying to recruit people who we suspect will be adopters or are we trying to recruit some of those people who are maybe very, very strongly against AI to the point that they might churn over our deployment of an AI product. So just kind of knowing where we're aiming our recruitment, criteria is really, really helpful. The other thing is that when you're screening these folks, don't just ask, do you use AI? Do you have a strongly held belief about AI? Like those questions are kind of transparent. People can see through them. Ask instead. How often have you used an AI product and then given an open ended, tell me about the last time you used an AI feature. You know, make people kind of describe what they're doing and then you can kind of evaluate based on those responses.

Drew Freeman: I think it's important here to say that often you will be trying to, often you might be trying to recruit a mix of people. You might want some early adopters, you might want some people who are skeptical and hesitant and you Might want people who are all the way in between that.

Joe Marcantano: Yeah, usually it's a mix that you want and it's, you know, just like when you're thinking about ages or education or gender, you kind of want to touch everybody a little bit there. Same thing here. This is just another factor that you want to kind of touch everyone on.

Drew Freeman: Although I have done recruitment and research for specifically early adopters, so, focusing only on people who were enthusiastic about AI.

Joe Marcantano: I have as well. And a lot of times what I'm seeing is this isn't even thought of. Right. Like, let's say I'm developing a new AI feature for an HR software. The recruitment criteria might just be folks who use HR software because they want to see will they even use it. They don't even want to ask about AI. That may mean that during the recruitment process we don't ask them, but during the intro, I'm absolutely asking them because I want to know how familiar, how comfortable are they with an AI product or feature?

Drew Freeman: For

00:10:00

Drew Freeman: sure. Yeah. You always want to know, like, what's your experience level with this, with whatever I'm testing, do you come in with some level of experience already?

Joe Marcantano: Yeah.

Drew Freeman: Okay. So do you find that certain methods of research tend to lend themselves better to AI research? AI, research on AI products, or is it really the same as any other research method selection process?

Joe Marcantano: So this actually like touches on two previous episodes. I have found that diary studies tend to be best. And I'll get into why in a second with the added caveat that if you are, building AI into an existing product, it's co designs that are best or co designs with diary studies, like a combination study. Because often you are trying to add the AI into that product without changing the product fundamentally. You want it to be an addition, not a rebuild of the product.

Drew Freeman: As an example, you might want to add a AI generated emails or responses into an email product. Just as an example.

Joe Marcantano: Yeah, I'm not looking to fundamentally redesign email. I'm looking to add to email's existing functionality.

Drew Freeman: So why are diary studies the best or the most, maybe a, more commonly used method in those situations.

Joe Marcantano: So this was something we talked about in the last episode a little bit. But the learning curve on AI products is, is steep. It's pretty steep. Right. And it's not something that you can pick up in day one and be an expert on. it requires some time, it requires some nuance, and frankly it requires some exploration. More than you can do in 60 minutes. And if you are trying you know, I like just about everybody out there, use ChatGPTT on occasion. The things that I use ChatYBT for today are very, very different than the things I used Chat GPT for even three months ago.

Drew Freeman: Same. I legitimately did research on how to write better prompts for large language models and watched videos, read articles. And that's not something that you can do in a 60 minute IDI.

Joe Marcantano: Yeah, it's both the how am I like prompting, right? How am I writing my prompt, how am I crafting it? But it's also the things I'm asking it to do are way different. Like I am, planning a trip next month to Europe and I'm going to have several stops And I used ChatPT to help me create kind of a tentative itinerary of how many days I should spend in each location because it was able to help, like, I could tell it the things I wanted to do and it was able to like, come up with the right timeline. I would have never dreamed of asking Chat GBT to help me do that three or six months ago. It's a problem that I wouldn't have used this tool to tackle. And like all this to say, the way folks use AI on day one, even an AI feature is going to be very different than day five, day 10, day 50. And the user's journey encompasses all of those things, not just the one hour snapshot.

Drew Freeman: Okay, so let's talk about. And I might not even have enough AI expertise to, to frame this question correctly, so take that with a grain of salt. But how do you do research differently or is it different at all? When the product is an AI product at its core, something like a large language model like Chat GPT or Gemini versus an existing product that is adding an AI feature into what it is already?

Joe Marcantano: The research questions are slightly different there, right? So if I am doing research on a product that is AI at its core, my research questions are going to be around how do I teach people better prompts? What kind of problems are they tackling? you know, I'm going to really, really focus on the interaction, right? Are folks getting the answers they need? Is it working in a way they expect?

Drew Freeman: Do people understand what is happening?

Joe Marcantano: Yeah. How can I nudge them to write better prompts? These kind of things. When you're on an AI product, you are essentially creating a new workflow,

00:15:00

Joe Marcantano: right? There was not a tool that allowed me to easily come up with a travel itinerary based on my specific interests. Scouring the Internet like that, aside from me manually Doing it.

Drew Freeman: I was going to say that workflow was called Travel Agent.

Joe Marcantano: Exactly. Yeah, you'd consult an expert. But when I have an AI feature into an existing product, I am looking to supplement or enhance existing workflows. So at that point it becomes, you know, using the email example, I, want to understand how folks wrote emails before. How did they tackle difficult subjects, how did they polish them, how did they edit them? Then I want to understand how they'll use the AI feature to accomplish those same things. Is it harder or more difficult? Is it worth the time it takes to write a good prompt? Or is that good prompt just as much effort as writing a good email? There's a comparison point there that while it exists on a new AI tool, it's not as like straight line comparison as it is comparing the workflows within the same product.

Drew Freeman: In your experience, have you found it easier or harder in one of these, in one of these cases, or is it just different?

Joe Marcantano: It's just different. one of the things that I ran into is that the abandonment of using the AI feature is pretty highive if it doesn't get it right the first time. You know, the example I always think of is like, lots of people I'm sure have, an older relative who tried voice to text once 10 years ago and it didn't work well and they haven't used it since. Whereas, you know, if you started using voice to text later on in its kind of life cycle, it works fairly well and you're probably using it a little more. The same kind of thing can happen with AI features. is, you know, if I try to use it to help me write an email and it doesn't work the first time or the first couple of times, it's unlikely that I will give it another try. And so, especially if you're doing diary studies, if I don't give people specific like topics to focus on for each entry, not necessarily a specific task, but specific topic, they, after day one or entry one, might just stick to the really superficial stuff because they just don't trust it to do the other things.

Drew Freeman: What you're describing is essentially exactly how my experience with large language models went. I tried it out because I like tech. That's interesting. Let's see what it can do. I didn't have the creativity or the experience of what it can do. I, was very underwhelmed with the underwhelming prompts that I gave it and it took a while until I saw more people using it in cool ways before I went back to it.

Joe Marcantano: Yeah. One of the things that I saw a lot of is what I called, like, tire kicking. It used to be back in the day when you went and bought a car, you kind of kicked the tires. And I never understood that. To me, it was always like looking at the engine. If you're not a car person, I'm like, yep, that's an engine, but that.

Drew Freeman: Feels like a tire.

Joe Marcantano: Yeah. So folks would essentially tire kick the AI. Let's say it's, an email, AI feature. They might say, delete my new email. Now, that is a task that the AI probably cannot do. It was not designed to do. And frankly, it is easier for you as the user to just click delete. But these folks were testing it. They wanted to see if they could trust it with the little things before they gave it the big things to do.

Drew Freeman: How do you try to account for that? How do you try to incorporate that into research or do you at all?

Joe Marcantano: So that's where diary studies fit in really well because I can kind of give people not specific tasks, but, but little n nudges. Right. So sticking with our email example, I might say, you know, today, focus on time management. Right. What things might you do? What prompts might you ask the AI to help you manage your time better? And so then somebody might type in, what are my most pressing things today? Or you whatever. Right. Like instead of just saying, hey, have at it, I'm giving people specific silos to

00:20:00

Joe Marcantano: kind of think about as they are prompting the AI.

Drew Freeman: Okay, that makes a lot of sense. So the last place that I want toa take our conversation is are there any differences that you've seen when it comes to data analysis or reporting on AI specific research?

Joe Marcantano: Yeah, so I have seen some, you know, additional things you have to factor in. The first being the performance of the AI. You know, AIs are finicky. Know you can type the same prompt into chat gptt twice and sometimes get two different answers. They're not always right. We've all seen the news stories of the lawyers who use chatt bet to write their motions. And it references fictitious cases. Right. It can hallucinate. So how the AI performs, how that specific model performs. Like factually, is it accurate or not? Can it be a, huge factor in how and how successful the participants viewed their interaction with the model?

Drew Freeman: Okay, so what about reporting? What kinds of things have you changed as you're reporting out to stakeholders and, and crafting a story?

Joe Marcantano: Yeah, so I'm always including the accuracy. Right. Like, did participants feel that this was useful because it was inaccurate or because it didn't do the right things? I'm also providing information on trust. How much do folks inherently trust the output? How much do they feel like they have to verify? And then I'm also reporting on things like tech savviness, how early adopters, how early of adopters they are, and do they have any strongly held beliefs that they talked about for AI. I'm trying to provide more context. If my n was 8 and five of those people said that they think, like, morally AI is not a good thing, that's going to cloud those results a little bit. It doesn't necessarily mean my results aren't valid, but it means I want to provide some additional context to my stakeholders.

Drew Freeman: Mm So I know that this episode really just kind of scratched the surface of the AI researching AI topic. So hopefully, listeners, this sparked some ideas and sparked some questions for you, and we would absolutely love it if you want to continue this conversation if you send those questions and send those ideas to inside uxrmail, uh.com.

Joe Marcantano: Yeah, this is something that I think is still evolving a little bit. And as researchers, we're being a little scrappy and kind of figuring this out and all the different nuances, and I don't think it's settled territory yet.

Drew Freeman: 100%. All right, so thank you everyone for listening today. My ask of you today is to share this episode with a friend who you think might like it. That really helps us reach new listeners and helps us grow and really just you makes us feel good about continuing doing the podcast.

Joe Marcantano: Yeah, it's always exciting to see we're still at a point where we can see when new people start downloading because they'll go back through and listen to older episodes. And so when we see you a certain number of downloads in the day and it's clearly somebody going back through and catching up, that's always really exciting for us to kind of learn that we've picked up somebody new.

Drew Freeman: Yeah. Before this recording session, Joe and I were just chatting and we were chatting about how we were chatting about the number of the number of listens and the number of downloads we've had this week. All right, so again, thank you everybody for listening. Really appreciate it. Please give us a like or a subscribe or a share that's really helpful to us. And like we just mentioned, we do see it and we do appreciate it with that. I'm Drew Freeman.

Joe Marcantano: And I'm Joe Marcantano.

Drew Freeman: And we'll see you next time.

00:24:07


People on this episode