How can I leverage AI to understand user behavior?
This interview is part of Kameleoon's Expert FAQs series, where we interview leaders in data-driven CX optimization and experimentation. Florent Buisson is an applied behavioral economist and author of Behavioral Data Analysis with R and Python: Customer-Driven Data for Real Business Results. After starting and leading the behavioral science team at Allstate Insurance, he most recently led product experimentation at the online marketplace Cars.com.
Can AI be used to better understand user behavior?
We’ve seen the emergence and neck-breaking speed of improvements to generative AI tools like ChaptGPT. People in marketing and product are wondering if these tools can help them understand the behavior of users, design better experiments, and ultimately build better content and products. The short answer is yes.
These tools are already helpful and will only get better. For instance, I use Perplexity AI as a substitute for Google Search because it provides excellent, up-to-date summaries on any topic. When looking at ‘Marketing Mix Modeling tools’ recently, I asked Perplexity AI, “What do you know about the Robyn MMM tool?” and its answer was an excellent starting point for me to dig deeper.
We’re also seeing more specialized use cases being developed, such as “synthetic audiences” that simulate how your target audience would respond to a survey or experiment.
Today’s performance is the worst it will ever be, and use cases that seemed impossible a month ago might be “unlocked” by the next generation of models.
How should I prepare user data before using AI to analyze it?
It’s been said that data scientists spend 80% of their time cleaning and preparing data and only 20% analyzing it. But now, you can give a CSV file to ChatGPT and ask it to clean the data for you. Will all the data preparation work disappear? Unfortunately, no.
If data cleaning and prepping were simply a matter of dropping duplicates, outliers, and rows with missing data, a few lines of Python code would have sorted this out a long time ago.
The problem is that “cleaning data” is a misleading representation of what data scientists and analysts actually do. It suggests that data has good and bad parts and that we must remove the “dirty” parts.
In reality, the goal of the preparatory stage is to form an implicit mental model of the data and how it was generated.
Then, that mental model can be used to transform the data to make it more accurate and useful. For instance, data might be missing at random because of a faulty sensor or a broken tag, or it might be missing because the user refused to provide it. In the second case, removing the rows with missing data would bias our analysis away from privacy-conscious users.
Another example is that I once encountered a price variable where the data was normally distributed around $10,000 but with two isolated peaks at $1,000 and $100,000, far from the rest of the distribution. I hypothesized that these were data entry errors, where the user got the number of zeroes wrong. At least for now, these situations can only be caught and corrected by humans exerting their best judgment and double-checking within the organization.
Don’t let the apparent ease of preparing data with ChatGPT fool you. To paraphrase the Princess Bride: “You keep using this ChatGPT cleaned data, but I don’t think it means what you think it means.” Instead, embrace the fact that data preparation is a human-intensive phase where you should use your judgment.
How can I reduce biases when using AI with behavioral data?
One particular challenge with AI is that it can replicate biases in its training data.
I worked with an HR team to develop an algorithm that selected job candidates to interview. I know, I know, but before you send me hate mail, hear me out. We knew past decisions might reflect biases and prejudices from recruiters, so we looked at the variables the model used to ensure that it was not unduly favoring or rejecting candidates based on race or gender.
The good news is that a model doesn’t know what it doesn’t know. If you withhold information such as a candidate’s name, it can be less biased and more objective than human beings. But simply assuming that an algorithm won’t be biased because it doesn’t care about race and gender is, in my opinion, misguided techno-utopism.
Tackle the issue head-on and ensure you’re not getting biased outcomes. This can, however, be difficult if you rely on generative AI to do the entire analysis behind a curtain. When dealing with behavioral data, you should look for potential biases in your training data.
How can I ensure I don’t get fooled when AI answers incorrectly?
Generative AI tools can give inaccurate responses and fabricate answers, known as “hallucinations.” It is easy to get misled by these, even when you know hallucinations are a possibility because the answers can be so compelling.
I still remember the first time ChatGPT invented an academic paper to answer my question about behavioral science literature. It seemed entirely plausible, so I went on a wild goose chase, even though I knew about potential hallucinations.
Fortunately, most people have experience dealing with an automated system that can make egregious errors: GPS. Initially, people follow GPS instructions blindly, even when it means committing an infraction such as driving down a one-way street. But after a couple of scares, we develop a level of sanity-checking to benefit from the tool without getting into trouble. The same applies to generative AI.
We can use AI to develop hypotheses or write code snippets, but we must confirm and validate them. In that sense, the right question is not “Is AI accurate?” but “Is the human-computer system I form with the AI accurate as a whole?”
Critical thinking and adaptability are crucial to working with AI. I believe a vital step in this process is to form a good mental model of what the AI is doing, not the details of the math but the high-level blocks and steps involved.
How can I work with AI to do sentiment analysis of qualitative user data?
It helps to recognize that generative AI is not a natural-language processing (NLP) algorithm; instead, it generates words based on its gigantic text database and the prompt provided.
When we’re talking about using generative AI for sentiment analysis, for example, customer feedback, there are two possibilities: either it’s summarizing the text and adding sentiment tags, or it’s summarizing the output of an NLP model.
For instance, Amazon shows summaries of user reviews on each product page, which provide a general sense of what people are saying. Generative AI latches on to the frequency of words or synonyms appearing in reviews. Ultimately, this is just a smarter version of a word cloud. It provides broad strokes, not accurate measures.
Alternatively, more recent versions of tools such as ChatGPT use a multi-step approach: translate your request into code, run the code on the data, and translate its output back into understandable language and graphs. This can speed up a workflow, but the results are only as good as the underlying machine learning model.
You’ll need to pay attention to the accuracy of the model (true and false positives, etc.), distinguishing between training and testing performance, and so on.
I believe that generative AI, like previous generations of machines and tools, offer an opportunity to reduce effort and grunt work for human beings. I don’t think there’s any accountant out there yearning to return to bookkeeping without computers.
But this makes critical judgment and nuanced domain expertise more important, not less. In the case of data science, this means refining your “data sense,” a deep understanding of what the data means and doesn’t mean and how it relates to the real world and user behaviors.
--
If you’d like to keep up with all the industry's top experimentation-related news, sign up for our newsletter or find industry leaders to follow in our Expert Directory.