Casey Lewis, author of the Generation Z newsletter After School, has spent years searching the internet for a simple mini dress in vain. The closest thing to perfection, she said in one newsletter, was the 2020 Everlane dress, which is no longer available.
Recently tried a new approach. She asked ShopWith AI, an AI shopping assistant that combines ChatGPT OpenAI and proprietary technology, to recommend a black mini dress.
“In fact, several brands have come out that I’ve never heard of before,” Lewis said in an interview. “Honestly, I might have ordered one of these dresses because it was so cute and seemed exactly what I was looking for, and I probably wouldn’t have found it otherwise.”
In recent months, companies including Shopify, Mercari, and KNXT, an online retailer run by Kering, have debuted ChatGPT-based shopping assistants. Zalando plans to release its own soon.
These chatbots are a step up from the previous generation of automated online assistants. They drew from a set of scripted responses, but the hope for this new wave is that they’ll be smart, flexible, and responsive like a real salesperson. A client can ask questions like “What should I wear to an October wedding in New York?” and receive an answer that takes into account current trends, location and weather along with links to products from the seller’s catalog. The customer can also provide feedback and guidance to the bot to get closer to the products they are looking for.
This future is still largely theoretical. While these generative AI bots are able to provide consistent, personalized answers to shoppers’ questions, along with relevant product recommendations, they can still feel robotic and have limitations.
Using ShopWith AI, which lets you shop by asking questions to celebrities like Kim Kardashian, Lewis said she still feels like she’s talking to a bot. Brand recommendations are more useful, but it’s largely a human job: ShopWith AI co-founder Sarah Stark has chosen a list of brands and boutiques from which the AI picks products.
“It’s still a very sophisticated approach. These are not the main brands that appear on the first three pages of Google search results,” Stark said.
To better understand how well the new group of AI shopping assistants works, BoF employees tested some. We introduced the same prompts to different bots to see how the results compare, and we asked different questions to get an overall picture of their abilities. Here’s what we found.
The bots understood the prompts and were mostly able to recommend the right items, knowing that linen is good for warmer weather and having a general understanding of dress codes for different occasions. But overall, reviewers were disappointed with the product suggestions alone. Sometimes the effectiveness of AI was down to the non-technological question of whether a retailer had clothes in stock that matched a query. Of course, not every store will have items for every occasion, but at least a human associate could definitively tell if a shopper should look elsewhere instead of leaving them guessing whether the problem is how they framed the question and whether the associate knows the answer.
One tester who asked what to wear to a boat party in Europe this summer after being asked by a friend said the bot recommendations were generally not very useful, although she thought Shopify’s suggestions seemed a bit more selected. She ended up sending her friend one of these, a white maxi dress with buttons down the front.
KNXT’s Madeline bot, which offers products from Kering brands and those owned by the Pinault family holding company behind Kering, had the biggest search problems. It offered a Balenciaga look that she didn’t think suited a summer boat party. (When another reviewer asked about Bottega Veneta menswear, he kept suggesting Balenciaga.)
A reviewer looking for graduation dresses for a ceremony in Chicago in June liked KNXT’s suggestions the most in terms of style, but found the interface clunky. Shopify’s was quick and easy, but she didn’t like the products it recommended, including the festival outfits, which left her confused.
On the other hand, she and another tester who asked about clothes her husband might wear to a cricket match in London noticed that the Shopify bot was the only one that seemed to take the weather into account. For a cricket match in the London area, he chose a raincoat as one of the options.
Many reviewers said that Mercari’s Merchat’s product recommendations did not seem very curated, perhaps because Mercari is a second-hand marketplace and relies on the offerings of individual retailers. They also noted that he asks a lot of questions, apparently trying to get more context – “too many questions for my attention in Gen Z,” noted the youngest reviewer.
A tester buying for her husband, however, appreciated that she needed more information about him as he is 6ft 3in and had ideas on what would best suit him. She ran out of product suggestions, including shorts, which she noticed people wear less often in London. She generally believed that his point of view, including language, was centered on the United States. “I can’t say I’m convinced,” she concluded of shopping with an AI assistant.
A reviewer looking for prom dresses was also unimpressed with the ShopWith AI options, although she was more impressed when asked her Lenny Kravitz persona what to wear to the concert. The outfit actually looked like something Kravitz might choose, she said.
The conversational interface used by these shopping assistants had the advantage of allowing reviewers to ask for suggestions instead of just searching for products. But text responses often sounded like marketing copy, and any conversational opportunities won’t matter much if the bots can’t offer the right products.
It is still too early for shopping assistants with generative artificial intelligence
While these bots may be rough these days, they represent the early days of shopping assistants with generative AI, and in theory should improve as technology advances, which it does quickly. The companies behind them admit that they are more for testing and learning than for acting as finished products.
“One of the reasons for leaving early was the opportunity to see what user feedback is, how people will interact with it, and how they will understand it,” said Miqdad Jaffer, Shopify’s product director. “The places where we would like to make improvements are in the type of results returned, the types of conversations you have, and then the possibility of further filtering.”
Kering similarly called KNXT a place to “test innovative digital experiences” in a statement. John Lagerling, CEO of Mercari US, said in an email that they are learning and iterating on a regular basis and expect to continue improving the quality of product recommendations.
But the theoretical strength of these bots is that the underlying large language models are designed in a way that allows them to predict user intent well. Robert Hetu, vice president and analyst at Gartner, a technology research and consulting firm, said that previous chatbots were good at simple tasks like informing about the status of an order, but were not good at making purchases because they required users to spend a lot of information to guide them. Shopping assistants powered by generative AI, on the other hand, are more likely to succeed because they are much better at accurately predicting customer intent.
However, it is not an easy skill.
“Of course, this is a work in progress,” said Jake Stark, co-founder of ShopWith AI. “It’s a tough, personalized search.”
Lewis, who has described herself as a “technology optimist”, said AI shopping assistants will not replace her conventional search at this stage. But that could change if tech giants like Google and Microsoft, which are integrating generative AI into their products, are channeling their vast amounts of data, knowledge and resources in this direction.
“It hasn’t replaced search, but I’m sure it might in the future,” she said. “Maybe they’ll do it.”