Skip to content
7 Ultimate Chatbot Datasets for E-commerce
 — April 20, 2022

Since the emergence of the pandemic, businesses have begun to more deeply understand the importance of using the power of AI to lighten the workload of customer service and sales teams. 

As people spend more and more of their time online (especially on social media and chat apps) and doing their shopping there, too, companies have been flooded with messages through these important channels. Today, people expect brands to quickly respond to their inquiries, whether for simple questions, complex requests or sales assistance—think product recommendations—via their preferred channels.

This is where conversational AI comes in. By automating responses to simple requests, AI chatbots can free up time for customer care and sales associates to focus on high-value transactions. This leads to better CX and helps grow sales!

But for all the value chatbots can deliver, they have also predictably become the subject of a lot of hype. With all this excitement, first-generation chatbot platforms like Chatfuel, ManyChat and Drift have popped up, promising clients to help them build their own chatbots in 10 minutes. Does this snap-of-the-fingers formula sound alarm bells in your head? It should. The age-old saying “buy nice or buy twice” doesn’t just apply to clothes, it’s also 100% true when talking about AI (artificial intelligence), especially with all the quick, cheap builds flooding the market and giving the technology a bad rap.

Building a state-of-the-art chatbot (or conversational AI assistant, if you’re feeling extra savvy) is no walk in the park. First, you need to start with a solid data set. Second, if you think you have enough data, odds are you need more. AI is not this magical button you can press that will fix all of your problems, it’s an engine that needs to be built meticulously and fueled by loads of data. If you want your chatbot to last for the long-haul and be a strong extension of your brand, you need to start by choosing the right tech company to partner with.

At Heyday, we know that, much like Rome, great chatbots aren’t built in a day. As we’ve said, to get your brand where you want it to be in the AI world, you’re going to need data, a whole lot of it and of all kinds. Here are the seven types of data you need to get your hands on:

1. Product data feeds, in which a brand or store’s products are listed, are the backbone of any great chatbot. By integrating with e-commerce platform databases like Shopify, Magento or Demandware, Heyday’s AI chatbot solution can effectively fetch the right product information, including attributes, tags, availability, sizes, descriptions or store location, to answer customers’ most pressing questions and guide them towards the right product recommendations.

By tapping into product catalogs and only surfacing a curated selection of items relevant to a given customer’s needs, you’re creating a more personalized shopping experience, which leads to increased sales and customer satisfaction.


Conversational interfaces are the new search mode, but for them to deliver on their promise, they need to be fed with highly structured and easily actionable data.

2. Historical data teaches us that, sometimes, the best way to move forward is to look back. The past is often the greatest teacher, and information gathered from call centres or email support threads give us concrete insight on the overall scope of conversations a brand has had with its customers over time, good and bad alike.

Although phone, email and messaging are vastly different mediums for interacting with a customer, they all provide invaluable data and direct feedback on how a company is doing in the eye of the most prized beholder.

Gleaning information about what people are looking for from these types of sources can provide a stable foundation to build a solid AI project. If we look at the work Heyday did with Danone for example, historical data was pivotal, as the company gave us an export with 18 months-worth of various customer conversations. Sifting through all of this back-and-forth information allowed us to identify topics relevant to the brand, create question clusters, pin-point edge cases and problems, all of which fed the AI assistant with a decent training dataset starting on day one.



3. Customer relationship management (CRM) data is pivotal to any personalization effort, not to mention it’s the cornerstone of any sustainable AI project. Using a person’s previous experience with a brand helps create a virtuous circle that starts with the CRM feeding the AI assistant conversational data. On the flip side, the chatbot then feeds historical data back to the CRM to ensure that the exchanges are framed within the right context and include relevant, personalized information.

The end goal here is to create a 100% holistic customer profile for each user and give everyone a V.I.P. experience, while also maintaining visibility of the customer’s preferences throughout the company, for AI assistants as well as internal teams.

Building this synergy between all the moving parts is how you’ll successfully create a smart and seamless shopping experience.

4. FAQ and knowledge-based data is the information that is inherently at your disposal, which means leveraging the content that already exists on your website. This kind of data helps you provide spot-on answers to your most frequently asked questions, like opening hours, shipping costs or return policies.

Mobile customers are increasingly impatient to find questions to their answers as soon as they land on your homepage. However, most FAQs are buried in the site’s footer or sub-section, which makes them inefficient and underleveraged. By tapping into the company’s existing knowledge base, AI assistants can be trained to answer repetitive questions and make the information more readily available. Users should be able to get immediate access to basic information, and fixing this issue will quickly smooth out a surprisingly common hiccup in the shopping experience.

5. Contextual data allows your company to have a local approach on a global scale. AI assistants should be culturally relevant and adapt to local specifics to be useful. For example, a bot serving a North American company will want to be aware about dates like Black Friday, while another built in Israel will need to consider Jewish holidays.

Context is everything when it comes to sales, since you can’t buy an item from a closed store, and business hours are continually affected by local happenings, including religious, bank and federal holidays. Bots need to know the exceptions to the rule and that there is no one-size-fits-all model when it comes to hours of operation.

6. Third-party application programming interfaces (APIs), including booking engines like Sabre for hotels and UPS’ delivery tracking system, for example, help make your chatbot smarter and more useful by providing you with access their functionality to use on your site. This may be a lot of invisible back-end work, but they need to be integrated seamlessly if you want your AI assistant to be able to fetch the right information and deliver it back to a customer in the blink of an eye. In the world of e-commerce, speed is everything, and a time-consuming glitch at this point in the process can mean the difference between a user clicking the purchase button or moving along to a different site.

7. Internal team data is last on this list, but certainly not least. Providing a human touch when necessary is still a crucial part of the online shopping experience, and brands that use AI to enhance their customer service teams are the ones that come out on top. To keep this part of the operation running smoothly, the AI assistant needs to know which agents are available at all times, which departments they work in and what their sales specialties are, in order to appropriately redirect an escalated customer interaction.

When a chatbot can’t answer a question or if the customer requests human assistance, the request needs to be processed swiftly and put into the capable hands of your customer service team without a hitch. Remember, the more seamless the user experience, the more likely a customer will be to want to repeat it.

Data is the fuel your AI assistant needs to run on

A smooth combination of these seven types of data is essential if you want to have a chatbot that’s worth your (and your customer’s) time. Without integrating all these aspects of user information, your AI assistant will be useless – much like a car with an empty gas tank, you won’t be getting very far. Without tons of input data, your chatbot will be reduced to a basic workflow decision tree or the text message equivalent of an interactive voice response phone tree, the thought of which alone can bring up enough memories of endless button-activated menus to turn users off completely.

A broad mix of types of data is the backbone of any top-notch business chatbot. Here at Heyday, we often start by performing a data audit for our clients to identify a brand’s readily available sources of data that we can tap into to ensure that your AI assistant’s most valuable asset (you guessed it, that’s your data) provides enough fuel to get to market. Though AI is an ever-changing and evolving entity that is continuously learning from every interaction, starting with a strong foundational database is crucial when trying to turn a newbie chatbot into your team’s MVP.

Looking to find out what data you’re going to need when building your own AI-powered chatbot? Contact us for a free consultation session and we can talk about all the data you’ll want to get your hands on.

Stay ahead of the curve. Get Conversational AI insights delivered straight to your inbox.