Tag Archives for " API "

Tree Branches depicting product categorisation

How can I use AI to Categorize Product Data

Is there a better way to leverage AI to categorize product data?

Have you ever tried searching for a product on your favorite online shopping site, only to be disappointed when you couldn’t find the product that you’re looking for? Most product site search engines leverage accurate product categorization attributes to help narrow the search results for a user.

In this article we’re going to look at the impact that proper categorization has on search and how it’s now possible to automate product categorization with a machine learning model.

What is Categorization?

Categorization starts with a well designed product category taxonomy. The product taxonomy defines how each product type is related. The first couple levels of a product taxonomy contain broad category labels. For a grocery taxonomy, the top levels might be organized by departments within the store. It’s a logical representation of the way that a shopper would look for a given product in the physical store. A taxonomy is often referred to as a “Product tree”, with each product category referred to as a “branch” and each individual item referred to as a “leaf” on that branch.

Grocery taxonomy example:

  1. Meat & Seafood

    1. Fresh Meat

      1. Ribs

      2. Smoked Ham

      3. Specialty Meat

      4. Kosher Meat

      5. ...

    2. Fresh Seafood

    3. Packaged Meat

    4. Packaged Seafood

  2. Produce

  3. Deli

  4. Bakery

  5. Adult Beverages

  6. Beverages

  7. Floral

  8. ...

For a new product to be put into the online product catalog, it first needs to be categorized appropriately into the correct level of the product taxonomy. This is easy enough for a human to complete the product categorization, however, when you have thousands and thousands of products, this can be a tedious process.

Why is Categorization Important?

The science of search has evolved over the last two decades. Trying to determine the searchers intent from one or two words is not a simple process. We’re not going to dive into that in this article. However, in the specific use case of product search for an ecommerce website, most shoppers will generally include the object of their intent as part of the search input. In most cases this data can be used to quickly narrow the results set based on the product taxonomy. After all, the consumer isn’t looking for organic lettuce in the seafood section, nor would they be looking for seafood in the produce section. So one method to quickly close the search breadth is to narrow the search to specific sub-branch of the product taxonomy.

One downside to improper categorization is that improperly categorized products can become “lost”. When a product is mis-categorized on an improper branch of the taxonomy, the search engine may either (1) not find the product or (2) relegate the mis-categorized product to the bottom of the search results.

Don’t believe me? Try this: go to your favorite ecommerce provider, search for something, and then go to the last page of the search results. What do find there? Don’t let this happen to your product catalog.

In addition, the product category for a given catalog item can help define the product schema that should be employed to display the product information for the consumer on the product data page. The schema can also help define the meaning of generic product attributes, depending on the product type.

What is ATOM?

ATOM is the product categorization service from IceCream Labs. We developed ATOM as an API service which can be accessed automatically from your product information manager. ATOM takes a product title or description as an input and outputs the recommended product category for the item. ATOM is powered by a machine learning model that has been trained on millions of product records. It’s constantly learning as it processes new data.

With ATOM, you can properly categorize or validate a new product item before accepting it into your production product catalog.

To learn more about ATOM, or see a demo, contact our sales team: sales@icecreamlabs.com

sketchbook with a sketch of a robot

The core components of Artificial Intelligence

The core components of Artificial Intelligence

The massive surge in AI and the buzz surrounding it sometimes makes it difficult to get a handle on the technology and piece around AI. Here we make a small attempt to explain the core components, services and piece in AI.


Graphic Processing Units or GPUs - Nvidia is the largest maker of GPUs had its niche in the gaming market. It had such a lasting impression that no gamer would want to be caught without it. This was until the discovery of Bitcoin and the heavy interest in blockchain. GPUs were no longer restricted to gaming. People relied on them to mine bitcoins.
With the current buzz regarding AI and the rush towards adapting it in their processes, people have realized the benefit of using GPUs for AI and deep learning. GPUs essentially are high end graphic cards that go on to regular servers. The cards as well as the extensive software stack make it easy to process large volumes of data in complex AI models. GPUs are expensive but they cut processing time from months to days. Now, you would never see a deep learning developer without a GPU. Nvidia from its end, has built an extensive software stack to support the use of these GPUs. Libraries such as Cuda are a great resource.


While the hardware is in place, the framework or libraries are the what the machine learning applications are built on. Matlab was widely used to experiment and the R programming was also heavily used.
When Python libraries became available, developers switched almost immediately. Scikit-Learn is the most popular python framework most developers use to do traditional machine learning.
The big push towards deep learning has lead to some great competing frameworks. Most recently, Tensorflow from Google has gained popularity as well as Pytorch. Some others in the same field are Kara’s, Caffe and Theao. Each frameworks have their own set of pros and cons.
Every framework has its pro and cons. The adoption is really driven by the community available for people get support since all of these are free open sourced and not commercially sold frameworks. Tensorflow is pushing the limits with awesome features to handle text, images etc and has been widely used within Google. Our teams here are currently in love with Pytorch.


All the big cloud providers such as Amazon Web Services (AWS), Google Web Platform or Microsoft Azure have a portfolio of AI or machine learning APIs offered as cloud services. They offer text classification, sentiment analysis, image classification etc. These services can be plugged into solve simple problems within a developers application. For instance, for a broad level search, you can use the sentiment analysis API to see if your customer feedback is negative or positive.

AI Applications

These are the application used by customers which leverage AI and solve specific problems. Siri and Alexa are the ideal examples. Another great example would be the recommendation engines that we see on Amazon, Walmart, etc. The most common applications can be seen in almost every car leveraging technology from MobileEye. It has features such as the lane assist, parking sensor, and pedestrian or obstacle detection systems.
Our teams at IceCream Labs have spent the better part of the last 18 months building applications focused on catalog management and merchandising for retailers and brands.


We leverage GPUs, Tensorflow, Pytorch, Keras and Caffe. We don’t use the standard cloud APIs as they are not sufficient for the problem we cover. Our application, CatalogIQ can intelligently score the quality of the product content and automatically classify products, generate keywords, improve content for SEO and search.

By leveraging secure private clouds, we can do this on from 1 product to 100 million products seamlessly. That is, really the power of AI.