How Amazon Governs with Product Information for 350M+ Products

Type of Content:

Research

This article is the first in a series that I will write that aims to distill the knowledge I gained while working at Hopstack. The topic of this article is something I dove into when I was lead PM for Hopstack Ignite.

I'm willing to bet that everyone reading this has made at least one purchase from Amazon in their lifetime. Amazon has made it so convenient to get products from around the world to your doorstep within 1-10 days that it's become ubiquitous in the lifestyle of the average internet-connected individual.

However, powering this convenience is monumental task. There are a series of challenges that come with powering an ecommerce business, including but not limited to, challenges with supply chain, last-mile delivery, address validation (especially in a country like India), inventory management, maximizing operational efficiency, etc. But one key issue, one that resides upstream from all of aforementioned challenges, is managing product information.

Based on the sources I referred to while writing this article, Amazon reportedly has more than 350 million unique products that can be sold on its platform across all geographies. This means that Amazon has to source, manage, and maintain the accuracy of information for each of these products in near-real-time, for all sellers and across all its marketplaces. This includes information like:

  • Images

  • Product Names

  • Description

  • Features

  • A1 Content

  • Dimensions (Weight, Length x Width x Height of Product, LxWxH of Box)

  • Identifiers (Barcode Numbers like UPC, EAN, Model Numbers, Variant Numbers, etc.)

Ensuring that all product pages that refer to the same product show the same information is of paramount importance, and Amazon knows that. This is because showing incomplete information can dissuade customers from buying from a particular seller. This dissuasion can be triggered by something as simple as differences in product name.
For example, if one seller writes their product name as "sony ps5" and the other writes it as "Sony PlayStation®5 Digital Edition (Slim)", which one would you be more likely to buy? (For me, it would be the latter. It's more descriptive and gives me a sense of trust that the seller knows what they are selling)

So, we've established that incomplete product information is bad for sellers. We can also reasonably estimate that not every single seller is going to keep their product information up-to-date. 100% uniformity in any billion-scale operation is unheard of, and frankly, improbable enough to be impossible.

Amazon as a platform is incentivized more than the individual sellers to keep accurate product information because they want customers to complete the sale with them. Amazon doesn't care much about which seller fulfills the order, they just want the purchase to be done on their website instead of their competitors.

So having said this, how does Amazon ensure the accuracy of product information for every single seller across the globe?

Amazon's Approach: Catalogs & Listings

There are two main types of sellers that sell products on Amazon:

  1. Vendors - These are the manufacturers of the goods.

  2. Third-party sellers - These are external parties who have authorization from the manufacturers to sell goods.

An example of this would be a Sony PlayStation 5 being sold by Sony themselves (vendor) as well as their authorized distributors (third-party sellers) on Amazon.

A PS5 on Amazon US is sold by Sony itself + five other third-party sellers

Since the number of third party sellers is uncapped, and Amazon's primary goal is to maintain the product-identifying information for a better user experience, they decoupled this information into two main categories.

  1. Catalog Entries

  2. Listings

Catalog Entries

The Amazon catalog stores and maintains product-level information about each one of those 350 million products. This includes information that is uniform across all marketplaces for a product, such as:

  1. Name of the Product

  2. Description and other A+ content

  3. Universal identifiers like GTIN, MPC, EAN (Europe), UPC, ISBN (Books) and so on

  4. Dimensions - gross and net weight, height, width, length

  5. Dangerous goods classification based on UN classification (global) (pages 3 to 5) or Department of Transportation criteria (US only)

The onus of maintaining this information is on Amazon. They maintain this information with the help of vendors and trademark owners. If someone is self-branding a generic product AND owns the trademark for the product they wish to sell, they will submit the above information to Amazon to be able to sell it.
But if a seller wants to sell an iPhone 15, they will use the catalog information uploaded by Apple onto Amazon. They do not have to create their own.

Each entry in this catalog is identified by a unique identifier, which Amazon has termed as ASIN. ASIN stands for Amazon Standard Identification Number. It's a 10-character long alphanumeric code, with one being assigned to every catalog entry.

Amazon went with a 10-character code for all products because the first product category they started with (books) relied on the International Standard Book Number (ISBN), which was also a 10-character code. This has since been revised to 13 characters.

Listing

The listings are seller-specific entries based on catalog entries that contain specific product information and information that is typically non-uniform for any given product, such as:

  • SKU ID - Seller-specific identifier i.e. what does a seller call this product in their operations,

  • Pricing information - Cost Price, Sale Price, transport costs, etc,

  • Lot ID, Batch ID, expiry date, production date, etc. - to gauge the freshness of perishable products,

  • Variant information - Color, taste, size, texture, material, region, etc.

The onus of maintaining this information is on the seller. And because of the information split, sellers are incentivized (and obligated by Amazon Seller Guidelines) to maintain this information accurately.

Each catalog entry (ASIN) will have at least one associated listing for any seller that's selling that product on Amazon. These listings are uniquely identified by an FNSKU ID. FNSKU stands for Fulfillment Network Stock Keeping Unit. It's an identifier that uniquely identifies the product as well as the seller selling it. It is also usually a 10-character alphanumeric code and is mandatorily applied to every product that is sold through Amazon. You may have seen an FNSKU barcode applied to products that you have purchased through Amazon. Here's an example:

In summary, Amazon maintains accurate product information for all its products in its catalog (uniquely identified by ASIN) and every seller who wants to sell a particular product is creates a listing based on the catalog entry (uniquely identified by an FNSKU ID)

Some other salient features of catalog and listings include:

  1. A seller can have more than one listing for the same catalog entry.

    1. Example - They may have one listing for wristbands that they sell via FBA and another one for FBM. They may also have different listings for different conditions i.e. a brand-new iPhone and a refurbished iPhone.

  2. A seller can edit existing listings i.e. prices can be changed, fulfilment channels can be changed, marketplaces can be changed, etc.

  3. A catalog entry CANNOT be edited by anyone except Amazon, and it usually doesn’t happen frequently. In the rare case that it does, changes can be expected in the name, description, and rarely, the ASIN.

Edge Case: Commingled Inventory

Amazon also has the concept of commingled inventory.
When items are commingled, the actual item is not attached to the seller, just the item’s identity. This means that when an order is placed for that item through a specific seller, the order is credited to that seller, but the item might be any of a number of those exact same items that are being currently stored within the Amazon warehouses.
This means that these products will not have a separate FNSKU label generated for them. In this case, the FNSKU label will be the same as the ASIN label.
In some cases, the uniqueness of a product can also be determined from its manufacturer-provided barcodes like UPC/EAN (the barcodes that you see on branded products, even the ones you buy from non-Amazon retailers like shops, other ecommerce sites, etc.)

Conclusion

This is the invisible dance that occurs in the background of the largest online retailer in the world. It enables Amazon to clock an average daily sales revenue of $1.6 billion from products sold by an army of close to 10 million sellers worldwide, to 310 million active customers like (potentially) you and me.

One key thing that I've seen in my limited experience in business so far is that a lot of effort goes into making something feel effortless. This might seem like a monumentally complex approach to most users who haven't had a lot of exposure to this field, but this effort being taken by Amazon ensures a simpler and safer shopping experience for its users.
It protects users from:

  1. Incomplete and inaccurate information, potentially resulting in unintended transactions,

  2. Getting stuck in long support loops to reverse the consequences of these unintended transactions (which may be irreversible in a lot of cases),

  3. Taking uninformed buying decisions.

As much as I've developed a deep disdain of working with Amazon infrastructure as a software solutions provider in the past, Jeff Bezos' obsession with customer centricity is the main reason why Amazon is an industry pioneer.

And considering that I don't have to deal with the solutions provider side of the coin anymore, you can expect my Prime subscription to auto-renew for the foreseeable future.

—————

Sources