How the CMS system works and why.

Promoted uses all of your data to optimize your marketplace. To send data about items (also called "content") and users to Promoted efficiently and asynchronously, use the Content API to interact with the Content System Management (CMS).

If you have information to send to Promoted that is not streaming user engagement and is "static" or "stable" (not rapidly changing or computed dynamically on every Request), then you should use the CMS.

System Design for Engineers

The CMS has a simple design. It is a key-value store (similar to MongoDB) with string keys, JSON-formatted string values, and REST API. It is attached to a feature store and a variety of update hook functions that transform data for use in machine learning and production use in an efficient binary format. The CMS data is kept in a live production cache (similar to Redis) with some expiration limits to save resources and reduce production use latencies.

The "User" and "Items" CMS are the same design, but separate tables. "Users" is isolated because it may contain User PII, so we manage it separately to isolate User data for easier GDPR and SOC2 compliance management.

By default, we put "contents" referenced by "content ID" in the Item CMS. However, we can load any record from the Item CMS, like brand features, or query features, or sometimes even order information by Order ID. The simple key-value JSON design with efficient serving layer makes the CMS system highly flexible. When in doubt about sending data to Promoted, send it with a unique key (consider a type prefix) to the Item CMS.

When Promoted reads records from the CMS, each "source" is a separate table. Promoted reads across all sources with the same ID and unions the values together for use in Delivery.

PUT versus PATCH

Use PUT. Promoted supports PATCH. It works as expected: it updates the JSON record with the patch. Managing the distributed state of complex records over time is error-prone and Promoted strongly advises against using PATCH. PATCH is supported because some customers require it due to limitations beyond their control. If you have no such limitations, use PUT.

Why PUT PUT overwrites the record with the value provided, so you know exactly the state of the record: the state that you wrote. If there is a mistake or change, simply PUT again.

Managing Multiple Writers Use a separate source for each writer and use PUT. That way, your different writers do not have to coordinate to manage a unified record state.

Data Format

Send data as schema-free JSON. Promoted's systems will automatically transform the data to features appropriate for automated machine learning, use in blending allocation rules, targeting, and reporting. [More information on how feature transformations. ] The more data you send to Promoted, the better Promoted can optimize your search, ads, and marketplace. There is no schema for data that you can send to Promoted. Send us anything and everything.

Common Integration Patterns

The most common integration is to:

  1. Upload a batch of content items
  2. Upload a batch of users

The ideal integration with Content API is for your servers to listen to database updates (change data capture or directly in the API) and write out to Content API immediately after the Content or User changes. Promoted has a streaming system so Content updates will be updated immediately in the Delivery system. Fresher data improves the accuracy of our predictions. We recommend that you send the entire record via PUT versus sending diffs via PATCH. Break records into different "sources" with separate writers to manage state.

The latency from CMS write to Delivery impact is about 1 second to a few seconds. This latency is important for ads use, particularly bids and budgets and live state.

You can also send writes in daily batches or via a manual script via PUT. You can do large batch PUTs to backfill or align record state periodically.

See tutorial for a how-to guide.


Writing to Content API from multiple sources

In your system, content data is processed by multiple processes and distributed across various tables. Instead of requiring a unified writer to Content API or attempting partial Content API updates, we support for multiple writers, each identified by different sources. Each document in the Content Store is keyed by contentId and source, and Promoted internally merges these documents. This mechanism is analogous to the column or document grouping used in other database systems.

While we provide the option to partially update documents via the PATCH method, we strongly recommend using distinct sources. The syntax and nuances of clearing values with PATCH can be intricate and susceptible to errors.

There isn't a cap on the number of sources. However, for optimal performance, we suggest maintaining a minimal, fixed number of sources for each content ID. If your requirements exceed ten sources for a single content ID, please consult with the Promoted engineering team to discuss your specific needs.