5 Cases where NoSQL Shines in Ad tech
Digital advertising is undeniably a data-driven industry, one where the value of NoSQL databases is demonstrated each day. In fact, every one of the ad tech companies Thumbtack has worked with rely upon NoSQL databases. to enable segments of their mission critical functions.
Many markets have technology demands unique to their business, and the tech that drives advertising sales is certainly one of them. Operational activity including cookie matching (and other types of user tracking techniques), user, site and audience profiling, fraud detection, and campaign data caching, as well as meeting analytical needs including campaign performance reporting, real time classification and segmentation pipelines, and user history analysis exemplify the high-load real-time, data-intensive workloads specific to ad tech.
These activities need to be executed with unprecedented speed and scale creating significant demand on database systems. Here are 5 common ad tech platform activities where traditional RDBMS struggle to handle processing requirements in terms of throughput + latency:
Real-time bidding (RTB) systems like Appnexus, Chango, and SiteScout, rely upon stored user profiles (campaign data) which need be read within fractions of a second during auctions to inform what publishing space to bid on, what price to bid and finally which version of a creative ad to serve.
Data management platforms, like Marin Software’s Audience Marketing Suite handle massive classification/ segmentation pipelines often driven by continuous machine learning processes to classify billions of users, demographic information, site traffic, and desired audiences and then serve that information quickly to drive both direct and programmatic ad sales.
In order to continuously gather more information about users using cookie matching techniques, targeting platforms utilize functions like AdForm’s Universal Tag Management which either work in the background, via Hadoop-like bulk process or in real-time with a fast key/value storage like Couchbase, Aerospike, or Redis.
Online ad fraud detection services like Forensiq process up to 1 trillion bid requests per month. Technically it looks like this: logs and user behavior statistics go into Hadoop, MapReduce jobs analyze it and create profiles for users, devices, sites, etc. that is then stored in a fast K/V database like Aerospike. The real time bidder can then verify the user request against the real time data in the db to determine whether its fraudulent or not.
Complex data querying, like those used by PlaceIQ’s Consumer Insights platform, and other cases like time series data, traversal data, graphs, or nested structures cannot be effectively handled using relational data models. Almost all ad tech components use these types of structures from time to time, depending upon what types of algorithms are being utilized.
But Wait. RDBMS is still Best in Class for many Use Cases
This does not mean however that all data should be migrated to NoSQL data storage (or that NoSQL solutions are one size fits all, they are not). There are many instances where datasets are best served by traditional RDBMS. Billing Information, which is transactional and need be highly durable, and doesn’t require high-volume processing, is best served by ACID qualities that are at the heart of RDBMS.
Another category where NoSQL is not necessarily the right solution is for information that needs to be accessed or analyzed by different departments in different ways – in other words, if flexible access patterns are crucial for the range of analytical tasks(i.e. accounts payable, ad ops and sysops departments need different reports generated on the same data set). This is where Hadoop, Storm and Spark solutions are coming in to play.
Coming in Part Two:
An overview of the 4 basic considerations to help in the decision-making process about what and when to migrate to a more modern database, and key difference between popular NoSQL solutions.