Maximize impact and return through real-time queries with respect to advertisements!
As a core part of monetising internet content and services, our SAPO web portal and associated pages, currently at 2M+ visitors per day, serve graphical ads. Through a set of configurations and rules regarding ad consumption, our customers buy packs of advertisement prints and the algorithm behind the AdServer then decides which advertisement to print at any given time to each user on specific pages, attempting to maximise returns. For this decision, the AdServer relies on history of advertisement printing events, web analytics and user specific information. The normal running of the AdServer produces large amounts of information. Each printing event contains information such as timestamps, geo-referencing, visited URL, ISP and DNS data, specific user data, keywords used on the search that led to the page where the ad is served, to name a few. It is a highly relational dataset and at times heterogeneous. On top of printing events there are traditional web analytics events where we aggregate relational logging information describing user flow on our web pages. The overall number of events relevant to the AdServer is on the order of 3,000 per second generating over 100 GB/day, 36.5 TB/year, posing serious difficulties on all three strands of big data: volume, velocity, variety.
LeanBigData Enablers as Differentiators:
-
Run efficient and real-time ad-hoc SQL and non-SQL queries over the stream which will inform high-level decisions on the design and inner workings of the AdServer.
-
Use CEP analysis over the stream to trigger higher-level events in real time leading to richer and more reactive forms of serving and optimising advertisements.
-
Analyse ad campaigns in real time and react immediately, helping to avoid dead periods where the AdServer runs out of ads due to slower data analysis.
-
Reduce hardware and operational complexity contributing to cost reduction and more sustainable operations.
-
Produce more insightful visual dashboards of performance indicators.