Using Couchbase and XDCR for Real-Time Advertising Applications: Couchbase Connect 2014
-
Upload
couchbase -
Category
Data & Analytics
-
view
365 -
download
2
description
Transcript of Using Couchbase and XDCR for Real-Time Advertising Applications: Couchbase Connect 2014
Joe Lichtenberg
VP Advertising and Analytics, Mirror Image
Couchbase Connect
October 21, 2014
Using Couchbase and XDCR for Real-Time Advertising Applications
Media Delivery
Content Logic
Mirror Image 1.0 (c. 1997): Content Delivery Network
Media DeliveryEdge Computing
Content LogicContent Logic
Mirror Image 2.0: Edge Computing + Media Delivery
Media DeliveryEdge Computing
Content LogicContent Logic
Dynamic Delivery: Edge Computing + Media Delivery
Synchronized Operations Worldwide
Obligatory Marketing Slide
Extensive functional capabilities
Personalized expert customization and support
High capacity, worldwide, real time infrastructure
Edge Computing
Geo-Distributed Database*
Live and On-Demand Video
Object Delivery
SSL Delivery
Token-Based Access Control
Reporting and Analytics
Knowledgeable, dedicated,in-house support team
Fully staffed, 24 x 7 Network Operations Center (NOC)
Expert professional servicesresources
High capacity, centralizedserver model
Exceptional performance, scalability, and availabilitywith SLA guarantees
Elastic scaling within and across geographies
Worldwide coverage
15+ years of experience
Mirror Image’s Dynamic Delivery Network
powers hundreds of billions of real-time requests for mission critical applications
throughout the Advertising and Advertising Technology ecosystem
Why Focus on Advertising and Ad-Tech?
-Lots of prospects-Requires fast response times to each visitor’s browser / device
Ad Tech Ecosystem
Source: Luma Partners
Source: www.iab.net/data
Ad TechEcosystem – Another View
Request/Response/Feedback Cycle
• Request Logic
– Customized behavior based on explicit and derived request attributes such as IP
address, user-agent, query parameters, cookie values, geo-location, …
• Response Logic
– Personalized javascript, XML, HTML; Transparent pixel GIFs; Cookie modifications;
Transformations, token substitutions…
• Edge Data Sets
– Shared data sets such as IAB Spiders & Bots, WURFL (Device DB), IP-GeoLocation
– Customer’s Key-Value Data
• Customized Data Collection
– Delivery of log files to customers as part of feedback cycle
Edge Computing Flow
• IAB Spiders & Bots Data• Mobile Device Data Sets
• IP – Geolocation Data Sets• Custom Key-Value Data In Memory*
What Problems Does The CS Implementation Solve?
• Customers and prospects have requirements that we workwith their large and growing data sets in real-time at theedge of the internet
• Our “flat file” implementation required moving the entirecontents of the files into memory on each server at startup,and…
• … customers were bumping up against data size limitations
Edge Computing Flow
• IAB Spiders & Bots Data• Mobile Device Data Sets
• IP – Geolocation Data Sets• Custom Key-Value Data In Memory
• Couchbase Key-Value Distributed Database
• Real-time GUID database
• Real-time cookie matching
• Contextual targeting
• Ad fraud / brand safety
• Cross-device user matching / audience de-duplication
• In session execution of batch-developed analytics
• Any web or mobile database-backed application with geographically dispersed users requiring fast end-to-end response times
Geo-Distributed Database: Use Cases
Geo-Distributed Database: Requirements
• High performance lookup and write capabilities at the edge
• Ability to manage large custom data sets for each customer application
• Low latency replication (aka XDCR)
• Ability to replicate to different regions per application / different data per region
• Key-value lookups only– no need for complex queries or SQL
• Reliability
Implementation Details
• Master hub sites have a 3 node Couchbase cluster for
reliability
• Edge sites start with a single instance for real-time RW
access, and scale out via clustering based on demand
• Edges can failover to nearby edges for availability
• Limited number of buckets
– Multi-tenant
– Geographic regions
– Customer-specific bucket(s) are optional
Implementation Details
• Phase 1 (in production): Read-only edges; master
hubs; data import / ETL servers
• Phase 2: Extends read-only behavior to virtual
locations
• Phase 3: Adds write support on edges, bi-
directional XDCR back to each master hub
GDD “Marchitecture” - Read Only
GDD “Marchitecture” - Read/Write
Master Hub Configuration
CustomizableETL Servers
FTP UploadStorage
CouchbaseEdge Instance
ETL MasterCouchbase 3-Node Cluster
ECF Server Farm
Distributed Star Network Architecture
• Globally distributed, consistent data -> High performance globally
• Schema-less data model -> Flexibility
• Scalable -> Handles traffic and data growth and spikes
• Fully managed service -> Including setup, maintenance, replication, synchronization, backup, security, and 24x7 monitoring
• Subscription-based billing model -> No CAPEX, pay as you go
• Uses Couchbase Server -> Proven; active developer community
• Data privacy -> No conflicts of interest with our customers
GDD Key Benefits (or “Marketing Slide #2”)