Post on 21-Feb-2017
Real-time Platform for Second Look Use Case using Spark and KafkaIvy LuCapital One
How closely do you look at credit card statements?• Ifyouranswerisnotcloselyenough,thenyouprobablyaren’t
alone!• RecentresearchfromCapitalOnerevealssomeofthereal
costsoftheseunexpectedcharges.
Take a Second Look• Fraudvs.Non-Fraudbutunexpectedcharges
• Typesofchargesyoumaybemissing
• Customersareincorrectlycharged$150onaverageperyear
Recurring TransactionsSpikes in monthly
recurring bills
Duplicate Charges
Multiple swipes at thesame merchant
Generous TipsTip higher than average tipping
behavior
Email and Mobile UI
Second Look Program (initial phase)• Launch:
– Email: August 2015– Mobile push notification: January 2016
• Coverage:– Auto-enroll for credit card customers– Tens of thousands alerts sent per day– Tens of thousands customers reached per day– Several different types of alerts
Real Time Pipeline (current phase)
Microservices• Distributed• Decoupled jobs
Real-Time + Batch Data• Batch Data
– high volume– relatively slow
• Real-Time Data– medium-low
volume– fast
Deduplication• Cause of Duplication
– At-least-once at data source
– Spark, Kafka Job• Deduplicate at
Database
Checkpointing• Goal: achieving zero data-loss (at least once)• Spark checkpoint vs. Kafka offset• Connect to Kafka using Spark’s Direct Stream
approach and store offsets back to ZooKeeper
Ref http://aseigneurin.github.io/2016/05/07/spark-kafka-achieving-zero-data-loss.html
Social Media Feedback“Thank you @CapitalOne for your 'take a second look' email! You saved me money!!!”
@josedunham
“@CapitalOne Wow love the email I got about a possible fraudulent charge. A restaurant added on a tip without my permission and you caught it”
@ryan_babypro98
“Gotta love when a #CreditCard Company lets you know when there are higher charges then normal on your account thanks @CapitalOne YourTheBest”
@AnnButlerDesign
“Thank you @CapitalOne for the catching of an over charge that I otherwise May not had noticed. Must give credit where credit is due. Rock on”
@Jasonmjarrett
Word Cloud of email feedback
Thank YouIvy.Lu@capitalone.com