Telecom & Spend Analytics Arindam Guptaray
description
Transcript of Telecom & Spend Analytics Arindam Guptaray
Telecom & Spend AnalyticsArindam Guptaray
Few words about You!. . .
• Name
• Background – Engineering, Arts, Commerce
• Work Experience
• Expectations from this Class
• WHY ARE SPENDING YOUR WEEKENDS HERE????
Few words about me . . .
• B. TECH FROM IIT KHARAGPUR.
• MBA (FINANCE) FROM UNIV. OF MINNESOTA, CARLSON SCHOOL.
• HAVE WORKED IN ANALYTICS FOR 10 + YEARS
• 7 YEARS OF TELECOM ANALYTICS WITH PROVIDERS LIKE:
• AIRTEL
• IDEA
• TELEFONICA
• ETISALAT
• HAVE WORKED IN SPEND ANALYTICS FOR 3 YEARS
Email: [email protected]: http://www.linkedin.com/in/agriit
Agenda
DATE TOPICS
22nd June
• WHY TELECOM ANALYTICS
• AREAS OF TELECOM ANALYTICS
• DATA SOURCES FOR TELECOM ANALYTICS
• ASSIGNMENT
28th June • ASSIGNMENT WALK-THRU
29th June• SPEND ANALYTICS
• ASSIGNMENT BASED ON REAL LIFE PROJECT
5th July • PROJECT WALK-THRU
What is Analytics . . .
What is Analytics . . .
“Not everything that can be counted counts, and not everything that counts can be counted.”
- Albert Einstein
“Analytics is like the Game of Bridge. You can learn the rules of Bridge from a text book but when you
are actually playing, it’s a totally different ball game.”
- Arindam Guptaray
• Analytics can be Deterministic or Probabilistic.
• In real life you will never get clean data.
• An analyst should be able to tell you something about your data that you don’t know.
• A great analyst will be able to answer the follow up question:
“So what?”
Why Telecom Analytics . . .
Why Telecom Analytics . . .
• COMPETITION
• AVAILABILITY OF INTERLINKED DATA SOURCES
• DEMANDING CUSTOMER
• GOVERNMENT REGULATIONS
• AVAILABILITY OF CHEAP HARD-WARE FOR ANALYTICS
• HIGH VOLUME OF DATA MAKES IT SUITABLE TO USE STATISTICAL
CONCEPTS.
Post Paid Billing Process . . .
Pre Paid Billing Process . . .
Data Sources for Telecom Analytics . . .
• SWITCH CDR
• IN CDR
• RATED CDR (POSTPAID)
• SUBSCRIBERS INFORMATION (PREPAID, POSTPAID)
• SUBSCRIBER SERVICES
• RATE PLANS
• DIAL DIGIT LOOKUP
• TOWER LOCATIONS INFORMATION
• CRM DATABASES
• PAYMENT INFORMATION
Switch Call Data Record ( Switch CDR) . . .
• For every mobile transaction a record is generated in Switch. These are called
Switch Call Data Records or Switch Call Detail Records (CDR). The key fields of a
Switch CDR are:
• Called Number: The number receiving the Call.
• Calling Number: The number initiating the Call.
(A number is called MSIDN: Mobile Subscriber Integrated Services Digital
Network-Number)
• IMSI (International Mobile Subscriber Identity): The subscriber SIM card
number.
• IMEI (International Mobile Station Equipment Identity): The unique identifier for
your mobile phone.
• Cell ID: The unique ID of the cell tower transmitting the call.
• Call Status: Success, Failure, Missed.
• Call Time: The time of the call.
• Call Duration: Duration of the call.
IN Call Data Record ( IN CDR) . . .
• FOR EVERY PRE-PAID TRANSACTION A CDR IS GENERATED IN IN. THE
KEY FIELDS OF AN IN CDR ARE:
• CALLED NUMBER, CALLING NUMBER, DURATION, CALL TIME.
• RATE PLAN: THE RATE PLAN TO WHICH THE SUBSCRIBER BELONGS.
• CHARGE: AMOUNT CHARGED TO THE SUBSCRIBER.
• RECHARGE AMOUNT: AMOUNT RECHARGED.
• DEDICATED BALANCE: CAPTURES FREE TALK-TIME, SMSS, DATA
USAGE, ETC.
Rated Call Data Record ( Rated CDR) . . .
• ALL OF THE SWITCH POST PAID CDRS ARE PASSED TO A
RATING ENGINE WHERE THESE ARE RATED BASED ON
THE SUBSCRIBERS RATE PLAN.
• CALLED NUMBER, CALLING NUMBER, DURATION, CALL
TIME.
• RATE PLAN: THE RATE PLAN TO WHICH THE
SUBSCRIBER BELONGS.
• CHARGE: AMOUNT CHARGED TO THE SUBSCRIBER.
Subscriber Information . . .
• POSTPAID AND PREPAID SUBSCRIBERS ARE USUALLY STORED SEPARATELY.
THESE ARE THE KEY FIELDS.
• MSISDN
• IMSI
• FIRST USE DATE
• LAST USE DATE
• ACTIVE IN HLR (HOME LOCATION REGISTER)
• ACTIVATION DATE
• DEACTIVATION DATE
• ACCOUNT NUMBER
Subscriber Services . . .
CONTAINS THE SERVICES THAT SUBSCRIBERS SUBSCRIBE TO. FOR
EXAMPLE:
UNLIMITED OFF PEAK SMS
2 GB OF DATA USAGE
FREE ROAMING
INTERNATIONAL CALLING
THESE ARE THE KEY FIELDS:
MSISDN
IMSI
SERVICE NAME
ACTIVATION DATE OF THE SERVICE
DE-ACTIVATION DATE OF THE SERVICE
Rate Plans . . .
• THE RATE PLAN TABLE CONTAINS INFORMATION ABOUT A CALL WILL BE
CHARGED. THE KEY FIELDS ARE
• DIAL DIGIT CODE: DETERMINE THE TYPE OF THE CALL, DESTINATION
COUNTRY.
• CHARGES PER UNIT USAGE
• PROVIDER OF THE OTHER PARTY INVOLVED IN THE CALL
• TYPE OF HANDSET OF THE OTHER PARTY (LANDLINE, MOBILE)
• START TIME(HOUR) AND END TIME(HOUR) WHEN THE RATE IS
APPLICABLE.
Dial Digit Lookup
• The telecom provider needs to determine the following:
o Whether the call is a Local, STD or ISD call.
o Provider for the other party involved in the call.
• For countries where number portability is allowed there are services
available from where you can determine the provider. This is not a direct
data source. The key fields that are available in this table:
• Dial Digit Code
• Country of other party
• Provider of the other party
Cell Information . . .
INFORMATION ABOUT THE CELL TOWER USED FOR THE CALL. THE KEY FIELDS ARE:
• LATITUDE
• LONGITUDE
• COVERAGE RADIUS
Payment Information . . .
KEY FIELDS ARE:
• ACCOUNT NUMBER
• PAYMENT HISTORY
• PAYMENT TYPE (CASH, CHECK, CREDIT CARD, BANK ACCOUNT)
• PAYMENT DATE
• DUE DATE
• PAYMENT STATUS
CRM Database. . .
THE CRM DATABASE CONTAINS DEMOGRAPHIC, GEOGRAPHIC AND OTHER
FINANCIAL INFORMATION ABOUT THE CUSTOMER. SOME OF THE
ATTRIBUTES OF A CRM RECORD ARE:
ACCOUNT NUMBER
GENDER
DATE OF BIRTH
MONTHLY INCOME
WORK NUMBER
HOME NUMBER
ADDRESS
HLR ( Home Location Register) . . .
• The home location register (HLR) is a central database that contains details of
each mobile phone subscriber that is authorized to use the GSM core network. ).
The key fields of a HLR are:
MSISDN
IMSI
CUSTOMER TYPE (PRE-PAID, POST PAID)
LIST OF SERVICES (like 3G, Roaming, ISD)
TAP files (TAP IN, TAP OUT). . .
• The usage by a subscriber in a visited network is captured in a file called the TAP
(Transferred Account Procedure) for GSM This file is transferred to the home
network. A TAP file contains details of the calls made by the subscriber viz.
location, calling party, called party, time of call and duration, etc. The TAP files
are rated as per the tariffs charged by the visited operator. The home operator
then bills these calls to its subscribers and may charge a mark-up/tax applicable
locally. The key fields of a TAP file are:
• Called Number: The number receiving the Call.
• Calling Number: The number initiating the Call.
• Type of Service (PREPAID, POSTPAID)
• IMSI (International Mobile Subscriber Identity): The subscriber SIM card
number.
• Call Time: The time of the call.
• Call Duration: Duration of the call.
Detecting Subscriber Fraud . . .
• High number of calls to Black Listed numbers
• High Roaming charges
• High Internet Usages
• High number of VAS calls
• Frequent Change of Address
• Pre-Subscription Check:
• Verify address
• Verify home number
• Set Credit Limits
• Check PAN number, UID against Credit Violations
• Check IMEI against Black Listed IMEI
• Check for matching names with black listed customers.
• Check for matching PIN codes.
• Check for addresses from notorious localities.
• Match subscriber usage profile with black listed subscribers :
• Called numbers
• Matching tower locations
• Calling patterns (short calls, long calls)
Detecting Recharge Voucher Fraud . . .
• Unusual top-ups
• High number of recharges in a given time-period
Detecting Pre-paid Balance Fraud . . .
• Track employees with high number of manual balance change
• Subscribers with high balances
Project: Detecting Fraudsters Cont..
• After every 5 days they undertake an audit to see whether these
Fraudsters have joined their network. They review the list of
subscribers who have made calls to the same people as these three
fraudsters and in the same time frame.
• Please use a statistical method (Naïve Bayesian Classifier or Decision
Tree) to identify if any of these subscribers is Sally, Vince or Virginia.
• Please provide the following:
• R, SAS or MATLAB code used to determine the subscriber.
• Name of the probable caller in an additional column in the
Audit CallLog excel file and the confidence in terms of
probability.
• Name of the fraudster, if any.
• Note that the company needs to be absolutely sure that the person is
a fraudster. Matching of calling patterns for 1 or 2 days is not proof
enough. You should also have a high percentage of confidence
(probability) when you identify this person.
Assignment (20 marks) . . .
Analyze the sample phone bill using Tableau and present the key observations. Slide 1: Descriptive Analysis Time of the call: Peak, OffPeakData UsageSMS UsageCalled NumbersCalled DestinationsData usage.Type of Usage (Roaming, Local, International Roaming)
Slide 2:Compare and Select the best plan for this person from the team-members plan.
Slide 3:Try predicting the following with justification:- Gender- Approximate Age- Kind of phone- Occupation