Bing Phone Book Service Arch Spec

25
BING PHONE BOOK SERVICE [Document subtitle] Ashish Shah, SunitaS, VinaSura, BhJoshi [Email address] Abstract An overview of the Bing Phone Book Service. The first version is meant to support the Skype Dialer scenarios.

Transcript of Bing Phone Book Service Arch Spec

Page 1: Bing Phone Book Service Arch Spec

Bing PHONE BOOK Service

[Document subtitle]

Ashish Shah, SunitaS, VinaSura, BhJoshi[Email address]

AbstractAn overview of the Bing Phone Book Service. The first version is meant to support the Skype Dialer scenarios.

Page 2: Bing Phone Book Service Arch Spec

1 Contents2 Goals....................................................................................................................................................3

3 Caller ID Functionality and Graph Inference........................................................................................3

4 Spam Inference for Blocked Callers.....................................................................................................4

5 High Level Workflows..........................................................................................................................4

6 Architecture Overview.........................................................................................................................5

7 High Level Design Overview.................................................................................................................6

7.1 Client Service Workflows.............................................................................................................6

7.2 Inferencing Engine Workflow......................................................................................................8

8 Performance and Scale Targets for V1 Release...................................................................................8

8.1 Lookup Performance Goals..........................................................................................................8

8.2 Server Call Load...........................................................................................................................8

8.3 Address book Sync Load..............................................................................................................9

8.4 Storage Capacity..........................................................................................................................9

9 Bing Phone Book Data Storage Model.................................................................................................9

9.1 Target Storage Capacity Requirements........................................................................................9

9.2 Bing Phone Book Service Storage Design.....................................................................................9

9.3 Bing Phone Book Service Data Storage........................................................................................9

10 Geo Scale-Out Model.....................................................................................................................10

10.1 Centralized.................................................................................................................................10

10.2 Federated Model.......................................................................................................................10

11 Security..........................................................................................................................................11

11.1 Data Privacy...............................................................................................................................11

11.2 Authenticating the Client Application and User.........................................................................12

11.3 Transport Security.....................................................................................................................13

12 Backup/Disaster Recovery Requirements......................................................................................13

13 Bing Phone Book Client Design Details..........................................................................................13

13.1 Overview - Periodic Uploads and Downloads............................................................................13

13.2 Caching......................................................................................................................................13

13.3 NameLookup..............................................................................................................................13

13.4 Blocked Phone Number handling..............................................................................................14

13.5 Special Scenario Handling..........................................................................................................14

Page 3: Bing Phone Book Service Arch Spec

13.5.1 LowBatteryConditions........................................................................................................14

13.5.2 Shutdown...........................................................................................................................14

13.5.3 Lost Phone Scenario...........................................................................................................14

13.5.4 Dual Sim Scenario..............................................................................................................14

13.6 Client Side Performance Metrics...............................................................................................14

13.6.1 Metrics...............................................................................................................................14

13.6.2 Client Side Metrics Framework..........................................................................................14

13.7 Other Client Side Telemetry.......................................................................................................14

14 Bing Phone Book Service Design Details........................................................................................14

14.1 Deployment Topology and Sizing...............................................................................................14

14.2 REST APIs...................................................................................................................................14

14.3 Name Inferencing Engine...........................................................................................................15

14.4 Spammer Prediction..................................................................................................................15

14.5 SQL Azure Storage Data Layer Overview....................................................................................15

14.5.1 Options..............................................................................................................................15

14.5.2 Phone Number Entity Table Schema..................................................................................16

14.5.3 PhoneNumberDataTable Schema......................................................................................17

14.5.4 PhoneNumberUpdated Table Schema.............................................................................18

14.5.5 SQL Maintenance...............................................................................................................18

14.6 Security......................................................................................................................................19

14.6.1 Implicit Grant Type Sequence............................................................................................19

14.6.2 Using Skype Dialer Authentication – Multiple Resource Server Support...........................19

15 Server Side Performance Metrics..................................................................................................19

16 Monitoring.....................................................................................................................................19

Page 4: Bing Phone Book Service Arch Spec

Bing Phone Book Service for Skype Dialer2 GoalsThe main purpose for this service for the V1 release is to provide the following functionality to the Skype Dialer:

1) Provide Caller Id lookup by phone number for incoming and outgoing calls (for people as well as local business phone numbers)

2) Provide a crowd-sourced spam detection and tagging mechanism for incoming callers. 3) Provide round-trippable storage for incoming call blocklist for all registered users, with implicit

portability across the users devices.

Design goals for the service:

1. Be agnostic of the calling application. There’s no implicit or explicit dependency in the design on Skype infrastructure or coupling with the Skype Dialer. The same service can be leveraged in the future for another dialer endpoint, e.g., for the native Windows Phone dialer.

2. Be agnostic of the device ecosystem and client OS – specifically no dependency has been taken on Google services in the service.

3. Keep design scalable to global audience and not specific to India. Design and implementation should scale beyond India market without any code level changes. There may be market specific config files which can be extended to new markets as needed.

4. Client API components should be designed in a way that calling application can enforce required SLAs for battery and network/data usage.

5. Ensure appropriate security of user private data.

3 Caller ID Functionality and Graph InferenceIn order to serve the primary purpose of providing caller ID information for phone numbers, the phone book service relies on three inputs:

1. Information from registered users (as entered when the user first signs up for Skype Dialer).2. Local business information seeded from the Bing Local index. E.g., we have seeded ~5M local

business numbers for India into the caller ID service already. More can be seeded for other markets as we scale the service globally.

3. High confidence name inference on the phone-number linked graph crowdsourced from registered user’s address books. Due to the fact that different users may store the same number with different names in their phone books we cannot implicitly trust the uploaded name association from a single address book. However, a high confidence inference can be drawn from a group of uploaded names from many address books.

Page 5: Bing Phone Book Service Arch Spec

Following picture demonstrates this process. User names in this color are registered users, names in this color are high confidence inferred names.

4 Spam Inference for Blocked CallersThe Dialer provider functionality to block incoming calls from specific phone numbers. The phone book service leverages this functionality to create a crowd-sourced inference for spam callers. Every time a user blocks an incoming number, they contribute one spam vote for that number into the inference system. We upload the user’s blocklist to the phone book service at regular intervals.

Aggregate information from user block lists is used to do a simple inference based on number of users that have blocked a specific number. As the system matures, we can do more sophisticated spam inference based on whether number is toll-free number, trust level for the originating user, known tele-marketing prefixes, etc.

Once the spam inference runs, it adds a spam flag and spam vote count for all numbers that were inferred as spam originators. The same information is returned from the caller Id lookup call to the phone book service and used by the Dialer to show spam status for incoming callers.

The service also has an ability to pre-load a set of known top spammers as a white list (non-inferred). For India, we have sourced a list of ~500 top spam numbers and seeded into the phone book service.

5 High Level WorkflowsBased on above functionality, there are 4 basic app workflows that are relevant for the Caller Id service:

- New User Registrationo Dialer will register new users with the caller ID service using a normalized phone

number as the user ID, no password required. o User authenticated with MSA using MSA token.

Page 6: Bing Phone Book Service Arch Spec

- Phone Book Synco Dialer will sync (upload) the user’s local phone book to the caller ID service at regular

intervals, including user’s contacts with phone numbers from all sources (local phone book, outlook.com address book, facebook contacts, linkedin contacts, etc.).

- Caller Id Lookupo For all incoming and outgoing calls where the incoming or outgoing call number is not in

the user’s local phone book, Dialer will call the service for a caller Id lookup (with normalized phone number as key).

o If phone number entry exists in the cloud directory, Caller ID service will return: Registered name (if available) Inferred name (if available) Spam flag and spam count (if available)

- Phone Number Blockingo Every time the user blocks an incoming number, Dialer will update the caller ID service

with that information. Internally, every block action translates into a spam count increment for the blocked number.

6 Architecture OverviewFor interface simplicity and wire protocol abstraction, we have provided a client java library for the phone book service, which under the covers calls RESTful web service APIs hosted on Azure. Currently we have a client library implementation for Android (4.0+). Same can be easily ported over to Windows Phone, iOS, etc.

The Bing Phone Service is an Azure PaaS service hosting RESTful endpoint in Azure Web Roles. Multiple service instance roles will be used for load-balancing and high availability, leveraging Azure’s auto-scale feature for load elasticity while maintaining end-to-end performance SLAs. The inference engine will be hosted in an Azure Worker Role. The SNR APIs provide a REST endpoint

Current version of the service uses SQL Azure as its data store (to be swapped out with Bing Object Store for higher scalability post-V1 release). SQL Azure provides high availability and reliability by design, however it has limits on the storage it can support. The SQL store will be partitioned and sized according to V1 scale and performance requirements.

Page 7: Bing Phone Book Service Arch Spec

In addition, an Azure Storage Account is provisioned to support persistence of logging/diagnostic data. Other scale out options and storage design options are discussed later.

7 High Level Design Overview The Bing Phone Book Service must support the four workflows described earlier in the spec. Essentially it needs to

1) Store the phone book data, both contact information and block list information from the registered users

2) Host an Inferencing engine that does name inferencing for numbers uploaded by the registered users

3) Build the Top Spammer list

The Bing Phone Book Data Layer abstracts the storage from the service. The PhoneNumberUpdateTable stores the stores the contact data and block list information that is synced to the cloud. The PhoneNumberEntityTable is the table that the name inferencing engine creates and the one that is used for CallerID lookup. The Top Spammer list is built and refreshed on the server. It is also cached on the server.

7.1 Client Service WorkflowsThe high level interactions between the Client, the Bing Phone Book Service and the Bing Phone Book Data Layer are captured here. They are discussed in detail in the Client Design section.

Page 8: Bing Phone Book Service Arch Spec
Page 9: Bing Phone Book Service Arch Spec

7.2 Inferencing Engine Workflow

8 Performance and Scale Targets for V1 ReleaseBelow is a list of assumptions that will be used to validate the design and drive the scale and performance test metrics. Note that the numbers are purposefully kept at the higher end to ensure that the query latency for lookup calls is acceptable under adverse load conditions.

We will assume a scale target of 10M registered users for the V1 release.

8.1 Lookup Performance GoalsThe initial end-to-end latency goal for the caller ID lookup is for 90th percentile lookups to be within 150ms round-trip time on Wi-Fi and LTE/3G network connections, and 250ms round-trip time for 2G network connections. Round-trip times are measured from the time the client library API is invoked to having the caller Id response available in the Dialer.

8.2 Server Call LoadWe will assume that on an average a single user will receive or make 20 calls per day. Of these we will further assume that at most 10 calls will results in a caller id lookup (i.e. number not in the user’s local phone book). Lastly we will assume that most of these calls will be made during day time, spanning a ~12 hour time period.

Above assumptions dictate an average rate of ~250 qps (queries per second) for the caller ID lookup operation for every 1M registered users (10 * 1M / 12 * 3600). For 10M registered users, that translates to a target capacity of ~2500 QPS.

Page 10: Bing Phone Book Service Arch Spec

8.3 Address book Sync LoadWe will upload/download phone book data twice a day at most. This would imply 20 million sync calls. Assuming that all the sync calls are staggered, the sync rate at the service would be ~230 (Max) Sync TopSpammer and Sync Blocked List calls per second.

However, the client to service sync is designed to only upload incremental changes to the user’s address book – we expect only 1/4th of the sync calls on the client to actually have data to upload (and hence results in a service call). Assuming that all the sync calls are staggered with equal distribution over a 24 hour period, above implies an average rate of ~60 QPS for sync calls to the service.

8.4 Storage Capacity We will assume on an average users will have 250 uploadable contacts with phone numbers in their device phone book (including contacts synced from various cloud services such as email services, facebook, etc.). Of these we will assume that a user will contribute up to 100 unique new contacts to the service. This number will start to trail down as the service and size of the phone directory grows.

9 Bing Phone Book Data Storage Model9.1 Target Storage Capacity Requirements

The total predicted Size for the Phone Number Data Table is around 250 GB (10M users * 250 Entries * 100 Bytes/Entry). The total predicted size for Phone Number Entity is calculated based on the assumption that each of the 10M user will contribute at most 100 unique contacts and is expected to be around 200GB (10 Users * 100 Unique Entries Per User * 200 Bytes per entry). Including the other table sizes, we predict the Storage Capacity requirements to be around 500 GB.

9.2 Bing Phone Book Service Storage DesignThe Phone Number Entity Table must be able to serve queries at a low latency to be able to meet the latency SLAs. The bulk of writes will be into the Phone Number Data Table. Note that only numbers for which an inference name is calculated are stored or inserted into the Phone Number Entity Table. The phone number entity table insertion rate after an inferencing run should reduce over time but may be detrimental to the lookup performance during the initial onboarding of new users.

9.3 Bing Phone Book Service Data Storage The current V1 implementation is based on SQL Azure. Appropriate number of Standard or Premium Instances will be configured to support the latency requirements.

Based on the limited calculations for the V1 release it is clear that SQL Azure is not the correct long term storage solution for the Bing Phone Book Service since the SQL Azure Data Size limit is 250GB for Standard and 500 for Premium. The Bing Phone Book Service Data Layer abstraction is important for this reason. If size and latency requirements dictate, an Azure File Table Storage plugin will be built for V1.

The schema and other storage related are discussed in more detail in the Bing Phone Book Service Design Details.

Page 11: Bing Phone Book Service Arch Spec

Azure Table Store and Object Store are under the scanner for long term storage strategy for the Bing Phone Service.

10Geo Scale-Out ModelAs the BING Phone book service grows to serve different geographies, the following are options can be considered for Geo Scale.

10.1 CentralizedIn this model, the inference engine will run in one region, but the lookup data is replicated to different regions for read scale out and for keeping the lookup latencies low.

The phone data is uploaded to the Service Stamp in the closest region. This data needs to be made available to the stamp that will host the inferencing engine. Another way to think about this is to think of this as two services: a Front End service that performs caller id lookup and phone data collection and a Backend service(offline) that hosts the inferencing engine and prepares the global phone number entity table. The phone number entity table needs to be replicated from Region A to other regions, in the picture shown above.

Page 12: Bing Phone Book Service Arch Spec

10.2 Federated ModelIn this model, the Inferencing Engine would run in each region. One option is to use a Phone Number Prefix to Region Map (is this feasible to build in the mobile world?) where each phone number is mapped to a region. A phone number upload is forwarded to the stamp based on this Phone Number Map. A further simplification to this would be to simply drop the numbers that are not owned by the region. It is unlikely that people will store a significant number of non-local numbers in their contacts.

The federated model seems suitable since the name inferencing should largely run locally and most numbers that are uploaded from the Phone Contacts are likely to be local. This would imply that potentially little data has to flow across data centers as compared to the centralized model. The disadvantage would be that the inferencing of names would be local. Cost wise, the federated model seems to be more desirable.

11Security11.1 Overview11.2 Data PrivacyData privacy is ensured though Storage Layer Access restrictions. Access to the Data Store is restricted to the Service and a few restricted users. The public APIs that the service exposes do not provide access to any tenant specific data, but only to the data that the inferencing and spam prediction engine generate. The Bing Phone Book service will follow the recommended security guidelines.

For SQL Azure DBs, firewall rules can be setup to restrict access to a set of IP Addresses, so that only the Bing Phone Book service, Web and Worker roles have access to the DB. The V1 implementation

Page 13: Bing Phone Book Service Arch Spec

supports firewalling. In addition, SQL Azure only supports SQL server authentication so care needs to be taken that the password is not compromised.

With Azure Tables, standard Azure Protection mechanisms around use of primary/secondary key, key rollover and key regeneration will be used to protect access to the data. Other operational procedures will be used to contain key management operations and visibility to a restricted set of personnel. In addition, configuration within a private VNET and Azure RBAC security investigations are in progress to harden the access to the Data Store.

11.3 Authenticating the Client Application and User (RPS)The Skype Dialer is using the Relying Party Suite of authentication protocol.

With RPS, the Dialer App can get tickets for a given service once the user is logged into MSA and his access token has been retrieved.

Next Steps:

Chose Authentication Policy : MBI_SSL (24 hour refresh, no force signin on expiry, ticket type compact)

<AuthPolicy>MBI</AuthPolicy> in rpcserver.xml

Live Auth configuration properties in web.config

Smart Client Protocol setup

11.4 Authenticating the Client Application and User (OAUTH)The phone book service needs to authenticate the client application and the user to ensure that authorized users or rogue users are not invoking the service end points with potential malicious intent.

We assume that the Skype Dialer application requires a Microsoft Account and that the Bing Phone Book Service will use the Windows Live Service as the Authentication Server.

In the OAuth Terminology, the Bing Phone Book client library would be the Client and Bing Phone Book Service would be the Resource Server, as shown in the picture below. However, it is the Skype Dialer Application which actually drives the user authentication process with the Authorization Server. Investigations are in progress as to how OAUTH supports multiple resource servers and if we can leverage that.

Page 14: Bing Phone Book Service Arch Spec

For desktop and mobile applications, the OAUTH implicit grant type sequence is recommended.

For the Bing Phone Book Service, the resources made available through its Public API, do not relate to a specific user. In that context, a grant type of Client Credential for generic application access might suffice. If user authorization is required, then the Implicit Grant request sequence for OAUTH, as described above, may be used. [TBD. Does the grant type of client credentials require the client secret?]

Refer to Support for Implicit Grant OAuth 2.0 for Windows Live/ Microsoft Account Service for a description of the implicit grant flow.

The details of the OAuth Implicit grant type are discussed in the Security Sections under Service Design Details.

11.5 Transport Security Https is used to provide server authentication to prevent man-in-middle attack and it provides for encryption of communication between the client and server, which ensures that there cannot be any eavesdropping or tampering of contents. OAUTH 2.0, in any case, requires SSL security, for the reasons described above.

12Backup/Disaster Recovery RequirementsTBD

Page 15: Bing Phone Book Service Arch Spec

13Bing Phone Book Client Design Details13.1 Overview - Periodic Uploads and DownloadsA mobile device (or SIM) is registered with the Bing Phone Book Service (BPBS) when the user first uses the SKYPE Dialer. Subsequently, the Bing PhoneBook Client connects to the Bing Phone Book Service to upload phone book data, to resolve names to phone numbers and to get spammer information.

The Skype Dialer application calls the Bing PhoneBook Client periodically to upload the users Contacts and the user’s BlockedList to the Bing Phone Book Service and download the TopSpammerList. Diffs are maintained on the client so that subsequently only changes in the contact list are uploaded.

Currently there is no support for change notifications using which the client could avoid calculating the diff between the current Contact List and the last uploaded Contact List. Also, the frequency of these uploads and downloads are not configurable independently for ContactList, BlockedList and TopSpammerList.

These uploads and downloads are staggered for different users by using a HASH of the user’s phone number and deriving a time of day for syncing the phone book data to the Bing Phone Book Service.

13.2 CachingThe Phone Book Service Client caches the last X numbers looked up and top spammer list on the client. The spammer list in the cache is updated every time the client syncs with the service. [Vinay, what is the caching mechanism? Can you add the details? Why is the top spammer list not cached on the server as well ]

13.3 NameLookupOn an incoming call, the PhoneBookClient checks the local cache to see if the number is on the local blocked list or the spam list and if the lookup can be resolved locally. If not, it calls the Azure Service if the network mode allows it.

13.4 Blocked Phone Number handlingWhen a number is marked as blocked, the BingPhoneBook client calls the BingPhoneBookService synchronously. If the number hasn’t been uploaded as yet, then will AddToBlockList handle that appropriately?

13.5 Special Scenario Handling13.5.1 LowBatteryConditionsDialer Appplication will call SetNetworkStatus with the appropriate settings.

13.5.2 ShutdownDialer application calls shutdown. Current CallerIdService instance becomes invalid post this call. New instance should be requested via GetCallerIdServiceInstance.

13.5.3 Lost Phone Scenario13.5.4 Dual Sim Scenario

Nitesh Jain, 06/10/15,
We should mention that this is basically to support offline scenario.
Nitesh Jain, 06/10/15,
Also, new SIM scenario
Bharat Joshi, 06/09/15,
Dialer app calls shutdown. Current instance becomes invalid post this call and new instance should be requested via GetCallerIdServiceInstance.
Nitesh Jain, 06/10/15,
Think we need to add more details here, on how we honor low battery settings. Also do we honor low battery settings or no network settings?
Sunita Shrivastava, 06/10/15,
Fixed.
Sunita Shrivastava, 06/10/15,
Vinay Surana, 06/09/15,
"When any number is marked as blocked" not just local contact list
Bharat Joshi, 06/09/15,
Yes.
Sunita Shrivastava, 06/10/15,
Fixed.
Bharat Joshi, 06/09/15,
Both blocked and spam list. Service will be called if the network mode allow it.
Vinay Surana, 06/09/15,
should we just call it LookUp?
Page 16: Bing Phone Book Service Arch Spec

13.6 Client Side Performance Metrics13.6.1 MetricsHere is a list of client side performance metrics that the instrumentation will support:

13.6.2 Client Side Metrics Framework13.7 Other Client Side TelemetryThe Bing Phone Book client will use the same client telemetry framework as the Skype Dialer Client does.

14Bing Phone Book Service Design Details14.1 Deployment Topology and Sizing

3 Instances of Web Role : A1 1 Instance of Worker role : A0 1 Instance of MFC Role: A0 -> this could be folded as optional in the web role?

14.2 REST APIs

API Type Request ResponseRegister POST UserName, UserPhoneNumber,

UserAppToken(UserSkypeToken)SyncContacts POST UserPhoneNumber, ContactsSyncBlockedList POST UserPhoneNumber, BlockedListGetTopSpammers GET UserPhoneNumber, N PhoneNumberEntityListLookupPhoneNumber GET UserPhoneNumber,

PhoneNumberToLookupPhoneNumberEntity

The REST GET APIs, in this case, GetTopSpammers, will be designed to leverage HTTP and server side caching. [Are we sending the last modified date or an etag header?? Not critical initially, since the frequency of the call is not that high, but with high volume of users, it might still be useful]

Comment: Currently, the order of parameters is switched? Is there a reason for that? Would be nice to be consistent

14.3 Name Inferencing Engine

The name inferencing engine evaluates multiple rows for a given phone number in the PhoneNumberDataTable for the list of phone numbers in the PhoneNumbersUpdatedTable and generates an inferenced name.

The current algorithm is fairly simple. It generates an inference name for both full name and first name, if more than some threshold of entities (current threshold is 5) have the same name set as the full name or the first name.

Page 17: Bing Phone Book Service Arch Spec

Currently we do not receive or use the first and last name information from the local contact list.

Q: is the inferencing engine doing bulk reads/updates? It is important that the Bing Phone Book Service serves lookups at low latency even while the Inferencing engine is updating entries. Potentially this can cause latency issues.

14.4 Spammer PredictionThe spam count for a number is incremented if it is found to be on the blocked list for a given user. If the spam count for a number is greater than the threshold (what is this value currently), it is assumed to be spam. The top spammer list is called by the Bing Phone Book Service. Is this done periodically? Does it cache this on the server side?

14.5 SQL Azure Storage Data Layer OverviewDo we have an estimate of what percentage of these numbers would be spam?

14.5.1 Options Following options are plausible and were evaluated:

SQL Azure

SQL Azure is really not a desirable option for over 500 GB of data, both from performance and from COGs perspective.

Azure Tables

Azure essentially provides a simple table indexed by partition and row key. A single batch request may contain only 100 entities (and not exceed 4MB in size). Azure Tables support a query filter option but not an ordering operation.

For the lookup table, the partition key could be HASH of the first digits of the phone number and the row key the phone number.

For the data table as well, the partition could be the HASH of the phone number and the row key the row id.

To calculate top spammers, a worker role would need to read into memory the spammers. The reads would need to be done as batched reads. The worker role would eventually write this top spammers list into a separate table. The top spammer list can also be cached on the server side.

Object Store

TBD.

Three tables in SQL Azure are used by the Bing Phone Book Service:

PhoneNumberEntityTable : Look ups are served from this. PhoneNumberDataTable : All sync data (contacts, suggestion) is inserted into this table PhoneNumberUpdated : Used by the Inferencing Engine to detect whether a phone number

entry in the data table has updates or not

Page 18: Bing Phone Book Service Arch Spec

14.5.2 Phone Number Entity Table Schema

Seminal Details

A filtered non clustered index based ‘IsSpam’ column is used to improve performance for ordering this spammers by the spam count.

A separate table called phone numbers updated is used by the inferencing engine to limit the re-inferencing operations to run only for impacted numbers.

ColumnName ColumnType IsNullable Comments

PhoneNumber Nvarchar(20) Not null, Primary Key

RegisteredName Nvarchar(max) null Registered Name of the phone number if registered ow null

InferredName Nvarchar(max) null Inferred name of the phone number if nay by the inference model

IsSpam Bit not null Is phone number a spam detected by spam inference (is this a sparse column?)

SpamCount Bigint not null Spam count for the phone number

Locality Nvarchar(50) null Locality of the phone number (mainly inserted for Bing Local Entities)

City Nvarchar(30) null city of the Phone Number (mainly inserted for Bing Local Entities)

UserAppToken Nvarhcar(50) null An unique token given by the app for the corresponding phone number - WHAT IS THIS FOR?

LastUpdatedTime Datetime not null The time when this row was last updated.

Indexes on PhoneNumberEntityTable:

Column IndexType Comments

PrimaryPhone Clustered As it’s a primarykey

UserAppToken Non-clustered To do look up for the given user app token and getting the phone number and other info

Page 19: Bing Phone Book Service Arch Spec

IsSpam Non-Clustered, Filtered (where IsSpam=1)

To easily get top spammers

14.5.3 PhoneNumberDataTable Schema

ColumnName ColumnType IsNullable Comments

RowId Bigint Not null, PrimaryKey

Required for having atleast one primarykey

PhoneNumber Nvarchar(20) Not null

ContactName Nvarchar(max) Null

TypeOfPhoneNumber Int Not null the type of phone number defined below: Unknown = 0, Mobile = 1, Home = 2, Work = 3, Company = 4, HomeFax = 5, WorkFax = 6

Source Nvarchar(max) Null source of the phone number like Facebook, Outlook/Spam

LastUpdatedTime DateTime Not null When was last time this row was updated

Indexes on PhoneNumber Data:

Column IndexType Comments

RowId Non-Clustered Primary Key

PhoneNumber Clustered For fast aggregator operation on phoneNumber used in inferencing

Page 20: Bing Phone Book Service Arch Spec

14.5.4 PhoneNumberUpdated Table Schema

ColumnName ColumnType IsNullable Comments

PhoneNumber Nvarchar(20) Not null, PrimaryKey

HasUpdates Bit Not Null Set to 1 if there is some updates in other table, reset to 0 by inference once done.

Indexes:

Column IndexType Comments

PhoneNumber Clustered As it is a Primary key

14.5.5 SQL MaintenanceDo we force run updatestatistics? When? How do we manage index fragmentation that will be

caused by repeated updates? In our case, how bad will the index fragmentation be?

14.6 Security14.6.1 Implicit Grant Type SequenceBefore a client application can request access to resources on a resource owner, the client application must register with the authorization server associated with the resource server. At the time of registration, the client application is assigned a client id and a client secret by the authorization server. The client id and the secret is unique to the client application on that that authorization server. It is important that the client identity and the secret generated by the provider is not shared with anyone. In our case, since the app is deployed and is a java, it is preferable not to use the secret. For this reason, the OAUTH Implicit Grant Type is recommended for mobile or desktop applications.

During the time of registration, the client also registers a redirect URI. This redirect URI is used when a resource owner grants authorization to the client application. When a resource owner has successfully authorized the client application via the authorization server, the resource owner is redirected back to the client application, to the redirect URI.

Note that the authorization service will only redirect users to a registered URI, which helps prevent some attacks. Any HTTP redirect URIs must be protected with TLS security, so the service will only redirect to URIs beginning with "https". This prevents tokens from being intercepted during the authorization process.

Page 21: Bing Phone Book Service Arch Spec

An implicit authorization grant is similar to an authorization code grant, except the access token is returned to the client application already after the user has finished the authorization. The access token is thus returned when the user agent is redirected to the redirect URI.

This of course means that the access token is accessible in the user agent, or native application participating in the implicit authorization grant. The access token is not stored securely on a web server.

Furthermore, the client application need only send its client ID to the authorization server. If the client were to send its client secret too, the client secret would have to be stored in the user agent or native application too. That would make it vulnerable to hacking.

14.6.2 Using Skype Dialer Authentication – Multiple Resource Server Support

15Server Side Performance Metrics16MonitoringAzure Diagnostic Framework and Azure Monitoring Framework will be used to instrument, persist and monitor the service.