TREX Analysis 1
-
Upload
anmol-pareek -
Category
Documents
-
view
221 -
download
0
Transcript of TREX Analysis 1
-
8/10/2019 TREX Analysis 1
1/8
TREX Analysis
-
8/10/2019 TREX Analysis 1
2/8
SPL Screening Details
Step by step process consisting of below steps-
Word in a word-
(SPL Name- Huji, BP name- Shuji), SPL fully contained in BP, SPL moves on to next step
(SPL Name- Hujin, BP Name- Shuji), SPL not fully contained in BP, SPL finds no block, doesnt
move to next step, partner is released (no block)
Word match % (70%)- BP Name (Source) to SPL (Target) (direction of match), (SPL Name- Huji, BP name- Shuji) match
rate 4/5- 80% > 70% (threshold), moves to phrase %
(SPL Name- Huji, BP name- Shujies) match rate 4/7- 57% < 70% (threshold), release (no block)
Phrase match % (51%)-
SPL Name (Target) to BP Name (Source) (direction of match), (SPL Name- Huji, BP name- Shuji
Private Contractors) match rate 1/1- 100% > 51% (threshold), moves to Delisting check
(SPL Name- Huji Private Contractors, BP name- Shuji) match rate 1/3- 33.3% < 51% (threshold),
releases
Delisting-
Looks at end dated SPL listings and releases even if above 3 steps are a match
-
8/10/2019 TREX Analysis 1
3/8
TREX Screening Details
Step by step process consisting of below steps-
TREX algorithm-
Choice of Algorithm- F (Fuzzy Logic) or G (Enhanced Logic), BP Name- Shuji, SPL Name- Huji (nomatch with F, match with G (Fuzzy Logic F checks whether the first letter is same in SPL andBP)
Fuzzy Logic also allows for a specific level of misspelling
Missing letters (SORD vs. SWORD)
Inverted Letters (SOCIETY vs. SOCEITY)
Additional Letters (Laboratories vs. Laboratory)
Word match % (70%)-
BP Name (Source) to SPL (Target) (direction of match), (SPL Name- Huji, BP name- Shuji) matchrate 4/5- 80% > 70% (threshold), moves to phrase %
(SPL Name- Huji, BP name- Shujies) match rate 4/7- 57% < 70% (threshold), releases
Phrase match % (51%)- SPL Name (Target) to BP Name (Source) (direction of match), (SPL Name- Huji, BP name- Shuji
Private Contractors) match rate 1/1- 100% > 51% (threshold), moves to Delisting check
(SPL Name- Huji Private Contractors, BP name- Shuji) match rate 1/3- 33.3% < 51% (threshold),releases
Delisting-
Looks at end dated SPL listings and releases even if above 3 steps are a match
-
8/10/2019 TREX Analysis 1
4/8
Differences between SPL & TREX
Functionality SPL TREX Impact to match
rate/ Quality
Word in a word
(BP Name- SORD, SPL
Name- SWORD).
No block Block TREX increases hit rate
Word Match No Impact
Phrase Match No Impact
Delisting (SPL Validity
Date)
TREX Algorithm
(BP Name- Word, SPLName SWORD)
Block No block (search
algorithm F)Block (search
Algorithm G)
-
8/10/2019 TREX Analysis 1
5/8
Testing Approach 1 Sandbox copied with production (July 2nd)
Ran SPL screening in sandbox
Ran 3 cycles of TREX screening in sandbox (different configuration)
Picked 25 partners blocked by SPL & 25 by TREX (AlgorithmF, Match Rate- 85%)
TREX Algorithm Used- G, Match Rate- 70%
SPL Screening TREX Screening
Blocked HR Partners 14 246
Blocked Logistics Partners 1540 8221
TREX Algorithm Used- F, Match Rate- 80%
SPL Screening TREX Screening
Blocked HR Partners 14 147
Blocked Logistics Partners 1540 5103
TREX Algorithm Used- F, Match Rate- 85%
SPL Screening TREX Screening
Blocked HR Partners 14 0
Blocked Logistics Partners 1540 667
In TREX In SPL
True Positive 18 (72%) 10 (40%)
False Positive 5 (20%) 8 (33%)
-
8/10/2019 TREX Analysis 1
6/8
Testing Approach 2 Ran screening (TREX in Sandbox & SPL in production) in tandem from August 4thto August 10
5 consecutive days statistics are taken into account
TREX Configuration (Algorithm- F, Match Rate- 70%) , Match percentage same in Production
and Sandbox
Picked up 27 partners from TREX & 23 from SPL
Date of Screening Blocked Partners in
TREX
Blocked Partners in
SPL
Examples
08/04 8 8
08/05 42 39
08/06 546 175 Dimon matched against Dixon,
Damon
Adel matched against Aden, Axel
08/07 108 34
08/08 78 126 Hua matched against Chua
Hum matched against Humi &
Shum
In TREX In SPL
True Positive 6 (22.2%) 3 (13.04)
False Positive 21 (77.8%) 20 (86.96)
-
8/10/2019 TREX Analysis 1
7/8
Testing Approach 3 Picked up negative partners from production (156 in number) and screened in Sandbox (with
TREX algorithm set at match rate 80%)
Uploaded the production MKDATA file (Aug 19-22) in to sandbox and ran the screening with
TREX algorithm (match rate 80%)
All the negative partners from production are blocked in TREX screening (at 80%)
Over a period of 5 days, TREX blocked 326 partners vs 292 in TREX (an increase in 11%)
ScreeningNegative
Partners Aug 19 Aug 20 Aug 21 Aug 22 Total
Total Blocked in SPL 156 3 44 51 38 292
Total Blocked in TREX 156 25 34 72 39 326
Common Blocks 156 3 30 46 37 272
Blocked in SPL Alone 0 0 14 5 1 20
Blocked in TREX Alone 0 22 4 26 2 54
False Positives in SPL 14 5 1 20
False Positives in TREX 22 4 2 28
In TREX In SPL
True Positive 26 (48.15%) 0 (0%)
False Positive 28 (51.85%) 20 (100%)
-
8/10/2019 TREX Analysis 1
8/8
Summary/ Way Forward
Observations-
No. of Matches- TREX is returning more matches (10% more matches on an average (match rate 80%) over SPL
(match rate 70%)
Quality of Matches-
For sample data analyzed, TREX returned lesser false positives (concluded after 3 rounds of
testing, even though the sample population is small, TREX is consistently showing an
improvement in the quality of matches vs SPL)
Not a single genuine block is getting missed out in TREX (confirmed as part of Negative Partnerscreening)
Recommendations-
TREX is increasing hit rate, which can be reduced because TREX gives us-
Based on our observations (after multiple rounds of testing and analysis) we can say that TREX
is consistently delivering a higher quality of matches over SPL (lesser false positives, more true
positives) and will provide a boost to the quality of our SPL screening Ability to increase both word match % and phrase match % due to higher assurance that truly
similar words will match.
Setting TREX sensitivity at a sufficiently high level will prevent truly dissimilar words from
matching. Business can take a call on the match rate (testing completed at 80% for 5 days)
(testing approach 3)