Unbiased Offline Recommender Evaluation for Missing-Not-At...
Transcript of Unbiased Offline Recommender Evaluation for Missing-Not-At...
![Page 1: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/1.jpg)
Unbiased Offline Recommender Evaluation for Missing-Not-At-Random Implicit Feedback
1
Funders:
Longqi Yang Yin Cui Yuan Xuan Chenyang Wang Serge Belongie Deborah Estrin
![Page 2: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/2.jpg)
2
Offline Evaluation of Recommendation Algorithm
( , )( , )
( , )… …
user-item interactions
R
recommendation algorithms
rewards
…
![Page 3: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/3.jpg)
…
R
recommendation algorithms
rewards
…
3
Offline Evaluation of Recommendation Algorithm
( , )( , )
( , )…
user interaction history
Pros:• Cost effective.• Efficient.• Iterate faster.• Experiment before deployment.
![Page 4: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/4.jpg)
…
R
recommendation algorithms
rewards
…
4
Offline Evaluation of Recommendation Algorithm
( , )( , )
( , )…
user interaction history
Pros:• Cost effective.• Efficient.• Iterate faster.• Experiment before deployment.
Cons:• The data is Missing-Not-At-Random (MNAR)
![Page 5: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/5.jpg)
5
Offline Evaluation procedure
user !
item "
user ! interacted with item "
![Page 6: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/6.jpg)
6
Offline Evaluation procedure
train/test
![Page 7: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/7.jpg)
7
Offline Evaluation procedure
1. Train and validate a recommendation model
2. Averaged performance over held-out (user, item) interaction pairs
(Average-Over-All)
![Page 8: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/8.jpg)
8
Offline Evaluation procedure
1. Train and validate a recommendation model
(policy) !
2. Averaged performance over held-out (user, item) interaction pairs
(Average-Over-All)
Rating-based recommendation systems
Implicit feedback-based recommendation systems
![Page 9: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/9.jpg)
9
Previous work: Average-Over-All is biased for rating-based recommendation systems, because ratings are MNAR
[Marlin et al. 09], [Schnabel et al. 16], [Steck 10], [Steck 11], and [Steck 13]
![Page 10: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/10.jpg)
10
Previous work: Average-Over-All is biased for rating-based recommendation systems, because ratings are MNAR
[Marlin et al. 09], [Schnabel et al. 16], [Steck 10], [Steck 11], and [Steck 13]
Previous work: Average-Over-All is unbiased for implicit feedback-based recommendation systems, because implicit
feedback is missing uniformly at random.[Lim 15]
![Page 11: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/11.jpg)
11
This work: Average-Over-All is biased for implicit feedback-based recommendation systems, because
implicit feedback is NOT missing uniformly at random.
![Page 12: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/12.jpg)
12
This work: Average-Over-All is biased for implicit feedback-based recommendation systems, because
implicit feedback is NOT missing uniformly at random.
Popularity bias (Users are more likely to be exposed to popular items)
trending recommendation
![Page 13: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/13.jpg)
13
A Hypothetical Example
Popular Items Long-tail Items
# of liked items(over all items)
# of liked items(over observations)
1 10
10 1
:
:
Algorithm 1 Performance
Algorithm 2 Performance
0.8
0.75 0.75
0
![Page 14: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/14.jpg)
14
A Hypothetical Example
Popular Items Long-tail Items
# of liked items(over all items)
# of liked items(over observations)
1 10
10 1
:
:
Algorithm 1 Performance
Algorithm 2 Performance
0.8
0.75 0.75
0
![Page 15: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/15.jpg)
15
A Hypothetical Example
Popular Items Long-tail Items
# of liked items(over all items)
# of liked items(over observations)
1 10
10 1
:
:
Algorithm 1 Performance
Algorithm 2 Performance
0.8
0.75 0.75
0
![Page 16: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/16.jpg)
16
A Hypothetical Example
Popular Items Long-tail Items
# of liked items(over all items)
# of liked items(over observations)
1 10
10 1
:
:
Algorithm 1 Performance
Algorithm 2 Performance
0.8
0.75 0.75
0
![Page 17: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/17.jpg)
17
A Hypothetical Example
Popular Items Long-tail Items
# of liked items(over all items)
# of liked items(over observations)
1 10
10 1
:
:
Algorithm 1 Performance
Algorithm 2 Performance
0.8
0.75 0.75
0
![Page 18: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/18.jpg)
18
A Hypothetical Example
Popular Items Long-tail Items
# of liked items(over all items)
# of liked items(over observations)
1 10
10 1
:
:
Algorithm 1 Performance
Algorithm 2 Performance
0.8
0.75 0.75
0Any sensible evaluation
![Page 19: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/19.jpg)
19
A Hypothetical Example
Popular Items Long-tail Items
# of liked items(over all items)
# of liked items(over observations)
1 10
10 1
:
:
Algorithm 1 Performance
Algorithm 2 Performance
0.8
0.75 0.75
0
Average-Over-All
![Page 20: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/20.jpg)
20
Formalize Reward !
R(Z) =1
|U|X
u2U
1
|Su|X
i2Su
c(Zu,i)Ideal evaluation:
Item rankings predicted by an algorithm<latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit>
![Page 21: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/21.jpg)
21
Formalize Reward !
R(Z) =1
|U|X
u2U
1
|Su|X
i2Su
c(Zu,i)Ideal evaluation:
Predicted ranking of item i for user u
Items liked by user u among the entire item set
Reward for (u, i) pair
scoring metric<latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AAAB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy733hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q==</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AAAB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy733hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q==</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AAAB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy733hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q==</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AAAB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy733hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q==</latexit>
Item rankings predicted by an algorithm<latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit>
![Page 22: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/22.jpg)
22
Formalize Reward !
R(Z) =1
|U|X
u2U
1
|Su|X
i2Su
c(Zu,i)Ideal evaluation:
Predicted ranking of item i for user u
Items liked by user u among the entire item set
Reward for user u
scoring metric<latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AAAB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy733hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q==</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AAAB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy733hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q==</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AAAB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy733hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q==</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AAAB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy733hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q==</latexit>
Item rankings predicted by an algorithm<latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit>
![Page 23: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/23.jpg)
23
Formalize Reward !
R(Z) =1
|U|X
u2U
1
|Su|X
i2Su
c(Zu,i)Ideal evaluation:
Predicted ranking of item i for user u
Items liked by user u among the entire item set scoring metric<latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AAAB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy733hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q==</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AAAB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy733hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q==</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AAAB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy733hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q==</latexit><latexit sha1_base64="kRMqKdaDY8ZBXiHQ/ORnrxYWwbE=">AAAB9XicdVDLSgMxFM3UV62vqks3wSK4Gmamj6m7ghuXFewD2rFk0kwbmkmGJKOUof/hxoUibv0Xd/6N6UNQ0QMXDufcy733hAmjSjvOh5VbW9/Y3MpvF3Z29/YPiodHbSVSiUkLCyZkN0SKMMpJS1PNSDeRBMUhI51wcjn3O3dEKir4jZ4mJIjRiNOIYqSNdKuwkJSPYEy0pHhQLDl2vVop+z507HLNLXtVQzzHr1/UoGs7C5TACs1B8b0/FDiNCdeYIaV6rpPoIENSU8zIrNBPFUkQnqAR6RnKUUxUkC2unsEzowxhJKQpruFC/T6RoVipaRyazhjpsfrtzcW/vF6qo3qQUZ6kmnC8XBSlDGoB5xHAIZUEazY1BGFJza0Qj5FEWJugCiaEr0/h/6Tt2a5ju9deqVFZxZEHJ+AUnAMX+KABrkATtAAGEjyAJ/Bs3VuP1ov1umzNWauZY/AD1tsnKXGS5Q==</latexit>
Item rankings predicted by an algorithm<latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit><latexit sha1_base64="ZyVWiPK9uixN125Ap0AIBWyDfRc=">AAACEHicdVDPaxNBGJ1Nq42xaqzHXoYGsaewG3YxuQW81FuF5gckIXw7+yUZMjO7zHxbCCF/ghf/FS8eWopXj978b5ykEWzRBwOP977H8F5aKOkoDH8FlYPDJ0+Pqs9qz49fvHxVf33Sd3lpBfZErnI7TMGhkgZ7JEnhsLAIOlU4SJcftv7gGq2TubmiVYETDXMjZ1IAeWlaf/eRUHMLZinN3HGfzaQgzHi64mA4qHluJS30tN4Im504iaOEh804jJMo8iRqd8J2wqNmuEOD7XE5rf8cZ7koNRoSCpwbRWFBkzVYkkLhpjYuHRYgljDHkacGNLrJeldow996JeOz3PpniO/UvxNr0M6tdOovNdDCPfa24r+8UUmz9mQtTVESGnH/0axUnHK+XYdn0qIgtfIEhC8uBRcLsOAnsa7mR/jTlP+f9FvNyPNPrUY33s9RZafsjJ2ziL1nXXbBLlmPCfaZfWU37Db4EnwL7oLv96eVYJ95wx4g+PEb612dHA==</latexit>
Reward for the algorithm<latexit sha1_base64="58zCSwWB0/MRbmW4ak6tFuTVoiw=">AAACAXicdVDLSgMxFM3UV62vqhvBTbAIrkpa+3InuHGpYlWopWTSO21oZjIkd5RS6sZfceNCEbf+hTv/xvQhqOhZhMM593Jzjh8raZGxDy81Mzs3v5BezCwtr6yuZdc3LqxOjIC60EqbK59bUDKCOkpUcBUb4KGv4NLvHY38yxswVuroHPsxNEPeiWQgBUcntbJbZ3DLTZsG2lDsAuWqo43EbtjK5lie1Ur7rEZZvlIqFytVR9x7UGa0kGdj5MgUJ63s+3VbiySECIXi1jYKLMbmgBuUQsEwc51YiLno8Q40HI14CLY5GCcY0l2nTD4R6AjpWP2+MeChtf3Qd5Mhx6797Y3Ev7xGgkGtOZBRnCBEYnIoSBRFTUd10LY0IFD1HeHCBZeCii43XKArLeNK+EpK/ycXxXzB8dNi7rA0rSNNtskO2SMFUiWH5JickDoR5I48kCfy7N17j96L9zoZTXnTnU3yA97bJ5YylvE=</latexit><latexit sha1_base64="58zCSwWB0/MRbmW4ak6tFuTVoiw=">AAACAXicdVDLSgMxFM3UV62vqhvBTbAIrkpa+3InuHGpYlWopWTSO21oZjIkd5RS6sZfceNCEbf+hTv/xvQhqOhZhMM593Jzjh8raZGxDy81Mzs3v5BezCwtr6yuZdc3LqxOjIC60EqbK59bUDKCOkpUcBUb4KGv4NLvHY38yxswVuroHPsxNEPeiWQgBUcntbJbZ3DLTZsG2lDsAuWqo43EbtjK5lie1Ur7rEZZvlIqFytVR9x7UGa0kGdj5MgUJ63s+3VbiySECIXi1jYKLMbmgBuUQsEwc51YiLno8Q40HI14CLY5GCcY0l2nTD4R6AjpWP2+MeChtf3Qd5Mhx6797Y3Ev7xGgkGtOZBRnCBEYnIoSBRFTUd10LY0IFD1HeHCBZeCii43XKArLeNK+EpK/ycXxXzB8dNi7rA0rSNNtskO2SMFUiWH5JickDoR5I48kCfy7N17j96L9zoZTXnTnU3yA97bJ5YylvE=</latexit><latexit sha1_base64="58zCSwWB0/MRbmW4ak6tFuTVoiw=">AAACAXicdVDLSgMxFM3UV62vqhvBTbAIrkpa+3InuHGpYlWopWTSO21oZjIkd5RS6sZfceNCEbf+hTv/xvQhqOhZhMM593Jzjh8raZGxDy81Mzs3v5BezCwtr6yuZdc3LqxOjIC60EqbK59bUDKCOkpUcBUb4KGv4NLvHY38yxswVuroHPsxNEPeiWQgBUcntbJbZ3DLTZsG2lDsAuWqo43EbtjK5lie1Ur7rEZZvlIqFytVR9x7UGa0kGdj5MgUJ63s+3VbiySECIXi1jYKLMbmgBuUQsEwc51YiLno8Q40HI14CLY5GCcY0l2nTD4R6AjpWP2+MeChtf3Qd5Mhx6797Y3Ev7xGgkGtOZBRnCBEYnIoSBRFTUd10LY0IFD1HeHCBZeCii43XKArLeNK+EpK/ycXxXzB8dNi7rA0rSNNtskO2SMFUiWH5JickDoR5I48kCfy7N17j96L9zoZTXnTnU3yA97bJ5YylvE=</latexit><latexit sha1_base64="58zCSwWB0/MRbmW4ak6tFuTVoiw=">AAACAXicdVDLSgMxFM3UV62vqhvBTbAIrkpa+3InuHGpYlWopWTSO21oZjIkd5RS6sZfceNCEbf+hTv/xvQhqOhZhMM593Jzjh8raZGxDy81Mzs3v5BezCwtr6yuZdc3LqxOjIC60EqbK59bUDKCOkpUcBUb4KGv4NLvHY38yxswVuroHPsxNEPeiWQgBUcntbJbZ3DLTZsG2lDsAuWqo43EbtjK5lie1Ur7rEZZvlIqFytVR9x7UGa0kGdj5MgUJ63s+3VbiySECIXi1jYKLMbmgBuUQsEwc51YiLno8Q40HI14CLY5GCcY0l2nTD4R6AjpWP2+MeChtf3Qd5Mhx6797Y3Ev7xGgkGtOZBRnCBEYnIoSBRFTUd10LY0IFD1HeHCBZeCii43XKArLeNK+EpK/ycXxXzB8dNi7rA0rSNNtskO2SMFUiWH5JickDoR5I48kCfy7N17j96L9zoZTXnTnU3yA97bJ5YylvE=</latexit>
![Page 24: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/24.jpg)
24
Formalize Reward !
Average-Over-All: RAOA(Z) =1
|U|X
u2U
1
|S⇤u|
X
i2S⇤u
c(Zu,i)
Items liked by user u (observed)
![Page 25: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/25.jpg)
25
Formalize Bias
Ou,i = 1 if (u, i) is observed, and Ou,i = 0 otherwise
EO
hRAOA(Z)
i6= R(Z)
Ou,i ⇠ B(1, Pu,i)
![Page 26: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/26.jpg)
26
RIPS(Z|P ) =1
|U|X
u2U
1
|Su|X
i2S⇤u
c(Zu,i)
Pu,i
RAOA(Z) =1
|U|X
u2U
1
|S⇤u|
X
i2S⇤u
c(Zu,i)
Inverse-Propensity-Scoring (IPS)
![Page 27: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/27.jpg)
27
RIPS(Z|P ) =1
|U|X
u2U
1
|Su|X
i2S⇤u
c(Zu,i)
Pu,i
RAOA(Z) =1
|U|X
u2U
1
|S⇤u|
X
i2S⇤u
c(Zu,i)
EO
hRIPS(Z|P )
i= R(Z)
Inverse-Propensity-Scoring (IPS)
![Page 28: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/28.jpg)
28
RIPS(Z|P ) =1
|U|X
u2U
1
|Su|X
i2S⇤u
c(Zu,i)
Pu,i
Self-Normalized Inverse-Propensity-Scoring (SNIPS)[Swaminathan et al.15]
RSNIPS(Z|P ) =1
|U|X
u2U
1Pi2S⇤
u
1Pu,i
X
i2S⇤u
c(Zu,i)
Pu,i
![Page 29: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/29.jpg)
29
Estimating Propensity Scores
Factor: Popularity bias (Users are more likely to be exposed to popular items)
Assumptions:
• User-independence assumption
• Two-steps assumption
• User preference is not affected by item presentation
Pu,i = P (Ou,i = 1) = P (O⇤,i = 1) = P⇤,i
P⇤,i = P select⇤,i · P interact|select
⇤,i
P interact|select⇤,i = P interact
⇤,i
![Page 30: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/30.jpg)
30
Estimating Propensity Scores
Popularity bias model [Steck 11]:
P select⇤,i / (n⇤
i )�
Observed item popularity
![Page 31: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/31.jpg)
31
Estimating Propensity Scores
Popularity bias model [Steck 11]:
P select⇤,i / (n⇤
i )�
P⇤,i / (n⇤i )( �+1
2 )
Estimated from known online content
serving policy
![Page 32: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/32.jpg)
32
Measuring bias in recommender evaluation(Yahoo! music rating dataset)
ModelAverage-Over-All
!SNIPS(# = 1.5)
!SNIPS(# = 2.0)
!SNIPS(# = 2.5)
!SNIPS(# = 3.0)
U-CML 0.401 0.270 0.260 0.253 0.248
A-CML 0.399 0.274 0.264 0.258 0.253
BPR 0.380 0.275 0.268 0.262 0.258
PMF 0.386 0.267 0.259 0.252 0.248
Mean Absolute Error (MAE), Recall
!SNIPS produces significantly lower
MAE
![Page 33: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/33.jpg)
33
Measuring bias in recommender evaluation(Yahoo! music rating dataset)
ModelAverage-Over-All
!SNIPS(# = 1.5)
!SNIPS(# = 2.0)
!SNIPS(# = 2.5)
!SNIPS(# = 3.0)
U-CML 0.401 0.270 0.260 0.253 0.248
A-CML 0.399 0.274 0.264 0.258 0.253
BPR 0.380 0.275 0.268 0.262 0.258
PMF 0.386 0.267 0.259 0.252 0.248
Mean Absolute Error (MAE), Recall
!SNIPS produces significantly lower
MAE
The accuracy of recommending popular items is a significant overestimation of the true recommendation performance
![Page 34: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/34.jpg)
34
Please come to our poster or refer to our paper for:
• Proofs
• Experimental details.
• More experiments.
• Deeper analysis of the unbiased evaluator.
![Page 35: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/35.jpg)
35
Conclusions and Future Work
RSNIPS(Z|P ) =1
|U|X
u2U
1Pi2S⇤
u
1Pu,i
X
i2S⇤u
c(Zu,i)
Pu,i
EO
hRIPS(Z|P )
i= R(Z)
• Understanding variance of evaluators.
• Propensity estimation (e.g., incorporate auxiliary
user and item information).
• Debias training of recommendation systems (e.g.,
[Liang et al. 16]).
![Page 36: Unbiased Offline Recommender Evaluation for Missing-Not-At …ylongqi/presentation/YangCXWBE18Slid… · [Marlin et al. 09], [Schnabel et al. 16], [Steck10], [Steck11], and [Steck13]](https://reader036.fdocuments.in/reader036/viewer/2022071103/5fdca973086ead00c35128f9/html5/thumbnails/36.jpg)
36
http://www.openrec.aiGithub link, documents, and tutorials
Longqi YangPh.D. candidate
Computer Science, Cornell Tech, Cornell University
Email: [email protected]
Web: bit.ly/longqi
Twitter: @ylongqi
Connected Experiences Lab
http://cx.jacobs.cornell.edu/
Small Data Lab
http://smalldata.io/
Funders: