Generative Adversarial Imitation Learning
Transcript of Generative Adversarial Imitation Learning
![Page 1: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/1.jpg)
Score-Based Models
LECTURE 14
CS236: Deep Generative Models
![Page 2: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/2.jpg)
Summary
![Page 3: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/3.jpg)
Score-based models• A model that represents the score function
• Score estimation: training the score-based model from datapoints• Score matching
• Not scalable for training deep score-based models.
s✓(x)<latexit sha1_base64="TOksaX90IOebB1H9zWpmmQdkvIA=">AAAH2HicfZVbb9MwFMezcekYtw0eeQlUE2OCqBkIeNyFSUxiYkhbNzGXyXHsxqodR7bbtYsi8YZ45YlX+Ax8Gb4NTpM2a1xhKfLR+f19fHwc20HCqNKt1t+FxWvXb9xsLN1avn3n7r37K6sP2kr0JcLHSDAhTwOoMKMxPtZUM3yaSAx5wPBJ0NvN+ckAS0VFfKRHCe5w2I0poQhq4/qizoGOsIbrICDDZ+crzZbXGjfXNvzSaDplOzxfvfkHhAL1OY41YlCpM7+V6E4KpaaI4WwZ9BVOIOrBLj4zZgw5Vp10nHbmrhlP6BIhzRdrd+y9OiIdFsIZH+RKjXhgRnOoI1VnuXMeO+tr8raT0jjpaxyjYnLSZ64Wbl4XN6QSI81GxoBIUpO/iyIoIdKmessgxhdIcA7jMAUwUNmZ30kBM5XVTR/IvM9mRUdZCmhsBiPIaogHpBifZxmQtOlnNcVkgkCwMF+tYHNEPAiuhAkKxaykH2YFljwNDV27QjcMLaBI0gw8vqqbDZJMgxgSYgLCAMpp3HGfl+rp5mvAe1jGL/xN3netOBdwgE0kRXl9IXmJyjhWsWJhpi2qjYkGrF2WnHYjDWS7KPw7bP4+iQ9MhI8JllALuWH2SXZVnrzpwfPc+o+QxqQU5lZtN8gwm27W0Norsl3RbZvuVHTHpicVPbFpu6Jtmx5U9MCmQUUDmw4qOrDpZUUvbbpf0X2b6orqOZGxFJWgVRcQCXu85LmdfrBDmJ8oPyEpwImiTMTZnBzyO20sqw5QcdHNEUOWRJa4cNriJKJ1ae6yhaNqlSObqooqm0YVjbLyzOGuWTBPohR73QyIGIdCl2hvgvYsRHGJqIdraH+C9i2ESIkQqZHdCdmtE6zRJEGNamygSjRQNXIhdXrhSU/PukNB0tATHpl1wx5ModfzYG2RNDTro15YzwiyaUquuVim2Lxzfv1Vs432pue/9FqfXjW3dsoXb8l55Dxx1h3feeNsOe+dQ+fYQY50fjq/nN+Nz42vjW+N74V0caEc89CZaY0f/wD9b+lT</latexit>
𝑥!
𝑥"𝑥#
𝑠!,#
𝑠!,$
𝑠!,%
![Page 4: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/4.jpg)
Denoising score matching
• Pros:• Much more scalable than score matching; • connection to optimal denoising.
• Con:
Perturbation distribution/kernel
Data distribution
Noise-perturbed data distribution
![Page 5: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/5.jpg)
Sliced score matching
• Projection distribution can be Gaussian or Rademacher• Pros:
• Much more scalable than score matching; • Estimates the true data score.
• Con: Slower than denoising score matching.
![Page 6: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/6.jpg)
Score-based generative modeling
score matching
Scores New samplesData samples
Langevin dynamics
![Page 7: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/7.jpg)
Pitfalls• Manifold hypothesis. Data score is undefined.
• Score matching fails in low data density regions
• Langevin dynamics fail to weight different modes correctly
Data pointsrx log pdata(x)
<latexit sha1_base64="DbRlY6Chp6fWi5fEPqWUhYsubBg=">AAAH8HicfZXdjtQ2FMcDBYYuBZZy2ZvAFPEhiCbbquUSdkFiJValEjuLtB6NbMeeWOOPyPbMzmDlBfoEvUPccsUFN/AmfZs6m8xkJ15hKcrR+f19fM5xHKOCM2MHg/8uXPzh0uUrvas/bl376fqNm9u3fh4aNdOYHGLFlX6HoCGcSXJomeXkXaEJFIiTIzTdq/jRnGjDlHxrlwUZCTiRjDIMrXeNt38FEiIOxwDRRQy4msTFGFiysC6DFpYPKv/D8XZ/kAxORxwaaWP0o2a8Gd+68gVkCs8EkRZzaMxxOijsyEFtGeak3AIzQwqIp3BCjr0poSBm5E7LKeN73pPFVGn/SBufes/OcItauOGDwpilQH62gDY3XVY5z2PHM0ufjhyTxcwSievF6YzHVsVVv+KMaYItX3oDYs18/jHOoYbY+q5uAUlOsBICyswBiEx5nI4c4L7jtp8CXb3LTdHb0gEm/WQMeQcJROv5VZaIun5adhSrBZDiWVWt4ueIBEJnwqBasSmZZWWNtXCZp/fO0Eee1lAVrgR3zuo2gxTrIJ5khIIMQb2Oe/quWnV/5w8gpkTLJ+mOmMVBnBM4Jz6SYaJbSNWiJk7QLKn8snW3CbWAD5uWs0lugR7WjX9B/NenyYGP8FdBNLRKP/L7pCemSt6/wePK+o6QSdoIK6uzG3RRrjdrEewVfd7S5yHdbeluSI9aehTSYUuHIT1o6UFIUUtRSOctnYf0fUvfh3S/pfshtS2150QmWrWCQVdANZyKhle2ex2G8B9RdUIcIIVhXMnynBxy4n9qbuMAgdoZiiEv8kBcO0NxkbOutHKFwmVb5TKkpqUmpHlL87I5c2TiCxZF7kgyKYGSJFO2QS9X6GWAGGkQS0gH7a/QfoAwbRCmHbK3IntdQixeJWhxh81Ng+amQ060dSeJTuymO1PUZYlK6KYbTqGDyTSBnSJZ5utjSdbNCPJ1SrH/sayxv+fS7q0WGsOdJP0tGfz9e//ZbnPjXY1+ie5GD6I0+jN6Fr2K3kSHEY7+iT5HX6NvPd37t/eh97GWXrzQzLkdbYzep/8BLBTy0Q==</latexit>
![Page 8: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/8.jpg)
Gaussian perturbation• The solution to all pitfalls: Gaussian perturbation!
• Manifold + noise
• Score matching on noisy data.
N (0; 0.0001)<latexit sha1_base64="vEY8ql2iNttGkThU0QQkbKB7+rI=">AAAH33icfZXPb9MwFMezAesYvzZ25BKoJsYEUTIQIHHZDyZRicGQtm7SXE2O4zRW4ziynW6dlTM3xJUTV7jzz/Df4DTp0sYTlio/vc/Xz+89x7WfxkRI1/07N3/j5q2F1uLtpTt3791/sLzysCtYxhE+Qixm/MSHAsckwUeSyBifpBxD6sf42B/sFvx4iLkgLDmUoxT3KOwnJCQISu06W14FFMFYfcrX3Xe267iu6z07W26PLT1s0/Aqo21V4+BsZeEPCBjKKE4kiqEQp56byp6CXBIU43wJZAKnEA1gH59qM4EUi54aZ5/ba9oT2CHj+pdIe+ydXqEuSuGMD1IhRtTXqymUkWiywnkdO81k+LanSJJmEieo3DzMYlsyu2iPHRCOkYxH2oCIE52/jSLIIZK6iUsgweeIUQqTQAHoi/zU6ykQ6wbLtgd4MeezosNcAZLoxbrLDUT9sFxfZOmHqu3lDcVkA5/FQVEti68RUd+fCuOXillJFuQl5lQFmq5N0Q1NS8hSlYPH07rZIOlVEE0CHILAh/wq7nguWvV08zWgA8yTF94mzWwjzjkcYh1JENospGhRFcdoVsL0tmW3cShB3K1aTvqRBLxbNv491l8fx/s6wucUcygZ39DnxPuiSF7P4Hlh/UdIkrASFlbjNMKL/OqwLoyzCrdrum3SnZrumPS4pscm7da0a9L9mu6b1K+pb9JhTYcmvazppUk7Ne2YVNZUXhMZc1YL3KYg5HBAK17Y6qMZQn9ExQ1RAKeCxCzJr8khwhKOZfUFAqXTFMM4jQxx6TTFaUSa0sJlCkd1lSOTipoKk0Y1jfLqzuG+LpimkcJOPwcswQGTFdqboD0DEVwh4uAG6kxQx0AorBAKG2R3QnabBEs0SVCiBhuKCg1Fg5xzqc4d7shZd8BCFTjMCWfdcAAVdAYObBRJAl0fcYJmRvp9m6Rk6z+WK6zfOa/5qplGd9PxXjrul1ftrZ3qxVu0HllPrHXLs95YW9YH68A6spA1sn5av6zfLdj62vrW+l5K5+eqNavWzGj9+AdaWOm/</latexit>
CIFAR-10 Noisy CIFAR-10
![Page 9: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/9.jpg)
Challenge in low data density regions
Inaccurate Inaccurate
Song and Ermon. “Generative Modeling by Estimating Gradients of the Data Distribution.” NeurIPS 2019.
![Page 10: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/10.jpg)
Adding noise to data
Accurate Accurate
Song and Ermon. “Generative Modeling by Estimating Gradients of the Data Distribution.” NeurIPS 2019.
Provide useful directional information for Langevin MCMC.
![Page 11: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/11.jpg)
Multi-scale Noise Perturbation• Trade-off
• Multi-scale noise perturbations.�1 > �2 > · · · > �L�1 > �L
<latexit sha1_base64="QUR8Lkz7u2GbfaIh0slXeHTikyA=">AAAIAnicfZXdbtMwFMczPlYYXxtcchOoJhBiUVMQcIXGBhKTNgES65DmanIcp7Fqx5Htduus3PEYPAF3CC654hZegLfBadJ6jScsVTk9v7+PzzmO4yinRKpO5+/ShYuXLi+3rlxduXb9xs1bq2u3e5KPBML7iFMuPkVQYkoyvK+IovhTLjBkEcUH0XC75AdjLCTh2Uc1yXGfwUFGEoKgMq6j1WdAkgGDR6H/0q/NbmmimCtpfXp3Iyzs392j1XYn6EyH7xphbbS9erw/Wlv+AWKORgxnClEo5WHYyVVfQ6EIorhYASOJc4iGcIAPjZlBhmVfTwss/HXjif2EC/PLlD/1np2hTyrhgg8yKScsMrMZVKlsstJ5HjscqeRFX5MsHymcoWrxZER9xf2yg35MBEaKTowBkSAmfx+lUECkTJ9XQIaPEWcMZrEGMJLFYdjXgJo9UO0QiPJZLIo+FhqQzExGkDYQi5JqfplllOh2WDQUswUiTuOyWk7PEbEoOhMmqhSLklFcVFgwHRu6foY+MrSCPNcFuHdWtxgknwcxJMYJiCMo5nGnz7JVD7rPABtikW2EXTbynTjHcIxNJElYs5CyRXUcp1kZN8tW3caJArRXt5wMUgVEr2r8a2zePoH3TIR3ORZQcfHI7JMYyDJ58wSPS+s/QpIltbC0GruRnBTzzTpx9ip5Zekrl25ZuuXSA0sPXNqztOfSPUv3XBpZGrl0bOnYpaeWnrp0x9IdlypL1TmRseBW0GkKEgGHrOalrXfdEOYlKk+IBjiXhPKsOCeHFCs4ldkDBCqnK4Y0Tx1x5XTFeUqa0tLlCie2yolLpaXSpamlaVGfOTwwBbM81TgYFIBn2HzKa/Rmht44iOAakQA30M4M7TgIJTVCSYNsz8h2k2CFZgkq1GBjWaOxbJBjofRxIAK16I55ouOAB8miGw6hhsEwgI0iSWzqI0HczAjSeUq++bDMsbnnwuat5hq9bhA+CTofnrY3t+ob74p317vvPfRC77m36b313nv7HvK+eL+8396f1ufW19a31vdKemGpnnPHWxitn/8AD5L31Q==</latexit>
…
![Page 12: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/12.jpg)
Trading off Data Quality and Estimation Accuracy
Worse data quality! Better score estimation!
(Red encodes error)
![Page 13: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/13.jpg)
Using multiple noise scales
![Page 14: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/14.jpg)
Annealed Langevin Dynamics: Joint Scores to Samples• Sample using sequentially with Langevin dynamics.• Anneal down the noise level.• Samples used as initialization for the next level.
�1,�2, · · · ,�L<latexit sha1_base64="4U2iI4Wry6eYJzayVSyTAOzvKvA=">AAAH8HicfZXdbtMwFMez8dExvja45CZQJtA0oqYg4HIfTKLSJoa0tZPmanIcp7Fqx5HttuuivABPwB3ilisuuIE34W1wmrRp4wpLUY7O7+/jc47j2IspkarR+LuyeuPmrdu1tTvrd+/df/BwY/NRW/KBQPgMccrFuQclpiTCZ4oois9jgSHzKO54/YOMd4ZYSMKjUzWOcZfBXkQCgqDSrsuN50CSHoOX7o5dWE1tIZ8rOfMcXW7UG05jMmzTcAujbhXj5HLz9i/gczRgOFKIQikv3EasugkUiiCK03UwkDiGqA97+EKbEWRYdpNJOam9pT2+HXChn0jZE+/8jOQqFy74IJNyzDw9m0EVyirLnMvYxUAF77sJieKBwhHKFw8G1Fbczvpl+0RgpOhYGxAJovO3UQgFREp3dR1EeIQ4YzDyEwA9mV643QRQ3XFVd4HI3umi6DRNAIn0ZARpBTEvyOdnWXpBUnfTimK6gMepn1XL6RIR87y5MF6uWJQM/DTHgiW+pltzdFvTHPI4ScHTed1ikHgWRBMfB8D3oJjFnbyzVr1ovgWsj0X0ym2ygW3EGcEh1pEkYdVCshYVcYxmRVwvm3cbBwrQdtFy0gsVEO288R+w/voEPtYRPsVYQMXFtt4n0ZNZ8voNdjLrP0ISBYUwsyq7EVyls826MvYq2Cvpnkn3S7pv0k5JOyZtl7Rt0uOSHpvUK6ln0mFJhya9Lum1SVslbZlUlVQtiYwFLwWNqiAQsM8KntnJkRlCf0TZCUkAjiWhPEqX5BBiBSey8gCB3GmKIY1DQ5w7TXEckqo0c5nCcVnl2KSypNKkYUnDtDhzuKcLZnGYYKeXAh5h/esu0OEUHRqI4AIRB1dQa4paBkJBgVBQIQdTclAlWKFpggpV2FAWaCgrZCRUMnKEoxbdPg8S3+FOsOiGfZhAp+/ASpHE1/URx69mBOksJVv/WGZY33Nu9VYzjXbTcV87jc9v6rv7xY23Zj2xnlkvLdd6Z+1aH60T68xC1hfrp/Xb+lMTta+1b7XvuXR1pZjz2FoYtR//ADvL8Z8=</latexit>
![Page 15: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/15.jpg)
Annealed Langevin dynamics
![Page 16: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/16.jpg)
Comparison to the vanilla Langevin dynamics
Langevin dynamics Annealed Langevin dynamics
![Page 17: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/17.jpg)
Joint Score Estimation via Noise Conditional Score Networks
𝑥!
𝑥"
𝜎
𝑠#,!
𝑠#,"
�3<latexit sha1_base64="Ul4/830jDTVn09Az1qNMeFxTM+4=">AAAH0nicfZVbb9MwFMezcekYtw0eeQlUE2iCqNkQ8LgLk6jExJC2dlJdTY7jNFbtOLKddl2UB8QrT7zCx+DL8G1wmrRp4wpLlY/O7+/jc47j2ospkarV+ru2fuv2nbuNjXub9x88fPR4a/tJR/JEIHyBOOXi0oMSUxLhC0UUxZexwJB5FHe94XHOuyMsJOHRuZrEuM/gICIBQVBp1yWQZMDg1f7VVrPltKbDNg23NJpWOc6utu/+AT5HCcORQhRK2XNbseqnUCiCKM42QSJxDNEQDnBPmxFkWPbTacKZvaM9vh1woX+RsqfexRXpdSFc8kEm5YR5ejWDKpR1ljtXsV6igg/9lERxonCEis2DhNqK23lHbJ8IjBSdaAMiQXT+NgqhgEjpvm2CCI8RZwxGfgqgJ7Oe208B1T1VTReIfM6WRedZCkikFyNIa4h5QbE+z9IL0qab1RSzDTxO/bxaTleImOcthPEKxbIk8bMCC5b6mu4s0F1NC8jjNAPPF3XLQeJ5EE18HADfg2IedzrnrXq59w6wIRbRG3ePJbYRZwxHWEeShNULyVtUxjGaFXG9bdFtHChAO2XLySBUQHSKxn/E+usT+FRH+BJjARUXu/qcxEDmyesZvM6t/whJFJTC3KqdRnCdzQ/r2jir4LCihyY9quiRSbsV7Zq0U9GOSU8rempSr6KeSUcVHZn0pqI3Jm1XtG1SVVG1IjIWvBK06oJAwCEreW6nn80Q+iPKb0gKcCwJ5VG2IocQKziVVRcIFE5TDGkcGuLCaYrjkNSlucsUTqoqJyaVFZUmDSsaZuWdwwNdMIvDFDuDDPAI+1yV6GSGTgxEcImIg2uoPUNtA6GgRCiokeMZOa4TrNAsQYVqbCRLNJI1MhYqHTvCUctunwep73AnWHbDIUyhM3RgrUji6/qI49czgnSekq3/WOZYv3Nu/VUzjc6e4+47ra9vmwdH5Yu3YT2zXlivLNd6bx1Yn6wz68JCFrV+Wr+s343zxk3jW+N7IV1fK9c8tZZG48c/otLm5Q==</latexit>
Noise Conditional Score Network
(NCSN)
�1<latexit sha1_base64="18mNNkjNU+gUJ40V0h1QgIf0VJA=">AAAH0nicfZXdbtMwFMezAesYXxtcchOoJtAEUTMQcLkPJlGJiSFt7aS6mhzHaazacWQ77booF4hbrriFx+BleBucJm3auMJS5aPz+/v4nOO49mJKpGq1/q6t37p9Z6OxeXfr3v0HDx9t7zzuSJ4IhC8Qp1xcelBiSiJ8oYii+DIWGDKP4q43PM55d4SFJDw6V5MY9xkcRCQgCCrtugSSDBi8cq+2my2nNR22abil0bTKcXa1s/EH+BwlDEcKUShlz23Fqp9CoQiiONsCicQxREM4wD1tRpBh2U+nCWf2rvb4dsCF/kXKnnoXV6TXhXDJB5mUE+bp1QyqUNZZ7lzFeokKPvRTEsWJwhEqNg8Saitu5x2xfSIwUnSiDYgE0fnbKIQCIqX7tgUiPEacMRj5KYCezHpuPwVU91Q1XSDyOVsWnWcpIJFejCCtIeYFxfo8Sy9Im25WU8w28Dj182o5XSFinrcQxisUy5LEzwosWOprurtA9zQtII/TDDxb1C0HiedBNPFxAHwPinnc6Zy36sX+O8CGWESv3X2W2EacMRxhHUkSVi8kb1EZx2hWxPW2RbdxoADtlC0ng1AB0Ska/xHrr0/gUx3hS4wFVFzs6XMSA5knr2fwKrf+IyRRUApzq3YawXU2P6xr46yCw4oemvSookcm7Va0a9JORTsmPa3oqUm9inomHVV0ZNKbit6YtF3RtklVRdWKyFjwStCqCwIBh6zkuZ1+NkPojyi/ISnAsSSUR9mKHEKs4FRWXSBQOE0xpHFoiAunKY5DUpfmLlM4qaqcmFRWVJo0rGiYlXcOD3TBLA5T7AwywCPsc1Wikxk6MRDBJSIOrqH2DLUNhIISoaBGjmfkuE6wQrMEFaqxkSzRSNbIWKh07AhHLbt9HqS+w51g2Q2HMIXO0IG1Iomv6yOOX88I0nlKtv5jmWP9zrn1V800OvuO+8ZpfX3bPDgqX7xN66n13HppudZ768D6ZJ1ZFxayqPXT+mX9bpw3bhrfGt8L6fpaueaJtTQaP/4BlATm4w==</latexit>
�2<latexit sha1_base64="coJlohA62IECat4FT54d4jsLB40=">AAAH0nicfZXdbtMwFMezAesYXxtcchOoJtAEUVMQcLkPJlGJiSFt7aS6mhzHaazacWQ77booF4hbrriFx+BleBucJm3aeMJS5aPz+/v4nOO49mJKpGq1/q6t37p9Z6OxeXfr3v0HDx9t7zzuSp4IhM8Rp1xceFBiSiJ8roii+CIWGDKP4p43Osp5b4yFJDw6U9MYDxgcRiQgCCrtugCSDBm8bF9uN1tOazZs03BLo2mV4/RyZ+MP8DlKGI4UolDKvtuK1SCFQhFEcbYFEoljiEZwiPvajCDDcpDOEs7sXe3x7YAL/YuUPfMur0ivCuGKDzIpp8zTqxlUoayz3HkT6ycq+DBISRQnCkeo2DxIqK24nXfE9onASNGpNiASROdvoxAKiJTu2xaI8ARxxmDkpwB6Muu7gxRQ3VPVdIHI52xVdJalgER6MYK0hpgXFOvzLL0gbbpZTTHfwOPUz6vl9AYR87ylMF6hWJUkflZgwVJf090luqdpAXmcZuDZsm41SLwIoomPA+B7UCzizua8VS/a7wAbYRG9dtsssY04EzjGOpIkrF5I3qIyjtGsiOtti27jQAHaLVtOhqECols0/iPWX5/AJzrClxgLqLjY0+ckhjJPXs/gVW79R0iioBTmVu00gqtscVhXxlkFBxU9MOlhRQ9N2qtoz6TdinZNelLRE5N6FfVMOq7o2KTXFb02aaeiHZOqiqobImPBK0GrLggEHLGS53b62QyhP6L8hqQAx5JQHmU35BBiBWey6gKBwmmKIY1DQ1w4TXEckro0d5nCaVXl1KSyotKkYUXDrLxzeKgLZnGYYmeYAR5hn6sSHc/RsYEILhFxcA115qhjIBSUCAU1cjQnR3WCFZonqFCNjWWJxrJGJkKlE0c4atXt8yD1He4Eq244gil0Rg6sFUl8XR9x/HpGkC5SsvUfywLrd86tv2qm0W077hun9fVtc/+wfPE2rafWc+ul5VrvrX3rk3VqnVvIotZP65f1u3HWuG58a3wvpOtr5Zon1spo/PgHm2vm5A==</latexit>
![Page 18: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/18.jpg)
Training noise conditional score networks• Which score matching loss?
• Sliced score matching?• Denoising score matching?
• Denoising score matching is naturally suitable, since the goal is to estimate the score of perturbed data distributions.
• Weighted combination of denoising score matching losses
![Page 19: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/19.jpg)
Choosing noise scales• Maximum noise scale
• Minimum noise scale: should be sufficiently small to control the noise in final samples.
![Page 20: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/20.jpg)
Choosing noise scales• Key intuition: adjacent noise scales
should have sufficient overlap to facilitate transitioning across noise scales in annealed Langevin dynamics.
• A geometric progression with sufficient length.
![Page 21: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/21.jpg)
Choosing the weighting function• Weighted combination of denoising score matching losses
• How to choose the weighting function ? • Goal: balancing different score matching losses in the sum à
![Page 22: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/22.jpg)
Training noise conditional score networks• Sample a mini-batch of datapoints• Sample a mini-batch of noise scale indices
• Sample a mini-batch of Gaussian noise• Estimate the weighted mixture of score matching losses
• Stochastic gradient descent• As efficient as training one single non-conditional score-based model
![Page 23: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/23.jpg)
Experiments: Sampling
![Page 24: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/24.jpg)
Experiments: sampling
![Page 25: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/25.jpg)
High Resolution Image Generation
![Page 26: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/26.jpg)
Using an infinite number of noise scales
: continuous index of perturbed distributions
![Page 27: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/27.jpg)
Compact representation of infinite distributions
• Stochastic process à Marginal probability densities
• Stochastic differential equation:
Infinitesimal white noiseDeterministic drift
![Page 28: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/28.jpg)
Score-based generative modeling via SDEs
![Page 29: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/29.jpg)
Score-based generative modeling via SDEs
Time reversal Score function!
![Page 30: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/30.jpg)
Score-based generative modeling via SDEs• Time-dependent score-based model
• Training:
• Reverse-time SDE
• Euler-Maruyama (analgous to Euler for ODEs)
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole. “Score-Based Generative Modeling through Stochastic Differential Equations.” ICLR 2021.
![Page 31: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/31.jpg)
Predictor-Corrector sampling methods• Predictor-Corrector sampling.
• Predictor: Numerical SDE solver• Corrector: Score-based MCMC
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole. “Score-Based Generative Modeling through Stochastic Differential Equations.” ICLR 2021.
predictor
corrector
![Page 32: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/32.jpg)
Results on predictor-corrector sampling
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole. “Score-Based Generative Modeling through Stochastic Differential Equations.” ICLR 2021.
![Page 33: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/33.jpg)
High-Fidelity Generation for 1024x1024 Images
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole. “Score-Based Generative Modeling through Stochastic Differential Equations.” ICLR 2021.
![Page 34: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/34.jpg)
Probability flow ODE: turning the SDE to ODE
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole. “Score-Based Generative Modeling through Stochastic Differential Equations.” ICLR 2021.
• Probability flow ODE (ordinary differential equation)
Score function
![Page 35: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/35.jpg)
Solving Reverse-ODE for Sampling• More efficient samplers via black-box ODE solvers
• Exact likelihood though models are trained with score matching.
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole. “Score-Based Generative Modeling through Stochastic Differential Equations.” ICLR 2021.
NFE = Number of score Function Evaluations
• Predictor-corrector:• > 1000 NFE
• ODE• ≈100 NFE
![Page 36: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/36.jpg)
Solving Reverse-ODE for Sampling
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole. “Score-Based Generative Modeling through Stochastic Differential Equations.” ICLR 2021.
models trainedwith score matching
black-box ODE Solvers for sampling
![Page 37: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/37.jpg)
Probability flow ODE: latent space manipulation
Temperature scaling
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole. “Score-Based Generative Modeling through Stochastic Differential Equations.” ICLR 2021.
![Page 38: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/38.jpg)
Probability flow ODE: uniquely identifiable encoding• Uniquely identifiable encoding
• No trainable parameters in the probability flow ODE!
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole. “Score-Based Generative Modeling through Stochastic Differential Equations.” ICLR 2021.
Model 1
Model 2
Flow models, VAE, etc
Model 1
Model 2Score-based models via
probability flow ODE
![Page 39: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/39.jpg)
Controllable Generation
• Conditional reverse-time SDE via unconditional scores
unconditional score,Trained w/o y
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole. “Score-Based Generative Modeling through Stochastic Differential Equations.” ICLR 2021.
Trained separately orspecified with domain knowledge
Control signal
![Page 40: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/40.jpg)
Controllable Generation: class-conditional generation
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole. “Score-Based Generative Modeling through Stochastic Differential Equations.” ICLR 2021.
• is the class label• is a time-dependent classifier
![Page 41: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/41.jpg)
Controllable Generation: inpainting
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole. “Score-Based Generative Modeling through Stochastic Differential Equations.” ICLR 2021.
• is the masked image• can be approximated without training
![Page 42: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/42.jpg)
Controllable Generation: colorization
Song, Sohl-Dickstein, Kingma, Kumar, Ermon, Poole. “Score-Based Generative Modeling through Stochastic Differential Equations.” ICLR 2021.
• is the gray-scale image• can be approximated without training
![Page 43: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/43.jpg)
43
Controllable Generation: colorization
Resolution: 1024x1024
![Page 44: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/44.jpg)
44
Controllable generation: Text-guided generationAn abstract painting of computer
science:A painting of the starry night by van
Gogh
https://colab.research.google.com/drive/1FuOobQOmDJuG7rGsMWfQa883A9r4HxEO?usp=sharing
![Page 45: Generative Adversarial Imitation Learning](https://reader030.fdocuments.in/reader030/viewer/2022011901/61d6225ab60cac7c5e5f7105/html5/thumbnails/45.jpg)
Conclusion• Gradients of distributions (scores) can be estimated easily
• Flexible architecture choices — no need to be normalized/invertible• Stable training — no minimax optimization
• Better or comparable sample quality to GANs• State-of-the-art performance on CIFAR-10 and others• Scalable to resolution of 1024x1024 for image generation
• Exact likelihood computation• State-of-the-art likelihood on CIFAR-10• Sample editing via manuplating latent codes• Uniquely identifiable encoding