Facial Feature Analysis For Model Based Coding
-
Upload
eric-larson -
Category
Documents
-
view
656 -
download
1
description
Transcript of Facial Feature Analysis For Model Based Coding
Eric LarsonDecember 2007
Image Coding and Analysis Laboratory, Oklahoma State University
What is model-based coding? Facial Analysis Dealing with Dynamic Bandwidths
Solving a MOP quickly An application specific NSGA-II, with a
deterministic searchResultsConclusion
Alternative to sending raw video footage
Creation of “essential” parameters needed to reconstruct a scene
A real-time analysis nightmare
Copyright by Microsoft
Very Low Bit Rate Teleconferencing GamingMan-Machine InteractionVideo Telephony
Telephony for the deaf
Image Courtesy of Dr. Peter Eisert [3]
Analysis (by Synthesis)
Image Courtesy of Dr. Peter Eisert [3]
Images Courtesy of Dr. Peter Eisert [4]
Generously, Instituto Superior Technico
ISTface [22]
Gradient based approximation is not robust
Complication of direct optimization Handled by reducing FAPs
Do not address problem of dynamic bandwidth
Image Courtesy of J. Ahlberg [17]
Quality Objective Function:
FAP Number Objective Function:
D
2
10
255log 10 = PSNR
21
0
1
0
1),(
M
m
N
n
nmEMN
D
used is set fap if ,
used not is set fap if ,)( where
),(
1
0
18
1
iFAP
iFAPNi
FAP
Use NSGA-II for the multiple objective optimization
Assign a premature stopping criteriaChoose bandwidth Select FAP sets Use deterministic algorithm
Tournament selection used for crossover
Parents and children combined, sorted according to Domination Nearest Neighbor
RepeatFrom [7], NSGA-II
while {a search direction of improvement can be found} for {each dimension, step 20 units}
▪ -if the step is favorable, another step is made ▪ -Else, choose next dimension
find direction of steepest descent from original
point and improved point while {step size scaling constant < 0.0001}
take step in the steepest descent direction▪ -if the new point is favorable, increase step size by two, ▪ -else, decrease step size by a factor of ten.
Update starting individual with new individual
Pareto fronts
Max Bandwidth (Uncompressed)
FRAME NO.
Selected FAP Setsa Best PSNRMean PSNR (Over 3 runs)
Mean Function Evaluations
Medium 0 0(3), 1(2), 2, 4, 5(2), 6, 9, 10, 11(2), 12, 13, 15(3), 16(3) 30.57 dB 30.36 dB 779
(~4.8 Kbits/s 1 0 (3), 1(2), 2, 4, 5 (3), 6 (2), 8, 9, 10, 11, 12, 13, 14(2), 15(3), 16(3), 17(3) 35.14 dB 32.54 dB 690
At 25 fps)b 2 0(2), 1, 2, 4, 5(2), 6(2), 7, 8, 10, 11(2), 12(2), 13(2), 14(3) , 15(3), 16(3), 17(2) 32.09 dB 29.50 dB 392
3 0(2), 1, 2(2), 5, 6(2), 7(2), 8, 9(2), 11(2), 12, 13(2), 14(2), 15(3), 16(3), 17 33.20 dB 29.99 dB 5614 0(2), 1, 2(2), 3, 5(2), 6(2), 7, 8, 10(2), 11(2), 13(2), 14(2), 15(3), 16(2), 17(2) 32.98 dB 28.14 dB 415
5 0(2), 1(2), 2(2), 3, 6(2), 7(2), 8, 9(2), 10(2), 11, 12(3), 13, 14(3), 15(3), 16(2), 17(2) 32.90 dB 28.73 dB 299
6 0(2), 1(3), 2, 5, 7(2), 8(3), 9, 10(2), 11, 12, 14, 15(3), 16(3), 17 32.13 dB 30.89 dB 7487 0(3), 1(2), 4, 5, 6, 7(3), 8(3), 11(2), 12(2), 13, 14, 15(3), 16(3), 17(2) 31.91 dB 29.51 dB 4458 0(3), 2, 4, 5(2), 6(2), 8, 9, 11(2), 12(2), 13(2), 14(2), 15(3), 16(3), 17(2) 30.97 dB 29.53 dB 7269 0(3), 1(2), 3, 5(2), 6(2), 7, 8, 9(2), 10(2), 11(2), 12(2), 14, 15(3), 16(3), 17 30.96 dB 28.99 dB 451
10 0(3), 2, 3, 5, 6, 7, 8, 9, 10(2), 11(2), 12(2), 13(2), 14(2), 15(3), 16(3), 17(2) 30.21 dB 28.80 dB 527
Low 0 0, 7, 8(2), 11(2), 14(2), 15(3), 16(2) 29.95 dB 27.13 dB 573(~2.4 Kbits/s 1 0, 5, 8, 11(2), 12, 14, 15(3), 16(2), 17(3) 33.23 dB 29.46 dB 595At 25 fps)b 2 8, 10, 11, 12(2), 13(2), 14, 15(3), 16(2), 17(3) 32.02 dB 27.21 dB 773
3 2, 5, 6, 8, 9, 12(2), 14, 15(3), 16, 17 28.77 dB 24.34 dB 8084 1, 9(2), 10, 11, 12(2), 14(2), 15(3), 17(3) 22.99 dB 22.80 dB 7455 1, 2, 4, 5, 6, 9, 11, 12, 14, 15(3), 16(2), 17 29.25 dB 26.93 dB 4466 2, 5, 6, 9(2), 10, 11(2), 12, 14(2), 15(2), 16(3), 17 29.67 dB 25.75 dB 376
7 1, 2, 7, 8, 9, 10, 12, 14, 15(3), 16(3), 17 29.01 dB 28.41 dB 3868 1, 3, 9, 12, 13, 15, 16(3) 28.97 dB 23.98 dB 5299 0, 5, 9, 10, 11, 12, 15(2), 16(3), 17 28.79 dB 25.93 dB 694
10 3, 5(2), 6(2), 9, 10(2), 12, 15, 16(3) 27.56 dB 24.25 dB 226
Histogram of all resultant individuals
Video Sequence
Frame 90
Low
Medium
Frame 93
Low
Medium
Frame 96
Low
Medium
Frame 99
Low
Medium
Frame 102
Low
Medium
Frame 105
Low
Medium
Frame 108
Low
Medium
Frame 111
Low
Medium
Frame 114
Low
Medium
Frame 117
Low
Medium
Frame 120
Low
Medium
Deficiencies can be traced back to selection of PSNR
Future work should include error functions like SSIM or Eigen-faces
Algorithm works Accentuates the details of PSNR
1. D. Pearson, “Developments in model-based image coding,” Proceedings of the IEEE, Vol. 83, No. 6, June 1995.2. I. Pandizic. J. Ahlberg, M. Wzorek, P. Rudol, and M. Mosmondor, “Faces Everywhere: Towards Ubiquitous Production and
Delivery of Face Animation,” Proceedings of the 2nd international conferenice on mobile and ubiquitous media, 20033. P. Eisert, “MPEG-4 facial animation in video analysis and synthesis,” International Journal of Imaging Systems and
Technology, June 2003.4. P. Eisert, “Very Low Bit Rate Coding,” Doctoral Thesis, November 2000.5. J. D. Schaffer, “Multiple objective optimization with vector evaluated genetic algorithms,” 1st international conference on
genetic algorithms, 1985.6. K. Deb, “Multi-objective genetic algorithms: problems, difficulties, and construction of test problems,” Evolutionary
Computation, 1999.7. Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T., A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE
Transactions on Evolutionary Computation, 2002.8. F. I. Parke, Parameterized Models for Facial Animation, IEEE Transactions on Computer Graphics and Animation, 1982.9. R. Forchheimer and T. Kronander, “Image coding – from waveforms to animation,” IEEE Transactions on Acoustics, Speech,
and Signal Processing, 37:1212, 1989.10. C. S. Choi, K. Aizawa, H. Harashima, and T. Takebe, “Analysis and synthesis of facial image sequences in model-based
image coding,” IEEE Transactions on Circuits and Systems for Video Technology, June 1994.11. M. Buck, “Model based image sequence coding,” Motion Analysis and Image Sequence Coding, Ch. 10, Kluwer Academic
Publishing, 1993, pp. 285-315.12. N. Diehl, “Object motion estimation and segmentation on image sequences,” Signal Processing: Image Communications,
Vol. 3, No. 1, February 1991, pp. 23-56.13. K. Aizawa, H. Harashima and T. Saito, “Model-based analysis-synthesis image coding (MBASIC) system for a person’s face,”
Signal Processing: Image Communication, vol. 1, pp. 139-152, 1989.14. I. S. Pandizic and R. Forchheimer, “MPEG-4 Facial Animation: the Standard, Implementation, and Applications,” 1st Ed. John
Wiley and Sons, 2002, pp. 3-41.15. J. Ahlberg and R. Forchheimer, “Face Tracking for model-based coding and face animation,” International Journal on
Imaging Systems Technology, Wiley Periodicals, Vol. 13, pp. 8-22, 2003.16. Dornaika, F., Ahlberg, J., Fast and Reliable Active Appearance Model Search for 3D Face Tracking, Proceedings of Mirage
2003, March 2003.17. Dornaika, F., Ahlberg, J., Fitting 3D Face Models for Tracking and Active Appearance Model Training, Image and Vision
Computing 24(2006), Science Direct, 2006.18. Carter, E.F, 1994, The Generation and Application of Random Numbers, Forth Dimensions, Vol XVI, Nos 1 & 2, Forth Interest
Group, Oakland California.19. S. Kirkpatrick, C. D. Gelati, and M. P. Vecchi, “Optimization by simulated annealing,” Science, Vol. 220, No. 4598, pp. 671-
680, 1983.20. T. Edgar, D. Himmelblau, and Lasdon, L., Optimization of Chemical Processes, 2nd Edition, McGraw-Hill, New York, NY, 2001.21. G. Reklaitis, A. Ravindran, and Ragsdell, K., Engineering Optimization, Methods and Applications, 2nd Edition, John Wiley and
Sons, New York, NY, 2006.22. ISTface, Program from Instituto Superior Technico, standard FAP animation sequence, “wow25.fap”.23. J. Jiang, A. Alwan, P. A. Keating, and T. A. Edward Jr., “On the relationship between face movements, tongue movements,
and speech acoustics,” EURASIP Journal on Applied Signal Processing, 2002.24. Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image Quality Assessment: From Error Visibility to Structural Similarity,”
IEEE Trans. Image Process. 13, 600–612 (2004).