Multimedia Workloads versus SPEC Benchmarks
Christopher Martinez, Mythri Pinnamaneni, and Eugene JohnUniversity of Texas – San Antonio
Outline
MotivationMultimedia WorkloadsCycles Per InstructionBranch PredictionCache PerformanceConclusion
Motivation
The common workloads for the home user now focus upon entertainmentFor the home user entertainment performance is the selling pointThere are many media benchmarks but can SPEC benchmarks give some insight to entertainment applications?
Objective
Understand the performance characteristics of multimedia workloads
Compare them against SPEC CPU 2000
Multimedia Workloads
Codecs used include: mp3, aac, MPEG2(dvd), windows media(dvd, HD), and MPEG4
Examine multimedia playback and creation (decoding/encoding)
Multimedia Workloads
Decoding MP3/AAC – iTunes, Winamp,
RealPlayer Video – Windows Media Player
Encoding MP3 – iTunes, Windows Media Player,
RealPlayer AAC – iTunes, RealPlayer Video – Windows Encoder
Multimedia Workloads
MP3 files used a bitrate of 128kbpsAAC files used a bitrate of 128kbpsVideo files used presets from applicationsVideo was a TV capture of a football gameAudio encoding was done on Beethoven Symphonie Pastoraie Audio playback was done on “Boulevard Of Broken Dreams” by Greenday
Performance
Performance based on common measurements: cycles per instruction (CPI), uops per instruction, branch prediction, cache hit rateUse on chip performance counters on the Pentium 4 processorUse Vtune to capture the on chip counters
CPIOur test were performed on a Pentium 4 which is capable of executing 6 micro operation per second (uops)Audio decoding CPI --- 1.85 - 3.55Audio encoding CPI --- 1.40 - 2.11Video decoding --- 1.96 - 2.56Video encoding --- 1.82 and 2.08Integer SPEC 2000 CPI --- 1.16 - 8.54Floating SPEC 2000 CPI --- 4.72 – 8.31
CPI
0
0.5
1
1.5
2
2.5
3
3.5
4
CPI
uops
Audio decoding uops --- 1.38 – 1.71Audio encoding uops --- 1.30 – 1.41Video decoding uops --- 1.28 – 1.43Video encoding uops --- 1.29 – 1.31SPEC 2000 integer uops --- 1.29 – 2.11SPEC 2000 float uops --- 1.32 – 2.48
Branch Prediction
SPEC benchmarks have a large percentage of branch instructions than media applicationsAudio decoding -- 12% branch instructionsAudio encoding -- 7% branch instructionsVideo decoding & encoding -- 8% branch instructionsSPEC -- 13% - 20% branch instructions
Branch Prediction
Media and SPEC benchmark exhibit a high branch prediction rate Prediction rates of 94% and higher in
most cases
With media application there is a high correlation between misprediction and CPI
Branch Prediction
0
2
4
6
8
10
12
14
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
CPI
br miss
Cache Performance
The Pentium 4 processor has two level cache 1st level 16KB & 2nd level 1MB
Multimedia deals with data in a linear fashion Audio/Video must be played in order This sequential data should allow for high hit
rates
Since SPEC benchmark covers a wide application range not all benchmarks will resemble the media hit rates
1st Level Cache Performance
For 1st level cache hit rates the multimedia had hit rates of 93% and higherHalf of the SPEC benchmarks had similar 1st level hit rates Remainder of the SPEC benchmarks
were considerable worst performance
1st Level Cache Performance
0
5
10
15
20
25
30
35
40
45
50
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
CPI
L1 miss
2nd Level Cache Performance
For all multimedia application 2nd level cache had a hit rate of 99.8% or greaterOnly 5 of the 14 SPEC benchmarks had similar 2nd level hit rates Most of the remaining SPEC
benchmarks had 98% or higher but 2 SPEC had 86%
2nd Level Cache Performance
0
2
4
6
8
10
12
14
16
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
CPI
L2 miss
Conclusion
Audio and video have similar range in CPI, uops per instruction, and uops per cycleSPEC programs exhibit performance characteristics in a much larger range than media. i.e SPEC suites are very diverse
Conclusion
Both audio and video are comparable to SPEC in 2nd level cache performanceHalf of the SPEC benchmarks resemble audio and video in 1st level cacheSPEC benchmarks can give some insight into performance of media applications
CPI iTunes MP3/AAC Decode 1.85 / 1.98
WMV DVD/HD Decode 1.96 / 2.14
Video Encode Pass1/Pass2 2.02 / 1.82
RealPlayer MP3 Encode 2.02
iTunes MP3 Encode 2.07
gcc / crafty / praser 1.81 / 1.81 / 1.86
bzip2 2.06
Encode WMP MP3/ Real AAC 1.66 / 1.71
Encode iTunes AAC 1.40
gzip/ vortex/ gap 1.52 / 1.32 / 1.40
CPI
Winamp MP3 Decode 3.11
Real MP3 Decode 3.55
vpr 3.17
Twolf 3.36
Winamp AAC decode 2.43
MPEG2 2.38
MPEG4 2.59
Real AAC Decode 2.82
eon 2.53
uops
uops/instr
MP3 Decode RealPlayer & iTunes 1.54
AAC Decode RealPlayer/iTunes 1.57/1.61
vortex 1.60
parser 1.52
gap 1.53
twolf 1.56
uops Encode MP3 WMP/ Real/ iTunes
1.49 / 1.38 / 1.41
Encode AAC Real / iTunes 1.38 / 1.30
Winamp AAC Decode 1.38
MPEG2 / MPEG4/ WMV DVD 1.43 / 1.37 / 1.28
WMV HD / pass1 / pass2 1.31 / 1.31 / 1.29
gzip / mcf / vpr 1.35 / 1.29 / 1.46
art / crafty / perlbmk 1.32 / 1.31 / 1.48
bzip2 1.42
uops
Besides just similar number of uops one can also look at the cycles to complete the uop
Cycle/uop CPI
iTunes AAC Encode 1.08 1.40
gcc 1.05 1.81
gzip 1.13 1.52
uops
Decode Real MP3/AAC 2.30 / 1.80
Decode winamp MP3 / AAC 1.80 / 1.76
vpr / twolf 2.17 / 2.15
Decode iTunes AAC / MP3 1.23 / 1.20
parser / eon 1.22 / 1.20
Pass1 / pass2 1.59 / 1.42
bzip2 1.44
Encode MP3 Real / iTunes 1.47 / 1.47
Branch Prediction% of branches
Prediction Rate Mispredict/Instr
Winamp MP3
12.8 94.92 0.0065
Real MP3 9.41 91.50 0.0080
iTunes MP3 11.76 97.84 0.0025
Winamp AAC
16.85 96.88 0.0053
Real AAC 13.02 95.26 0.0060
iTunes AAC 12.81 98.16 0.0024Audio Decoding
Branch Prediction% of branches
Prediction Rate Mispredict/Instr
WMP MP3 9.08 96.96 0.0028
Real MP3 10.42 95.86 0.0043
iTunes MP3 0.53 94.87 0.0055
Real AAC 7.74 94.52 0.0043
iTunes AAC 7.68 95.36 0.0035
Audio Encoding
Branch Prediction% of branches
Prediction Rate Mispredict/Instr
MPEG2 (DVD)
8.91 92.93 0.0063
MPEG4 8.28 96.76 0.0027
WMV DVD 5.12 95.86 0.0021
WMV HD 9.89 96.30 0.0018
WMV HD -Pass1
6.31 94.69 0.0033
WMV HD - Pass2
9.28 95.46 0.0042
Video
Branch Predictiongcc 21.84 96.91 0.0067
gzip 19.10 94.89 0.0097
mcf 24.25 95.78 0.0102
vortex 21.22 99.75 0.0005
vpr 16.57 92.86 0.0118
art 14.21 99.21 0.0011
equake 11.00 98.21 0.0020
parser 20.80 96.65 0.0074
crafty 15.76 94.20 0.0091
eon 13.45 97.12 0.0039
gap 17.51 98.57 0.0025
perlbmk 21.18 98.56 0.0031
bzip2 14.83 94.35 0.0084
twolf 16.48 88.39 0.0019
Branch Prediction
The high correlation between branch prediction and CPI can give improvement insightWhen new CPU enhancements show improvement in SPEC, a similar or higher gain will be observed in multimedia applications
Top Related