Chapter 5. Outline (2 nd part)
-
Upload
zachary-nelson -
Category
Documents
-
view
24 -
download
4
description
Transcript of Chapter 5. Outline (2 nd part)
-
Chapter 5. Outline (2nd part)Virtual MachinesXen VM: Design and PerformanceAMD Opteron Memory HierarchyOpteron Memory Performance vs. Pentium 4Fallacies and PitfallsConclusion
-
Virtual Machine Monitors (VMMs)Virtual machine monitor (VMM) or hypervisor is software that supports VMsVMM determines how to map virtual resources to physical resourcesPhysical resource may be time-shared, partitioned, or emulated in software VMM is much smaller than a traditional OS; isolation portion of a VMM is 10,000 lines of code
-
VMM Overhead?Depends on the workloadUser-level processor-bound programs (e.g., SPEC) have zero-virtualization overhead Runs at native speeds since OS rarely invokedI/O-intensive workloads OS-intensive execute many system calls and privileged instructions can result in high virtualization overhead For System VMs, goal of architecture and VMM is to run almost all instructions directly on native hardwareIf I/O-intensive workload is also I/O-bound low processor utilization since waiting for I/O processor virtualization can be hidden low virtualization overhead
-
Requirements of a Virtual Machine MonitorA VM Monitor Presents a SW interface to guest software, Isolates state of guests from each other, and Protects itself from guest software (including guest OSes)Guest software should behave on a VM exactly as if running on the native HW Except for performance-related behavior or limitations of fixed resources shared by multiple VMsGuest software should not be able to change allocation of real system resources directlyHence, VMM must control everything even though guest VM and OS currently running is temporarily using themAccess to privileged state, Address translation, I/O, Exceptions and Interrupts,
-
Requirements of a Virtual Machine MonitorVMM must be at higher privilege level than guest VM, which generally run in user mode Execution of privileged instructions handled by VMME.g., Timer interrupt: VMM suspends currently running guest VM, saves its state, handles interrupt, determine which guest VM to run next, and then load its state Guest VMs that rely on timer interrupt provided with virtual timer and an emulated timer interrupt by VMMRequirements of system virtual machines are same as paged-virtual memory: At least 2 processor modes, system and userPrivileged subset of instructions available only in system mode, trap if executed in user modeAll system resources controllable only via these instr.s
-
ISA Support for Virtual MachinesIf plan for VM during design of ISA, easy to reduce instructions executed by VMM, speed to emulateISA is virtualizable if can execute VM directly on real machine while letting VMM retain ultimate control of CPU: direct executionSince VMs have been considered for desktop/PC server apps only recently, most ISAs were created ignoring virtualization, including 80x86 and most RISC architecturesVMM must ensure that guest system only interacts with virtual resources conventional guest OS runs as user mode on top of VMMIf guest OS accesses or modifies information related to HW resources via a privileged instructione.g., reading or writing the page table pointerit will trap to VMMIf not, VMM must intercept instruction and support a virtual version of sensitive information as guest OS expects
-
Impact of VMs on Virtual MemoryVirtualization of virtual memory if each guest OS in every VM manages its own set of page tables?VMM separates real and physical memory Makes real memory a separate, intermediate level between virtual memory and physical memorySome use the terms virtual memory, physical memory, and machine memory to name the 3 levelsGuest OS maps virtual memory to real memory via its page tables, and VMM page tables map real memory to physical memoryVMM maintains a shadow page table that maps directly from the guest virtual address space to the physical address space of HWRather than pay extra level of indirection on every memory accessVMM must trap any attempt by guest OS to change its page table or to access the page table pointer
-
ISA Support for VMs & Virtual MemoryIBM 370 architecture added additional level of indirection that is managed by the VMM Guest OS keeps its page tables as before, so the shadow pages are unnecessary(AMD Pacifica proposes same improvement for 80x86)To virtualize software TLB, VMM manages the real TLB and has a copy of the contents of the TLB of each guest VMAny instruction that accesses the TLB must trapTLBs with Process ID tags support a mix of entries from different VMs and the VMM, thereby avoiding flushing of the TLB on a VM switch
-
Impact of I/O on Virtual MemoryI/O most difficult part of virtualizationIncreasing number of I/O devices attached to the computer Increasing diversity of I/O device typesSharing of a real device among multiple VMsSupporting many device drivers that are required, especially if different guest OSes are supported on same VM systemGive each VM generic versions of each type of I/O device driver, and let VMM to handle real I/OMethod for mapping virtual to physical I/O device depends on the type of device:Disks partitioned by VMM to create virtual disks for guest VMsNetwork interfaces shared between VMs in short time slices, and VMM tracks messages for virtual network addresses to ensure that guest VMs only receive their messages
-
Example: Xen VMXen: Open-source System VMM for 80x86 ISA Project started at University of Cambridge, GNU license modelOriginal vision of VM is running unmodified OSSignificant wasted effort just to keep guest OS happyparavirtualization - small modifications to guest OS to simplify virtualization 3 Examples of paravirtualization in Xen:To avoid flushing TLB when invoke VMM, Xen mapped into upper 64 MB of address space of each VM Guest OS allowed to allocate pages, just check that didnt violate protection restrictions To protect the guest OS from user programs in VM, Xen takes advantage of 4 protection levels available in 80x86 Most OSes for 80x86 keep everything at privilege levels 0 or at 3.Xen VMM runs at the highest privilege level (0) Guest OS runs at the next level (1) Applications run at the lowest privilege level (3)
-
Xen changes for paravirtualizationPort of Linux to Xen changed 3000 lines, or 1% of 80x86-specific code Does not affect application-binary interfaces of guest OSOSes supported in Xen 2.0
http://wiki.xensource.com/xenwiki/OSCompatibility
-
Xen and I/OTo simplify I/O, privileged VMs assigned to each hardware I/O device: driver domains Xen Jargon: domains = Virtual MachinesDriver domains run physical device drivers, although interrupts still handled by VMM before being sent to appropriate driver domain Regular VMs (guest domains) run simple virtual device drivers that communicate with physical devices drivers in driver domains over a channel to access physical I/O hardware Data sent between guest and driver domains by page remapping
-
Xen PerformancePerformance relative to native Linux for Xen for 6 benchmarks from Xen developers
Slide 6: User-level processor-bound programs? I/O-intensive workloads? I/O-Bound I/O-Intensive?
Chart1
1
0.9704797048
0.9186046512
0.9527421237
0.956937799
0.9922779923
Xen/Linux
Performance relative to native Linux
Sheet1
L5671.00Linux (native)Linux (native)Xen (VM)Vmware (VM)Linux (User Mode)
X5671.00Xen (VM)SPEC INT20001.001.000.980.97
V5540.98Vmware (VM)Linux build time1.000.970.790.49
U5500.97Linux (User Mode)OSDB-IR1.000.920.470.38
SPEC INT2000 (score)OSDB-OLTP1.000.950.120.18
L2631.00dbench1.000.960.740.27
X2710.97SPEC WEB991.000.990.290.33
V3340.79Xen/LinuxVMware Workstation 3.2User Mode Linux
U5350.49SPEC INT2000100%1.01.0
Linux build time (s)Linux build time97%0.80.5
L1721.00PostgreSQL Inf. Retrieval92%0.50.4
X1580.92PostgreSQL OLTP95%0.10.2
V800.47dbench96%0.70.3
U650.38SPEC WEB9999%0.30.3
OSDB-IR (tup/s)
L17141.00
X16330.95
V1990.12
U3060.18
OSDB-OLTP (tup/s)
L4181.00
X4000.96
V3100.74
U1110.27
dbench (score)
L5181.00
X5140.99
V1500.29
U1720.33
SPEC WEB99 (score)
Sheet1
00
00
00
00
00
00
Xen/Linux
VMware Workstation 3.2
Performance relative to native Linux
Sheet2
0
0
0
0
0
0
Xen/Linux
Performance relative to native Linux
Sheet3
-
Xen Performance, Part IISubsequent study noticed Xen experiments based on 1 Ethernet network interfaces card (NIC), and single NIC was a performance bottleneck
Chart1
942942849
18821878849
24621539849
24461593849
Linux
Xen-privileged driver VM ("driver domain")
Xen-guest VM + driver VM
Number of Network Interface Cards
Receive Throughput (Mbits/sec)
Figure 8 events web server
# commands
# set style data histogram
# set style fill pattern border -1
# set yrange [0:4.5]
# set size 0.6,0.6
# plot 'file' using ($2/$3):xtic(1) title 2, '' using ($3/$3) title 3, '' using ($4/$3) title 4, '' using ($5/$3) title 5
# set xlabel "Profiled Hardware Event"
# set ylabel "Relative costs"
# set terminal postscript eps
# profile for the 1 NIC runs
#profiled parameterLinuxxen-domain0xen-guest0xen-guest1
arbitLinux (1 CPU)Xen-priviledged driver VM (1 CPU)Xen-guest VM + driver VM (1 CPU)Xen-guest VM + driver VM (2 CPUs)
"Instr"13005151473422734815
"L-2"319832001383510470
"I-TLB"423268781196512159
"D-TLB"1119130872462627262
Linux (1 CPU)Xen-priviledged driver VM (1 CPU)Xen-guest VM + driver VM (1 CPU)Xen-guest VM + driver VM (2 CPUs)
"Instr"13005151473422734815
1.22.62.7
"L-2"319832001383510470
1.04.33.3
"I-TLB"423268781196512159
"D-TLB"1119130872462627262
11.722.024.4
all_knot_prof.data for figure 8
Relative to Xen-domain0
Linux (1 CPU)Xen-priviledged driver VM (1 CPU)Xen-guest VM + driver VM (1 CPU)Xen-guest VM + driver VM (2 CPUs)
Intructions0.861.002.262.30
L2 misses1.001.004.323.27
I-TLB misses0.621.001.741.77
D-TLB misses0.091.001.882.08
Figure 8
Figure 8 events web server
0000000000000000
0000000000000000
0000000000000000
0000000000000000
Linux (1 CPU)
Xen-priviledged driver VM (1 CPU)
Xen-guest VM + driver VM (1 CPU)
Xen-guest VM + driver VM (2 CPUs)
Linux (1 CPU)
Xen-priviledged driver VM (1 CPU)
Xen-guest VM + driver VM (1 CPU)
Xen-guest VM + driver VM (2 CPUs)
Linux (1 CPU)
Xen-priviledged driver VM (1 CPU)
Xen-guest VM + driver VM (1 CPU)
Xen-guest VM + driver VM (2 CPUs)
Linux (1 CPU)
Xen-priviledged driver VM (1 CPU)
Xen-guest VM + driver VM (1 CPU)
Xen-guest VM + driver VM (2 CPUs)
Event count relative to Xen-priviledged driver domain
Figure 7 events web server
Request rate (reqs/sec)Throughput (Mbits/sec)
LinuxXen-priviledged driver domainXen-guest0Xen-guest1
10008.138.138.138.13
200016.2716.2716.2716.27
300024.424.424.424.4
400032.5232.5232.5232.52
500040.6640.5240.5240.52
600048.848.845.5148.8
700056.9556.9543.156.95
800065.165.143.165.1
900073.273.145073.14
1000081.4181.4152.581.41
1100089.5389.5353.0389.53
1200097.4697.465397.46
13000105.76105.75105.75
14000113.9113.9113.9
15000122.05122.04107.1
16000130.298.23106.8
17000138.3370.9491.7
18000146.4765.53104.6
19000154.660.26104.51
20000122.7557.02104.5
21000119.2257.14
22000103.0866.33
2300096.8955.83
2400089.7250.53
Figure 7 events web server
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
Linux
Xen-priviledged driver domain
Xen-guest0
Xen-guest1
Figure 3 Rcv Thruput
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
Xen guest VM +privileged driver VM
Xen privileged driver VM only
Lunix
Linux
Xen-priviledged driver domain
Xen-guest0
Request Rate (Reqs/sec)
Throughput (Mbits/sec)
# commands
# set style data histogram
# set style fill pattern border -1
# set yrange [0:3000]
# set size 0.6,0.6
# plot 'file' using 2:xtic(1) title 2, '' using 3 title 3
# set xlabel "Number of NICs"
# set ylabel "Aggregate Throughput (Mb/s)"
# set terminal postscript eps
#number of NICSMb/s(linux)xenXen-domain0
arbitLinuxXen-privileged driver VM ("driver domain")
1942942
218821878
324621539
424461593
LinuxXen-privileged driver VM ("driver domain")Xen-guest VM + driver VM
1942942849
218821878849
324621539849
424461593849
Number of NICs
0
000
000
000
000
Linux
Xen-privileged driver VM ("driver domain")
Xen-guest VM + driver VM
Number of Network Interface Cards
Receive Throughput (Mbits/sec)
MBD00A6E44D.xls
Chart5
0.858585858612.25965537732.2984749455
0.99937514.32343753.271875
0.615295143911.73960453621.76781041
0.085504699311.88171467872.0831359364
Linux
Xen-privileged driver VM only
Xen-guest VM + driver VM
Xen-guest VM + driver VM (2 CPUs)
Event count relative to Xen-priviledged driver domain
Figure 8 events web server
# commands
# set style data histogram
# set style fill pattern border -1
# set yrange [0:4.5]
# set size 0.6,0.6
# plot 'file' using ($2/$3):xtic(1) title 2, '' using ($3/$3) title 3, '' using ($4/$3) title 4, '' using ($5/$3) title 5
# set xlabel "Profiled Hardware Event"
# set ylabel "Relative costs"
# set terminal postscript eps
# profile for the 1 NIC runs
#profiled parameterLinuxxen-domain0xen-guest0xen-guest1
arbitLinuxXen-privileged driver VM onlyXen-guest VM + driver VMXen-guest VM + driver VM (2 CPUs)
"Instr"13005151473422734815
"L-2"319832001383510470
"I-TLB"423268781196512159
"D-TLB"1119130872462627262
LinuxXen-privileged driver VM onlyXen-guest VM + driver VMXen-guest VM + driver VM (2 CPUs)
"Instr"13005151473422734815
"L-2"319832001383510470
"I-TLB"423268781196512159
"D-TLB"1119130872462627262
11.722.024.4
all_knot_prof.data for figure 8
Relative to Xen-domain0
LinuxXen-privileged driver VM onlyXen-guest VM + driver VMXen-guest VM + driver VM (2 CPUs)
Intructions0.861.002.262.30
L2 misses1.001.004.323.27
I-TLB misses0.621.001.741.77
D-TLB misses0.091.001.882.08
Figure 8
Figure 8 events web server
Linux
Xen-privileged driver VM only
Xen-guest VM + driver VM
Event count relative to Xen-priviledged driver domain
Figure 7 events web server
Request rate (reqs/sec)Throughput (Mbits/sec)
LinuxXen-priviledged driver domainXen-guest0Xen-guest1
10008.138.138.138.13
200016.2716.2716.2716.27
300024.424.424.424.4
400032.5232.5232.5232.52
500040.6640.5240.5240.52
600048.848.845.5148.8
700056.9556.9543.156.95
800065.165.143.165.1
900073.273.145073.14
1000081.4181.4152.581.41
1100089.5389.5353.0389.53
1200097.4697.465397.46
13000105.76105.75105.75
14000113.9113.9113.9
15000122.05122.04107.1
16000130.298.23106.8
17000138.3370.9491.7
18000146.4765.53104.6
19000154.660.26104.51
20000122.7557.02104.5
21000119.2257.14
22000103.0866.33
2300096.8955.83
2400089.7250.53
Figure 7 events web server
Linux
Xen-priviledged driver domain
Xen-guest0
Xen-guest1
Figure 3 Rcv Thruput
Xen guest VM+driver VM (1 CPU)
Xen priviledged driver VM (1 CPU)
Lunix(1 CPU)
Xen guest VM+driver VM (2 CPUs)
Linux
Xen-priviledged driver domain
Xen-guest1
Xen-guest0
Request Rate (Reqs/sec)
Throughput (Mbits/sec)
# commands
# set style data histogram
# set style fill pattern border -1
# set yrange [0:3000]
# set size 0.6,0.6
# plot 'file' using 2:xtic(1) title 2, '' using 3 title 3
# set xlabel "Number of NICs"
# set ylabel "Aggregate Throughput (Mb/s)"
# set terminal postscript eps
#number of NICSMb/s(linux)xenXen-domain0
arbitLinuxXen-priviledged driver VM ("driver domain")
1942942
218821878
324621539
424461593
LinuxXen-priviledged driver VM ("driver domain")Xen-guest VM + driver VM
1942942800
218821878930
324621539930
424461593930
Number of NICs
0
Linux
Xen-priviledged driver VM ("driver domain")
Xen-guest VM + driver VM
Number of Network Interface Cards
Receive Throughput (Mbits/sec)
-
Xen Performance, Part III> 2X instructions for guest VM + driver VM> 4X L2 cache misses12X 24X Data TLB misses
Chart3
0.858585858612.2596553773
0.99937514.3234375
0.615295143911.7396045362
0.085504699311.8817146787
Linux
Xen-privileged driver VM only
Xen-guest VM + driver VM
Event count relative to Xen-priviledged driver domain
Chart5
0.858585858612.25965537732.2984749455
0.99937514.32343753.271875
0.615295143911.73960453621.76781041
0.085504699311.88171467872.0831359364
Linux
Xen-privileged driver VM only
Xen-guest VM + driver VM
Xen-guest VM + driver VM (2 CPUs)
Event count relative to Xen-priviledged driver domain
Figure 8 events web server
# commands
# set style data histogram
# set style fill pattern border -1
# set yrange [0:4.5]
# set size 0.6,0.6
# plot 'file' using ($2/$3):xtic(1) title 2, '' using ($3/$3) title 3, '' using ($4/$3) title 4, '' using ($5/$3) title 5
# set xlabel "Profiled Hardware Event"
# set ylabel "Relative costs"
# set terminal postscript eps
# profile for the 1 NIC runs
#profiled parameterLinuxxen-domain0xen-guest0xen-guest1
arbitLinuxXen-privileged driver VM onlyXen-guest VM + driver VMXen-guest VM + driver VM (2 CPUs)
"Instr"13005151473422734815
"L-2"319832001383510470
"I-TLB"423268781196512159
"D-TLB"1119130872462627262
LinuxXen-privileged driver VM onlyXen-guest VM + driver VMXen-guest VM + driver VM (2 CPUs)
"Instr"13005151473422734815
"L-2"319832001383510470
"I-TLB"423268781196512159
"D-TLB"1119130872462627262
11.722.024.4
all_knot_prof.data for figure 8
Relative to Xen-domain0
LinuxXen-privileged driver VM onlyXen-guest VM + driver VMXen-guest VM + driver VM (2 CPUs)
Intructions0.861.002.262.30
L2 misses1.001.004.323.27
I-TLB misses0.621.001.741.77
D-TLB misses0.091.001.882.08
Figure 8
Figure 8 events web server
000
000
000
000
Linux
Xen-privileged driver VM only
Xen-guest VM + driver VM
Event count relative to Xen-priviledged driver domain
Figure 7 events web server
Request rate (reqs/sec)Throughput (Mbits/sec)
LinuxXen-priviledged driver domainXen-guest0Xen-guest1
10008.138.138.138.13
200016.2716.2716.2716.27
300024.424.424.424.4
400032.5232.5232.5232.52
500040.6640.5240.5240.52
600048.848.845.5148.8
700056.9556.9543.156.95
800065.165.143.165.1
900073.273.145073.14
1000081.4181.4152.581.41
1100089.5389.5353.0389.53
1200097.4697.465397.46
13000105.76105.75105.75
14000113.9113.9113.9
15000122.05122.04107.1
16000130.298.23106.8
17000138.3370.9491.7
18000146.4765.53104.6
19000154.660.26104.51
20000122.7557.02104.5
21000119.2257.14
22000103.0866.33
2300096.8955.83
2400089.7250.53
Figure 7 events web server
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
Linux
Xen-priviledged driver domain
Xen-guest0
Xen-guest1
Figure 3 Rcv Thruput
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
0000
Xen guest VM+driver VM (1 CPU)
Xen priviledged driver VM (1 CPU)
Lunix(1 CPU)
Xen guest VM+driver VM (2 CPUs)
Linux
Xen-priviledged driver domain
Xen-guest1
Xen-guest0
Request Rate (Reqs/sec)
Throughput (Mbits/sec)
# commands
# set style data histogram
# set style fill pattern border -1
# set yrange [0:3000]
# set size 0.6,0.6
# plot 'file' using 2:xtic(1) title 2, '' using 3 title 3
# set xlabel "Number of NICs"
# set ylabel "Aggregate Throughput (Mb/s)"
# set terminal postscript eps
#number of NICSMb/s(linux)xenXen-domain0
arbitLinuxXen-priviledged driver VM ("driver domain")
1942942
218821878
324621539
424461593
LinuxXen-priviledged driver VM ("driver domain")Xen-guest VM + driver VM
1942942800
218821878930
324621539930
424461593930
Number of NICs
0
000
000
000
000
Linux
Xen-priviledged driver VM ("driver domain")
Xen-guest VM + driver VM
Number of Network Interface Cards
Receive Throughput (Mbits/sec)
-
Xen Performance, Part IV> 2X instructions: page remapping and page transfer between driver and guest VMs and due to communication between the 2 VMs over a channel4X L2 cache misses: Linux uses zero-copy network interface that depends on ability of NIC to do DMA from different locations in memory Since Xen does not support gather DMA in its virtual network interface, it cant do true zero-copy in the guest VM12X 24X Data TLB misses: 2 Linux optimizationsSuperpages for part of Linux kernel space, and 4MB pages lowers TLB misses versus using 1024 4 KB pages. Not in XenPTEs marked global are not flushed on a context switch, and Linux uses them for its kernel space. Not in XenFuture Xen may address 2. and 3., but 1. inherent?
-
Protection and Instruction Set ArchitectureExample Problem: 80x86 POPF instruction loads flag registers from top of stack in memoryOne such flag is Interrupt Enable (IE)In system mode, POPF changes IE In user mode, POPF simply changes all flags except IE Problem: guest OS runs in user mode inside a VM, so it expects to see changed a IE, but it wontHistorically, IBM mainframe HW and VMM took 3 steps:Reduce cost of processor virtualizationIntel/AMD proposed ISA changes to reduce this costReduce interrupt overhead cost due to virtualizationReduce interrupt cost by steering interrupts to proper VM directly without invoking VMM2. and 3. not yet addressed by Intel/AMD; in the future?
-
80x86 VM Challenges18 instructions cause problems for virtualization:Read control registers in user model that reveal that the guest operating system in running in a virtual machine (such as POPF), and Check protection as required by the segmented architecture but assume that the operating system is running at the highest privilege levelVirtual memory: 80x86 TLBs do not support process ID tags more expensive for VMM and guest OSes to share the TLB each address space change typically requires a TLB flush
-
Intel/AMD address 80x86 VM ChallengesGoal is direct execution of VMs on 80x86Intel's VT-xA new execution mode for running VMs An architected definition of the VM state Instructions to swap VMs rapidly Large set of parameters to select the circumstances where a VMM must be invoked VT-x adds 11 new instructions to 80x86Xen 3.0 plan proposes to use VT-x to run Windows on Xen AMDs Pacifica makes similar proposalsPlus indirection level in page table like IBM VM 370Ironic adding a new modeIf OS start using mode in kernel, new mode would cause performance problems for VMM since 100 times too slow
-
AMD Opteron Memory Hierarchy12-stage integer pipeline yields a maximum clock rate of 2.8 GHz and fastest memory PC3200 DDR SDRAM48-bit virtual and 40-bit physical addressesI and D cache: 64 KB, 2-way set associative, 64-B block, LRUL2 cache: 1 MB, 16-way, 64-B block, pseudo LRUData and L2 caches use write back, write allocate L1 caches are virtually indexed and physically taggedL1 I TLB and L1 D TLB: fully associative, 40 entries 32 entries for 4 KB pages and 8 for 2 MB or 4 MB pages L2 I TLB and L1 D TLB: 4-way, 512 entities of 4 KB pagesMemory controller allows up to 10 cache misses8 from D cache and 2 from I cache
-
Example on TLB / L1 CacheSee Fig 5.18, 5.19 for AMD Opteron
Virtual Address ( 48 )
TLB tag ( 28 )
Index ( 7 )
L1 index ( 7 )
offset ( 6 )
TLB tag ( 28 )
TLB data ( 27 )
Total TLB tag size ( 28672 )
Total TLB datasize ( 27648 )
Compare
Virtual page number ( 35 )
Page offset ( 13 )
Compare
L1 tag ( 27 )
Total L1 tag size ( 13824 )
L1 data ( 512 )
Total L1 data size ( 262144 )
Physical page number ( 27 )
L1 Cache
TLB
Page offset ( 13 )
hit
hit
L2 Cache
-
Opteron Memory Hierarchy PerformanceFor SPEC2000I cache misses per instruction is 0.01% to 0.09% D cache misses per instruction are 1.34% to 1.43% L2 cache misses per instruction are 0.23% to 0.36% Commercial benchmark (TPC-C-like)I cache misses per instruction is 1.83% (100X!)D cache misses per instruction are 1.39% ( same)L2 cache misses per instruction are 0.62% (2X to 3X)How compare to ideal CPI of 0.33?
-
CPI breakdown for Integer ProgramsCPI above base attributable to memory 50%L2 cache misses 25% overall (50% memory CPI)Assumes misses are not overlapped with the execution pipeline or with each other, so the pipeline stall portion is a lower bound
Chart2
0.33333333330.106240.2604266667
0.33333333330.096530.2901366667
0.33333333330.003690.4829766667
0.33333333330.151050.3756166667
0.33333333330.137180.3894866667
0.33333333330.296490.2501766667
0.33333333330.565210.1014566667
0.33333333330.302830.3838366667
0.33333333330.346250.6004166667
0.33333333331.204060.2426066667
0.33333333330.931910.5847566667
0.33333333331.251510.9851566667
Base CPI
Max Memory CPI
Min Pipeline Stall
CPI
Sheet3
Base CPIPipeline StallMemory CPITotal CPI% CPI-base in Memory
TPC-C like0.330.921.312.5759%
perlbmk0.330.240.130.7035%
crafty0.330.200.190.7249%
eon0.330.480.000.821%
gzip0.330.320.210.8639%
gap0.330.370.150.8629%
vortex0.330.170.380.8869%
bzip20.330.060.611.0091%
gcc0.330.360.331.0248%
parser0.330.540.401.2843%
vpr0.33(0.01)1.461.78101%
twolf0.330.511.011.8566%52%
TPC-C0.330.921.312.5759%53%
CFP2000 Avg0.330.330.491.1560%
sixtrack0.330.260.030.6312%
mesa0.330.350.090.7821%
wupwise0.330.180.310.8363%
mgrid0.330.220.340.8960%
applu0.330.010.620.9798%
facerec0.330.030.711.0796%
galgel0.330.170.571.0777%
apsi0.330.260.581.1769%
ammp0.330.280.581.1968%
fma3d0.330.430.571.3457%
lucas0.330.520.881.7363%
swim0.331.000.551.8835%
equake0.331.210.802.3540%
art0.330.961.733.0364%59%
Sheet3
000
000
000
000
000
000
000
000
000
000
000
000
Base CPI
Pipeline Stall
Memory CPI
CPI
Sheet2
0.33333333330.26238666670.03428
0.33333333330.35187666670.09479
0.33333333330.18241666670.31425
0.33333333330.22061666670.33605
0.33333333330.01378666670.62288
0.33333333330.03115666670.70551
0.33333333330.16735666670.56931
0.33333333330.25520666670.58146
0.33333333330.27824666670.57842
0.33333333330.43484666670.57182
0.33333333330.52112666670.87554
0.33333333331.00030666670.54636
0.33333333331.21468666670.80198
0.33333333330.96347666671.73319
Base CPI
Min Pipeline Stall
Max Memory CPI
CPI
CPI breakdown
BenchmarkAvg CPIIcache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 misses
per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.
TPC-C like2.5718.3413.896.183.259.000.091.71
CINT2000 total1.300.9014.273.570.2512.470.001.06
Miss Penalty771607777
Peak CPI0.3333333333
memory CPI
TPC-C like1.310.130.100.990.020.060.000.01
CINT2000 total0.770.010.100.570.000.090.00.01
Base CPI0.33
Memory CPI (no overlap)0.77
Pipeline Stalls (no overlap)0.19
BenchmarkAvg CPIIcache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 misses
Base CPIMemory CPIPipeline StallL2 Memory% Memory L2per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.
TPC-C like0.331.310.920.9975%2.5718.3413.896.183.259.000.091.71
Miss Pen771607777
CINT2000 total0.330.770.190.5774%1.300.9014.273.570.2512.470.001.06INT
gzip0.330.210.320.028%0.860.0116.030.100.0111.060.000.09INT
vpr0.331.46(0.01)0.9263%1.780.0223.365.730.0150.520.003.22INT
gcc0.330.330.360.1444%1.021.9419.040.900.794.530.000.19INT
mcf0.3318.20(5.47)16.6191%13.060.02148.90103.820.0150.490.0026.98INT
crafty0.330.190.200.015%0.723.154.050.060.1618.070.000.01INT
parser0.330.400.540.2153%1.280.0814.801.340.0111.560.000.65INT
eon0.330.000.480.00%0.820.060.450.000.010.050.000.00INT
perlbmk0.330.130.240.0754%0.701.362.410.430.933.510.000.31INT
gap0.330.150.370.0960%0.860.764.270.580.053.380.000.33INT
vortex0.330.380.170.1949%0.883.675.861.170.6815.780.001.38INT
bzip20.330.610.060.4778%1.000.0110.572.940.008.170.000.63INT
twolf0.331.010.510.7271%1.850.0826.184.490.0214.790.000.01INT
52%
CFP2000 total0.330.490.330.3674%1.150.0813.432.260.013.700.000.79FP
wupwise0.330.310.180.2785%0.830.006.561.660.000.220.000.17FP
swim0.330.551.000.3259%1.880.0130.872.020.000.590.000.41FP
mgrid0.330.340.220.2264%0.890.0116.541.350.000.350.000.25FP
applu0.330.620.010.5588%0.970.018.483.410.002.420.000.13FP
mesa0.330.090.350.0222%0.780.031.580.130.018.780.000.17FP
galgel0.330.570.170.3867%1.070.0118.632.380.007.620.000.67FP
art0.331.730.961.3276%3.030.0056.968.270.001.200.000.41FP
equake0.330.801.210.5366%2.350.0637.293.300.001.200.000.59FP
facerec0.330.710.030.6389%1.070.019.313.940.001.210.000.20FP
ammp0.330.580.280.3866%1.190.0216.582.370.008.610.003.25FP
lucas0.330.880.520.7080%1.730.0017.354.360.004.800.003.27FP
fma3d0.330.570.430.4885%1.340.2011.843.020.050.360.000.21FP
sixtrack0.330.030.260.0375%0.630.030.530.160.010.660.000.01FP
apsi0.330.580.260.4068%1.170.5013.812.480.0110.370.001.69FP
71%
CINT2000 Avg0.330.190.771.30
mcf0.33(5.47)18.20
Base CPIMin Pipeline StallMax Memory CPITotal CPI
TPC-C like0.330.921.312.57
perlbmk0.330.240.130.70
crafty0.330.200.190.72
eon0.330.480.000.82
gzip0.330.320.210.86
gap0.330.370.150.86
vortex0.330.170.380.88
bzip20.330.060.611.00
gcc0.330.360.331.02
parser0.330.540.401.28
vpr0.33(0.01)1.461.78
twolf0.330.511.011.85
TPC-C0.330.921.312.57
CFP2000 Avg0.330.330.491.15
sixtrack0.330.260.030.63
mesa0.330.350.090.78
wupwise0.330.180.310.83
mgrid0.330.220.340.89
applu0.330.010.620.97
facerec0.330.030.711.07
galgel0.330.170.571.07
apsi0.330.260.581.17
ammp0.330.280.581.19
fma3d0.330.430.571.34
lucas0.330.520.881.73
swim0.331.000.551.88
equake0.331.210.802.35
art0.330.961.733.03
CPI breakdown
0.33333333330.106240.2604266667
0.33333333330.096530.2901366667
0.33333333330.003690.4829766667
0.33333333330.151050.3756166667
0.33333333330.137180.3894866667
0.33333333330.296490.2501766667
0.33333333330.565210.1014566667
0.33333333330.302830.3838366667
0.33333333330.346250.6004166667
0.33333333331.204060.2426066667
0.33333333330.931910.5847566667
0.33333333331.251510.9851566667
Base CPI
Max Memory CPI
Min Pipeline Stall
CPI
Sheet1
0.33333333330.034280.2623866667
0.33333333330.094790.3518766667
0.33333333330.314250.1824166667
0.33333333330.336050.2206166667
0.33333333330.622880.0137866667
0.33333333330.705510.0311566667
0.33333333330.569310.1673566667
0.33333333330.581460.2552066667
0.33333333330.578420.2782466667
0.33333333330.571820.4348466667
0.33333333330.875540.5211266667
0.33333333330.546361.0003066667
0.33333333330.801981.2146866667
0.33333333331.733190.9634766667
Base CPI
Max Memory CPI
Min Pipeline Stall
CPI
Opteron SPEC CPU2000 Rates
BenchmarkAvg CPIIcache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 misses
per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.
TPC-C like2.5718.3413.896.183.259.000.091.71
CINT2000 total1.300.9014.273.570.2512.470.001.06
Miss Penalty771607227
Peak CPI0.3333333333
memory CPI
TPC-C like1.270.130.100.990.020.020.000.01
CINT2000 total0.710.010.100.570.000.020.00.01
Base CPI0.33
Memory CPI (no overlap)0.71
Pipeline Stalls (no overlap)0.26
BenchmarkAvg CPIIcache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 misses
Base CPIMemory CPIPipeline StallL2 Memory% Memory L2per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.
TPC-C like0.331.250.990.9979%2.5718.3413.896.183.259.000.091.71
Miss Pen771602277
CINT2000 total0.330.710.260.5780%1.300.9014.273.570.2512.470.001.06INT
perlbmk0.330.110.260.0765%0.701.362.410.430.933.510.000.31INT
crafty0.330.100.290.0110%0.723.154.050.060.1618.070.000.01INT
eon0.330.000.480.00%0.820.060.450.000.010.050.000.00INT
gzip0.330.150.380.0211%0.860.0116.030.100.0111.060.000.09INT
gap0.330.140.390.0968%0.860.764.270.580.053.380.000.33INT
vortex0.330.300.250.1963%0.883.675.861.170.6815.780.001.38INT
bzip20.330.570.100.4783%1.000.0110.572.940.008.170.000.63INT
gcc0.330.300.380.1448%1.021.9419.040.900.794.530.000.19INT
parser0.330.350.600.2162%1.280.0814.801.340.0111.560.000.65INT
vpr0.331.200.240.9276%1.780.0223.365.730.0150.520.003.22INT
twolf0.330.930.580.7277%1.850.0826.184.490.0214.790.000.01INT
mcf0.3317.94(5.22)16.6193%13.060.02148.90103.820.0150.490.0026.98INT
58%
CFP2000 total0.330.470.350.3677%1.150.0813.432.260.013.700.000.79FP
sixtrack0.330.030.270.0383%0.630.030.530.160.010.660.000.01FP
mesa0.330.050.400.0241%0.780.031.580.130.018.780.000.17FP
wupwise0.330.310.180.2785%0.830.006.561.660.000.220.000.17FP
mgrid0.330.330.220.2265%0.890.0116.541.350.000.350.000.25FP
applu0.330.610.030.5589%0.970.018.483.410.002.420.000.13FP
galgel0.330.530.210.3872%1.070.0118.632.380.007.620.000.67FP
facerec0.330.700.040.6390%1.070.019.313.940.001.210.000.20FP
apsi0.330.530.310.4075%1.170.5013.812.480.0110.370.001.69FP
ammp0.330.540.320.3871%1.190.0216.582.370.008.610.003.25FP
fma3d0.330.570.440.4885%1.340.2011.843.020.050.360.000.21FP
lucas0.330.850.550.7082%1.730.0017.354.360.004.800.003.27FP
swim0.330.541.000.3259%1.880.0130.872.020.000.590.000.41FP
equake0.330.801.220.5366%2.350.0637.293.300.001.200.000.59FP
art0.331.730.971.3277%3.030.0056.968.270.001.200.000.41FP
74%
CINT2000 Avg0.330.260.711.30
mcf0.33(5.47)18.20
Base CPIMin Pipeline StallMax Memory CPITotal CPI
TPC-C like0.330.991.252.57
perlbmk0.330.260.110.70
crafty0.330.290.100.72
eon0.330.480.000.82
gzip0.330.380.150.86
gap0.330.390.140.86
vortex0.330.250.300.88
bzip20.330.100.571.00
gcc0.330.380.301.02
parser0.330.600.351.28
vpr0.330.241.201.78
twolf0.330.580.931.85
TPC-C0.330.991.252.57
CFP2000 Avg0.330.350.471.15
sixtrack0.330.260.030.63
mesa0.330.350.090.78
wupwise0.330.180.310.83
mgrid0.330.220.340.89
applu0.330.010.620.97
facerec0.330.030.711.07
galgel0.330.170.571.07
apsi0.330.260.581.17
ammp0.330.280.581.19
fma3d0.330.430.571.34
lucas0.330.520.881.73
swim0.331.000.551.88
equake0.331.210.802.35
art0.330.961.733.03
perlbmk0.330.260.110.70
crafty0.330.290.100.72
eon0.330.480.000.82
gzip0.330.380.150.86
gap0.330.390.140.86
vortex0.330.250.300.88
bzip20.330.100.571.00
gcc0.330.380.301.02
parser0.330.600.351.28
vpr0.330.241.201.78
twolf0.330.580.931.85
TPC-C0.330.991.252.57
sixtrack0.330.260.030.63
mesa0.330.350.090.78
wupwise0.330.180.310.83
mgrid0.330.220.340.89
applu0.330.010.620.97
facerec0.330.030.711.07
galgel0.330.170.571.07
apsi0.330.260.581.17
ammp0.330.280.581.19
fma3d0.330.430.571.34
lucas0.330.520.881.73
swim0.331.000.551.88
equake0.331.210.802.35
art0.330.961.733.03
Opteron SPEC CPU2000 Rates
000
000
000
000
000
000
000
000
000
000
000
000
Base CPI
Max Memory CPI
Min Pipeline Stall
CPI
000
000
000
000
000
000
000
000
000
000
000
000
000
000
Base CPI
Max Memory CPI
Min Pipeline Stall
CPI
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
BenchmarkAvg CPIIcache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 missesIcache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 misses
per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.
TPC-C like2.5718.3413.896.183.259.000.091.711.83%1.39%0.62%0.33%0.90%0.01%0.17%
%%%%%%%
CINT2000 total1.300.9014.273.570.2512.470.001.06INT0.09%1.43%0.36%0.03%1.25%0.00%0.11%
164.gzip0.860.0116.030.100.0111.060.000.09INT0.001%1.603%0.010%0.001%1.106%0.000%0.009%
175.vpr1.780.0223.365.730.0150.520.003.22INT0.002%2.336%0.573%0.001%5.052%0.000%0.322%
176.gcc1.021.9419.040.900.794.530.000.19INT0.194%1.904%0.090%0.079%0.453%0.000%0.019%
181.mcf13.060.02148.90103.820.0150.490.0026.98INT0.002%14.890%10.382%0.001%5.049%0.000%2.698%
186.crafty0.723.154.050.060.1618.070.000.01INT0.315%0.405%0.006%0.016%1.807%0.000%0.001%
197.parser1.280.0814.801.340.0111.560.000.65INT0.008%1.480%0.134%0.001%1.156%0.000%0.065%
252.eon0.820.060.450.000.010.050.000.00INT0.006%0.045%0.000%0.001%0.005%0.000%0.000%
253.perlbmk0.701.362.410.430.933.510.000.31INT0.136%0.241%0.043%0.093%0.351%0.000%0.031%
254.gap0.860.764.270.580.053.380.000.33INT0.076%0.427%0.058%0.005%0.338%0.000%0.033%
255.vortex0.883.675.861.170.6815.780.001.38INT0.367%0.586%0.117%0.068%1.578%0.000%0.138%
256.bzip21.000.0110.572.940.008.170.000.63INT0.001%1.057%0.294%0.000%0.817%0.000%0.063%
300.twolf1.850.0826.184.490.0214.790.000.01INT0.008%2.618%0.449%0.002%1.479%0.000%0.001%
CFP2000 total1.150.0813.432.260.013.700.000.79FP0.01%1.34%0.23%0.00%0.37%0.00%0.08%
168.wupwise0.830.006.561.660.000.220.000.17FP0.000%0.656%0.166%0.000%0.022%0.000%0.017%
171.swim1.880.0130.872.020.000.590.000.41FP0.001%3.087%0.202%0.000%0.059%0.000%0.041%
172.mgrid0.890.0116.541.350.000.350.000.25FP0.001%1.654%0.135%0.000%0.035%0.000%0.025%
173.applu0.970.018.483.410.002.420.000.13FP0.001%0.848%0.341%0.000%0.242%0.000%0.013%
177.mesa0.780.031.580.130.018.780.000.17FP0.003%0.158%0.013%0.001%0.878%0.000%0.017%
178.galgel1.070.0118.632.380.007.620.000.67FP0.001%1.863%0.238%0.000%0.762%0.000%0.067%
179.art3.030.0056.968.270.001.200.000.41FP0.000%5.696%0.827%0.000%0.120%0.000%0.041%
183.equake2.350.0637.293.300.001.200.000.59FP0.006%3.729%0.330%0.000%0.120%0.000%0.059%
187.facerec1.070.019.313.940.001.210.000.20FP0.001%0.931%0.394%0.000%0.121%0.000%0.020%
188.ammp1.190.0216.582.370.008.610.003.25FP0.002%1.658%0.237%0.000%0.861%0.000%0.325%
189.lucas1.730.0017.354.360.004.800.003.27FP0.000%1.735%0.436%0.000%0.480%0.000%0.327%
191.fma3d1.340.2011.843.020.050.360.000.21FP0.020%1.184%0.302%0.005%0.036%0.000%0.021%
200.sixtrack0.630.030.530.160.010.660.000.01FP0.003%0.053%0.016%0.001%0.066%0.000%0.001%
301.apsi1.170.5013.812.480.0110.370.001.69FP0.050%1.381%0.248%0.001%1.037%0.000%0.169%
2.020.41.01.713.00.70.01.6Min0.000%0.045%0.000%0.000%0.005%0.000%0.000%
2.2229.31.02.7325.02.40.02.2Max0.367%14.890%10.382%0.093%5.052%0.000%2.698%
Inverse0.77Icache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 misses
0.87per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.
0.38910505842.57Not MCF0.000%0.045%0.000%0.000%0.005%0.000%0.000%
Notes:NOT MCF0.367%5.696%0.827%0.093%5.052%0.000%0.327%
- Data measured on uniprocessor Opteron based systems
- Metrics are averaged over the entire benchmark run
- Only SPEC CPU2000 base results are given. All CPU2000 benchmarks run in 64-bit mode.
- TPC-C like benchmark is run in 32-bit mode.
- Contact: [email protected] 512-602-5581 or 512-576-9485 mobile
BenchmarkAvg CPIIcache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 misses
per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.
TPC-C like2.5718.3413.896.183.259.000.091.71
CINT2000 total1.300.9014.273.570.2512.470.001.06INT
164.gzip0.860.0116.030.100.0111.060.000.09INT
175.vpr1.780.0223.365.730.0150.520.003.22INT
176.gcc1.021.9419.040.900.794.530.000.19INT
181.mcf13.060.02148.90103.820.0150.490.0026.98INT
186.crafty0.723.154.050.060.1618.070.000.01INT
197.parser1.280.0814.801.340.0111.560.000.65INT
252.eon0.820.060.450.000.010.050.000.00INT
253.perlbmk0.701.362.410.430.933.510.000.31INT
254.gap0.860.764.270.580.053.380.000.33INT
255.vortex0.883.675.861.170.6815.780.001.38INT
256.bzip21.000.0110.572.940.008.170.000.63INT
300.twolf1.850.0826.184.490.0214.790.000.01INT
CFP2000 total1.150.0813.432.260.013.700.000.79FP
168.wupwise0.830.006.561.660.000.220.000.17FP
171.swim1.880.0130.872.020.000.590.000.41FP
172.mgrid0.890.0116.541.350.000.350.000.25FP
173.applu0.970.018.483.410.002.420.000.13FP
177.mesa0.780.031.580.130.018.780.000.17FP
178.galgel1.070.0118.632.380.007.620.000.67FP
179.art3.030.0056.968.270.001.200.000.41FP
183.equake2.350.0637.293.300.001.200.000.59FP
187.facerec1.070.019.313.940.001.210.000.20FP
188.ammp1.190.0216.582.370.008.610.003.25FP
189.lucas1.730.0017.354.360.004.800.003.27FP
191.fma3d1.340.2011.843.020.050.360.000.21FP
200.sixtrack0.630.030.530.160.010.660.000.01FP
301.apsi1.170.5013.812.480.0110.370.001.69FP
2.020.41.01.713.00.70.01.6
2.2229.31.02.7325.02.40.02.2
Notes:
- Data measured on uniprocessor Opteron based systems
- Metrics are averaged over the entire benchmark run
- Only SPEC CPU2000 base results are given. All CPU2000 benchmarks run in 64-bit mode.
- TPC-C like benchmark is run in 32-bit mode.
- Contact: [email protected] 512-602-5581 or 512-576-9485 mobile
-
CPI breakdown for Floating Pt. ProgramsCPI above base attributable to memory 60%L2 cache misses 40% overall (70% memory CPI)Assumes misses are not overlapped with the execution pipeline or with each other, so the pipeline stall portion is a lower bound
Chart4
0.33333333330.034280.2623866667
0.33333333330.094790.3518766667
0.33333333330.314250.1824166667
0.33333333330.336050.2206166667
0.33333333330.622880.0137866667
0.33333333330.705510.0311566667
0.33333333330.569310.1673566667
0.33333333330.581460.2552066667
0.33333333330.578420.2782466667
0.33333333330.571820.4348466667
0.33333333330.875540.5211266667
0.33333333330.546361.0003066667
0.33333333330.801981.2146866667
0.33333333331.733190.9634766667
Base CPI
Max Memory CPI
Min Pipeline Stall
CPI
Sheet3
Base CPIPipeline StallMemory CPITotal CPI% CPI-base in Memory
TPC-C like0.330.921.312.5759%
perlbmk0.330.240.130.7035%
crafty0.330.200.190.7249%
eon0.330.480.000.821%
gzip0.330.320.210.8639%
gap0.330.370.150.8629%
vortex0.330.170.380.8869%
bzip20.330.060.611.0091%
gcc0.330.360.331.0248%
parser0.330.540.401.2843%
vpr0.33(0.01)1.461.78101%
twolf0.330.511.011.8566%52%
TPC-C0.330.921.312.5759%53%
CFP2000 Avg0.330.330.491.1560%
sixtrack0.330.260.030.6312%
mesa0.330.350.090.7821%
wupwise0.330.180.310.8363%
mgrid0.330.220.340.8960%
applu0.330.010.620.9798%
facerec0.330.030.711.0796%
galgel0.330.170.571.0777%
apsi0.330.260.581.1769%
ammp0.330.280.581.1968%
fma3d0.330.430.571.3457%
lucas0.330.520.881.7363%
swim0.331.000.551.8835%
equake0.331.210.802.3540%
art0.330.961.733.0364%59%
Sheet3
000
000
000
000
000
000
000
000
000
000
000
000
Base CPI
Pipeline Stall
Memory CPI
CPI
Sheet2
0.33333333330.26238666670.03428
0.33333333330.35187666670.09479
0.33333333330.18241666670.31425
0.33333333330.22061666670.33605
0.33333333330.01378666670.62288
0.33333333330.03115666670.70551
0.33333333330.16735666670.56931
0.33333333330.25520666670.58146
0.33333333330.27824666670.57842
0.33333333330.43484666670.57182
0.33333333330.52112666670.87554
0.33333333331.00030666670.54636
0.33333333331.21468666670.80198
0.33333333330.96347666671.73319
Base CPI
Min Pipeline Stall
Max Memory CPI
CPI
CPI breakdown
BenchmarkAvg CPIIcache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 misses
per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.
TPC-C like2.5718.3413.896.183.259.000.091.71
CINT2000 total1.300.9014.273.570.2512.470.001.06
Miss Penalty771607777
Peak CPI0.3333333333
memory CPI
TPC-C like1.310.130.100.990.020.060.000.01
CINT2000 total0.770.010.100.570.000.090.00.01
Base CPI0.33
Memory CPI (no overlap)0.77
Pipeline Stalls (no overlap)0.19
BenchmarkAvg CPIIcache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 misses
Base CPIMemory CPIPipeline StallL2 Memory% Memory L2per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.
TPC-C like0.331.310.920.9975%2.5718.3413.896.183.259.000.091.71
Miss Pen771607777
CINT2000 total0.330.770.190.5774%1.300.9014.273.570.2512.470.001.06INT
gzip0.330.210.320.028%0.860.0116.030.100.0111.060.000.09INT
vpr0.331.46(0.01)0.9263%1.780.0223.365.730.0150.520.003.22INT
gcc0.330.330.360.1444%1.021.9419.040.900.794.530.000.19INT
mcf0.3318.20(5.47)16.6191%13.060.02148.90103.820.0150.490.0026.98INT
crafty0.330.190.200.015%0.723.154.050.060.1618.070.000.01INT
parser0.330.400.540.2153%1.280.0814.801.340.0111.560.000.65INT
eon0.330.000.480.00%0.820.060.450.000.010.050.000.00INT
perlbmk0.330.130.240.0754%0.701.362.410.430.933.510.000.31INT
gap0.330.150.370.0960%0.860.764.270.580.053.380.000.33INT
vortex0.330.380.170.1949%0.883.675.861.170.6815.780.001.38INT
bzip20.330.610.060.4778%1.000.0110.572.940.008.170.000.63INT
twolf0.331.010.510.7271%1.850.0826.184.490.0214.790.000.01INT
52%
CFP2000 total0.330.490.330.3674%1.150.0813.432.260.013.700.000.79FP
wupwise0.330.310.180.2785%0.830.006.561.660.000.220.000.17FP
swim0.330.551.000.3259%1.880.0130.872.020.000.590.000.41FP
mgrid0.330.340.220.2264%0.890.0116.541.350.000.350.000.25FP
applu0.330.620.010.5588%0.970.018.483.410.002.420.000.13FP
mesa0.330.090.350.0222%0.780.031.580.130.018.780.000.17FP
galgel0.330.570.170.3867%1.070.0118.632.380.007.620.000.67FP
art0.331.730.961.3276%3.030.0056.968.270.001.200.000.41FP
equake0.330.801.210.5366%2.350.0637.293.300.001.200.000.59FP
facerec0.330.710.030.6389%1.070.019.313.940.001.210.000.20FP
ammp0.330.580.280.3866%1.190.0216.582.370.008.610.003.25FP
lucas0.330.880.520.7080%1.730.0017.354.360.004.800.003.27FP
fma3d0.330.570.430.4885%1.340.2011.843.020.050.360.000.21FP
sixtrack0.330.030.260.0375%0.630.030.530.160.010.660.000.01FP
apsi0.330.580.260.4068%1.170.5013.812.480.0110.370.001.69FP
71%
CINT2000 Avg0.330.190.771.30
mcf0.33(5.47)18.20
Base CPIMin Pipeline StallMax Memory CPITotal CPI
TPC-C like0.330.921.312.57
perlbmk0.330.240.130.70
crafty0.330.200.190.72
eon0.330.480.000.82
gzip0.330.320.210.86
gap0.330.370.150.86
vortex0.330.170.380.88
bzip20.330.060.611.00
gcc0.330.360.331.02
parser0.330.540.401.28
vpr0.33(0.01)1.461.78
twolf0.330.511.011.85
TPC-C0.330.921.312.57
CFP2000 Avg0.330.330.491.15
sixtrack0.330.260.030.63
mesa0.330.350.090.78
wupwise0.330.180.310.83
mgrid0.330.220.340.89
applu0.330.010.620.97
facerec0.330.030.711.07
galgel0.330.170.571.07
apsi0.330.260.581.17
ammp0.330.280.581.19
fma3d0.330.430.571.34
lucas0.330.520.881.73
swim0.331.000.551.88
equake0.331.210.802.35
art0.330.961.733.03
CPI breakdown
0.33333333330.106240.2604266667
0.33333333330.096530.2901366667
0.33333333330.003690.4829766667
0.33333333330.151050.3756166667
0.33333333330.137180.3894866667
0.33333333330.296490.2501766667
0.33333333330.565210.1014566667
0.33333333330.302830.3838366667
0.33333333330.346250.6004166667
0.33333333331.204060.2426066667
0.33333333330.931910.5847566667
0.33333333331.251510.9851566667
Base CPI
Max Memory CPI
Min Pipeline Stall
CPI
Sheet1
0.33333333330.034280.2623866667
0.33333333330.094790.3518766667
0.33333333330.314250.1824166667
0.33333333330.336050.2206166667
0.33333333330.622880.0137866667
0.33333333330.705510.0311566667
0.33333333330.569310.1673566667
0.33333333330.581460.2552066667
0.33333333330.578420.2782466667
0.33333333330.571820.4348466667
0.33333333330.875540.5211266667
0.33333333330.546361.0003066667
0.33333333330.801981.2146866667
0.33333333331.733190.9634766667
Base CPI
Max Memory CPI
Min Pipeline Stall
CPI
Opteron SPEC CPU2000 Rates
BenchmarkAvg CPIIcache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 misses
per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.
TPC-C like2.5718.3413.896.183.259.000.091.71
CINT2000 total1.300.9014.273.570.2512.470.001.06
Miss Penalty771607227
Peak CPI0.3333333333
memory CPI
TPC-C like1.270.130.100.990.020.020.000.01
CINT2000 total0.710.010.100.570.000.020.00.01
Base CPI0.33
Memory CPI (no overlap)0.71
Pipeline Stalls (no overlap)0.26
BenchmarkAvg CPIIcache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 misses
Base CPIMemory CPIPipeline StallL2 Memory% Memory L2per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.
TPC-C like0.331.250.990.9979%2.5718.3413.896.183.259.000.091.71
Miss Pen771602277
CINT2000 total0.330.710.260.5780%1.300.9014.273.570.2512.470.001.06INT
perlbmk0.330.110.260.0765%0.701.362.410.430.933.510.000.31INT
crafty0.330.100.290.0110%0.723.154.050.060.1618.070.000.01INT
eon0.330.000.480.00%0.820.060.450.000.010.050.000.00INT
gzip0.330.150.380.0211%0.860.0116.030.100.0111.060.000.09INT
gap0.330.140.390.0968%0.860.764.270.580.053.380.000.33INT
vortex0.330.300.250.1963%0.883.675.861.170.6815.780.001.38INT
bzip20.330.570.100.4783%1.000.0110.572.940.008.170.000.63INT
gcc0.330.300.380.1448%1.021.9419.040.900.794.530.000.19INT
parser0.330.350.600.2162%1.280.0814.801.340.0111.560.000.65INT
vpr0.331.200.240.9276%1.780.0223.365.730.0150.520.003.22INT
twolf0.330.930.580.7277%1.850.0826.184.490.0214.790.000.01INT
mcf0.3317.94(5.22)16.6193%13.060.02148.90103.820.0150.490.0026.98INT
58%
CFP2000 total0.330.470.350.3677%1.150.0813.432.260.013.700.000.79FP
sixtrack0.330.030.270.0383%0.630.030.530.160.010.660.000.01FP
mesa0.330.050.400.0241%0.780.031.580.130.018.780.000.17FP
wupwise0.330.310.180.2785%0.830.006.561.660.000.220.000.17FP
mgrid0.330.330.220.2265%0.890.0116.541.350.000.350.000.25FP
applu0.330.610.030.5589%0.970.018.483.410.002.420.000.13FP
galgel0.330.530.210.3872%1.070.0118.632.380.007.620.000.67FP
facerec0.330.700.040.6390%1.070.019.313.940.001.210.000.20FP
apsi0.330.530.310.4075%1.170.5013.812.480.0110.370.001.69FP
ammp0.330.540.320.3871%1.190.0216.582.370.008.610.003.25FP
fma3d0.330.570.440.4885%1.340.2011.843.020.050.360.000.21FP
lucas0.330.850.550.7082%1.730.0017.354.360.004.800.003.27FP
swim0.330.541.000.3259%1.880.0130.872.020.000.590.000.41FP
equake0.330.801.220.5366%2.350.0637.293.300.001.200.000.59FP
art0.331.730.971.3277%3.030.0056.968.270.001.200.000.41FP
74%
CINT2000 Avg0.330.260.711.30
mcf0.33(5.47)18.20
Base CPIMin Pipeline StallMax Memory CPITotal CPI
TPC-C like0.330.991.252.57
perlbmk0.330.260.110.70
crafty0.330.290.100.72
eon0.330.480.000.82
gzip0.330.380.150.86
gap0.330.390.140.86
vortex0.330.250.300.88
bzip20.330.100.571.00
gcc0.330.380.301.02
parser0.330.600.351.28
vpr0.330.241.201.78
twolf0.330.580.931.85
TPC-C0.330.991.252.57
CFP2000 Avg0.330.350.471.15
sixtrack0.330.260.030.63
mesa0.330.350.090.78
wupwise0.330.180.310.83
mgrid0.330.220.340.89
applu0.330.010.620.97
facerec0.330.030.711.07
galgel0.330.170.571.07
apsi0.330.260.581.17
ammp0.330.280.581.19
fma3d0.330.430.571.34
lucas0.330.520.881.73
swim0.331.000.551.88
equake0.331.210.802.35
art0.330.961.733.03
perlbmk0.330.260.110.70
crafty0.330.290.100.72
eon0.330.480.000.82
gzip0.330.380.150.86
gap0.330.390.140.86
vortex0.330.250.300.88
bzip20.330.100.571.00
gcc0.330.380.301.02
parser0.330.600.351.28
vpr0.330.241.201.78
twolf0.330.580.931.85
TPC-C0.330.991.252.57
sixtrack0.330.260.030.63
mesa0.330.350.090.78
wupwise0.330.180.310.83
mgrid0.330.220.340.89
applu0.330.010.620.97
facerec0.330.030.711.07
galgel0.330.170.571.07
apsi0.330.260.581.17
ammp0.330.280.581.19
fma3d0.330.430.571.34
lucas0.330.520.881.73
swim0.331.000.551.88
equake0.331.210.802.35
art0.330.961.733.03
Opteron SPEC CPU2000 Rates
000
000
000
000
000
000
000
000
000
000
000
000
Base CPI
Max Memory CPI
Min Pipeline Stall
CPI
000
000
000
000
000
000
000
000
000
000
000
000
000
000
Base CPI
Max Memory CPI
Min Pipeline Stall
CPI
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
BenchmarkAvg CPIIcache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 missesIcache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 misses
per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.
TPC-C like2.5718.3413.896.183.259.000.091.711.83%1.39%0.62%0.33%0.90%0.01%0.17%
%%%%%%%
CINT2000 total1.300.9014.273.570.2512.470.001.06INT0.09%1.43%0.36%0.03%1.25%0.00%0.11%
164.gzip0.860.0116.030.100.0111.060.000.09INT0.001%1.603%0.010%0.001%1.106%0.000%0.009%
175.vpr1.780.0223.365.730.0150.520.003.22INT0.002%2.336%0.573%0.001%5.052%0.000%0.322%
176.gcc1.021.9419.040.900.794.530.000.19INT0.194%1.904%0.090%0.079%0.453%0.000%0.019%
181.mcf13.060.02148.90103.820.0150.490.0026.98INT0.002%14.890%10.382%0.001%5.049%0.000%2.698%
186.crafty0.723.154.050.060.1618.070.000.01INT0.315%0.405%0.006%0.016%1.807%0.000%0.001%
197.parser1.280.0814.801.340.0111.560.000.65INT0.008%1.480%0.134%0.001%1.156%0.000%0.065%
252.eon0.820.060.450.000.010.050.000.00INT0.006%0.045%0.000%0.001%0.005%0.000%0.000%
253.perlbmk0.701.362.410.430.933.510.000.31INT0.136%0.241%0.043%0.093%0.351%0.000%0.031%
254.gap0.860.764.270.580.053.380.000.33INT0.076%0.427%0.058%0.005%0.338%0.000%0.033%
255.vortex0.883.675.861.170.6815.780.001.38INT0.367%0.586%0.117%0.068%1.578%0.000%0.138%
256.bzip21.000.0110.572.940.008.170.000.63INT0.001%1.057%0.294%0.000%0.817%0.000%0.063%
300.twolf1.850.0826.184.490.0214.790.000.01INT0.008%2.618%0.449%0.002%1.479%0.000%0.001%
CFP2000 total1.150.0813.432.260.013.700.000.79FP0.01%1.34%0.23%0.00%0.37%0.00%0.08%
168.wupwise0.830.006.561.660.000.220.000.17FP0.000%0.656%0.166%0.000%0.022%0.000%0.017%
171.swim1.880.0130.872.020.000.590.000.41FP0.001%3.087%0.202%0.000%0.059%0.000%0.041%
172.mgrid0.890.0116.541.350.000.350.000.25FP0.001%1.654%0.135%0.000%0.035%0.000%0.025%
173.applu0.970.018.483.410.002.420.000.13FP0.001%0.848%0.341%0.000%0.242%0.000%0.013%
177.mesa0.780.031.580.130.018.780.000.17FP0.003%0.158%0.013%0.001%0.878%0.000%0.017%
178.galgel1.070.0118.632.380.007.620.000.67FP0.001%1.863%0.238%0.000%0.762%0.000%0.067%
179.art3.030.0056.968.270.001.200.000.41FP0.000%5.696%0.827%0.000%0.120%0.000%0.041%
183.equake2.350.0637.293.300.001.200.000.59FP0.006%3.729%0.330%0.000%0.120%0.000%0.059%
187.facerec1.070.019.313.940.001.210.000.20FP0.001%0.931%0.394%0.000%0.121%0.000%0.020%
188.ammp1.190.0216.582.370.008.610.003.25FP0.002%1.658%0.237%0.000%0.861%0.000%0.325%
189.lucas1.730.0017.354.360.004.800.003.27FP0.000%1.735%0.436%0.000%0.480%0.000%0.327%
191.fma3d1.340.2011.843.020.050.360.000.21FP0.020%1.184%0.302%0.005%0.036%0.000%0.021%
200.sixtrack0.630.030.530.160.010.660.000.01FP0.003%0.053%0.016%0.001%0.066%0.000%0.001%
301.apsi1.170.5013.812.480.0110.370.001.69FP0.050%1.381%0.248%0.001%1.037%0.000%0.169%
2.020.41.01.713.00.70.01.6Min0.000%0.045%0.000%0.000%0.005%0.000%0.000%
2.2229.31.02.7325.02.40.02.2Max0.367%14.890%10.382%0.093%5.052%0.000%2.698%
Inverse0.77Icache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 misses
0.87per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.
0.38910505842.57Not MCF0.000%0.045%0.000%0.000%0.005%0.000%0.000%
Notes:NOT MCF0.367%5.696%0.827%0.093%5.052%0.000%0.327%
- Data measured on uniprocessor Opteron based systems
- Metrics are averaged over the entire benchmark run
- Only SPEC CPU2000 base results are given. All CPU2000 benchmarks run in 64-bit mode.
- TPC-C like benchmark is run in 32-bit mode.
- Contact: [email protected] 512-602-5581 or 512-576-9485 mobile
BenchmarkAvg CPIIcache missesDcache missesL2 missesITLB L1 missesDTLB L1 missesITLB L2 missesDTLB L2 misses
per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.per 1k instr.
TPC-C like2.5718.3413.896.183.259.000.091.71
CINT2000 total1.300.9014.273.570.2512.470.001.06INT
164.gzip0.860.0116.030.100.0111.060.000.09INT
175.vpr1.780.0223.365.730.0150.520.003.22INT
176.gcc1.021.9419.040.900.794.530.000.19INT
181.mcf13.060.02148.90103.820.0150.490.0026.98INT
186.crafty0.723.154.050.060.1618.070.000.01INT
197.parser1.280.0814.801.340.0111.560.000.65INT
252.eon0.820.060.450.000.010.050.000.00INT
253.perlbmk0.701.362.410.430.933.510.000.31INT
254.gap0.860.764.270.580.053.380.000.33INT
255.vortex0.883.675.861.170.6815.780.001.38INT
256.bzip21.000.0110.572.940.008.170.000.63INT
300.twolf1.850.0826.184.490.0214.790.000.01INT
CFP2000 total1.150.0813.432.260.013.700.000.79FP
168.wupwise0.830.006.561.660.000.220.000.17FP
171.swim1.880.0130.872.020.000.590.000.41FP
172.mgrid0.890.0116.541.350.000.350.000.25FP
173.applu0.970.018.483.410.002.420.000.13FP
177.mesa0.780.031.580.130.018.780.000.17FP
178.galgel1.070.0118.632.380.007.620.000.67FP
179.art3.030.0056.968.270.001.200.000.41FP
183.equake2.350.0637.293.300.001.200.000.59FP
187.facerec1.070.019.313.940.001.210.000.20FP
188.ammp1.190.0216.582.370.008.610.003.25FP
189.lucas1.730.0017.354.360.004.800.003.27FP
191.fma3d1.340.2011.843.020.050.360.000.21FP
200.sixtrack0.630.030.530.160.010.660.000.01FP
301.apsi1.170.5013.812.480.0110.370.001.69FP
2.020.41.01.713.00.70.01.6
2.2229.31.02.7325.02.40.02.2
Notes:
- Data measured on uniprocessor Opteron based systems
- Metrics are averaged over the entire benchmark run
- Only SPEC CPU2000 base results are given. All CPU2000 benchmarks run in 64-bit mode.
- TPC-C like benchmark is run in 32-bit mode.
- Contact: [email protected] 512-602-5581 or 512-576-9485 mobile
-
Pentium 4 vs. Opteron Memory Hierarchy*Clock rate for this comparison in 2005; faster versions existed
-
Misses Per Instruction: Pentium 4 vs. OpteronOpteron betterPentium betterD cache miss: P4 is 2.3X to 3.4X vs. OpteronL2 cache miss: P4 is 0.5X to 1.5X vs. OpteronNote: Same ISA, but not same instruction count2.3X3.4X0.5X1.5X
Chart1
3.49905844833.4651969004
1.77015906670.3413784983
1.62797720580.8163025337
1.2339812360.1755664967
4.68415928980.19326934
4.66914217021.9451292097
5.01586036616.1920687672
1.3099034460.723184898
3.58570534940.4991951275
3.96956736461.6074951627
SPECint2000
SPECfp2000
D cache: P4/Opteron
L2 cache: P4/Opteron
Ratio of MPI: Pentium 4/Opteron
D & L2 MPI, SPEC
Pentium 43.2 GHzOpteron2.8 GHzP4/O
12K16 K2 MBL1/L264 K64 K1 MBL1/L2
I cacheD cacheL2 cacheI cacheD cacheL2 cacheI cacheD cacheL2 cache
164.gzip0.0556.095.610.351620.0116.031.600.101605.03.53.5
175.vpr0.3541.354.141.96210.0223.362.345.73417.61.80.3
176.gcc3.9931.003.100.73421.9419.041.900.90212.11.60.8
181.mcf0.08183.7418.3718.23100.02148.9014.89103.8213.81.20.2
186.crafty5.5518.971.900.011,6363.154.050.410.06681.84.70.2
168.wupwise0.0530.633.063.2390.016.560.661.6649.44.71.9
171.swim0.07154.8415.4812.51120.0130.873.092.02156.85.06.2
172.mgrid0.0221.672.170.98220.0116.541.651.35121.81.30.7
173.applu0.0330.413.041.70180.018.480.853.4122.93.60.5
177.mesa0.516.270.630.21300.031.580.160.131216.94.01.6
MIN0.026.270.630.019.490.011.580.160.061.431.761.230.18
MAX5.55183.7418.3718.231,635.963.15148.9014.89103.82160.3017.635.026.19
I Pent4I OptD Pent4D OptL2 Pent4L2 Opt
gzip0.050.0156.0916.030.350.10
vpr0.350.0241.3523.361.965.73
gcc3.991.9431.0019.040.730.90
mcf0.080.02183.74148.9018.23103.82
crafty5.553.1518.974.050.010.06
wupeise0.050.0130.636.563.231.66
swim0.070.01154.8430.8712.512.02
mgrid0.020.0121.6716.540.981.35
applu0.030.0130.418.481.703.41
mesa0.510.036.271.580.210.13
I cacheD cacheL2 cache
gzip5.03.53.5
vpr17.61.80.3
gcc2.11.60.8
mcf3.81.20.2
crafty1.84.70.2
wupwise9.44.71.9
swim6.85.06.2
mgrid1.81.30.7
applu2.93.60.5
mesa16.94.01.6
D cache: P4/OpteronL2 cache: P4/Opteron
gzip33.5
vpr20.3
gcc20.8
mcf10.2
crafty50.2
wupwise51.9
swim56.2
mgrid10.7
applu40.5
mesa41.6
Median3.50.8
geomean2.760.86
gstdev
SPECratio2000base
P4OpteronRatioRatio Op/P4
gzip106612840.831.20
vpr110312440.891.13
gcc188014311.310.76
mcf196211631.690.59
crafty118318990.621.61
wupwise259728380.921.09
swim264022531.170.85
mgrid146217230.851.18
applu152621220.721.39
mesa140318400.761.31
All GM int158417430.941.10
All GM fp184319761.081.07
GM 5 int1,3871,3821.001.00
GM 5 fp1,8462,1220.871.15
D & L2 MPI, SPEC
000000
000000
000000
000000
000000
000000
000000
000000
000000
000000
I Pent4
I Opt
D Pent4
D Opt
L2 Pent4
L2 Opt
Misses per Intruction
New GeoMean
000
000
000
000
000
000
000
000
000
000
I cache
D cache
L2 cache
GeoMean
00
00
00
00
00
00
00
00
00
00
SPECint2000
SPECfp2000
D cache: P4/Opteron
L2 cache: P4/Opteron
Ratio of MPI: Pentium 4/Opteron
SPEC2000 Pent4
000
000
000
000
000
000
000
000
000
000
D cache: P4/Opteron
L2 cache: P4/Opteron
SPECRatio
Ratio of MPI: Pentium 4/Opteron
SPEC2000 Opt
BenchmarkItanium 2 SPECRatioln(Sample)ln(GM)(ln(S)-ln(GM))**2
gzip0.83(0.19)0.000.04
vpr0.89(0.12)0.000.02
gcc1.310.270.000.07
mcf1.690.520.000.27
Bold are beyondcrafty0.62(0.47)0.000.23
1 St.Dev. Range
N5
1/N0.2Sum ln(S)0.02Sum[(ln(S)-ln(GM))**2]0.62
Product1.0163530546div n0.00Sum[]/n0.12
GM (old)1.00GM (via ln)1.00expSQRT(Sum[]/n)0.35SQRT(Su
GM calculated old wayStDev(Sum(ln(S))0.39St Devexp(SQRT(Sum[]/n))1.42Gstdev
exp(StDev(Sum(ln(S)))1.48Gstdev
GM/Gstdev0.68lowGM/Gstdev0.71low
GM*Gstdev1.49highGM*Gstdev1.43high
GM, Gstdev, Range calculated beforeGM, Gstdev, Range calculated as on web page
BenchmarkItanium 2 SPECRatioln(Sample)ln(GM)(ln(S)-ln(GM))**2
wupwise0.92(0.09)0.000.01
swim1.170.160.000.02
mgrid0.85(0.16)0.000.03
applu0.72(0.33)0.000.11
Bold are beyondmesa0.76(0.27)0.000.08
1 St.Dev. Range
N5
1/N0.2Sum ln(S)(0.70)Sum[(ln(S)-ln(GM))**2]0.25
Product0.4988998701div n(0.14)Sum[]/n0.05
GM (old)0.87GM (via ln)0.87expSQRT(Sum[]/n)0.22SQRT(Su
GM calculated old wayStDev(Sum(ln(S))0.19St Devexp(SQRT(Sum[]/n))1.25Gstdev
exp(StDev(Sum(ln(S)))1.21Gstdev
GM/Gstdev0.72lowGM/Gstdev0.70low
GM*Gstdev1.05highGM*Gstdev1.09high
GM, Gstdev, Range calculated beforeGM, Gstdev, Range calculated as on web page
BenchmarkPentium 4 / Opteron RatiolnD cacheBenchmarkPentium 4 / Opteron RatiolnD cacheBenchmarkPentium 4 / Opteron RatiolnD cache
gzip3.51.25gzip3.51.25
vpr1.80.57vpr1.80.57
gcc1.60.49gcc1.60.49
mcf1.20.21mcf1.20.21
crafty4.71.54crafty4.71.54
wupwise4.71.54wupwise4.71.54
swim5.01.61swim5.01.61
mgrid1.30.27mgrid1.30.27
applu3.61.28applu3.61.28
mesa4.01.38mesa4.01.38
1055
0.125450.224509197810.14Sum ln0.258.28435863264.07Sum ln0.2436.65616481446.08Sum ln
GM2.761.01div nGM2.250.81div nGM3.371.22div n
2.76exp2.25exp3.37exp
0.56St Dev0.56St Dev0.54St Dev
1.76Gstdev1.75Gstdev1.72Gstdev
1.57low1.29low1.96low
4.84high3.95high5.82high
BenchmarkPentium 4 / Opteron RatiolnL2 cacheBenchmarkPentium 4 / Opteron RatiolnL2 cacheBenchmarkPentium 4 / Opteron RatiolnL2 cache
gzip3.51.24gzip3.51.24
vpr0.3(1.07)vpr0.3(1.07)
gcc0.8(0.20)gcc0.8(0.20)
mcf0.2(1.74)mcf0.2(1.74)
crafty0.2(1.64)crafty0.2(1.64)
wupwise1.90.67wupwise1.90.67
swim6.21.82swim6.21.82
mgrid0.7(0.32)mgrid0.7(0.32)
applu0.5(0.69)applu0.5(0.69)
mesa1.60.47mesa1.60.47
1055
0.10.2290200047(1.47)Sum ln0.20.0327657288(3.42)Sum ln0.26.98962035021.94Sum ln
GM0.86(0.15)div nGM0.50(0.68)div nGM1.480.39div n
0.86exp0.50exp1.48exp
1.19St Dev1.24St Dev0.98St Dev
3.30Gstdev3.45Gstdev2.66Gstdev
0low0low1low
3high2high4high
BenchmarkPentium 4 / Opteron RatiolnSPECBenchmarkPentium 4 / Opteron RatiolnL2 cacheBenchmarkPentium 4 / Opteron RatiolnL2 cache
gzip0.83(0.19)gzip0.83(0.19)
vpr0.89(0.12)vpr0.89(0.12)
gcc1.310.27gcc1.310.27
mcf1.690.52mcf1.690.52
crafty0.62(0.47)crafty0.62(0.47)
wupwise0.92(0.09)wupwise0.92(0.09)
swim1.170.16swim1.170.16
mgrid0.85(0.16)mgrid0.85(0.16)
applu0.72(0.33)applu0.72(0.33)
mesa0.76(0.27)mesa0.76(0.27)
1055
0.10.507058407(0.68)Sum ln0.21.01635305460.02Sum ln0.20.4988998701(0.70)Sum ln
GM0.93(0.07)div nGM1.000.00div nGM0.87(0.14)div n
0.93exp1.00exp0.87exp
0.30St Dev0.39St Dev0.19St Dev
1.35Gstdev1.48Gstdev1.21Gstdev
1low1low1low
1high1high1high
SPECint2000BaseBaseBasePeakPeakPeak
BenchmarksRef TimeRun TimeRatioRef TimeRun TimeRatio
164.gzip1400131106614001321063
175.vpr1400127110314001251119
176.gcc110058.518801100591865
181.mcf180091.81962180091.71963
186.crafty100084.51183100084.51183
197.parser1800134133818001341340
252.eon130064.82006130064.82006
253.perlbmk180093.21932180093.21932
254.gap110062.81751110062.51760
255.vortex190070.12709190070.12709
256.bzip21500124120515001251205
300.twolf3000183163830001831638
SPECint_base20001584
SPECint20001585
SPECfp2000BaseBaseBasePeakPeakPeak
BenchmarksRef TimeRun TimeRatioRef TimeRun TimeRatio
168.wupwise160061.62597160061.62597
171.swim3100117264031001172640
172.mgrid1800123146218001231462
173.applu2100138152621001381526
177.mesa140099.81403140099.81403
178.galgel290085.73384290085.73384
179.art260055.34702260055.34702
183.equake13004727631300472763
187.facerec190098.71925190098.71925
188.ammp2200196112122001961121
189.lucas200091.42188200091.42188
191.fma3d2100143147121001431471
200.sixtrack11001736351100173635
301.apsi2600207125726002071257
SPECfp_base20001843
SPECfp20001843
SPEC CINT2000 Summary
Fujitsu Siemens Computers CELSIUS M440, Intel Pentium 4 640
Wed Nov 30 11:08:35 2005
HARDWARE
--------
Hardware Vendor: Fujitsu Siemens Computers
Model Name: CELSIUS M440, Intel Pentium 4 640
CPU: Intel Pentium 4 processor 640
CPU MHz: 3200
FPU: Integrated
CPU(s) enabled: 1 core, 1 chip, 1 core/chip (Hyper-Threading Technology disabled)
CPU(s) orderable: 1
Parallel: No
Primary Cache: 12k micro-ops I + 16KBD on chip
Secondary Cache: 2048KB(I+D) on chip
L3 Cache: N/A
Other Cache: N/A
Memory: 4 GB (4x1 GB DDR2-533, 2rank, CL4-4-4, with ECC)
Disk Subsystem: Serial ATA 7200 rpm
Other Hardware:
SOFTWARE
--------
Operating System: Windows XP Professional, Service Pack 2
Compiler: Intel(R) C++ Compiler for 32-bit app., Version 9.0,
- Build 20050912Z Package ID: W_CC_C_9.0.024
Microsoft Visual Studio .NET 2003 (for libraries)
MicroQuill SmartHeap Library 7.4
File System: NTFS
System State: Default
NOTES
-----
Portability flags:
176.gcc: -Dalloca=_alloca /F10000000
186.crafty: -DNT_i386
253.perlbmk: -DSPEC_CPU2000_NTOS -DPERLDLL /MT
254.gap: -DSYS_HAS_CALLOC_PROTO -DSYS_HAS_MALLOC_PROTO
Feedback optimization:
+FDO: PASS1= -Qprof_gen PASS2= -Qprof_use
Baseline Tuning Flags:
for C programs:
-fast +FDO shlW32M.lib
for C++ program 252.eon:
-fast -Qcxx_features +FDO
Peak Tuning Flags:
164.gzip: -fast +FDO
175.vpr: -fast +FDO
176.gcc: -fast +FDO
181.mcf: -fast +FDO shlW32M.lib
186.crafty: -fast +FDO
197.parser: -fast +FDO
252.eon: -fast -Qcxx_features +FDO
253.perlbmk: -fast +FDO shlW32M.lib
254.gap: -fast +FDO
255.vortex -fast +FDO shlW32M.lib
256.bzip2: -fast +FDO
300.twolf: -fast +FDO shlW32M.lib
Extra Libraries:
shlW32M.lib = Microquill SmartHeap Library 7.4
see www.microquill.com
The system bus runs at 800 MHz
For information about Fujitsu Siemens Computers in your country please see:
http://www.fujitsu-siemens.com/countries
-----------------------------------------------------------------------------
For questions about this result, please contact the tester.
For other inquiries, please contact [email protected].
Copyright 1999-2005 Standard Performance Evaluation Corporation
Generated on Tue Dec 27 11:33:15 2005 by SPEC CPU2000 ASCII formatter v2.1
SPECint2000BaseBaseBasePeakPeakPeak
BenchmarksRef TimeRun TimeRatioRef TimeRun TimeRatio
164.gzip140083.11685140082.8169114001091284140090.71543*
175.vpr14001021373140098.5142114001131244140090.21552*
176.gcc110055.91969110057.21923110076.91431110053.62052*
181.mcf1800236764180014812141800155116318001191516*
186.crafty100040.12492100039.22553100052.718991000501998*
197.parser18001401285180011016321800103174118001041733*
252.eon130044.82905130038.83352130050.12593130044.92895*
253.perlbmk180095.61883180083.321621800872068180085.22112*
254.gap110060.71813110060.7181311006018331100601833*
255.vortex190066287719006230651900692753190065.92881*
256.bzip2150096.41556150096.4155615001021476150099.21513*
300.twolf30001891587300014920083000171175430001332254*
SPECint_base200017431708
SPECint200019451940
SPECfp2000BaseBaseBasePeakPeakPeak
BenchmarksRef TimeRun TimeRatioRef TimeRun TimeRatio
168.wupwise160056.42838160049.13261
171.swim3100138225331001332326
172.mgrid18001041723180096.81859
173.applu2100992122210085.82448
177.mesa140076.11840140057.82423
178.galgel2900903222290080.73596
179.art260098.52639260068.33805
183.equake130076.21706130069.21880
187.facerec190071.42661190071.82647
188.ammp2200134164122001251755
189.lucas2000108184920001021963
191.fma3d2100123171321001221720
200.sixtrack11001179391100112979
301.apsi2600152171126001461781
SPECfp_base20001976
SPECfp20002191
SPEC CFP2000 Summary
Advanced Micro Devices TYAN Tomcat K8E (S2865), AMD Opteron (TM) 154
Sun Dec 11 13:58:07 2005
Sun Dec 11 12:05:44 2005
Hardware Vendor: Advanced Micro Devices
Model Name: TYAN Tomcat K8E (S2865), AMD Opteron (TM) 154
CPU: AMD Opteron (TM) 154 (939 pin)
CPU MHz: 2800
FPU: Integrated
CPU(s) enabled: 1 core, 1 chip, 1 core/chip
CPU(s) orderable: 1
Parallel: No
Primary Cache: 64KBI + 64KBD on chip
Secondary Cache: 1024KB (I+D) on chip
L3 Cache: N/A
Other Cache: N/A
Memory: 2x512MB, DDR400 CL2
Disk Subsystem: IDE, 160 GB
Other Hardware: None
SOFTWARE
--------
Operating System: SuSE Linux Enterprise Server 9 for AMD64
Compiler: PathScale EKOPath(TM) Compiler
Suite, Release 2.3
File System: Linux/ext3
System State: Multi-user, run level 3
NOTES
-----
Tested by Advanced Micro Devices
+FDO: PASS1= -fb_create fbdata PASS2= -fb_opt fbdata
Baseline optimization flags:
C programs: -Ofast +FDO
C++ programs: -Ofast +FDO
Portability Flags:
186.crafty: -DLINUX_i386
252.eon: -DHAS_ERRLIST -DSPEC_CPU2000_LP64
253.perlbmk: -DSPEC_CPU2000_LINUX_I386 -DSPEC_CPU2000_NEED_BOOL
-DSPEC_CPU2000_GLIBC22 -DSPEC_CPU2000_LP64
254.gap: -DSYS_IS_USG -DSYS_HAS_IOCTL_PROTO -DSYS_HAS_TIME_PROTO
-DSYS_HAS_CALLOC_PROTO -DSPEC_CPU2000_LP64
255.vortex: -DSPEC_CPU2000_LP64
Peak Tuning:
164.gzip: -O3 -ipa -WOPT:val=0 -OPT:unroll_size=0 +FDO
175.vpr: -O3 -ipa -m32 +FDO
176.gcc: -O3 -IPA:plimit=10000 -LNO:opt=0 -OPT:goto=off +FDO
181.mcf: -O3 -ipa -IPA:field_reorder=on -m32 +FDO
186.crafty: -Ofast -CG:local_fwd_sched=on -LNO:opt=0 -WOPT:val=0 +FDO
197.parser: -O3 -ipa -m32 -IPA:ctype=on +FDO
252.eon: -Ofast -CG:gcm=off:p2align_freq=1:prefetch=off -IPA:plimit=4000
-OPT:treeheight=on -TENV:X=4:frame_pointer=off -fno-exceptions
-LNO:fu=10:full_unroll_outer=on -GRA:optimize_boundary=on +FDO
253.perlbmk: -O2 -ipa -OPT:Ofast:transform_to_memlib=off
-fno-math-errno -IPA:plimit=10000 +FDO
254.gap: basepeak = true
255.vortex: -Ofast -OPT:goto=off -CG:p2align=on
-GRA:optimize_boundary=on -IPA:min_hotness=120 +FDO
256.bzip2: basepeak = true
300.twolf: -O2 -CG:gcm=off:p2align_freq=100000
-OPT:Ofast:unroll_times_max=8:unroll_size=256:alias=disjoint
-WOPT:mem_opnds=on -m32 +FDO
Corsair CMX512-3200XL memory used in Dual Channel configuration
Memory timings manually set in BIOS: CAS=2, Trcd=2, Tras=5, Trp=2
BIOS rev 3.01
The tested system can be assembled using a standard ATX case and an Antec True 550
watt EPS12V Power Supply