Computer System in Layman

Post on 07-Apr-2022

9 views 0 download

Transcript of Computer System in Layman

Computer System in Layman

Present byNg Kheng Ghee01 Sept 2021

Today’s Content

● Processor, Core and CPU● Memory● Storage● Network● WorkStation vs Server vs

HPC

Processor, Core, and CPU

The electronic circuitry that executes instructions comprising a computer program.

In layman term, CPU is like a worker in your computer. Computer instruction is like the work instruction given to the workers. Computer program is like a compile list of work instructions.

Processor, Core, and CPU

Processor - The physical processor socket on the motherboard.

Core - The processing core in the processor.

CPU - The logical unit of CPU in the processor.

Each processor can consist of multiple cores.

The total number of CPUs available in a single server is based on:

● Number of processors(sockets) * number of cores * number of thread (multithreading)

Output of “lscpu” command in servers in epyc partition

CPUs = Number of Sockets * Cores per socket * Threads per coreAffect the performance of parallel processing.

Output of “lscpu” command in servers in epyc partition

CPU Frequency - Affects the performance of serial processing

Output of “lscpu” command in servers in epyc partition

CPU cache

What is a CPU cache?

Cache = memoryIn layman term, cache/memory is a space where your worker place their data and instructions.

Processor, Core, and CPU

● A hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost to access data from the main memory.

● A cache is a smaller, faster memory, located closer to a processor core, which stores copies of the data from frequently used main memory locations.

● For more information about cache mapping, please read about direct mapping and associative mapping.

Processor, Core, and CPU

What happen when CPU try to read from or write to memory?

1. CPU look for the data in the CPU cache.

2. If cache hit, CPU will perform the read/write operation on the cache.

3. If cache miss, the CPU cache will copy the data from the main memory to the cache, and the read/write operation will be perform on the cache.

L2 cache (512k)

Layman presentation (Single Core)

Notes:

Architecture for CPU model: AMD EPYC 7F72 Processor

CPU = Worker

Cache = Working Space

L1 cache (32k)

CPU

L2 cache (512k)

Layman presentation (Cores with L3 cache)

Notes:

Architecture for CPU model: AMD EPYC 7F72 Processor

CPU = Worker

Cache = Working Space

L1 cache (32k)

CPU

L2 cache (512k)

L1 cache (32k)

CPU

L3 Cache (16MB)

Layman presentation (Single Processor)

Notes:

Architecture for CPU model: AMD EPYC 7F72 Processor

CPU = Worker

Cache = Working Space

Memory/IO DIE

CPU Architecture for AMD EPYC Zen2 and Zen3

Processor, Core, and CPU - Multithreading

What is multithreading?

Multithreading is the ability of a CPU to provide multiple threads of execution concurrently, supported by the operating system.

Thread - A single instruction.

What is multithreading and how it works?

Multiple threads can be executed in a single core(but not at the same time). A single physical core can be seen as 2 logical CPUs.

Examples of multithreading:

An IO request has been made in a thread A. While waiting for the IO, the processor core can execute thread B while waiting the IO request in thread A to be completed.

Multithreading

Thread B

Thread A

Thread B

No multithreading

Thread A

Multithreading

Process Process

Tim

e

Tim

e

Processor, Core, and CPU - Multithreading

What is multithreading?

Multithreading is the ability of a CPU to provide multiple threads of execution concurrently, supported by the operating system.

Thread - A single instruction.

Does multithreading improve the performance of my HPC jobs?

The answer is NOT ALWAYS, and sometimes it even DEGRADES your job performance.

It depends on the nature of your computational job. Jobs with many IO requests or cache miss might be see the performance gained.

Do test your job with/without multithreading to see its performance.

Processor, Core, and CPU

Tips for writing program/code for HPC jobs, selecting the appropriate resources for HPC jobs.

1. Understand your application/programs execution model. - Serial or parallel

2. Make sure to parallelize your program/codes whenever possible to utilize the available resources.

3. A single OS process can run on multiple CPUs. For the best performance, ensure every CPUs only executes one thread at a time. Do not allocate more threads than the number of CPUs requested.

4. Ensure your parallel section of your program/codes/applications works. Always try with small use case before submitting for large jobs that required large amount of CPUs.

5. Determine whether your program/codes/application perform well with multithreading enabled.

Memory

Computer Memory - the storage space in the computer, where data is to be processed and instructions required for processing are stored.

In layman, memory is like a bigger store room where your worker place their data and instructions.

Memory

(Repeat) What happen when CPU try to read from or write to memory?

1. CPU look for the data in the CPU cache.

2. If cache hit, CPU will perform the read/write operation on the cache.

3. If cache miss, the CPU cache will copy the data from the main memory to the cache, and the read/write operation will be perform on the cache.

Memory

What happen when CPU try to read from or write to memory?

L3 Cache (16MB)

Memory/IO

Memory

Memory

Shared memory vs Distributed memory

Shared memory

Refers to a block of RAM that can be accessed by several different CPUs in a multiprocessor computer system.

Usually referred to the memory in a single node/server.

Shared Memory

Shared Memory - Uniform Memory Access

RA

M

RA

M

RA

M

RA

M

RA

M

RA

M

RA

M

RA

MSystem Bus

What is Non-uniform Memory Access (NUMA)?

Memory

Non-uniform Memory Access

● A computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor.

● Under NUMA, a processor can access its own local memory faster than non-local memory.

Shared Memory

RA

M

RA

M

RA

M

RA

M

RA

M

RA

M

RA

M

RA

MSystem Bus

NUMA Node NUMA Node

Memory

Shared memory vs Distributed memory

Distributed memory

Refers to a multiprocessor computer system in which each processor has its own private memory.

Usually refer to the memory in multiple nodes/servers, with each node/server having its own memory.

Distributed Memory

Distributed Memory

RA

M

RA

M

RA

M

RA

M

RA

M

RA

M

RA

M

RA

M

NICR

AM

RA

M

RA

M

RA

M

RA

M

RA

M

RA

M

RA

M

NIC

. . .

Network

Latency

Memory Memory type Latency

CPU Cache - L1, L2, L3 cache

Fastest

Local RAM - NUMA node Fast

Local RAM - UMA Slow

Remote RAM - Distributed Memory

Slowest

Parallel programming model

Memory

Shared memory- OpenMP

Distributed Memory- Message Parsing Interface(MPI):

- OpenMPI- Intel MPI

Memory

Tips for writing program/code for HPC jobs, selecting the appropriate resources for HPC jobs.

1. Understand your the memory requirements of your jobs and plan accordingly.

2. Enable/Implement OpenMP in your program/code in shared memory environment might be able to improve your job performance.

3. Consider using MPI for distributed memory environment.

4. If the CPU and memory required can fits in a single node, just request for a single node. Do not request for multiple nodes as latency might slow down your job performance.

5. Always try with small use case before submitting for large jobs that required large amount of memory.

Storage

Storage is also a form of memory, and is consist of hard drives (HDD or SSD). The data store in the hard drives is persistent.

In layman term, storage is a large warehouse that keep your persistent data.

Storage

HDD vs SSD

Hard Disk Drive (HDD)

● Magnetic Storage● Rotating Platter● Inexpensive● Slow compared to SSD

Storage

HDD vs SSD

Solid State Drive (SSD)

● Integrated circuit● Flash memory● Expensive● Fast compared to HDD

How storage connected to CPU and memory (Simplified)

L3 Memory/IO

Memory

Storage

Syst

em B

us

Storage

Storage Type

Storage Type:

● Local Storage○ Connected using SAS/SATA cables

inside a node.○ Lower latency.○ Usually small in capacity.

● Network Attached Storage (NAS)○ Connected using network cables.○ Higher latency.○ Can be large in capacity.

Storage

File system

A method and data structure that the operating system controls how data is stored and retrieved.

In layman term, filesystem is like the warehouse manager that helps you to managed your data in the storage drive.

Filesystem type:

● Local file system○ NTFS (Windows)○ FAT32○ EXT4 (Linux)○ XFS (Linux)○ Apple HDFS

● Distributed file system○ GPFS○ Lustre○ Ceph

Storage

Other Storage TechnologyOther Storage Technology:● Redundant Array of Inexpensive Disk (RAID)● Network File System (NFS)

Network

Computer Network

A set of computers sharing resources located or provided by network nodes.

Network

Network Technology in HPC

Common Network Technology in HPC

● Ethernet - TCP/IP● Infiniband

Supported link speed● 1,10,25,40,50,100Gbps

Network

Ethernet VS Infiniband

● Ethernet (TCP/IP)○ Common implementation in network○ Require CPU overhead○ Higher latency

Network

Ethernet VS Infiniband

● Infiniband○ Application Dependent○ Hardware dependent○ Remote Direct Memory Access (RDMA)○ Lower latency

Workstation, Server, HPC Cluster

WorkStation

Server

HPC Cluster

What is the difference?

WorkStation, Server, and HPC Cluster

Computer Workstation Compute Server HPC Cluster

Processor Usually single processor Multiprocessor Consist of multiple servers with multiple processors

Memory 8-64GB RAM (Depend on the processor model); may get more.

16GB-1TB RAM (Depend on the processor model)

Usually more than 1TB of memory available

Storage Local storage, local file system Local storage or NAS, local file system.

Local storage, NAS, distributed file system.

Network Single network interface card (NIC) - 1GbpsWiFi

Up to multiple network interface cards mixed link speeds. Can support up to 100Gbps.

Up to multiple network interface cards mixed link speeds. Can support up to 100Gbps. Multiple servers interconnected.

Feel free to ask me any questions.

QnA

Thank youIf you have any further questions, please feel free to drop me an email:

ngkhengghee@um.edu.my

Or login in to our service desk and create a request.

Please note that there will be another training session on 3pm.