Operating Systems 1 (5/12) - Architectures (Unix)

16
Operating System Architectures - Unix Beuth Hochschule Summer Term 2014

Transcript of Operating Systems 1 (5/12) - Architectures (Unix)

Page 1: Operating Systems 1 (5/12) - Architectures (Unix)

Operating System Architectures - Unix

Beuth HochschuleSummer Term 2014

Page 2: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

Modern UNIX Systems• System V Release 4 (SVR4) was a major milestone

• AT&T and Sun Microsystems (R.I.P.) combined so-far diverging Unix flavors

• Intention to provide uniform platform for commercial UNIX deployment

• Added preemptive kernel, virtual memory concepts, virtual file system support

• Solaris is the successor of Sun‘s SVR4-based UNIX release

• 4.4BSD was the final version from Berkeley university

• Meanwhile many successful derivatives, including Mac OS X

• Most modern UNIX kernels are monolithic

• All functional components of the kernel have access to all data and methods

• Loadable modules (object files) that can be linked to / unlinked from the kernel at runtime, stackable

2

Page 3: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

System Programming in Unix• Unix system interface is a mixture of C library, POSIX, and custom functions

• Linux

• POSIX 1003.1 (mostly) + Standard C library + SVR4 + BSD functions

• Every system call has a platform-dependent symbolic constant(asm-<arch>/unistd.h) and a symbolic name

• Classes: Process management, time-related functions, signal processing, scheduling, kernel modules, file system, memory management, IPC, network, monitoring, security

• MacOS X

• BSD portion derived from FreeBSD (4.4BSD) + Standard C library + ObjC specials

• Free BSD

• POSIX 1003.1 (mostly) + Standard C library + BSD functions

3

Page 4: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

Unix: Everything Is A File

• „The UNIX Time-Sharing System“ - D. M. Ritchie and K. Thompson, 1974

4

Page 5: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

Unix: Everything Is A File• Hierarchical namespace of special files, ordinary files and directories

• Support for mountable sub trees in one hierarchy

• Today typically de-named as Virtual File System (VFS) concept

• Each supported I/O device is associated with at least one special file in /dev

• Read and written as ordinary files, but leads to device interaction

• Protection relies on filesystem mechanisms

• „Everything can have a file descriptor“ is a better description than „Everything is a file“ [Brown2007]

• /proc

• Special file system mounted by the kernel at boot time (since SVR4 / BSD)

• Representation of kernel information as files, possibility for user - kernel mode interaction (e.g. ps tool)

5

Page 6: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

Linux

• Unix variant initially targeting the IBM PC, meanwhile broad adoption

• Wide number of supported platforms, source code available as ,free‘ software

• „Free as in speech, not as in beer“ [FSF]

• Monolithic kernel compiled per platform

• /linux/arch/* directory in the source code tree

• Kernel is extensible at run-time by loadable kernel modules (LKM)

• API / ABI for such modules is not stable - module binaries must fit to the kernel version being executed

• Support for versioning of kernel modules and ,tainting‘ of non-GPL drivers

• Graphic system traditionally completely in user mode

6

Page 7: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

Linux Kernel Components

7

Page 8: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

Linux

8

Page 9: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

Anatomy of a Linux System Call [Mauerer]

• Handler implementations in portable C code („sys_“ prefix) spread in the sources

• Example: sys_getuid(void) in kernel/timer.c

• Kernel code performs mode switch and conversion of function parameters

• Processor registers store system call parameters and system call number(architecture-specific assembler code)

• errno.h and errno-base.h define positive error return codes, delivered as negative number to indicate that this is a problem

9

Application libc Kernel Kernel

Handler

• $0x80 call gate (IA32) • SYSENTER / SYSEXIT

(>IA32 PII) • call_pal PAL_callsys (Alpha) • sc (PowerPC) • syscall (AMD64)

Page 10: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

Anatomy of a Linux System Call

• strace tool, based on ptrace system call

• Interception on system call boundary

• Access to process address space possible

• Hardware-supported breakpoints possible

• MacOS X: dtruss

• Solaris: truss

10

troeger@dfw:~$ strace -f -T pwd execve("/bin/pwd", ["pwd"], [/* 14 vars */]) = 0 <0.000279> brk(0) = 0x80d5000 <0.000012> access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.000018> mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7761000 <0.000014> access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) <0.000015> open("/etc/ld.so.cache", O_RDONLY) = 3 <0.000016> fstat64(3, {st_mode=S_IFREG|0644, st_size=48165, ...}) = 0 <0.000012> mmap2(NULL, 48165, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7755000 <0.000014> close(3) = 0 <0.000011> access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) <0.000015> open("/lib/i686/cmov/libc.so.6", O_RDONLY) = 3 <0.000019> read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\0n\1\0004\0\0\0"..., 512) = 512 <0.000013> fstat64(3, {st_mode=S_IFREG|0755, st_size=1327556, ...}) = 0 <0.000012> mmap2(NULL, 1337704, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb760e000 <0.000014> mprotect(0xb774e000, 4096, PROT_NONE) = 0 <0.000017> mmap2(0xb774f000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x140) = 0xb774f000 <0.000018> mmap2(0xb7752000, 10600, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7752000 <0.000015> close(3) = 0 <0.000012> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb760d000 <0.000013> set_thread_area({entry_number:-1 -> 6, base_addr:0xb760d8d0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 <0.000012> mprotect(0xb774f000, 8192, PROT_READ) = 0 <0.000015> mprotect(0xb777f000, 4096, PROT_READ) = 0 <0.000014> munmap(0xb7755000, 48165) = 0 <0.000018> brk(0) = 0x80d5000 <0.000011> brk(0x80f6000) = 0x80f6000 <0.000012> open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3 <0.000023> fstat64(3, {st_mode=S_IFREG|0644, st_size=108793664, ...}) = 0 <0.000011> mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb740d000 <0.000014> mmap2(NULL, 4096, PROT_READ, MAP_PRIVATE, 3, 0xf37) = 0xb7760000 <0.000014> close(3) = 0 <0.000012> getcwd("/net/pao/export/home/staff/troeger", 4096) = 35 <0.000016> fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0 <0.000011> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb775f000 <0.000013> write(1, "/net/pao/export/home/staff/troeg"..., 35/net/pao/export/home/staff/troeger ) = 35 <0.000016> close(1) = 0 <0.000011> munmap(0xb775f000, 4096) = 0 <0.000016> close(2) = 0 <0.000011> exit_group(0) = ?

Page 11: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

Linux Modules• Support for dynamically loaded and linked binary kernel parts - modules

• Reduces size of the compiled monolithic kernel binary

• Allows driver integration without re-compilation of the kernel

• Also solves some GPL licensing issues with modern hardware drivers

• Modules are relocatable object files that are linked into the kernel

• Kernel has table of registered functions with their address (/proc/kallsyms)

• Dynamic linker (ld.so) can load and re-locate the code accordingly (more later)

• modprobe tool, relies on insmod tool which uses the init_module system call

• Considers module dependencies determined by depmod utility (modules.dep)

• Kernel can trigger kmod daemon to automatically load missing module(request_module)

11

Page 12: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

Linux Modules

12

Page 13: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

Linux Modules

• Versioning

• (Binary) drivers have problems with updated kernel versions

• Optional solution is to generate signature checksums for kernel functions (genksym)

• Module compilation stores checksums of all used functions in the implementation

• Kernel may become „tainted“ if module uses symbol without demanding a specific version

13

Page 14: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

Mac OS X / Darwin

• Mac OS X kernel is Darwin

• Kernel environment derived fromFreeBSD + Mach

• Available as open source

• Mach components: Low-level functionality(IPC, SMP, virtual memory, paging, modularity)

• I/O Kit: Framework for simplified driver development

• Network Kernel Extensions (NKE)

• Add / remove kernel modules for networking without interruption orre-compilation

14

(C) developer.apple.com

Page 15: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

Mac OS X / Darwin

• Switch between kernel and user mode is called boundary crossing

• Darwin supports several methods

• Mach IPC / RPC: low-level, low-latency, low bandwidth

• Mach Interface Generator (MIG) implements C API from interface description

• RPC routines are grouped in subsystems (e.g. virtual memory)

• BSD syscall: not pluggable, only intended for filesystem and networking

• BSD sysctl / sysctlbyname: supersedes the syscall interface, pluggable

• Typically used to read / write kernel variables

• BSD ioctl: sends commands directly to device drivers (/dev)

• Classical mechanism from BSD

15

Page 16: Operating Systems 1 (5/12) - Architectures (Unix)

Operating Systems I PT / FF 2014

Summary

16

• Modern operating system tackle three major tasks

• Hide complexity and heterogeneity of the underlying hardware

• Manage system resources

• Ensure flexibility, portability and security through layering

• Fundamental concepts are processes and virtual memory

• All operating systems use ring protection support from hardware to implement user mode and kernel mode

• Applications use system API to access kernel-mode functionality

• Operating systems have pluggability support for their hardware device drivers

• All operating systems have common roots in history