1
System Programming
Chapters 1 & 2
2
UNIX History• Developed in the late 1960s and 1970s at Bell Labs• UNICS – a pun MULTICS (Multiplexed Information and
Computer Service) which was supposed to support 1000 on line users but only handled a few (barely 3). (MULTI-UNiplexed).
• Thomson writes first version of UNICS in assembler for a PDF-7 in one MONTH which contains a new type of file system: kernel, shell, editor and the assembler (one week).
• 1969 Thomson writes interpretive B based on BCPL --Ritchie improves on B and called it “C”
• 1972 UNIX is re-written in C to facilitate porting
3
UNIX History (cont)
• 1973 UNIX philosophy developed:– Write programs that do one thing and do
it well– Write programs that work together– Write programs that handle text
streams, because that is the universal interface
4
UNIXBerkleySoftwareDistributions
Bell LabsResearch
UNIX System Laboratories(USG/USDL/ATTIS/DSG/USO/USL)
1BSD,…,4.0BSD
4.3BSD4.3BSD Tahoe4.3BSD Reno4.4BSD
SUNOSSolarisSolaris 2
Mach
First Edition
Sixth EditionSeventh Edition
XENIXSystem VRelease 2,3
UNIXSystem VRelease 4
Chorus
* POSIX.1 (IEEE, ISO) standard!
5
UNIX Today
• Supports many users running many programs at the same time, all sharing the same computer system
• Information Sharing (which is Ken’s original goal in 1969.)
• Geared towards facilitating the job of creating new programs
• UNIX system– Sun: SunOS
• UNIX-Compatible systems– Solaris; SGI: IRIX; Free BSD; Hewlett Packard:
HP-UX; Apple: OS X (Darwin) GNU: Linux;
6
UNIX Architecture
terminal controller, terminals, physical memory, device controller, devices such as disks, memory, etc.
CPU scheduling, signal handling, virtual memory, paging, swapping,file system, disk drivers, caching/buffering, etc.
Shells, compilers, X, application programs, etc.
UNIX
Kernel interfaceto the hardware
System callinterface
useruseruseruseruseruseruser
Userinterface
7
Introduction
• Objective– Briefly describe services provided by various versio
ns of the UNIX operating system.• Logging In
– /etc/passwd – local machine or NIS DB• root:x:0:1:Super-User:/root:/bin/tcsh• Login-name, encrypted passwd, numeric user-ID, numeric
group ID, comment, home dir, shell program– /etc/shadow – with “x” indicated for passwd
8
Introduction
• Shell– Command interpreters
• Built-in commands, e.g., umask• (External) commands, e.g., ls
forkshellprocess
wait
shellprocess
execve() exit
zombie process
child process
9
Introduction
• Shells– Bourne shell, /bin/sh
• Steve Bourne at Bell Labs– C shell, /bin/csh
• Bill Jay at Berkeley– Command-line editing, history, job-control, etc
– KornShell. /bin/ksh• David Korn (successor of Bourne shell)• Command-line editing, job-control, etc
– .cshrc
10
Introduction
• Filesystem– A hierarchical arrangement of directories and files – starting i
n root /• File
– No / or null char in filenames– . and ..– BSD: 255-char filenames (14 in the past)– Attributes – stat()
• Type, size, owner, permissions, modification/access time, etc.• Directory
– Files with directory entries, e.g., { (filenames, inode) }
11
Introduction – Files and Dir
• File– A sequence of bytes
• Directory– A file that includes info on how to find other
files.
vmunix
/
dev
console lp0 …
bin
csh …
lib
libc.a …
usr
include …
etc
passwd …
* Use command “mount” to show all mounted file systems!
12
Introduction
• Path name– Absolute path name
• Start at the root / of the file system• /user/john/fileA
– Relative path name• Start at the “current directory” which is an attribute of the process
accessing the path name.• ./dirA/fileB
• Links– Symbolic Link – 4.3BSD
• A file containing the path name of another file can cross file-system boundaries. (home/prof/cshih>ln –s ../cshih cshih-test)
– Hard Link• . or ..
13
Introduction
• Directories– /vmunix - binary root image of UNIX– /dev - device special files, e.g., /dev/console– /bin - binaries of UNIX system programs
• /usr/ucb - written by Berkley instead of AT&T• /usr/local/bin - written at the local site
– /lib - library files, e.g., those for C– /user - directories for users, e.g., /user/john– /etc - administrative files and programs, e.g.,
passwd– /tmp - temporary files
14
Introduction
• Program 1.1 – Page 5– List all the files in a directory
• Note– “apue.h”, err_sys() and err_quit() in Appen
dix B– opendir(), readdir(), closedir()
• No ordering of directory entries• Working directory
– Goes with each process• Home Directory
15
Introduction – Input/Output• Operations
– open, close, read, write, truncate, lseek, dup, rename, chmod, chown, fcntl, ioctl, mkdir, cd, opendir, readdir, closedir, etc.
• File descriptor
Read(4, …)
Tables ofOpened Files(per process)
SystemOpen FileTable
In-corei-node list
i-nodei-nodei-nodesync
data blockdata block
16
Introduction
• File descriptor– Standard input (stdin), standard output (stdout), standard err
or (stderr)– Connected to the terminal if nothing special is done.
• I/O redirection– ls > file.list
• Unbuffered I/O: open, close, read, write, lseek– Program 1.2 – Page 8– Copies of stdin to stdout– STDIN_FILENO, STDOUT_FILENO in <unistd.h> , POSIX.1– ls | a.out > datafile
17
Introduction
• Advantages of standard I/O functions such as fgets() and printf()– No need to worry about the optimal block size – a b
uffered interface– Handling of line input
• Misc.– <stdio.h>– Figure 1.5 – Page 9
• Copy stdin to stdout using standard I/O
18
Introduction
• Programs and Processes– Program – an executable file residing in a di
sk file– exec(), etc.– Process – an executing instance of a progra
m• Unique Process ID• Figure 1.6 – Page 11
– Print process ID // getpid()
19
Introduction
• Process Control– Three primary functions: fork(), exec(), waitpid()– Figure 1.7 – Page 12
• Read commands from stdin and execute them
• Note– End-of-file: ^D– fork(), execlp(), waitpid()– No parsing of the input line– execlp(file, arg0 , …, argn, (char *) 0)
• If file does not contain a slash character, the path prefix for this file is obtained by a search of the directories passed in the PATH environment variable.
20
Introduction
• ANSI C Features– Function Prototypes
• ssize_t read(int, void *, size_t);• void *malloc(size_t)
– Generic Pointers• void * - avoid type casting
– Primitive System Data Types• ssize_t, pid_t, etc.• <sys/types.h> included in <unistd.h>• Prevent programs from using specific data types – each impleme
ntation choose its proper data types by “typedef”!
21
Introduction
• Error Handling– errno in <errno.h> (sys/errno.h)
• E.g., 15 error numbers for open()• #define ENOTTY 25 /* Inappropriate ioctl for device */• Never cleared if no error occurs• No value 0 for any error number
• Functions– char *strerror (int errnum) (<string.h>)– void perror(const char *msg) (<stdio.h>)
• Figure 1.8 – Page 15 – demo perror and strerror
22
Introduction
• UNIX – A Layer Architecture– System Calls
• Programmer Interface to UNIX• Trap 40 – VAX 4.2BSD• R0 – error code
– Categories• File Manipulation
– Devices are special files under “/dev”!
• Process Control• Information Manipulation
23
Introduction
• User Identification– Numeric value in /etc/passwd
• 0 for root/superuser– Unchangeable and for access permission control
• Group ID– Numeric value in /etc/passwd– /etc/group
• Supplementary Group ID’s– /etc/group (4.2BSD allows 16 additional groups.)– adm::4:root,adm,daemo
• “ls –l” uses /etc/passwd and /etc/group
24
Introduction
• Signals– To notify a process that some condition has occurred
• Action– Ignore the signal– Execute the default action
• E.g., for SIGFPE (divided by zero)
– Provide a function
• Signal Generation– Terminal keys (^c ~> SIGINT), kill – owner-only
• Program 1.8 – Page 19– Read commands and exec + signal SIGINT
25
Introduction
• Time Values– Calendar time
• In seconds since the Epoch (00:00:00 January 1, 1970, Coordinated Universal Time, i.e., UTC)
• type time_t – Process time
• In clock ticks (divided by CLK_TCK -> secs)• type clock_t • Clock time, user/system CPU time
> time grep _POSIX_SOURCE */*.h > /dev/null0.25u 0.25s 0:03.51 14.2%
26
Introduction
• System Calls vs Library Functions– System Calls
• 50 for Unix Ver 7, 110 for 4.3+BSD, 120 for SVR4– Unix Technique
• Same function names for system calls• Differences
– Fixed set, more elaborate functionality• malloc() calls sbrk() better allocated space management• gmtime() calls time() seconds into broken-down time!• fgets() calls read() unbuffered I/O -> buffered I/O
• Misc– Process control: fork(), exec(), wait() invoked directly from ap
plication code (vs system()).
27
Contents
1. Preface/Introduction2. Standardization and Implementation3. File I/O4. Standard I/O Library5. Files and Directories6. System Data Files and Information7. Environment of a Unix Process8. Process Control9. Signals10.Inter-process Communication
28
Ariane 5• An European rocket designed to launch commercial p
ayloads (e.g. communications satellites, etc.) into Earth orbit.
• Successor to the successful Ariane 4 launchers.• Ariane 5 can carry a heavier payload than Ariane 4.• On June 4th, 1996, Flight 501 was launched.• Approximately 37 seconds after a successful lift-off, th
e Ariane 5 launcher lost control.
29
Flight 501 Failure
• The attitude and trajectory of the rocket are measured by a computer-based inertial reference system. This transmits commands to the engines to maintain attitude and direction.
• The software failed and this system and the backup system shut down.
• Diagnostic commands were transmitted to the engines which interpreted them as real data and which swiveled to an extreme position resulting in unforeseen stresses on the rocket.
30
What Causes the Failure?
• Software failure occurred when an attempt to convert a 64-bit floating point number to a signed 16-bit integer caused the number to overflow.
• Why not Ariane 4?– The physical characteristics of Ariane 4 (A smaller vehi
cle) are such that it has a lower initial acceleration and build up of horizontal velocity than Ariane 5.)
– The value of the variable on Ariane 4 could never reach a level that caused overflow during the launch period.
31
What Lesson Can We Learn?
• While porting one software from one platform to another platform, we have to keep our eyes on the platform assumption and capabilities.
• Software reliability cannot be sacrificed.– Exact copy of the failure software was used as
the backup copy.– Exception handler was removed to save the
computation power, which is a rare resource on space mission.
• How do you make sure your program functions correctly from one Unix platform to another Unix platform?
32
Standardization and Implementation
• Why Standardization?– Proliferation of UNIX versions
• What should be done?– The specifications of limits that each
implementation must define!
33
UNIX Standardization
• ANSI C– American National Standards Institute – ISO/IEC 9899:1990
• International Organization for Standardization (ISO)
• Syntax/Semantics of C, a standard library
– Purpose:• Provide portability of conforming C programs
to a wide variety of OS’s.
– 15 areas: Fig 2.1 – Page 27
34
UNIX Standardization
• ANSIC C– <assert.h> - verify program assertion– <ctype.h> - char types– <errno.h> - error codes– <float.h> - float point constants– <limits.h> - implementation constants– <locale.h> - locale catalogs– <math.h> - mathematical constants– <setjmp.h> - nonlocal goto– <signal.h> - signals– <stdarg.h> - variable argument lists– <stddef.h> - standard definitions– <stdio.h> - standard library– <stdlib.h> - utilities functions– <string.h> - string operations– <time.h> - time and date
35
UNIX Standardization
• POSIX.1 (Portable Operating System Interface) developed by IEEE– Not restricted for Unix-like systems and no distincti
on for system calls and library functions– Originally IEEE Standard 1003.1-1988– 1003.2: shells and utilities, 1003.7: system administ
rator, > 15 other communities– Published as IEEE std 1003.1-1990, ISO/IEC9945-1:1
990– New: the inclusion of symbolic links– No superuser notion
36
UNIX Standardization
• POSIX.1– <cpio.h> - cpio archive val
ues– <dirent.h> - directory entr
ies– <fcntl.h> - file control– <grp.h> - group file– <pwd.h> - passwd file– <tar.h> tar archieve value
s– <termios.h> - terminal I/O
– <unistd.h> - symbolic constants
– <utime.h> file times– <sys/stat.h> - file status– <sys/times.h> - process ti
mes– <sys/types.h> - primitive s
ystem data types– <sys/utsname.h> - system
name– <sys/wait.h> - process con
trol
37
UNIX Standardization
• X/Open– An international group of computer vendors– Volume 2 of X/Open Portability Guide, Issue 3
(XPG3)• XSI System Interface and Headers• Based on IEEE Std. 1003.1 – 1988 (text displaying in
different languages)• Built on the draft of ANSI C
– Some are out-of-date.
– Solaris 2.4 – compliance to XPG4V2• man xpg4
38
UNIX Standardization
• FIPS (Federal Information Processing Standard) 151-1– IEEE Std. 1003.1-1988 & ANSI C– For the procurement of computers by the US gover
nment.– Required Features:
• JOB_CONTROL, SAVED_ID, NO_TRUNC, CHOWN_RESTRICTED, VDISIBLE,• NGROUP_MAX >= 8, Group Ids of new files and dir be equal to their paren
t dir, env var HOME and LOGNAME defined for a login shell, interrupted read/write functions return the number of transferred bytes.
39
UNIX ImplementationBerkleySoftwareDistributions
Bell LabsResearch
UNIX System Laboratories(USG/USDL/ATTIS/DSG/USO/USL)
1BSD,…,4.0BSD
4.3BSD4.3BSD Tahoe4.3BSD Reno4.4BSD
SUNOSSolarisSolaris 2
Mach
First Edition
Sixth Edition (1976)Seventh Edition (1979)
XENIXSystem VRelease 2,3
UNIXSystem VRelease 4
Chorus
* POSIX.1 (IEEE, ISO) standard!
40
UNIX Implementation
• System V Release 4 - 1989– POSIX 1003.1 & X/Open XPG3– Merging of SVR3.2, SunOS, 4.3BSD, Xenix– SVID (System V Interface Definition)
• Issue 3 specifies the functionality qualified for SVR4.
– Containing of a Berkley compatibility library• For 4.3BSD counterparts
41
UNIX Implementation
• 4.2BSD - 1983– DARPA (Defense Advanced Research Projects Agency) wanted a
standard research operating systems for the VAX.– Networking support - remote login, file transfer (ftp), etc.
Support for a wide range of hardware devices, e.g., 10Mbps Ethernet.
– Higher-speed file system.– Revised virtual memory to support processes with large
sparse address space (not part of the release).– Inter-process-communication facilities.
42
UNIX Implementation
• 4.3 BSD - 1986– Improvement of 4.2 BSD
• Loss of performance because of many new facilities in 4.2 BSD.
• Bug fixing, e.g., TCP/IP implementation.• New facilities such as TCP/IP subnet and routing support.
– Backward compatibility with 4.2 BSD.– Second Version - 4.3 BSD Tahoe
• support machines beside VAX
– Third Version - 4.3 BSD Reno• freely redistributable implementation of NFS, etc.
43
UNIX Implementation
• 4.4 BSD - 1992– POSIX compatibility– Deficiencies remedy of 4.3 BSD
• Support for numerous architectures such as 68K, SPARC, MIPS, PC.
• New virtual memory better for large memory and less dependent on VAX architecture – Mach.
• TCP/IP performance improvement and implementation of new network protocols.
• Support of an object-oriented interface for numerous filesystem types, e.g., SUN NFS.
44
UNIX Implementation - Major UCB CSRG Distributions
• Major new facilities:– 3BSD, 4.0BSD, 4.2BSD, 4.4 BSD
• Bug fixes and efficiency improvement:– 4.1 BSD, 4.3BSD
• BSD Networking Software, Release 1.0 (from 4.3BSD Tahoe, 1989), 2.0 (from 4.3BSD Reno, 1991)
• Remark: – Standards define a subset of any actual system –
compliance and compatibility
45
Limits – ANSI C, POSIX, XPG3, FIPS 151-1
• Compiler-time options and limits (headers)– Job control?– Largest value of a short?
• Run-time limits related to file/dir– pathconf and fpathconf, e.g., the max # of bytes in
a filename• Run-time limits not related to file/dir
– sysconf, e.g., the max # of opened files per process• Remark: implementation-related
46
ANSI C Limits
• All compile-time limits - <limits.h>– Minimum acceptable values
• E.g., CHAR_BIT, INT_MAX– Implementation-related
• char (limits.h), float (FLT_MAX in float.h) • open (FOPEN_MAX & TMP_MAX in stdio.h)#if defined(_CHAR_IS_SIGNED)#define CHAR_MAX SCHAR_MAX #elif defined(_CHAR_IS_UNSIGNED)#define CHAR_MAX UCHAR_MAX
47
POSIX Limits
• 33 limits and constants– Invariant minimum values (POSIX defined in Figure 2.3 –
Page 33, limits.h)– Corresponding implementation (limits.h)
• Invariant SSIZE_MAX• Run-time increasable value NGROUP_MAX• Run-time invariant values, e.g., CHILD_MAX• Pathname variable values, e.g., LINK_MAX
– Compile-time symbolic constants, e.g., _POSIX_JOB_CONTROL
– Execution-time symbolic constants, e.g., _POSIX_CHOWN_RESTRICTED
– Obsolete constant: CLK_TCK
48
POSIX Limits
• Limitation of POSIX– E.g., _POSIX_OPEN_MAX in <limits.h>– sysconf(), pathconf(), fpathconf() at run-time– Possibly indeterminate from some
• E.g., OPEN_MAX under SVR4
49
XPG3 Limits
• 7 constants in <limits.h> - invariant minimum values called by POSIX.1– Dealing with message catalogs
• NL_MSGMAX – 32767
• PASS_MAX– <limits.h>
• Run-time invariant value called by POSIX.1– sysconf()
50
Run-Time Limits
• #include <unistd.h> (Figure 2.7 – Page 40: compile/run time limits)
• long sysconf(int name);– _SC_CHILD_MAX, _SC_OPEN_MAX, etc.
• long pathconf(const char *pathname, int name);• long fpathconf(int *filedes, int name);
– _PC_LINK_MAX, _PC_PATH_MAX, _PC_PIPE_BUF, _PC_NAME_MAX, etc.
• Various names and restrictions on arguments (Page 35 and Figure 2.5)
• Return –1 and set errno if any error occurs.– EINVAL if the name is incorrect.
51
Run-Time Limits
• Example Program 2.1 – Page 38– Print sysconf and pathconf valuesb (Fig 2.6 – Page 39)
Limit SunOS4.1.1 SVR4 4.3+BSDCHILD_MAX 133 30 40OPEN_MAX 64 64 64LINK_MAX 32767 1000 32767NAME_MAX 255 14/255 255_POSIX_NO_TRUC 1 nodef/1 1
52
Indeterminate Run-Time Limits –Two Cases
• Pathname– 4.3BSD: MAXPATHLEN in <sys/param.h>, PATH_MAX in <limits.
h>– Program 2.2 – Page 42
• Allocate space for a pathname vs getcwd• _PC_PATH_MAX is for relative pathnames
• Max # of Open Files – POSIX run-time invariant– NOFILE (<sys/param.h>), _NFILE (stdio.h>)– sysconf(_SC_OPEN_MAX) – POSIX.1
• getrlimit() & setrlimit() for SVR4 & 4.3+BSD– Program 2.3 – Page 43: OPEN_MAX!
53
MISC
• Feature Test Macro– POSIX only
• cc –D_POSIX_SOURCE file.c• Or, #define _POSIX_SOURCE 1
– ANSI C onlyifdef __STDC__void *myfunc(const char*, int)#elsevoid *myfunc();#endif
54
MISC
• Primitive System Data Types– Figure 2.8 – Page 45
• Implementation-dependent data types• E.g., caddr_t, pid_t, ssize_t, off_t, etc.
– <sys/types.h>• E.g., major_t, minor_t, caddr_t, etc.• Examples:
– typedef char * caddr_t;– typedef ulong_t major_t;(SRV4: 14 bits for the major device number, and 18 bits for the
minor device number. Traditionally they are all short: 8-bits)
55
MISC
• Conflicts Between Standards– clock() in ANSI C and times() in POSIX.1
• clock_t divided by CLOCKS_PER_SEC in <time.h> (while CLK_TCK became obsolete) – different values for clock_t
– Implementation of POSIX functions• No assumption on the host operating system.• signal() in SRV4 is different from sigaction() in P
OSIX.1
Top Related