19007586 Buffer Overflow Attack(2)

download 19007586 Buffer Overflow Attack(2)

of 34

Transcript of 19007586 Buffer Overflow Attack(2)

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    1/34

    EMBED PBrush

    A

    SEMINAR REPORT ON

    BUFFER OVERFLOW ATTACKS

    SUBMITTED

    TO

    PUNE UNIVERSITY, PUNEFOR THE DEGREE

    OF

    BACHELOR OF COMPUTER ENGINEERINGBY

    GHANSHYAM SATYANARAYAN SHARMAT.E. COMP (B)

    Roll No. 42

    UNDER THE GUIDANCE

    OF

    Mrs. MAYURA KINIKAR

    DEPARTMENT OF COMPUTER ENGINEERINGMAHARASHTRA ACADEMY OF ENGINEERING

    ALANDI (D), PUNE-412105

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    2/34

    2008-2009

    Certificate

    This is to certify that Mr.GHANSHYAM SATYANARAYAN SHARMA has

    successfully submitted his seminar report on

    BUFFER OVERFLOW ATTACKS

    during the academic year 2008-2009 in the partial fulfillment towards completion of

    Bachelors Degree Program in Computer Engineering under Pune University, Pune.

    Prof. Mrs. MAYURA KINIKAR Prof. Mrs. Uma Nagraj

    Guide. Head of Department.

    Computer Engineering.

    Dr. J P GeorgePrincipal

    2

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    3/34

    DEPARTMENT OF COMPUTER ENGINEERINGMAHARASHTRA ACADEMY OF ENGINEERING

    ALANDI(D), PUNE-4121052008-2009

    Acknowledgement

    I have great pleasure in presenting this report on BUFFER OVERFLOWATTACKS .I take this opportunity to thank all those who have contributedfor this successful completion of this report.

    My special thanks to Prof. Mrs. MAYURA KINIKAR for her suggestionin this presentation of this report. I am also grateful to her for solving all mytechnical difficulties.

    Once again, I would like to thank all those who directly & indirectly made acontribution to this report.

    -Ghanshyam Sharma

    3

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    4/34

    Table Of Contents

    Abstract ..................................... 051. Introduction ....06

    1.1 Brief History..06

    1.2 What is Buffer?..................................................................................07

    2. Anatomy of a Buffer Overflow Attack...09

    2.1 What is stack?....................................................................................09

    2.2 What is Return Address?...................................................................093. Smashing the Stack..16

    4. Screen Shots Showing Buffer Overflow Attacks21

    5. Heap 22

    6. Defenses against Buffer Overflow Attacks.23

    6.1 Program Modification...23

    6.2 Modifying the language and/ or compiler.23

    6.3 OS Kernel Stack execution privilege26

    6.4 Safer C liabrary Support27

    4

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    5/34

    7. Future Scope31

    8. Conclusion...32

    9. References...33

    Abstract:

    Buffer overflows are serious security bugs that find regular mention in every computer

    security bulletin. The design philosophy of the C language wherein flexibility and

    efficiency are given greater priority than safety is one of the main reasons for buffer

    overflow attacks. Due to Cs extensive use of pointers, arrays and pointer arithmetic

    without any bounds checking, programs sometimes access data that they arent supposed

    to. By combining the C programming language's flexible yet liberal approach to memory

    handling with specific UNIX file system permissions, UNIX and its flavors can be

    manipulated to grant unrestricted privilege to unknown and unregistered users. In this

    paper we have tried to examine this violation and discussed the approach to mitigate this

    vulnerability.

    5

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    6/34

    1. Introduction:

    1.1 Brief History:

    Buffer Overflows have been successfully used as a method of penetrating systems

    security for over 12 years. One of the first buffer overflow attacks which attracted

    widespread attention due to its spectacular success was Robert Morris's Internet Worm. In

    1988 Morris released a program which succeeded in infecting thousands of Unix hosts

    on the Internet. One of the methods Morris used to gain access to a vulnerable system

    was a buffer overflow bug in the fingerd daemon. Once it gained access to a vulnerable

    system, Morris's program installed itself on the machine, and used several methods to

    attempt to spread itself to other machines. The original intent of Morris was to spread to

    other systems relatively slowly and undetected, without causing a significant disruption

    on any of the affected machines. However, his attack failed completely in this. Morris

    made a programming error which caused his worm to spread at a much higher rate than

    originally intended. Because of this error, machines were infected and reinfected so

    rapidly that the worm ended up overwhelming the attacked systems. Of course this

    caused his program to be detected immediately, and transformed it into the most

    devastating denial of service attack until that time. Morris's program usually did not

    gain administrative root access, and did not destroy any information on the penetrated

    system, nor leave time bombs or other malicious code behind .

    From 1988 to 1996 the number of buffer overflow attacks remained relatively low. The

    known vulnerabilities were fixed, and because the attack method was little known and

    thought to be difficult to execute few new vulnerabilities were discovered. This

    6

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    7/34

    changed dramatically in 1996 when Levy published a very well written paper which

    simultaneously showed that it was very likely that many programs harbored buffer

    overflow vulnerabilities, and also demonstrated techniques of constructing buffer

    overflow attacks which were likely to succeed against a target program suspected of

    being vulnerable, even if the attacker had no access to the actual source code of the

    target program. The combination of these two factors stimulated attackers to a flurry of

    research activity which lead to many discoveries of new vulnerabilities. In addition,

    many of the attacks were automated, which permitted the attack to be carried out even

    by people with little or no knowledge. People who are relatively unsophisticated but

    interested in such attacks are often called Script Kiddies. Unfortunately, there are far

    too many script kiddies, who seem to have plenty of time on their hands, and also the

    energy, patience and persistence to keep hacking systems this way. The unhappy result

    is that these automated attacks have become a serious nuisance to the overworked system

    administrators responsible for maintaining the integrity of their systems under

    continuous attack.

    1.2 What is Buffer?

    A buffer is a temporary storage area in memory. It can be a statically or dynamically

    allocated memory space. A buffer is said to overflow if the some program or routine tries

    to stuff in more data than its capacity. If the data entered into a buffer exceeds the limit

    specified by the program it gets stored into adjacent buffers. This might result in valid

    information getting overwritten and can also be susceptible to what are known as buffer

    overrun attacks. Although it may occur accidentally through programming mistakes and

    improper use of pointers, buffer overflow is a common and extremely dangerous attack

    on system security and private data. The attacker can write data which may contain codes

    created to carry out certain actions which could, for example, damage the user's files,

    result in a system crash, change data, or disclose confidential information. Buffer

    overflow attacks arise because of the framework of the C programming language. Poor

    programming practices have only worsened the issue.

    7

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    8/34

    The C language is a structured programming language, which uses the function call as the

    unit of organization. Each time a function is called, arguments to the function get copied

    to an area of memory called the stack. In assembly, you store things on the stack by

    pushing them and retrieve them by popping them off the stack. All CPU architectures

    currently in use support a function call by a stack mechanism and have a special register

    called the stack pointer. There are special operators for carrying out PUSH and POP

    operations. There is also an operator that takes an address off the stack and copies it into

    the program counter, the register that determines the address of the next instruction to

    execute. The register used for this purpose is called the instruction pointer and enables the

    structured nature of C. Calling a function always pushes the return address onto the stack.

    The problem with this design shows up within the called function. Any variables defined

    within this function are also stored in space allocated on the stack. For example, if a

    string, such as the name of a file to open, needs to be defined in the function, a number of

    bytes will be allocated on the stack. The function can then use this memory, but it will

    automatically be unallocated after the function returns. This is no doubted a very efficient

    process and totally in sync with the design philosophy of C. But C does no bounds

    checking when data is stored in this area, a loophole that can be exploited by the attacker.

    C functions that copy data but do no bounds checking are the main cause of these

    vulnerabilities. Functions like strcat (), strcpy (), sprintf (), vsprintf (), bcopy (), gets (),

    and scanf ()calls can be exploited because these functions dont check to see if the buffer,

    allocated on the stack, will be large enough for the data copied into the buffer. Some of

    these functions have suitable replacements (like strncpy () and strncat () for strcpy () and

    Figure 1 Computer Memory

    This is a BUFFER of variables

    float x

    Char y

    int z

    This is a variable

    8

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    9/34

    strcat () respectively) while others dont. When such functions are called, data can be

    entered in locations, which are actually not the allocated space for storing that data. This

    results in overflow.

    2. Anatomy of a Buffer Overflow Attack:

    2.1 Whats a Stack?

    The stack is the place where the software stores almost all-temporary information.

    Example of temporary information is the return addresses from function calls, and all the

    local variables. What's really important is functions can write to this space, and modify

    any data on it.

    2.2 Whats a return address?

    When a function is called, the system will save where it was called. Once a function

    exits, it will read this address and let the program return to what it was doing before the

    function was called. If this address is maliciously altered, the program won't behave as it

    was programmed to do.

    It's worth to notice that the biggest problem is the ability for an attacker to modify the

    return address. This is what makes it possible to make the code behave unexpectedly. In

    an important program like the Unix command su, simply being able to make the program

    jump into another part of itself could be enough to compromise the system. A stack

    smashing attack usually has two mutually dependent goals:

    9

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    10/34

    Example:

    Top o

    Bottom o

    Function Call

    Return Pointer

    Buffer 1(Local Variable 1)

    Buffer 2

    (Local Variable 2)

    .

    . Fill

    Normal Stack

    10*str

    ret address

    sfp

    buffer [16]

    Bottom of stack

    Top of memory

    Top of stack

    Bottom of memory

    Stackgrowsu

    pward

    Buffergrowsdo

    wnward

    int main(){char large_string[256];int i;for (i = 0; i < 255; i++){

    large_string[i] = A;}function(large_string);

    }

    void function(char *str){char buffer[16];strcpy(buffer, str);

    }

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    11/34

    1) Insert Attack code:

    The user actually enters as his input string an executable or a binary code pertaining to

    the machine being attacked.

    2) Change return address:

    There is space on the stack above every buffer for the return address of the function. The

    attacker writes arbitrary (and dangerous!) code up to the return address and alters the

    return address to point to the arbitrary code. So when the function returns it jumps to the

    code that has been placed on the buffer.

    The codes that are most likely to be victim to buffer overflow attacks are the ones which

    read in data using unsafe functions like gets () and which are used to move data like strcat

    () Unfortunately, the local array as well as the function return address will both be stored

    on the stack.

    This is extremely dangerous because the attacker will easily be able to feed you hostile

    code instead of data, and with a simple trick the attacker will make yourprogram execute

    the code. This vulnerability is known as "buffer overflow", and is a special case of the

    overflow problems.

    11

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    12/34

    Writing a Buffer Overflow:

    Exploiting the vulnerabilities of C to carry out a buffer overflow requires knowledge of

    the Intel x86 family assembly programming and experience with disassembling softwarelike gdb. The section below gives an explanation of how dynamically created stack

    buffers are smashed. For understanding this it is essential to know how a function call

    in C is executed and basics of structured programming.

    Procedure in C:

    C carries out instructions by using procedures or what it calls functions. From one point

    of view, a procedure call alters the flow of control just as a goto jump does, but unlike a

    jump, when finished performing its task, it returns control to the statement or instruction

    following the call. This high-level abstraction is implemented through a stack. A stack is

    also used to dynamically allocate memory for the formal variables that are sent as

    arguments to the function.

    Stack region:

    A stack is a contiguous block of memory containing data. A register called the stack

    pointer (SP) points to the top of the stack. The bottom of the stack is at a fixed address.

    The kernel dynamically adjusts its size at run time. The central processing unit executes

    the stack instructions of PUSH and POP at program execution time.

    The stack consists of logical stack frames that are pushed when calling a function and

    popped when returning. A stack frame contains the parameters to a function, its local

    variables, and the data necessary to recover the previous stack frame, including the value

    of the instruction pointer (IP) at the time of the function call. The stack pointer usually

    points to the top of the stack. Ideally giving offsets from the stack pointer can reference

    the local variables. But when data is added to the top of a stack these offsets change. Thus

    it is not possible for the compiler to keep track of all these changes in offsets. As a result

    many compilers use a second register called the frame pointer (FP). The frame pointer

    points to a fixed address within a stack frame. Thanks to this property, distance of local

    12

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    13/34

    variables from the frame pointer does not change and hence FP can be used to reference

    local variables and parameters.

    The first thing that a procedure does is save previous FP (so that it can be restored at

    procedure exit). Then it copies SP (stack pointer) into FP and advances SP to reserve

    space for the next local variable. This is called procedure prolog. Upon procedure exit,

    the stack is cleaned up again. This is called procedure epilog.

    A simple example of how a stack is formed by a function call is shown in the example

    below:

    void sample(int ,int ,int)

    void main()

    {

    sample (5,6,7);

    }

    void sample (int a, int b,int c)

    {

    char arr1[5];

    char arr [10];

    }

    The assembly code output of the function call procedure is as follows:(This is as it

    appears on a Intel x86 CPU and UNIX as operating system)

    push1 $7

    push1 $6

    push1 $5

    call function

    When the main () function encounters the function call, first the two arguments are

    pushed onto the stack and the function is called. What exactly happens when the function

    is called is that the instruction pointer, which points to the function, is pushed onto the

    stack. In other words the return address of the function is stored on the stack. Next it

    pushes the frame pointer onto the stack. It then copies the current SP into the frame

    13

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    14/34

    pointer. Memory space is allocated for the local variables by subtracting their size from

    the address of the Stack Pointer. This procedure prolog can be represented in assembly

    code as follows:

    push1 %ebp

    mov1 %esp,%ebp

    sub1 $, %esp

    where ebp and esp are the mnemonics for current frame pointer and current stack

    pointer.The formation of the stack when the above C code is called is shown in figure 1

    below:

    With this much information about how a function call is executed in C by use of a stack,

    we can now go into the details of how a buffer overflow attack is coded and how it can be

    used to execute random code.Lets consider another code snippet in which a buffer

    overflows results.

    void concat(char *);

    void main ()

    {

    char string [200];

    int ctr;

    for(ctr=0;ctr

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    15/34

    strcat(buff,str);

    }

    This is a program uses a typically unsafe C string function strcat (). Here the function

    appends the contents of the string str to the end of buff without any bounds checking.

    This code gives a segmentation fault. Taking a closer look at the stack formation in this

    case will make it clear how the contents of the return address are undesirably overwritten.

    The reason why we get a segmentation fault is because of the fact that we store 200 bytes

    in buffer; an astronomical number considering that it can hold only 16 bytes. The 184

    bytes after the allocated buffer get overwritten. This includes the FP, the return address,

    even the *str. We have initialized every element of str to the character C. The hex

    character value of the string is 0x43.Thus the return address is overwritten as

    0x43434343.This is outside the process address space. Thus there is a segmentation fault.

    This knowledge can help us carry out arbitrary instructions as is shown with regards to

    our first example.

    Now, as we know, before arr1 is the frame pointer FP and before that is the returnaddress which is 4 bytes past the end of arr1. As we know memory can be accessed only

    in multiples of word size, which in this case is 4 bytes or 32 bits. Thus our 5-byte array

    arr1 actually occupies 8 bytes or 2 words. Thus the buffer actually occupies 8 bytes.

    Hence the return address is 12 bytes after the array.

    Void sample(int a,int b,int c)

    {

    char arr1[5];

    char arr[10];

    int *ret;

    ret=arr1+12;

    (*ret)+=8;

    }

    15

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    16/34

    void main()

    {

    int x;

    x=0;

    sample(5,6,7);

    x=1;

    printf(%d,x);

    getch();

    }

    What we have done is added 12 to arr1. This new address is where the return address is

    stored. To know how much to add to the return address, first use a test value, compile the

    program and then run the disassembler gdb. We get an assembly code, which can be used

    to jump the assignment statement x=1..

    3. Smashing the Stack

    One classification of buffer overflow attacks depends on where the buffer is allocated. If

    the buffer is a local variable of a function, the buffer resides on the run-time stack. This

    is the type of attack examined in Levy's article and it is by far the most prevalent form ofbuffer overflow attack.

    When a function is called in a C program, before the execution jumps to the actual code

    of the called function , the activation record of the function must be pushed on the run-

    time stack. In a C program the activation record consists of the following fields:

    space allocated for each parameter of the function;

    the return address;

    the dynamic link;

    space allocated to each local variable of the function.

    For convenience we will consider the address of the dynamic link field to be the base

    address of the activation record. The function must be able to access it's parameters and

    local variables. This requires that during the execution of the function a register hold

    16

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    17/34

    the base address of the activation record of the function., i.e. the address of the dynamic

    link field. Parameters are below this address on the stack, and local variables above.

    When the function returns, this register must be restored to its previous value, to point to

    the activation record of the calling function. To be able to do this, when the function is

    called the value of this register is saved in the dynamic link field. Thus the dynamic

    link field of each activation record points to the dynamic link field of the previous

    activation record on the stack, which in turn points to the dynamic link field of the

    previous activation record, and so on, all the way to the bottom of the stack. The first

    activation record on the stack is that of main(). This chain of pointers is called the

    dynamic chain.

    In many C compilers the buffer grows towards the bottom of the stack. Thus if the buffer

    overflows and the overflow is long enough the return address will be corrupted, (as well

    as everything else in between, including the dynamic link.) If the return address is

    overwritten by the buffer overflow so as to point to the attack code, this will be executed

    when the function returns. Thus, in this type of attack, the return address on the stack is

    used to hijack the control of the program.

    Overwriting the return address, as explained above, gives the attacker the means of

    hijacking the control of the program, but where should the attack code be stored? Most

    commonly it is stored in the buffer itself. Thus the payload string which is copied into

    the buffer will contain both the binary machine language attack code as well as the

    address of this code which will overwrite the return address.

    There are a few difficulties that the attacker must overcome to carry out this plan. If the

    attacker has the source code of the attacked program it may be possible to determine

    exactly how big the buffer is and how far it is from the return address, determining how

    big the payload string must be. Also, the payload string cannot contain the null character

    since this would abort the copying of the payload into the buffer. Some copying routines

    of the C library use carriage returns and new lines as a delimiter instead, so these

    characters should also be similarly avoided in the payload string.

    Access to the source code is nowadays quite common for many Operating Systems, e.g.

    Linux, OpenBSD, Free BSD, and even Solaris. Levy shows, however, that there is no

    need to have access to the source, or even knowledge of the exact details of how the

    17

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    18/34

    attacked program works. The address of the attack code can be guessed, and through

    various techniques an approximate guess will do. For example, the attack code could

    start with a long list of no operation instructions, so that control could be passed to any

    of these in order to correctly execute the crucial part of the attack code which spawns the

    shell and comes after the no ops. This technique was already used in the Morris worm.

    Similarly, the tail of the payload string could consist of a repeated list of the guessed

    address of the attack code that we want to overwrite the return address with. These

    techniques increase considerably the chances of guessing the address of the attack code

    close enough for the attack to work. For more details check Levy's article.

    We now examine why buffer overflows are so common. Suppose that the buffer is a

    character array used to store strings. Most programs have string inputs or environment

    variables which can be used by the attacker to deliver the attack. The program must read

    this input and parse it in order to make the appropriate response to the input. Often, to

    parse the input,the program will first copy it into a local variable of a function and then

    parse it. To do this the programmer reserves a large enough buffer for any reasonable

    input. To copy the input into the buffer the program will typically use a string copying

    function of the standard C library such as strcpy(). If done carelessly, this introduces a

    buffer overflow vulnerability. This pattern is so well established in the C

    programmer's repertoire that it makes very likely that many programs will contain

    buffer overflow vulnerabilities.

    The problem arises partly because C represents strings in a dangerous way. The length

    of a string is determined by terminating the sequence of characters by a null character.

    This representation is convenient, because strings can have arbitrary length and yet it

    allows for efficient processing of strings. But at the same time it is also dangerous,

    because the scheme breaks down if a string is not null terminated, and because there is

    no way of knowing the length of the string prior to processing all its characters. The

    typical C culture emphasizes efficiency over correctness, prudence or safety, which

    compounds the problem. It would require a massive amount of education to change this

    well entrenched programming practice. A consequence of this is that it is unlikely that

    buffer overflow vulnerabilities can be eradicated at the source by not introducing them

    into a program in the first place. Not only it will be difficult to eliminate the

    18

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    19/34

    vulnerability from the enormous quantity of software already deployed, but it seems

    likely that programmers will continue to write new vulnerable software.

    Miller studied the behavior of UNIX utilities when given random input in many

    distributions, both commercial and open source. His study is important and relevant to

    our discussion, because while unexpected input is not necessarily directly related to

    buffer overflows, the inability of programs to handle unexpected input comes from the

    same tendency of programmers to concentrate only on reasonable input that leads also to

    buffer overflow flaws. Attackers are not reasonable. On the contrary, they wish to exploit

    this blind spot of programmers for unreasonableness, to find a hole in the program's logic

    that they can use for their own purposes. So Miller's study provides some evidence on

    how common buffer overflow problems are likely to be. Unfortunately, in almost all

    distributions more than half of the utilities crashed under Miller's experiment.

    Miller also gives us some insight into the speed with which vendors are making progress

    in improving the quality of their software, if at all, because he repeated the study five

    years later. Indeed, his results show that progress is being made. But progress has been

    very modest.

    Another interesting result of Miller is the confirmation of the widely held anecdotal belief

    that Open Source provides significantly higher quality software than commercial

    offerings. This seems to suggest the power of somewhat chaotic large-scale parallelism

    over better organization of small-scale parallelism. The former is prevalent in the Open

    Source model, in which many pairs of eyes scrutinize the software but relatively

    uncoordinated. The latter is characteristic of commercial organizations, with fewer pairs

    of eyes scrutinizing the software but in a much more systematic and organized fashion.

    Many times, the execution shell code is not precompiled with the UNIX distributions as a

    part of the binaries. Thus the smasher has to find ways to feed his shell code into the

    runtime environment. Stack smashers have devised creative ways to accomplish this.

    In order to inject the shell code into the runtime process, stack smashers manipulate

    command line arguments, shell environment variables, and interactive input functions

    with the necessary shell code sequence. Most stack smashing attacks depend upon shell

    code instructions to accomplish their task.. These type of exploits depend on knowing at

    19

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    20/34

    what address in memory this shell code will reside. Taking this into consideration, many

    stack smashers pad their shell code with NULL (or noop) assembly operations, which

    gives the shell code a wider space in memory and makes it easier to guess where the

    shellcode may be when manipulating the return address. This approach, combined with

    an approach whereby the shell code is followed by many instances of the guessed return

    address in memory; is a common strategy used in constructing stack smashing exploits.

    An additional approach, when small programs with memory restrictions are exploited, is

    to store the shellcode in an environment variable.

    Small Buffer Overflows:

    Many times, the size of the buffer declared is small i.e. the program allocates very less

    memory to it. In this case there is a possibility that the entire desired shell code might not

    fit into the buffer. Furthermore, there is also a chance that the return address might be

    overwritten by the shell code instructions themselves and not the address of the shell

    code. Also, the number of NOPs you can pad in the front of the string might be so small

    that the chance of guessing their address is minuscule. A unique approach has to be

    adopted to obtain an interactive shell from these programs. This approach requires you to

    have control over the shells environment variables. What we will do is place our

    shellcode in an environment variable, and then overflow the buffer with the address ofthis variable in memory. This method also increases your chances of the exploit working

    as you can make the environment variable holding the shell code as large as you want.

    The environment variables are stored in the top of the stack when the program is started.

    20

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    21/34

    4. Heap

    The final memory segment we need to cover is the heap. The heap is the memory area

    where you can allocate memory during the execution of a binary (by means of a system

    function called malloc(), memory allocation). You (well, the programmer) can just say: Inow need 5000 bytes of memory and there it is, if you have been blessed by the

    operating system! This is particularly helpful if you cant predict how much space you

    will

    actually need, since this will depend on the input to the program (do you recall our

    discussion of fixed buffer length in the So whats a Buffer Overflow, after all? section?

    Great!). The counterpart to malloc() and the memory allocation is incidentally the

    free() function, which returns the memory to the operating system.

    The heap is actually closely related to the already mentioned concept of a pointer in the C

    language, a memory address that holds no real data, but another memory address. Part

    of the magic of malloc() is that it provides you with the lowest address of the memory

    region you have been granted how could you otherwise access it? Variables holding

    memory addresses are of pointer type, hence the address returned by malloc() is for

    21

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    22/34

    future use stored in such a pointer. This can on the one hand be very useful, but has on

    the other hand been a constant source of various problems with C programs8, in

    particular

    if pointers are not properly initialized or operations on pointers are done wrong.

    5. Screen Shots Showing the Buffer Overflow Attack:

    Screen 1

    Screen 2

    22

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    23/34

    6. Defenses against Buffer Overflow Attacks:

    A centralized or decentralized approach can be taken to avoid stack smashing security

    vulnerabilities. To do so, changes must be implemented in the targeted programs

    themselves, in the operating system kernel, or in the framework of the C language. A

    centralized approach involves modification of system libraries or an operating system

    kernel while a decentralized approach involves the modification of privileged programs

    and/or C programming language compilers. We take a look at some of the decentralized

    and centralized approaches along with their pros and cons.

    6.1 Program Modification:

    To effectively fix defective SUID root programs, a number of modifications can be made

    to the program's source code to avoid stack-smashing vulnerabilities. Standard C bytecopy or concatenation functions often are crucial in most buffer overflow exploits. Also,

    functions that return a pointer to a result in static storage can be used in stack smashing

    exploits. In other terms, standard C function calls that copy strings without checking their

    length are insecure. Since many string functions perform no bounds checking on the

    buffer being appended or copied (whichever may be the case), they are susceptible to

    23

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    24/34

    stack smashing attacks. In addition to replacing vulnerable functions it is also essential to

    check shell environment pointers and excessive command line arguments for invalid data.

    Stack smashers are creative and often hide shell code and other crucial exploit

    information in excessive command line arguments or environment variables. Thus,

    securing source code must be a comprehensive process to be effective, and all avenues of

    unauthorized input must be inspected and properly terminated if invalid.

    6.2 Modifying the language and/or compiler:

    Many different language based and compiler based approaches have been adopted to

    guard against stack smashing vulnerabilities.

    A decentralized approach to preventing stack-smashing vulnerabilities is to

    redesign or modify the C language compiler's performance in a given UNIX

    operating system concerning vulnerable functions. However, it is important to

    note that, in most cases, these modifications to the C programming language are

    not trivial and involve root-level modifications to the concepts and methodologies

    behind the C programming language.

    A simple approach of this nature involves changes to the C compiler. An

    advantage of this approach is that it does not redesign the language itself. That is,

    it encourages secure programming without changing the code or its performance.

    A middle-of-the-road approach of this nature involves slight modifications to the

    compiler, which would modify only the dangerous functions in the C library

    and perform a stack integrity check before referencing the appropriate return

    value. If the integrity check fails, it would simply print a warning message and

    exit the affected program. The main disadvantage to this approach is that all

    dangerous functions would suffer a significant performance penalty, and like the

    previous approach, this does not consider possible bugs in the programmer-

    defined functions since it is confined to the system libraries. Also code required to

    be put for affecting the above changes is in assembly language, which is

    architecture dependant. It thus follows that this cannot be transported to different

    CPU architectures.

    24

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    25/34

    An extreme approach to solving the problem of buffer overflows is to implement

    bounds checking. While this sounds the most foolproof and convenient, it is

    potentially the most dangerous. Having static bounds checking would reduce Cs

    flexibility, simplicity and efficiency.

    To avoid this, another approach is to modify the way pointers are defined and

    manipulated in C. According to this new approach, a pointer would be declared

    by giving three parameters, the pointer itself and the upper and lower bounds of

    the address space which can be accessed using it. Thus it would be then

    unnecessary to have bounds checking. Inspite of this advantage, by giving the

    compiler the additional information about the upper and lower bounds of the

    pointer address space, a sizeable overhead would be generated. This would

    approximately increase the execution time by a factor of 10 and also increase the

    time for register allocation by a factor of 3.

    A unique approach to modifying the compiler in this manner was done by

    Richard Jones and Paul Kelly at Imperial College in July 1995.Their approach

    involved modifying the compiler to perform the same type of bounds checking.

    However the uniqueness laid in the fact that this invlolved no changes made to

    the design of C or the representation of pointers in the language. Furthermore,

    there was an option to turn the bounds checking mode on or off in a givenprogram. Thus all programs didnt suffer the overhead generated due to the extra

    code added.

    By representing every pointer with a new base pointer, k, that is derived from the original

    pointer, p, by using the formula:p+2*k+1 Only one pointer is valid for a given region

    and one can check whether a pointer arithmetic expression is valid by finding its base

    pointer's storage region. This is checked again to ensure that the expression's result

    points to the same storage region. In their implementation Jones and Kelly modified the

    front end of the GNU project's cc compiler, gcc. Code was added to check pointer

    arithmetic and use, and to maintain a table of known allocated storage regions using

    splay trees for efficiency. Despite slightly unfavourable performance statistics, and

    inspite of the fact that this meant modifying the C level at a low-level, this modification

    involves patching and recompiling the existing C compiler and its libraries. Furthermore,

    25

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    26/34

    all previously compiled binaries must be deleted and recompiled with the new libraries.

    Once this is done, all binaries on the system will execute with respect to this patch. The

    performance penalties are modest as is shown in the statistics of 2 typical algorithms: a

    recursive fibonaci generation and a pointer intensive matrix multiplication.

    nfib (dumb doubly-recursive Fibonacci): no slowdown.

    o Execution time: same.

    o Compile-time: slowdown of 3 (very small)

    o Executable size: much larger due to inclusion of library.

    Matrix multiply (ikj, using array subscripting):

    o Execution time: slowdown of around 30 compared to unoptimised.

    o Compile-time: slowdown of around 2.

    o Executable size: roughly the same.

    In short, modifying the C language and the language compiler involves making changes

    at a very non-trivial level. This, as we have seen can lead to performance penalties of

    varying degrees but considering the security threats that buffer overflow provide, some

    penalty should be tolerable.

    6.3 OS Kernel Stack execution privilege:

    The most centralized approach in preventing some stack smashing vulnerabilities

    involves modifying an operating system's kernel segment limit such that it does not cover

    the actual stack space. This approach effectively removes the kernel's stack execution

    permission. This has a fundamental advantage over other counter measures. As the most

    centralized method in limiting stack smashing vulnerabilities, no recompilation of C

    libraries or the actual compiler would be necessary, only the operating system kernel need

    be recompiled. A practical execution of this concept on the Linux operating system is

    described below, this description touches on the details of implementation as well as

    26

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    27/34

    some of the problems. To remove stack execution privilege in UNIX, the operating

    system dynamic memory allocation stack of the operating system is marked as non-

    executable. Stack smashing exploits depend on an executable stack when returning back

    into a memory address, which executes an interactive shell. By removing this

    functionality from the system, some stack smashing vulnerabilities can be stopped. A

    patch removing stack execution permission was written for the Linux operating system.

    This patch involved changing the kernel's code segment limit using a new descriptor, so

    that it does not cover the actual stack space, effectively removing its stack execution

    privilege. As a patch that is not difficult to compile into a kernel and test, one must be

    aware of the potential difficulties with this method. First, nested function calls or

    trampoline functions do not work properly with patched kernels.

    Furthermore, signal handler returns in the Linux operating system require an executable

    stack. Signal handlers are absolutely crucial in an operating system. A system with a non-

    executable stack also hinders objective C development efforts as well as other functional

    languages might also be affected. Furthermore, every program contains code that

    performs fundamental operations such as saving and restoring values from CPU registers,

    performs system calls. In contrast to the formulated stack smashing exploits available, an

    attack such as this would be impossible to prevent by changing the stack execution

    privilege. In other words, removing the stack execution permission only prevents today's

    stack smashing exploits from working properly. As exploits become more sophisticated,

    stack execution bits may have little or no relevance in terms of the exploit. As an aside,

    this type of patch can also be implemented in system CPU hardware. New system

    architectures could simply have multiple stacks: one for call frames, and one for

    automatic storage. In conclusion, by removing stack execution from the system kernel,

    one can attempt to stop the stack-smashing problem at the source. However, this

    approach suffers in implementation because the necessary code is non-portable, standard

    compiler functions and operating system signal handling behavior is modified and may

    be unpredictable. In addition to these points, this approach is not proven to stop more

    sophisticated stack smashing exploits.

    27

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    28/34

    6.4 Safer C library support

    A much more robust alternative would be if we could provide a safe version to the C

    library functions on which the attack relies to overwrite the return address. This idea

    seems to have occurred independently to several people. Alexander Snarskii seems to

    have been the first one to think of it .He implemented it for the FreeBSD version of Unix

    and offered it to the development group of FreeBSD. His explanation of the method was

    unfortunately a little obscure, and either he may not have fully realized the true power of

    his method, or if he did, he certainly did not elaborate on it in his note. Thus Snarskii's

    idea had less impact than it should have had. Baratloo, Tsai , and Singh from Bell Labs

    independently rediscovered the idea , and wrote a much more substantial white paper

    about it. This author also rediscovered this defense independently. The Bell Labs group

    implemented the vulnerable functions in a library called LibSafe, which can be freely

    downloaded from their site.

    Can we replace a vulnerable function in the C library by a safer version? We will discuss

    the idea in terms of strcpy(), but it will become readily apparent that the method

    generalizes to any of the other vulnerable string manipulation functions. At first sight a

    safer version of strcpy() appears impossible because strcpy() does not know the size of

    the buffer that it is copying into. So complete avoidance of overflowing the buffer is not

    possible. Nonetheless, strcpy() has access to the dynamic chain on the stack, and

    successive dynamic links are like bright markers delimiting the activation records of all

    the currently active functions. The idea is to use this information to prevent strcpy() from

    corrupting the return address or the dynamic link fields.

    Table 1 :Some of the Problematic C-Functions:

    28

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    29/34

    Using these markers and the address of the buffer itself strcpy() can first determine

    which activation record contains the buffer, or else that the buffer is not on the stack at

    all. To do this strcpy() finds the interval [a,b] of consecutive dynamic links which

    contains the buffer. The cases in which the buffer is either below the first activation

    record on the stack, or above the last activation record can be handled as special cases

    with appropriate values of either a or b. Once the values of a and b are determined, we

    can compute an upper bound on the size of buffer. For example, if the buffer growstowards the bottom of the stack then |buffer -a | is an upper bound on the size of the

    buffer. This can be used by strcpy() to limit the length of the copied string so that

    neither the dynamic link nor the return address are overwritten. Furthermore, strcpy()

    can detect an attempt to do so, report the problem to syslog, and safely terminate the

    application.

    LibSafe does not replace the standard C library. The method relies instead on the loader

    searching LibSafe before the standard C library, so that the safe functions are used

    instead of the standard library functions. This scheme is more flexible than replacing the

    functions in the C library itself. For example, it is possible to have one program use the

    C library functions and another use the LibSafe versions. By setting appropriate

    environment variables LibSafe can be installed as the default library. But from a security

    29

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    30/34

    perspective, there seems to be little reason to keep the vulnerable functions installed on

    the system, so the usefulness of this extra flexibility is somewhat questionable.

    This defense has several advantages. It is effective against all buffer overflow attacks that

    attempt to smash the stack in which the target program uses one of the vulnerable C

    library functions to copy into the buffer. The method does not totally prevent buffer

    overflows. It can't, because it does not know the true size of the buffer. It is still possible

    to overflow areas between the buffer and the dynamic link. But the critical return address

    and the dynamic link fields are protected from being overwritten.

    The method fails to provide any protection against heap based buffer overflow attacks

    (see below), or attacks which do not need to hijack control by overwriting the return

    address. Both of these kinds of attack, however, are much harder to pull off, and

    consequently much rarer. The method would also fail to protect a program that does not

    use the standard C library functions to copy into the buffer. For example, if the target

    program contains custom code to copy the string into the buffer it will not be protected.

    However, it seems clear that few programs will have such custom code. Generally

    speaking it is considered to be bad programming practice to "reinvent the wheel", so

    programmers are encouraged to use the standard libraries.

    Though programs that rely on custom code may contain buffer overflow vulnerabilities

    just as much as those that use the standard C library, they will be less likely to be

    detected. Because of this they will enjoy some immunity from attack. This is security

    through obscurity, which in general is not a good way to secure a system. Nonetheless it

    is of some security value.

    The overhead of the safe functions is negligible, and the cost of installing the library and

    configure the system to use it is very low. Another advantage is that it works with the

    binaries of the target program, and does not require access to their source code. Finally, it

    can be deployed without having to wait for the vendor to react to security threats, which

    is a very desirable feature. It is a much more robust defense than disabling stack

    execution. Though we have discussed variants of attacks against which it will offer no

    protection, it is very effective against the class of attacks that it is designed for, and it

    cannot be easily circumvented. The attacker has no way of interfering with the detection

    of the buffer overflow attack, because this occurs before the attacker has a chance to

    30

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    31/34

    hijack control. We conclude that overall, this defense offers a very significant

    improvement of the security of a system at very low cost. In our opinion it is a sure

    winner.

    We also mention Andrey Kolishak's BOWall protection. This is available for Windows

    NT systems, with full source. This solution has some similarities to both the safer Library

    approach, and to the methods to be presented in the next Section.

    Kolishak's approach is similar to the others in this Section, because it works by replacing

    the DLL's that contain the vulnerable library functions with a safer library version.

    However, unlike LibSafe or Snarskii's method, it seems to be a buffer overflow detection

    system, which is more similar to the methods of the next Section. It works by saving the

    return address when the function enters, and checking it before actually returning. If

    corruption of the return address is detected it does not return, so hijacking of control is

    prevented. Kolishak also has a second component of BOWall which relies on some

    specific Windows NT security features.

    7. Future Scope

    None of the countermeasures are perfect

    The earlier stack overruns are addressed in the design process the better

    Systematic work on removing security relevant buffer overflows is a relativelyrecent effort

    Further research on formal methods for software security is needed.

    31

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    32/34

    8. Conclusion:

    Stack smashing attacks are among the most common ways to gain access to a UNIX

    privileged file system. Prevention of these attacks is one of the primary concerns of the

    OS and networking community. The expertise of programmers who write privileged code

    as well as that of the UNIX gurus would be most crucial in building OS and software that

    are resistant to buffer attacks. With the combined efforts of these different groups stack

    smashing and indeed all other buffer overflow vulnerabilities can be defeated.

    32

  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    33/34

    9. References:

    1. Stefan Axelsson, A Comparison of the Security of Windows NT and UNIX, 1998http://www.securityfocus.com/data/library/nt-vs-unix.pdf

    2. Arash Baratloo, Timothy Tsai, and Navjot Singh, Libsafe: Protecting CriticalElements of Stackshttp://www.securityfocus.com/library/2267http://www.bell-labs.com/org/11356/libsafe.html

    33

    http://www.securityfocus.com/data/library/nt-vs-unix.pdfhttp://www.securityfocus.com/library/2267http://www.bell-labs.com/org/11356/libsafe.htmlhttp://www.securityfocus.com/data/library/nt-vs-unix.pdfhttp://www.securityfocus.com/library/2267http://www.bell-labs.com/org/11356/libsafe.html
  • 8/8/2019 19007586 Buffer Overflow Attack(2)

    34/34

    3. Bulba and Kil3r, Bypassing StackGuard and Stackshield,Phrack Magazine 56 No5, 1999.http://phrack.infonexus.com/search.phtml?view&article=p56-5

    4. Crispin Cowan, Perry Wagle, Calton Pu, Steve Beattie, and Jonathan Walpole,

    Buffer Overflows: Attacks and Defenses for the Vulnerability of the Decade, inDARPA Information Survivability Conference and Expo 2000.http://www.cse.ogi.edu/DISC/projects/immunix/publications.htmlhttp://www.securityfocus.com/library/1674

    5. David Curry, Improving the Security of your Unix System, 1990http://www.securityfocus.com/library/1913

    6. Drew Dean, Edward W. Felten, and Dan S. Wallach, Java Security: From HotJavato Netscape and Beyond, inProc. of the IEEE Symp. on Security and Privacy,1996

    http://www.cs.princeton.edu/sip/pub/secure96.html

    7. Casper Dik, Non-Executable Stack for Solaris, posted to comp.security.unixJanuary 2, 1997.http://x10.dejanews.com/

    8. DilDog, The TAO of Windows Buffer Overflow, 1998http://www.cultdeadcow.com/cDc_files/cDc-351/

    http://phrack.infonexus.com/search.phtml?view&article=p56-5http://www.cse.ogi.edu/DISC/projects/immunix/publications.htmlhttp://www.securityfocus.com/library/1674http://www.securityfocus.com/library/1913http://www.cs.princeton.edu/sip/pub/secure96.htmlhttp://x10.dejanews.com/http://www.cultdeadcow.com/cDc_files/cDc-351/http://phrack.infonexus.com/search.phtml?view&article=p56-5http://www.cse.ogi.edu/DISC/projects/immunix/publications.htmlhttp://www.securityfocus.com/library/1674http://www.securityfocus.com/library/1913http://www.cs.princeton.edu/sip/pub/secure96.htmlhttp://x10.dejanews.com/http://www.cultdeadcow.com/cDc_files/cDc-351/