Chapter 2.2 data structures

25
DATA STRUCTURES AND ALGORITHM

Transcript of Chapter 2.2 data structures

Page 1: Chapter 2.2 data structures

DATA STRUCTURES AND ALGORITHM

Page 2: Chapter 2.2 data structures

In programming, the term data structure refers to a scheme for organizing related pieces of information. The basic types of data structures include:

FILES, LISTS, ARRAYS, RECORDS, TREES AND TABLES

DATA STRUCTURES

Page 3: Chapter 2.2 data structures

FILES

A collection of data or information that has a name, called the filename. Almost all information stored in a computer must be in a file. There are many different types of files: data files, text files , program files, directory files,and so on. Different types of files store different types of information. For example, program files store programs, whereas text files store text.

Page 4: Chapter 2.2 data structures

In programming, a series of objects all of which are the same size and type. Each object in an array is called an array element. For example, you could have an array of integers or an array of characters or an array of anything that has a defined data type. The important characteristics of an array are:Each element has the same data type (although they may have different values).The entire array is stored contiguously in memory (that is, there are no gaps between elements).Arrays can have more than one dimension. A one-dimensional array is called a vector ; a two-dimensional array is called a matrix.

ARRAYS

Page 5: Chapter 2.2 data structures

RECORDS

Page 6: Chapter 2.2 data structures

TREE STRUCTURES

Page 7: Chapter 2.2 data structures

TABLES

Refers to data arranged in rows and columns. A spreadsheet, for example, is a table. In relational database management systems, all information is stored in the form of tables.

Page 8: Chapter 2.2 data structures

LISTS

In computer science, a list or sequence is an abstract data structure that implements an ordered collection of values, where the same value may occur more than once. An instance of a list is a computer representation of the mathematical concept of a finite sequence, that is, a tuple. Each instance of a value in the list is usually called an item, entry, or element of the list; if the same value occurs multiple times, each occurrence is considered a distinct item.                                                                                                          A singly-linked list structure, implementing a list with 3 integer elements. The name list is also used for several concrete data structures that can be used to implement abstract lists, especially linked lists

Page 9: Chapter 2.2 data structures

In mathematics, computing, and related subjects, an algorithm is an effective method for solving a problem using a finite sequence of instructions. Algorithms are used for calculation, data processing, and many other fields.Each algorithm is a list of well-defined instructions for completing a task. Starting from an initial state, the instructions describe a computation that proceeds through a well-defined series of successive states, eventually terminating in a final ending state. The transition from one state to the next is not necessarily deterministic; some algorithms, known as randomized algorithms, incorporate randomness. A partial formalization of the concept began with attempts to solve the Entscheidungsproblem (the "decision problem") posed by David Hilbert in 1928. Subsequent formalizations were framed as attempts to define "effective calculability“or "effective method"[2]; those formalizations included the Gödel-Herbrand-Kleene recursive functions of 1930, 1934 and 1935, Alonzo Church's lambda calculus of 1936, Emil Post's "Formulation 1" of 1936, and Alan Turing's Turing machines of 1936–7 and 1939.

ALGORITHM

Page 10: Chapter 2.2 data structures
Page 11: Chapter 2.2 data structures

PSEUDOCODE

• is a compact and informal high-level description of a computer programming algorithm that uses the structural conventions of a programming language, but is intended for human reading rather than machine reading. Pseudo code typically omits details that are not essential for human understanding of the algorithm, such as variable declarations, system-specific code and subroutines.

Page 12: Chapter 2.2 data structures

ANALYSIS OF ALGORITHM

• To analyze an algorithm is to determine the amount of resources (such as time and storage) necessary to execute it. Most algorithms are designed to work with inputs of arbitrary length. Usually the efficiency or complexity of an algorithm is stated as a function relating the input length to the number of steps (time complexity) or storage locations (space complexity).

Page 13: Chapter 2.2 data structures

ANALYSIS OF ALGORITHM

• Algorithm analysis is an important part of a broader computational complexity theory, which provides theoretical estimates for the resources needed by any algorithm which solves a given computational problem. These estimates provide an insight into reasonable directions of search for efficient algorithms.

Page 14: Chapter 2.2 data structures

ANALYSIS OF ALGORITHM

• In theoretical analysis of algorithms it is common to estimate their complexity in the asymptotic sense, i.e., to estimate the complexity function for arbitrarily large input. Big O notation, omega notation and theta notation are used to this end. For instance, binary search is said to run in a number of steps proportional to the logarithm of the length of the list being searched, or in O(log(n)), colloquially "in logarithmic time". Usually asymptotic estimates are used because different implementations of the same algorithm may differ in efficiency. However the efficiencies of any two "reasonable" implementations of a given algorithm are related by a constant multiplicative factor called a hidden constant.

Page 15: Chapter 2.2 data structures

ANALYSIS OF ALGORITHM

• Exact (not asymptotic) measures of efficiency can sometimes be computed but they usually require certain assumptions concerning the particular implementation of the algorithm, called model of computation. A model of computation may be defined in terms of an abstract computer, e.g., Turing machine, and/or by postulating that certain operations are executed in unit time. For example, if the sorted list to which we apply binary search has nelements, and we can guarantee that each lookup of an element in the list can be done in unit time, then at most log2 n + 1 time units are needed to return an answer.

Page 16: Chapter 2.2 data structures

ANALYSIS OF ALGORITHM

• Time efficiency estimates depend on what we define to be a step. For the analysis to correspond usefully to the actual execution time, the time required to perform a step must be guaranteed to be bounded above by a constant. One must be careful here; for instance, some analyses count an addition of two numbers as one step. This assumption may not be warranted in certain contexts. For example, if the numbers involved in a computation may be arbitrarily large, the time required by a single addition can no longer be assumed to be constant.

Page 17: Chapter 2.2 data structures

COMPUTATIONAL COMPLEXITY

• DEVELOPED BY JURIS HARTMANIS AND RICHARD STEARNS

• IT IS USED TO COMPARE THE EFFICIENCY OF ALGORITHMS, A MEASURE OF DEGREE OF DIFFICULTY OF AN ALGORITHM.

• TO EVALUATE ALGORITHM EFFICIENCY, REAL TIME UNITS SUCH AS MICROSECONDS AND NANO SECONDS SHOULD NOT BE USED.RATHER LOGICAL UNITS THAT EXPRESS A RELATIONSHIP BETWEEN SIZE N OF A FILE OR AN ARRAY AND THE AMOUNT OF TIME T REQUIRED TO PROCESS THE DATA SHOULD BE USED.

Page 18: Chapter 2.2 data structures

ARRAYS

• In computer science, an array data structure or simply array is a data structure consisting of a collection of elements (values or variables), each identified by one or more integer indices, stored so that the address of each element can be computed from its index tuple by a simple mathematical formula. For example, an array of 10 integer variables, with indices 0 through 9, may be stored as 10 words at memory addresses 2000, 2004, 2008, … 2036; so that the element with index i has address 2000 + 4 × i.

Page 19: Chapter 2.2 data structures

ARRAY

• Array structures are the computer analog of the mathematical concepts of vector, matrix, and tensor. Indeed, an array with one or two indices is often called a vector or matrix structure, respectively. Arrays are often used to implement tables, especially lookup tables; so the word table is sometimes used as synonym of array.

Page 20: Chapter 2.2 data structures

INDEX

• In computer science, an index can be:

• an integer which identifies an array element

• a data structure that enables sub linear-time lookup

Page 21: Chapter 2.2 data structures

INDEX• ARRAY ELEMENT IDENTIFIER• When data objects are stored in an array, individual objects are

selected by an index which is usually a non-negative scalar integer. Indices are also called subscripts.

• There are three ways in which the elements of an array can be indexed:

• 0 (zero-based indexing)– The first element of the array is indexed by subscript of 0.

• 1 (one-based indexing)– The first element of the array is indexed by subscript of 1.

• n (n-based indexing)– The base index of an array can be freely chosen. Usually programming

languages allowing n-based indexing also allow negative index values and other scalar data types like enumerations, or characters may be used as an array index.

Page 22: Chapter 2.2 data structures

ARRAYS

• Arrays are among the oldest and most important data structures, and are used by almost every program and are used to implement many other data structures, such as lists and strings. They effectively exploit the addressing machinery of computers; indeed, in most modern computers (and manyexternal storage devices), the memory is a one-dimensional array of words, whose indices are their addresses. Processors, especially vector processors, are often optimized for array operations.

Page 23: Chapter 2.2 data structures

ARRAYS

• The terms array and array structure are often used to mean array data type, a kind of data type provided by most high-level programming languagesthat consists of a collection of values or variables that can be selected by one or more indices computed at run-time. Array types are often implemented by array structures; however, in some languages they may be implemented by hash tables, linked lists, search trees, or other data structures.

Page 24: Chapter 2.2 data structures

ARRAYS

• The terms are also used, especially in the description of algorithms, to mean associative array or "abstract array", a theoretical computer science model (an abstract data type or ADT) intended to capture the essential properties of arrays.

Page 25: Chapter 2.2 data structures

ARRAYS• In computer science, array programming languages (also known

as vector or multidimensional languages) generalize operations on scalars to apply transparently to vectors, matrices, and higher dimensional arrays.

• Array programming primitives concisely express broad ideas about data manipulation. The level of conciseness can be dramatic in certain cases: it is not uncommon to find array programming language one-liners that require more than a couple of pages of Java code. 

• APL, designed by Ken Iverson, was the first programming language to provide array programming capabilities. The mnemonic APL refers to the title of his seminal book "A Programming Language" and not to arrays per se. Iverson's contribution to rigor and clarity was probably more important than the simple extension of dimensions to functions.