SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of...

15
SRI VENKATESWARA COLLLEGE OF ENGINEERING AND TECHNOLOGY , THIRUPACHUR DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK CS6402 DATABASE MANAGEMENT SYSTEMS UNIT I PART A 1. What is database? A database is a collection of related data with some inherent meaning,representing some aspect of real world and which is designed,built and populated with data for a specific purpose. 2. What is a Database Management system? A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS collectively known as database system. 3. What are the disadvantages of file processing system? Data redundancy and inconsistency Difficulty in Accessing Data Data Isolation Integrity problems Atomicity problems Concurrent Access Anomalies Security problems 4. What are the advantages of a a DBMS? Data Independence , Efficient Data Access , Data integrity and security , Data administration , concurrent access and crash recovery. 5. Define Database Administrator(DBA). The DBA is responsible for authorizing access to the database for coordinating and monitoring its use and for acquiring software and hardware resources as needed. The DBA is accountable for problems such as breach of security or poor system response time.In large organizations , the DBA is assisted by a staff that hat helps carry out these functions. 6. What is meant by end users? End users are the people whose jobs require access to the database for querying , updating and generating reports. 7. List out any five applications of DBMS. Banking , Airlines , Universities , credit card transactions , Telecommunications . 8. Define Data redundancy. Storing the same data multiple times is called data redundancy. 9. Give the levels of data abstraction Physical level , logical level , view level.

Transcript of SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of...

Page 1: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

SRI VENKATESWARA COLLLEGE OF ENGINEERING AND TECHNOLOGY , THIRUPACHUR

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

QUESTION BANK

CS6402 DATABASE MANAGEMENT SYSTEMS

UNIT I

PART A

1. What is database?

A database is a collection of related data with some inherent meaning,representing some

aspect of real world and which is designed,built and populated with data for a specific

purpose.

2. What is a Database Management system?

A Database Management system is a collection of programs that enables users to create

and maintain a database.

Database and DBMS collectively known as database system.

3. What are the disadvantages of file processing system?

Data redundancy and inconsistency

Difficulty in Accessing Data

Data Isolation

Integrity problems

Atomicity problems

Concurrent Access Anomalies

Security problems

4. What are the advantages of a a DBMS?

Data Independence , Efficient Data Access , Data integrity and security , Data

administration , concurrent access and crash recovery.

5. Define Database Administrator(DBA).

The DBA is responsible for authorizing access to the database for coordinating and

monitoring its use and for acquiring software and hardware resources as needed.

The DBA is accountable for problems such as breach of security or poor system response

time.In large organizations , the DBA is assisted by a staff that hat helps carry out these

functions.

6. What is meant by end users?

End users are the people whose jobs require access to the database for querying ,

updating and generating reports.

7. List out any five applications of DBMS.

Banking , Airlines , Universities , credit card transactions , Telecommunications .

8. Define Data redundancy.

Storing the same data multiple times is called data redundancy.

9. Give the levels of data abstraction

Physical level , logical level , view level.

Page 2: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

10. Define weak and strong entity set.

Weak entity set: Entity set that do not have key attribute of their own are called weak

entity sets.

Strong entity sets : Entity set that has a primary key is termed a strong entity set.

11. What is the need of normalization?

Normailzation of data is a process of analyzing the given relation schema based on their

FDs and primary keys to achieve the desirable properties of

(i) Minimizing redundancy

(ii) Minimizing the insertion , deletion , update anomalies

12. Write a note on functional dependencies.

A functional dependency is denoted by X Y , between two sets of attributes X and Y

that are subsets of R specifies a constraint on the possible tuples that can form a relation

state r of R.

13. What are the components of storage manager.

Authorization and integrity manager

Transaction manager

File manager

Buffer manager

14. Define Data Independence and what are the two levels of independence.

The ability to modify a schema definition in one level without affecting a schema

definition in a next higher level is called data independence.

(i) Physical data independence

(ii) Logical data independence

15. What is data model and give the categories of data model.

A data model is a collection of conceptual tools for describing data , data relationships ,

data semantics and consistency constraints.

Object based data model

Record based data model

Physical data model

16. Define Data Dictionary.

A data dictionary is a data structure which stores meta data about the structure of the

database I.e the schema of the database.

17. Define single valued and multivalued attributes.

Single valued attributes: attributes with a single value for a particular entity are called

single valued attributes.

Multivalued attributes : Attributes with a set of value for a particular entity are

called multivalued attributes.

18. Define null values.

In some cases a particular entity may not have an applicable value for an attribute or if

we do not know the value of an attribute for a particular entity. In these cases null

value is used.

19. Define the terms i) DDL ii) DML

DDL: Data base schema is specified by a set of definitions expressed by a special

language called a data definition language.

DML: A data manipulation language is a language that enables users to access or

manipulate data as organized by the appropriate data model.

Page 3: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

20. What is relationship? Give examples A relationship is an association among several entities.

Example: A depositor relationship associates a customer with each account that

he/she has. 21. Define normalization.

Normalization of data is a process during which unsatisfactory relation schemas are

decomposed by breaking up their attributes into smaller relation shemas that possess desirable

properties. 22. Define 1NF.

1NF states that the domains of attributes must include only atomic values and that the

value of any attribute in a tuple must be a single value from the domain of that attribute. It

disallows multivalued attributes, composite attributes and their combinations.

23. Define 2NF. A relation shema R is in 2NF if every nonprime attribute A in R is fully functionally

dependent on the primary key. A functional dependency X Y is a full functional dependency

ifremoval of any attribute A from X means that the dependency not hold any more; that

-{A}) X Y.

24. Define 3NF. A relation shema R is in 3NF if it is in 2NF and no nonprime attribute of R is

transitively dependent on the primary key. A funtional dependency X Y in arelation shema R is atransitive dependency if ther is a set of attributes Z that is not a subset of any key of R, and both X Y and Z Y hold.

25. What is need for normalization. a. To ensure that the update anomalies do not occur.

b. Normal forms provide a formal frame work for analyzing relation shemas based on

their keys and on the functional dependencies among their attributes.

c. A series of tests that can be carried out on individual relation schemas so that the

relation database can be normalized to any degree. When a test fails , the relation

violating that test must be decomposed into relations that individually meet the

normalization tests.

26. In what way BCNF is different from 3NF? A relation schema R is in BCNF if whenever a functional dependency X Y holds in

R,then X is a superkey of R.The only difference between BCNF and 3NF : the 3NF allows A to be a prime if X is not a superkey, is absent from BCNF.

PART B

1. Explain the purpose of database system.

2. Describe various file system organizations.

3. With a neat diagram, explain the structure of a DBMS.

4.. Explain 1NF, 2NF, 3NF and BCNF with suitable example.

5. What are the pitfalls in relational database design? With a suitable example, explain the

roleof functional dependency in the process of normalization.

6. Construct an E-R diagram for a car-insurance company whose customers own one or

morecars each. Each car has associated with it zero to any number of recorded accidents. State

any assumptions you make.

7. With relevant examples discuss the various operations in Relational Algebras

Page 4: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

8 . Define a functional dependency. List and discuss the six inference rules for functional

dependencies. Give relevant examples.

9. Explain the three different groups of data models with examples.

10. Discuss Join Dependencies and Fifth Normal Form, and explain why 5NF?

UNIT II

1. Define Query Language.

A query is a statement requesting the retrieval of information. The portion of DML that

involves information retrieval is called a query language.

2. What are the parts of SQL language?

The SQL language has several parts: data - definitition language

Data manipulation language

View definition

Transaction control

Embedded SQL

Integrity

Authorization

3. What are the categories of SQL command? SQL commands are divided in to the following categories:

1. Data - Definitition Language

2. Data Manipulation language

3. Data Query Language

4. Data Control Language

5. Data Administration Statements

6. Transaction Control Statements

4. What are the three classes of SQL expression?

SQL expression consists of three clauses: i. Select

ii. From iii. Where

5. List the string operations supported by SQL?

a. Pattern matching Operation

b. Concatenation

c. Extracting character strings

d. Converting between uppercase and lower case letters.

6. List the set operations of SQL?

a. Union

b. Intersect operation

c. The except operation

7. What are aggregate functions? And list the aggregate functions supported by

SQL? Aggregate functions are functions that take a collection of values as input and return a

single value.

Aggregate functions supported by SQL are Average: avg

Minimum: min

Maximum: max

Page 5: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

Total: sum

Count: count

8. What is the use of group by clause? Group by clause is used to apply aggregate functions to a set of tuples. The attributes

given in the group by clause are used to form groups. Tuples with the same value on all

attributes in the group by clause are placed in one group. 9. What is the use of sub queries?

A sub query is a select-from-where expression that is nested with in another query. A

common use of sub queries is to perform tests for set membership, make set comparisions, and

determine set cardinality.

10. We define view in SQL by using the create view command. The form of

the create view command is

Create view v as <query expression>

11. What is the use of with clause in SQL?

The with clause provides a way of defining a temporary view whose definition is

available only to the query in which the with clause occurs.

12. List out the statements associated with a database transaction?

i. Commit work

ii. Rollback work

13. What is transaction?

Transaction is a unit of program execution that accesses and possibly updated various

data items.

14. What is the use of integrity constraints?

Integrity constraints ensure that changes made to the database by authorized users do

not result in a loss of data consistency. Thus integrity constraints guard against accidental

damage to the database. 15. What is the need for triggers?

Triggers are useful mechanisms for alerting humans or for starting certain tasks

automatically when certain conditions are met.

16. List the requirements needed to design a trigger.

The requirements are

a. Specifying when a trigger is to be executed.

b. Specify the actions to be taken when the trigger executes. 17. Give the limitations of SQL authorization.

a. The code for checking authorization becomes intermixed with the rest of

the application code.

b. Implementing authorization through application code rather than

specifying it declaratively in SQL makes it hard to ensure the absence of

loopholes.

18. List the disadvantages of relational database system

a. Repetition of data

b. Inability to represent certain information.

19. What do you mean by “Query Optimization”?

Improving of the strategy for processing a query is called “Query Optimization”. It is

the responsibi

Page 6: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

lity of the system to transform the query as entered by the user into an equivalent query

which can be computed more efficiently. 20. What is called query processing?

Query processing refers to the range of activities involved in extracting data froma

database.

21. What are the steps involved in query processing?

The basic steps are:

a. parsing and translation

b. optimization

c. Evaluation

22. What is called an evaluation primitive?

A relational algebra operation annotated with instructions on how to evaluate is called

an evaluation primitive.

23. What is called a query evaluation plan? A sequence of primitive operations that can be used to evaluate a query is a query

evaluation plan or a query execution plan.

24. What is called a query –execution engine? The query execution engine takes a query evaluation plan, executes that plan, and

returns the answers to the query.

25. What are audit trails?

An audit trail is a log of all changes to the database along with information such as

which user performed the change and when the change was performed.

26. Mention the various levels in security measures.

a. Database system

b. Operating system

c. Network

d. Physical

e. Human

PART B

1. What is a view? How can it be created? Explain with an example.

2. Discuss in detail the operators SELECT, PROJECT, UNION with suitable examples.

3. Explain static and dynamic SQL in detail.

4. Diagrammatically illustrate and discuss the steps involved in processing a query.

5. Give briefly about Query evaluations cost & Selection operation

6. How does a DBMS represent a relational query evaluation plan? 7. Since indices speed query processing, why might they not be kept on several search keys?

List as many reasons as possible. 8. Consider the database given by the following schemes.

Customer (Cust_No, Sales_ Person_No ,City) Sales_ Person(Sales_ Person_No ,Sales_

Person_Name, Common_Prec,Year_of_Hire)

Give an expression in SQL for each of the following queries: Display the list of all customers by Cust_No with the city in which each is located. Select Cust No, city from Customer List the names of the sales persons who have accounts in Delhi.

Page 7: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

Select Sales_Person_Name from Sales_Person_Name where( select * from customer where city = delhi)

9. Write short notes on the following: Data Manipulation Language (DML), Data Definition Language (DDL) Transaction Control Statements (TCS),Data Control Language (DCL)

10. Consider the employee database , where the primary keys are Underlined. employee(empname,street,city) works(empname,companyname,salary) company(companyname,city) manages(empname,management)

Give an expression in the relational algebra for each request. 1) Find the names of all employees who work for First Bank Corporation. 2) Find the names, street addresses and cities of residence of all employees who work for First

Bank Corporation and earn more than 200000 per annum. 3) Find the names of all employees in this database who live in the same city as the company

for which they work. 4) Find the names of all employees who earn more than every employees of small Bank

Corporation.

UNIT III

1. What is transaction?

A transaction is an executing program that forms a logical unit of database processing.

A transaction includes one or more database access operations. These can include

insertion , deletion , modification or retrieval operations.

2. What are the basic database access operations that a transaction can include?

Read-item

Write-item

3. List out the transaction estates.

Active

Partially committed

Failed

Aborted

Committed 4. What are the properties of transaction?

The properties of transactions are: Atomicity

Consistency

Isolation

Durability

5. What is recovery management component?

Ensuring durability is the responsibility of a software component of the base system

called the recovery management component.

6. When is a transaction rolled back? Any changes that the aborted transaction made to the database must be undone. Once

the changes caused by an aborted transaction have been undone, then the transaction has been

rolled back. 7. What is average response time?

The average response time is that the average time for a transaction to be completed

Page 8: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

after it has been submitted.

8. What are the two types of serializability?

The two types of serializability is

Conflict serializability

View serializability

9. Define lock? Lock is the most common used to implement the requirement is to allow a transaction

to access a data item only if it is currently holding a lock on that item.

10. What are the different modes of lock?

The modes of lock are: Shared

Exclusive

11. Define deadlock? Neither of the transaction can ever proceed with its normal execution. This situation is

called deadlock.

12. Define the phases of two phase locking protocol Growing phase: a transaction may obtain locks but not release any lock.

Shrinking phase: a transaction may release locks but may not obtain any new

locks.

13. What are the two methods for dealing deadlock problem? The two methods for dealing deadlock problem is deadlock detection and deadlock

recovery. 14. What is meant by log-based recovery?

The most widely used structures for recording database modifications is the log.The

log is a sequence of log records, recording all the update activities in the database. There are

several types of log records.

15. Define page. The database is partitioned into some number of fixed-length blocks, which are

referred to as pages.

16. What are the drawbacks of shadow-paging technique? Commit Overhead

Data fragmentation

Garbage collection

17. Differentiate strict two phase locking protocol and rigorous two phase

locking protocol.

Strict two phase locking protocol all exclusive mode locks taken by a

transaction is held until that transaction commits.

Rigorous two phase locking protocol requires that all locks be held until

the transaction commits.

18. What are the time stamps associated with each data item?

W-timestamp (Q) denotes the largest time stamp if any transaction that executed

WRITE (Q) successfully.

R-timestamp (Q) denotes the largest time stamp if any transaction that executed

READ (Q) successfully.

19. What Is a shadow copy scheme?

It is based on making copies of the database called shadow copies that one transaction is

active at a time. The scheme also assumes that the database is simply a file on disk.

Page 9: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

20. List out the types of intention locks.

Intention shared locks

Intention exclusive locks

Shared intention exclusive locks

21. Define conflict equivalent.

If a schedule S can be transformed into a schedule S’ by a series of swaps of

nonconflicting instructions then S and S’ are conflict equivalent.

22. What are the types of indexes?

Primary index

Secondary index

Clustering index

23. Define upgrade and downgrade?

It provides a mechanism for conversion from shared lock to exclusive lock is known as

updgrade. It provides a mechanism for conversion from exclusive lock to shared lock is

known as downgrade.

24. What is the use of isolation level?

The isolation level is used to refer the degree of interference that a given transaction is

prepared to tolerate on the part of concurrent transactions.

25. Define recoverable schedule.

It is one where, for each pair of transactions Ti and Tj such that Tj reads a data item

previously written by Ti , the commit operation of Ti appears before the commit

operation of Tj.

PART B

1. What is the difference between conflict serializability and view serializability? Explain in

detail with example.

2. Explain about system recovery and media recovery. 3. Explain the deferred and immediate modification versions of the log based recovery scheme. 4. What are different types of schedules are acceptable for recoverability.

5. Discuss on strict, two-phase locking protocol and time stamp-based protocol. 6. Explain Time stamp-Based Concurrency Control protocol and the modifications

implemented in it. 7. Describe shadow paging recovery techniques.

8. How can you implement atomicity in transactions? Explain.

9. Describe the concept of serializability with suitable example. 10. How concurrency is performed? Explain the protocol that is used to maintain the

concurrency concept.

UNIT IV

1. What are the types of storage devices?

Primary storage , Secondary storage , tertiary storage , volatile storage , non volatile storage.

2. What is called remapping of bad sectors? If the controller detects that a sector is damaged when the disk is initially formatted , or

Page 10: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

when an attempt is made to write the sector , it can logically map the sector to a different physical location.

3. Define blocks? The database system resides permanently on nonvolatile storage, and is into fixed-

length storage units called blocks.

4. What is meant by disk buffer? The area of memory where blocks reside temporarily is called the disk buffer.

5. Define garbage collection. Garbage may be created also as a side effect of crashes. Periodically, it is necessary to

find all the garbage pages and to add them to the list of free pages. This process is called

garbage collection.

6. Define access time. Access time is the time from when a read or write request is issued to when data transfer

begins.

7. Define seek time. The time for repositioning the arm is called the seek time and it increases with the

distance that the arm is called the seek time. 8. Define rotational latency time.

The time spent waiting for the sector to be accessed to appear under the head is called

the rotational latency time.

9. Define average latency time. The average latency time of the disk is one-half the time for a full rotation of the disk.

10. What is meant by mean time to failure? The mean time to failure is the amount of time that the system could run continuously

without failure.

11. What is a block and a block number? A block is a contiguous sequence of sectors from a single track of one platter.Each

request specifies the address on the disk to be referenced. That address is in the form of a

block number. 12. What is the use of RAID?

A variety of disk-organization techniques, collectively called redundant arrays of

independent disks are used to improve the performance and reliability.

13. What is called mean time to repair? The mean time to failure is the time it takes to replace a failed disk and to restore the

data on it.

14. What are the two main goals of parallelism?

Load –balance multiple small accesses, so that the throughput of such accesses

increases.

Parallelize large accesses so that the response time of large accesses is reduced.

15. What are the factors to be taken into account when choosing a RAID level?

Monetary cost of extra disk storage requirements.

Performance requirements in terms of number of I/O operations

Performance when a disk has failed.

Performances during rebuild.

Page 11: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

16. Define hot swapping? Hot swapping permits the removal of faulty disks and replaces it by new ones without

turning power off. Hot swapping reduces the mean time to repair.

17. What are the ways in which the variable-length records arise in database systems?

Storage of multiple record types in a file.

Record types that allow variable lengths for one or more fields.

Record types that allow repeating fields.

18. What is known as heap file organization?

In the heap file organization, any record can be placed anywhere in the file where there

is space for the record. There is no ordering of records. There is a single file for each relation.

19. What is known as sequential file organization? In the sequential file organization, the records are stored in sequential order,according

to the value of a “search key” of each record.

20. What is hashing file organization?

In the hashing file organization, a hash function is computed on some attribute of each

record. The result of the hash function specifies in which block of the file the record should

be placed.

21. What is known as clustering file organization? In the clustering file organization, records of several different relations are stored in the

same file.

22. What are the types of indices?

Ordered indices

Hash indices

23. What is known as a search key?

An attribute or set of attributes used to look up records in a file is called a search key.

24. What is a primary index? A primary index is an index whose search key also defines the sequential order of the

file.

25. What are called index-sequential files? The files that are ordered sequentially with a primary index on the search key, are

called index-sequential files.

26. What are the two types of indices?

Dense index

Sparse index

27. What are called multilevel indices?

Indices with two or more levels are called multilevel indices.

28. What is B-Tree? A B-tree eliminates the redundant storage of search-key values .It allows search key

values to appear only once.

29. What is a B+-Tree index? A B+-Tree index takes the form of a balanced tree in which every path from the root of

Page 12: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

the root of the root of the tree to a leaf of the tree is of the same length.

30. What is called as recursive partitioning?

The system repeats the splitting of the input until each partition of the build input fits

in the memory. Such partitioning is called recursive partitioning.

PART B

1. Explain how the RAID system improves performance and reliability.

2. Describe the structure of B+ tree and list the characteristics of a B+ tree.

3. What are the different types of storage media?

4. Discuss briefly static and dynamic hashing.

Explain the use of index structure and explain the concept of ordered indices.

5. Describe a hash file organization.

6. Write a detailed note on hashing techniques. 7. Describe the different types of file organization? Explain using a sketch of each of them

with their advantages and disadvantages. Mention the purpose of indexing. How this can

be done by B+ tree? Explain.

8. Describe the different methods of implementing variable length record.

9. Explain in detail about Distributed Databases.

10. Explain in detail about Multidimensional and Parallel databases.

Describe in detail about spatial and multimedia databases

UNIT V

1. What are the types of security?

Legal and ethical issues , policy issues , system related issues.

2. What are the control measures used to provide security of data in database.

Access control , Inference control , Flow control , Data encryption

3. What are the classifications of threats?

a. Physical Threats

b. Accidental error

c. Unauthorized access

d. Malicious misuse

4. Define data mining

Data mining is a process of extracting or mining knowledge from huge amount of data.

5. What is meant by Data warehouse? A data warehouse is a repository (archive) of information gathered from multiple sources, stored

under a unified schema at a single site.

6. What is meant by audit trail?

A database log that is used mainly for security purposes is sometimes called an audit trail.

7. What are the steps in the data mining process?

i. Data cleaning

ii. Data integration

iii. Data selection

iv. Data transformation

Page 13: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

v. Data mining

vi. Pattern evaluation

vii. Knowledge representation

8. What are the uses of statistics in data mining?

Statistics is used to

a. To estimate the complexity of a data mining problem;

b. suggest which data mining techniques are most likely to be successful; and

c. identify data fields that contain the most “surface information”.

9. Define Data Classification. It is a two-step process. In the first step, a model is built describing a pre-determined

set of data classes or concepts. The model is constructed by analyzing database tuples

described by attributes. In the second step the model is used for classification.

10. What is Association rule? Association rule finds interesting association or correlation relationships among a large

set of data items, which is used for decision-making processes. Association rules analyzes

buying patterns that are frequently associated or purchased together. 11. How are association rules mined from large databases?

Association rule mining is a two-step process.

a. Find all frequent itemsets.

b. Generate strong association rules from the frequent itemsets.

12. Why do you need data warehouse life cycle process?

Data warehouse life cycle approach is essential because it ensures that the project

pieces are brought together in the right order and at the right time.

13. What are the merits of Data Warehouse?

a. Ability to make effective decisions from database

b. Better analysis of data and decision support

c. Discover trends and correlations that benefits business

d. Handle huge amount of data.

14. List out the functionalities of Data warehouse.

Data cleaning

Data transformation

Data integration

Data loading &

Periodic data refreshing

15. List the types of security mechanisms.

Discretionary security mechanisms

Mandatory security mechanisms

16. What are the database design issues?

Legal and ethical issues

Policy issues

System related issues

17. What are the steps for designing a warehouse?

Choose a business process to model

Choose the grain of the business process

Choose the dimensions that will apply to each fact table record

Page 14: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

Choose the measures that will populate each fact table record

18. What are the issues in data warehouse design?

When and how to gather data

What schema to use

Data cleansing

How to propagate updates

What data to summarize

19. What are the goals of data mining?

Prediction

Identification

Classification

Optimization

20. List out the types of Discovered knowledge.

Association rules

Classification Hierarchies

Sequential patterns

Patterns within time series

Clustering

21.What is meant by Confidence rule? Given a rule of the form A→B, rule confidence is the conditional probability that B is true when

A is known to be true.

22. Define Apriori algorithm. The Apriori algorithm was the first algorithm used to generate association rules. It uses the

general algorithm for creating association rules together with downward closure and anti-

monotonicity.

23. Define Sampling algorithm. The Sampling algorithm selects samples from the database of transactions that individually fit

into memory. Frequent itemsets are then formed for each sample. If the frequent itemsets form a

superset of frequent itemsets for the entire database, then the real frequent itemsets can be

obtained by scanning the remainder of the database.

24. List some of the Data Warehouse tools?

OLAP (Online Analytic Processing)

ROLAP (Relational OLAP)

End User Data Access tool

Ad Hoc Query tool

Data Transformation services

Replication

25. List out the operations done in data warehouse.

Roll up

Drill down

Pivot

Sorting

Selection

26. What is meant by frequent pattern tree algorithm?

Page 15: SRI VENKATESWARA COLLLEGE OF ENGINEERING AND … · A Database Management system is a collection of programs that enables users to create and maintain a database. Database and DBMS

The Frequent pattern tree algorithm reduces the total number of candidate itemsets by producing

a compressed version of the database in terms of an FP-tree. The FP-tree stores relevant

information and allows for the efficient description of frequent itemsets. The algorithm consists

of 2 steps:

1. Build FP-tree

2. Use the tree to find frequent itemsets.

27. What is meant by Classification? Classification is the process of learning a model that is able to describe different classes of data.

28. List the applications of data mining.

Marketing

Finance

Resource optimization

Image Analysis

Fraud detection

PART B

1. Describe the classification of data? Explain about Threats and Risks?

2. Explain in detail about database control and types of privileges?

3. Explain in detail about Cryptography?

4. Describe about Statistical Databases and Distributed Databases?

5. Describe the architecture Data Warehousing and Mining?

6. Explain in detail about the transaction processing?

7. Describe about the Association rule?

8. Explain about Clustering, Information Retrieval, Relevance ranking?

9. Describe about the Crawling and Indexing the Web?

10. Explain about Object Oriented Databases and XML Databases with suitable example?