94043286 Teradata Interview Questions

25
Teradata Interview Questions 1. What are the differences between the followings? - Vertical & Horizontal Partitioning vs - Join & Hash Indexes vs - PPI 2. What is OLAP? Online Analytical Processing, a category of software tools that provides analysis of data stored in a database. OLAP tools enable users to analyze different dimensions of multidimensional data. For example, it provides time series and trend analysis views. The chief component of OLAP is the OLAP server, which sits between a client and a database management systems (DBMS). The OLAP server understands how data is organized in the database and has special functions analyzing the data. 3. What is OLAP, MOLAP, ROLAP, DOLAP, HOLAP? Examples? OLAP - On-Line Analytical Processing. Designates a category of applications and technologies that allow the collection, storage, manipulation and reproduction of multidimensional data, with the goal of analysis. MOLAP - Multidimensional OLAP. This term designates a cartesian data structure more specifically. In effect, MOLAP contrasts with ROLAP. Inb the former, joins between tables are already suitable, which enhances performances. In the latter, joins are computed during the request. Targeted at groups of users because it's a shared environment. Data is stored in an exclusive server-based format. It performs more complex analysis of data. DOLAP - Desktop OLAP. Small OLAP products for local multidimensional analysis Desktop OLAP. There can be a mini multidimensional database (using Personal Express), or extraction of a datacube (using Business Objects). Designed for low-end, single, departmental user. Data is stored in cubes on the desktop. It's like having your own spreadsheet. Since the data is local, end users don't have to worry about performance hits against the server. ROLAP - Relational OLAP. Designates one or several star schemas stored in relational databases. This technology permits multidimensional analysis with data stored in relational databases. Used for large departments or groups because it supports large amounts of data and users. HOLAP:Hybridization of OLAP, which can include any of the above. 4. WHAT IS THE DIFFERENCE BETWEEN OLTP AND OLAP? 5. WHAT IS THE DIFFERENCE BETWEEN DELETE AND TRUNCATE?

Transcript of 94043286 Teradata Interview Questions

Page 1: 94043286 Teradata Interview Questions

Teradata Interview Questions

1. What are the differences between the followings?

- Vertical & Horizontal Partitioning

vs

- Join & Hash Indexes

vs

- PPI

2. What is OLAP? Online Analytical Processing, a category of software tools that provides analysis of data stored

in a database. OLAP tools enable users to analyze different dimensions of multidimensional

data. For example, it provides time series and trend analysis views. The chief component of

OLAP is the OLAP server, which sits between a client and a database management systems

(DBMS). The OLAP server understands how data is organized in the database and has special

functions analyzing the data.

3. What is OLAP, MOLAP, ROLAP, DOLAP, HOLAP? Examples?

OLAP - On-Line Analytical Processing.

Designates a category of applications and technologies that allow the collection, storage, manipulation and reproduction of multidimensional data, with the goal of analysis. MOLAP - Multidimensional OLAP.

This term designates a cartesian data structure more specifically. In effect, MOLAP contrasts with ROLAP. Inb the former, joins between tables are already suitable, which enhances performances. In the latter, joins are computed during the request. Targeted at groups of users because it's a shared environment. Data is stored in an exclusive

server-based format. It performs more complex analysis of data. DOLAP - Desktop OLAP.

Small OLAP products for local multidimensional analysis Desktop OLAP. There can be a mini multidimensional database (using Personal Express), or extraction of a datacube (using Business Objects).

Designed for low-end, single, departmental user. Data is stored in cubes on the desktop. It's like

having your own spreadsheet. Since the data is local, end users don't have to worry about

performance hits against the server.

ROLAP - Relational OLAP.

Designates one or several star schemas stored in relational databases. This technology permits

multidimensional analysis with data stored in relational databases.

Used for large departments or groups because it supports large amounts of data and users.

HOLAP:Hybridization of OLAP, which can include any of the above.

4. WHAT IS THE DIFFERENCE BETWEEN OLTP AND OLAP?

5. WHAT IS THE DIFFERENCE BETWEEN DELETE AND TRUNCATE?

Page 2: 94043286 Teradata Interview Questions

• Delete table is a logged operation, so the deletion of each row gets logged in the

transaction log, which makes it slow.

• Truncate table also deletes all the rows in a table, but it won’t log the deletion of each

row, instead it logs the de-allocation of the data pages of the table, which makes it

faster. Of course, truncate table cannot be rolled back.

• Truncate table is functionally identical to delete statement with no “where clause”

both remove all rows in the table. But truncate table is faster and uses fewer system

and transaction log resources than delete.

• Truncate table removes all rows from a table, but the table structure and its columns,

constraints, indexes etc., remains as it is.

• In truncate table the counter used by an identity column for new rows is reset to the

seed for the column.

• If you want to retain the identity counter, use delete statement instead.

• If you want to remove table definition and its data, use the drop table statement.

• You cannot use truncate table on a table referenced by a foreign key constraint;

instead, use delete statement without a where clause. Because truncate table is not

logged, it cannot activate a trigger.

• Truncate table may not be used on tables participating in an indexed view.

6. WHAT IS NORMALIZATION? EXPLAIN FIRST THREE NORMAL FORMS?

Normalization is the process of efficiently organizing data in a database. There are two

goals of the normalization process: eliminating redundant data (for example, storing the

same data in more than one table) and ensuring data dependencies make sense (only

storing related data in a table). Both of these are worthy goals as they reduce the

amount of space a database consumes and ensure that data is logically stored.

The Normal Forms

The database community has developed a series of guidelines for ensuring that

databases are normalized. These are referred to as normal forms and are numbered

from one (the lowest form of normalization, referred to as first normal form or 1NF)

through five (fifth normal form or 5NF). In practical applications, you'll often see 1NF,

2NF, and 3NF along with the occasional 4NF. Fifth normal form is very rarely seen and

won't be discussed in this article.

First Normal Form (1NF) sets the very basic rules for an organized database:

Eliminate duplicative columns from the same table.

Create separate tables for each group of related data and identify each row with a

unique column (the primary key).

What do these rules mean when contemplating the practical design of a database? It's

actually quite simple.

The first rule dictates that we must not duplicate data within the same row of a table.

Within the database community, this concept is referred to as the atomicity of a table.

Tables that comply with this rule are said to be atomic. Let's explore this principle with

a classic example - a table within a human resources database that stores the manager-

subordinate relationship. For the purposes of our example, we'll impose the business

rule that each manager may have one or more subordinates while each subordinate may

have only one manager.

Page 3: 94043286 Teradata Interview Questions

Intuitively, when creating a list or spreadsheet to track this information, we might

create a table with the following fields:

Manager

Subordinate1

Subordinate2

Subordinate3

Subordinate4

However, recall the first rule imposed by 1NF: eliminate duplicative columns from the

same table. Clearly, the Subordinate1-Subordinate4 columns are duplicative. Take a

moment and ponder the problems raised by this scenario. If a manager only has one

subordinate - the Subordinate2-Subordinate4 columns are simply wasted storage space

(a precious database commodity). Furthermore, imagine the case where a manager

already has 4 subordinates - what happens if she takes on another employee? The

whole table structure would require modification.

At this point, a second bright idea usually occurs to database novices: We don't want to

have more than one column and we want to allow for a flexible amount of data storage.

Let's try something like this:

Manager

Subordinates

Where the Subordinates field contains multiple entries in the form "Mary, Bill, Joe"

This solution is closer, but it also falls short of the mark. The subordinates column is still

duplicative and non-atomic. What happens when we need to add or remove a

subordinate? We need to read and write the entire contents of the table. That's not a big

deal in this situation, but what if one manager had one hundred employees? Also, it

complicates the process of selecting data from the database in future queries.

Here's a table that satisfies the first rule of 1NF:

Manager

Subordinate

In this case, each subordinate has a single entry, but managers may have multiple

entries.

Now, what about the second rule: identify each row with a unique column or set of

columns (the primary key)? You might take a look at the table above and suggest the

use of the subordinate column as a primary key. In fact, the subordinate column is a

good candidate for a primary key due to the fact that our business rules specified that

each subordinate may have only one manager. However, the data that we've chosen to

store in our table makes this a less than ideal solution. What happens if we hire another

employee named Jim? How do we store his manager-subordinate relationship in the

database?

Page 4: 94043286 Teradata Interview Questions

It's best to use a truly unique identifier (such as an employee ID) as a primary key. Our

final table would look like this:

Manager ID

Subordinate ID

2<sup>ND</sup> Normal Form

Over the past month, we've looked at several aspects of normalizing a database table.

First, we discussed the basic principles of database normalization. Last time, we

explored the basic requirements laid down by the first normal form (1NF). Now, let's

continue our journey and cover the principles of second normal form (2NF).

Recall the general requirements of 2NF:

Remove subsets of data that apply to multiple rows of a table and place them in

separate tables.

Create relationships between these new tables and their predecessors through the use

of foreign keys.

These rules can be summarized in a simple statement: 2NF attempts to reduce the

amount of redundant data in a table by extracting it, placing it in new table(s) and

creating relationships between those tables.

Let's look at an example. Imagine an online store that maintains customer information

in a database. They might have a single table called Customers with the following

elements:

CustNum

FirstName

LastName

Address

City

State

ZIP

A brief look at this table reveals a small amount of redundant data. We're storing the

"Sea Cliff, NY 11579" and "Miami, FL 33157" entries twice each. Now, that might not

seem like too much added storage in our simple example, but imagine the wasted space

if we had thousands of rows in our table. Additionally, if the ZIP code for Sea Cliff were

to change, we'd need to make that change in many places throughout the database.

In a 2NF-compliant database structure, this redundant information is extracted and

stored in a separate table. Our new table (let's call it ZIPs) might have the following

fields:

Page 5: 94043286 Teradata Interview Questions

ZIP

City

State

If we want to be super-efficient, we can even fill this table in advance -- the post office

provides a directory of all valid ZIP codes and their city/state relationships. Surely,

you've encountered a situation where this type of database was utilized. Someone

taking an order might have asked you for your ZIP code first and then knew the city and

state you were calling from. This type of arrangement reduces operator error and

increases efficiency.

Now that we've removed the duplicative data from the Customers table, we've satisfied

the first rule of second normal form. We still need to use a foreign key to tie the two

tables together. We'll use the ZIP code (the primary key from the ZIPs table) to create

that relationship. Here's our new Customers table:

CustNum

FirstName

LastName

Address

ZIP

We've now minimized the amount of redundant information stored within the database

and our structure is in second normal form!

3<sup>RD</sup> Normal Form

There are two basic requirements for a database to be in third normal form:

Already meet the requirements of both 1NF and 2NF

Remove columns that are not fully dependent upon the primary key.

Imagine that we have a table of widget orders that contains the following attributes:

Order Number

Customer Number

Unit Price

Quantity

Total

Remember, our first requirement is that the table must satisfy the requirements of 1NF

and 2NF. Are there any duplicative columns? No. Do we have a primary key? Yes, the

order number. Therefore, we satisfy the requirements of 1NF. Are there any subsets of

data that apply to multiple rows? No, so we also satisfy the requirements of 2NF.

Page 6: 94043286 Teradata Interview Questions

Now, are all of the columns fully dependent upon the primary key? The customer number

varies with the order number and it doesn't appear to depend upon any of the other

fields. What about the unit price? This field could be dependent upon the customer

number in a situation where we charged each customer a set price. However, looking at

the data above, it appears we sometimes charge the same customer different prices.

Therefore, the unit price is fully dependent upon the order number. The quantity of items

also varies from order to order, so we're OK there.

What about the total? It looks like we might be in trouble here. The total can be derived

by multiplying the unit price by the quantity, therefore it's not fully dependent upon the

primary key. We must remove it from the table to comply with the third normal form.

Perhaps we use the following attributes:

Order Number

Customer Number

Unit Price

Quantity

Now our table is in 3NF. But, you might ask, what about the total? This is a derived field

and it's best not to store it in the database at all. We can simply compute it "on the fly"

when performing database queries. For example, we might have previously used this

query to retrieve order numbers and totals:

SELECT OrderNumber, Total

FROM WidgetOrders

We can now use the following query:

SELECT OrderNumber, UnitPrice * Quantity AS Total

FROM WidgetOrders

to achieve the same results without violating normalization rules.

Before we begin our discussion of the normal forms, it's important to point out that they

are guidelines and guidelines only. Occasionally, it becomes necessary to stray from them

to meet practical business requirements. However, when variations take place, it's

extremely important to evaluate any possible ramifications they could have on your

system and account for possible inconsistencies

DIFFERENCE BETWEEN CLUSTERED AND NON-CLUSTERED INDEXES?

There are clustered and nonclustered indexes. A clustered index is a special type of index

that reorders the way records in the table are physically stored. Therefore table can have

only one clustered index. The leaf nodes of a clustered index contain the data pages.

A nonclustered index is a special type of index in which the logical order of the index does

not match the physical stored order of the rows on disk. The leaf nodes of a nonclustered

index does not consist of the data pages. Instead, the leaf nodes contain index rows.

Consider using a clustered index for:

o Columns that contain a large number of distinct values.

o Queries that return a range of values using operators such as BETWEEN, >, >=, <, and

<=.

Page 7: 94043286 Teradata Interview Questions

o Columns that are accessed sequentially.

o Queries that return large result sets.

Non-clustered indexes have the same B-tree structure as clustered indexes, with two

significant differences:

o The data rows are not sorted and stored in order based on their non-clustered keys.

o The leaf layer of a non-clustered index does not consist of the data pages. Instead, the

leaf nodes contain index rows. Each index row contains the non-clustered key value and

one or more row locators that point to the data row (or rows if the index is not unique)

having the key value.

o Per table only 249 non clustered indexes.

WHAT'S THE DIFFERENCE BETWEEN CONTROL FLOW AND DATA FLOW?

Control Flow:

1. Process Oriented

2. Doesn’t manage or pass data between components.

3. It functions as a task coordinator

4. In control flow tasks requires completion (Success, failure or completion)

5. Synchronous in nature, this means, task requires completion before moving to next

task. If the tasks are not connected with each other but still they are synchronous in

nature.

6. Tasks can be executed both parallel and serially

7. Three types of control flow elements in SSIS 2005

· Containers: Provides structures in the packages

· Tasks: Provides functionality in the packages

· Precedence Constraints: Connects containers, executables and tasks into an

ordered control flow.

8. We can control the sequence execution for tasks and also specify the conditions

that tasks and containers run.

9. It is possible to include nested containers as SSIS Architecture supports nesting of

the containers. Control flow can include multiple levels of nested containers.

Data Flow

Streaming in nature

Information oriented

Passes data between other components

Page 8: 94043286 Teradata Interview Questions

Transformations work together to manage and process data. This means first set of data

from the source may be in the final destination step while at the same time other set of

data is still flowing. All the transformations are doing work at the same time.

Three types of Data Flow components

· Sources: Extracts data from the various sources (Database, Text Files etc)

· Transformations: Cleans, modify, merge and summarizes the data

· Destination: Loads data into destinations like database, files or in memory datasets

WHAT IS THE MULTICAST SHAPE USED FOR?

The Multicast transformation distributes its input to one or more outputs. This

transformation is similar to the Conditional Split transformation. Both transformations

direct an input to multiple outputs. The difference between the two is that the Multicast

transformation directs every row to every output, and the Conditional Split directs a row

to a single output

WHAT SHAPE WOULD YOU USE TO CONCATENATE TWO INPUT FIELDS INTO A SINGLE

OUTPUT FIELD?

Derived Column shape\Task can be used to concatenate columns

----------------------------------------------------------------------------------------------------

Q1 Explain architecture of SSIS?

http://technet.microsoft.com/en-us/library/ms141709(SQL.90).aspx

Q2 Difference between Control Flow and Data Flow?

Very easy.

Q3 How would you do Logging in SSIS?

Log using the logging configuration inbuilt in SSIS or use Custom logging through Event handlers.

http://msdn.microsoft.com/en-us/library/ms141727.aspx

Q4 How would you do Error Handling?

its for you.

Q5 How to pass property value at Run time? How do you implement Package Configuration?

http://msdn.microsoft.com/en-us/library/ms141682.aspx

Q6 How would you deploy a SSIS Package on production?

1. Create deployment utility by setting its propery as true .

2. It will be created in the bin folder of the solution as soon as package is build.

3. Copy all the files in the utility and use manifest file to deply it on the Prod.

Q7 Difference between DTS and SSIS?

Every thing except both are product of Microsoft :-)

Page 9: 94043286 Teradata Interview Questions

Q8 What are new features in SSIS 2008?

http://sqlserversolutions.blogspot.com/2009/01/new-improvementfeatures-in-ssis-2008.html

Q9 How would you pass a variable value to Child Package?

http://sqlserversolutions.blogspot.com/2009/02/passing-variable-to-child-package-from.html

http://technet.microsoft.com/en-us/library/ms345179(SQL.90).aspx

Q10 What is Execution Tree?

http://technet.microsoft.com/en-us/library/cc966529.aspx

Q11 What are the points to keep in mind for performance improvement of the package?

http://technet.microsoft.com/en-us/library/cc966529.aspx

Q12 You may get a question stating a scenario and then asking you how would you create a package for that e.g.

How would you configure a data flow task so that it can transfer data to different table based on the city name in a

source table column?

Q13 Difference between Unionall and Merge Join?

http://sqlserversolutions.blogspot.com/2009/01/difference-between-merge-and-union-all.html

Q14 May get question regarding what X transformation do?Lookup, fuzzy lookup, fuzzy grouping transformation

are my favorites.

For you.

Q15 How would you restart package from previous failure point?What are Checkpoints and how can we implement

in SSIS?

http://msdn.microsoft.com/en-us/library/ms140226.aspx

Q16 Where are SSIS package stored in the SQL Server?

MSDB.sysdtspackages90 stores the actual content and ssydtscategories, sysdtslog90,

sysdtspackagefolders90, sysdtspackagelog, sysdtssteplog, and sysdtstasklog do the supporting

roles.

Q17 How would you schedule a SSIS packages?

Using SQL Server Agent. Read about Scheduling a job on Sql server Agent

Q18 Difference between asynchronous and synchronos transformations?

Asynchronous transformation have different Input and Output buffers and it is up to the component designer in an

Async component to provide a column structure to the output buffer and hook up the data from the input.

Q19 How to achieve multiple threading in SSIS?

_________________________________________________________________________________________

Question 1 - True or False - Using a checkpoint file in SSIS is just like issuing the

CHECKPOINT command against the relational engine. It commits all of the data

to the database.

Ans: False. SSIS provides a Checkpoint capability which allows a package to restart at the

point of failure.

Page 10: 94043286 Teradata Interview Questions

Additional information: Checkpoints in SQL Server Integration Services (SSIS) Packages

to restart from the point of failure

Question 2 - Can you explain the what the Import\Export tool does and the basic

steps in the wizard?

The Import\Export tool is accessible via BIDS or executing the dtswizard command.

The tool identifies a data source and a destination to move data either within 1 database,

between instances or even from a database to a file (or vice versa).

Question 3 - What are the command line tools to execute SQL Server Integration

Services packages?

DTSEXECUI - When this command line tool is run a user interface is loaded in order to

configure each of the applicable parameters to execute an SSIS package.

DTEXEC - This is a pure command line tool where all of the needed switches must be

passed into the command for successful execution of the SSIS package.

Question 4 - Can you explain the SQL Server Integration Services functionality in

Management Studio?

You have the ability to do the following:

Login to the SQL Server Integration Services instance

View the SSIS log

View the packages that are currently running on that instance

Browse the packages stored in MSDB or the file system

Import or export packages

Delete packages

Run packages

Question 5 - Can you name some of the core SSIS components in the Business

Intelligence Development Studio you work with on a regular basis when building

an SSIS package?

Connection Managers

Control Flow

Data Flow

Event Handlers

Variables window

Page 11: 94043286 Teradata Interview Questions

Toolbox window

Output window

Logging

Package Configurations

Question 6 - True or False: SSIS has a default means to log all records updated,

deleted or inserted on a per table basis.

False, but a custom solution can be built to meet these needs.

Additional information: Custom Logging in SQL Server Integration Services Packages

(SSIS)

Question 7 - What is a breakpoint in SSIS? How is it setup? How do you disable

it?

A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA

an opportunity to review the status of the data, variables and the overall status of the

SSIS package.

10 unique conditions exist for each breakpoint.

Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click

on the object where you want to set the breakpoint and select the 'Edit Breakpoints...'

option.

Additional information:

Breakpoints in SQL Server 2005 Integration Services

Question 8 - Can you name 5 or more of the native SSIS connection managers?

OLEDB connection - Used to connect to any data source requiring an OLEDB connection

(i.e., SQL Server 2000)

Flat file connection - Used to make a connection to a single file in the File System.

Required for reading information from a File System flat file

ADO.Net connection - Uses the .Net Provider to make a connection to SQL Server 2005 or

other connection exposed through managed code (like C#) in a custom task

Analysis Services connection - Used to make a connection to an Analysis Services

database or project. Required for the Analysis Services DDL Task and Analysis Services

Processing Task

File connection - Used to reference a file or folder. The options are to either use or create

a file or folder

Page 12: 94043286 Teradata Interview Questions

Excel

FTP

HTTP

MSMQ

SMO

SMTP

SQLMobile

WMI

Question 9 - How do you eliminate quotes from being uploaded from a flat file to

SQL Server?

In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the

Text qualifier field then preview the data to ensure the quotes are not included.

Additional information: How to strip out double quotes from an import file in SQL Server

Integration Services

Question 10 - Can you name 5 or more of the main SSIS tool box widgets and

their functionality?

For Loop Container

Foreach Loop Container

Sequence Container

ActiveX Script Task

Analysis Services Execute DDL Task

Analysis Services Processing Task

Bulk Insert Task

Data Flow Task

Data Mining Query Task

Execute DTS 2000 Package Task

Execute Package Task

Execute Process Task

Execute SQL Task

Page 13: 94043286 Teradata Interview Questions

etc.

Question 11 - Can you explain one approach to deploy an SSIS package?

One option is to build a deployment manifest file in BIDS, then copy the directory to the

applicable SQL Server then work through the steps of the package installation wizard

A second option is using the dtutil utility to copy, paste, rename, delete an SSIS Package

A third option is to login to SQL Server Integration Services via SQL Server Management

Studio then navigate to the 'Stored Packages' folder then right click on the one of the

children folders or an SSIS package to access the 'Import Packages...' or 'Export

Packages...'option.

A fourth option in BIDS is to navigate to File | Save Copy of Package and complete the

interface.

Additional information:

Deploying a SQL Server 2000 DTS Package vs. a SQL Server 2005 Integration Services

Package (SSIS)

Import, Export, Copy and Delete Integration Services Packages in SQL Server 2005

Question 12 - Can you explain how to setup a checkpoint file in SSIS?

The following items need to be configured on the properties tab for SSIS package:

CheckpointFileName - Specify the full path to the Checkpoint file that the package uses to

save the value of package variables and log completed tasks. Rather than using a hard-

coded path as shown above, it's a good idea to use an expression that concatenates a

path defined in a package variable and the package name.

CheckpointUsage - Determines if/how checkpoints are used. Choose from these options:

Never (default), IfExists, or Always. Never indicates that you are not using Checkpoints.

IfExists is the typical setting and implements the restart at the point of failure behavior. If

a Checkpoint file is found it is used to restore package variable values and restart at the

point of failure. If a Checkpoint file is not found the package starts execution with the first

task. The Always choice raises an error if the Checkpoint file does not exist.

SaveCheckpoints - Choose from these options: True or False (default). You must select

True to implement the Checkpoint behavior.

Additional information: Checkpoints in SQL Server Integration Services (SSIS) Packages

to restart from the point of failure

Question 13 - Can you explain different options for dynamic configurations in

SSIS?

Use an XML file

Page 14: 94043286 Teradata Interview Questions

Use custom variables

Use a database per environment with the variables

Use a centralized database with all variables

Additional information: Using XML Package Configuration with SQL Server Integration

Services (SSIS) Packages

Question 14 - How do you upgrade an SSIS Package?

Depending on the complexity of the package, one or two techniques are typically used:

Recode the package based on the functionality in SQL Server DTS

Use the Migrate DTS 2000 Package wizard in BIDS then recode any portion of the package

that is not accurate

Additional information:

Upgrade SQL Server DTS Packages to Integration Services Packages

Question 15 - Can you name five of the Perfmon counters for SSIS and the value

they provide?

SQLServer:SSIS Service

SSIS Package Instances - Total number of simultaneous SSIS Packages running

SQLServer:SSIS Pipeline

BLOB bytes read - Total bytes read from binary large objects during the monitoring period.

BLOB bytes written - Total bytes written to binary large objects during the monitoring period.

BLOB files in use - Number of binary large objects files used during the data flow task during the

monitoring period.

Buffer memory - The amount of physical or virtual memory used by the data flow task during the

monitoring period.

Buffers in use - The number of buffers in use during the data flow task during the monitoring period.

Buffers spooled - The number of buffers written to disk during the data flow task during the

monitoring period.

Flat buffer memory - The total number of blocks of memory in use by the data flow task during the

monitoring period.

Flat buffers in use - The number of blocks of memory in use by the data flow task at a point in time.

Private buffer memory - The total amount of physical or virtual memory used by data transformation

tasks in the data flow engine during the monitoring period.

Page 15: 94043286 Teradata Interview Questions

Private buffers in use - The number of blocks of memory in use by the transformations in the data

flow task at a point in time.

Rows read - Total number of input rows in use by the data flow task at a point in time.

Rows written - Total number of output rows in use by the data flow task at a point in time.

_______________________________________________________________________

Database concepts Interview questions - Part 1

Next>> Part 1 | Part 2 | Part 3 | Part 4

Define Fact tables and dimension tables.

Fact tables are central tables in data warehousing. They contain the aggregate values that are used in business

process..............

Read answer

Explain the ETL process in Data warehousing.

Extraction, Transformation and loading are different stages in data warehousing................

Read answer

What is Data mining?

Data mining is a process of analyzing current data and summarizing the information in more useful

manner..................

Read answer

What are indexes?

Index can be thought as index of the book that is used for fast retrieval of information.

Index uses one or more column index keys and pointers to the record to locate record...............

Read answer

Explain the types of indexes?

Clustered index

Non-clustered.....................

Read answer

Define SQL.

SQL stands for Structured Query Language. It allows access, insert/update/delete records and retrieve data from the

database...................

Read answer

What is RDBMS? Explain its features.

RDBMS stands for Relational Database Management System. It organizes data into related rows and

columns...................

Read answer

Page 16: 94043286 Teradata Interview Questions

What is an Entity-Relationship diagram?

It is a graphical representation of tables with the relationship between them................

Read answer

Define referential integrity.

It is the rules that are applied when the relationships are created. It ensures integrity of data and prevents inconsitent

data into the tables...............

Read answer

Define Primary key and Foreign key.

A column or combination of columns that identify a row of data in a table is Primary Key............

Read answer

Define alternate key.

There can be a key apart from primary key in a table that can also be a key. This key may or may not be a unique

key..............

Read answer

Database interview questions- April 06, 2009, 17:40 pm by Nishant Kumar

Delete vs. Truncate table.

Delete logs the deletion of each row whereas Truncate doesn't log deleted rows in the transaction log. This makes

truncate command is bit faster than Delete command.

Define constraints.

Constraints enforce integrity of the database. Constraints can be of following types

Not Null

Check

Unique

Primary key

Foreign key

Database interview questions- April 12, 2009, 14:50 pm by Nishant Kumar

Define stored procedure.

Stored procedure is a set of pre-compiled SQL statements, executed when it is called in the program.

Define Trigger.

Triggers are similar to stored procedure except it is executed automatically when any operations are occurred on the

table.

Part 1 | Part 2 | Part 3 | Part 4

Page 17: 94043286 Teradata Interview Questions

Next>>

Also read

What is Data warehousing?

Answer - A data warehouse can be considered as a storage area where interest specific or relevant data........

What is an OLTP system and OLAP system?

Answer - OLTP: Online Transaction and Processing helps and manages applications based........

SQL Server 2005 Analysis Services Interview Questions

What is SQL Server 2005 Analysis Services (SSAS)?

What are the new features with SQL Server 2005 Analysis Services (SSAS)?

What are SQL Server Analysis Services cubes?

Explain the purpose of synchronization feature provided in Analysis Services 2005.

Explain the new features of SQL Server 2005 Analysis Services (SSAS). [Hint - Unified Dimensional Model, Data

Source View, new aggregation functions and querying tools]....................

OLAP interview questions

Explain the concepts and capabilities of OLAP.

Explain the functionality of OLAP.

What are MOLAP and ROLAP?

Explain the role of bitmap indexes to solve aggregation problems.

Explain the encoding technique used in bitmaps indexes.

What is Binning?

What is candidate check?..................

Define Truncate and Delete commands.

Answer - Truncate command is used to remove all rows of the column.The removed records are not recorded in the

transaction log......

Define Primary and Unique key.

Answer - The column or columns of the table whose value uniquely identifies each row in the table is called primary

key. You can define column as primary key using primary key constraint while you create table.....

What is index? Define its types.

Answer - Index can be thought as index of the book that is used for fast retrieval of information. Index uses one or

more column index keys and pointers to the record to locate record.........

Define Normalization and De- Normalization.

Answer - It is the process of organizing data into related table. To normalize database, we divide database into

tables.....

What is transact-SQL? Describe its types?

Page 18: 94043286 Teradata Interview Questions

Answer - SQL Server Provides three types of Transact-SQL statements namely DDL, DCL, and DML....

Database concepts Interview questions - Part 2

<<Previous Next>>

Part 1 | Part 2 | Part 3 | Part 4

Define SQL.

Structured query language, SQL is an ANSI standard language that provides commands to access and update

databases.............

Explain the difference between DBMS and RDBMS.

DBMS offers organized way of storing, managing and retrieving information..........

What are E-R diagrams?

E-R diagrams, i.e. Entity-Relationship diagram represent relationship between various tables in the

database..............

Explain the types of relationships in database.

One-to-one

One to one is implemented using single table by establishing relationship between same type of columns in a

table...............

What are the benifits of normalizing database?

It helps to avoid duplicate entries.

It allows saving storage space....................

What is normalization?

It is the process of organizing data into related table.

To normalize database, we divide database into tables and establish relationships between the tables............

What is denormalization?

The process of adding redundant data to get rid of complex join, in order to optimize database performance. This is

done to speed up database access by moving from higher to lower form of normalization.................

Explain DML and DDL statements.

Data definition language is used to define and manage all attributes and properties of a database..................

Part 1 | Part 2 | Part 3 | Part 4

<<Previous Next>>

Page 19: 94043286 Teradata Interview Questions

Also read

Define database objects.

Answer - SQL Server database stores information in a two dimensional objects of rows and columns called table......

Define data, entity, domain and referential integrity.

Answer - Data Integrity validates the data before getting stored in the columns of the table. SQL Server supports four

type of data integrity.....

SQL Server Optimization Tips

Answer - Restricting query result means return of required rows instead of all rows of the table. This helps in

reducing network traffic......

What are the lock types?

Answer - Shared Lock allows simultaneous access of record by multiple Select statements. Shared Lock blocks

record from updating and will remain in queue waiting while record is accessed for reading......

XSLT in SQL Server 2005

Overview of XSLT and the components that make up an XSLT style sheet.

What is XSLCompiledTransform class of the .NET Framework?

What is XSLTSetting class of the .NET Framework?

Database concepts Interview questions - Part 3

<<Previous Next>>

Part 1 | Part 2 | Part 3 | Part 4

What is Union and Union All operator?

Union is used to combine distinct records from two tables. Union all combines all records from two tables..............

What is cursor?

A Cursor is a database object that represents a result set and is used to manipulate data row by row. When a cursor

is opened, it is positioned on a row and that row is available for processing.............

Explain the cursor types.

DYNAMIC: It reflects changes happened on the table while scrolling through the row.

STATIC: It works on snapshot of record set and disconnects from the server...............

Explain in brief the cursor optimization tips.

Close cursor when it is not required.

You shouldn’t forget to deallocate cursor after closing it................

Page 20: 94043286 Teradata Interview Questions

What is sub-query?

Sub-query is a query within a Query. Example of sub-query:

Select CustId, Custname From Customer Where Cust_Id IN (Select Doct_Id from Doctor)...............

Explain the use of group by clause.

"Group By" is used to derive aggegate values by grouping similar data................

Difference between clustered and non-clustered index.

Both stored as B-tree structure. The leaf level of a clustered index is the actual data where as leaf level of a non-

clustered index is pointer to data...............

Part 1 | Part 2 | Part 3 | Part 4

<<Previous Next>>

Also read

Querying and modifying XML data in SQL Server 2005

What is XQuery language?

Explain the syntax rule of XQuery language.

XQuery expression contains two parts: the Prolog and the Body. Explain them

Explain PATH expression in XQuery with an example.

SQL Server 2005 XML support

Explain the concepts and capabilities of SOAP. Explain the purpose of Native XML mode in SQL Server 2005.

Native XML Access vs. SQLXML.

Benefits of Native XML Access in SQL Server 2005.

Limitation for Native XML Web Services.

Define Distributed Query and Linked Server?

Answer - Distributed Query is a query which can retrieve data from multiple data sources including distributed

data........

Describe in brief Databases and SQL Server Databases Architecture.

Answer - A database is a structured collection of data. Database can be thought as simple data file......

What security features are available for stored procedures?

Answer - Database users can have permission to execute a stored procedure without being......

Database concepts Interview questions - Part 4

<<Previous Next>>

Part 1 | Part 2 | Part 3 | Part 4

Page 21: 94043286 Teradata Interview Questions

Define aggregate and scalar functions.

Aggregate Functions return a single value by operating against a group of values. Scalar functions operate against a

single value................

What are the restrictions applicable while creating views?

Views can be created referencing tables and views only in the current database.

A view name must not be the same as any table owned by that user.

You can build views on other views and on procedures that reference views.............

What is "correlated subqueries"?

In "correlated subqueries", the result of outer query is passed to the subquery and the subquery runs for each

row...............

What is Data Warehousing?

Data Warehousing is a process of storing and accessing data from central location for some strategic

decision................

What is a join and explain different types of joins.

Joins are used in queries to explain how different tables are related.

Joins also let you select data from a table depending upon data from another table................

Part 1 | Part 2 | Part 3 | Part 4

<<Previous Next>>

Also read

What are the capabilities of Cursors?

Answer - Cursors can support various functionalities that are listed here.....

What are the ways to controlling Cursor Behavior?

Answer - Cursors behavior can be controlled by dividing them into cursor types: forward-only, static,........

Define temporary and extended stored procedure.

Answer - Temporary Stored Procedure is stored in TempDB database. It is volatile and is deleted once connection

gets terminated or server is restarted......

Describe in brief authentication modes in SQL server.

Answer - This is the default and recommended security mode. In this mode, access to SQL server is controlled by

Windows NT.....

Define Identity and uniqueidentifier property of Column.

Page 22: 94043286 Teradata Interview Questions

Answer - Column with identity property contains unique system generated value in the table. Column with identity

property is similar to AutoNumber field in MS Access....

Database index tuning interview questions

<<Previous Next>>

What is Index tuning?

Query performance as well as speed improvement of a database can be done using Indexes.

The process of enhancing the selection of indexes is called Index Tuning..................

How is index tuning used to improve query performance?

The Index tuning wizard can be used to improve the performance of queries and databases. It uses the

following measures to do so:...............

<<Previous Next>>

Also read

SQL Server Optimization Tips

Restricting query result means return of required rows instead of all rows of the table. This helps in

reducing network traffic......

SQL Server 2005 Analysis Services Interview Questions

What is SQL Server 2005 Analysis Services (SSAS)?

What are the new features with SQL Server 2005 Analysis Services (SSAS)?

What are SQL Server Analysis Services cubes?

Explain the purpose of synchronization feature provided in Analysis Services 2005.

Explain the new features of SQL Server 2005 Analysis Services (SSAS). [Hint - Unified Dimensional

Model, Data Source View, new aggregation functions and querying tools]....................

What are the ways to code efficient transactions?

Database users can have permission to execute a stored procedure without being....

What are the ways to controlling Cursor Behavior?

Cursors behavior can be controlled by dividing them into cursor types: forward-only, static,........

What are cubes?

A data cube stores data in a summarized version which helps in a faster analysis of data..........

What is snow flake scheme design in database?

A snowflake Schema in its simplest form is an arrangement of fact tables.........

Page 23: 94043286 Teradata Interview Questions

Database Optimization Interview questions

<<Previous Next>>

Reasons of poor performance of query.

No indexes

Excess recompilations of stored procedures.

Procedures and triggers without SET NOCOUNT ON...............

What are the ways to code efficient transactions?

We shouldn't allow input from users during a transaction.

We shouldn't open transactions while browsing through data...................

Explain Execution Plan.

SQL Server caches the plan of execution of query or stored procedure which it uses in subsequent

call...................

What are Indexes?

Index can be thought as index of the book that is used for fast retrieval of information.

Index uses one or more column index keys and pointers to the record to locate record............

Explain in brief the cursor optimization tips.

Close cursor when it is not required.

You shouldn’t forget to deallocate cursor after closing it. ...............

What are B-trees?

Explain Table Scan and Index Scan.

Describe FillFactor concept in indexes.

What are Index statistics?

Describe Fragmentation.

Explain Nested Join, Hash Join, and Merge Join in SQL Query Plan.

<<Previous Next>>

Also read

SQL Server Optimization Tips

Answer - Restricting query result means return of required rows instead of all rows of the table. This helps in

reducing network traffic......

Page 24: 94043286 Teradata Interview Questions

What are the lock types?

Answer - Shared Lock allows simultaneous access of record by multiple Select statements. Shared Lock blocks

record from updating and will remain in queue waiting while record is accessed for reading......

CLR support for SQL Server 2005

Overview of integration of CLR with SQL Server.

Advantages of CLR integration.

Indexing XML data in SQL Server 2005

Explain the concepts of indexing XML data in SQL Server 2005.

Provide basic syntax for creating index on XML data type column.

What is content indexing/full text indexing?

Explain the reason to index XML data type column.

What are the guidelines to be adhered when creating a XML index?

Database Partitioning interview questions

<<Previous Next>>

What is database partitioning?

Database partitioning involves dividing logical database into distinct independent units to improve its performance,

manageability and availability.

Explain how is partitioning an important part of database optimization.

<<Previous Next>>

Also read

Describe in brief Databases and SQL Server Databases Architecture.

Answer - A database is a structured collection of data. Database can be thought as simple data file......

Define Normalization and De- Normalization.

Answer - It is the process of organizing data into related table. To normalize database, we divide database into

tables.....

Define transaction and transaction isolation levels.

Answer - A transaction is a set of operations that works as a single unit. The ransactions can be categorized into

explicit, autocommit, and implicit....

Define Truncate and Delete commands.

Answer - Truncate command is used to remove all rows of the column.The removed records are not recorded in the

transaction log......

Page 25: 94043286 Teradata Interview Questions