[IEEE 2011 Annual IEEE India Conference (INDICON) - Hyderabad, India (2011.12.16-2011.12.18)] 2011...

6
Agile Way of BI Implementation Bhawna Rehani (Author) Business Intelligence -Technology excellence Group Tata Consultancy services The rapidly changing IT economy has influenced the Business Intelligence (BI) systems to look at innovative ways to be equally fast and flexible. There is a need to be more intuitive and quick in implementation so as to adapt to the changing environment. One of the ways by which organizations can achieve these goals is by using Agile based BI development models. There are many components in a successful BI solution which include data integration, analytics, data quality, metadata management, enterprise data warehouse, dashboards and so on. Each of these components are critical for an organization, and stakeholders are ready to invest in these. The only issue is how quickly we can provide these solutions and how flexible these solutions are with the changing demands. Traditionally, we have been using the waterfall SDLC model for BI implementations which encourages getting requirements clarity in the initial phases of the projects and having distinct deliverables for each phase. With time the approach has been customized and enhanced to ‘iterative waterfall approach’ where a chunk of requirements is implemented in one SDLC cycle. Though this approach has been successful in the past, the BI practitioners recognize that business requirements are not static and we must be able to effectively mould the deliverables based on changing requirements. Hence, we cannot continue with the Waterfall (or Iterative waterfall) project management approach that is neither fast nor flexible. Applying the concepts of agile development to BI is the intuitive way forward. The aim of this paper is to provide a background on agile project management & development techniques, and suggest some guidelines and best practices which can help in successful Agile BI implementations. Keywords- Agile , Buisness Intelligence, Scrum, Development methodlogy I. INTRODUCTION The recession and economic turmoil in the past few years have forced the developers and sponsors to reconsider the development and delivery of BI solutions. Low cost and quicker deployment models are the need of the business hour. Hence, we need to think beyond the traditional software development approaches and look for solutions which bring in more capabilities and provide faster deployments. This can certainly be achieved using the Agile approach for BI implementations. This paper attempts to address few questions which will help in Creating an Agile BI Environment. The document gives insight on the following questions: What is Agile? Why Agile for BI? How is it implemented? When to move to Agile? How is Agile performance measured? The paper gives a background on Agile and specifically how Agile helps in BI implementation. It explains various agile methodologies applicable to BI. It also suggests the best practices that can be followed in an Agile BI implementation. II. BACKGROUND A. Introduction to Agile Software Development First, confirm that you have the correct template for your pap Agile is an iterative software development methodology which encourages incremental and continual development of the product. The requirements and design of the software evolve through close collaboration between the sponsors and the development teams. In this methodology, the requirements are broken in to small user stories which should be independent, negotiable, valuable, estimable, small and testable. Iterations or Sprints are short time frames that typically last from one to four weeks. In each iteration, the team works through a full software development cycle including planning, requirements analysis, design, coding, unit testing, and acceptance testing. In acceptance testing, a working product is demonstrated to stakeholders. This minimizes the overall risk and allows the project to adapt to changes quickly. Documentation is produced as and when required. A Sprint should add some valuable features that are bug free and ready to be deployed to Production. Production release may be done once a group of features are added to the Product and can be used by the Sponsors. Agile development use communication as an important tool. All the Agile meetings are face-to-face where the main goal is to collaborate and take decisions as a team. This face- to-face communication exposes any issues as they rise and

Transcript of [IEEE 2011 Annual IEEE India Conference (INDICON) - Hyderabad, India (2011.12.16-2011.12.18)] 2011...

Page 1: [IEEE 2011 Annual IEEE India Conference (INDICON) - Hyderabad, India (2011.12.16-2011.12.18)] 2011 Annual IEEE India Conference - Agile way of BI implementation

Agile Way of BI Implementation

Bhawna Rehani (Author) Business Intelligence -Technology excellence Group

Tata Consultancy services

The rapidly changing IT economy has influenced the Business Intelligence (BI) systems to look at innovative ways to be equally fast and flexible. There is a need to be more intuitive and quick in implementation so as to adapt to the changing environment. One of the ways by which organizations can achieve these goals is by using Agile based BI development models.

There are many components in a successful BI solution which include data integration, analytics, data quality, metadata management, enterprise data warehouse, dashboards and so on. Each of these components are critical for an organization, and stakeholders are ready to invest in these. The only issue is how quickly we can provide these solutions and how flexible these solutions are with the changing demands.

Traditionally, we have been using the waterfall SDLC model for BI implementations which encourages getting requirements clarity in the initial phases of the projects and having distinct deliverables for each phase. With time the approach has been customized and enhanced to ‘iterative waterfall approach’ where a chunk of requirements is implemented in one SDLC cycle. Though this approach has been successful in the past, the BI practitioners recognize that business requirements are not static and we must be able to effectively mould the deliverables based on changing requirements.

Hence, we cannot continue with the Waterfall (or Iterative waterfall) project management approach that is neither fast nor flexible. Applying the concepts of agile development to BI is the intuitive way forward.

The aim of this paper is to provide a background on agile project management & development techniques, and suggest some guidelines and best practices which can help in successful Agile BI implementations.

Keywords- Agile , Buisness Intelligence, Scrum, Development methodlogy

I. INTRODUCTION The recession and economic turmoil in the past few years

have forced the developers and sponsors to reconsider the development and delivery of BI solutions. Low cost and quicker deployment models are the need of the business hour. Hence, we need to think beyond the traditional software development approaches and look for solutions which bring in more capabilities and provide faster deployments. This can certainly be achieved using the Agile approach for BI implementations.

This paper attempts to address few questions which will help in Creating an Agile BI Environment. The document gives insight on the following questions:

• What is Agile?

• Why Agile for BI?

• How is it implemented?

• When to move to Agile?

• How is Agile performance measured?

The paper gives a background on Agile and specifically how Agile helps in BI implementation. It explains various agile methodologies applicable to BI. It also suggests the best practices that can be followed in an Agile BI implementation.

II. BACKGROUND

A. Introduction to Agile Software Development First, confirm that you have the correct template for your

pap Agile is an iterative software development methodology which encourages incremental and continual development of the product. The requirements and design of the software evolve through close collaboration between the sponsors and the development teams.

In this methodology, the requirements are broken in to small user stories which should be independent, negotiable, valuable, estimable, small and testable.

Iterations or Sprints are short time frames that typically last from one to four weeks. In each iteration, the team works through a full software development cycle including planning, requirements analysis, design, coding, unit testing, and acceptance testing. In acceptance testing, a working product is demonstrated to stakeholders. This minimizes the overall risk and allows the project to adapt to changes quickly. Documentation is produced as and when required. A Sprint should add some valuable features that are bug free and ready to be deployed to Production. Production release may be done once a group of features are added to the Product and can be used by the Sponsors.

Agile development use communication as an important tool. All the Agile meetings are face-to-face where the main goal is to collaborate and take decisions as a team. This face-to-face communication exposes any issues as they rise and

Page 2: [IEEE 2011 Annual IEEE India Conference (INDICON) - Hyderabad, India (2011.12.16-2011.12.18)] 2011 Annual IEEE India Conference - Agile way of BI implementation

increases confidence in the sponsor community. Agile also encourages thought partnership amongst the customers and vendors.

Agile development emphasizes working software as the primary measure of progress. This, combined with the preference for face-to-face communication, produces less written documentation than other methods. The agile method encourages stakeholders to prioritize wants with other iteration outcomes based exclusively on business value perceived at the beginning of the iteration.

The most widely used methodologies based on the agile philosophy are Scrum and Extreme Programming (XP).

XP: XP is a highly collaborative methodology that concentrates on Product development rather than managerial aspect. A user or a representative is generally a part of the team, so that he or she can add details to requirements as the software is being built. This enables the users and developers to evolve the requirements to define the end product. When the product has enough features to satisfy users, the team terminates iteration and releases the software. If users decide that enough user stories have been delivered, the team can choose to terminate the project before all of the originally planned user stories have been implemented.

Scrum: Scrum for software development came out of the rapid prototyping community because prototypers wanted a methodology that would support an environment in which the requirements were not only incomplete at the start, but also could change rapidly during development. Unlike XP, Scrum methodology includes both managerial and development processes. Scrum projects revolve around a Product Backlog of the pending work. It consists of a prioritised list of stories which the team picks one by one. Each story is then designed, developed, tested and released to the users. When enough of the backlog has been implemented so that the end users believe the release is worth putting into production, management closes development.

B. Agile vs. Waterfall Difference between Waterfall and Agile can be appreciated

by looking at the underlying policy for each of these:

The manifesto for waterfall software development is [4]:

“Software development can be equated to any other engineering task. We believe software development projects can be effectively managed by:

• Understanding and writing specifications that define how the software will look and what it will do

• Performing in-depth analysis and design work before estimating development costs

• Ensuring software developers follow the specifications

• Testing the software after implementation to make sure it works as specified

• Delivering the finished result to the user.

That is, if the specification is of sufficient detail, then the software will be written such that it will satisfy the customer, will be within budget, and will be delivered on time.”

Same thought is applicable to iterative waterfall approach which is being used in BI engagements for last many years instead of pure waterfall. Iterative waterfall also considers completion of one phase before moving to the next. The Terms Waterfall and Iterative waterfall have been used interchangeably in this paper

In comparison to Waterfall, below is the manifesto for Agile [3]:

“We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:

• Individuals and Interactions over processes and tools

• Working software over comprehensive documentation

• Customer Collaboration over contract negotiation

• Responding to change over following a plan

That is, while there is value in items on the right, we value the items on the left more “

Few key differences between the two approaches are listed below:

TABLE I. AGILE VS ITERATIVE WATERFALL – BENEFITS

ITERATIVE WATERFALL AGILE

DELIVERABLES Distinct phases and

documented deliverables pre defined for each phase.

Iterations rather than phases, typically the only deliverable after each instance is code

FLEXIBILITY

Requirements are frozen in first phase of requirement gathering, rest of the phases use these requirements as base. Any changes after that have to go through a process of CR

Requirements can keep changing within an iteration based on user inputs

VISIBILITY

Once the requirements are finalised, the users are not involved in the development cycle. They are not aware of what is being developed, till it is exposed for UAT after full iteration of 90 days

Users and developers are part of the same team. Users are involved in each phase

Identify applicable sponsor/s here. If no sponsors, delete this text box. (sponsors)

Page 3: [IEEE 2011 Annual IEEE India Conference (INDICON) - Hyderabad, India (2011.12.16-2011.12.18)] 2011 Annual IEEE India Conference - Agile way of BI implementation

DESIGN

Data model and ETL jobs are designed keeping in mind all the requirements

Framework is defined before starting iterations; same model is revised in each iteration. If required, refactoring is used (explained in Section 2.5)

RISK

Longer cycle and less visibility to customer may result in end product which do not meet user requirements

Users is engaged in early stage and lesser risk of project failure

III. VALUE ADDITION OF AGILE IN BI IMPLEMENTATIONS Agile (or any kind of iterative style of development) works

well in the BI area due to the following reasons:

• A lot of BI projects are about adding new data, or new reports, to an existing data warehouse. These require few enhancements/changes in the existing data warehouses and create reports that derive that from data warehouse. This fits in well with the concept of Agile

• A typical BI implementation consists of various modules such as Data Modeling, Data Quality, Data Extraction and Reporting. Agile methodology can be used for developing each module separately and then integrating each module to make a complete BI solution.

• Reporting part of BI requires lot of business interaction and Agile helps to bring in business users and developer as a team, which help to accelerate the processes. The users can see the reports, comment on them, and evolve the requirements, for example, in 2 week iterations, users can work with the development team and change the look and feel of reports in reports demo sessions. So by the end of sprint, reports delivered are the ones that really work for them.

• Agile is testing driven approach, user is actively involved in the design and implementation process , hence many reports and data bugs are found in the early phases as compared to the traditional approach.

A typical Agile BI Cycle is illustrated in Figure 1 below.

Figure 1: Traditional Agile Development Approach.

As illustrated in Figure 1, Requirement Backlog is received from users. These requirements are divided into sprints based on their priority and complexity. One sprint is a full life cycle of understanding the requirement, analysis, design, build and user testing. A Sprint lasts for 1-2 weeks. User Demonstration (typically called Show and Tell) is conducted after every sprint to get the user feedback.

As per the requirements, after every few sprints, a production roll out is carried out to deploy changes to production. This is called Release

IV. BEST PRACTICES FOR AGILE BI IMPLEMENTATIONS

A. Testing Automation A release generally consists of new added reports along

with few changes in ETL(Extract Transform and Load) and database layer. As Agile sprints are of short duration i.e. 1-2 Weeks, testing automation should be done to reduce the testing cycle.

It is recommended that a regression test suite is developed containing a set of test cases which cover the core functional areas. This test suite is automated for database and ETL testing. For example, SQL scripts can be used for record count match between source and destination of key ETL loads.

B. Metrics Story points for each user story are calculated based on the complexity of requirement, design and current understanding of the solution. It gives an estimate on how much time a user story is going to take for development. Based on the user stories grouped together in a sprint, the total time required for sprint is calculated as given below: Expected Time required for a sprint = Sum of Story points for each user story in a sprint * Average time required to implement one point

Page 4: [IEEE 2011 Annual IEEE India Conference (INDICON) - Hyderabad, India (2011.12.16-2011.12.18)] 2011 Annual IEEE India Conference - Agile way of BI implementation

This approach is different from effort estimation done for the Waterfall approach. The main advantages of using this agile estimation technique are:

• The above mentioned measurement technique is independent of factors such as team member’s expertise or number of members in the team. It just signifies the complexity of a user story compared to other user stories in the same sprint. So while calculating the story points of a user story, other factors need not be considered.

• Steady state: After few sprints, the team is able to easily judge how many story points can be delivered in a sprint. The estimation of story points and sprint effort then becomes easier.

• User satisfaction: After reaching the steady state, deliveries become more certain, even the sprint velocity (number of story points delivered per sprint) increase. This helps in gaining user confidence and project deliverables are shown at short and regular intervals.

Following are some of the metrics which can be used to measure the performance of Agile BI implementation: Scrum velocity: Velocity is a measure of how much work is getting done on the project in one scrum. This is an important metric which drives release planning and schedule updates. In project management terms, velocity is the amount of work that a team can complete in a specified period of time. In Agile, it can be measured as the number of story points completed

Figure 2: Sample Scrum Velocity Chart Sprint Burn Down: The performance metrics in Figure 2 measures the total absolute amount of effort in person hours or person days left in the sprint. The burn down can be calculated even daily, to understand the status of how much work is pending in the sprint and if any change of action plan is required. Sprint burn down calculation can be part of daily Scrum calls. Burnout = New estimate (tasks backlog) + Estimate (new issues)

Sprint burn down chart calculates the amount of work to be done over the sprints. So, for example, if you have 15 sprints, and 200 tasks, your chart will have 200 along the vertical axis, and 15 along the horizontal. After each sprint, mark the number of tasks remaining along the vertical axis. During the first few sprints, it may be slow, but it will show progress towards the goal. Release Burn Down: The metric in Figure 3 indicates the burn down across the sprints. This helps to appreciate how the performance has improved/changed with every sprint.

Figure 3: Release Burn Down Chart

C. Refactoring In Agile BI implementations, the design of ETL or data

model is changed in every sprint to take care of user stories of that sprint. Since all the requirements are not analysed at the beginning of the design, there will be scenarios where at later stages we realise that the data model should be modified and corresponding ETL layer should be changed based on the requirements added in the current sprint or to make the data model more adaptable or scalable. That is, the design alternative chosen in a sprint may not be the best approach in long term but meets the user needs. To address this issue, a ‘refactoring’ process is recommended to incorporate the change in data model or ETL, while keeping the reports unaffected.

Refactoring means making small changes to the internal structure of an application without changing its external behavior. It may not change any existing functionality from the user’s perspective but helps in improving the system behavior and performance in long term. In terms of effort, refactoring can be considered as user stories and will be implemented as part of sprints. The only difference being that it will not have any user impact associated with it.

D. Gradual Movement towards Agile If an organization is currently using the Waterfall model of implementation and plans to shift to Agile, a gradual movement should be planned from one methodology to the other. To start with, Agile can be introduced for project which meets the Agile requirements. For example, out of

Page 5: [IEEE 2011 Annual IEEE India Conference (INDICON) - Hyderabad, India (2011.12.16-2011.12.18)] 2011 Annual IEEE India Conference - Agile way of BI implementation

the total requirements, if 70% of the requirements are very clear to the development team and 30 % are vague and even the user is not sure of what is required, continue using the Waterfall/Spiral model for 70% of the stable requirements, and use the sprint approach for the rest 30%.

Another factor which governs the movement to Agile model is the maturity of data warehouse. If the enterprise data warehouse already exists and requirements are more towards adding reports rather than making changes in warehouse data model, then moving to Agile is a comparatively smoother process.

E. Team Structure

The nature of Agile itself promotes oral communication and discourages spending time on written communication and documentation. Typically, a team size should consist of a group of 6-8 people. Small teams help in better communication and team work.

If the number of user stories to be covered in a sprint is large and a larger team is required, it is suggested to split the team into multiple groups, each team catering to separate functional area.

Within a small Agile BI team, there should be a mix of technical skills and domain knowledge, the team should have primary owners for each of the below areas:

• User requirements

• Source data and applications

• Data warehouse data model

• Data transformations and ETL routines

• Reporting applications

In order to satisfy these needs the team comprises of: End user, Business analyst, ETL designer/developer, Report designer/developer, Database administrator, and BI architect..

F. Factory Model Since Agile approach works typically for a team of 5-8 people (Section 2.7 above), it is best to split the project into small teams, each working towards a business area so that one business user is part of one team. In this case coordination and prioritization across teams is vital and can be handled using factory model/competency centre.

Role of this team is to govern the process, people and data in a BI implementation across various agile modules. Agile Benefits

Agile implementation is beneficial to both IT and business users.

For IT

• Prompt feedback ensures minimum last minute changes or updates and involves minimum risk.

• Customer collaboration helps developers understand the end to end system and deliver what is really required by the customers.

• Continuous flow of value to customers, hence better satisfaction index.

• Daily scrum calls bring energy to the team.

• Lesser documentation involved enables user to directly judge the end product.

• Iterative model helps to reduce complexity and improve productivity

• Better resource utilization

For Business/Customer

• Agile methodology gives users the flexibility to change requirements within a development cycle, which brings in flexibility to the implementation

• Users are able to start using the end product quickly as compared to waterfall due to shorter development cycles. This leads to better ROI.

• Agile projects are cheaper as they reduce the cost of rework by ensuring customer satisfaction and avoid last minute changes.

• User acceptance testing is an early phase in the agile methodology as compared to non agile development process.

• Higher probability of on time and defect free deliveries

V. CHOOSING BETWEEN AGILE AND WATERFALL Different business scenarios demand different

methodologies. Following table provides a comparison of Agile and Waterfall (or Iterative waterfall) against different criteria

TABLE II. AGILE VS. WATERFALL- SELECTION CRITERIA

Criteria Agile Waterfall

Type of Contract

Suitable for model based on T&M and focus on reducing time to market

Suitable for Fixed Bid projects. For example, for developing canned reports or solutions within fixed time frame.

Requirements

More than 30% change expected in requirements

Less change expected in requirements

Type of Solution Solution that requires more changes at

Building Data warehouse from scratch

Page 6: [IEEE 2011 Annual IEEE India Conference (INDICON) - Hyderabad, India (2011.12.16-2011.12.18)] 2011 Annual IEEE India Conference - Agile way of BI implementation

reporting.

Maintenance projects

Team Size Team size between 5-9 people

Team size more than 10

Developers Multi skilled, motivated team

Not so skilled, junior developers

Customer

Engage with developers and get involved in day to day activities.

Remote customer,

Does not engage in development,

Interested in the final product.

Here are two scenarios where a decision of project methodology was taken based on the above criterion

Scenario A: Firm PQR is a leading provider of railcars. The aim of the BI initiative is to move from existing in house CRM system to new CRM system from a leading vendor. In the current set up, users have been suing a reporting system based on standalone OLAP database taking data from in house CRM system. The aim of the project is to build a new reporting system which takes data from new CRM system and also provides similar reporting functionality from the existing enterprise data warehouse platform

Analysis: In this scenario, the requirements are very clear. Users are already using a reporting system and are aware of their requirements. Most of the analysis work is back end related where the reporting database needs mapping from new source systems. Therefore focus of the project is on ETL part which does not require user interaction. For reporting perspective, requirements are clear and will need small modifications.

Considering these factors, iterative waterfall approach has been recommended. Firstly a base reporting layer was developed from new source systems, then for each department, aggregated data marts were developed in an iterative waterfall approach. After the development of data marts, copy of existing reports was developed using new system and demo was shown to users. Since the requirement was very clear, the comments received in demo were minor and were incorporated easily.

Scenario B: Firm XYZ is a management consulting organization. The aim of the BI Initiative is to develop a personnel analysis system which produces key people metrics. The users of the reports will be across various departments who have been using manual reporting or excel based reporting till now. They are keen to move to the new solution which can save their time on generating management reports. There is an

existing data warehouse which provides some of the information needed for such reporting, to leverage the current DW for Key people metrics reporting, DW needs to be enhanced to fetch additional information and reports need to be build based on user requirements.

Analysis: In this scenario, since users are not experienced in using extensive reporting, they are not very clear usage and features of reporting tools and therefore will not be able to share the requirements clearly in the first few meetings. But there is enthusiasm from users to move to new solution and hence will be ready to spend time in its development. Also, the current data warehouse exist, the base layer and aggregate layer needs to be enhanced for new KPI’s

Keeping these facts in mind, since solution framework is clear but requirement is not clear, we chose agile Methodology. The program started with sprint duration of 4 weeks and was reduced to 1 week after 2 years of continuous projects. Because of the instant user feedback there were faster deployments and high scrum velocity

REFERENCES [1] Extreme Programming- [Online].

Available:http://www.xprogramming.com/xpmag/whatisxp. [2] Scrum Methodology [Online]. Available - www.scrumalliance.org. [3] Beck, Kent; (2001). "Manifesto for Agile Software Development". Agile

Alliance. [4] Agile development [Online]. Available

http://www.serena.com/docs/repository/solutions/intro-to-agile-devel.pdf.

Bhawna Rehani received M.Tech degree in Infoamtion technology from IIITM in 2004 and Bachelor in Information Sciences Degree from Delhi University in 2002. She is working as Buisness Intelligence Centre of Excellence Lead for Tata Consultancy Services Ltd. Her current area of interest includes Cloud based BI solutions and Data Mining