Designing High Performance BIRT Reports

28
Designing High Performance BIRT Reports Mica J. Block Director Actuate Corporate Engineers Actuate Corporation

Transcript of Designing High Performance BIRT Reports

Page 1: Designing High Performance BIRT Reports

Designing High Performance BIRT Reports

Mica J. Block

Director

Actuate Corporate Engineers

Actuate Corporation

Page 2: Designing High Performance BIRT Reports

Topics

• Understanding generation performance• External factors• Overhead• Estimated Component Times

• Performance tips• Generation• Rendering

• NOTE: This presentation deals with the performance of the reports themselves regardless of the server technology being used.

Page 3: Designing High Performance BIRT Reports

Understanding Performance

Page 4: Designing High Performance BIRT Reports

“Pages-per-Second” Myth

• Assumes all reports are equal

• Ignores

• Number of report items per page

• Complexity of the query

• Pages are defined at render time

• Impact of aggregates, and more…

• Reality:

• report items-per-second is better metric

• Pages-per-second applies only to same report on different runs

Page 5: Designing High Performance BIRT Reports

External Factors

• System: CPU, load

• Raw CPU power

• RAM

• Overall load in server environment

• JVM

• User expectations

• DB performance

• Database design

• Query design & optimization

• Performance of vendor’s query features

• Network overhead

Page 6: Designing High Performance BIRT Reports

Actuate Architecture

Development Tier: IT builds reports, blueprints, metadata, & templates for different reporting styles

Storage & Data Tier: Dedicated, secure storage locations for accessing data, storing project & report content

Production Tier: Single, scalable cluster for generating content for different reporting styles

Presentation Tier: Dedicated tier for accessing & presenting report & dashboard content to users

Client Tier: Users consume content according to their analytic objective

ClientTier

(Web Browsers)

IEIEFirefoxFirefox

Storage, Data Access &

Integration Tier

DevelopmentTier

Perf Perf MgmtMgmt

iiEII

EMMVVFF11 FF22

iiEIIEII

EEMMVVFF11 F2

iiEIIEII

EEMMVVFF11 F2

iiEII

EMMVVFF11 FF22

iServer

Content Production

Tier

Presentation (Web/Portal)

Tier

MgmtConsole

iPortal

iPortal

iPortal

iPortal

iPortal

Page 7: Designing High Performance BIRT Reports

Estimated Component Times

• Estimated from simple listing using single table (10,000 rows) in SQL Server

• Generation only does not include rendering

• Not scientific methodology (done on a laptop)

• Your mileage will vary

• Use your own data

• Try in your own environment

• Focus on specific reports with problems

Page 8: Designing High Performance BIRT Reports

Estimated Component Times

• Pages: little to no effect

• Changed page break from 200 to 100 --> double pages

• Adds < 2% to report run

• Formatting: little to no effect

• Added numeric and date formatting

• Was slightly faster

• Groups: moderate to significant

• Add two group levels to simple listing

• Adds ~5-20% to report run per group

• Depends on the number of group breaks

• Depends on how the data is sorted

Page 9: Designing High Performance BIRT Reports

Estimated Component Times

• One-pass aggregates: moderate• Added two aggregates• Adds ~4% per aggregate to report run• Depends on number of groups

• Look-ahead aggregates: significant• Total for group as percent of overall total• Adds ~2-8% per aggregate to report run• Depends on number of groups and number of data items

• Charts: Very significant• One chart added ~33% to report run• One chart per group ~30-150% to report run• Depends on number of groups (i.e. charts).

Page 10: Designing High Performance BIRT Reports

Estimated Component Times

Report Name Size Average (in miliseconds) Difference Compare Report

Single Table 4.30 MB 1,954.40

Single Table Formatted 4.30 MB 1,931.80 -1.16% Single Table

Single Table Double Pages 4.35 MB 1,996.00 2.13% Single Table

Group By City (4 instances) 4.51 MB 2,201.20 10.28% Single Table Double Pages

Group By Customer (400 instances) 4.69 MB 2,353.20 17.90% Single Table Double Pages

Group By Customer Sorted 4.69 MB 2,207.60 10.60% Single Table Double Pages

Group By City Aggregates (2 per group) 4.71 MB 2,377.60 8.01% Group By City

Group By Customer Aggregates (2 per group) 4.89 MB 2,542.60 8.05% Group By Customer

Group By City Two Pass 4.81 MB 2,366.40 7.50% Group By City

Group By Customer Two Pass 5.00 MB 2,397.60 1.89% Group By Customer

Single Chart 5.92 MB 2,607.40 33.41% Single Table

Group By City Chart 6.25 MB 2,888.80 31.24% Group By City

Group By Customer Chart 21.6 MB 5,704.80 142.43% Group By Customer

Page 11: Designing High Performance BIRT Reports

Implications

• Report generation depends on:• number of report items• Presence of aggregates• Number of groups• Sorting of data• Presence of charts

• Time per page depends on output format• Pages per second depends on layout

• Decreasing page break number “doubles” performance!

Page 12: Designing High Performance BIRT Reports

Performance Strategies

• Use report items-per-second as a guide

• Relatively fixed for a platform

• Determine a time budget

• How many report items can the report afford?

• Performance strategies

• Remove application-specific bottlenecks

• Make report items work harder

• Reduce impact of aggregates

Page 13: Designing High Performance BIRT Reports

How to Analyze Performance

• Test functionality separately

• Write to a log file timers in key areas

• Collect run times

• Remove all content from report

• Collect run times again

• Difference is cost of processing report items • Remainder is per-row cost

• Example:

Page 14: Designing High Performance BIRT Reports

Performance Tips

Page 15: Designing High Performance BIRT Reports

General Observations

• Report optimization is a trial and error effort

• Some of the report optimization techniques require additional development time

• Not necessary to use these techniques when the reports perform within the user requirements

• These techniques should only be used to optimize reports

Page 16: Designing High Performance BIRT Reports

Use Latest Version

• Use latest version of BIRT

• Has many performance improvements

• Do not use ‘Total’ functions

• These functions are deprecated in BIRT 2.2.2

• Has some performance issues

• Especially with filters

Page 17: Designing High Performance BIRT Reports

Optimize Database Access

• Extra time from queries, DB overhead, computation, etc.

• Minimize query time

• Make sure query is optimized

• Reduce the number of columns and rows returned

• Reduce number of queries needed

• Use stored procedures

• Use materialized views

Page 18: Designing High Performance BIRT Reports

Optimize XML Access

• XML is versatile, and powerful to describe meta data and actual data in one file

• BIRT has a “generic” XML ODA which uses an extremely efficient XPath algorithm to parse the results

• “generic” is great to solve a multitude of needs, but lacks to solve a single need very well

• If the XML Schema will not change, and high user loads are required, specialize connectors should be built to improve overall system performance

Page 19: Designing High Performance BIRT Reports

Optimize XML Access

• Java API for XML Binding (JAXB) is a specialized API for Java used to efficiently and quickly parse a fixed schema XML data file

• Upside – may be 10x faster than the “generic” XML ODA

• Downside – if the XML Schema changes, JAXB classes will need to be re-compiled

• Downside – no UI exists to create data sets, JAXB classes must be used with a scripted data source

• The same also applies for the Web Services ODA

Page 20: Designing High Performance BIRT Reports

Filtering

• BIRT enables filtering at different layers such as in the table

• Push filtering to the database (if possible)

• Reduces the size of the result set

• Extremely important with two pass aggregates

Page 21: Designing High Performance BIRT Reports

Sorting

• When you add a group section BIRT will automatically sort the dataset in memory.

• There is no setting to tell BIRT that the data is already sorted.• Always better to push the sort to the database

Page 22: Designing High Performance BIRT Reports

Getting caught in a (Data) Bind

• As of BIRT 2.1.3 – this will change for a future release with data set caching

• Each report item with a specified data binding will force that data set to re-execute for each binding

• Bindings will cascade down to contained report items (data bindings on a table cascade down to items inside the table)

• In nearly all reports data sets should only have 1 binding specified

• Only extremely complex reports with inter-woven data set requirements will require multiple bindings per data set

• Joint Data Sets can be used in some cases to avoid multiple bindings on a single data set

• Do not bind data sets on the Master Page

Page 23: Designing High Performance BIRT Reports

Aggregates

• Aggregates: Sum( ), Count( ), Min( ), etc.• Two types

• Running – done while creating the table• Look-ahead - requires two passes over data

• For performance, review look-ahead type• Create a stored procedure to do calculation• Use a separate query• Use a data filter to merge totals into each row• Compare to out-of-box solution

Page 24: Designing High Performance BIRT Reports

Charts

• Good news - Most time spent in rendering (using drawing primitives in swing)

• Actual code is optimized

• Size and resolution will impact performance

• All points are loaded in memory.

• Avoid charts with many points

• Little more you can discern in a chart with 10,000 points than in a chart with 500 points

• More points will also take longer to render as there is more to draw

• Make sure you use the table binding not the dataset binding

Page 25: Designing High Performance BIRT Reports

Charts

• 3D charts might take more time as it uses a real 3D algorithm to sort surfaces

• 2d charts with depth have no significant performance impact

• Grouping inside charts will be the number one point that slows things down

• Chart engine uses a different grouping algorithm

• Group the data in the data set

• BIRT 2.3 will use the DTE grouping capabilities

• Avoid extra markers, labels, shadows, gradients, etc…

• will impact the performance as it means more shapes and fills to draw

Page 26: Designing High Performance BIRT Reports

General Tips

• Reduce number of report items

• Concatenate values where makes sense

• First Name + Last Name

• Avoid table data bindings when not used

• Use new Crosstab report item when appropriate as it is tuned for such operations.

Page 27: Designing High Performance BIRT Reports

Rendering Tips

• PDF

• Set appropriate page size in the master page

• Will significantly decrease dynamic geometry

• HTML

• Avoid group sections with many items

• Will cause a long TOC list and will impact viewing performance

Page 28: Designing High Performance BIRT Reports

Q & A