Common Database Problems Common Database Solutions Mike Furgal Managed Database Service EMEA PUG...
-
Upload
polly-lawrence -
Category
Documents
-
view
216 -
download
1
Transcript of Common Database Problems Common Database Solutions Mike Furgal Managed Database Service EMEA PUG...
Common Database ProblemsCommon Database Solutions
Mike FurgalManaged Database Service
EMEA PUG Challenge 2015,Copenhagen, Denmark4 – 6 November, 2015
© 2015 Progress Software Corporation.2
Introduction
Mike Furgal
Progress Employee since 1989
Developer of the OpenEdge database
Joined Bravepoint in 2012
Heads up Database Services
• Including Managed Database Services
Bravepoint
Largest Progress/OpenEdge consulting firm
Founded in 1987
Purchased by Progress in October 2014
Specializes in all things OpenEdge
• Database Services
• Programming
Pro2SQL
• Real-time Replication to SQL Target
A series of case studies of issue that the PROGRESS BravePoint Managed
Database Services Team has encountered over the years.
The case of the Missing Files
A large distribution center had a power failure. When the power came back on the machine booted but the database did not start
© 2015 Progress Software Corporation.6
: (43) ** Cannot find or open file /agility/prod/prod_db/platte_11.d5, errno = 2.
: (451) prostrct list session begin for root on /dev/pts/0.
: (12475) Unable to get file status for extent /agility/prod/prod_db/platte_11.d5
: (334) prostrct list session end.
© 2015 Progress Software Corporation.7
Specifics
Database was 80 GB
Last good backup was 1 week old
Not running After Imaging
Platform was Linux
WHAT WOULD YOU DO?
© 2015 Progress Software Corporation.9
Approach
Made a copy of the existing database incase we made a mistake
Used PROSTRCT LIST to determine which files were missing• We were lucky that the missing file was part of a storage area that
only held indexes
Tools Available• PROSTRCT UNLOCK
• PROSTRCT BUILDDB
© 2015 Progress Software Corporation.10
Solution
Restored the missing extent from the week old backup and ran PROSTRCT UNLOCK
Rebuilt the indexes• # proutil db –C idxbuild all
BUT…….
Index rebuild failed due to finding back blocks in the storage area where the records were stored
NOW WHAT?
© 2015 Progress Software Corporation.12
Back to the Beginning
Copied the backed up database to start over.• Since Index Rebuild failed, we needed to start over
• Good thing we had copied all the files in the first place
Add the missing extent
Truncate the BI and do a DBRPR scan• Fix bad blocks
• Fix bad records
© 2015 Progress Software Corporation.13
Dump and Load
After all the corruption was removed it was time to dump and load
Need to do an ASCII Dump to dump around some bad records
© 2015 Progress Software Corporation.14
Lessons Learned
This Database was important to this customer, hence they wanted it back when it got corrupted.
They need to treat the Database better • Daily Backups
• After Imaging
A good DR plan saves a lot of heartache
© 2015 Progress Software Corporation.15
Next Steps
Implement a good Disaster Recover plan which includes• Frequent backups
• After Imaging implemented
Test the Disaster Recover Plan• Annually
Disaster Recover Plan needs to be on Paper• Can’t be just on the computer
• Need a backup plan incase the DR plan fails
The case of the Micro Manager
A brand name US bank had SAN corruption. This prevented Crash Recovery from completing. They had a Hot Standby machine and database using OE Replication.
© 2015 Progress Software Corporation.18
Specifics
Had a local backup and local AI files, but the backup would not restore
Previous backup was not available
Replica Database was up to date
Platform was Windows
Database size 200 GB
OpenEdge 10.1C04
© 2015 Progress Software Corporation.19
What’s the Problem
Customer refused to fail-over• They never tested running on the fail-over
machine. Had little confidence that the application would run in the fail-over environment.
• Customer worried about the time it takes to fail-back once failed over.
© 2015 Progress Software Corporation.20
Making Matters Worse
Copying the DR database to the production machine is measured in days
Options presented to Management included FORCED ACCESS to the Database
© 2015 Progress Software Corporation.21
© 2015 Progress Software Corporation.22
What Next
Forced into the database – This skips Crash Recovery
Index Rebuild DOES NOT fix the database
Dump and Load DOES NOT fix the database
© 2015 Progress Software Corporation.23
Lesson Learned
Have confidence in your Disaster Recovery Plan• There is no sense of having one if you are never going to
use it
Be Careful of the “QUICK FIX”• Non-technical people will ALWAYS choose the fastest
approach to the solution without understanding the consequences
© 2015 Progress Software Corporation.24
Next Steps
Worked with the customer to do a fail-over test.
Made the fail-over testing an annual event
Schools Out For Summer
A Large school district needs to get their reports cards out to 30,000+ students. They discovered they had corruption in the database because backups stopped working for about a week
© 2015 Progress Software Corporation.27
Specifics
10.2B05 Windows 64bit OpenEdge
Last good backup is 1 week old
All report card data for 30,000+ students entered since that last good backup
After Imaging is turned on, but AI file retention was less than 1 week
Database is about 300 GB
They have the 1 week old backup restored to a different location
WHAT WOULD YOU DO?
© 2015 Progress Software Corporation.29
Approach
We had 2 plans
Plan A – Get the corruption out of the live database• Use any and all tools to remove the corruption
Plan B – Revert back to the week old database• See if we can take all the report card data from the live
database and import it into the week old database.
© 2015 Progress Software Corporation.30
Plan A
The database .lg file showed the extents where the corruption was located.
Each storage area was a single variable length extent
Corruption was in an 80 GB extent (Ugh!)• Used DBRPR to scan and fix bad blocks
• This took hours to run on this large extent
• In the end this failed
© 2015 Progress Software Corporation.31
Plan B
Worked with the vendor to find all the tables that made up the report card processing• This was about 12 tables
Dumped these tables from the live database• There was no corruption in these tables
Had to figure out how to get the table data into the week old database
HMMMMM…..
© 2015 Progress Software Corporation.33
Plan B
Dumped the schema for the 12 tables
Went into the dictionary and renamed the tables• Added _old to the end of the table name
Loaded the schema for the 12 tables
Loaded the data for the 12 tables
This is a very useful trick• Didn’t need to recompile – the application worked
© 2015 Progress Software Corporation.34
Plan A (revisited)
Dumped and Loaded the plan A database
There were 5 tables where the dump and load failed.
Did a 4GL dump• FOR EACH … BY field. EXPORT…
• FOR EACH … BY field DESCENDING. EXPORT …
Didn’t trust the data, so we use the same table rename technique to get these tables from the week old backup.
© 2015 Progress Software Corporation.35
© 2015 Progress Software Corporation.36
But Wait – There’s More
A week later they found they also had corruption in a different database• That was solved by restore and roll forward
• Needed to upgrade to 10.2B08 for Roll Forward to work properly
– Windows 64bit 10.2B06 has a roll forward bug that prevented it from working.
© 2015 Progress Software Corporation.37
Next Steps
Implement a DR solution• OpenEdge Replication
• Rolling Forward AI
Restore the backup and roll forward on the same machine• This verifies the backup is functional
• DB block corruption does not get replicated from roll forward
A case of the spins
A large medical center patched their software over the weekend. On Monday the performance of the system was unacceptable. The vendor says the patch was minor and could not be the cause of the issue. The customer says nothing else changed.
© 2015 Progress Software Corporation.40
Specifics
OpenEdge 11.2.0 32bit
Windows 2012 64bit
Database is 321 GB
Number of users is 3,000
© 2015 Progress Software Corporation.41
Some Metrics – Month View
Date CPs Users DB Requests DB Reads Ratio Rec Rds Rec Cr Rec Up Rec Del DB Writes BI Writes AI Writes Latch TO
05/16/15 (Sat) 86 580 7,948,358,747 149,508,775 53 1,072,918,966 425,542 1,922,412 374,270 175,980 172,163 116,490 14,798
05/15/15 (Fri) 261 2,792 12,987,557,626 114,936,649 113 2,465,462,910 1,520,296 3,227,568 1,626,692 1,886,768 449,449 295,333 115,003
05/14/15 (Thu) 263 3,011 11,000,344,090 56,940,273 193 2,090,165,002 1,639,564 3,475,092 871,808 2,023,720 454,097 298,088 73,017
05/13/15 (Wed) 323 3,126 10,371,051,213 55,142,493 188 1,879,551,662 2,250,168 3,423,760 1,099,930 2,378,306 525,070 374,006 885,294
05/12/15 (Tue) 279 3,089 10,567,333,668 140,530,655 75 1,901,654,803 1,797,520 3,397,238 1,043,849 2,068,487 496,165 328,450 943,510
05/11/15 (Mon) Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart
05/10/15 (Sun) 73 385 10,806,473,996 206,617,341 52 2,307,235,660 307,377 1,804,694 368,589 100,764 150,257 102,579 244,087
05/09/15 (Sat) 88 504 5,704,394,389 82,411,186 69 693,023,191 483,479 1,423,644 516,069 171,617 165,926 115,092 186,064
05/08/15 (Fri) Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart
05/07/15 (Thu) 271 2,940 10,046,740,997 145,481,723 69 1,596,756,358 1,705,661 3,503,669 924,082 2,153,671 455,228 306,003 128,058
05/06/15 (Wed) 338 2,989 9,830,327,570 153,056,406 64 1,442,212,561 2,247,914 3,525,546 1,225,826 2,453,942 557,639 374,309 129,160
05/05/15 (Tue) 293 2,967 10,392,149,949 154,806,221 67 1,593,242,356 2,000,392 3,366,955 1,126,177 2,324,533 488,949 329,102 171,067
05/04/15 (Mon) 488 2,971 10,483,718,093 162,479,487 65 1,547,975,267 2,311,179 3,733,307 1,363,409 2,678,057 712,951 528,518 212,059
05/03/15 (Sun) 125 484 11,161,696,099 217,504,812 51 1,884,717,953 331,006 1,783,868 1,243,395 270,902 222,981 156,128 23,917
05/02/15 (Sat) 1,433 540 8,114,391,833 164,345,414 49 735,325,461 444,568 1,853,376 24,483,171 3,078,655 1,889,151 1,496,027 132,360
05/01/15 (Fri) 374 2,735 11,611,724,202 126,877,164 92 1,815,943,455 2,450,987 3,063,577 1,458,166 2,046,184 590,195 411,705 3,268,221
© 2015 Progress Software Corporation.42
Some Metrics – Month View
Date CPs Users DB Requests DB Reads Ratio Rec Rds Rec Cr Rec Up Rec Del DB Writes BI Writes AI Writes Latch TO
05/16/15 (Sat) 86 580 7,948,358,747 149,508,775 53 1,072,918,966 425,542 1,922,412 374,270 175,980 172,163 116,490 14,798
05/15/15 (Fri) 261 2,792 12,987,557,626 114,936,649 113 2,465,462,910 1,520,296 3,227,568 1,626,692 1,886,768 449,449 295,333 115,003
05/14/15 (Thu) 263 3,011 11,000,344,090 56,940,273 193 2,090,165,002 1,639,564 3,475,092 871,808 2,023,720 454,097 298,088 73,017
05/13/15 (Wed) 323 3,126 10,371,051,213 55,142,493 188 1,879,551,662 2,250,168 3,423,760 1,099,930 2,378,306 525,070 374,006 885,294
05/12/15 (Tue) 279 3,089 10,567,333,668 140,530,655 75 1,901,654,803 1,797,520 3,397,238 1,043,849 2,068,487 496,165 328,450 943,510
05/11/15 (Mon) Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart
05/10/15 (Sun) 73 385 10,806,473,996 206,617,341 52 2,307,235,660 307,377 1,804,694 368,589 100,764 150,257 102,579 244,087
05/09/15 (Sat) 88 504 5,704,394,389 82,411,186 69 693,023,191 483,479 1,423,644 516,069 171,617 165,926 115,092 186,064
05/08/15 (Fri) Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart
05/07/15 (Thu) 271 2,940 10,046,740,997 145,481,723 69 1,596,756,358 1,705,661 3,503,669 924,082 2,153,671 455,228 306,003 128,058
05/06/15 (Wed) 338 2,989 9,830,327,570 153,056,406 64 1,442,212,561 2,247,914 3,525,546 1,225,826 2,453,942 557,639 374,309 129,160
05/05/15 (Tue) 293 2,967 10,392,149,949 154,806,221 67 1,593,242,356 2,000,392 3,366,955 1,126,177 2,324,533 488,949 329,102 171,067
05/04/15 (Mon) 488 2,971 10,483,718,093 162,479,487 65 1,547,975,267 2,311,179 3,733,307 1,363,409 2,678,057 712,951 528,518 212,059
05/03/15 (Sun) 125 484 11,161,696,099 217,504,812 51 1,884,717,953 331,006 1,783,868 1,243,395 270,902 222,981 156,128 23,917
05/02/15 (Sat) 1,433 540 8,114,391,833 164,345,414 49 735,325,461 444,568 1,853,376 24,483,171 3,078,655 1,889,151 1,496,027 132,360
05/01/15 (Fri) 374 2,735 11,611,724,202 126,877,164 92 1,815,943,455 2,450,987 3,063,577 1,458,166 2,046,184 590,195 411,705 3,268,221
323279
1,879,551,6621,901,654,803
293488
1,593,242,3561,547,975,267
© 2015 Progress Software Corporation.43
Sample CPs Users DB Requests DB Reads Ratio Rec Rds Rec Cr Rec Up Rec Del DB Writes BI Writes AI Writes Latch TO
41 10:05:01 2 3,033 106,609,219 36,595,922 3 11,840,580 9,664 31,257 2,338 19,189 4,858 4,499 88,92942 10:20:01 2 3,052 102,685,946 27,744,828 4 12,858,412 10,923 33,024 2,407 21,026 5,858 4,660 99,36543 10:35:02 2 3,082 97,655,645 1,303,656 75 17,250,593 13,814 38,947 3,318 27,611 7,075 5,046 107,79944 10:50:01 3 3,089 81,674,392 1,293,722 63 14,030,509 22,289 36,321 5,409 25,428 7,604 5,516 100,92045 11:05:01 7 3,086 214,447,121 1,716,183 125 61,973,396 40,122 58,595 29,276 59,987 13,919 8,509 30,33246 11:20:01 5 3,039 155,915,767 1,492,150 104 34,202,285 25,758 57,494 14,197 48,054 8,778 4,993 4,49847 11:35:01 4 3,040 156,151,501 1,434,288 109 34,103,824 27,304 60,045 7,791 48,323 8,285 4,571 3,95248 11:50:01 5 2,888 146,245,414 1,666,633 88 33,019,801 33,605 60,463 11,379 52,606 8,566 5,226 2,711
Sample CPs Users DB Requests DB Reads Ratio Rec Rds Rec Cr Rec Up Rec Del DB Writes BI Writes AI Writes Latch TO
41 10:05:01 4 2,848 153,746,951 2,343,049 66 28,340,783 30,489 60,805 7,666 55,818 7,757 4,774 5,92842 10:20:01 5 2,812 145,441,871 1,755,123 83 27,387,070 26,490 59,676 13,279 53,195 6,642 4,962 3,63743 10:35:01 4 2,877 151,783,516 1,876,013 81 27,653,297 30,446 61,754 11,262 54,906 7,899 5,192 6,77744 10:50:01 7 2,894 143,780,080 1,877,651 77 29,215,543 46,234 63,980 19,392 66,429 11,820 6,774 7,18545 11:05:02 4 2,912 158,495,087 1,808,835 88 30,191,428 34,806 63,070 12,040 59,041 9,215 5,284 10,16546 11:20:01 6 2,897 155,845,110 2,259,787 69 27,841,346 34,727 60,416 12,770 59,998 8,929 5,497 7,23747 11:35:01 8 2,938 150,662,822 2,195,738 69 25,976,744 70,239 83,419 20,193 82,777 12,552 8,542 6,03548 11:50:01 4 2,914 138,147,804 1,774,289 78 23,570,981 31,236 59,286 12,957 57,756 7,696 4,971 2,731
Good Day – 15 minute samples
Bad Day – 15 minute samples
Digging Deeper
© 2015 Progress Software Corporation.44
Sample CPs Users DB Requests DB Reads Ratio Rec Rds Rec Cr Rec Up Rec Del DB Writes BI Writes AI Writes Latch TO
41 10:05:01 2 3,033 106,609,219 36,595,922 3 11,840,580 9,664 31,257 2,338 19,189 4,858 4,499 88,92942 10:20:01 2 3,052 102,685,946 27,744,828 4 12,858,412 10,923 33,024 2,407 21,026 5,858 4,660 99,36543 10:35:02 2 3,082 97,655,645 1,303,656 75 17,250,593 13,814 38,947 3,318 27,611 7,075 5,046 107,79944 10:50:01 3 3,089 81,674,392 1,293,722 63 14,030,509 22,289 36,321 5,409 25,428 7,604 5,516 100,92045 11:05:01 7 3,086 214,447,121 1,716,183 125 61,973,396 40,122 58,595 29,276 59,987 13,919 8,509 30,33246 11:20:01 5 3,039 155,915,767 1,492,150 104 34,202,285 25,758 57,494 14,197 48,054 8,778 4,993 4,49847 11:35:01 4 3,040 156,151,501 1,434,288 109 34,103,824 27,304 60,045 7,791 48,323 8,285 4,571 3,95248 11:50:01 5 2,888 146,245,414 1,666,633 88 33,019,801 33,605 60,463 11,379 52,606 8,566 5,226 2,711
Sample CPs Users DB Requests DB Reads Ratio Rec Rds Rec Cr Rec Up Rec Del DB Writes BI Writes AI Writes Latch TO
41 10:05:01 4 2,848 153,746,951 2,343,049 66 28,340,783 30,489 60,805 7,666 55,818 7,757 4,774 5,92842 10:20:01 5 2,812 145,441,871 1,755,123 83 27,387,070 26,490 59,676 13,279 53,195 6,642 4,962 3,63743 10:35:01 4 2,877 151,783,516 1,876,013 81 27,653,297 30,446 61,754 11,262 54,906 7,899 5,192 6,77744 10:50:01 7 2,894 143,780,080 1,877,651 77 29,215,543 46,234 63,980 19,392 66,429 11,820 6,774 7,18545 11:05:02 4 2,912 158,495,087 1,808,835 88 30,191,428 34,806 63,070 12,040 59,041 9,215 5,284 10,16546 11:20:01 6 2,897 155,845,110 2,259,787 69 27,841,346 34,727 60,416 12,770 59,998 8,929 5,497 7,23747 11:35:01 8 2,938 150,662,822 2,195,738 69 25,976,744 70,239 83,419 20,193 82,777 12,552 8,542 6,03548 11:50:01 4 2,914 138,147,804 1,774,289 78 23,570,981 31,236 59,286 12,957 57,756 7,696 4,971 2,731
Good Day – 15 minute samples
Bad Day – 15 minute samples
Digging Deeper
© 2015 Progress Software Corporation.45
Some Metrics
Date CPs Users DB Requests DB Reads Ratio Rec Rds Rec Cr Rec Up Rec Del DB Writes BI Writes AI Writes Latch TO
05/16/15 (Sat) 86 580 7,948,358,747 149,508,775 53 1,072,918,966 425,542 1,922,412 374,270 175,980 172,163 116,490 14,798
05/15/15 (Fri) 261 2,792 12,987,557,626 114,936,649 113 2,465,462,910 1,520,296 3,227,568 1,626,692 1,886,768 449,449 295,333 115,003
05/14/15 (Thu) 263 3,011 11,000,344,090 56,940,273 193 2,090,165,002 1,639,564 3,475,092 871,808 2,023,720 454,097 298,088 73,017
05/13/15 (Wed) 323 3,126 10,371,051,213 55,142,493 188 1,879,551,662 2,250,168 3,423,760 1,099,930 2,378,306 525,070 374,006 885,294
05/12/15 (Tue) 279 3,089 10,567,333,668 140,530,655 75 1,901,654,803 1,797,520 3,397,238 1,043,849 2,068,487 496,165 328,450 943,510
05/11/15 (Mon) Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart
05/10/15 (Sun) 73 385 10,806,473,996 206,617,341 52 2,307,235,660 307,377 1,804,694 368,589 100,764 150,257 102,579 244,087
05/09/15 (Sat) 88 504 5,704,394,389 82,411,186 69 693,023,191 483,479 1,423,644 516,069 171,617 165,926 115,092 186,064
05/08/15 (Fri) Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart Restart
05/07/15 (Thu) 271 2,940 10,046,740,997 145,481,723 69 1,596,756,358 1,705,661 3,503,669 924,082 2,153,671 455,228 306,003 128,058
05/06/15 (Wed) 338 2,989 9,830,327,570 153,056,406 64 1,442,212,561 2,247,914 3,525,546 1,225,826 2,453,942 557,639 374,309 129,160
05/05/15 (Tue) 293 2,967 10,392,149,949 154,806,221 67 1,593,242,356 2,000,392 3,366,955 1,126,177 2,324,533 488,949 329,102 171,067
05/04/15 (Mon) 488 2,971 10,483,718,093 162,479,487 65 1,547,975,267 2,311,179 3,733,307 1,363,409 2,678,057 712,951 528,518 212,059
05/03/15 (Sun) 125 484 11,161,696,099 217,504,812 51 1,884,717,953 331,006 1,783,868 1,243,395 270,902 222,981 156,128 23,917
05/02/15 (Sat) 1,433 540 8,114,391,833 164,345,414 49 735,325,461 444,568 1,853,376 24,483,171 3,078,655 1,889,151 1,496,027 132,360
05/01/15 (Fri) 374 2,735 11,611,724,202 126,877,164 92 1,815,943,455 2,450,987 3,063,577 1,458,166 2,046,184 590,195 411,705 3,268,221
885,294943,510
128,058129,160
3,268,211
© 2015 Progress Software Corporation.46
Latch Timeouts increased. CRUD Operations Decreased. Why? Nothing had changed
© 2015 Progress Software Corporation.47
Further investigation revealed that the –spin setting was changed from 96,000 to 20,000.
• This change was a move to best practices where so called “industry experts”
have been saying to not have –spin higher than 20,000
© 2015 Progress Software Corporation.48
The change was made months back to the conmgr.properties file and was long forgotten.
When the patch was applied, the database was bounced and the change finally took affect
While no one remembers a configuration change, the change was there
Setting –spin back up to 96,000 got them the performance back
© 2015 Progress Software Corporation.49
Sample CPs Users DB Requests DB Reads Ratio Rec Rds Rec Cr Rec Up Rec Del DB Writes BI Writes AI Writes Latch TO
41 10:05:01 2 3,033 106,609,219 36,595,922 3 11,840,580 9,664 31,257 2,338 19,189 4,858 4,499 88,92942 10:20:01 2 3,052 102,685,946 27,744,828 4 12,858,412 10,923 33,024 2,407 21,026 5,858 4,660 99,36543 10:35:02 2 3,082 97,655,645 1,303,656 75 17,250,593 13,814 38,947 3,318 27,611 7,075 5,046 107,79944 10:50:01 3 3,089 81,674,392 1,293,722 63 14,030,509 22,289 36,321 5,409 25,428 7,604 5,516 100,92045 11:05:01 7 3,086 214,447,121 1,716,183 125 61,973,396 40,122 58,595 29,276 59,987 13,919 8,509 30,33246 11:20:01 5 3,039 155,915,767 1,492,150 104 34,202,285 25,758 57,494 14,197 48,054 8,778 4,993 4,49847 11:35:01 4 3,040 156,151,501 1,434,288 109 34,103,824 27,304 60,045 7,791 48,323 8,285 4,571 3,95248 11:50:01 5 2,888 146,245,414 1,666,633 88 33,019,801 33,605 60,463 11,379 52,606 8,566 5,226 2,711
Bad Day – 15 minute samples
Sample CPs Users DB Requests DB Reads Ratio Rec Rds Rec Cr Rec Up Rec Del DB Writes BI Writes AI Writes Latch TO
41 10:05:01 4 2,848 153,746,951 2,343,049 66 28,340,783 30,489 60,805 7,666 55,818 7,757 4,774 5,92842 10:20:01 5 2,812 145,441,871 1,755,123 83 27,387,070 26,490 59,676 13,279 53,195 6,642 4,962 3,63743 10:35:01 4 2,877 151,783,516 1,876,013 81 27,653,297 30,446 61,754 11,262 54,906 7,899 5,192 6,77744 10:50:01 7 2,894 143,780,080 1,877,651 77 29,215,543 46,234 63,980 19,392 66,429 11,820 6,774 7,18545 11:05:02 4 2,912 158,495,087 1,808,835 88 30,191,428 34,806 63,070 12,040 59,041 9,215 5,284 10,16546 11:20:01 6 2,897 155,845,110 2,259,787 69 27,841,346 34,727 60,416 12,770 59,998 8,929 5,497 7,23747 11:35:01 8 2,938 150,662,822 2,195,738 69 25,976,744 70,239 83,419 20,193 82,777 12,552 8,542 6,03548 11:50:01 4 2,914 138,147,804 1,774,289 78 23,570,981 31,236 59,286 12,957 57,756 7,696 4,971 2,731
Good Day – 15 minute samples
Changed –spin online
But WAIT! There’s more
© 2015 Progress Software Corporation.51
A different customer added a few CPUs to their environment.
When the users login, the CPUs peg to 100% utilized
Performance suffers
WebSpeed launches additional Agents
• Due to all agents are busy
Specifics
Customer database is > 1 TB
430 Webspeed agents
AIX
10.1C 64bit
© 2015 Progress Software Corporation.52
Sample CPs Users DB Requests DB Reads Ratio Rec Rds Rec Cr Rec Up Rec Del DB Writes BI Writes AI Writes Latch TO
171 14:10:00 0 433 226,177,323 38,138 5,930 92,799,768 30,190 6,316 0 4,999 3,311 1,859 16,058172 14:15:00 1 434 230,752,781 45,463 5,076 95,312,150 15,075 5,079 0 5,253 1,218 1,022 15,475173 14:20:00 1 432 227,272,169 47,269 4,808 90,336,327 21,173 5,439 81 4,904 2,366 1,409 21,450174 14:25:00 0 433 174,847,554 50,871 3,437 68,054,671 14,028 5,468 0 5,425 1,797 976 8,608175 14:30:00 0 433 201,167,309 53,661 3,749 72,196,316 15,100 5,109 0 6,304 1,808 1,032 20,395176 14:35:00 1 433 272,198,783 69,170 3,935 104,501,989 25,086 7,389 4 5,431 2,913 1,597 48,935177 14:40:00 1 434 261,870,509 100,191 2,614 97,340,387 23,871 5,737 42 7,930 1,784 1,504 58,078178 14:45:01 0 434 264,460,116 391,712 675 103,827,717 19,668 5,931 46 8,671 1,694 1,284 93,499179 14:50:00 1 434 249,536,969 779,568 320 88,726,157 23,008 5,444 0 7,872 2,543 1,453 123,551180 14:55:00 1 435 279,690,881 155,333 1,801 108,846,566 22,640 7,233 24 6,050 2,668 1,470 72,701181 15:00:00 0 433 268,670,791 539,202 498 104,316,852 24,230 7,554 0 6,147 2,677 1,557 64,974182 15:05:00 1 435 259,585,414 161,012 1,612 107,194,651 21,828 6,552 63 7,095 1,739 1,375 38,665183 15:10:00 1 435 245,056,072 316,736 774 101,424,285 25,343 5,862 0 5,973 2,401 1,522 28,853
© 2015 Progress Software Corporation.53
Unlike the previous example, we had no historical performance metrics to compare to when thing were
good.
Could only rely on instincts and experience.
© 2015 Progress Software Corporation.54
A Different View
In a 5 minute sample, the highest latch timeout should be no more than 3,000
© 2015 Progress Software Corporation.55
Changed –spin from 60,000 to 20,000 and the problem went away
© 2015 Progress Software Corporation.56
Lesson Learned
There is no one setting that will work for every situation• Changing –spin from 20,000 to 96,000 helped one customer
• Changing –spin from 60,000 to 20,000 helped another one
Having historical data is key
Don’t assume nothing has changed just because they said so• Configuration changes usually only take affect at next startup
© 2015 Progress Software Corporation.57
Summary
These are examples of some real world Database Problems
Don’t assume things can’t go wrong
Having a plan is not going enough • Testing the plan and having confidence is required
If all else fails, seek professional help
AnswersEmail: [email protected]