TDTWG Report to RMS

22
1 TDTWG Report to RMS Recommended Solutions for SCR 745 ERCOT Unplanned System Outages and Failures Wednesday, August 10th

description

TDTWG Report to RMS. Recommended Solutions for SCR 745 ERCOT Unplanned System Outages and Failures. Wednesday, August 10th. SCR 745 Status. The original SCR 745 requested ERCOT perform in depth analysis in order to determine root causes for unplanned system outages. - PowerPoint PPT Presentation

Transcript of TDTWG Report to RMS

Page 1: TDTWG  Report to RMS

1

TDTWG Report to RMS

Recommended Solutions for SCR 745

ERCOT Unplanned System Outages and Failures

Wednesday, August 10th

Page 2: TDTWG  Report to RMS

2

SCR 745 Status•The original SCR 745 requested ERCOT perform in depth analysis in

order to determine root causes for unplanned system outages.

•At the July 13 RMS meeting, RMS approved the revised SCR745 which included the full evaluation of ERCOT systems as well as options for resolving ERCOT system failures and outages

•On July 13th a SCR745 workshop was scheduled in order for Market Participants and their technical experts to thoroughly evaluate the options as well as determine recommended solutions

•This presentation includes the recommended solutions made at that workshop

•They are consistent with the ERCOT recommended solutions which were presented to RMS on July 13

Page 3: TDTWG  Report to RMS

3

Paperfree Process Servers

Key:

PROXIES

INTERNET

OUTBOUND

NAESB

TCH

EAI

TCH Database

Single Retail Database Server(Multiple Oracle Databases)

PAPERFREE

SIEBEL

Bi-Directional Data Flow

Siebel DatabaseNAESB Database Paperfree DatabaseOutbound Data Flow

Inbound Data Flow

INBOUND

Paperfree File Server

FIREWALL

Solaris

W2K

W2K

HP

W2K

W2K

HP

IN/OUT

DMZ

SWITCH

Single Point of Failure

Retail Systems• NAESB • PaperFree• TCH-EAI (Transaction Clearing House)• All Retail (Database Server)

Market Participant

Page 4: TDTWG  Report to RMS

4

All options presented to RMS at the July meeting were discussed in depth at the WorkshopThese included:

1 of 2 options for NAESB Proxy Server improvements

1 of 3 options for NAESB Application

1 of 2 options for PaperFree improvements

1 of 3 options for Database Server for All Retail

System Options

Page 5: TDTWG  Report to RMS

5

Review of NAESB Proxy Server Options

Option 1 – Fully Clustered* V880 Solution – 4 V880 NAESB Proxy

Servers

Summary – Maximum reliability solution. This option will provide a fully clustered and fault tolerant solution; opportunity to consolidate the current 18 production proxy servers including the servers identified in Option 2

This option virtually eliminates the potential for NAESB proxy outages, unplanned or planned.

This option will provide 99.99% availability for the NAESB proxy servers.

*Cluster: A group of servers that are typically on different physical machines and have the same

applications configured within them, but operate as a single logical server.

Page 6: TDTWG  Report to RMS

6

Review ofNAESB Proxy Server Options

Option 2 – 4 V120 NAESB Proxy Servers.

Summary – Minimum reliability solution.

This option will provide redundancy to address the single point of failure. Two servers will be located in Taylor and two servers will be located in Austin.

This will not be a clustered solution it will be a load balance solution. V120 servers cannot cluster.

This solution will reduce the frequency and duration of proxy outages, is not as costly as option 1 but is also not as a robust solution as Option 1.

Page 7: TDTWG  Report to RMS

7

NAESB Proxy Server Option Recommended

Those attending the workshop determined that…Option 1 – Fully Clustered* V880 Solution – best meets the needs of the Market.

Primary Discussion points included:This solution is the most robust for NAESB Proxy Servers and will provide the best reliability of the two options.

Those attending the workshop support this option since it will allow more flexibility for maintenance

This option is more expensive but the Market improvements that will result from implementing this option more than make up for the cost difference

Page 8: TDTWG  Report to RMS

8

Review ofNAESB Application Options

Option 3 - Separate Application Server Cluster

This option moves peripheral NAESB processes (data encryption, decryption) to the PaperFree cluster and separates inbound and outbound transmissions to disconnected clusters.

Page 9: TDTWG  Report to RMS

9

Review ofNAESB Application Options

Option 4 Hybrid Application Cluster

This option creates an application cluster for inbound transactions and moves outbound transaction processing to the PaperFree system in order to utilize PaperFree’s load balancing and high availability capabilities.

Page 10: TDTWG  Report to RMS

10

Review ofNAESB Application Options

Option 5 – Combined Application Cluster

This option combines inbound and outbound transaction processing into a single application cluster.

Page 11: TDTWG  Report to RMS

11

NAESB Proxy Server Application Recommendation

Those attending the workshop determined that…Option 4 – Hybrid Application Cluster best

meets the needs of the Market.

Primary Discussion points included:This solution was found to be the most likely

for success in that it fits best with the PaperFree application

This option utilize PaperFree’s load balancing and high availability capabilities.

Those attending the workshop believe this option allows ERCOT the best ability for maintenance

Page 12: TDTWG  Report to RMS

12

NAESB Proxy Server and Application Costs

Option 1 V880 Server Cluster $370,000

Option 4 Hybrid Application Cluster $165,000

An additional cost of $66,105 identified for Training, Business Process and Monitoring.

Page 13: TDTWG  Report to RMS

13

Review of PaperFree Options

Option 1 – Clustered File System Server solution

This option represents the maximum availability solution.

TCH Database

Retail Database Server

(Oracle Databases)

PAPERFREE

Siebel DatabaseNAESB Database Paperfree Database

PaperFree (Option 1)

Paperfree File ServerCluster

W2K

HP

Key:

Bi-Directional Data Flow

Outbound Data Flow

Inbound Data Flow

Single Point of Failure

Paperfree Process Servers

File

ser

ver c

lust

er V

irtua

l IP

Page 14: TDTWG  Report to RMS

14

Review of PaperFree Options

Option 2 – Local File System Solution

– This option supports the load balancing applications

– The system will still be active with a single sever failure; however server interruptions may result in delays in processing persistent data for the server experiencing an interruption.

TCH Database

Retail Database Server

(Oracle Databases)

PAPERFREE

NAESB Database Paperfree Database

HP

Key:

Bi-Directional Data Flow

Outbound Data Flow

Inbound Data Flow

Single Point of Failure

Paperfree Process Servers

Page 15: TDTWG  Report to RMS

15

PaperfreeRecommendation

Those attending the workshop determined that Option 1 – Clustered File System Server solution best meets the needs of the Market

Primary discussion points included:

• Best solution to protect real time data• Least expensive to implement and more effective as

the other option

An additional cost of $10,815 identified for Training, Business Process and Monitoring.

Page 16: TDTWG  Report to RMS

16

Review of All Retail System

Paperfree Process Servers

Key:

PROXIES

INTERNET

OUTBOUND

NAESB

TCH

EAI

TCH Database

Single Retail Database Server(Multiple Oracle Databases)

PAPERFREE

SIEBEL

Bi-Directional Data Flow

Siebel DatabaseNAESB Database Paperfree DatabaseOutbound Data Flow

Inbound Data Flow

All Retail (Database Server)

INBOUND

Paperfree File Server

FIREWALL

Solaris

W2K

W2K

HP

W2K

W2K

HP

IN/OUT

DMZ

SWITCH

Single Point of Failure

Page 17: TDTWG  Report to RMS

17

Review of Database Server High Availability Options

Option 1 - All HP-UX Oracle Real Application Cluster (RAC)

Option 2 - All Linux Oracle Real Application Cluster (RAC)

For options 1 and 2:

Provides active redundancy for database connectivity for all retail databases

Complex to implement

Removes single point of failure at the database server level

Page 18: TDTWG  Report to RMS

18

Database Server High Availability Options

Option 3:– NAESB Linux Oracle RAC and Different Standby/cluster solution for

the rest of the Retail databases• Provides active redundancy for database connectivity for

NAESB database• Less complex to implement as NAESB database is small and

easier to migrate• Provides option to migrate PaperFree and Siebel to migrate into

this RAC• Removes single point of failure at the database server level

– Veritas cluster, or Oracle Standby or Oracle RAC for other databases on HP-UX or Linux for appropriate availability requirements.

• Phased implementation NAESB first and other databases next• Removes single point of failure at the database server level

Page 19: TDTWG  Report to RMS

19

Data Base Server Recommendation

Those attending the workshop determinedOption 3 NAESB Linux Oracle RAC and Different Standby/cluster solution for the rest of the Retail databases as the recommended solution

Primary Discussion points included:This solution provides a high availability option for the NAESB database

Will provide appropriate high availability solutions for the rest of the retail databases in subsequent phases.

Easier to implement in phased manner addressing acute availability needs first.

An additional cost of $79,685 identified for Training, Business Process and Monitoring.

Page 20: TDTWG  Report to RMS

20

Summary of Costs for PaperFree,All Retail Database recommendations

PaperFree Clustered File System Server solution $75,000

All Retail Database Solution• Hardware – $400,000 - $600,000• Cluster SW –$100,000 - $400,000• Oracle RAC SW - $0-$400,000• Cluster Ext Service –$0-$120,000• Oracle RAC Ext Service - $120,000 - $180,000• Internal project cost (FTE) - $120,000 - $180,000

Total: $1,650,000

Page 21: TDTWG  Report to RMS

21

Motion for RMSApprove the following recommendations to be implemented in order to resolve ERCOT System Outages and System failures.

Recommendations include:NAESB Proxy Server Option 1, V880 Server Cluster solution at a projected cost of $370,000

NAESB Application Option 4, Hybrid Application Cluster solution at a projected cost of $165,000

PaperFree Clustered File System Server solution at a projected cost of $75,000

Database Server Option 3 All Retail Database Solution at a projected cost of $1,650,000

Page 22: TDTWG  Report to RMS

22

Thank You