University of Oxford - WhatDoTheyKnow

30
UNIVERSITY OF OXFORD University Offices, Wellington Square, Oxford OX1 2JD General Enquiries Tel: +44 (0)1865 270000 Fax: +44 (0)1865 270222 Email: [email protected] Web: www.ox.ac.uk _ Ref. FOI/Aldover 25 July 2012 Reply to request for information under Freedom of Information of Act Your Ref: E-mail dated 26 June 2012 Address: WhatDoTheyKnow.com Request I've heard that the University's OXAM service, which appears to be essentially a repository for old examination papers, has been unexpectedly unavailable for a period of time during the main ‘examination season' in May/June 2012. Could you please provide me with all the records you hold discussing this incident? That is, I refer to all records created DURING OR AFTER the incident, which for example includes emails, correspondence, file notes, memoranda, reports compiled for managers or committees, notes or minutes of committees discussing this etc. To avoid unnecessarily broad searches, I am happy to exclude emails from individual students which go no further than noting the unavailability of OXAM and asking for help, as they may have been sent to any number of IT support helpdesks within the University. The focus of my request is really more on the University's handling of and internal discussion on the incident. 1. I attach: Annex A: E-mail correspondence relating to the non-availability of OXAM from 4.00 pm on Friday 4 May to 9.30 pm on Sunday 6 May. Annex B: Major incident report dated 10 May 2012. Annex C: Further report on the same incident, headed ‘ OXAM service unavailability 17:30 4/5/12 to 21:30 6/5/12’. Annex D: Minutes of meetings. 2. We have redacted from these documents names and other personal data, the disclosure of which would breach any of the data protection principles in Schedule 1 to the Data Protection Act 1998. Specifically, we consider that the disclosure of this information under the Freedom of Information Act (FOIA) would breach the first data protection principle, which requires that personal data shall be processed fairly and lawfully. There are two reasons for this.

Transcript of University of Oxford - WhatDoTheyKnow

UNIVERSITY OF OXFORD University Offices, Wellington Square, Oxford OX1 2JD

General Enquiries Tel: +44 (0)1865 270000 Fax: +44 (0)1865 270222 Email: [email protected] Web: www.ox.ac.uk

_

Ref. FOI/Aldover

25 July 2012

Reply to request for information under Freedom of Information of Act

Your Ref: E-mail dated 26 June 2012

Address: WhatDoTheyKnow.com

Request I've heard that the University's OXAM service, which appears to be essentially a repository for old examination papers, has been unexpectedly unavailable for a period of time during the main ‘examination season' in May/June 2012.

Could you please provide me with all the records you hold discussing this incident? That is, I refer to all records created DURING OR AFTER the incident, which for example includes emails, correspondence, file notes, memoranda, reports compiled for managers or committees, notes or minutes of committees discussing this etc.

To avoid unnecessarily broad searches, I am happy to exclude emails from individual students which go no further than noting the unavailability of OXAM and asking for help, as they may have been sent to any number of IT support helpdesks within the University. The focus of my request is really more on the University's handling of and internal discussion on the incident.

1. I attach:

Annex A: E-mail correspondence relating to the non-availability of OXAM from 4.00 pm on Friday 4 May to 9.30 pm on Sunday 6 May.

Annex B: Major incident report dated 10 May 2012.

Annex C: Further report on the same incident, headed ‘OXAM service unavailability 17:30 4/5/12 to 21:30 6/5/12’.

Annex D: Minutes of meetings.

2. We have redacted from these documents names and other personal data, the disclosure of which would breach any of the data protection principles in Schedule 1 to the Data Protection Act 1998. Specifically, we consider that the disclosure of this information under the Freedom of Information Act (FOIA) would breach the first data protection principle, which requires that personal data shall be processed fairly and lawfully. There are two reasons for this.

2

a) Disclosure would be unfair to the individuals concerned, and contrary to their reasonable and legitimate expectations. They would not reasonably expect that their personal data would be made public without their consent.

b) The first data protection principle requires that any disclosure must satisfy one of the conditions set out in Schedule 2 to the Data Protection Act. There are six conditions altogether: we do not consider that any of them would be satisfied in respect of the disclosure.

The redacted information is therefore exempt from disclosure under Section 40(2) of the FOIA. However, we have identified the most senior officers involved, in most cases, by their post title.

INTERNAL REVIEW

3. If you are dissatisfied with this reply, you may ask the University to review it, by writing to the Registrar at the following address:

University Offices Wellington Square Oxford OX1 2JD

Alternatively, you may request a review by e-mailing [email protected]. THE INFORMATION COMMISSIONER

4. If, after the internal review, you are still dissatisfied, you have the right under FOIA to apply to the Information Commissioner for a decision as to whether your request have been dealt with in accordance with the FOIA. The Information Commissioner’s address is:

Information Commissioner Wycliffe House Water Lane Wilmslow SK9 5AF

Tel: 0303 123113

Further information for submitting complaints to the Information Commissioner is available at http://www.ico.gov.uk/complaints.aspx

FOI OXFORD

3

Annex A E-mails

E-mail dated Saturday 5 May 2012 from Oxford University Computing Services (OUCS) to Business Services and Projects (BSP)

Sorry, not entirely sure what the support address is for OxAM when it's unavailable but we are receiving reports (from anxious students) that the service is down (whether within (<http://missun29.offices.ox.ac.uk/pls/oxam/main>) or outwith (<http://ezproxy.bodleian.ox.ac.uk:2048/login?url=http://missun29.offices.ox.ac.uk>) the Oxford network). I realise it's a bank holiday but just emailing in case someone is able to kick the server or whatever the solution might be. If the service becomes available again, can you let me know so that I can respond to the tickets in our helpdesk? E-mail dated 5 May from BSP to OUCS

The system is down due to electrical maintenance work in Wellington Square this weekend. E-mail dated 5 May from OUCS to BSP

Thanks for letting me know. Is this announced anywhere (Oxam in particular)? E-mail dated Sunday 6 May from BSP to OUCS

Just wanted to let you know that Oxam is now available E-mail dated 6 May from OUCS to BSP

Great -- I'll let those who contacted OUCS know. Internal Student Administration E-mail dated 6 May

Not sure if you were aware that Oxam went down due to the electrical work in Welly Sq. XXX [OUCS] discovered that although BSP support the service, the servers are located within a Welly Sq cluster (was news to us), so one to be aware of in case of any future Welly Sq electrical works announcements.

You might therefore get some 'no prior warning' complaints from students/tutors next week.... E-mail dated 6 May from OUCS to Head, Student Administration

Just for noting: I am responding to a number of tickets raised by finalists about the current unavailability of Oxam (those who link through from Weblearn will assume it's an OUCS service). A reply from XXXX yesterday informed me it was unavailable this weekend due to electrical work. Do you know if this was announced anywhere? I am surprised (not sure why) that electrical work at University Offices should bring down a service like Oxam. I guess that's partly why it's being migrated to Weblearn...

4

Internal BSP E-mail dated 6 May

Given the impact of Oxam being down is there any chance you could try and bring it back up when you are in tomorrow? It seems OUCS have had quite a few calls about it.

Internal BSP E-mail dated 6 May

Have just read the e-mail from XXX [student] and will try and get the system back up today then. Will update you later Internal BSP E-mail dated 6 May

Thanks for responding to this XXX. Do you need any assistance in bringing OXAM up? Internal BSP E-mail dated 6 May

OXAM is up and running, just managed to download past paper. E-mail dated 6 May from Student Representative to BSP

In my capacity as XXXX [Student Rep], I have received several communications from XXXX students affected by the non-availability of the OXAM service. I have made enquiries with OUCS, where XXXXX has confirmed that OXAM was maintained by BSP and that XX was told the service was "unavailable this weekend due to electrical maintenance work at University Offices and is unlikely to be back again before Tuesday". As you will certainly appreciate, this extended downtime comes at a time in the academic year, where a high number of students is preparing for Finals, first-year examinations or postgraduate examinations, all of whom rely on access to OXAM to download past examination papers. In this context, I would like to ask you the following questions - please feel free to forward them to the person best placed to answer: 1) Is the electrical maintenance work referred to above planned maintenance work, or unplanned (emergency) maintenance work? 2) If it is planned maintenance work, has the impact on services such as OXAM been considered and has any thought been given to the possibility of contingency measures to keep OXAM up and running ("backup site")? 3) It it is planned maintenance work, why has there been no prior announcement to students, neither via email nor, as far as I can see, via any website? 4) If it was emergency maintenance work, I do of course understand that this is something outside of your control and responsibility. However, would it not have been possible to inform students via an email or via a website announcement e.g. on http://www.ox.ac.uk or even on http://www.oucs.ox.ac.uk (the OUCS website being the website probably most consulted by students in such cases, even if OUCS is not responsible)? XXX of OUCS has also informed me that the OUCS helpdesk has had to respond to many queries over the weekend about this issue, so they too would have probably been glad about some form of prominent announcement.

5

I note that http://www.ox.ac.uk/students features a message saying "Student Self Service will be unavailable between 8am-4pm on Sat 5 May, due to essential maintenance. We apologise for any inconvenience this may cause." This may or may not refer to the same maintenance work as mentioned by XXX of OUCS. However, the message does not seem to include OXAM nor does it extend beyond Saturday afternoon. In view of the large number of students affected by this and the many representations and complaints I received, I very much look forward to your response. Again, I emphasize that certainly everyone understands that power outages and other electrical failures are not anyone's fault. However, if this turns out to be on account of planned electrical maintenance which was not announced to students, so that they could have made the necessary preparations (e.g. downloading all examination papers from OXAM which they need), then I believe this is a rather regrettable oversight. E-mail dated 6 May from BSP to Student Rep

I will send a more formal response when I'm back in the office on Tuesday as I will need input from others. Oxam is however now available. E-mail dated 6 May from Student Rep to BSP

Thanks for the notification - that is much appreciated and I'll pass it on to people who have contacted me! Internal BSP E-mail dated 6 May

Below was the brief holding email I sent to XXX [Student Rep] on Sunday evening when service was restored.

E-mail dated 6 May from OUCS to Head, Student Administration

Just to keep you informed (and to reassure you I am not encouraging revolt – XXXX [Student Rep] received a standard reply from me using the information I received from XXXX [BSP] yesterday. E-mail dated 6 May from Head, Student Administration to OUCS

Thanks XXX - I'll see what I can find out from XXXX [BSP] etc (I'm in tomorrow for Exams Panel). Internal Student Admin E-mail dated 6 May

XXX phoned BSP at 5 on Friday after we'd discovered OXAM down, and were told there was nothing to be done until Tues, that they couldn't put 'out of order' webpage up and hadn't realised we needed to be informed. I'm following up with XXX (BSP) tomorrow.

6

Internal BSP e-mail dated 6 May

I have powered back on all Sun servers and restored NFS links including those to the admin interface server.

I have brought up OXAM and checked as requested by XX.

All systems can be checked on Tuesday morning, I will be in for 8am.

Internal BSP e-mail dated 6 May

Many thanks for your help with this unexpected power outage and for responding to the need to restore OXAM sooner than expected. E-mail dated 7 May from Head, Student Admin to Director, BSP

I'm not sure whether you are aware of the issues experienced with OXAM over the weekend (see correspondence)? It seems as though the electrical work that was being undertaken in the University offices over the weekend affected the OXAM service, which was unavailable until last night at around 9pm. We were not aware that the OXAM servers are located within a cluster at Wellington Square and therefore were not able to pre-warn students and staff of this outage (if we had been aware, we would have made every effort to request that the work be postponed given the time of year).

Given that I am at an Exams Panel meeting this afternoon (chaired by Sally Mapstone, with XXX (OUSU) and XXXX [Student Admin] both present, so I would expect this matter to be raised), I will mention this under AOB. I've not seen a response to XXXX [Student Rep] yet, XXX [OUCS] forwarded this correspondence onto me so that I was made aware - but I'd be grateful if I could be copied into the reply please. E-mail dated 7 May from Head, Student Admin to OUCS

Thanks XXX - apparently the servers are located in Welly Sq, but we didn't know (otherwise we'd have requested that the work was undertaken at a different time, given it's exams season). E-mail dated 7 May from OUSU to Head, Student Admin

Thanks for taking this up. I'll get in touch with XXX [Student Rep] and let him know that the university is looking into it and will be getting in touch soon if they haven't already. E-mail dated 7 May from OUCS to Head, Student Admin

XXX [Student Rep] noted that XXX [BSP] emailed when service resumed last night though I suspect it was only to confirm availability rather than any detailed answer to questions. E-mail dated 7 May from Head, Student Admin to OUCS

Thanks - XXX (OUSU) is going to ring XXXX [Student Rep] and report back at Exams Panel this afternoon...

7

E-mail dated 7 May from OUCS to Head, Student Admin

XXXX [Student Rep] mentioned the OUCS helpdesk "has had to respond to many queries over the weekend". I should clarify that this was his interpretation of "We have received a number of reports". In fact, I handled nine reports, though two represented a wider set of users XXX [Student Rep] and a JCR President; others referred to their friends. Of course, Student Information may have received many more and my hunch is that students are less likely than staff to contact the OUCS helpdesk when there is a service outage.

Next week's student newspapers may provide an indication of actual impact... E-mail dated 7 May from Head, Student admin to OUCS

Thanks XXX - I'll check with our Student Info team before I go to Exams panel today and see what they received! E-mail dated 7 May from BSP to Head, Student Admin

XXX [BSP] sent a holding response to XXXX [Student Rep] yesterday, which I will forward on to you when I have it. We will also be doing the analysis to develop a wider, fuller response. E-mail dated 7 May from Director, BSP to Head, Student Admin

I'll pick this up first thing tomorrow. I gather that the service has been restored. E-mail dated 7 May from BSP to Head, Student Admin

Below is XXX [BSP] direct update to XXXX [Student Rep] after the system was powered up again and available. E-mail dated 7 May from Head, Student Admin to BSP

Yes – I was aware he’d had a reply; it’s more the detailed response to the points he makes that I’m concerned about. This issue has just been discussed in Exams panel and I’ve advised Sally that I will keep her updated and we will also ensure a communication goes out to students. E-mail dated 7 May from Deputy Registrar, to Head, Student Admin

Could you let Chief Information Officer know? E-mail dated 7 May from Head, Student Admin to Deputy Registrar

Yes - will do. We did discuss at Exams panel today and I advised Sally that I would let her see the full explanation. E-mail dated 7 May from Head, Student Admin to Chief Information Officer

I just wanted to make you aware of this issue (see correspondence) as it was raised at Exams Panel this afternoon and clearly the student feedback on this is not good. I advised Sally that I would let her see the full explanation once it's available.

8

E-mail dated 7 May from BSP to Head, Student Admin

As per my earlier email, the analysis allowing the fuller response will be done as soon as resources are available, enabling the points in XXXX email [from Student Rep] to be addressed appropriately. I expect to be able to update you tomorrow. Internal Student Admin E-mail dated 7 May

XX [Student Admin] phoned BSP at 5 on Friday after we'd discovered OXAM down, and were told there was nothing to be done until Tues, that they couldn't put 'out of order' webpage up and hadn't realised we needed to be informed.

I'm following up with XXX (BSP) tomorrow. Internal E-mail Academic Administration Division (AAD) dated 7 May

We had 7 messages from students between Friday and Sunday complaining the OXAM was down. Not sure how many Student Info received

Could you please let me know how your conversation with XX [BSP] goes. I appreciate OXAM may not be set-up currently for displaying messages, or was it that BSP isn’t set-up to do this?, but I could have carried a message on the Student Gateway if informed, which may have helped. Internal AAD E-mail dated 7 May

We talked about this in Exams Panel yesterday, and Emma [Head, Student Admin] is now liaising with Tom [Director, BSP] about it. Student Info received 3 queries, but OUCS received rather more.

No-one in Student Admin was warned about the OXAM outage, which appeared to be an unintended consequence of power work in Wellington Square. Had we known we would have asked for the servers to be moved given how critical this time of year is for OXAM usage, or at the very least would have emailed students to warn them so they could download papers in advance.

We agreed in Exams Panel yesterday (XXX from OUSU was there) that an apology and explanation will be drafted and sent to the student reps who had been in touch, and (with their permission) forwarded to all students. Emma - Do you think the next student news email might be a forum to do this or shall we send an additional message to all students? Internal Student Admin E-mail dated 7 May

I’d think the next student newsletter would be appropriate if the timing works – I’m awaiting a response from Tom/XXXX today and will then email you both.

9

Internal Student Admin E-mail dated 7 May

I think the next Student News is due to go out on Monday, so items required by end of Thursday - according to my diary note. Internal BSP E-mail dated 9 May

Can you have a look at my responses below in Blue? Feel free to amend or correct anything.

1) Is the electrical maintenance work referred to above planned maintenance work, or unplanned (emergency) maintenance work? In the strictest sense of the word this was planned maintenance which was being undertaken by the local council on an electricity substation. Unfortunately BSP were not included on any communications until last Thursday (this is something we will be speaking with Facilities to ensure doesn’t happen again) 2) If it is planned maintenance work, has the impact on services such as OXAM been considered and has any thought been given to the possibility of contingency measures to keep OXAM up and running ("backup site")? Consideration was given to keeping our services available, our technical teams met with an electrician on Thursday afternoon to discuss the possibility of bringing in a generator, however due to the short notice and lack of testing it was felt unwise to proceed with this option as it had the potential to damage the server hardware (due to power spikes) which would have potentially meant the system was down for much longer. 3) If it is planned maintenance work, why has there been no prior announcement to students, neither via email nor, as far as I can see, via any website? Ordinarily BSP only communicate with students via third parties (such as via the AAD news letter or SIAS), however due to the short timescales involved this didn’t happen and it’s an area we will look to address. As an interim measure I’ve added Oxam to the BSP System Availability page which students, staff and the OUCS Helpdesk have access to

4) If it was emergency maintenance work, I do of course understand that this is something outside of your control and responsibility. However, would it not have been possible to inform students via an email or via a website announcement e.g. on http://www.ox.ac.uk or even on http://www.oucs.ox.ac.uk (the OUCS website being the website probably most consulted by students in such cases, even if OUCS is not responsible)? XXXX of OUCS has also informed me that the OUCS helpdesk has had to respond to many queries over the weekend about this issue, so they too would have probably been glad about some form of prominent announcement. As per my above comment this kind of emergency notification to students is something we need to address, it’s a rare occurrence but it’s one which we should (and will) have a procedure in place for. Internal BSP e-mail dated 9 May

I think this is okay. If OXAM is on the BSP availability page (I can’t see it), could you add a link to it in the email ?

Internal BSP e-mail dated 9 May

I can indeed

10

Internal BSP E-mail dated 9 May

I haven’t been able to speak to XXX in Estates but I have spoken again to XXXX *Facilities – Wellington Square] and XXX said that it was Southern Electric undertaking work on a power substation affecting a number of buildings in the Wellington Square/Little Clarendon Street area. So probably best substitute ‘Council’ for ‘Southern Electric’

XXX is willing to add BSP Helpdesk to the notification list and I will email XX with details shortly. XX did say that their process is notify administrators in affected departments – and in this case it included XXX, which he thinks should have been sufficient.

E-mail dated 9 May from BSP to Head, Student Admin

Contained in the email below is the fuller answer to XXXX specific points [from Student Rep].

You will shortly see a Major Incident report covering the whole power outage event which affected multiple systems, but essentially circumstances conspired such that BSP received inadequate notice of the Wellington Square power outage planned by Southern Electric, causing unavailability of several services (see below) which were brought down and restarted in a controlled manner. The highest technical risk was to Opendoor, due to the status of the hardware and its history, and the impact and communication around others including OXAM are subject to the MI review and subsequent actions which I will be happy to discuss.

Service Server

Backup OpenDoor aisusun25

NFS Server aisusun27

University Card Test DB aisusun26

Oxford Examination Papers Online (oxam) web aisusun29

Oxford Examination Papers Online (oxam) database aisusun30

University Card System (test and live) aisusun30

COS print / Prophecy / oxam dev/ SCCS aisusun7

OpenDoor, OpenRoad, HESA aisusun8

Midland Trent Citrix Access Cisco firewall x2

E-mail dated 9 May from BSP to Student Rep

Apologies for the delay in responding. I’ve endeavoured to answer your questions below (answers are in blue), however if you have any further questions or concerns please let me know. 1) Is the electrical maintenance work referred to above planned maintenance work, or unplanned (emergency) maintenance work? In the strictest sense of the word this was planned maintenance which was being undertaken by Southern Electricity on an electricity substation. Unfortunately BSP were not included on any communications until last Thursday (this is something we will be speaking with Facilities to ensure doesn’t happen again) 2) If it is planned maintenance work, has the impact on services such as OXAM been considered and has any thought been given to the possibility of contingency measures to keep OXAM up

11

and running ("backup site")? Consideration was given to keeping our services available, our technical teams met with an electrician on Thursday afternoon to discuss the possibility of bringing in a generator, however due to the short notice and lack of testing it was felt unwise to proceed with this option as it had the potential to damage the server hardware (due to power spikes) which would have potentially meant the system was down for much longer. 3) If it is planned maintenance work, why has there been no prior announcement to students, neither via email nor, as far as I can see, via any website? Ordinarily BSP only communicate with students via third parties (such as via the AAD news letter or SIAS), however due to the short timescales involved this didn’t happen and it’s an area we will look to address. As an interim measure I’ve added Oxam to the BSP System Availability page which students, staff and the OUCS Helpdesk have access to (http://www.admin.ox.ac.uk/bsp/sysavailability/). 4) If it was emergency maintenance work, I do of course understand that this is something outside of your control and responsibility. However, would it not have been possible to inform students via an email or via a website announcement e.g. on http://www.ox.ac.uk or even on http://www.oucs.ox.ac.uk (the OUCS website being the website probably most consulted by students in such cases, even if OUCS is not responsible)? XXXX of OUCS has also informed me that the OUCS helpdesk has had to respond to many queries over the weekend about this issue, so they too would have probably been glad about some form of prominent announcement. As per my above comment this kind of emergency notification to students is something we need to address, it’s a rare occurrence but it’s one which we should (and will) have a procedure in place for. E-mail dated 9 May from Head, Student Admin to BSP

Thanks for the information below – I’ll discuss with Sally *Pro-Vice-Chancellor (Education)] , Mike [Deputy Registrar] and Keith [Director, Student Admin] tomorrow morning (copied here - as we have a scheduled meeting in the morning and we could perhaps take under ‘AOB’ as this follows on from the Exams Panel discussion on Monday). The concern I would raise is the fact that it seems, from the information below, as though this power outage was known about last Thursday, servers were then powered down in a controlled fashion, but there was no communication to my section that OXAM was affected. When we noticed that it was unavailable, we were advised that it would not be brought back until Tuesday morning (although subsequent escalation brought it back online on Sunday evening). Given the time of year, had I been advised of this in advance then I would have made every effort to ensure that the electrical work was rescheduled, as it’s just not acceptable for this service to be unavailable to students during the main examination period. I also didn’t realise that the OXAM servers were located there and I can’t see that any of the other systems identified below were as business critical to our operations during that time period.

I do think that a more detailed response will need to be sent to XXX [Student Rep] and there is an expectation that this information will be made available to students, to provide them with the appropriate reassurance that this will not happen again. There is a regular student newsletter sent out by the AAD Comms team, the next issue is due out on Monday (items to be received by close of play Thursday – i.e. tomorrow). I would have thought that this would be a useful forum to use for this. However, we also need to be prepared for an item on this to

12

appear in one or both of the student newspapers (Oxford Student or Cherwell) – and so to have a full response prepared and communicated as soon as possible is essential, as it may also be wise for us to brief the Press Office.

With these comments in mind, would you be able to draft a reply to XXX Student Rep please that we can then publish in the student newsletter along with his original email, if Sally, Mike and Keith agree that this is appropriate? E-mail dated 9 May from Deputy Registrar to Chief Information officer

To see. Not too good. I will let you know if we gain any further understanding.

E-mail dated 9 May from Head, Student Admin to Chief Information officer

We discussed this issue (see further correspondence) at Education Management Group this morning. I am sure you can imagine the tone of the discussion we had. Is there any possibility of you being able to oversee the response that goes to XXXX [Student Rep]? If it would help to discuss then do let me know. E-mail dated 9 May from Chief Information officer to Head, Student Admin

Already on it E-mail dated 9 May from Student Rep to BSP

Many thanks for your email and answers! I understand that BSP has received very late notification of the planned Southern Electricity works from Facilities / OUED and that you will be addressing this with them to ensure that does not happen in future. It is also absolutely clear why it was not possible for you to organize contingency arrangements (generator) in the short time available. Regarding the issue of notification - I believe this is the most important aspect to be followed-up on now. Maintenance work, whether planned or unplanned, is of course unavoidable and I am sure all students will understand it when systems are unavailable at certain times. However, notifications both in advance and during emergency events are very helpful in mitigating the impact of any system outage, both by allowing students to plan their work (when the time and date of maintenance is known) and by giving them at least a rough idea when the systems will hopefully be up and running again (through estimated repair times in cases of emergency downtimes). From student communications, I understood that one main anxiety related to not knowing whether the OXAM system was only unavailable for a very short time (say 2 hours) or whether this might be a complex outage which needed several days or even a week for resolution. Thus, I am glad to hear that you're looking into protocols for emergency notification now - including during weekends / bank holidays, when normal communication routes through AAD / SIAS might not be available. If I may, I would strongly suggest that such emergency communication, at least for students, be email-based. Students can safely be expected to frequently access their Nexus email, however

13

they are much less likely to be aware of pages such as http://www.admin.ox.ac.uk/bsp/sysavailability/ Moreover, there is some potential for confusion because OUCS operates its own website at http://status.ox.ac.uk which is students are more likely to know (due to wider publicity) and on which, of course, OXAM is not listed because it is not a OUCS service. I absolutely appreciate that the fact that IT services are provided by different entities within the opportunity leads to such apparent duplications (e.g. 2 different system availability pages), however this can be quite confusing to students, which is why I think a email-based option would be much more efficient in communication. E-mail dated 10 May from Chief Information Officer to BSP

Can you confirm for me that the OXAM downtime was due to the power outage that was planned for Sunday 8-2? Or was there a longer outage?

Is there a communication being prepared as Emma has suggested?

E-mail dated 9 May from BSP to ICT support Team

Following the power outage this weekend I have agreed with XXX [Facilities] to include the BSP Helpdesk in any notifications from Facilities relating to University Offices. I would be grateful if – for a belt-and-braces approach – you could also pass on any notifications you receive in future that might affect the servers at Wellington Square.

E-mail dated 10 May from BSP to Facilities

Following our conversation yesterday I would be very grateful if you could include the BSP Helpdesk to any notifications about the University Offices at Wellington Square that will have an impact on the provision of IT services from the computer rooms on level 4. I expect this will be mostly power disruptions but may also include building works, access to the room or network changes.

The Outlook name is: UAS BSP Helpdesk

The email address is : [email protected] E-mail dated 10 May from Facilities to BSP

Will do

E-mail dated 10 May from BSP to Chief Information Officer

The power outage was planned for Sunday 8-2pm and could not be rescheduled at short notice. Due to the short notice, XXXX was unable to secure resource to bring the servers down on Sunday morning, so the shutdown commenced Friday late afternoon, following communication with the Card Office, Finance users, Opendoor users and Payroll, but not, it has to be said, to students about OXAM; we have never had a communications channel for OXAM and clearly that was a gap, now filled (BSP system status page, BSP Applications calendar, BSP support wiki, documentation).

14

XXX [Student Rep] has responded (as attached) in a sensible way given the disruption, and we will incorporate XX suggestions about comms in to the Major Incident report actions. I will liaise with Emma and AAD comms about what else might be necessary for student newsletter etc.

You will be copied on the Major Incident report which should be available tomorrow.

E-mail dated 10 May from BSP to Head, Student Admin

XXX [Student Rep] has replied as attached, and we will incorporate his useful suggestions about student communications into the Major Incident report (on the general impact of the power outage) which should be published tomorrow, and copied to him. I will work with AAD Comms on a version of the text suitable for the student newsletter. Let me know if you think there is any other immediate comms action. E-mail dated 10 May from Head of Student Admin to BSP

I would be grateful if you can ensure that any further communication on this issue is passed by me before publication. E-mail dated 10 May from BSP to Head of Student Admin

Okay, we’ll do that Internal AAD E-mail dated 11 May

I’m going to draft something today for your initial (and then any other usual suspects’) review. We can make the Student News on Monday if this is deemed the best channel.

If the text is going to be circulated via the e-news, then it won’t be verbose or in letter format. Once my initial draft has been reviewed, we may need to consider it circulated as an email to students if a more letter style or length fits better. Internal AAD E-mail dated 11 May

I’d suggest something very short in the student news and then perhaps we can provide something fuller online and give OUSU a link in case they want to publicise more. Internal AAD E-mail dated 11 May

Are you happy with the following as a student news item on Monday, before I circulate more widely: Apology for recent OXAM unavailability Students preparing for examinations over last weekend’s bank holiday weekend were regrettably inconvenienced due to the unavailability of the OXAM service.

The service outage resulted from scheduled maintenance work carried out by Southern Electricity. Due to an oversight in communication between services within the University, the contingency measures that we have in place to ensure that our systems remain available at critical periods failed on this occasion.

15

In event of scheduled maintenance it is standard procedure to communicate this to students so that they can make necessary preparations. In event of unplanned emergency maintenance we seek to display prominent website and system notices to advise students that we are aware of, and taking action on, the issue to reassure that the service will be restored as soon as possible.

We extend our sincere apologies for this disruption. Please be reassured that we are looking at the chain of events that led to the service unavailability and communication oversight to ensure that this does not happen again. Internal AAD E-mail dated 11 May

Thanks for this – I’ve just added a bit about avoiding downtime for OXAM in your text below. In terms of review, I’d suggest XXX *Student Admin+ (copied) plus XXX [OUSU] and Tom. If you could also copy to Anne [Chief Information Officer], Mike, Keith and Sally Mapstone.

We extend our sincere apologies for this disruption. Please be reassured that we recognise the significance of the OXAM service and would always seek to avoid downtime, particularly at critical periods of the year. Internal BSP E-mail dated 12 May

Got your message re OXAM.

It now looks like there was a collective failure to pick up all aspects of the business support requirements when XXX left. I guess XXX must have been doing more than was handed over to sw dev or anyone else.

Internal BSP E-mail dated 12 May

I think you are right but it does boil down to the fact that we notified other users about services, why not AAD about OXAM?

I don't think we can spin it, we'll just have to say it was an oversight and apologise and make reassurances about why it won't happen again. Internal BSP E-mail dated 12 May

One point to mention: although it does affect the fact that we did not pick on the impact of OXAM downtime or excuse BSP in any way, XXX in the Exam Schools knew about it on Friday afternoon (because it was down) and presumably he also went home without considering comms or contacting XXX.

Internal AAD E-mail dated 13 May

Further to last weekend's OXAM downtime, please find below an apology item to be published in tomorrow's (Monday 14th May) Student e-Newsletter and on the Student Gateway website. Please let me know if you would like to make any revisions to the text by 10am tomorrow morning.

16

Apology for recent OXAM unavailability Students preparing for examinations over last weekend’s bank holiday weekend were regrettably inconvenienced due to the unavailability of the OXAM service. The service outage resulted from scheduled maintenance work carried out by Southern Electricity. Due to an oversight in communication between services within the University, the contingency measures that we have in place to ensure that our systems remain available at critical periods failed on this occasion. In event of scheduled maintenance it is standard procedure to communicate this to students so that they can make necessary preparations. In event of unplanned emergency maintenance we seek to display prominent website and system notices to advise students that we are aware of, and taking action on, the issue to reassure that the service will be restored as soon as possible. We extend our sincere apologies for last weekend's disruption. Please be reassured that we recognise the significance of the OXAM service and would always seek to avoid downtime, particularly at critical periods of the year. We are looking at the chain of events that led to the service unavailability and communication oversight to ensure that this does not happen again. E-mail dated 14 May from Director, BSP to AAD (Communications) and others

No revisions from me but I was prepared for the apology to come from BSP – it’s your call however.

Internal BSP E-mail dated 14 May

I've reviewed and tweaked and attach v 0.4. I've copied to Emma as I believe she wanted to include with the SSMG papers going out at 8am this morning.

[Attachment: Major Incident Report] Internal Student Admin E-mail dated 14 May

Haven’t read the report but can respond on the XXX point *Student Admin+ - yes it’s our mess-up too in not escalating to you and XXX directly as soon as we spotted OXAM wasn’t there. XXX [Student Admin] spoke to BSP late on Friday pm, was told it was down due to power outage and kicked up about the bad timing and need for a message on the website rather than just ‘website not found’. He forwarded me an email later on from XXX [BSP] (Customer services)]which I should have sent directly on to you, but I foolishly thought BSP were updating the web-pages, and was tied up in a physio appointment, then didn’t check over the weekend. Apologies for my (big) part in this mess.

E-mail dated 14 May from Head, Student Administration to Student Admin (Examinations)

That’s fine – how about if I say: ‘XXX reported the issue with OXAM not being available late on the Friday afternoon. XX raised this directly with BSP Support and was told it was down due to the power outage and that this couldn’t, at such a late stage, be rearranged. At that point XX made the support team aware of the criticality of the service and the need for a message to be put onto the website, so that students would not just see a ‘website not found’ message. The impression XXX was given was that this updating of the webpages was being done, therefore XX didn’t escalate internally, as XX believed it was too late for any further action to be taken and

17

XX assumed that the appropriate escalation within BSP was in progress, given that XX had made clear the criticality of the service’.

Please red pen….

E-mail dated 14 May from Student Admin (Examinations) to Head, Student Administration

That’s fine – how about if I say: ‘XXX reported the issue with OXAM not being available late on the Friday afternoon. XX raised this directly with BSP Support and was told it was down due to the power outage and that this couldn’t, at such a late stage, be rearranged. At that point XX made the support team aware of the criticality of the service and the need for a message to be put onto the website, so that students would not just see a ‘website not found’ message. XXX forwarded correspondence to XXX, who erroneously believed the website was being updated.

The impression XXX was given was that this updating of the webpages was being done, tTherefore XXX didn’t escalate internally, as XXX believed it was too late for any further action to be taken and XX assumed that the appropriate escalation within BSP was in progress, given that he XXX had made clear the criticality of the service’.

E-mail dated 14 May from Head, Student Administration to Director, BSP

I’ve reviewed this document. The main points I would make (some of which we discussed on the phone on Friday evening) are as follows:-

a) In terms of the summary of actions, I think that this can be relatively brief – the key points for SSPB are:-

late notice received by BSP from Facilities (received on Thursday 3rd May at 9.30am)

communication to the Examinations team/Student Administration was overlooked

server was powered down on Friday afternoon

Examinations team reported the non-availability of OXAM on Friday late afternoon

Student enquiries made to OUCS, Student Information and the Exams team direct during Friday/Saturday

Escalation resulted in service being brought back online at 9.30pm on Sunday evening

b) On the reference below to XXXX [Student Admin] being aware of this issue – the situation is actually that XXX reported the issue with OXAM not being available late on the Friday afternoon. XX raised this directly with BSP Support and was told it was down due to the power outage and that this couldn’t, at such a late stage, be rearranged. At that point he made the support team aware of the criticality of the service and the need for a message to be put onto the website, so that students would not just see a ‘website not found’ message. He forwarded correspondence to XXXX [Student Admin], who understood from the discussions that BSP were updating the website, to make the non-availability of the service clear. As XX was given the impression that this updating of the webpages was being done, XXX did not escalate further within Student Administration, as XX believed it was too late for any further action to be taken and assumed that the appropriate escalation within BSP was in progress, given that XX had made clear the criticality of the service.

18

c) Even with the oversight in communication to the Exams team between Thursday 9.30am and Friday at 4pm, I don’t understand why the information and feedback from XXX wasn’t sufficient to make it clear that downtime for OXAM at such a critical stage in the academic year needed to be avoided, and for action to be taken to make the service available for Friday night/all day Saturday at least, with a follow-up communication. It seems as though it was only once XXX [OUCS] intervened (as XX been made aware of the OUCS support tickets) that action was taken to bring it back online on Sunday evening.

d) I don’t understand why there wasn’t recognition that this was a major issue, of widespread impact, and that it was essential that further advice was sought from those with appropriate expertise in this area, to ensure an appropriate and timely response.

E-mail dated 15 May from Director, BSP to Head, Student Administration

I’ve responded to your points below:

Regards

Tom ******************* Dear Tom,

I’ve reviewed this document. The main points I would make (some of which we discussed on the phone on Friday evening) are as follows:-

a) a) In terms of the summary of actions, I think that this can be relatively brief – the key points for SSPB are:-

late notice received by BSP from Facilities (received on Thursday 3rd May at 9.30am)

communication to the Examinations team/Student Administration was overlooked

server was powered down on Friday afternoon

Examinations team reported the non-availability of OXAM on Friday late afternoon

Student enquiries made to OUCS, Student Information and the Exams team direct during Friday/Saturday

Escalation resulted in service being brought back online at 9.30pm on Sunday evening

Understood. We will draft something for review by COB Thursday

b) On the reference below to XXXX [Student Admin (Examinations)] being aware of this issue – the situation is actually that XXX reported the issue with OXAM not being available late on the Friday afternoon. XX raised this directly with BSP Support and was told it was down due to the power outage and that this couldn’t, at such a late stage, be rearranged. At that point he made the support team aware of the criticality of the service and the need for a message to be put onto the website, so that students would not just see a ‘website not found’ message. He forwarded correspondence to XXXX [Student Admin (Examinations)], who understood from the discussions that BSP were updating the website, to make the non-availability of the service

19

clear. As XX was given the impression that this updating of the webpages was being done, XXX did not escalate further within Student Administration, as XX believed it was too late for any further action to be taken and assumed that the appropriate escalation within BSP was in progress, given that XX had made clear the criticality of the service.

There appears to be some confusion about the website and messages and it is possible (although I don’t know for a fact) that the helpdesk officer (a temp unfortunately) gave a misleading impression about what was possible. OXAM is set up in such a way that there is no capability for re-directing users to a different landing page if the service is unavailable. This is possible with other applications but not OXAM currently so there would have been no way of informing students other than via the student comms channel. This is another aspect of the previous lack of visibility within BSP of OXAM as a critical service at certain times of the year. OXAM had not been transitioned properly from a PC Apps service managed exclusively by XXXX to a critical student service. If it had been the comms would have been dealt with properly and the notification from Facilities on Thursday morning would have triggered a communication with you and an escalation with Estates. Suffice it to say that the profile has now been raised. With OXAM transitioning to Weblearn this summer, would you like to undertake a review of the current BSP service or focus attention on the new one?

c) Even with the oversight in communication to the Exams team between Thursday 9.30am and Friday at 4pm, I don’t understand why the information and feedback from XXX wasn’t sufficient to make it clear that downtime for OXAM at such a critical stage in the academic year needed to be avoided, and for action to be taken to make the service available for Friday night/all day Saturday at least, with a follow-up communication. It seems as though it was only once XXX [OUCS] intervened (as XX been made aware of the OUCS support tickets) that action was taken to bring it back online on Sunday evening.

I’ve checked with those involved and, by way of explanation, I don’t think that the alarm bells about the impact on students were ringing loudly enough to warrant, in the view of first line support, escalation within BSP. XXX [Student Admin] was annoyed about not being informed prior to the downtime but the helpdesk person (the temp) was not aware of the serious nature of the situation. XXXX [BSP] followed up with an apologetic email to XXX late Friday but got no reply. If you or XXX had been made aware my phone would have been ringing and I would have talked to XXX [BSP] about bringing the server back up until Sunday morning. That said, the short notice from Facilities had made it difficult for XXX to find someone in his team to alter his or her weekend plans to attend site; XXX did ask the question – if XX could have got someone to do it he would have done. I think that you would accept that usually we manage to cover weekends when required and when we have a little notice. XXX emails on Saturday afternoon alerted XXXX to the seriousness of the situation and XXX took steps to get the service back as soon after the power outage as was possible.

d).I don’t understand why there wasn’t a recognition that this was a major issue, of widespread impact, and that it was essential that further advice was sought from those with appropriate expertise in this area, to ensure an appropriate and timely response.

20

XXX took his lead from your email to me (attached) asking to be copied into his reply to XXXX [Student Rep]

E-mail dated 16 May from Director, BSP to Head, Student Administration

For review. I think I’ve captured your points.

[Attachment: BSP Report SSPB]

Internal BSP E-mail dated 16 May

The OXAM issue rumbles on and I am having to write another paper for the SSPB. In it I’ve identified two principle failings:

1. Inadequate notice from Facilities. If BSP had been informed in time the issue could have been escalated and the sub-station work possibly deferred. Failing that there would have been time for BSP staff to make arrangements to minimise downtime.

Facilities are now aware of BSP’s vested interest in the Wellington Square server room and the BSP helpdesk will be informed as early as possible in future.

2. BSP did not inform the Examinations team or Student Administration about the problem when it became aware of Southern Electricity’s plan on Thursday morning. If it had, the impact on students would have become apparent and, if the work could not have been deferred, students could have been alerted to download papers before the service was taken down.

OXAM support had not been transitioned correctly from one team to another in BSP and there was a consequent lack of visibility of service ownership, communication protocol and business criticality at certain times of the year.

With regard to 2) – what steps have been taken to address the lack of visibility of service ownership, communication protocol and business criticality at certain times of the year.

Internal BSP E-mail dated 16 May

With regards to communication, I've agreed with XXX [AAD] that issues affecting OXAM will be publicised via the student gateway which XXX owns and updates. This is for both planned and unplanned outages, apparently XXX and XXXX pick up emails 24/7 so an email to both will be sufficient to get the gateway updated asap. I'll add this to the BSP wiki and major incident processes tomorrow. We'll also be adding OXAM to our application calendar do this can't get over looked again. E-mail dated 17 May from Head, Student Administration to Director, BSP

We discussed this briefly yesterday and I note you’ve sent a summary through for SSPB next week for which many thanks; I’ll look at shortly.

21

As I’ve explained before, the most significant issue around this event was the fact that it was known about on Thursday morning and we weren’t informed.

E-mail dated 17 May from Director, BSP to Head, Student Administration

I understand your points and your concerns and I think that lessons have been learned here. Comms involving student systems have to be handled in a very sensitive way and I will ask XXX [BSP] to ensure that all XX staff are aware and, if in any doubt, they should consult either XXX or me about the way comms relating to a particular incident should be handled and whether you and XXX [AAD] need to be consulted.

Would you be content with that?

E-mail dated 17 May from Head, Student Administration to Director, BSP

This is fine

E-mail dated 30 May from Head, Student Administration to College Rep, Student Systems Programme Board

Apologies for the delayed feedback – time seems to be running away from me….

Main points from SSPB:-

c) Considerable concern expressed at the OXAM downtime and the effect this has had on students revising for exams.

22

ANNEX B

Major Incident Report

EXECUTIVE SUMMARY

BSP was informally advised of a power outage over a bank holiday weekend with only 2 days notice, and this caused disruption to users of a number of systems including OXAM, OPENdoor and the University card system. Particularly impacted were students who were unable to access OXAM and hence past examination papers between 4pm on Friday the 4th May to 9.30pm on Sunday the 6th May.

Communications failures between Facilities and BSP resulted in systems being powered down at very short notice and restricting the opportunity to minimise the impact or reschedule the work.

BSP have agreed a formal process with Facilities for communicating any works which may affect BSP hosted services in Wellington Square, these will be sent to the BSP Helpdesk rather than an individual.

Communication failures between BSP and the student community meant that students were unable to take mitigating steps such as downloading examination papers prior to the downtime.

BSP are working with the AAD communications team to agree and document a formal process for communicating urgent system downtime messages to students, details of this will also be added to the BSP Customer Services Wiki ensuring quick access to it in the event of future problems.

Major Incident which occurred on 04/05/12, affecting a number of services based at Wellington Square

Version: v0.4 Date: 10th May 2012

Author: Ian Teasdale

Owner: Jeffrey Thomas

Intended Audience: BSP and Key Stakeholders

23

OVERVIEW

BSP was informally advised of a power outage over a bank holiday weekend with only 2 days notice, this caused disruption to users of a number of systems including OXAM, OPENdoor and the University card system. Particularly impacted were students who were unable to access OXAM and hence past examination papers between 4pm on Friday the 4th May to 9pm on Sunday the 6th May.

TIMELINE AND TECHNICAL ACTIONS

At approximately 9.30am on Thursday 3rd May an email from ICTST to BSP Technical Services highlighted planned maintenance taking place in Wellington Square which would cause the systems there to lose power between 8am and 2pm on Sunday 6th May. It became clear that at this point that the power outage had been announced by Facilities to the [email protected] mailing list and that no one from either BSP Customer Services or Technical Services was on the list.

BSP Technical Services immediately contacted the Facilities team to see if the work could be postponed, however this was not an option as Southern Electric were the initiators; they were carrying out maintenance on an electricity substation. Discussions then took place to determine if a generator could be sourced to provide a temporary source of power.

An electrician attended site late on Thursday afternoon and discussed the available options with both ICTST and BSP Technical Services. Whilst it may have been technically possible for a generator to be connected this had never been tested, and it was felt that there was a high risk of power spikes,which have the potential to damage the server hardware making it impossible to quickly restore services. Therefore it was decided that in order to reduce risk the servers should be brought down in a controlled manner, there being no other option available given the limited window of opportunity when suitable qualified staff would be available.

The BSP systems affected by the power outage were hosted on SUN servers which are relatively old and therefore considered to be more prone to hardware issues therefore a decision was made that the servers should be powered down towards the end of the business day on Friday and brought up again on Tuesday morning (Monday being a Bank Holiday) when specialist technical resources would be available to manage any issues. A timeline for this was circulated on Friday (a draft timeline had been circulated on Thursday afternoon prior to confirmation that the services would need to be brought down) and communications with end users were initiated, details of affected systems and communications are below:

24

Service Server Communications - Comments

Backup OpenDoor aisusun25 No user impact

NFS Server aisusun27

Affects interfaces, but no end user impact

University Card Test DB aisusun26 No user impact

Oxford Examination Papers Online (oxam) web aisusun29

No formal communications channel defined for service; comms overlooked

Oxford Examination Papers Online (oxam) database aisusun30

No formal communications channel defined for service; comms overlooked

University Card System (test and live) aisusun30

Advised Card Office of Downtime

COS print / Prophecy / oxam dev/ SCCS aisusun7

Printing unavailable, text added to Financials website

OpenDoor, OpenRoad, HESA aisusun8

Email communications sent out to all OpenDoor users

Midland Trent Citrix Access Cisco firewall x2

Payroll team were advised of downtime.

Upon further investigation a number of the systems were eventually powered down on Saturday morning to allow the overnight backups to take place to prevent any loss of data.

All the databases have backups which could be utilised in the event of longer term hardware issues following the power outage. Contingency plans were considered for systems at most risk of subsequent hardware failure, namely the Opendoor servers. These had not been fully

25

powered down cold for a period of years and there was a real risk of extended downtime following the power outage. It was established that absence process would have to be changed, and that extra reporting using Business Objects against the hosted Midland payroll system would go some way towards replacing the Opendoor reports if necessary, until payroll switches to Core_HR in September 2012.

Charlie Morgan of the Finance Payroll section was advised and consulted about the

potential impact on the Opendoor system.

Maureeen McNaboe was contacted at the Card Office to consult and advise about

downtime for the Card system. The card system is utilised alongside CMIS on exam

weekends, but this was not an exam weekend.

Matthew Kirk contacted the OSS Support Centre and was advised that OXAM was

unavailable due to electrical maintenance in Wellington Square. At this point the impact

on students was clarified, and the downtime which at this point was unavoidable was

later modified to make OXAM available again as soon as power was restored on Sunday

at 9pm. No student communication was initiated however.

ISSUES IDENTIFIED

There were two key issues identified as a result of this major incident; the two key issues relate to how and when Facilities communicate maintenance works to BSP, and also how BSP communicate urgent issues to all students.

Neither BSP Technical Services nor Customer Services were included in any formal communications regarding the electrical maintenance work, this meant planning and communication had to be compressed into a very short period (less than 2 days).

Until the time Jill James of the BSP PCApps group left the University in April 2011, the OXAM system and data was maintained by the PCApps group. The ultimate transition plan on Jill’s departure was to make the OXAM data available to students via WebLearn, with the maintenance moving to what is currently OUCS. In the interim, the technical part of data loading was moved to the BSP software development team. There appears to have been inadequate consideration given to the continuity of non-technical aspects of OXAM operations, such as the usage calendar and communications to sponsor and students. As a result in the condensed period available in this instance for planning and mitigating actions for the power outage, the communications were regrettably inadequate. . ACTIONS AND IMPROVEMENTS

Chris Cattermole has agreed with Facilities that they will communicate to BSP (via the BSP Helpdesk) any works which have the potential to disrupt services from the server room in Wellington Square; in addition Ian Teasdale (BSP Customer Services) has been added to the [email protected] , and informal agreement is now in place

26

whereby ICTST will advise us if they become aware of any downtime, however ultimate responsibility will remain with Facilities.

Ian Teasdale is speaking with Tara Jewell to agree and document a formal process for communicating urgent system downtime messages to students, details of this will also be added to the Customer Services Wiki ensuring quick access to it in the event of future problems. OXAM has been added to the BSP system availability page (http://www.admin.ox.ac.uk/bsp/sysavailability/) however it’s noted that is unlikely students would know to check this page if OXAM were unavailable therefore we will contact OUCS to discuss the possibility of them including a link to the BSP status page on the OUCS status page and vice versa.

Action: Agree and document a communication plan for emergency activities affecting OXAM (IT) 16/05/12

Action: Discuss with OUCS the possibility of including a link to the BSP status page on the OUCS status page and vice versa. (IT) 18/05/12

Activities which are unrelated to the above but will improve the situation significantly are; OXAM will shortly be migrated into the WebLearn system which is far more resilient than the current OXAM system; over the next year many functions currently being provided via OPENdoor will be provided through the new CoreHR system meaning the impact of system downtime will be greatly reduced.

LESSONS LEARNED

Inadequate notice of the planned works prevented proper planning or adequate notification to users; we have agreed a process with our colleagues in Facilities to prevent this happening. In addition two informal routes have been put in place should the formal route fail for any reason.

When operational responsibility for any live IT services changes or moves, use of the Service Transition process will in the future be actively considered in every case, rather than only for new or replacement services.

27

ANNEX C

Business Services & Projects

OXAM service unavailability 17:30 4/5/12 to 21:30 6/5/12

OXAM is a legacy bespoke Oracle-based system that runs on servers based at the University Offices in Wellington Square, making past examination papers available to students via a web front-end. The examination papers are in the process of transferring to Weblearn this summer.

At 09:30 on Thursday 3rd May, BSP was informed that Southern Electricity had scheduled work to take place on a sub-station which would result in a power outage at Wellington Square from 08:00 to 14:00 on Sunday 6th May. This work had been known about for some time by Facilities but a communications failure resulted in BSP receiving short notice. The possibility of using a generator was considered but dismissed because of technical difficulty and risk.

A number of services run on servers based at Wellington Square and a list of these were passed from BSP Technical Services to BSP Customer Services for communication to system owners. BSP’s Technical Services Director then considered whether any of his staff might be available over the weekend to power down the servers late on Saturday or early on Sunday 6th May but the notice period was too short for staff to make alternative plans and no one was aware that any of those services were critical over the weekend period. Accordingly, plans were made to bring the servers down at 17:30 on Friday 4th May and restore them on the Bank Holiday Monday 7th May and communications to service owners were made to that effect. Unfortunately, OXAM was overlooked for reasons that will be described later and neither the Examinations team nor Student Administration were informed.

The BSP Helpdesk received a call from the Examinations team late on Friday alerting BSP to the fact that OXAM was unavailable. The helpdesk officer explained the reason for it and the BSP Line Support Manager followed up with an email to confirm. Student enquiries were made to OUCS, Student Information and the Examinations team direct during Friday and Saturday and OUCS alerted BSP’s First Line Support Manager on Saturday afternoon to the impact that the service unavailability was having on students. The issue was then escalated and a member of the Technical Services team offered to come in on Sunday evening to power the servers up. The OXAM service was restored at 21:30 on Sunday 6th May.

An email was received from Michael Bimmler, the Undergraduate Representative on the Humanities Divisional Board, [STUDENT REP] by Ian Teasdale, BSP First Line Support Manager at 18:52 on the Sunday, expressing his concern at the impact on students and asking a number of questions.

Ian Teasdale replied to Mr Bimmler at 21:23 announcing that the service was available again and saying that he would respond to his email when he was back in the office on the following Tuesday after the Bank Holiday on the 7th May. He actually responded on Wednesday 9th May at 16:51.

Their email exchange is included in Annexe A.

An apology was published on the Student Gateway and in the Student e-newsletter the following Monday 14th May:

28

Apology for recent OXAM unavailability

Students preparing for examinations over last weekend’s bank holiday weekend were regrettably inconvenienced due to the unavailability of the OXAM service. The service outage resulted from scheduled maintenance work carried out by Southern Electricity. Due to an oversight in communication between services within the University, the contingency measures that we have in place to ensure that our systems remain available at critical periods failed on this occasion. In event of scheduled maintenance it is standard procedure to communicate this to students so that they can make necessary preparations. In event of unplanned emergency maintenance we seek to display prominent website and system notices to advise students that we are aware of, and taking action on, the issue to reassure that the service will be restored as soon as possible. We extend our sincere apologies for last weekend's disruption. Please be reassured that we recognise the significance of the OXAM service and would always seek to avoid downtime, particularly at critical periods of the year. We are looking at the chain of events that led to the service unavailability and communication oversight to ensure that this does not happen again.

What went wrong

There were two principal failures:

1. Inadequate notice from Facilities. If BSP had been informed in time the issue could have been

escalated and the sub-station work possibly deferred. Failing that there would have been time for

BSP staff to make arrangements to minimise downtime.

Facilities are now aware of BSP’s vested interest in the Wellington Square server room and the BSP Helpdesk will be informed as early as possible in future.

2. BSP did not inform the Examinations team or Student Administration about the problem when it

became aware of Southern Electricity’s plan on Thursday morning. If it had, the impact on students

would have become apparent and, if the work could not have been deferred, students could have

been alerted to download papers before the service was taken down.

OXAM support had not been transitioned correctly from one team to another in BSP and there was a consequent lack of visibility of service ownership, communication protocol and business criticality at certain times of the year. This has now been addressed and AAD Communications will be informed of anything that affects the OXAM service in the future. OXAM has been added to the BSP applications calendar to reflect the business critical period.

29

ANNEXE A

[INCLUDED AT ANNEX A above]

30

ANNEX D Minutes of meetings

17 May 2012: Student Systems Management Group – Extract from Minutes 5.2 OXAM: unavailability of OXAM 04 - 06 May 2012 –

It was noted that OXAM had been unavailable over the May Bank Holiday weekend due to electrical work being carried out at Wellington Square resulting in OXAM being unavailable from 4pm on the Friday to 9pm on Sunday. The problem was exacerbated as an advance notification was not sent out and, given the time of year, the preferred action would have been to postpone this downtime. A paper will be presented on this issue to the SSPB. Apologies to students have gone out and on-going investigations and discussions are in progress to ensure this does not happen again. 24 May 2012: Student Systems Programme Board – Extract from Minutes 7.1 OXAM Service Unavailability 04-06 May.

A paper outlining the unavailability of OXAM over the May bank holiday weekend (04-06 May) was presented to the Board. It was noted that the importance of the system at a key stage in the academic year was overlooked. The Board noted the reputational damage that this incident had caused and expressed concern that this type of issue during a critical period should not be allowed to happen again.