Tuesday, March 10, 2009

Communicating System Outages

outage-notification-name-tag.jpgIn my last post I promised I'd tell you about the new mechanisms ASU is putting into place to improve the University's ability to notify the community of disruption in service.

If our mail is any indicator, these kinds of notifications are important to you. When systems are going to be changed or we have outages planned, you tell us you like to know well in advance. When we experience unplanned outages, you tell us that you want to know what's wrong and when it'll be fixed.

We've done some things already -- our System Health page is well visited and moved us forward in terms of keeping you informed about planned outages. The emergency notices we put on My ASU have helped a lot of you know when systems are out and when we can expect them back.

But they don't always work. Sometimes when we have extreme outages, like say the Internet is unavailable or power is lost to the data center, we can't get to System Health and My ASU to update them, and even if we could, sometimes you can't get to them to find out what's going on.

So this month, right after St. Patrick's Day, we're releasing a set of new improvements that we hope will make you better informed in the event of an emergency.

First, System Health is being moved off-site. We're moving it to our Denver facility to improve its availability in those times when major portions of our infrastructure are unavailable.

Second, we're expanding our outage notifications. In addition to announcing outage information on System Health, all unplanned outages will also be announced through a notification group associated with our new ASU Alert service provided by e2Campus. This new service will allow members of the ASU community to receive a text message and/or email message whenever System Health turns red. We are pre-subscribing members of the Outages@asu.edu mailing list to this service, but if you are not already subscribed, you can self-subscribe to ASU Alert. Click here for complete instructions on signing up.

Third, all planned outages and system changes will be announced through UTO's new Change Management System. For authorized users, the Change Management System provides a complete history of proposed and implemented changes. Our system was designed by the Communications Subcommittee of the UTC. Again, if you are already a member of the Outages@asu.edu list, you will be pre-subscribed to the system. If you are not a member and wish to subscribe, please send a note to sub-outages@asu.edu.

We're continuing to work on reliability and we hope not to have to use these notification systems as often as we have of late. But we know that when we do have a system disruption, you want to know as much information as you can about what's wrong and when we'll be back online. We're hoping these changes we're making will help with that.

As always, your comments and suggestions are welcome, particularly the constructive ones.


Sansa March 14, 2009 at 12:40 AM  

Ahh, the dreaded "Change Management System" :-) I work in IT and our CMS system is so painful to use. I swear, I spend more time trying to Open/Close Change Management tickets than actually performing the job. I sure hope your system works better than ours!

A really cool tool that allows us to manage hundreds of websites is called SiteScope. You probably already have heard about it. If not definitely check it out! The text message alert option you guys are using is very cool too.

Bruce,  March 24, 2009 at 3:59 AM  

ASU has a Denver facility?

Todd,  April 22, 2009 at 3:37 AM  

Please bring back the system outage calendar. It was nice to view the outages on an actual calendar to check for weekend availability and to plan things. Now we need to subscribe to the blog and search? I think the value of the visual calendar should be re-examined because of its ease of use and simplicity compare to the blog format.

Where it used to exist: http://systemstatus.asu.edu/status/calendar.asp

Adrian Sannier April 28, 2009 at 6:11 AM  

Yes, this is a great idea. Thanks for the suggestion; we are working on it and hope to have it working by the end of summer.