Tuesday, September 06, 2005

Last Week in Review...

Last week saw a dramatic increase in comment activity on the blog, which is great. So in case you're not following the comments, (and even if you are), I thought I'd review some of the issues that folks have been raising -- adding my own comments of course. :)

There was a thread of conversation about the ASU Data Warehouse, and its role as a "reporting database". Some wonder why we need "reporting databases" other than the Data Warehouse, others assert that not all the data needed to generate reports is in the DW, or that the DW is too hard to use.

I have to agree that the proliferation of “shadow” databases is a response to not being able to do what the customers need quickly and easily enough doing it the "right" way. I figure if it was easier and cheaper to do it the "right" way, then people would do it that way, as long as they know what the "right" way is. So, given that people set up these "shadow" databases up, I figure either:

  1. its too hard to do it the "right" way

  2. its too expensive to do it the "right" way, or...

  3. people don't know / don't believe there is a "right" way

Of course, while shadow databases may solve a local problem, they create a global one; local improvement at the price of global confusion. Effective coordination is the key to this challenge, but its a tough organizational nut to crack.

So my question, and I think its an important one, is what do we have to do with the Data Warehouse to drastically reduce the number of "shadow" databases that are out there? So let's hear from the "Shadow Masters". Could you have used the data warehouse to accomplish the same things you do in shadow? Is the DW too hard to use? Is it too slow? Is the data not there? Is there a lack of conformed queries? Does everyone just hate John Rome? :)

Cat put forward a plan based on the idea that we need to hit the low hanging fruit. That's right in tight with the CPI vision. My view of CPI is:

  • identify fixes hanging the business units up and do them nimbly to free them to help the people

  • identify simplifications that reduce redundancy and position us better for ERP

  • bring to online/self-service those things that are offline/expert service

So Giddy Up Cat.

Another comment asked how we were planning to seperate policy decisions from technology decisions, or, to say it another way, how we will allow policy to drive technology. The commenter pointed out that all the systems are integrated (the strong ERP argument) and so care must be taken to ensure that the systems implement the policy decisions, and that those policy decisions take place globally.

Max echoed this:

“We need to apply continuous improvement to our business processes not just our technology. Technology must not only be continually improving but technology must also be flexible and nimble enough to support the continual improvement of business processes”.

Good advice, but a word of caution. Analyze the successful ERP implementations and you'll find that Vanilla is the most popular flavor, not Rocky Road. Customizations are expensive and risky. But this implies that software to a certain degree drives business process, not the other way around as our commenter suggests. Wonder how our community feels about that tradeoff?

Several comments focused on the need to improve coordination between activities in centralized IT and those taking place out in the departments. Since 2/3 of the personnel is outside of IT, this is a critical issue. What most folks point to is that they don't know the standards they should conform to, or that there are stumbling blocks that central IT can't clear on the customer's time frame. There are committees and user groups, but no way to enforce cooperation. The result is duplication of effort. True?

...we have a lot of disparate 3rd party products being acquired and used in the enterprise. At the scope of a shrink wrapped desktop piece of software, not a problem. However, when we start talking products like Vignette, EMC/Documentum, Verity’s Teleform or LiquidOffice, I hope we can find better ways of collaborating and sharing to acheive an economy of scale and standardization for the university.

Sounds like a governance issue to me.

The major debate though was about the idea of CPI itself. And the major argument from several knowledgeable players, was:

Aren't we already doing CPI? Isn't CPI the process that put us here in the first place? With CPI we've been falling behind as the New Amercian University picks up speed. If we can't afford an ERP system today, lets figure out when we can afford it, or hold a bake sale or something...

I'll say it again. The issue is not money alone, but money and risk. To build the institutional will, the level of risk needs to be reduced. Risk is reduced by showing an unbroken stream of improvement. Rewiring the backend, using a vendor based product, has to be part of that plan, but I claim we can't wait till the back end is fixed to make improvements that the customers (that's right, the customers, not the natives!) need to run the business NOW. Acknowledging that reality will help build the institutional will we need.

Even Max, the champion for closing the gap, had some kind words for a CPI approach:

There are some existing services we can portalize. We can develop new services with the portal as the end game. [As] we go the ERP route we add or swap modules as they are ready to come live.

There's a difference between the CPI I'm proposing and the strategy we have been following for the past several years. AS Roger said:

While we have been doing continuous improvement for several years, it has not been as unified and systematic as it needs to be. We lack a set of common standards that need to be followed by ALL developers, regardless of whether they work for central or a distributed IT shop.

...we need to encapsulate the inner system and provide a set of common database and business rule objects for all developers need to use and adhere to:

  1. Ensure consistency between applications;

  2. Minimize the redundancy of data;

  3. Provide a set of tools that can enable a developer to ramp up and become productive faster;

  4. Lessens our dependence on the legacy SIS.

Its gotten a little quiet in the past few days comment wise. Is everyone getting bored?


Nancy September 6, 2005 at 6:07 PM  


• The institution likely won't support us in our plea to replace the SIS back end, fearing risk and unwilling to bear the cost.
• The SIS back end definitely needs replacing.
• To get the institutional support needed to replace it, we must create a plan to:
• Reduce intra- and interdepartmental wasted effort by integrating redundant data and processing (visualized in part by our Data Stores map).
• Simplify departmental access to data by publishing and maintaining relevant standards. (Question: how to enforce use?)
• Enlarge end-user (student, staff, faculty) access to processes by making key application functions, if not complete applications, available through the Portal.
• Accomodate new requirements of the New American University, such as flexible scheduling.
• Accomodate replacement of major components such as the SIS and other aging systems.
• Then we must prove the strength of the plan by demonstrating progress at early benchmarks.
• Simultaneously, we should create a durable infrastructure for all administrative processing, so that later, when the need to replace components again arises, we don't have to endure a whole new round of analysis. (I'm adding this part but I think at least some agree?)

Work has begun to locate redundant effort and data. What sort of low-hanging fruit would help us build institutional support for further development? We need a big fat shiny fruit that's actually within reach.


waldo,  September 7, 2005 at 4:14 PM  

I agree with Nancy but I would rather have a fruit basket. You see some fruit will spoil. Some fruit we will drop and ruin (it’s going to happen and I think that’s ok). Some fruit needs to go into the freezer to be thawed out later and used. But I’ll bet some fruit are almost ripe and ready be picked and eaten (this will make us strong for the next crop). But I get the feeling that the growing season is coming to an end and we might lose the entire crop. So let’s clear the fields, get on with the feast, plant some seeds (remember to rotate the crops), and plan for the next crop. All we need is a little rain then some sunshine! I’ll bet there is enough to feed and satisfy everyone.

iacnld,  September 7, 2005 at 5:26 PM  

Shadow databases? I have heard this word often and maybe I'm the only one... but I'm not sure I know what that means.

Here is senario one. An application developer in a college has an automatic SQL process that kicks off each morning at 8am to load a college SQL Server with Data Warehouse student data for that college .... is that a shadow database? The application developer builds an ASP or MS Access front end for college staff that use the student data from the SQL Server (application developer likes to control their response time... not have users wait on every Warehouse query) AND collects local data that supports a function in the college... like advisor name, appointment times etc. Is it a shadow database now? What is the global problem here? Maybe security... and we hope the SQL server is patched.

Here is senario two. A department stays in touch with their graduates via newsletters, events, and fund raising activities. They create and update a database with demographic data, job data, contact and participation data. Some of this data may also reside in the Alumni data store. Does that make the department database a shadow? Is the Alumni database a shadow? This situation makes for more of a global problem in that data may have different definitions between the two data stores. Is this more what we are worried about?

You know... there is one thing we have done very well in all of IT here at ASU. Our system response times are fast. We have been very protective of response times and that drives application developers to slurp and store Warehouse data creating their own local "replications" that they can tune and control.

So I ask again... what is a "shadow" database? Does it have something to do with if it is within Central IT or outside of it?

Cat,  September 8, 2005 at 2:14 AM  

RE: Data warehouse and low-hanging fruit...

Business Technology Systems (BTS... formerly AFIT) is currently in development on four custom web service applications to be delivered through "the Portal" -- Travel Approval/Advance/Reimbursement, Hiring Process Report, Leave Request & Reporting, and an HR Employee Service Tracking system which will eventually become an online HR self-service center and knowledgebase.

Our most urgent need right now is access to legacy data which is currently only available on a 24- to 48-hour delay through "reporting tables" in the data warehouse.

An illustration of the problem lies in the online travel application we're developing -- under the current process, a traveler requesting authorization would not be able to see that request in their dashboard until it had been approved and passed through Advantage and updated in the appropriate tables in the data warehouse -- next day at best, 2 business days if their request missed the scheduled daily update.

Updating your personal information in the HR data systems suffers the same fate. And don't even talk to me about benefits info from HRIS...

Is there any reason why we can't have a read-only "reporting layer" of operational data, updated in real time, rather than once daily or even hourly? If we can't see directly into the production data because of performance issues, this would provide us a more realistic view of the data in HRMS, Advantage, SIS and other operational systems.

Some student data is updated in real time or 30-60 minutes -- is there any reason we can't do the same with other business systems?

In the meantime, we're moving ahead, designing data tables to hold the online transaction data and display it to the customer in an online dashboard view. We've been told it will be 4-6 weeks to build those tables in the data warehouse -- longer than it will take us to develop the application. Not only is this a crippling delay in the development process, it seems a waste to duplicate data that could be maintained in one place.

A real-time, read-only data layer would seem to solve that problem. Any ideas on how we can accomplish this? Or is there a better solution?

- Cat

kasdm,  September 8, 2005 at 5:06 AM  

I am not sure that the term "shadow systems" always means what the term implies. When I think of a shadow system I think of a copy of data made to query and manipulate as needed. While I agree that enterprise data from both legacy systems and the warehouse is somewhat difficult to access and use, what I see most often is the need to capture and store additional data that is not currently captured in the "enterprise systems". Thus, a new departmental database is created which is a mixture usually of enterprise data and data that meets the specific needs of the department. Of course, the challenging part is that each college, program, etc. has some data requirements that are unique to them. I don't think of these as shadow systems.


Jon September 8, 2005 at 1:16 PM  

Regarding duplication of efforts, resources, etc.

I’m a strong proponent of centralized resources. But let’s take it another step forward. I would propose, that not only should we have MORE centralized resources (File servers, SMS servers, Project servers, etc.) but also a harder tie in to Central IT, and more involvement from Central IT for the “disperse and duplicate IT bubbles” scattered around Campus.
I would picture this in terms of:
• Review and coordinate IT efforts.
o This would reduce duplication, and allow Central IT to help direct and be more aware of Campus needs.
• Establish a Liaison program out to the “IT bubbles”.
o Standardization of effort
o Centralized Training allowing IT personnel to keep skills up-to-date.
o Pooling of Campus Technical resources.
• More involvement in the Technical direction that ASU is moving in (or attempting to move in).

We’re doing SOME of that now, but not enough. Let’s take it up a notch.


Nancy September 8, 2005 at 1:53 PM  

Re: Cat's comments about data warehouse and low-hanging fruit

There's an almost-real-time replicant of the SIS database (15 minutes, give or take) called SISREP. I don't know who-all, or which-all types of applications, can have read access. Somebody chime in? I have heard that it's not an easy database to navigate because the relational tables are based on the IDMS set structure; I think this was because the product that does the data transfer from IDMS journal files had to work that way.

In terms of the other databases that you mention, I don't see a straight replicant of HRMS or Advantage on Nancy's diagram. I don't know what kind of overhead, storage, etc. it would take to create one. If your application needed to refer to all three databases, you'd of course need to be familiar with all their structures. And DB structures don't stand still. Every time a field gets added or changed, programs may have to change. Everyone that manages those programs have to be notified. Of course, if we had centralized definitions and control of those DBs, and knew who was using them and therefore who to notify about changes (and coordinate application testing, by the way, which often happens late at night or on weekends so as not to disrupt normal usage), duplicating those datastores for read-only might make sense. It would take some infrastructure. (Read: time/money/staff.)

Hey, IT guys, help me out here.... Any insights?


Nancy September 9, 2005 at 10:49 AM  

I've written on the topic of "Money" in my own blog "CoffeeTalk", so as not to take up too much room here. And it's really a separate topic.


Sandy,  September 13, 2005 at 6:43 AM  

At the urging of fellow TSAs here at the West campus, I tuned into Adrian’s blog for the first time today and I love the discussion. I spent years developing applications at another university where they could have written the book on data integrity and data management issues. High-level decisions *seemed* to have been made in a vacuum, separate from those affected most by those decisions. Whether this was actually the case or not, we would never know because we did not have access to a forum such as this in which to discuss those issues. I haven’t done much in the way of application development yet here at ASU, but I am very encouraged by the concern about data and the thought-provoking conversation in search of solutions.

I do not feel that I am up to speed on the discussion enough at this time to make much in the way of useful contributions to the conversation, but I would like to throw out something to keep in mind in the brainstorming process.

I have read many postings that make reference to things like “campus needs”, “Central IT”, “IT bubbles scattered around campus”, “Campus Technical resources”, etc. It sound like most of the conversation may be coming from voices on the Tempe campus, and I would like to make the point (if it hasn’t already been made) that the other campuses of our single university need to remain part of discussion and problem solving process.

This is not only to make sure that needs of all the campuses are taken into consideration, but also to make sure that the possibility of technical issues inherent in physical, geographic separation from the ‘mother ship’ are not a surprise at implementation of any solution. This separation may also, by itself, be creating some of the database shadowing, duplication of effort, and difficulty enforcing data standards, and may be worth further investigation. Just my thoughts! Thanks for the great reading so far.

Mike,  September 18, 2005 at 7:51 AM  

Data Warehouse discussion - from a business user standpoint there are two issuesthat I think are relavent to this discussion and may help explain why "shadow" systems are so abundant. To use the data warehouse, the user must first know some access software to build and run a query. That is difficult enough for the majority of people at the university because most non-technology people are technology challenged. But the second thing a user of the data warehouse must know is really the show stopper, they must also know the data itself. So people do not trust the results they may get because they are unsure of how the data is store and whether they have build a proper query to hget the data they want. So they build a shadow system because they can then trust the data. So if we are going to reduce the number of "shadow" systems out there, we need to put a front end on the data that a relatively unsophisticated user can access the data without know much about it. I don't pretend to kinow how to do that, but I do know that we will not have a robustly used system until we figure it out. The data warehouse as it is now has put a relatively few people in a position of power simply because they have figured out how to pull the data and trust what they pull enough to sell it to their supervisors. This will not change until we make getting the data in the form the user wants easy.

Mike,  September 18, 2005 at 7:57 AM  

Policy vs Technology - I do not agree that technology should drive policy, but I do believe that technology should cause a re-examination of current policy when appropriate. I also believe that technology causes policy makers to be more specific in their interpretation of what policy means. Policies are typically communicated in words that may have different meaning to different people. But when we configure an ERP we need to know specifics of those policies and that causes policy makers to get very nervous because now they have to explicitly state what the intent of the policy is so that the technologists can correctly configure the system. The policy maker sees thios as an erosion of authority, when in actuality it better defines the organization and frees the policy maker to focus on higher and more important issues. Seeling this concept to the policy maker is not always easy and in my experience is one of the most difficult items to tackle in implementing any system. We must built a strategy to enable us to get quick decisions from policy makers when these issues arise and the policy maker must be comfortable in making those quick decisions.