Saturday, August 27, 2005

This just in....

This is my third post today. And its Saturday. I think I might need to get a life. I did go to Cactus Jack's last night for a while, though I must admit that Shawna and I didn't connect as tightly with the band as we might have liked. Still, it was nice to get out and take a "thin slice" of Ahwatukee night life. There were some folks there deeply into CHG...

I'm also fired up about this diagram the data warehouse folks put together. I have it as a powerpoint file and as a jpg file.





Cool no? Its not complete yet, but what it attempts to do is show how data flows--and at what frequency it flows -- between the major data stores. It also shows how much of each of the data stores is in the data warehouse.

There are also some stores listed there that don't have connections. This is becasue either they don't have any connections, or because no one has drawn them in yet.

If your favorite data store is missing from the diagram, let me know. Please also try to indicate what other stores it transacts with and how often. If it's "standalone", let us know that too, together with your opinion of whether its in the data warehouse, and if its not, if it needs to be.

If the diagram has flaws, please help correct them. If the entire diagram is flawed, please suggest a replacement. If the diagram has utility for you, let us know why you like it.

7 comments:

iacnld,  August 30, 2005 at 6:14 AM  

Hey all... Adrian has convinced me to try blogging about data stores. I never considered blogging to be all that useful before - most blogs seem to go on about things that I don't have any interest in (what my son ate for breakfast for example). But I love to chat with folks (eh Ron?)so let me throw out for your consideration some data stores that didn't get included in the original diagram. And please all... these are just data stores we are trying to understand... feel free to chime in and EDUCATE us (Data Admin). Tell us about platforms, connections to other databases, who knows about the data, relation to the Warehouse or to SISREP (does it pull data or push data?). Tell us about data stores you have heard about but wonder what they are.

For starters.. I hear that Undergrad Admissions has a database on Sybase? Is that right? What does it talk to? How about the SAREO database on Sybase (what is SAREO anyway)? What is the Student Rec Center doing with data stores? How about the Library? Russ.. what is VP Research doing with data stores... you guys have SQL servers still? First pass above was missing Blackboard (oops brain fade).. what does it pull data from. And how about some of you folks out in the Colleges and Departments.. Randy.. lets get the Nursing data store on this diagram is it still SQL? Brian.. before you go :-( is the Law data store on a SQL Server or Sybase or both? Henry - I bet you have multiple important data stores in Liberal Arts.

Thanks Kathy and Brian for letting us know the Foundation and Alumni database are one and represent a new platform on the diagram. Who wants to vote for the color for an IBM Unidata database? Lets get more info like that thru the blog.

All you blog watchers.. what's your favorite database? (Your favorite ASU database... Ebay doesn't count!.. although I admit, I stay awake nights sometimes wishing to query all the data on Ebay! - True measure of a data junkie!) Lets get your favorite ASU database on this diagram. I know everyone is busy.. I know I am.. but if all of us at the University could have a better shared understanding of what all the data parts of this place are... wouldn't life be so much better? If we overwhelm Adrian's blog with database info... he can probably kick us off to some other blog... eh? And if you overwhelm me with information on the blog, maybe I will shut up? (That is NOT a promise!)
Nancy D. (as if you all didn't know by my ASURite ID!)
PS HAPPY BEGINNING OF THE SEMESTER ALL!

zola,  August 30, 2005 at 12:48 PM  

SAREO - Student Affairs Research and Evaluation Office, formerly directed by Holl[e?]y Belch, then Alice Mitchell. Present existence in the form of several cases of CD's, ZIP disks and 3.5" floppies given to Rick Batchelor by Sheldon Zola; and in the Freshman Year Experience data (still being collected) in the department1.SAREO_FRESHMAN_YEAR_EXPERIENCE database. Data extends back as far as 1998, maybe farther.

zola,  August 30, 2005 at 1:03 PM  

There's also a SAREO_LEADERSHIP_DEVELOPMENT database on department1 - so named because it was started during Holl[e?]y Belch's watch as SAREO Director. It was meant to capture, on a semester basis, student involvement in certain curricular (ASU courses), co-curricular (workshops, trainings), student employment (as in HR data) and special program (Passport, service programs) activities - and did until some of the more skilled trustees moved on and the aggressiveness of recruiting departments to participate and enter data languished. It has been all but replaced by the ASU Experience database - an SQL Server db on UITS/vpsa04 - but I notice data entered as recently as 5/30/05. The most recent trustee - Doug Ewing - resigned last week.

There are 60K+ rows of data, dating back as far as Fall 1997.

Cat,  August 31, 2005 at 7:38 PM  

This is great! I really wish we had something like this six months ago when we were starting our web process initiative in AFIT and trying to put together requirements for an RFP for enterprise content management and web services development tools.

Hopefully, someone is documenting the comments in this blog regarding definitions and decoding the acronyms.

Also, where do we see the "reporting" databases, the dbs everyone uses to easily access and display our data? Forgive me if I'm using the wrong terminology here. My understanding is that we can't read data in the mainframe -- Advantage financial data, for example -- for reporting or use in online applications, such as a web-based travel request system, so we have to hit some other tables that have been created for this purpose and are updated on a daily rather than real-time or hourly basis.

Would that structure need to be identified in another layer? Or am I just not seeing it here?

Thanks again for what you've done so far!

Nancy September 1, 2005 at 8:03 AM  

Cat wrote:

"Hopefully, someone is documenting the comments in this blog regarding definitions and decoding the acronyms."

That, and linked descriptions of objects (like a database) in image files could be easily accomplished in a wiki. For the most well-known example, see WikiPedia at http://en.wikipedia.org/wiki/Main_Page. For an explanation, see "wiki" in the WikiPedia! Just type "wiki" in the search field in the left-hand nav bar. :o)

I may be jumping the gun a bit since something like that may be in the works, but it applied to your question so there's an example you can check out. Blogs are great for linear, sorta, conversations, but wikis allow content to determine format, they interlink references automagically, and they're pretty easy to reformat if needed, unlike a set of Web pages.

Nancy "WikiBigot" Lee

iacnld,  September 1, 2005 at 10:26 AM  

Ah "reporting" databases... maybe we do need to put a BIG box on the diagram that is the Data Warehouse. The Warehouse is the largest "reporting" database. The Advantage web reports run from there, Property Control reports, Facility Reports, and of course our nearly 1,000 plus ad-hoc queriers (is that a word?) plus college and departmental application developers pull data for reports from the Warehouse. In the current diagram, the pie in the top right corner indicates how much of the original data store goes to the Warehouse and can be used for reporting. I think the Warehouse data is "fairly" well described at link if you are interested ("fairly" is a resource impacted adjective!). I'm now looking for a way to show graphically ... data that goes out of the Data Warehouse into other data stores that then collect original data (like Nursing professional program admissions). Any graphic wizards have an idea on a push/pull type visualization?

Rick September 1, 2005 at 11:15 AM  

Nancy said:

...Russ.. what is VP Research doing with data stores… you guys have SQL servers still?


Well, I'll pick up that question, and that challenge: The OVPREA office uses SQL Servers for their power, flexibility, capacity, and cost... and we get analysis services for free.

We don't store any ASU-hosted values; if an application is going to display a name or a department code, we pull it on the fly from a Sybase machine (configured as a linked server, for convenience).

We have many Microsoft Access databases here, each with a separate client ODBC connection to the campus data they require, each with slightly-different versions of queries and local tables, depending on who set it up. We have been trying to move these pockets of data, queries, and connections to SQL Server and share it back to the users... but that takes a lot of time. Permissions into the Oracle or Sybase databases is on a person-by-person basis, which means that any centralized application we might like to write would have to duplicate the authentication and authorization scheme of the ASU data store. We have done that in limited cases only.

Our databases support several Web sites, where content is dynamically generated. We also host several Web sites for offices that (for one reason or another) need ASP or ColdFusion application services, or database interactivity.

Rick