{A Guide to PUC’s UniData | Colleague | Database}
By David Regal |
|
TOC
Four Ways to Make or Run a Query
4.1.0 Logout of the text-based GUI
4.2.4 Email the test query results
4.2.5 Why all this hand checking?
It’s easy to get data out of the database, but getting correct data is difficult and time consuming.
4.3.2 Export to your HOLD file
4.3.4 Overview of exporting to Excel
4.4.1.5 Documentation saves money
4.4.2 Conventions and Standards
4.4.3 Convention for Excel filenames
4.4.4 Convention for printouts
How do I import the output from a query into Excel?
In a query, should I use ticks ‘x’ or quotes “x”?.
How do I exit the “>” prompt back to the “:” prompt?
How do a make a Saved Paragraph?
What does all this syntax (.L, .R) mean?
For screenshots, how do I copy only the active window?
For screenshots, how do I copy the whole screen (includes the active window)?
A Guide to PUC’s UniData
(Colleague) Database
PUC 2004
:: Created: 6/10/04
:: by: David Regal
:: Modified: 6/30/04
:: by : David Regal
:: Version 2.4
:: Based on:
:: Datatel Release 17
:: Unidata Release 6
:: AIX Version 5.1
(Figure 1) Data Access. Colleague offers many ways to save and retrieve data.
(Figure 2 ) Simple environment
(Figure 4 ) A sample from the spreadsheet “Frequently Used Tables” in Colleague
(Figure 7 ) Data Processing. A solid foundation is important for accurate analysis and reports.
(Figure 8) Mock-up of the Custom Menu Item.
(Figure 10) Query Builder is available through the Colleague GUI
(Figure 11 ) Process that happens when you execute a query
(Figure 12) Screen shown after a Telnet login
(Figure 14) Look in the dictionary file to find the fields you may use in your query
(Figure 15) Excel used for database mapping
(Figure 16) Email the test query. Highlight a block >> Right Click >> Copy
(Figure 17) Email the test query. Cut & Pasted into an email
(Figure 18) Exporting to Excel takes some time, but should be done to be consistent
(Figure 20) Email or upload the query results to the people requesting the data
(Figure 21) Importing into Excel.
(Figure 22) Active window screenshot.
(Figure 23) All windows screenshot.
Many volumes of thick manuals from Datatel exist and are kept in the IT department. Out of the numerous manuals I have looked through, few represent what really happens at PUC when dealing with the Colleague database. Since practical documentation is almost non-existent, to learn the database system you have to ask those that have come before you (thank you LB and RH for answering my many questions), and most importantly, have some practical experience from building your own queries (thank you Dr. GH for your patience while I was learning UniQuery).
This guide is a work-in-progress and will be updated. If a section is not complete and you want it to be, check back in a couple months to see if a new version has been uploaded (search through www.puc.edu or www.techborder.com ). If you would like to contribute, please do so. Upload your version on your website and it will eventually be found (web bots are very good at this). Information and knowledge are free; share what you know.
To create reports and queries, a good understanding of the database and the system is necessary. After this, we will be closer to achieving the goal of Institutional Research (IR) to “…create a process that can serve as a tool in making decisions for the betterment of the College.” [ad hoc meeting, 1996]
Since coming to work at PUC in November 2003, I have been taking notes and honing my query building skills. This is an accumulation of my note-taking work and would like to share it with those who are interested.
This is a proposed guide and is a suggestive guide, not an official one supported by the institution. Until it is agreed upon by the proper powers, this guide is will not be in effect as in the conventions and standards for the IT and IR departments of the institution. Consensus among the programmers should be taken. This guide can be changed so it suites the needs of the institution as best seen by those affected. However, after consensus is has occurred and standards are agreed upon, they should be enforced and supported.
1. Saved Paragraphs
2. Mnemonic
3. Query Builder
4. Command Line
Why are there so many ways to reach Colleague? As of this writing, Datatel is on the 17th release of its Colleague software. They started with the text-base GUI, mainframes, and have evolved from there. With each version, new features and tools become available. To be fair to the new technologies, we should try the tools they offer and see how we can maximize our work output and quality. Because of these new releases and tools, there are multiple ways of access.
(Figure 1) Data Access. Colleague offers many ways to save and retrieve data.
(Figure 2 ) Simple environment
(Figure 3) Detailed environment. If you’re a programmer or query builder, this is most likely your environment.
There are over 3,000 tables (files) in the Colleague database. Don’t hope to learn the whole thing; rather, learn the modules of the database that are practical to what you are working on at the time. For instance, if you are writing queries that collects how much an employee was paid in the last 5 years, take notes, diagram, and share what you learn. You will become an expert about the Financial System (FS) module and everyone will love you for the documentation.
Excel is helpful for this since it is affordable (bundling makes it almost free), almost everyone has it (standard), and allows hyperlinks (for website). Even though Excel can’t make an ER diagram, it’s a practical start of database mapping. When writing a new query, I refer to my “Frequently Used Tables” spreadsheet often. Don’t try to list all the hundreds of fields in the table; just list the ones that are useful.
(Figure 4 ) A sample from the spreadsheet “Frequently Used Tables” in Colleague
The VOC file has many Dictionaries.
(Figure 5 )VOC file. The field values for the VOC file are @ID = filename, F1 = File’s description and date it was copied.
(Figure 6) More info from a LIST DICT. Instead of just an ‘I’ or ‘D’, a detailed description results from listing out ‘TYPE’ when you query the dictionary.
(Figure 7 ) Data Processing. A solid foundation is important for accurate analysis and reports.
Building an infallible query without a solid platform and valid data is nearly impossible. You can only hope to come close to the actual.
Query builders and programmers should not be held accountable for the output of the queries when other factors are involved. Whether the results are good or bad, it is a shared responsibility.
To help alleviate the problem, a single department or person should be in charge of the data and the database, but they need to be empowered and supported (both financially and mentally).
For more about getting results from your database, please read Planning-DataProcessing.doc
Pros:
Cons:
Basically, after you make a query, love it, and use it over and over again. You should save it on the database server. Save it using the naming convention of your initials “:.S DCR.MY.FAVORITE.QUERY 4 1” saves the last 4 lines in the .L stack. Instead of typing a huge query with nested selects over and over again, you can just type the shortcut name that you saved your query under, (e.g. “:DCR.MY.FAVORITE.QUERY”).
Datatel calls them Saved Paragraphs because you can save a multi-lined query, a “paragraph,” for execution at a later time).
Pros:
Cons:
X X X X X X X X X X X
(Figure 8) Mock-up of the Custom Menu Item.
This is not the actual screen. My account didn’t have access to this screen, so a mock-up was made. However, the Custom Menu Item screen will look similar to this text-based GUI. Custom Menu Item screen will have ‘X’s in front of the names. From here, you type a number, press enter, and a query runs. After a few seconds, you walk over to the printer and collect the report.
How it works. Mnemonic queries are done by saving a paragraph in the Colleague database, writing a program, and then retrieving it through an ASCII-type GUI. These menus are made specifically for PUC, by PUC. For example, the mnemonic XS36 brings up a report on Student Credit Hours Generated for our school.
Pros:
Cons:
As the Colleague database develops, so do the ways that access the Colleague database. In 2001, Datatel released a new way of accessing the database – through the Datatel GUI. The exact name of the program is not known, but most people call it “the GUI” or “Colleague GUI.” In Windows, it’s the gray and red cylinder icon called “Datatel.”
In the Colleague GUI, many screens can be accessed and information can be looked up on a per instance basis. For example, “how many classes is this one student, ID.NO = 0123456, taking? But to collect larger amounts of information for statistics, research, and reports, looking students up one at a time is impractical. Thus, Datatel offers ‘Query Builder’ as an easy way to build queries from within the Colleague GUI.
(Figure 9) The Colleague GUI
(Figure 10) Query Builder is available through the Colleague GUI
Within the Colleague GUI, you can access query builder by (as of R17):
Pros:
Cons:
You have to have permission to access the Colleague command line and should have a good knowledge of the database, AIX Unix, and UniQuery. Access is not given to everyone because it is possible to be careless and damage the system. Please read the manuals available online and in the ITSS department to learn more about UniQuery and the database.
The purpose is to illustrate the basic steps in the most powerful and flexible way of querying PUC’s Colleague database… Command Line Querying.
The problem of building an accurate and valid query basically stems from the fact that there is not one authority on the database. So far, the authority for changing the database structure has been both PUC and the software vendor, Datatel. The authority for data entry has been shared among numerous departments on the PUC campus. Authorities tell people how things should be done. Authorities are not necessarily people; but also documents, maps of the database, or decent Colleague and Datatel help files (as of this writing, none of these exist and it is my hope this document can help fill in the void).
So without a single authority, it is hard to find out what data is contained in the database, what the database structure is, and how accurate the data is. Thus, query building becomes the time consuming detective work of research, testing, and checking your work. Like a trench-coat-wearing sleuth walking down a dark alley without a map, you have to take small steps, constantly watch out for anything out of the ordinary or you’ll be taken by surprise.
(Figure 11 ) Process that happens when you execute a query
To get to the command line, login through telnet, and press “1” for EX – Exit. If this option is not available, then you are most likely blocked from the command line and should talk to the system administrator about privileges.
(Figure 12) Screen shown after a Telnet login
We will be using the command line to write queries so be sure you have access. Type “1” to exit to the command line.
(Figure 13) Command Line
Note you are not at the Unix command prompt, but at a command prompt that resides within the UniData DBMS.
(Figure 14) Look in the dictionary file to find the fields you may use in your query
Find fields that look remotely close to being able to help you in what you are doing. For example, if you want GPA averages, and you knew to look in the table called STUDENT.ACAD.CRED, you should write down any field with a name related to GPA. “PUC.GPA, STC.GPA, HS.GPA, CUM.GPA” would be a good start. To find these fields, look at the file’s dictionary file.
: LIST DICT FILENAME
After the possible fields are found, take samples to ensure the fields contain field values (data). The
option SAMPLE is helpful for this. I would write something like this:
: LIST STUDENT.ACAD.CRED PUC.GPA STC.GPA SAMPLE 50
This would give me the fifty records, not in alphabetical nor in numerical order, but as they are stored on the disk. Look at the values in each field. Find the best field by checking that value with a known record (e.g. your own GPA or, after asking permission, someone else’s GPA that is known).
Lastly, write down the pseudo-SQL data type of the field. Mark it PK if it is a primary key or a foreign key. This is for your own reference so a month from now you don’t have to research this area of the database again. Knowing the data type of the field is useful for joining tables, and will give you an intuitive feeling about the data that is stored there.
SQL data types are not at all like Datatel’s data types that are found when you do a “LIST DICT …” (like ‘D’ or ‘I’). I suggest a pseudo-SQL data type because, as a programmer, you want to think of what data is “usually” there. If you usually see 6-digit numbers there, you can think about that field as a INT(6) (pseudo-SQL), or MEDIUMINT (SQL). I say “usually” because even if you see only integer values like ‘1340323,’ it still may be possible to store a Varchar value there, like ‘1340x32ps’ and thus check your work. Also, I use a pseudo-SQL that gives the number of characters, instead of the SQL standard. This way, you don’t have to think about the range of values a Datatel field could take on and translate that into INT, TINYINT, MEDIUMINT, etc.
Again, taking notes is important for documentation and for future reference (as in the following week when you’re asked to refine and run the query again). Write down notes about your fields in a notebook, spreadsheet, ER diagram program, or any another document. After trying all of the above, I have found an Excel spreadsheet with links to be the quickest, most flexible, and most organized.
(Figure 15) Excel used for database mapping
Think you have the fields you need? Then proceed to the next section.
From the beginning, building a good query is a process of taking small steps and refinements. Before spending hours exporting, uploading and emailing a statistically incorrect report, start with a sample query to see if you’re getting the output you want.
Watch out for missing or out-of-the-ordinary values or null values. For instance, in the field called STNC.NON.COURSE.ACT, you might find a few ACT scores in the 1000’s. Why? That’s most likely a SAT score that has been entered into the ACT field. This would not happen if data was automatically checked when it was entered in the database (i.e. data validation on the data entry level); however, invalid data get into the database and query builders have to build smarter queries to exclude the erroneous data.
Hand check. If you went to school here, look at your own record or make friends with a student or teacher and ask them to check the numbers. Time consuming as it may be, this is the only fool-proof way to get accurate data with the current system.
After one other person and you have checked the data, send a sample to the person requesting the report or have an expert in the appropriate department look at the data.
Is it enrollment data? Call the registry department. Are they MFAT biology scores? Call the biology department. Ask them to double check it before you send the final report, but also let them know you are doing all you can to check the data.
To give them the data, just cut and paste from the command line screen. Time consuming as it may be, hand checking is the only fool-proof way to get accurate data with the current system.
(Figure 16) Email the test query. Highlight a block >> Right Click >> Copy
(Figure 17) Email the test query. Cut & Pasted into an email
This is not an ideal situation, but without data validation, automatic checks, a mapping of the database, or a database authority, the arduous work of checking the data for accuracy is left to intelligent and diligent employees at PUC.
Think the query is accurate and gives the data that is needed? Check again. This is very time consuming and represents a majority of the time of building an accurate and valid query. You’ll soon learn it’s easy to get data out of the database, but making a query that gets correct data is difficult and time consuming. I’ll say it again.
If you think your numbers from the test query are close enough to the actual, or are right on, and you got an email from the person requesting the data saying the test query works, then good job detective. Move on to the next section.
Finally, you get to make a report. Use a banner with the standard format. Overlook the data and check if anything seems out of the ordinary (e.g. extra terms like 88/SP or 03/T1 showing up).
Most likely, the person requesting the data would like the report in Excel. You can achieve this by exporting to your HOLD file.
Example:
:SETPTR 0, 256, 0,0,0,3, NOHEAD, BANNER yourfilename.txt, BRIEF
Name the file with the *.txt extension for easy importing.
When you run the query, append “LPTR” to the end.
Example:
:LIST STUDENT.TERMS WITH PUC.TERM = ‘03/SP’ PUC.ID.NO PUC.GPA LPTR
With the same username and password you log into Colleague, use FTP to download your file.
FTP settings:
Blackhole = 10.50.xxx.xxx
Username = yourColleagueAccount
Password = yourColleaguepassword
(Figure 18) Exporting to Excel takes some time, but should be done to be consistent
Remember, more information and documentation is better than less. You or the person requesting the data can always cut away the unwanted information later. But when documentation is not given in the report, it is very hard to track down who wrote the query, check for accuracy, and find the query.
If you don’t have an Excel template, request it from the IT Department. If they cannot give you a template, make one that includes the information given below (Conventions and Standards). Overall, it should just be consistent and include the essential information.
At the very minimum you should have:
1. Title: This report gives the average GPA
2. For: Dr. Request D Data
3. Date Created (when the query was first written): 12/21/03
4. Date Modified (the time the query was processed): 12/25/03
5. Author of the query: David Regal
6. Query: LIST STUDENT.ACAD.CRED WITH PUC.GPA>3.8 PUC.GPA
7. Save Paragraph: DCR.STUDENTS.WITH.HIGHEST.HONORS.GPA
(Figure 19 ) Standard report in Excel. Include the query and the name of the Save Paragraph (if used)
In documentation, more information is better than less. We document our work since others may have to read or debug that code someday. Code reuse saves us time and money. Don’t think we have the time to document properly? You should ask, “Do we have to time to do it over?” Poor or undocumented work usually results in rewriting code or queries. This wastes hours of employee time and resources redoing work which has already been done. Documentation saves money.
Please place your work in the Excel template or make your own that includes all of the following information. A consistent format is just as important as a consistent data output. And even if accuracy cannot be achieved, at least be consistent in the report’s formatting and documentation.
Include the Query and name of the Saved Paragraph in the Excel report. Not in the email body, in your notebook, on a sticky pad, or anywhere else. Even if it’s more than one line and may take a few extra minutes to cut & paste, in the long run there are immense time savings in code reuse and debugging.
Why include the Query and Saved Paragraph? In the example above, both the Query and the Saved Paragraph are shown. Isn’t this redundant? No, because the Saved Paragraph can be changed or erased minutes after the report is made. However, you want to include the name of the Saved Paragraph so others can generate the report through the Query Builder GUI and take some of the workload off your back. If the Saved Paragraph does not exist, then just place the Query in the report.
Remember, more information and documentation is better than less. The extra information can always be deleted later by the consumer of the report.
Use your judgment. If it’s statistics for only one year, save the year of the report in the title, say its 2003…
2003AverageScoreSpring.xls
Excel or another agreed upon program should become the standard statistical format that all employees at the institution use. However, if you must use reams of paper for printouts, at least give the essential information in the header of each printed page.
HEADER “’C’Registered for Six Units or Less as of ‘TLC’Process – XSS.6.AND.LESS Page: ‘PL’”
Example:
:LIST STUDENTS WITH .XSTU.CUR.CREDS.RGSTD LE ‘6’ BY PUC.NAME PUC.NAME STU.ACTIVE PROGRAMS HEADER “’C’Title: Registered for Six Units or Less as of ‘TLC’Process – Query: XSS.6.AND.LESS Modified: 6/29/2004 By: LB Page: ‘PL’” LPTR
I think typing this lengthy header for every query that is printed takes more time than the steps for importing data into Excel and prefer the Excel method of documentation.
The Excel file should be emailed or uploaded to a server so that it can be accessed by people requesting the data. Think about privacy and whether the destination is sufficiently password protected.
(Figure 20) Email or upload the query results to the people requesting the data
There is no quick way. Go to the Colleague GUI, press the help button “More” and look around for a VAL Code with a name similar to what you are looking for or ask someone who has worked here longer than you have.
Most Datatel manuals do not show the proper location and names of their VAL codes and their dependent screens.
Part of the problem is from broken VAL codes, rules, and conversions tables. The problem is too complicated to be answered fully within this document. Please see the document IPEDS-SuggestedChangesV1-2.doc for some of the problems and their fixes.
To summarize, it has been stated that IPEDS Reports will work [out-of-the-box] if …
(a.) The list given in the folder is complete (b.) Code files and tables are set-up (c.) Field values are populated [correctly] (d.) Departments are filling in the values.
Starting at the Command Prompt “:” you can easily get output to your screen. But what if someone wants the raw data? As a standard, place it into a common program, such as MS Excel.
Prerequisites:
(Figure 21) Importing into Excel.
Not found in most manuals for UniQuery. After performing a SELECT or SSELECT, the prompt will change from “:” to a “>.” This indicates you’re working with a collection of records stored in a cache.
Type “R” at the “>” prompt.
> R
Instead of typing the whole query, you can save the query on the database as a shortcut
Type “.L” to see what lines you want to save.
Save my query began on line 1 and ended on line 4, then my command would be
: .S DCR.MY.FAVORITE.QUERY 1 4
This saves line 1 to 4 to a newly created VOC file on the server called DCR.MY.FAVORITE.QUERY. To retrieve it, use .R saveparagraph or execute it by typing the name of the Saved Paragraph. (:saveparagraph).
Don’t know what ‘lines,’ .X or .R mean? Here’s my quick reference.
HelpDoc-CommonlyUsedCommandsUniQuery.doc
To learn more about UniQuery, please read the manuals in the section “Documents you should read.”
Using screenshots in a manual show the reader different stages of the program or process and guide them through the program. Copy only the active window so you don’t have to crop the screenshot later.
To only copy the active window,
[Ctl] + [Alt] + [Print Screen]
Sometimes [Print Screen] is abbreviated [PrtScn]
(Figure 22) Active window screenshot.
To paste into Photoshop, MS Word, or any graphics program,
[Ctl] + V
[Ctl] + [Print Screen]
Sometimes [Print Screen] is abbreviated [PrtScn]
(Figure 23) All windows screenshot.
|
|
Can be found in CORE. [Datatel manual]
|
|
Can be found with the mnemonic “VAL”
|
|
Colleague |
A software product from Datatel used by mid-sized educational institutions.
|
CORE module |
Colleague software module containing demographic data, shared data, shared processes and setup and utilities.
|
Datatel |
The company that supplies the UniData DBMS and Colleague.
|
DBMS |
Database Management System
|
Files |
A collection of records. Called a “Table” in most databases. The term “Table” will be used in this document to avoid confusion with the “files” that are stored on your disk in folders or directories.
|
GUI |
Graphical User Interface
|
Null values |
Blank field values. Instead of having data, you have holes.
|
PC |
Personal Computer
|
Query |
Literally, a question posed to the database. A small program that collects data from the database. Purists will argue that a query is not a program but a script. However, they’re all computer languages.
|
Server |
A computer, PC usually, that runs a special program.
|
Tables |
A collection of records. Called a “File” in Datatel speak.
|
UniBasic |
Programming language that UniData is compatible with and can be used to generate HTML code.
|
UniData |
The software suite that includes UniQuery and UniBasic. Also refers to the Database Management System (DBMS) that powers Colleague.
For release 18 (R18), this DBMS can be changed and the Colleague front-end can still be used.
|
UniQuery |
The query language used by Datatel and is not at all like SQL.
|
[ad hoc meeting, 1996] “Propose Philosophy Statement,” ad hoc Institutional Effectiveness and Assessment Committee, 1996
Questions I asked Mr. Haylock, David Regal, 2003
HelpDoc-IntroToColleague.doc, Saint Michael’s College, 2003
HelpDoc-QueriesWithUniDataTulsa.pdf, University of Tulsa, 2003
HelpDoc-UniQeuryReferenceArdent.pdf, Ardent, 2003
Printout of Mr. Belleau’s standard header, Mr Belleau, 2004