IBM Tools: April 2008

View Setup and Configuration

A team normally has separate views with the same config spec. View profiles can automate view setup for a team. The view_server is a process which runs on the host where Clearcase storage resides. This host must have Clearcase installed. You can place the view_server on the same host which contains the VOB, which reduces traffic, but is not recommended so as not to degrade the VOB.

The viewtag is the name of the view. It must be a unique identifier. It also must take the form of a valid directory name. Here is some guidance on the config spec for a view:

Each view gets a default config spec
The default retrieves the most recent version on the main branch
Can be modified using the "cleartool edcs" command
Do not change while files are opened in the IDE
Is case sensitive

In most simple projects, the development takes place on the main branch. The default config spec is taken from the file located at $ATRIAHOME/default_config_spec. CHECKEDOUT in a config spec allows modification of elements. And \main\LATEST allows creation of new elements. Note that a config spec can include other files.

I have more reading to do in my Clearcase Manual book. When I have read and studied the book some more, I will post again and share what I have learned.

Visual C++ Integration and the CLI

Clearcase integrates with Microsoft Visual C++ (VC++). However is supports only one version of Visual C++ at a time. It accomplishes this integration by adhering to the Microsoft Common Source Code Control (SCC) specification. Icons in VC++ get customized with Clearcase. New context menus items are added for files. There is a new source control toolbar in the IDE (Integrated Development Environment). And there is a new tab in the output window called "Source Control".

A Clearcase view can be assigned to a drive letter like a network drive. There is a "share from Clearcase" option which allows links from one VOB to another VOB. You should not allow Clearcase to auto merge any "resource.h" file from VC++. Clearcase cannot correctly detect some conflicts with this file. When adding files to Clearcase, you must check out the parent directory first.

There are a number of files that you should not add to Clearcase. In general the files that should not be added are those that are generated by the IDE. These files can have extensions such as .aps, .bsc, .clw, .dll, .exe, .exp, .lib, .ncb, .obj, .opt, .pch, .pdb, .res, and .tlb. The full list is extensive. But following the rule of not checking in generated files is usually sufficient.

Clearcase also integrates with other Microsoft tools such as Visual Basic. However I work exclusively with Visual C++ so I will not go into this integration.

The cleartool command starts the command line interface tool. It has an extensive set of commands. It operates in two modes: single command mode and interactive prompt mode. With the single command mode, you type "clearcase " to run every command. With the interactive prompt mode, you issue the cleartool command which causes the "cleartool>" prompt to be displayed. You can then issue sub commands without typing cleartool each time. In this mode you can end a line with a caret (^) to act as a continuation character. You should also quota any arguments that include a space character.

Commands as well as arguments have abbreviations. Here are some additional properties of the single command mode of the CLI:

The command is first processed by the command shell
The command shell expands environment variables
The C run time library interprets quotes to delimit arguments
The Clearcase commands expand special characters such as the asterisk (*)

Clearcase commands return 0 for success, and a non-zero for errors.

Welcome Back Clearcase

I recently started a new job. They haven't given me a computer yet. So to pass the time I broke out my old Clearcase Manual. Now this book is for version 3.2 of Clearcase running on Windows NT. But many of the topics are relevant to any version of Clearcase. So I thought I would share some facts I have relearned while reading the book.

The Windows version of Clearcase is integrated with Windows Explorer. You can issue the "cleartool man" command to see the reference pages. Clearcase is both a version control system and a configuration management system. It is designed for use by teams. Clearcase has a command line interface (CLI) as well as a GUI one. The Clearcase Home Base allows access to most of the tools in Clearcase.

Clearcase objects have properties which are separate from Windows properties. A VOB (versioned object base) is a public storage area. A view is a private storage area for one user. There are two types of views: snapshot and dynamic. You need to use a view to access the VOB. Starting the view activates it so that data appears as a directory to Windows. You must mount at least one VOB and start at least one view to use Clearcase.

The config spec contains the rules that a view uses to select a version. A view can have only one checkout of a given element. Once checked out, only you and the administrator can undo your checkout. There are two types of checkouts: reserved and unreserved. Reserved checkouts are the default, and give an exclusive right to extend a branch to a new version. Unreserved checkouts allow multiple views to get a copy on the same branch, and requires a merge on checkin. The Merge Manager is a GUI tool to help resolve conflicts during the merge.

The Version Tree Browser is a tool which, among many other things, allows you to:

compare versions of a file
view meta data (such as labels)
view a history of events on an object

A View Profile is an optional feature that allows common Clearcase information to be shared by a development team.

Goodbye For Now

Next week I start my new job. My official title will be "associate". Does not sound too impressive. However I guess I am not important enough to be a partner yet. The project at my new job is primarily backed by a huge Oracle 9 database. They have plans to move to Oracle 10 in the latter part of this year. But this means I will not be working with IBM DB2 as I had previously planned.

So I expect my posts to this blog will be limited unless there is another drastic change in employment. It is sad but I think I shall uninstall my copy of DB2 for Windows. I have already shared the little bit I know about IBM tools in previous posts, mostly about the DB2 database. And who knows? Maybe the new project shall start using another hot IBM tool such as Websphere.

I have already tossed out my book on Understanding DB2. Luckily I did not pitch my library full of Oracle books. They shall come in handy during the next gig. If you want to keep up with my exploits when my next job takes me through a move to the Oracle 10g database, check out my Oracle Development blog. Cheerio for now.

Performance, Problems, and Misc

DB2 writes transactions to the log buffer. This buffer is flushed periodically. The flush happens when either the buffer pool is cleaned, or when it is full. DB2 has the ability to read pages from disk into the buffer pool before the application needs them. This is called "prefetching". It is highly recommended that the RUNSTATS program be run periodically to ensure optimal performance.

Frequently you will receive error messages when using the DB2 database. For example you might get the SQL0911N if there is a deadlock that is detected and resolved by DB2. The "SQL" in the SQL0911N error number means that this is an error that comes from the Database Manager. This is the most common error you will encounter as a database developer.

It the event that you suspect that an error is a DB2 bug, technical support may request that you run the db2support utility. It collects all kinds of information and stores it in a file named "db2support.zip". Guess what type of file this is?

The system catalog is contained in the SYSIBM schema. There are read only views for SYSIBM tables in the SYSCAT schema. For example, SYSCAT.TABLES is a view for the underlying SYSIBM.SYSTABLES table.

There is an entity called the Communications Database (CDB) in DB2. It is currently a set of system table with connection information. They are stored in the host database. The CDB used to be a separate database in prior versions of DB2. However it is not part of the catalog.

The DB2 Model

There are a number of listeners which monitor incoming network traffic to the DB2 host:

db2ipccm - listens for local connections
db2tcpcm - listens for TCP/IP connections
db2tcpdm - listens for TCP/IP discovery requests

The listener assigns a coordinator agent to work on behalf of a client. DB2 engine dispatch units (workers) are implemented as threads in the Windows version of DB2. The client runs in a different address space than the DB2 engine. This ensures that the database continues even when a client goes down.

The first process to be run when the instance starts is db2sync. It is the system controller. By default there is one coordinator agent per database connection. An application shall get an error if it needs an agent but no free ones are available. DB2 has a Connection Concentrator which allows more database connections than coordinator agents. This is a good option for applications with small amounts of transactions. It improves performance in this scenario.

There are two types of shared memory in DB2 - instance level and database level. Instance level shared memory is used for the following functions:

snapshots
event monitoring
audit logging
tracing

Database level shared memory is used for the following functions:

buffer pools
the lock list
the sort area
database heap (log buffer, catalog cache)
package caching
utility heap (backup and restore)

Locking and Backup

DB2 acquires a lock on every record being updated. DB2 has a call level interface (CLI) which is a C/C++ API for database usage. The LOCK TABLE command locks all records in a table until the transaction is COMMITed. You can set the LOCKTIMEOUT parameter. When a wait reaches the LOCKTIMEOUT value, a SQL0911 will be issued and the transaction rolled back.

DB2 has a deadlock detector which randomly chooses one of the transactions causing the deadlock, and it rolls that transaction back. DB2 maintains a lock list. When the list gets full, it shall replace all row level locks with one table level lock. This is called Lock Escalation.

Here are some guidelines to avoid deadlock problems:

COMMIT often
Issue COMMITs even when you are only doing reads
Specify FOR FETCH ONLY (also known as FOR READ ONLY)
Perform any database changes near the end of your transaction

DB2 has a load utility. It only has the ability to insert data into a table, but not a view. This technique is faster than an import of data. It writes directly to the database. Constraint checking is skipped. However the load will reject any rows that do not comply with the table's column definitions. The db2look utility will generate the DDL for any given database object. The RUNSTATS utility updates statistics for tables and indexes.

The database transaction log keeps track of changes to database objects and data. The ROLLFORWARD command will replay the log for the database to catch back up after a crash. DB2 has the ability to copy a database by writing a copy of all data to disk. This is called "splitting a mirror". There is an INSPECT utility which shall check for data integrity problems and inconsistencies in the database.

DB2 Storage, SQL, Security

The smallest storage unit in a DB2 database is a page. A row in a table cannnot span across multiple pages. Consecutive pages are grouped together into extents. Tables and indexes are stored in table spaces. These tables spaces are logically divided into containers. You can issue the LIST TABLESPACES command to see their names. Not every database action results in the database actually going to the disk. Data is cached in memory called the Database Buffer Pool.

Note that autocommit is on by default in the following tools: the Command Line Processor, the Command Windows, and the Command Editor. You can limit the number of rows returns in a SELECT statement by appending criteria like FETCH FIRST 3 ROWS ONLY to the WHERE clause.

Inner joins in DB2 work just like they do in Oracle. You add a WHERE clause criteria joining columns from the two tables, separated by an equals sign. However outer joins in DB2 have a syntax which differs from Oracle. You use the LEFT OUTER JOIN, RIGHT OUTER JOIN, and FULL OUTER JOIN keywords in the WHERE clause. Hopefully these are self explanatory.

DB2 has a function that can be used like this: COALESCE(colname, 0). It acts like the NVL function in Oracle. If the "colname" column has a NULL value, the value supplied (0 in this case) is substituted instead of the NULL. DB2 has ROWNUMBER() OVER() which acts like ROWNUM in Oracle. It is a pseudo column which is a number representing the row number in the results rowset.

You can perform set operations on 2 or more query results sets. UNION ALL will include all duplicate rows. And INTERSECT will only return unique values.

PUBLIC is a special group that everyone belongs to. When granting privileges, WITH GRANT OPTION will allow the grantee to also grant the privilege to other users as well. A user requires the table space USE privilege to create tables in that table space. However users only need EXECUTE privilege for packages they wish to run. They do not need privileges for each of the objects referenced by the package. To see the list of privileges you have, issue the GET AUTHORIZATIONS command.

Client/Server, DB Objects

If you have the DB2 Connect software installed, then you have access to the Database Connection Services (DCS) directory. It contains the information you need to connect to hosts such as OS/390, z/OS, or iSeries. Issuing the "list db directory" command will show you the databases that you can connect to. Then to connect to one of these databases you issue the command CONNECT TO dbname USER username USING password.

Alternatively, you can use a DB2 Connect Gateway which sits between the client and host. This gateway machine requires DB2 Connection Enterprise Edition. You can use Search Discovery to look for DB2 servers on the network. When setting up a new machine to connect to remote hosts, you can use the Configuration Assistant tool to export the configuration of a client which is already set up. The configuration info gets stored in an access profile file. You can then import this on the machine you are setting up to speed configuration.

Next I want to share a little bit about the objects in an IBM DB2 database. A buffer pool is memory which caches recently used database information. There are two types of table spaces: system-managed and database-managed. The system managed table spaces are managed by the operating system. While the database-managed ones are managed by DB2. SYSCATSPACE is the table space containing the system catalogs, which store the information on database objects. The system catalog tables use the following schemas: SYSIBM, SYSCAT, and SYSSTAT.

When specifying database objects, the database uses the authorization ID by default if you do not explicitly state the schema name. The authorization ID is the one used when connecting to the database. Regarding database column types, CHAR has a maximum length of 254 characters. And the TIMESTAMP type encompasses both date and time down to the nanosecond granularity.

You can clone a table by issuing a CREATE TABLE tablename LIKE sourcetable. This does not copy the constraints, indexes, or trigger of the source table. It also does not copy the table data. It only duplicates the table structure. Another way to clone a table is to issue a CREATE TABLE tablename AS (SELECT * FROM sourcetable) DEFINITION ONLY.

The identity column is one which automatically generates unique numbers for sequences. There can be a maximum of one identity column per table. The syntax for creating the column is columname INTEGER GENERATED ALWAYS AS IDENTITY. The values used by the identity column do not get affected by a rollback. Once such a sequence is generated, that number is used up regardless when whether you COMMIT or ROLLBACK.

You can implement unique constraints on a table by creating a unique index on it. Materialized Query Tables (MQTs) are tables created with data from a query. Prior to DB2 version 8, MQTs were known as automatic summary tables (ASTs). Temporary tables in DB2 exist for one connection.

Disks which hold DB2 data are divided up into extents. A table scan is when DB2 scans all data pages in all extents. A view is a virtual table. The view definition is stored in SYSCAT.VIEWS. The INSTEAD OF trigger performs operations on the actual tables which back a view. Stored procedures can be written in the C, Java, and SQL PL languages. A fenced stored procedure runs in its own address space separate from the DB2 engine. Fenced stored procedures run slower than unfenced ones, but provide more protection.

Instances and Databases

The key environment variables in DB2 are DB2INSTANCE and DB2NODE. As mentioned in previous post Sharing More Basics, the central repository is the DB2 profile registry. This has nothing to do with the Microsoft Windows registry. The command db2ilist will give you a list of all instances. You have the ability to set any of the profile registry values using environment variables. However it is strongly recommended that you use the db2set command.

The instance is the environment in which databases are contained in DB2. However you should note that dropping an instance does not automatically drop the databases contained in that instance. The default instance is aptly named "DB2" in the Windows version of IBM DB2. The instance level is also known as the Database Manager level.

There is a background process called the Database Administration Server (DAS). It provides the ability to manager the database with GUI tools. In versions of DB2 prior to 8.2, the DAS used to be a separate instance.

DB2 Tools

IBM calls fixed to its database product Authorized Program Analysis Reports (APARs). These fixes are distributed in FixPaks.

You can open a DB2 command window by issuing the db2cmd command. The DB2 command line processor (CLP) operates in interactive or non-interactive modes. In interactive mode, you can enter commands directly. In non-interactive mode, you must precede each command with "db2". You may also need to enclose commands in quotes when using the non-interactive mode of the CLP. The backslash character allows multiple line command, and these lines are terminated with the terminator character. The default terminator character is the exclamation point.

Developers can code functionality in User Defined Functions (UDFs). These functions can be written in C/C++, Java, or SQL PL languages. SQL PL is IBM's extension to the SQL standard.

Now for some miscellaneous information. The db2pd command invokes a statistics tool. A satellite is a DB2 server that is a member of a group of other servers with are similarly configured.

Sharing More Basics

Issuing the db2 command invokes the Command Line Processor (CLP) tool. In the world of DB2, the instance is the environment in which databases run. The DB2INSTANCE environment variable let's you know the current active instance. There is a Database Administration Server (DAS) which lets you administer your database with GUI tools.

DB2 settings are stored in a repository called the DB2 Profile Registry. Note that this has nothing to do with the Microsoft Windows registry. Global variables in the profile registry apply to all DB2 instances. You can issue the "db2set -all" command to enumerate all the DB2 registry variables.

There is a file called the Database Manager Configuration file that stores instance level variables. Database level variables are stored in the Database Configuration file. There are a number of default table spaces created in a DB2 database:

SYSCATSPACE - holds the catalog (data dictionary)
TEMPSPAE1 - system temporary space
USERSPACE1 - default user table space

As mentioned in my first post sharing what I learned, the Enterprise Server Edition of DB2 support partitioning. This partitioning applies to the database level (as opposed to the instance).

Sharing What I Learned

Previously I had thought my next assignment was going to be a in DB2 shop. So I did a lot of reading, and recently started playing around with a DB2 installation on my machine. I thought I would share some of the basic information on DB2 I learned. It is broad in scope. But it is not too deep as I am still very much a beginner.

The current version of DB2 that I am running is 9.5. However DB2 version 8.2 was code named "Stinger". I installed DB2 on Microsoft Windows XP Professional. This is a version that shares a code base with their Linux and UNIX versions. This code base is known as DB2 UDB LUW (Linux UNIX Windows).

The targeted platform for the project I was going to join was the z/OS. The version of DB2 for that platform is a different code base than the DB2 UDB LUW. The version I installed was DB2 Personal Edition. However there is also an Enterprise Server Edition (ESE) that supports database partition feature (DPF).

Back to Oracle

I started trying to learn IBM DB2 because my next project was going to be one backed by a DB2 database. Through some unfortunate set of events, it looks like that project will not be picking me up. Therefore I am going back to the project I have worked on for the last 8 years or so. And that project is all Oracle.

Therefore I imagine I will not be posting as much on this blog any more. Maybe in my last 2 weeks of employment at my current company I shall play around with the sample database I installed. And if I find anything interesting, I will share it here. However I will not be getting to dig into the details of how things work in the world of DB2. It is a shame. Finally after installing DB2 for Windows 3 times, I got to the point where everything seemed to be working. Oh well. C'est la vie.

Perhaps I will write a couple posts on everything I learned from the book Understanding DB2. It was chock full of information for newbies such as me.

Third Time is a Charm

This may be hard to believe. But I uninstalled and reinstalled IBM DB2 on my machine for the 3rd time. And guess what? This time it is looking pretty good.

I chose DB2 Personal Edition. For the DB2 Administration Server, I selected my domain account. It seemed strange that I needed to enter my password for this account though. Is DB2 storing my password somewhere? What if my password changes? I certainly hope this does not break the DB2 database.

When the install was complete, it reminded me to use port number 50000 to connect to the DB2 instance. I don't think I will need to remember this information. I am running a copy of the database locally. Why would I need to know what port the database is listening on? I am already on the machine when I run programs which need to connect to the database. Strange...

Now here is where the beauty comes in. I chose to install the sample database after the installation was complete. To my complete surprise, the sample database actually installed without error. This is a first. I guess by now it does not take much to impress me. The installation of the sample database took over 1 minute. So I was a bit worried. However I never got that far in the past, so I figured something must have been going right.

Currently I am using the Command Line Processor to issue SQL commands on the data in the sample database. It helps to have the Control Center open to spy on the structure of the tables in the sample database. My background is in the Oracle database. However I am finding that my basic SQL skills are pretty much still valid in DB2. I believe the next step is to review the important parts of the book "Understanding DB2" that I recently finished. It is my hope that I now have all the access to execute the commands that gave me errors before.

Return of Problems

Having reinstalled DB2 on my machine, I was hoping to get some practice with the database. These hopes came crashing down pretty quickly. I chose to install the sample database as part of the post-installation "First Steps". This choice led to an SQL 1092N. Apparently my domain user does not have the authority to create the sample database. Clicking on the SAMPLE database overview link was not encouraging. It just took me to a web page that stated, "The topic that you have requested is not available." LOL.

I thought maybe there was something wrong specific to the sample database. So I tried to use the DB2 Control Center to create my own database. This led back to the same SQL 1092 N. My next theory was that maybe my domain account did not have the correct rights. So I checked and found that I am a member of administrators on my machine. I am also a member of the groups DB2ADMNS and DB2USERS. These groups should be defunct since I chose to not use Windows groups for DB2 security. Now I was completely confused.

As a last ditch effort, I logged into my local machine with the db2admin account I specified when installing DB2 this last time. I created directory C:\DB2 and chose it as Default Path for the database I was creating. This resulted in a new error stating that the "path does not exist". Well I went back one last time and chose directory C:\ as the Default Path. Finally a database got created. The command completed successfully. But I don't want to log onto my workstation locally every time I create a database. And I bet I cannot log in with my normal ID and do anything with this database. Is it time to reinstall again, choosing my domain account as the User Information provided for the DB2 Administration Server? Who knows?

Uninstall Time

I decided it was time to unintall IBM DB2 from my computer. There were too many problems, based in part due to choices made during installation. I was a bit worried that maybe I would not be able to reinstall due to my trial license being over. There was one warning message during uninstall - "DB2 databases will not be removed". That was fine. I later used Windows Explorer to manually delete the database files. It was surprisingly quick to uninstall the whole DB2 software. It took a couple minutes total with no reboot of my PC needed.

Luckily the reinstall worked pretty well. At first I was back to the same issues when I first did an install. But I reviewed my post on Installing DB2. That helped me decide to choose Personal Edition of DB2. And this time around I chose a Typical installation, and accepted almost all the defaults. For example I chose a local user account of db2admin. However I once again ran into problems choosing a password for this account. It kept giving me errors like I mentioned in DB2 Install Continued. Note to self - always choose a short password that is complex like Pass2008 (upper case, lower case, numbers).

This time around I skipped installation of the DB2 Tool Catalog. Apparently this is for the "Task Center" and "Scheduler". Not sure what exactly these 2 items do. But I won't need any backups for my test database. So I hope this was the right decision. I also decided not to mess around with the notifications option. The most important install decision was to disable operating system security. This got me into so much trouble after the last install.

As a last step bonus, I decided to install the IBM Database Add-Ins for Visual Studio 2005. This step took a long time. But I have recently installed Visual Studio 2005 on my machine. So I hope to take advantage of this feature.

IBM Tools