I had a few problems with a customer’s deployment of Sametime 9 which probably come down to deployment plans and the order of the servers being installed.

During installation I had problems detailed in “System version is null” on new IBM Sametime Video Manager installation which forced me to uninstall the VMGR and install again with a new deployment plan. The outcome of this was that I could not administer the default policies nor create new Media Manager policies in the SSC, I saw the following error, “AIDSC#####: Could not connect to Sametime Video Manager. Either VMgr is not installed or server is not up. Please retry after installing VMgr or starting it.”

1

I saw in the deployment manager SystemOut.log “[17/10/14 17:02:04:101 BST] 00000220 SametimeVmgrU E   Forbidden” but nothing much else to write home about.

I raised a PMR with IBM and gathered some trace and sent it off. The PMR ended up with Ankit Vij in L3 who worked as a developer on the propagation of policies from the SSC to the VMGR.

After some to’ing and fro’ing it was identified that there were missing credentials in the DEPLOYMENT table of the SSC database. In the DEPCONF  column of the Conference Manager deployment plan lies XML data. In the data are two fields VMGRUSER and VMGRPASSWORD. In the customer’s data these values were empty, this is why the SSC couldn’t access the VMGR’s policies.

There are few ways in which to edit the data, Data Studio is nice and easy and can export the table, edit it and then import it again in no time at all but as I was accessing their environment using Citrix this wasn’t an option because I couldn’t install any software. Using the CLI was the only way to do it.

My first attempts of using the DB2 EXPORT command failed because the tables have LOBs which are truncated when you export the data to a csv file. The way around it is to export to a csv file but also export all the data to LOB files. This can be achieved using the following command.

C:\Windows\system32>db2 “export to d:\export\deployment.csv of del lobs to d:\export\lobs\ modified by lobsinsepfiles select * from ssc.deployment”
SQL3104N  The Export utility is beginning to export data to file
“d:\export\deployment.csv”.

This produces a csv. Where there was LOB data a .lob file is produced and the csv details which number .lob file holds the information for that particular entry.

Once I had found the .lob file referenced for the Conference Manager deployment plan in the DEPCONF  column I had to copy the contents of the .lob to a new text file.

The VMGRUSER and VMGRPASSWORD values were empty so I then updated them with wasadmin (could be admin/admin) and the password associated with it.

Next I had to add to the beginning of the xml data UPDATE SSC.DEPLOYMENT SET DEPCONF=’ and to the end ‘ WHERE DEPID=’14908a6aa1d-00000000000a-MediaDep’

The DEPID is easy to come about and is listed DEPID column for the Conference Manager deployment plan.

The end result is a single line containing 18000+ characters looking something like this.

UPDATE SSC.DEPLOYMENT SET DEPCONF='<?xml version=”1.0″ encoding=”UTF-8″?>………………………….</parameter></parameters></Config>’ WHERE DEPID=’14908a6aa1d-00000000000a-MediaDep’

As the command was too large to paste into the CLI I saved it to a .sql file.

I stopped STConsoleServer, the node agent and the deployment manager.

Before changing the database I needed to back it up.

C:\Windows\system32>db2 backup database stsc
SQL1035N  The database is currently in use.  SQLSTATE=57019

I then needed to force the application connections from the database.

C:\Windows\system32>db2 list applications

Auth Id  Application    Appl.      Application Id                                                 DB       # of
Name           Handle                                                                    Name    Agents
——– ————– ———- ————————————————————– ——– —–
DB2ADMIN db2jcc_applica 41961      192.168.x.x.49442.141124093130                              STSC     1
DB2ADMIN db2jcc_applica 45374      192.168.x.x.61230.141125142939                              STMS     1
DB2ADMIN db2jcc_applica 45483      192.168.x.x.61666.141125152718                              STMS     1
DB2ADMIN db2jcc_applica 41949      192.168.x.x.49385.141124093116                              STMS     1

C:\Windows\system32>db2 force application(41961)
DB20000I  The FORCE APPLICATION command completed successfully.
DB21024I  This command is asynchronous and may not be effective immediately.

After all applications are disconnected I could run the backup.

 C:\Windows\system32>db2 backup database stsc

Backup successful. The timestamp for this backup image is : 20141125154621

C:\Windows\system32>db2 connect to stsc

   Database Connection Information

 Database server        = DB2/NT64 10.1.0
 SQL authorization ID   = DB2ADMIN
 Local database alias   = STSC

At this point I am going to run the UPDATE command using the .sql file I created.

C:\Windows\system32>db2 -vf C:\DB2\ssc.sql

DB20000I  The SQL command completed successfully.

Normally I would run db2 -tvf but that didn’t work, I think because I didn’t use semicolons for delimiters in the .sql file. Anyway, it worked.

I started the deployment manager, node agent and STConsoleServer and I could now edit the Media Manager policies.

Many thanks to Imran and Ankit at IBM for helping me through this frustrating but interesting problem.

A customer wanted to use a series of nested groups to populate Profiles. The theory is that the parent group has a number of child groups which are controlled by various location specific administrators.

Initially I hoped to be able to achieve this by using a special query (LDAP_MATCHING_RULE_IN_CHAIN) which would walk to the root and thus include all members of the nested groups.

“(&(objectClass=user)(member:1.2.840.113556.1.4.1941:(=CN=IBM Connections Users,DC=acme,DC=com)))”

I couldn’t get this to work using ldapsearch so I had their AD admins investigate. They too could not get it to work so the work around was to add all the groups to the source_ldap_search_filter value in profiles_tdi.properties. The search filter consisted of over 26000 characters!!

On running sync_all_dns I saw failures (after enabling debug from Profiles MustGather) in the ibmdi.log. The errors matched Population or Synchronization fails trying to update the Peopledb with the error SQLCODE: -302, SQLSTATE: 22001

I was hesitant to go ahead and make the changes details and read else where that LONG VARCHAR is deprecated in 9.7 potentially meaning CLOB datatype was required. I raised a PMR with IBM to check and they came back and said that there is no change made to PROF_SOURCE_URL in the later versions of Connections. The one pain could be if a side by side migration to 4.5 (or later) is performed as the new database will have VARCHAR as the datatype. With some forward planning this migration failure can be removed.

My DB2 colleague and I decided to back up PEOPLEDB and then run the following command to dump the definitions of the database before altering the datatype.

db2look -d PEOPLEDB -f -l -e -x > D:\db2look_peopledb_prechange.txt

To alter the datatype we used the Control Center which gave us the below SQL.

CONNECT TO PEOPLEDB;
CALL SYSPROC.ALTOBJ ( ‘APPLY_CONTINUE_ON_ERROR’, ‘CREATE TABLE EMPINST.EMPLOYEE ( PROF_KEY VARCHAR (36)  NOT NULL , PROF_UID VARCHAR (256)  NOT NULL , PROF_UID_LOWER VARCHAR (256)  NOT NULL , PROF_LAST_UPDATE TIMESTAMP  NOT NULL , PROF_MAIL VARCHAR (256) , PROF_MAIL_LOWER VARCHAR (256) , PROF_GUID VARCHAR (256)  NOT NULL , PROF_SOURCE_UID VARCHAR (256)  NOT NULL , PROF_DISPLAY_NAME VARCHAR (256) , PROF_LOGIN VARCHAR (256) , PROF_LOGIN_LOWER VARCHAR (256) , PROF_GIVEN_NAME VARCHAR (128) , PROF_SURNAME VARCHAR (128) , PROF_ALTERNATE_LAST_NAME VARCHAR (64) , PROF_PREFERRED_FIRST_NAME VARCHAR (32) , PROF_PREFERRED_LAST_NAME VARCHAR (64) , PROF_TYPE VARCHAR (64) , PROF_MANAGER_UID VARCHAR (256) , PROF_MANAGER_UID_LOWER VARCHAR (256) , PROF_SECRETARY_UID VARCHAR (256) , PROF_IS_MANAGER CHARACTER (1) , PROF_GROUPWARE_EMAIL VARCHAR (128) , PROF_GW_EMAIL_LOWER VARCHAR (128) , PROF_JOB_RESPONSIBILITIES VARCHAR (128) , PROF_ORGANIZATION_IDENTIFIER VARCHAR (64) , PROF_ISO_COUNTRY_CODE VARCHAR (3) , PROF_FAX_TELEPHONE_NUMBER VARCHAR (32) , PROF_IP_TELEPHONE_NUMBER VARCHAR (32) , PROF_MOBILE VARCHAR (32) , PROF_PAGER VARCHAR (32) , PROF_TELEPHONE_NUMBER VARCHAR (32) , PROF_WORK_LOCATION VARCHAR (32) , PROF_BUILDING_IDENTIFIER VARCHAR (64) , PROF_DEPARTMENT_NUMBER VARCHAR (24) , PROF_EMPLOYEE_TYPE VARCHAR (256) , PROF_FLOOR VARCHAR (16) , PROF_EMPLOYEE_NUMBER VARCHAR (16) , PROF_PAGER_TYPE VARCHAR (16) , PROF_PAGER_ID VARCHAR (32) , PROF_PAGER_SERVICE_PROVIDER VARCHAR (50) , PROF_PHYSICAL_DELIVERY_OFFICE VARCHAR (32) , PROF_PREFERRED_LANGUAGE VARCHAR (100) , PROF_SHIFT VARCHAR (4) , PROF_TITLE VARCHAR (256) , PROF_COURTESY_TITLE VARCHAR (64) , PROF_TIMEZONE VARCHAR (64) , PROF_NATIVE_LAST_NAME VARCHAR (256) , PROF_NATIVE_FIRST_NAME VARCHAR (256) , PROF_BLOG_URL VARCHAR (256) , PROF_FREEBUSY_URL VARCHAR (256) , PROF_CALENDAR_URL VARCHAR (256) , PROF_DESCRIPTION CLOB  (1048576   )  LOGGED  NOT  COMPACT , PROF_EXPERIENCE CLOB  (1048576   )  LOGGED  NOT  COMPACT , PROF_SOURCE_URL “LONG VARCHAR” , PROF_SRC_UID_LOWER VARCHAR (256)  NOT NULL , TENANT_KEY VARCHAR (36)  NOT NULL  WITH DEFAULT ‘00000000-0000-0000-0000-040508202233′ , PROF_STATE INTEGER  NOT NULL  WITH DEFAULT 0   ) IN USERSPACE32K INDEX IN USERSPACE4K LONG IN USERSPACE32K ‘, -1, ? );
CONNECT RESET;

After running this the datatype of the column changed to LONG VARCHAR.

We run the definitions dump again and compared the contents with the pre-change information. The contents were the same albeit in a different order.

db2look -d PEOPLEDB -f -l -e -x > D:\db2look_peopledb_prechange.txt

This gave us confidence to continue and started Connections. At which point the sync_all_dns completed successfully and the users in the nested groups are populated to Profiles. Checking EMPLOYEE.PEOPLEDB shows the very long search filter in the PROF_SOURCE_URL for those that were added recently.

Thankfully I knew that I was going to come across one of these problems. When upgrading the policies need to also be upgraded and the wiki describes the process well. Prior to starting the upgrade I tested the approach of editing the LDAP document in the SSC but got the following error

??? nl.ErrorResourceMessages|CWWIM5044E The ******* repository cannot be deleted because it has at least one base entry that is referenced by a realm. [en] ???

To enable SSO to work across Portal, Connections, Domino and Sametime I had to configure a realm which was in line with the other environments as well as make some modifications to my wimconfig.xml. The error above is telling me that it doesn’t like the fact that the realm is different to defaultWIMFileBasedRealm, (more information about how to do this can be found in this Technote).

This had me concerned that this could jeopardize the upgrade so I read around and found the following IBM documents which are worth bearing in mind, LO70452 and LO56702. The former suggested that I change the realm back to defaultWIMFileBasedRealm as it breaks the LDAP guided activity which is what I need to step through. By changing the realm name I saw synchronisation errors with the nodes but that was quickly rectified after the policies were upgraded and the realm reverted to what it was. I ensured that I shut down all the nodes and their node agents and ensured synchronisation was working.

The other problem I came across was just after the clustered Meeting servers were upgraded to 8.5.2 is detailed in After upgrading to 8.5.2, users are no longer able to enter Sametime Meeting Rooms. The instruction in the Technote suggest that you run some DB2 commands to purge the contents of the POLICY.TEMPLATE and POLICY.ASSIGNMENT tables in STSC database. That is all good and well but you will lose ALL your policies so be very careful about doing that.

You can of course take a back up of the database as well as manually making a not of all your policies and their settings.
Below are a couple of commands that may make it easier for you.

db2 “select count(*) as rows from POLICY.ASSIGNMENT” and db2 “select count(*) as rows from POLICY.TEMPLATE”
The above will allow you to check that the number of rows and when run afterwards it checks that it has worked.

This allows you to dump the data out to a text file. This doesn’t give you the configuration of the policy just those who are assigned to them
db2 “select * from POLICY.TEMPLATE” > /tmp/policy.template.original.txt
db2 “select * from POLICY.ASSIGNMENT” > /tmp/policy.assignment.original.txt

This is the command to purge the contents
db2 “delete from POLICY.TEMPLATE”
db2 “delete from POLICY.ASSIGNMENT”

It worked for me BUT I had to recreate the policies.

As part of my plans to upgrade a customer’s environment for Sametime 8.5.1 to 8.5.2.1 I wanted to get ahead of myself by creating the database used for iOS devices. When trying to create the database using the createProxyDb.sh I got a failure.

“SQL1363W warning: One or more of the parameters submitted for immediate modification were not changed dynamically. For these configuration parameters, all applications must disconnect from this database before the changes become effective.”

I thought that maybe a stop and start of DB2 was required but before failing over the two HADR members I Googled and found Failure creating Sametime Proxy database on DB2 HADR system which describes what is (not) happening and a resolution….. for Windows

I’m not overly hot with writing shell scripts but I managed to get there in the end on a RHEL server and have pasted the amended shell script for those who come across this.

PS – I ran the shell script with the following c0mmand so that I could see all the output.

bash -xv ./createProxyDbHADR.sh

#
# US Government Users Restricted Rights – Use, duplication or
# disclosure restricted by GSA ADP Schedule Contract with
# IBM Corp.
#
# *****************************************************************

# This script will create the database and run the schema creation script

# exec 1>/dev/null

PATH=/usr/bin:/usr/ucb:$PATH; export PATH
+ PATH=/usr/bin:/usr/ucb:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/opt/IBM
/db2/V9.5/db2inst1/sqllib/bin:/opt/IBM/db2/V9.5/db2inst1/sqllib/adm:/opt/IBM/db2
/V9.5/db2inst1/sqllib/misc:/opt/IBM/db2/V9.5/db2inst1/sqllib/db2tss/bin:/opt/IBM
/db2/V9.5/db2inst1/bin
+ export PATH
PRG=`basename $0`
basename $0
++ basename ./createProxyDbHADR.sh
+ PRG=createProxyDbHADR.sh

STDDL=proxyServer.ddl
+ STDDL=proxyServer.ddl

if [ “$2″ = “” ]; then
        echo “Insufficient arguments”
        echo
        echo “Usage: $PRG DatabaseName DbUserName”
        echo ”  DatabaseName is the name of the database this script will”
        echo ”  create.  DbUserName is the name of the database user that”
        echo ”  will be used to connect to the DB2 server from the”
        echo ”  Sametime Proxy Server.”
        exit 1
fi
+ ‘[‘ db2inst1 = ” ‘]’

if [ ! -f “${STDDL}” ]; then
        echo “${STDDL} is missing from the current directory”
        echo “Aborting database creation”
        exit 2
fi
+ ‘[‘ ‘!’ -f proxyServer.ddl ‘]’

echo “Processing…”
+ echo Processing…
Processing…
echo
+ echo

db2 CREATE DATABASE $1 USING CODESET UTF-8 TERRITORY US COLLATE USING UCA400_NO
+ db2 CREATE DATABASE STPROXY USING CODESET UTF-8 TERRITORY US COLLATE USING UCA
400_NO
DB20000I  The CREATE DATABASE command completed successfully.
#       AUTOCONFIGURE USING MEM_PERCENT 50 TPM 1500 \
#       ADMIN_PRIORITY PERFORMANCE ISOLATION UR APPLY DB AND DBM
command_return_code=$?
+ command_return_code=0
if [ ${command_return_code} -ne 0 ]; then
        echo
        echo “Database creation failed”
        exit ${command_return_code}
fi
+ ‘[‘ 0 -ne 0 ‘]’

# adding deactivate as needed for HADR installation
db2 deactivate database $1 user db2inst1 using ********
+ db2 deactivate database STPROXY user db2inst1 using ********
SQL1496W  Deactivate database is successful, but the database was not
activated.

db2 update db cfg for $1 using LOGSECOND 20
+ db2 update db cfg for STPROXY using LOGSECOND 20
DB20000I  The UPDATE DATABASE CONFIGURATION command completed successfully.
SQL1363W  One or more of the parameters submitted for immediate modification
were not changed dynamically. For these configuration parameters, all
applications must disconnect from this database before the changes become
effective.
#command_return_code=$?
#if [ ${command_return_code} -ne 0 ]; then
#       echo
#       echo “Database creation failed”
#       exit ${command_return_code}
#fi

db2 update db cfg for $1 using LOGFILSIZ 10000
+ db2 update db cfg for STPROXY using LOGFILSIZ 10000
DB20000I  The UPDATE DATABASE CONFIGURATION command completed successfully.
SQL1363W  One or more of the parameters submitted for immediate modification
were not changed dynamically. For these configuration parameters, all
applications must disconnect from this database before the changes become
effective.
#command_return_code=$?
#if [ ${command_return_code} -ne 0 ]; then
#       echo
#       echo “Database creation failed”
#       exit ${command_return_code}
#fi

# adding activate as needed for HADR installation
db2 activate database $1 user db2inst1 using ********
+ db2 activate database STPROXY user db2inst1 using ********
DB20000I  The ACTIVATE DATABASE command completed successfully.

db2 CONNECT to $1
+ db2 CONNECT to STPROXY

   Database Connection Information

 Database server        = DB2/LINUX 9.5.1
 SQL authorization ID   = DB2INST1
 Local database alias   = STPROXY

command_return_code=$?
+ command_return_code=0
if [ ${command_return_code} -ne 0 ]; then
        echo
        echo “Database creation failed”
        exit ${command_return_code}
fi
+ ‘[‘ 0 -ne 0 ‘]’

if [ “$2″ = `whoami` ]; then
        echo “*** $PRG:  skipping granting privileges to self”
        echo
else
        db2 GRANT DBADM ON DATABASE TO USER $2
        command_return_code=$?
        if [ ${command_return_code} -ne 0 ]; then
                echo
                echo “Database creation failed”
        exit ${command_return_code}
        fi
fi
whoami
++ whoami
+ ‘[‘ db2inst1 = db2inst1 ‘]’
+ echo ‘*** createProxyDbHADR.sh:  skipping granting privileges to self’
*** createProxyDbHADR.sh:  skipping granting privileges to self
+ echo

db2 CREATE BUFFERPOOL STPBP32K ALL NODES \
        SIZE 750 AUTOMATIC PAGESIZE 32768
+ db2 CREATE BUFFERPOOL STPBP32K ALL NODES SIZE 750 AUTOMATIC PAGESIZE 32768
DB20000I  The SQL command completed successfully.
command_return_code=$?
+ command_return_code=0
if [ ${command_return_code} -ne 0 ]; then
        echo
        echo “Database creation failed”
        exit ${command_return_code}
fi
+ ‘[‘ 0 -ne 0 ‘]’

db2 CREATE LARGE TABLESPACE STP32K \
        PAGESIZE 32768 \
        MANAGED BY DATABASE \
        USING “( FILE ‘STPDATA32K_1′ 200M, FILE ‘STPDATA32K_2′ 200M)” \
        AUTORESIZE YES \
        INCREASESIZE 500M \
        MAXSIZE NONE \
        BUFFERPOOL STPBP32K
+ db2 CREATE LARGE TABLESPACE STP32K PAGESIZE 32768 MANAGED BY DATABASE USING ‘(
 FILE ‘\”STPDATA32K_1’\” 200M, FILE ‘\”STPDATA32K_2’\” 200M)’ AUTORESIZE YES
 INCREASESIZE 500M MAXSIZE NONE BUFFERPOOL STPBP32K
DB20000I  The SQL command completed successfully.
command_return_code=$?
+ command_return_code=0
if [ ${command_return_code} -ne 0 ]; then
        echo
        echo “Database creation failed”
        exit ${command_return_code}
fi
+ ‘[‘ 0 -ne 0 ‘]’

db2 CREATE TEMPORARY TABLESPACE STPTEMP32K \
        PAGESIZE 32768 \
        MANAGED BY DATABASE \
        USING “( FILE ‘STPTEMPDATA32K_1′ 10M, FILE ‘STPTEMPDATA32K_2′ 10M)” \
        AUTORESIZE YES \
        INCREASESIZE 5M \
        MAXSIZE NONE \
        BUFFERPOOL STPBP32K
+ db2 CREATE TEMPORARY TABLESPACE STPTEMP32K PAGESIZE 32768 MANAGED BY DATABASE
USING ‘( FILE ‘\”STPTEMPDATA32K_1’\” 10M, FILE ‘\”STPTEMPDATA32K_2’\” 10M)’
AUTORESIZE YES INCREASESIZE 5M MAXSIZE NONE BUFFERPOOL STPBP32K
DB20000I  The SQL command completed successfully.
command_return_code=$?
+ command_return_code=0
if [ ${command_return_code} -ne 0 ]; then
        echo
        echo “Database creation failed”
        exit ${command_return_code}
fi
+ ‘[‘ 0 -ne 0 ‘]’

db2 -tvf ${STDDL}
+ db2 -tvf proxyServer.ddl
CREATE TABLE STPROXY.STP_NWAYINV  ( NWAY_USERID VARCHAR(128) NOT NULL, NWAY_SEND
ERID VARCHAR(128) NOT NULL, NWAY_SENDERNAME VARCHAR(128) NOT NULL, NWAY_TIMESTAM
P TIMESTAMP NOT NULL WITH DEFAULT CURRENT TIMESTAMP, NWAY_TOPIC VARCHAR(256) NOT
 NULL, NWAY_MESSAGE VARCHAR(1024) NOT NULL, NWAY_PLACEID VARCHAR(256) NOT NULL,
CONSTRAINT NWAY_PK PRIMARY KEY (NWAY_USERID, NWAY_PLACEID, NWAY_TIMESTAMP) ) IN
STP32K
DB20000I  The SQL command completed successfully.

CREATE TABLE STPROXY.STP_IMMSG  ( IM_USERID VARCHAR(128) NOT NULL, IM_PARTNERID
VARCHAR(128) NOT NULL, IM_PARTNERUID VARCHAR(128) NOT NULL, IM_PARTNERNAME VARCH
AR(256) NOT NULL, IM_TEXT CLOB (1M) COMPACT NOT NULL, IM_PLACEID VARCHAR(256) WI
TH DEFAULT NULL, IM_TIMESTAMP TIMESTAMP NOT NULL WITH DEFAULT CURRENT TIMESTAMP,
 CONSTRAINT IM_PK PRIMARY KEY (IM_USERID, IM_PARTNERID, IM_TIMESTAMP) ) IN STP32
K
DB20000I  The SQL command completed successfully.

CREATE TABLE STPROXY.STP_ANNC  ( ANNC_USERID VARCHAR(128) NOT NULL, ANNC_SENDERI
D VARCHAR(128) NOT NULL, ANNC_MESSAGE CLOB (1M) COMPACT NOT NULL, ANNC_RESPONSE_
ALLOWED SMALLINT NOT NULL WITH DEFAULT 0, ANNC_SENDERNAME VARCHAR(128) NOT NULL,
 ANNC_TIMESTAMP TIMESTAMP NOT NULL WITH DEFAULT CURRENT TIMESTAMP, CONSTRAINT AN
NC_PK PRIMARY KEY (ANNC_USERID, ANNC_SENDERNAME, ANNC_TIMESTAMP) ) IN STP32K
DB20000I  The SQL command completed successfully.

CREATE TABLE STPROXY.STP_DEVICETOKENS  ( DT_DEVICETOKEN VARCHAR(128) NOT NULL, C
ONSTRAINT DEVICETOKEN_PK PRIMARY KEY (DT_DEVICETOKEN) ) IN STP32K
DB20000I  The SQL command completed successfully.

command_return_code=$?
+ command_return_code=0
if [ ${command_return_code} -ne 0 ]; then
        echo
        echo “Database creation failed”
        exit ${command_return_code}
fi
+ ‘[‘ 0 -ne 0 ‘]’

db2 disconnect $1
+ db2 disconnect STPROXY
DB20000I  The SQL DISCONNECT command completed successfully.
command_return_code=$?
+ command_return_code=0
if [ ${command_return_code} -ne 0 ]; then
        echo
        echo “Database creation failed”
        exit ${command_return_code}
fi
+ ‘[‘ 0 -ne 0 ‘]’

 

I was asked to assist a colleague who was upgrading a customer’s Sametime environment which had failed when upgrading the SSC to 8.5.2.

Initially when logging into the SSC the version was showing as 8.5.2 with WAS at 7.0.0.15 which is good but I couldn’t access policies nor any of the servers which is of course bad. Also users were getting a blank screen when accessing the meeting center with an error referring to policies.

The database tables showed that the SSC was at the old version too.

I checked the various SystemOut.logs and saw the following written to the STConsoleServer log file.

[28/06/12 18:30:20:419 BST] 00000032 PolicyBeanDAO E   Exception Failed to get the Policy for: default, for product: im
com.ibm.sametime.console.admin.plugins.StConsolePluginException: java.sql.SQLException: [ibm][db2][jcc][t4][2013][11249] Connection authorization failure occurred.  Reason: User ID or Password invalid.DSRA0010E: SQL State = null, Error Code = -4,214
at com.ibm.sametime.console.admin.plugins.policy.PolicyBeanDAO.iGetPolicyIdsForUserDN(PolicyBeanDAO.java:1495)
at com.ibm.sametime.console.admin.plugins.policy.PolicyBeanDAO.getPolicyForUserIDByProduct(PolicyBeanDAO.java:857)
at com.ibm.sametime.console.admin.plugins.policy.PolicyBeanDAO.getPoliciesForUserID(PolicyBeanDAO.java:818)
at com.ibm.sametime.console.admin.plugins.policy.PolicyBeanDAO.getPoliciesForDefault(PolicyBeanDAO.java:802)

So I checked the password for the DB2 administrative account by trying to connect to itself forcing the use of the password which I knew (since I built the servers).

db2 connect to stsc user db2admin
Enter current password for db2admin:
SQL30082N  Security processing failed with reason “24” (“USERNAME AND/ORPASSWORD INVALID”).  SQLSTATE=08001

So the password is wrong which I confirmed by attempting to change the password using passwd.

I thought I’d go a little further (since I had root!) and checked the date in which the password was changed and also who changed it by looking at the /var/log/secure files for the date shown below.

# passwd db2admin -S
db2admin PS 2012-06-20 0 99999 7 -1 (Password set, MD5 crypt.)

The first step was to change the password for db2admin back to what it was previously and recycle the SSC. This allowed me to access policies and users to continue using the meeting servers.

At first it was thought running update_85_SCDb.sh STSC db2admin might fix it but this fails and looking at the wrapper it calls update.ddl which essentially runs three SQL queries. This fails because the tables and columns referenced have already been updated.

I decided to try and replicate this on a lab deployment and during the upgrade I came across a few unrelated problems which were resolved by running through Upgrading Sametime System Console to 8.5.2 IFR 1 fails during installation and trying again.

Once I got the lab server to the same state I then tried registering the SSC again because these tables are updated by the registerProduct.sh -upgrade script. Before doing this I checked the DBAppPassword= parameter in /opt/IBM/WebSphere/SSC/STSCServerCell/console/productConfig.properties to ensure that the password listed was correct and run it. Low and behold the tables updated (below).

I also checked https://<<SSC URL>>:9443/console/deployment/ListOfAllProductDeployments to ensure that the version was correct there. I can now access all the Sametime component servers.

The crux of the problem was that the password detailed in /opt/IBM/WebSphere/SSC/STSCServerCell/console/productConfig.properties was different to that of the OS at the time of the upgrade.

A customer had a particularly difficult and awkward problem. A number of user reported that they could not see the status updates of others users who were part of the same network.

This seemed to affect a handful of users in the following pattern. User A cannot see User B’s status updates whilst User C can see User B’s updates. The problem is associated with User B. To c0mplicate things further User A shares a network with User B but does not follow him whilst User C also shares a network but does follow User B. So if you follow User B you get to see their status updates in your Status Updates. You should not need to follow the user to see their status updates the fact that you’re sharing a network should be sufficient.

I logged a PMR which thankfully found it’s way into the ever helpful hands of David McCarthy. An ISSC colleague mentioned that this had been seen in IBM’s deployment of Connections (w3) by Luiz Benietz and others so getting to the bottom of this would kill two birds with one stone.

IBM had me extract data from PEOPLEDB and HOMEPAGE databases using the DB2 commands below to provide the PERSON_ID of the users within HOMEPAGE to make sure it matches with the PROF_GUID within PEOPLEDB which for User B is E833339D-AED2-425F-8600-64CEFD85A3A5.

Comparison of NR_NETWORK showed that the data in there was correct.

PEOPLEDB.EMPLOYEE
PROF_GUID = E833339D-AED2-425F-8600-64CEFD85A3A5

HOMEPAGE.PERSON
PERSON_ID = 0bdd7880-8d9a-4fe1-ab13-77c54bb429bc
EXID = E833339D-AED2-425F-8600-64CEFD85A3A5

Using the PERSON_ID I was then able to retrieve data from the HOMEPAGE.NR_NEWS_STATUS_NETWORK table which details all the status updates posted by that PERSON_ID which in the HOMEPAGE.NR_NEWS_STATUS_NETWORK table is identified by ACTOR_UUID. This table also shows the READER_ID (detailed below) as well as a BRIEF_DESC column which shows the first 200 (or so) characters of the status update.

db2 “select * from HOMEPAGE.NR_NEWS_STATUS_NETWORK where ACTOR_UUID = ‘0bdd7880-8d9a-4fe1-ab13-77c54bb429bc‘”

HOMEPAGE.NR_NEWS_STATUS_NETWORK
ACTOR_UUID = 0bdd7880-8d9a-4fe1-ab13-77c54bb429bc
READER_ID = PERSON_ID of the person that is consuming the status updates ie those listed in the READER_ID column should be able to see the status updates.

To find out the names of the people who can read the updates you will need to use the READER_ID from HOMEPAGE.NR_NEWS_STATUS_NETWORK and find out the name of the user from the HOMEPAGE.PERSON by using the READER_ID converted to PERSON_ID.

db2 “select DISPLAYNAME from HOMEPAGE.PERSON where PERSON_ID = ‘714fbcd9-1740-477b-972d-a6456a35e4ea'”

I contacted the user and they could indeed see User B’s status updates.

To see how many people User B has in their network the following command was run which produced a value of about 140.

db2 “select count(*) from HOMEPAGE.NR_NETWORK where PERSON_ID = ‘0bdd7880-8d9a-4fe1-ab13-77c54bb429bc‘”

IBM asked for the following trace to be enabled whilst User B posted a status update and in the trace.log an exception was seen during the processing of one of the threads.

*=info: java.sql.*=all: com.ibatis.sqlmap.*=all: com.ibm.lconn.homepage.*=all: com.ibm.lconn.news.*=all

It was suggested that 3.0.1.1 be installed but that would not be possible. The following fixes were applied which aimed to improve that code path to handle the case where the use records in the Homepage database are out of sync.

3.0.1.0     LO63965     VBUT8LNH89     Enhancement to News aggregation service for better handling of exceptional cases where user records become inconsistent with Profiles

3.0.1.0     LO66468     SCRD8QCDDK     Profiles Sync issue. Too many requests made on Event consumption to Profiles colleague feed

Unfortunately the problem was still apparent after the iFixes so more debugging was enabled, this time com.ibm.lconn.news.*=all: com.ibm.lconn.events.*=all

Thankfully this time the trace.log provided much more valuable information.

[30/05/12 16:14:59:301 BST] 00000067 StatusUpdateS 3 com.ibm.lconn.news.service.impl.StatusUpdateService insertStatusUpdate Going to insert into db with a batch process this number of records: 144 and BUCKET_SIZE_BATCH_INSERT: 100
[ … snip …]
[30/05/12 16:14:59:308 BST] 00000067 NewsStatusNet 3 com.ibm.lconn.news.data.dao.impl.ibatis.NewsStatusNetworkDao insertBatch – doInSqlMapClient rowsaffected 100
[30/05/12 16:14:59:308 BST] 00000067 NewsStatusNet < com.ibm.lconn.news.data.dao.impl.ibatis.NewsStatusNetworkDao insertBatch RETURN 100
[30/05/12 16:14:59:312 BST] 00000067 StatusUpdateS < com.ibm.lconn.news.service.impl.StatusUpdateService insertBatch RETURN 100
[ Here – it’s missing the insert for the remaining 44 records ]
[30/05/12 16:14:59:312 BST] 00000067 StatusUpdateS 3 com.ibm.lconn.news.service.impl.StatusUpdateService insertStatusUpdate Going to insert into db with a batch process this number of records: 3 and BUCKET_SIZE_BATCH_INSERT: 100″

The response from L3 was:

The root cause is a coding error in a routine performing batch insert. The issue should be reproducible for people in the following conditions:

  • The number of people in the intersection of the sets (colleagues, followers)  for the author of the status update is > 100
  • The number of colleagues – number of people in the intersection is > 100 (this is what happen for this very specific PMR)
  • The number of followers – number of people in the intersection is > 100.

So the problem we are seeing is because the user had more than 100 people in their network and/or followers and the code would not add the 44 others to the database. pressing further I asked for an iFix for this which is too complicated and the only option is to go to 3.0.1.1 CR1.

“Upon further review – the issue appears to be indirectly fixed in 3.0.1.1….We performed a code review in 3.0.1.1 – this code path was entirely re-written. The final section of the batch is inserted in the new code in 3.0.1.1.”

Up until this point the problem still hasn’t been resolved as updating Connections has not been planned yet but IBM support are fairly confident that 3.0.1.1 CR1 will iron out this. With regards to w3, David McCarthy was able to track down one user who reported the problem initially who now (after the upgrade) can see all status updates.

A customer had caused themselves a bit of bother by restarting the OS before shutting down DB2 just thinking that it will deal with it….

After doing this only Files wasn’t working so I checked the SystemOut.log and saw the following error.

[5/8/12 15:28:07:729 GMT] 00000016 SchedulerTask E   CLFWY0327E: Unable to sync scheduler task information with configuration.
com.ibm.websphere.scheduler.SchedulerNotAvailableException: com.ibm.db2.jcc.am.SqlException: Table space access is not allowed.. SQLCODE=-290, SQLSTATE=55039, DRIVER=3.59.81
at com.ibm.ws.scheduler.TaskStoreImpl.findByName(TaskStoreImpl.java:1579)
at com.ibm.ws.scheduler.SchedulerImpl$7.run(SchedulerImpl.java:1912)
at java.security.AccessController.doPrivileged(AccessController.java:251)
at com.ibm.ws.scheduler.SchedulerImpl.findByName(SchedulerImpl.java:1881)
at com.ibm.ws.scheduler.SchedulerImpl.findByName(SchedulerImpl.java:1865)
at com.ibm.ws.scheduler.SchedulerImpl.findTasksByName(SchedulerImpl.java:1828)
at com.ibm.lconn.scheduler.admintasks.SchedulerTaskManager.syncSchedulerTasksWithConfig(SchedulerTaskManager.java:153)
at com.ibm.lconn.scheduler.service.Scheduler.init(Scheduler.java:74)
at com.ibm.quickr.share.platform.config.Platform.<init>(Platform.java:450)
at com.ibm.quickr.share.platform.config.Platform.init(Platform.java:669)
at com.ibm.quickr.share.platform.config.PlatformInitializer.contextInitialized(PlatformInitializer.java:28)
at com.ibm.ws.webcontainer.webapp.WebApp.notifyServletContextCreated(WebApp.java:1707)
at com.ibm.ws.webcontainer.webapp.WebApp.commonInitializationFinish(WebApp.java:380)
at com.ibm.ws.webcontainer.webapp.WebAppImpl.initialize(WebAppImpl.java:299)
at com.ibm.ws.webcontainer.webapp.WebGroupImpl.addWebApplication(WebGroupImpl.java:100)
at com.ibm.ws.webcontainer.VirtualHostImpl.addWebApplication(VirtualHostImpl.java:166)
at com.ibm.ws.webcontainer.WSWebContainer.addWebApp(WSWebContainer.java:731)
at com.ibm.ws.webcontainer.WSWebContainer.addWebApplication(WSWebContainer.java:616)
at com.ibm.ws.webcontainer.component.WebContainerImpl.install(WebContainerImpl.java:376)
at com.ibm.ws.webcontainer.component.WebContainerImpl.start(WebContainerImpl.java:668)
at com.ibm.ws.runtime.component.ApplicationMgrImpl.start(ApplicationMgrImpl.java:1123)
at com.ibm.ws.runtime.component.DeployedApplicationImpl.fireDeployedObjectStart(DeployedApplicationImpl.java:1319)
at com.ibm.ws.runtime.component.DeployedModuleImpl.start(DeployedModuleImpl.java:610)
at com.ibm.ws.runtime.component.DeployedApplicationImpl.fireDeployedObjectStart(DeployedApplicationImpl.java:1319)
at com.ibm.ws.runtime.component.DeployedModuleImpl.start(DeployedModuleImpl.java:610)
at com.ibm.ws.runtime.component.DeployedApplicationImpl.start(DeployedApplicationImpl.java:944)
at com.ibm.ws.runtime.component.ApplicationMgrImpl.startApplication(ApplicationMgrImpl.java:726)
at com.ibm.ws.runtime.component.ApplicationMgrImpl.start(ApplicationMgrImpl.java:2048)
at com.ibm.ws.runtime.component.CompositionUnitMgrImpl.start(CompositionUnitMgrImpl.java:441)
at com.ibm.ws.runtime.component.CompositionUnitImpl.start(CompositionUnitImpl.java:123)
at com.ibm.ws.runtime.component.CompositionUnitMgrImpl.start(CompositionUnitMgrImpl.java:384)
at com.ibm.ws.runtime.component.CompositionUnitMgrImpl.access$300(CompositionUnitMgrImpl.java:112)
at com.ibm.ws.runtime.component.CompositionUnitMgrImpl$CUInitializer.run(CompositionUnitMgrImpl.java:951)
at com.ibm.wsspi.runtime.component.WsComponentImpl$_AsynchInitializer.run(WsComponentImpl.java:349)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1604)
Caused by: com.ibm.db2.jcc.am.SqlException: Table space access is not allowed.. SQLCODE=-290, SQLSTATE=55039, DRIVER=3.59.81
at com.ibm.db2.jcc.am.dd.a(dd.java:679)
at com.ibm.db2.jcc.am.dd.a(dd.java:60)
at com.ibm.db2.jcc.am.dd.a(dd.java:127)
at com.ibm.db2.jcc.am.qm.b(qm.java:3900)
at com.ibm.db2.jcc.am.qm.a(qm.java:3882)
at com.ibm.db2.jcc.t4.cb.a(cb.java:793)
at com.ibm.db2.jcc.t4.cb.m(cb.java:763)
at com.ibm.db2.jcc.t4.cb.i(cb.java:246)
at com.ibm.db2.jcc.t4.cb.c(cb.java:54)
at com.ibm.db2.jcc.t4.q.c(q.java:44)
at com.ibm.db2.jcc.t4.rb.j(rb.java:147)
at com.ibm.db2.jcc.am.bn.ib(bn.java:2048)
at com.ibm.db2.jcc.am.cn.b(cn.java:3845)
at com.ibm.db2.jcc.am.cn.bc(cn.java:678)
at com.ibm.db2.jcc.am.cn.executeQuery(cn.java:652)
at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.pmiExecuteQuery(WSJdbcPreparedStatement.java:1099)
at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.executeQuery(WSJdbcPreparedStatement.java:720)
at com.ibm.ws.scheduler.TaskStoreImpl.executeQueryWithRetry(TaskStoreImpl.java:1775)
at com.ibm.ws.scheduler.TaskStoreImpl$ExecuteQueryPriviledged.run(TaskStoreImpl.java:1662)
at java.security.AccessController.doPrivileged(AccessController.java:251)
at com.ibm.ws.scheduler.TaskStoreImpl.executePrivilegedQueryWithRetry(TaskStoreImpl.java:1831)
at com.ibm.ws.scheduler.TaskStoreImpl.findByName(TaskStoreImpl.java:1490)
… 32 more

I restarted the application but the same appeared so I checked DB2. The db2diag.log found by running db2 get dbm cfg | grep -i diag was huge so I archived it using b2diag -A and shut everything (Connections and DB2) down and then started it back up again.

Picking through the db2diag.log I found the following warning.

2012-05-10-09.30.00.093460+000 E46586E440          LEVEL: Warning
PID     : 26475                TID  : 1088047424   PROC : db2acd 0
INSTANCE: db2inst1             NODE : 000
FUNCTION: DB2 UDB, Health Monitor, HealthIndicator::update, probe:500
MESSAGE : ADM10502W  Health indicator “Table Space Operational State”
(“ts.ts_op_status”) is in state “Roll forward pending” on “table
space” “db2inst1.FILES   .FILESTS32″.

A bit of Googling found TSM Server Fails to Start with ANR9999D_3831306406, sqlCode=-290 with a suggestion of how to deal with it. When the OS’ rug was pulled out from underneath it DB2 was probably performing an action on this table space. When DB2 came back up it was marked as “pending roll forward” effectively saying, “hey is it safe to continue what I was doing?”

Running db2 “rollforward db FILES to end of logs tablespace(FILESTS32) online” replayed the logs and changed the status and on a restart of Files the application was accessible.

                                 Rollforward Status

Input database alias                   = FILES
Number of nodes have returned status   = 1

Node number                            = 0
Rollforward status                     = not pending
Next log file to be read               =
Log files processed                    =  -
Last committed transaction             = 2012-05-07-13.41.18.000000 UTC

DB20000I  The ROLLFORWARD command completed successfully.

Follow

Get every new post delivered to your Inbox.

Join 60 other followers