End to Surveys problems in IBM Connections 5.0?

I wrote a blog post Ongoing issues with Surveys (FEB) and IBM Connections which detailed some problems a customer was having with Surveys. This dragged on and resulted in a couple of PMR’s being raised with IBM but I am hopefully at the end of it now.

Recently IBM provided me with a modified .jar to provide additional output when the problem occurred. I needed to add to the ear file. I did this as follows

# cd /opt/IBM/WebSphere/AppServer/profiles/Dmgr01/bin/
# ./wsadmin.sh -lang jython
wsadmin>AdminApp.export(‘Forms Experience Builder’, ‘/tmp/Forms Experience Builder.ear’)
# cp /tmp/Forms\ Experience\ Builder.ear /tmp/Forms\ Experience\ Builder.ear.orig
# mkdir /tmp/feb_expanded
# mkdir /tmp/feb_collapsed
# /opt/IBM/WebSphere/AppServer/bin/EARExpander.sh -ear /tmp/Forms\ Experience\ Builder.ear -operationDir /tmp/feb_expanded/ -operation expand
ADMA4006I: Expanding enterprise archive (EAR) file /tmp/Forms Experience Builder.ear to directory /tmp/feb_expanded/.
# mkdir /tmp/feb_backup
# mv /tmp/feb_expanded/builder.war/WEB-INF/lib/ibm.fsp.core.service.startup-8.0.1.35.jar/ /tmp/feb_backup/
# cp -R /home/ldap/BenW/17891.033.866.ibm.fsp.core.service.startup-8.0.1.81/ /tmp/feb_expanded/builder.war/WEB-INF/lib/ibm.fsp.core.service.startup-8.0.1.35.jar
# /opt/IBM/WebSphere/AppServer/bin/EARExpander.sh -ear ‘/tmp/feb_collapsed/Forms Experience Builder.ear’ -operationDir /tmp/feb_expanded/ -operation collapse
ADMA4007I: Collapsing the contents of directory /tmp/feb_expanded/ to enterprise archive (EAR) file /tmp/feb_collapsed/Forms Experience Builder.ear.
Update the current application using the ISC pointing to /tmp/feb_collapsed/Forms Experience Builder.ear and selecting the default values.

What I found in the SystemOut.log after a period of time was a different error which in the UI was not allowing me to create new surveys but I could complete existing ones which was slightly different to what I was seeing when I raised the PMR. The exception was

[7/11/16 10:34:12:359 BST] 00001b1e StandardExcep E com.ibm.form.nitro.platform.StandardExceptionMapper toResponse ac7d3dec-57f7-482f-83e5-9eaf77c82cbb
java.lang.RuntimeException: Error reading from /tmp/ibm.fsp.temp.1466513524000/fspjars, isDirectory = false, exists = false, canRead = false
at com.ibm.form.platform.service.startup.IsolatingClassLoader.getFileList(IsolatingClassLoader.java:1577)
at com.ibm.form.platform.service.startup.IsolatingClassLoader.access$100(IsolatingClassLoader.java:47)…………..

I created /tmp/ibm.fsp.temp.1466513524000/fspjars and some functionality returned but it wasn’t until I restarted the JVM that it started to work properly.

IBM told me that the problem here is that the /tmp/ directory is getting cleared out and removing the aforementioned directory causing a problem for FEB.

After a bit of Googling I found that tmpwatch was clearing out files/directories that haven’t been edited for 10 days. To stop this I added the bold text.

# vi /etc/cron.daily/tmpwatch
#! /bin/sh
flags=-umc
/usr/sbin/tmpwatch "$flags" -x /tmp/.X11-unix -x /tmp/.XIM-unix \
        -x /tmp/.font-unix -x /tmp/.ICE-unix -x /tmp/.Test-unix \
        -X '/tmp/hsperfdata_*' -X '/tmp/.hdb*lock' -X '/tmp/.sapstartsrv*.log' \
        -X '/tmp/ibm.fsp.*' -X '/tmp/pymp-*' 10d /tmp
/usr/sbin/tmpwatch "$flags" 30d /var/tmp
for d in /var/{cache/man,catman}/{cat?,X11R6/cat?,local/cat?}; do
    if [ -d "$d" ]; then
        /usr/sbin/tmpwatch "$flags" -f 30d "$d"
    fi
done

After a few weeks the problem hadn’t manifested again and IBM told me that the cause of the initial PMR was the /tmp directory being emptied. I was dubious at first but then found https://developer.ibm.com/answers/questions/219765/periodically-my-feb-server-stops-working-properly.html which describes problems due the /tmp directory being cleaned out.

As other stuff gets written to the /tmp directory which is what WAS will use by default I decided to use the java.io.tmpdir custom property to instruct WAS to use a directory under /opt/ where it won’t be cleaned by tmpwatch.

Fingers crossed this is the end of it.

Advertisements

Forcing TLSv1.2 breaks IBM Connections Surveys and Textbox.io

I had to force TLSv1.2 across all of Connections to fix a problem with RTE as I detailed in Rich Content widget widget stops working due to mix matched SSL protocols but after testing I’ve found that this breaks Textbox.io in Chrome and Surveys.

The process is well documented in How to Force IBM Connections 5.5 CR1 to Use TLSv1.2 but after making the changes the following happens.

Textbox.io

In IE and FF Textbox.io works fine but in Chrome the spell check service fails.

1

In Fiddler trace is saw Spelling server error:  Could not load url “https://connections.acme.com/ephox-spelling/1/correction”: 500 Internal Server Error

In the SystemOut.log I saw

[6/29/16 10:00:38:507 BST] 00000200 SystemOut     O ironbark-akka.actor.default-dispatcher-17, RECV TLSv1 ALERT:  fatal, handshake_failure
[6/29/16 10:00:38:507 BST] 00000200 SystemOut     O ironbark-akka.actor.default-dispatcher-17, fatal: engine already closed.  Rethrowing javax.net.ssl.SSLException: Received fatal alert: handshake_failure

spray.can.Http$ConnectionException: Aborted
    at spray.can.client.HttpHostConnectionSlot.reportDisconnection(HttpHostConnectionSlot.scala:228) ~[spray-can_2.11-1.3.3.jar:na]
    at spray.can.client.HttpHostConnectionSlot$$anonfun$connected$1.applyOrElse(HttpHostConnectionSlot.scala:161) ~[spray-can_2.11-1.3.3.jar:na]
    at akka.actor.Actor$class.aroundReceive(Actor.scala:465) ~[akka-actor_2.11-2.3.9.jar:na]

IBM asked me to put on SSL trace, *=info:SSL=all. It seems that the client is sending TLSv1.0 which of course is not allowed now TLSv1.2 has been forced.

[7/11/16 9:31:17:286 BST] 00000115 SystemOut     O   ironbark-akka.actor.default-dispatcher-7, READ: TLSv1.2 Alert, length = 2
[7/11/16 9:31:17:286 BST] 00000115 SystemOut     O   ironbark-akka.actor.default-dispatcher-7, RECV TLSv1 ALERT:  fatal, handshake_failure
[7/11/16 9:31:17:286 BST] 00000115 SystemOut     O   ironbark-akka.actor.default-dispatcher-7, fatal: engine already closed.  Rethrowing javax.net.ssl.SSLException: Received fatal alert: handshake_failure

IBM have logged a ticket with Ephox as well as investigating it from there end.

Surveys

When in a community with previous surveys I can not see any of the historical surveys nor could I create new ones.

In the SystemOut.log I saw the following

[6/29/16 10:01:56:542 BST] 0000033b StandardExcep E com.ibm.form.nitro.platform.StandardExceptionMapper toResponse eaa8e54e-7c38-4edb-a5ca-bcbd6d7f6c64
                                 com.ibm.form.platform.service.framework.exception.ServicesPlatformException: com.ibm.connections.directory.services.exception.DSException: com.ibm.connections.directory.services.exception.DSOutOfServiceException: javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure

Caused by: com.ibm.connections.directory.services.exception.DSException: com.ibm.connections.directory.services.exception.DSOutOfServiceException: javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure

Caused by: javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure

Again, it seems like it’s sending TLSv1.0.

There’s at least one other person I know of who’s logged a PMR for these problems. It’s fairly urgent due to a problem with the RTE application which is only fixed when TLS1.2 is enforced. I’m hoping that these problems can be resolved sharpish so I can resolve the RTE problem for a customer.

Ongoing issues with Surveys (FEB) and IBM Connections

For a number of IBM Connections v5.0 customers I have come across problems with Surveys occasionally and unpredictably stop working. Users would get a screen that wouldn’t load or a 500 error like below.

1 2

There seems to be two sets of errors which I will describe below.

First I must say this was a frustrating journey with IBM. There seems to be a lack of joined up thinking between the FEB and Connections teams. The FEB teams kept telling me that certain configuration was required whilst this was not present in any of the Connections documentation.

[10/13/15 18:45:59:092 BST] 000005a9 StandardExcep E com.ibm.form.nitro.platform.StandardExceptionMapper toResponse 205d4fed-6a97-4ba7-804b-dbf1ad97554d
                                 com.ibm.pdq.runtime.exception.DataRuntimeException: [pdq][10103][2.18.120] An error prevented the update operation from completing successfully.;  Caused by: com.ibm.db2.jcc.am.SqlIntegrityConstraintViolationException: The insert or update value of the FOREIGN KEY “FREEDOM.USER_GROUPS.U_UG_FK” is not equal to any value of the parent key of the parent table.. SQLCODE=-530, SQLSTATE=23503, DRIVER=3.65.110
    at com.ibm.pdq.runtime.internal.db.JdbcData.update_(JdbcData.java:388)
    at com.ibm.pdq.runtime.internal.db.DataImpl.update(DataImpl.java:695)
    at com.ibm.pdq.runtime.generator.BaseData.update(BaseData.java:906)

Caused by: com.ibm.db2.jcc.am.SqlIntegrityConstraintViolationException: The insert or update value of the FOREIGN KEY “FREEDOM.USER_GROUPS.U_UG_FK” is not equal to any value of the parent key of the parent table.. SQLCODE=-530, SQLSTATE=23503, DRIVER=3.65.110
    at com.ibm.db2.jcc.am.cd.a(cd.java:694)
    at com.ibm.db2.jcc.am.cd.a(cd.java:60)

IBM said that this was a duplicate data problem but did not provide me with any more information as to what data so I blindly updated /opt/ibm/Forms/extensions/Builder_config.properties in line with their suggestions. I unremmed the following and set the values appropriately.

ibm.was.MemberManager.userProps.loginName = uid
#
ibm.was.MemberManager.userProps.id = uid
#
ibm.was.MemberManager.groupProps.id = cn
#
ibm.was.MemberManager.userProps.email = mail
#
ibm.was.MemberManager.userProps.displayName = displayName

IBM wanted me to set the following value but by replacing it with https://server.com/forms but this broke Surveys each and every time without fail.

#ibm.nitro.NitroConfig.serverURI = http://host:9080/forms

If I set ibm.nitro.NitroConfig.serverURI then Surveys does not load in a Community. When I amend Builder_config.properties and set serverURI I also see the following in the SystemOut.log

[2/5/16 16:51:23:023 GMT] 000011b2 PropertyUtils W com.ibm.form.platform.service.common.util.PropertyUtils updateProperties Unable to find accessible set method for the property called serverURI within class class com.ibm.form.nitro.service.config.NitroConfig.
[2/5/16 16:51:23:076 GMT] 000011b2 PropertyUtils W com.ibm.form.platform.service.common.util.PropertyUtils updateProperties Unable to find accessible set method for the property called serverURI within class class com.ibm.form.nitro.service.config.NitroConfig.

IBM finally stopped asking me to set ibm.nitro.NitroConfig.serverURI and for a short period of time Surveys worked.

The following error is what appeared after I made the above changes. IBM said that they had seen this before but at the time, only two other customers had come across it so they have not had a chance to determine the cause.

[1/19/16 14:44:45:871 GMT] 00000486 webapp E com.ibm.ws.webcontainer.webapp.WebApp logServletError SRVE0293E: [Servlet Error]-[fspServlet]: java.lang.NullPointerException
at com.ibm.form.platform.service.startup.IsolatingClassLoader$3.run(IsolatingClassLoader.java:414)
at com.ibm.form.platform.service.startup.IsolatingClassLoader$3.run(IsolatingClassLoader.java:408)
at java.security.AccessController.doPrivileged(AccessController.java:284)
at com.ibm.form.platform.service.startup.IsolatingClassLoader.findResourceAsStreamA(IsolatingClassLoader.java:406)
at com.ibm.form.platform.service.startup.IsolatingClassLoader.findResourceAsStream(IsolatingClassLoader.java:227)
at com.ibm.form.platform.service.startup.IsolatingClassLoader.getResourceAsStream(IsolatingClassLoader.java:987)

[1/19/16 14:48:59:100 GMT] 00000136 webapp I com.ibm.ws.webcontainer.webapp.WebApp log SRVE0296E: [Forms Experience Builder#builder.war][/forms][Servlet.LOG]:.fspServlet: SimplifiedPlatformAccessServlet.service():.java.lang.NullPointerException
at com.ibm.form.platform.service.startup.IsolatingClassLoader$7.run(IsolatingClassLoader.java:1189)
at com.ibm.form.platform.service.startup.IsolatingClassLoader$7.run(IsolatingClassLoader.java:1177)
at java.security.AccessController.doPrivileged(AccessController.java:284)
at com.ibm.form.platform.service.startup.IsolatingClassLoader.loadClassFromJarA(IsolatingClassLoader.java:1174)
at com.ibm.form.platform.service.startup.IsolatingClassLoader.findClass(IsolatingClassLoader.java:181)
at com.ibm.form.platform.service.startup.IsolatingClassLoader.loadClass(IsolatingClassLoader.java:1049)
at com.ibm.form.platform.service.startup.IsolatingClassLoader.loadClass(IsolatingClassLoader.java:1140)

At the same time as the above errors, in the trace.log, I found the following. This appeared when I tried to add the widget to a community.

[1/19/16 14:44:45:873 GMT] 00000285 ServerToServe E com.ibm.connections.httpClient.ServerToServerHttpClient checkResultsForRetry SONATA: Internal Server Error @’https://server.com/forms/secure/org/lifecycle’;
[1/19/16 14:44:45:875 GMT] 00000285 EventPropagat E com.ibm.lconn.widgets.service.EventPropagater postRemoteEvent CLFWZ0004E: Event ‘widget.added’ sent to remote lifecycle handler at https://server.com/forms/secure/org/lifecyclereturned bad response: 500 – Internal Server Error
[1/19/16 14:44:45:886 GMT] 00000285 AddWidgetActi E com.ibm.lconn.widgets.actions.AddWidgetAction execute CLFWZ0004E: Event ‘widget.added’ sent to remote lifecycle handler at https://server.com/forms/secure/org/lifecyclereturned bad response: 500 – Internal Server Error
com.ibm.lconn.widgets.model.LifecycleStatusCodeException: CLFWZ0004E: Event ‘widget.added’ sent to remote lifecycle handler at https://server.com/forms/secure/org/lifecyclereturned bad response: 500 – Internal Server Error
at com.ibm.lconn.widgets.service.EventPropagater.postRemoteEvent(EventPropagater.java:569)
at com.ibm.lconn.widgets.service.EventPropagater.addWidget(EventPropagater.java:753)
at com.ibm.lconn.widgets.service.WidgetInfoService.addWidgetPropagateInternal(WidgetInfoService.java:285)
at com.ibm.lconn.widgets.service.WidgetInfoService.addWidget(WidgetInfoService.java:376)

IBM put the 500 error down to not using ibm.nitro.NitroConfig.serverURI!!! IBM did say that redoing the configuration for the resource bundle will resolve the 500 errors in case there is a corruption. If there was a corruption it would never work! Redoing the configuration requires a restart, that seems to be the only way to restore Surveys albeit temporarily.

After making the changes to Builder_config.properties and monitoring the servers the errors appeared again. IBM set up a conference call and it was clear IBM did not really have anything to offer.

On the call I read through the instructions to deploy without the installer (http://www.ibm.com/support/knowledgecenter/SS6KJL_8.5.1/FEB/in_deploying_was.dita?lang=en) and the only difference I could see between this documentation and the current configuration which was configured by the installer is with the setting fullyMaterializeLobData.

In all the environments, Data sources > IBM_FORMS_DATA_SOURCE > Custom properties change fullyMaterializeLobData = true whilst the knowledge center says it should be false. IBM jumped on to that since it was an action for me to do. Anyway, I changed this custom property and for a good few weeks I have not seen the errors appear and Surveys has continued to work.

I’m hoping this is the end to it and I have been configuring all new 5.0 and 5.5 servers with this in mind.