Orient Me and some things I’ve come across and wrestled with

Having gained some experience of Docker and CfC (IBM Spectrum Conductor for Containers) before Connections 6.0 was released I thought this would be easy to set up but I must admit I’m struggling.

My setup is 3 CentOS servers for Orient Me with another for DB2/SDI and another for Connections hosting the deployment manager.

Here are some things I have come across which I’d like to add to as I come across other problems.

DNS

Working on a beefy ESXi server running at home I normally manage most things using hosts file which has worked really well, up until now. I won’t steal from Roberto Boccadoro’s blog post but suffice to say I couldn’t get it to work using hosts file even after editing nsswitch.conf. I had to rely on spoofing DNS, internally, on my router by updating /jffs/configs/hosts.add to include all my Connections servers.

Even with this I found that the migration script in people-migrate container would fail because so in this case I had to add my host files to /etc/hosts which got me past that step.

MongoDB

I had to uninstall and reinstall a couple of times. On reinstall I had problems with the migration application (people-migrate) connecting to mongoDB. I was able to check the databases and connect to them.

# kubectl exec -it mongo-0 bash

#mongo

rs0:PRIMARY> show dbs
admin  0.000GB
local  0.000GB

The migration script was failing to connect and I couldn’t fathom why. I uninstalled again and this time I removed the persistent volumes and recreated them and now the migration script gets further but fails with the following exception.

2017-04-12T12:01:42.751Z – info: [migrator] Mongo DB URL: mongodb://mongo-0.mongo:27017/relationshipdb?replicaSet=rs&readPreference=primaryPreferred&wtimeoutMS=2000
2017-04-12T12:01:42.757Z – info: [migrator] Mongo DB URL: mongodb://mongo-0.mongo:27017/datamigrationdb?replicaSet=rs&readPreference=primaryPreferred&wtimeoutMS=2000
2017-04-12T12:01:42.758Z – info: [migrator] Mongo DB URL: mongodb://mongo-0.mongo:27017/profiledb?replicaSet=rs&readPreference=primaryPreferred&wtimeoutMS=2000
2017-04-12T12:01:54.018Z – info: [migrator] total request number: 1
2017-04-12T12:01:54.021Z – info: [populator] Start to populate URL:
–“https://connections.domain.com/profiles/admin/atom/profiles.do?ps=100”

2017-04-12T12:01:59.417Z – error: [migrator] errors:[{“profileKey”:”16ff2775-2ace-4db8-8e54-56adcc62a5fb”,”externalId”:”382AB352-F9AE-D6E4-8025-7D2C004A7248″,”created”:1491998514408,”orgId”:”a”,”id”:”FAKE_ID”,”error”:{}},{“profileKey”:”8af449b4-0357-4bed-a7c7-c0e5285ba826″,”externalId”:”932ED7B3-988D-9EFC-8625-79E3005B2B62″,”created”:1491998514409,”orgId”:”a”,”id”:”FAKE_ID”,”error”:{}},{“profileKey”:”a9294f18-ee72-49d0-8a44-cf02abe6d4d2″,”externalId”:”0873E9A9-7E12-0609-8025-7D38003BFD71″,”created”:1491998514410,”orgId”:”a”,”id”:”FAKE_ID”,”error”:{}},{“profileKey”:”b6994f86-7525-48b6-92da-900393382e11″,”externalId”:”0F64A6F8-927B-483C-8625-79E3005AC781″,”created”:1491998514410,”orgId”:”a”,”id”:”FAKE_ID”,”error”:{}}]
Connection fails: MongoError: failed to connect to server [mongo-0:27017] on first connect [MongoError: connection 4 to mongo-0:27017 timed out]
It will be retried for the next request.
Connection fails: MongoError: failed to connect to server [mongo-0:27017] on first connect [MongoError: connection 5 to mongo-0:27017 timed out]
It will be retried for the next request.

/usr/src/app/node_modules/mongodb/lib/mongo_client.js:338
throw err
^
MongoError: failed to connect to server [mongo-0:27017] on first connect [MongoError: connection 5 to mongo-0:27017 timed out]
at Pool.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/topologies/server.js:327:35)
at emitOne (events.js:96:13)
at Pool.emit (events.js:188:7)
at Connection.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/connection/pool.js:274:12)
at Connection.g (events.js:291:16)
at emitTwo (events.js:106:13)
at Connection.emit (events.js:191:7)
at Socket.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/connection/connection.js:187:10)
at Socket.g (events.js:291:16)
at emitNone (events.js:86:13)
at Socket.emit (events.js:185:7)
at Socket._onTimeout (net.js:339:8)
at ontimeout (timers.js:365:14)
at tryOnTimeout (timers.js:237:5)
at Timer.listOnTimeout (timers.js:207:5)

Redis client

In the knowledge center it alludes as to how to test connecting to Redis from the Connections node. If you want to install the client and try for yourself here are the instructions IBM deemed not necessary to write down for you.

# su -c ‘rpm -Uvh http://download.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-9.noarch.rpm&#8217;
# yum install redis

# redis-cli -p 30379
127.0.0.1:30379> set foo bar
OK
127.0.0.1:30379> get foo
“bar”
127.0.0.1:30379>

Odd pod behaviour

I believe I have an underlying problem with the persistent volumes and over night this happened.

# kubectl get pods

zookeeper-controller-3-2528439515-xz702   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-xz79d   0/1       OutOfpods   0          14h
zookeeper-controller-3-2528439515-xzqc9   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-xzzbl   0/1       OutOfpods   0          16h
zookeeper-controller-3-2528439515-z0kwf   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-z13kn   0/1       OutOfpods   0          17h
zookeeper-controller-3-2528439515-z2lsn   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-z6mc5   0/1       OutOfpods   0          14h
zookeeper-controller-3-2528439515-z74nj   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-z97jp   0/1       OutOfpods   0          17h
zookeeper-controller-3-2528439515-zd2js   0/1       OutOfpods   0          4h
zookeeper-controller-3-2528439515-zdc3t   0/1       OutOfpods   0          14h
zookeeper-controller-3-2528439515-zk5bw   0/1       OutOfpods   0          16h

# kubectl get pods | wc -l
2114

There were thousands of pods. I believe they were created faster than they could be garbage collected.

I deleted all the pods in the “OutOfpods” status using the following command.

# kubectl get pod | cut -d ” ” -f 1 | xargs -n1 -P 10 kubectl delete pod

Shutdown

To shutdown my servers I have been running the following to stop all pods.

# docker stop $(docker ps -a -q)

I’m not sure whether I am better off using a different variation of above to stop all pods

# kubectl get pod | cut -d ” ” -f 1 | xargs -n1 -P 10 kubectl delete pod

Is there a better prescribed way of doing this?

Enabling profiles events for Orient Me

I did what was asked of me in the knowledge center but there is little indication of it having worked. In the documentation it states that I should see “OrientMe configured properly – both properties are enabled.” Where should I see that, in SDI’s ibmdi.log or in one of the application servers SystemOut.log? I have looked at both and I do not see this written.

Anyway, I’ll hopefully add  to this as I go. If anyone has come across these problems and found a resolution to them, please get in touch.

Old version of Notes Java breaks IBM Connections Files plugin when TLSv1.2 is enforced

I had to raise a PMR on a problem I and others in my company had with the Notes client. After enforcing TLSv1.2 in Connections 5.5 using the following configuration in httpd.conf the Files plugin would not work but the Activities and Status Updates plugins would.

SSLEnable
SSLProtocolDisable SSLv2 SSLv3 TLSv11 TLSv10
SSLCipherSpec TLSv12 +TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 +TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 +TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384 +TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA +TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 +TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA +TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 +TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 +TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384 +TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 +TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA

I kept seeing the following screen and clicking “try again using existing options” did nothing.

Whilst clicking on “try again using existing options” I would see the following in IHS.

[Wed Mar 29 14:30:41 2017] [warn] [client xxx.xxx.xxx.xx] [7f9a480ec800] [30453] SSL0222W: SSL Handshake Failed, No ciphers specified (no shared ciphers or no shared protocols).  [xxx.xxx.xxx.xx:49296 -> xxx.xxx.xxx.xx:443] [14:30:41.000571168] 0ms

The SSL certificate is at 4096 bits and I had previously replaced US_export_policy.jar and local_policy.jar with the unrestricted policy jars so that was not the problem.

I found, oddly, that if I swapped to the IBM Sametime Meetings plugin first and then changed to Files, my files would load…. Also, if I ran Fiddler and restarted my Notes client but went directly to Files it would load too. Weird.

I had a screen share with Elizabeth Hecht and Jacqueline Chewens to show them the odd behaviour and they too were baffled. Liz came across a thought of the version of Java being used may not be allowing connectivity to Files and asked whether I had applied the Java update for FP6? Not having so much focus on Notes and Domino of late I told her I wasn’t even aware that previously you were supposed to update the version of Java being used by the Notes client.

To test this I updated Notes to FP8, which bundles in the Java update and low and behold the Files plugin started working. Also, there was no need to replace the jars with the unrestricted ones!

The version of Java now in play is as follows.

c:\Program Files (x86)\IBM\Notes\jvm>java -version
java version “1.8.0_121”
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) Client VM (build 25.121-b13, mixed mode, sharing)

BTW – if Connections enforces SSL then you need to make sure that com.ibm.documents.connector.service/ENABLE_SSL=true is set in the plugin_customization.ini.

Connections Pink and container orchestration using CfC

A while ago I started dabbling with Docker after reading some great blogs about ELK by Klaus Bild and Christoph Stoettner thinking I could do with a tool like ELK to analyse log files and to give me something tangible to work with whilst learning about Docker.

After a lot of hard learning and some frustrating hours I got my head around containers and how they could be used to my advantage and got ELK running natively on Ubuntu and then on my work Windows 7 laptop.

A few months before Connect 2017 news was leaking about Connections Pink and its architecture and how the applications will run within containers. Recently Jason Gary Roy held a webinar (Open Mic Webcast: Think Pink – The Future of IBM Connections – 07 March 2017) replaying some of his slides from Connect 2017 and in the video he mentions (briefly) CfC in combination with Docker and containers.

I asked the question in the IBM Connections Community Skype chat and a few people told me that CfC was an IBM product called IBM Spectrum Conductor for Containers. I looked through the community for CfC and realised how important having an orchestration tool is for running multiple containers and scaling for high availability. This was a long way away from running three containers on my laptop.

Installing CfC was pretty easy and well documented in the CfC community. Installation wise you need to install on Ubuntu 16.04 or RHEL although I am sure CentOS will work. I’ll get to that next week.

What you end up with is a rather nice UI which does many of the hard things for you such as networking, setting up persistent storage for your containers, moving applications to other nodes, automatic scaling when demand requires and many more.

What I also liked is that it acts as a private repository for your containers avoiding you needing to push to Docker Hub for storage.

In the latest version you can install on a single node which is great for testing purposes but it also allows you to add and remove worker nodes when you want to branch out.

I asked in the CfC Slack channel what the future looks like for CfC because if it requires a license then it is another hurdle to overcome when selling in Connections. The response I got was:

“We are intending to keep providing a free version that customer can use and deploy as it is a packaging of open-source. Business discussion on what to do beyond that are still ongoing so I can’t comment. Options include providing commercial support or additional add-ons around the open-source for a commercial product. Right now this is a community effort, and we are currently looking  technical feedback  and understanding of what use cases people would like to use CfC for.  Looking forward to  your participation.”

Since the product is built on the following open technologies I would hope that a free option remains available going forward.

Another other benefit for using CfC is that IBM are using it for Pink. I assume that most of the documentation referring to orchestration of the containers will reference CfC in some form. Getting to know it now, I hope, will make deploying Pink containers easier.

Thanks to Michele Buccarello for answering my questions.

CwC has been built with below individual components

Core component:

  • Kubernetes and Mesosphere API/CLI
  • GUI
  • Installer for HA
  • Authentication through LDAP
  • An App store
  • A Private image registry

Sample applications:

  • Frontend
  • Liberty
  • Nginx
  • Redis
  • Tomcat

Built in Network

  • Flannel
  • Calico

Built in persistent Storage

  • NFS
  • Hostpath
  • GlusterFs

Supported CPU Architecture

  • PowerPC LE
  • x86

Configure Connections to use SMTP MX records to multiple servers

Internally we originally used a DNS round robin alias for Connections to connect to to route SMTP emails but that was problematic when one of the servers in the alias was taken off line.

IBM has made this easier by allowing you to use MX records to list the SMTP servers to connect to as detailed in Sending mail from any available mail server.

It was fairly simple using the example in the  knowledge center to set this up. Firstly I had our network team create (internal only) MX records for three Domino servers for internal.acme.com with the required weightings. Then I checked out notifications-config.xml and edited the following lines and checked it back in.

<channelConfigs>
<emailChannelConfig>
<useJavaMailProvider>false</useJavaMailProvider>
<smtpJNDILookup>
<smtpJNDILookupURL>dns:///internal.acme.com</smtpJNDILookupURL>
<javamail>
<property name=”mail.debug”>false</property>
<property name=”mail.smtp.connectiontimeout”>120000</property>
<property name=”mail.smtp.timeout”>120000</property>
<property name=”mail.smtp.port”>25</property>
<property name=”mail.smtp.socketFactory.port”>25</property>
<property name=”mail.smtp.socketFactory.fallback”>false</property>
<property name=”mail.smtp.sendpartial”>true</property>
</javamail>
</smtpJNDILookup>
<maxRecipients>50</maxRecipients>
</emailChannelConfig>
</channelConfigs>

At first I left the below line in and it didn’t work.

<property name=”mail.smtp.socketFactory.class”>javax.net.ssl.SSLSocketFactory</property>

Setting <property name=”mail.debug”>true</property> wrote the following to the SystemOut.log.

[2/21/17 20:13:34:309 GMT] 0000023e SystemOut     O DEBUG: getProvider() returning javax.mail.Provider[TRANSPORT,smtp,com.sun.mail.smtp.SMTPTransport,Sun Microsystems, Inc]
[2/21/17 20:13:34:322 GMT] 0000023e SystemOut     O DEBUG SMTP: useEhlo true, useAuth false
[2/21/17 20:13:34:322 GMT] 0000023e SystemOut     O DEBUG SMTP: trying to connect to host “domino.internal.acme.com.”, port 25, isSSL false
[2/21/17 20:13:34:347 GMT] 0000023e SystemOut     O DEBUG SMTP: exception reading response: javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?
[2/21/17 20:13:34:348 GMT] 0000023e SystemOut     O DEBUG SMTP: useEhlo true, useAuth false
[2/21/17 20:13:34:348 GMT] 0000023e SystemOut     O DEBUG SMTP: starting protocol to host “domino.internal.acme.com.”, port 25
[2/21/17 20:13:34:349 GMT] 0000023e SystemOut     O DEBUG SMTP: exception reading response: javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?

Remming out the aforementioned line referencing allowed me to connect over port 25.

To test this my colleague stopped the SMTP listener on the Domino server with the lowest weighting causing it to connect to the next server.

CCM/FileNet search index fails in IBM Connections 4.5 due to special character

The customer told me that his search index never completed correctly when Connections was initially deployed and now users are complaining that search results do not contain CCM documents.

The customer had tried recreating the index but to no avail and called me to take a look.

I first enabled trace on one of the infrastructure nodes (*=info: com.ibm.connections.search.index.indexing.*=all: com.ibm.connections.search.seedlist.*=all: com.ibm.connections.httpClient.*=all: com.ibm.connections.search.index.indexing.EcmFilesIndexer=all) as detailed in http://www-01.ibm.com/support/docview.wss?uid=swg21636559

I then created a back ground index as detailed in, Creating a back ground index and tailed the trace.log and SystemOut.log. To create the background index I ran the following commands on the Windows server.

cd c:\IBM\WebSphere\AppServer\profiles\Dmgr01\bin

wsadmin.bat -lang jython -username wasadmin -password ********

execfile(“searchAdmin.py”)

SearchService.startBackgroundIndex(“c:/IBM/Connections/background/crawl”, “c:/IBM/Connections/background/extracted”, “c:/IBM/Connections/background/index”, “ecm_files”)

I found that the indexing process finished abruptly about 3500 documents in (with another 6500 odd remaining).

[10/09/14 09:15:59:293 BST] 0000007a SeedlistPagin < com.ibm.connections.search.seedlist.parser.impl.SeedlistPaginationHandler resolve RETURN https://connections.acme.com/dm/atom/library/8DB6D184-AAF5-41F3-A28D-D1B7BEF17967%3BC11D230C-66A5-4CEB-8906-EAB19DFE0B8D/document/%7B5DEBC165-CDF6-4672-8300-A3345507867F%7D/media/%33%35%20%28%32%30%31%34%29%20%34%33%2d%38%35%20%54%68%65%20%53%79%73%74%65%6d%73%20%54%61%6e%74%6164%66?follow=true
[10/09/14 09:15:59:293 BST] 0000007a SystemErr     R   [Fatal Error] :23466:346: An invalid XML character (Unicode: 0x2) was found in the element content of the document.
[10/09/14 09:15:59:293 BST] 0000007a SeedlistEntry 2 com.ibm.connections.search.seedlist.crawler.impl.SeedlistEntryIterator hasNext CLFRW0063E: SAX parser error.
org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x2) was found in the element content of the document.
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at com.ibm.connections.search.seedlist.crawler.impl.SeedlistPage.parse(SeedlistPage.java:86)
at com.ibm.connections.search.seedlist.crawler.impl.SeedlistEntryIterator.hasNext(SeedlistEntryIterator.java:102)
at com.ibm.connections.search.index.process.work.IndexingWork.run(IndexingWork.java:205)
at com.ibm.connections.search.index.process.initial.InitialProcess.index(InitialProcess.java:493)
at com.ibm.connections.search.index.process.initial.InitialProcess.index(InitialProcess.java:444)
at com.ibm.connections.search.index.process.initial.InitialProcess.run(InitialProcess.java:332)
at com.ibm.ws.asynchbeans.J2EEContext$RunProxy.run(J2EEContext.java:265)
at java.security.AccessController.doPrivileged(AccessController.java:229)
at com.ibm.ws.asynchbeans.J2EEContext.run(J2EEContext.java:1165)
at com.ibm.ws.asynchbeans.WorkWithExecutionContextImpl.go(WorkWithExecutionContextImpl.java:199)
at com.ibm.ws.asynchbeans.CJWorkItemImpl.run(CJWorkItemImpl.java:236)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1690)

I took the URL (which has been edited) and logged in using an administrative account and was provided with a pdf. I initially believed that it must have been the contents of the document that caused the problem so I uploaded the same document to a 4.5 CR4 server I run in the lab and couldn’t reproduce the problem.

I raised a PMR and they came back and said that problem is likely to be due a special character in the description and not in the document itself.

I looked at the trace.log and found reference to the seedlist xml that was being processed at the time.

[10/09/14 09:52:26:121 BST] 0000007a SeedlistPersi > com.ibm.connections.search.seedlist.crawler.impl.SeedlistPersistenceManager getSeedlistDirs ENTRY ecm_files
[10/09/14 09:52:26:121 BST] 0000007a SeedlistPersi < com.ibm.connections.search.seedlist.crawler.impl.SeedlistPersistenceManager getSeedlistDirs RETURN ecm_files, [c:\IBM\Connections\background\crawl\seedlists-ecm_files-initial-1410267828454]
[10/09/14 09:52:26:121 BST] 0000007a SeedlistPersi < com.ibm.connections.search.seedlist.crawler.impl.SeedlistPersistenceManager getSeedlistDir RETURN c:\IBM\Connections\background\crawl\seedlists-ecm_files-initial-1410267828454
[10/09/14 09:52:26:121 BST] 0000007a SeedlistFetch 3   seedlistFile = [c:\IBM\Connections\background\crawl\seedlists-ecm_files-initial-1410267828454\1410267828454-00007.xml]
[10/09/14 09:52:26:121 BST] 0000007a SeedlistFetch 2   Retrieving seedlist content: https://connections.acme.com/dm/atom/seedlist/myserver?useLocalFS=true&Start=3500&Action=GetDocuments&Format=xml&Range=500
[10/09/14 09:52:26:121 BST] 0000007a SeedlistFetch 3   Retrieving seedlist from file: 1410267828454-00007.xml

I opened the xml in Notepad++ and searched for the document name which I obtained from the URL previously and found a match. In one of the fields I see the following.

1

I provided the community and library that the document resided in and the customer couldn’t view the description data in the web browser. The customer made some changes to the field via the FileNet interface and once the special character was removed the data showed in the web browser.

To check whether the index is created correctly after this change I ran the background index again but wrote the files to a new location. If you run the command again to the same location as the initial background index then it will fail  because the seedlist will not have been recreated and the original special character is retained.

To speed things up, copy the extracted files from the previ0us location to the new extracted files. This customer had over ten thousand CCM documents so extracting them all again was time consuming.

I had to iterate this process four times until all the special characters were removed. Once you have an INDEX.READY file then I repeated the process for all the applications by copying over the extracted files and using SearchService.startBackgroundIndex(“c:/IBM/Connections/background/crawl”, “c:/IBM/Connections/background/extracted”, “c:/IBM/Connections/background/index”, “all_configured”) which built an index successfully.

I then used the steps in the IBM wiki to replace the current with the new index.

It turns out that the customer used a scripted import facility to import all the documents into CCM and this process introduced these characters.