Limiting resources used by IBM Cloud private and Orient Me

IBM Conductor for Containers has been rebranded IBM Cloud private with version 1.2.0 (https://www.ibm.com/developerworks/community/blogs/fe25b4ef-ea6a-4d86-a629-6f87ccf4649e/entry/IBM_Cloud_private_formerly_IBM_Spectrum_Conductor_for_Containers_version_1_2_0_is_now_available?lang=en)

IBM released version 6.0.0.1 of Orient Me and with it added new applications increasing the total amount of pods in play. Each pod requires some resources to run. Recently there has been some frustration for those who work with Connections trying to get Orient Me up and running on smaller servers for testing purposes or for deployment to SMB customers.

I spent some time looking at how to limit the resources consumed by decreasing the number of pods.

Kubernetes allows you to scale up or down your pods. This can be done on the command line or via the UI

Since I prefer the command line here is how you scale an application and it’s effect on the number of pods. There are two ways in which this is done, by Replica Sets and Stateful Sets. I won’t go into the difference of both because I’m not even wholly sure myself but suffice to say that most of OM applications use Replica Sets.

Replica Sets

I’m using analysisservice as an example because it is at the top when commands are run.

# kubectl get pods
NAME                                                   READY STATUS RESTARTS AGE
analysisservice-1093785398-31ks2 1/1        Running        0             8m
analysisservice-1093785398-hf90j 1/1        Running        0             8m

# kubectl get rs
NAME                                        DESIRED CURRENT READY AGE
analysisservice-1093785398 2                 2                   2            9m

The following command tells K8s to change the number of pods to be 1 that will accept load.

# kubectl scale –replicas=1 rs/analysisservice-1093785398
replicaset “analysisservice-1093785398” scaled

Below shows that just the one pod is ready to accept load. Note that the desired number is two. This means that this will be the default value if all the pods are deleted or the OS restarted.

# kubectl get rs
NAME                                        DESIRED CURRENT READY AGE
analysisservice-1093785398 2                 2                   1            9m

The pod that is going to not accept load is destroyed and a new one replaces it.

# kubectl get pods
NAME                                                   READY STATUS                  RESTARTS AGE
analysisservice-1093785398-31ks2 1/1        Running                  0                   18m
analysisservice-1093785398-4njpn 1/1       Terminating           0                    5m
analysisservice-1093785398-fmnrd 0/1     ContainerCreating 0                   3s

You can see that the new pod is not “ready” and thus not accepting any load.

# kubectl get pods
NAME                                                   READY STATUS RESTARTS AGE
analysisservice-1093785398-31ks2 1/1        Running 0                   19m
analysisservice-1093785398-fmnrd 0/1      Running 0                   43s

The reverse is true and you can scale the number of pods upwards. ICp can do this with policies based on CPU usage creating more pods and then decreasing them when the load drops.

The above approach does not persist over OS restarts or deletion of all the pods. To persist these changes the following steps need to be followed.

# kubectl get deployment
NAME               DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
analysisservice 2                2                   2                        2                      34m

This command amends the deployment configuration which was set in complete.6_0.yaml in the OM binaries.

# kubectl edit deployment analysisservice
apiVersion: extensions/v1beta1
kind: Deployment

This will open in vi though you can change your editor if you prefer. Under the spec section you want to amend the number of replicas

spec:
replicas: 1
selector:
matchLabels:
mService: analysisservice
name: analysisservice
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1

Ignore the status section. Save and close (:wq)

# kubectl get deployment
NAME                    DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
analysisservice      1                   1                   1                      1                       44m

This time the second pod is not listed with a 0/1 ready value. The second pod has been deleted.

# kubectl get pods
NAME                                                     READY STATUS RESTARTS AGE
analysisservice-1093785398-kz76m 1/1        Running  0                   17m

You can use the following command to open all application deployments and update using vi all the applications at one time.

# kubectl edit deployment

When you save and close the applications will be updated in line the values you set for the replicas.

# kubectl get deployment
NAME                                    DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
analysisservice                     1                  1                   1                         1                      55m
haproxy                                 1                  1                   1                         1                      57m
indexingservice                   1                  1                   1                         1                      55m
itm-services                         1                  1                   1                         1                      55m
mail-service                        1                  1                   1                          1                     55m
orient-webclient                1                  1                   1                         1                      55m
people-migrate                  1                  1                   1                         1                      55m
people-relation                  1                  1                   1                         1                      55m
people-scoring                   1                  1                   1                         1                      55m
redis-sentinel                     1                  1                   1                         1                      57m
retrievalservice                  1                  1                   1                         1                      55m
solr1                                     1                  1                   1                         1                      57m
solr2                                    1                  1                   1                         1                      57m
solr3                                    1                  1                   1                         1                      57m
zookeeper-controller-1   1                  1                   1                         1                      57m
zookeeper-controller-2  1                  1                   1                         1                      57m
zookeeper-controller-3  1                  1                   1                         1                      57m

To delete the additional solr and zookeper-controller pods you needs to run the following.

# kubectl delete deployment zookeeper-controller-2 zookeeper-controller-3
# kubectl delete deployment solr2 solr3

Running the following shows the number of pods have decreased by quite a lot.

# kubectl get pods

Checking the ReplicaSets again shows the values have decreased.

# kubectl get rs

Mongo and redis-server do not use Replica Sets, they use StatefulSets.

StatefulSets

The following command shows that there are 3 pods for each application.

# kubectl get statefulsets
NAME          DESIRED CURRENT AGE
mongo          3                 3                  1h
redis-server 3                 3                  1h

In the same vain as before you edit the replicas decreasing/increasing them as you see fit.

# kubectl edit statefulsets
statefulset “mongo” edited
statefulset “redis-server” edited

The end result is that only the one ReplicaSet is configured.

# kubectl get statefulsets
NAME          DESIRED CURRENT AGE
mongo          1                 1                  1h
redis-server 1                 1                  1h

The effect is seen when you list the pods.

# kubectl get pods
NAME              READY STATUS RESTARTS AGE
mongo-0          2/2        Running 0                   1h
redis-server-0 1/1         Running 0                   1h

At install time

These changes can be made at install time by updating the various .yml files in /microservices/hybridcloud/templates/* and /microservices/hybridcloud/templates/complete.6_0.yaml and then running install.sh.

Finally

I have only experimented on the default applications and have not touched those from the kube-system namespace which are the ICp applications and not OM specific.

I haven’t tried this on a working system yet, purely a detached single node running all roles with hostpath configuration.

Since there is no load on the server my measurements with regards to resources consumed pre and post changes is far from scientific but looking at the UI the amount of CPU and memory is certainly less then previously used.

I have no idea as yet whether this will break OM but I will persist and see whether it does or whether it works swimmingly. If anyone tries this out then please feedback to me.

BTW – I restarted the OS and had a couple of problems with analysisservice and indexingservice pods not being ready and shown as unhealthy but after deleting haproxy, redis-server-0 and redis-sentinel all my pods are showing as healthy.

IBM, please please provide a relatively simple way (ideally at install time) for us to cut the deployment down to bare bones maybe a small, medium or large deployment as you do with traditional Connections?

Update 05/07/2017

Once I integrated the server with a working Connections 6.0 server with latest fixes applied the ITM bar did not work. Nico Meisenzahl has also been looking into this and we hope to have a working set up soon

Update 07/07/2017

Nico created a great blog updating the yml files to decrease the amount of pods/containers during installation of Orient Me.

Advertisements

IBM Connections Files plugin not working within Notes when TLSv1.2 is enforced

After enforcing TLSv1.2 on our internal Connections 5.5 servers the Files plugin would not work.

In the IHS logs I would see errors such as

[warn] [client 80.229.222.90] [7f9a700a7060] [21173] SSL0222W: SSL Handshake Failed, No ciphers specified (no shared ciphers or no shared protocols). [xx.xx.xx.xx:62899 -> xxx.xxx.xxx.xxx:443] [09:45:11.000102454] 0ms

Enabling trace on IHS showed that the protocol being used was TLSv1.0 which matched Wireshark output. Oddly Status Updates and Activities plugins use TLSv1.2.

“GET /files/basic/api/library/4a7a7240-8f68-44d8-9447-7410cc2bb467/feed?pageSize=300&acls=true&sI=601 HTTP/1.1” 200 168770 TLS_RSA_WITH_AES_128_CBC_SHA TLSV1

I then had to allow TLSv1.0 until I could get an explanation from IBM.

Finally IBM came back with the following two lines to be added to the notes.ini.

SSL_DISABLE_TLS_10
DISABLE_SSLV3=1

Now in access_log I see TLSv1.2 being used.

“GET /files/basic/api/library/4a7a7240-8f68-44d8-9447-7410cc2bb467/feed?pageSize=300&acls=true&sI=601 HTTP/1.1” 200 168770 TLS_RSA_WITH_AES_128_GCM_SHA256 TLSV1.2

IBM also suggested that I check the following was set in plugin_customization.ini, which it was.

com.ibm.documents.connector.service/ENABLE_SSL=true

The notes.ini values have been pushed out to my colleagues via Domino policies.

Touchpoint problem due to no search index

A new Connections customer got in touch with a raft of problems after an upgrade to Connections 6. One of them was a problem with Touchpoint which stopped them from completing the on boarding process which caused them to repeatedly be directed to Touchpoint. What was happening was that they were able to get two or three screens in to “Add your interests” and then they couldn’t go further and had to use “finish later” or they were faced with “Error during prefetching for step profileTags.”

A quick Google of “profileTags” turned up references to search within Connections. I checked the index (which I hadn’t got around to doing just yet) and I didn’t find INDEX.READY. The search index had not been created due to LTPAToken exceptions which needed the scheduled tasks to be cleared and all clearScheduler.sql scripts run. Once the search index was created Touchpoint worked.

Orient Me and some things I’ve come across and wrestled with

Having gained some experience of Docker and CfC (IBM Spectrum Conductor for Containers) before Connections 6.0 was released I thought this would be easy to set up but I must admit I’m struggling.

My setup is 3 CentOS servers for Orient Me with another for DB2/SDI and another for Connections hosting the deployment manager.

Here are some things I have come across which I’d like to add to as I come across other problems.

DNS

Working on a beefy ESXi server running at home I normally manage most things using hosts file which has worked really well, up until now. I won’t steal from Roberto Boccadoro’s blog post but suffice to say I couldn’t get it to work using hosts file even after editing nsswitch.conf. I had to rely on spoofing DNS, internally, on my router by updating /jffs/configs/hosts.add to include all my Connections servers.

Even with this I found that the migration script in people-migrate container would fail because so in this case I had to add my host files to /etc/hosts which got me past that step.

MongoDB

I had to uninstall and reinstall a couple of times. On reinstall I had problems with the migration application (people-migrate) connecting to mongoDB. I was able to check the databases and connect to them.

# kubectl exec -it mongo-0 bash

#mongo

rs0:PRIMARY> show dbs
admin  0.000GB
local  0.000GB

The migration script was failing to connect and I couldn’t fathom why. I uninstalled again and this time I removed the persistent volumes and recreated them and now the migration script gets further but fails with the following exception.

2017-04-12T12:01:42.751Z – info: [migrator] Mongo DB URL: mongodb://mongo-0.mongo:27017/relationshipdb?replicaSet=rs&readPreference=primaryPreferred&wtimeoutMS=2000
2017-04-12T12:01:42.757Z – info: [migrator] Mongo DB URL: mongodb://mongo-0.mongo:27017/datamigrationdb?replicaSet=rs&readPreference=primaryPreferred&wtimeoutMS=2000
2017-04-12T12:01:42.758Z – info: [migrator] Mongo DB URL: mongodb://mongo-0.mongo:27017/profiledb?replicaSet=rs&readPreference=primaryPreferred&wtimeoutMS=2000
2017-04-12T12:01:54.018Z – info: [migrator] total request number: 1
2017-04-12T12:01:54.021Z – info: [populator] Start to populate URL:
–“https://connections.domain.com/profiles/admin/atom/profiles.do?ps=100”

2017-04-12T12:01:59.417Z – error: [migrator] errors:[{“profileKey”:”16ff2775-2ace-4db8-8e54-56adcc62a5fb”,”externalId”:”382AB352-F9AE-D6E4-8025-7D2C004A7248″,”created”:1491998514408,”orgId”:”a”,”id”:”FAKE_ID”,”error”:{}},{“profileKey”:”8af449b4-0357-4bed-a7c7-c0e5285ba826″,”externalId”:”932ED7B3-988D-9EFC-8625-79E3005B2B62″,”created”:1491998514409,”orgId”:”a”,”id”:”FAKE_ID”,”error”:{}},{“profileKey”:”a9294f18-ee72-49d0-8a44-cf02abe6d4d2″,”externalId”:”0873E9A9-7E12-0609-8025-7D38003BFD71″,”created”:1491998514410,”orgId”:”a”,”id”:”FAKE_ID”,”error”:{}},{“profileKey”:”b6994f86-7525-48b6-92da-900393382e11″,”externalId”:”0F64A6F8-927B-483C-8625-79E3005AC781″,”created”:1491998514410,”orgId”:”a”,”id”:”FAKE_ID”,”error”:{}}]
Connection fails: MongoError: failed to connect to server [mongo-0:27017] on first connect [MongoError: connection 4 to mongo-0:27017 timed out]
It will be retried for the next request.
Connection fails: MongoError: failed to connect to server [mongo-0:27017] on first connect [MongoError: connection 5 to mongo-0:27017 timed out]
It will be retried for the next request.

/usr/src/app/node_modules/mongodb/lib/mongo_client.js:338
throw err
^
MongoError: failed to connect to server [mongo-0:27017] on first connect [MongoError: connection 5 to mongo-0:27017 timed out]
at Pool.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/topologies/server.js:327:35)
at emitOne (events.js:96:13)
at Pool.emit (events.js:188:7)
at Connection.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/connection/pool.js:274:12)
at Connection.g (events.js:291:16)
at emitTwo (events.js:106:13)
at Connection.emit (events.js:191:7)
at Socket.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/connection/connection.js:187:10)
at Socket.g (events.js:291:16)
at emitNone (events.js:86:13)
at Socket.emit (events.js:185:7)
at Socket._onTimeout (net.js:339:8)
at ontimeout (timers.js:365:14)
at tryOnTimeout (timers.js:237:5)
at Timer.listOnTimeout (timers.js:207:5)

Redis client

In the knowledge center it alludes as to how to test connecting to Redis from the Connections node. If you want to install the client and try for yourself here are the instructions IBM deemed not necessary to write down for you.

# su -c ‘rpm -Uvh http://download.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-9.noarch.rpm&#8217;
# yum install redis

# redis-cli -p 30379
127.0.0.1:30379> set foo bar
OK
127.0.0.1:30379> get foo
“bar”
127.0.0.1:30379>

Odd pod behaviour

I believe I have an underlying problem with the persistent volumes and over night this happened.

# kubectl get pods

zookeeper-controller-3-2528439515-xz702   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-xz79d   0/1       OutOfpods   0          14h
zookeeper-controller-3-2528439515-xzqc9   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-xzzbl   0/1       OutOfpods   0          16h
zookeeper-controller-3-2528439515-z0kwf   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-z13kn   0/1       OutOfpods   0          17h
zookeeper-controller-3-2528439515-z2lsn   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-z6mc5   0/1       OutOfpods   0          14h
zookeeper-controller-3-2528439515-z74nj   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-z97jp   0/1       OutOfpods   0          17h
zookeeper-controller-3-2528439515-zd2js   0/1       OutOfpods   0          4h
zookeeper-controller-3-2528439515-zdc3t   0/1       OutOfpods   0          14h
zookeeper-controller-3-2528439515-zk5bw   0/1       OutOfpods   0          16h

# kubectl get pods | wc -l
2114

There were thousands of pods. I believe they were created faster than they could be garbage collected.

I deleted all the pods in the “OutOfpods” status using the following command.

# kubectl get pod | cut -d ” ” -f 1 | xargs -n1 -P 10 kubectl delete pod

Shutdown

To shutdown my servers I have been running the following to stop all pods.

# docker stop $(docker ps -a -q)

I’m not sure whether I am better off using a different variation of above to stop all pods

# kubectl get pod | cut -d ” ” -f 1 | xargs -n1 -P 10 kubectl delete pod

Is there a better prescribed way of doing this?

Enabling profiles events for Orient Me

I did what was asked of me in the knowledge center but there is little indication of it having worked. In the documentation it states that I should see “OrientMe configured properly – both properties are enabled.” Where should I see that, in SDI’s ibmdi.log or in one of the application servers SystemOut.log? I have looked at both and I do not see this written.

Anyway, I’ll hopefully add  to this as I go. If anyone has come across these problems and found a resolution to them, please get in touch.

Old version of Notes Java breaks IBM Connections Files plugin when TLSv1.2 is enforced

I had to raise a PMR on a problem I and others in my company had with the Notes client. After enforcing TLSv1.2 in Connections 5.5 using the following configuration in httpd.conf the Files plugin would not work but the Activities and Status Updates plugins would.

SSLEnable
SSLProtocolDisable SSLv2 SSLv3 TLSv11 TLSv10
SSLCipherSpec TLSv12 +TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 +TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 +TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384 +TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA +TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 +TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA +TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 +TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 +TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384 +TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 +TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA

I kept seeing the following screen and clicking “try again using existing options” did nothing.

Whilst clicking on “try again using existing options” I would see the following in IHS.

[Wed Mar 29 14:30:41 2017] [warn] [client xxx.xxx.xxx.xx] [7f9a480ec800] [30453] SSL0222W: SSL Handshake Failed, No ciphers specified (no shared ciphers or no shared protocols).  [xxx.xxx.xxx.xx:49296 -> xxx.xxx.xxx.xx:443] [14:30:41.000571168] 0ms

The SSL certificate is at 4096 bits and I had previously replaced US_export_policy.jar and local_policy.jar with the unrestricted policy jars so that was not the problem.

I found, oddly, that if I swapped to the IBM Sametime Meetings plugin first and then changed to Files, my files would load…. Also, if I ran Fiddler and restarted my Notes client but went directly to Files it would load too. Weird.

I had a screen share with Elizabeth Hecht and Jacqueline Chewens to show them the odd behaviour and they too were baffled. Liz came across a thought of the version of Java being used may not be allowing connectivity to Files and asked whether I had applied the Java update for FP6? Not having so much focus on Notes and Domino of late I told her I wasn’t even aware that previously you were supposed to update the version of Java being used by the Notes client.

To test this I updated Notes to FP8, which bundles in the Java update and low and behold the Files plugin started working. Also, there was no need to replace the jars with the unrestricted ones!

The version of Java now in play is as follows.

c:\Program Files (x86)\IBM\Notes\jvm>java -version
java version “1.8.0_121”
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) Client VM (build 25.121-b13, mixed mode, sharing)

BTW – if Connections enforces SSL then you need to make sure that com.ibm.documents.connector.service/ENABLE_SSL=true is set in the plugin_customization.ini.

Connections Pink and container orchestration using CfC

A while ago I started dabbling with Docker after reading some great blogs about ELK by Klaus Bild and Christoph Stoettner thinking I could do with a tool like ELK to analyse log files and to give me something tangible to work with whilst learning about Docker.

After a lot of hard learning and some frustrating hours I got my head around containers and how they could be used to my advantage and got ELK running natively on Ubuntu and then on my work Windows 7 laptop.

A few months before Connect 2017 news was leaking about Connections Pink and its architecture and how the applications will run within containers. Recently Jason Gary Roy held a webinar (Open Mic Webcast: Think Pink – The Future of IBM Connections – 07 March 2017) replaying some of his slides from Connect 2017 and in the video he mentions (briefly) CfC in combination with Docker and containers.

I asked the question in the IBM Connections Community Skype chat and a few people told me that CfC was an IBM product called IBM Spectrum Conductor for Containers. I looked through the community for CfC and realised how important having an orchestration tool is for running multiple containers and scaling for high availability. This was a long way away from running three containers on my laptop.

Installing CfC was pretty easy and well documented in the CfC community. Installation wise you need to install on Ubuntu 16.04 or RHEL although I am sure CentOS will work. I’ll get to that next week.

What you end up with is a rather nice UI which does many of the hard things for you such as networking, setting up persistent storage for your containers, moving applications to other nodes, automatic scaling when demand requires and many more.

What I also liked is that it acts as a private repository for your containers avoiding you needing to push to Docker Hub for storage.

In the latest version you can install on a single node which is great for testing purposes but it also allows you to add and remove worker nodes when you want to branch out.

I asked in the CfC Slack channel what the future looks like for CfC because if it requires a license then it is another hurdle to overcome when selling in Connections. The response I got was:

“We are intending to keep providing a free version that customer can use and deploy as it is a packaging of open-source. Business discussion on what to do beyond that are still ongoing so I can’t comment. Options include providing commercial support or additional add-ons around the open-source for a commercial product. Right now this is a community effort, and we are currently looking  technical feedback  and understanding of what use cases people would like to use CfC for.  Looking forward to  your participation.”

Since the product is built on the following open technologies I would hope that a free option remains available going forward.

Another other benefit for using CfC is that IBM are using it for Pink. I assume that most of the documentation referring to orchestration of the containers will reference CfC in some form. Getting to know it now, I hope, will make deploying Pink containers easier.

Thanks to Michele Buccarello for answering my questions.

CwC has been built with below individual components

Core component:

  • Kubernetes and Mesosphere API/CLI
  • GUI
  • Installer for HA
  • Authentication through LDAP
  • An App store
  • A Private image registry

Sample applications:

  • Frontend
  • Liberty
  • Nginx
  • Redis
  • Tomcat

Built in Network

  • Flannel
  • Calico

Built in persistent Storage

  • NFS
  • Hostpath
  • GlusterFs

Supported CPU Architecture

  • PowerPC LE
  • x86

Configure Connections to use SMTP MX records to multiple servers

Internally we originally used a DNS round robin alias for Connections to connect to to route SMTP emails but that was problematic when one of the servers in the alias was taken off line.

IBM has made this easier by allowing you to use MX records to list the SMTP servers to connect to as detailed in Sending mail from any available mail server.

It was fairly simple using the example in the  knowledge center to set this up. Firstly I had our network team create (internal only) MX records for three Domino servers for internal.acme.com with the required weightings. Then I checked out notifications-config.xml and edited the following lines and checked it back in.

<channelConfigs>
<emailChannelConfig>
<useJavaMailProvider>false</useJavaMailProvider>
<smtpJNDILookup>
<smtpJNDILookupURL>dns:///internal.acme.com</smtpJNDILookupURL>
<javamail>
<property name=”mail.debug”>false</property>
<property name=”mail.smtp.connectiontimeout”>120000</property>
<property name=”mail.smtp.timeout”>120000</property>
<property name=”mail.smtp.port”>25</property>
<property name=”mail.smtp.socketFactory.port”>25</property>
<property name=”mail.smtp.socketFactory.fallback”>false</property>
<property name=”mail.smtp.sendpartial”>true</property>
</javamail>
</smtpJNDILookup>
<maxRecipients>50</maxRecipients>
</emailChannelConfig>
</channelConfigs>

At first I left the below line in and it didn’t work.

<property name=”mail.smtp.socketFactory.class”>javax.net.ssl.SSLSocketFactory</property>

Setting <property name=”mail.debug”>true</property> wrote the following to the SystemOut.log.

[2/21/17 20:13:34:309 GMT] 0000023e SystemOut     O DEBUG: getProvider() returning javax.mail.Provider[TRANSPORT,smtp,com.sun.mail.smtp.SMTPTransport,Sun Microsystems, Inc]
[2/21/17 20:13:34:322 GMT] 0000023e SystemOut     O DEBUG SMTP: useEhlo true, useAuth false
[2/21/17 20:13:34:322 GMT] 0000023e SystemOut     O DEBUG SMTP: trying to connect to host “domino.internal.acme.com.”, port 25, isSSL false
[2/21/17 20:13:34:347 GMT] 0000023e SystemOut     O DEBUG SMTP: exception reading response: javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?
[2/21/17 20:13:34:348 GMT] 0000023e SystemOut     O DEBUG SMTP: useEhlo true, useAuth false
[2/21/17 20:13:34:348 GMT] 0000023e SystemOut     O DEBUG SMTP: starting protocol to host “domino.internal.acme.com.”, port 25
[2/21/17 20:13:34:349 GMT] 0000023e SystemOut     O DEBUG SMTP: exception reading response: javax.net.ssl.SSLException: Connection has been shutdown: javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?

Remming out the aforementioned line referencing allowed me to connect over port 25.

To test this my colleague stopped the SMTP listener on the Domino server with the lowest weighting causing it to connect to the next server.