Gone, but not forgotten

For over twelve years I have worked for an IBM Business Partner in the UK focusing on IBM Collaboration Solutions and I have loved every minute of it but it’s time to move on to a new challenge which is not within the ICS community.

Coming from Domino third level support I crossed over easily to Sametime 6.5.1 which at that time was an add on to the underlying Domino server (still true to an extent now). Sametime was my first love. It should have been easy, right? An additional installer on top of Domino and for many deployments it was and still is although not so much now with WebSphere and DB2 in the mix.

What I loved were the problems Sametime caused or should I say, problems caused when you introduced Sametime to a large user base. I wrestled for many days tuning Sametime for a large deployment of over 40,000, tracing LDAP, debugging the text files and tweaking the sametime.ini. This was a baptism of fire and I loved Sametime more for the pain it caused me. I learnt so much, much of it I still remember and often come across when deploying Sametime for customers.

In 2007 I went to Collaboration University in London as Quickr had been recently released. It was my first introduction to the ICS community. Being in the same place as dozens of others all with the same approach of making Sametime, Quickr and Domino successful was intoxicating. I had already quite a bit of experience of Sametime but it helped to be in the same place as Chris Miller, Carl Tyler, Rob Novak and Warren Elsmore to bolster that knowledge and start learning about Quickr. Quickr took off incredibly quickly being easy to implement and manage which is why it’s still being used now long after it went end of life.

In recent years Connections has been the application that seems to be more in demand so I have seen my time split between the two applications. I remember being introduced to Connections, also in 2007, at a course in Hursley which described deploying and configuring Connections 2.0. At that point there were only six applications and Bookmarks was called Dogear!

Connections is a wonderfully complex set of applications which has come a long way from the days when they were a collection of disparate applications bundled together with WAS acting as the glue. The premise to get people working together better and allow you to find information quickly so you can focus on your job. For many people like me that resonates. I get paid to work with software that allows people to work together better, to formulate relationships with one another and most importantly to share. You might argue the case that is the same of all software but that’s not true. Connections is unique to that extent.

I don’t know whether it was Connections that started my journey or whether it was already something inside of me but sharing is one of the most important aspects of my job. Connections is all about sharing. Information is put into Connections for others to consume. They have a subject of interest and Connections allows them to find a person with knowledge of that subject, to follow them, to communicate with them, to add their take on the subject.

This approach to sharing makes public all your knowledge. No more do you find that people are keeping information in their mail files or P drives, it’s all there to be found. The days when you hoarded your information to make you seem indispensable to your employer are gone. People who are actively sharing their information are now seen to be those who are indispensable.

This sharing concept is underlined by two excellent Skype chat groups for Sametime and Connections. Within these two chat rooms are people such as Gabriella Davies, Robert Farstad, Michele Buccarello, Sharon James, Christoph Stoettner, Keith Brooks, Marco Ensing, Matteo Bisi, Michael Urspringer, Nico Meisenzahl, Roberto Boccadoro, Wannes Rams, Chris Whisonant and many others I haven’t mentioned. They are busy people but they help with problems whenever they have a spare 10 minutes. They share their wisdom and experience with whomever asks regardless of the complexity of question. The underlying sharing ideology runs through all these people, through the software into the wider ICS community.

As I alluded to in the opening paragraph, I am set for a new challenge and searching for the right challenge has taken me outside of the ICS product portfolio but I am staying within the larger IBM sphere. I am joining IBM Resilient working on their security incident response platform which was bought by IBM last year. It looks like an exciting time to be joining what is a growing industry.

I am sad to leave such a wonderful community at such an exciting stage with Pink gaining traction. I strongly believe Pink and it’s underlying platform will be a success especially with the aforementioned people driving the product forward.

Whilst I will soon be gone, the years working with this software will not be forgotten and neither will the friends and colleagues I have made along the way.

Advertisements

Cannot get past Context Roots page in Engagement Center

A few weeks ago I had some problems installing Engagement Centre on my employers internal Connections 5.5 servers. I installed it as I did with a 6.0 Connections server but each time I went to https://connections.acme.com/xcc/main I was redirected with a 302 to https://connections.acme.com/xcc/admin#ContextRoots?redirectUrl=/xcc/main which is the context roots page.

I checked the context roots were correct and they were. I went back to the customization screen and ensured I had saved it.

It still wouldn’t let me go to /xcc/main to start creating pages. I logged a PMR and Charlie Price got involved and reproduced it. It was an embarrassingly easy fix. I needed to go into the context roots screen and click save even though the values were correct and didn’t need changing. After clicking save I could go to /xcc/main and create my pages. Simples.

Limiting resources used by IBM Cloud private and Orient Me

IBM Conductor for Containers has been rebranded IBM Cloud private with version 1.2.0 (https://www.ibm.com/developerworks/community/blogs/fe25b4ef-ea6a-4d86-a629-6f87ccf4649e/entry/IBM_Cloud_private_formerly_IBM_Spectrum_Conductor_for_Containers_version_1_2_0_is_now_available?lang=en)

IBM released version 6.0.0.1 of Orient Me and with it added new applications increasing the total amount of pods in play. Each pod requires some resources to run. Recently there has been some frustration for those who work with Connections trying to get Orient Me up and running on smaller servers for testing purposes or for deployment to SMB customers.

I spent some time looking at how to limit the resources consumed by decreasing the number of pods.

Kubernetes allows you to scale up or down your pods. This can be done on the command line or via the UI

Since I prefer the command line here is how you scale an application and it’s effect on the number of pods. There are two ways in which this is done, by Replica Sets and Stateful Sets. I won’t go into the difference of both because I’m not even wholly sure myself but suffice to say that most of OM applications use Replica Sets.

Replica Sets

I’m using analysisservice as an example because it is at the top when commands are run.

# kubectl get pods
NAME                                                   READY STATUS RESTARTS AGE
analysisservice-1093785398-31ks2 1/1        Running        0             8m
analysisservice-1093785398-hf90j 1/1        Running        0             8m

# kubectl get rs
NAME                                        DESIRED CURRENT READY AGE
analysisservice-1093785398 2                 2                   2            9m

The following command tells K8s to change the number of pods to be 1 that will accept load.

# kubectl scale –replicas=1 rs/analysisservice-1093785398
replicaset “analysisservice-1093785398” scaled

Below shows that just the one pod is ready to accept load. Note that the desired number is two. This means that this will be the default value if all the pods are deleted or the OS restarted.

# kubectl get rs
NAME                                        DESIRED CURRENT READY AGE
analysisservice-1093785398 2                 2                   1            9m

The pod that is going to not accept load is destroyed and a new one replaces it.

# kubectl get pods
NAME                                                   READY STATUS                  RESTARTS AGE
analysisservice-1093785398-31ks2 1/1        Running                  0                   18m
analysisservice-1093785398-4njpn 1/1       Terminating           0                    5m
analysisservice-1093785398-fmnrd 0/1     ContainerCreating 0                   3s

You can see that the new pod is not “ready” and thus not accepting any load.

# kubectl get pods
NAME                                                   READY STATUS RESTARTS AGE
analysisservice-1093785398-31ks2 1/1        Running 0                   19m
analysisservice-1093785398-fmnrd 0/1      Running 0                   43s

The reverse is true and you can scale the number of pods upwards. ICp can do this with policies based on CPU usage creating more pods and then decreasing them when the load drops.

The above approach does not persist over OS restarts or deletion of all the pods. To persist these changes the following steps need to be followed.

# kubectl get deployment
NAME               DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
analysisservice 2                2                   2                        2                      34m

This command amends the deployment configuration which was set in complete.6_0.yaml in the OM binaries.

# kubectl edit deployment analysisservice
apiVersion: extensions/v1beta1
kind: Deployment

This will open in vi though you can change your editor if you prefer. Under the spec section you want to amend the number of replicas

spec:
replicas: 1
selector:
matchLabels:
mService: analysisservice
name: analysisservice
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1

Ignore the status section. Save and close (:wq)

# kubectl get deployment
NAME                    DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
analysisservice      1                   1                   1                      1                       44m

This time the second pod is not listed with a 0/1 ready value. The second pod has been deleted.

# kubectl get pods
NAME                                                     READY STATUS RESTARTS AGE
analysisservice-1093785398-kz76m 1/1        Running  0                   17m

You can use the following command to open all application deployments and update using vi all the applications at one time.

# kubectl edit deployment

When you save and close the applications will be updated in line the values you set for the replicas.

# kubectl get deployment
NAME                                    DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
analysisservice                     1                  1                   1                         1                      55m
haproxy                                 1                  1                   1                         1                      57m
indexingservice                   1                  1                   1                         1                      55m
itm-services                         1                  1                   1                         1                      55m
mail-service                        1                  1                   1                          1                     55m
orient-webclient                1                  1                   1                         1                      55m
people-migrate                  1                  1                   1                         1                      55m
people-relation                  1                  1                   1                         1                      55m
people-scoring                   1                  1                   1                         1                      55m
redis-sentinel                     1                  1                   1                         1                      57m
retrievalservice                  1                  1                   1                         1                      55m
solr1                                     1                  1                   1                         1                      57m
solr2                                    1                  1                   1                         1                      57m
solr3                                    1                  1                   1                         1                      57m
zookeeper-controller-1   1                  1                   1                         1                      57m
zookeeper-controller-2  1                  1                   1                         1                      57m
zookeeper-controller-3  1                  1                   1                         1                      57m

To delete the additional solr and zookeper-controller pods you needs to run the following.

# kubectl delete deployment zookeeper-controller-2 zookeeper-controller-3
# kubectl delete deployment solr2 solr3

Running the following shows the number of pods have decreased by quite a lot.

# kubectl get pods

Checking the ReplicaSets again shows the values have decreased.

# kubectl get rs

Mongo and redis-server do not use Replica Sets, they use StatefulSets.

StatefulSets

The following command shows that there are 3 pods for each application.

# kubectl get statefulsets
NAME          DESIRED CURRENT AGE
mongo          3                 3                  1h
redis-server 3                 3                  1h

In the same vain as before you edit the replicas decreasing/increasing them as you see fit.

# kubectl edit statefulsets
statefulset “mongo” edited
statefulset “redis-server” edited

The end result is that only the one ReplicaSet is configured.

# kubectl get statefulsets
NAME          DESIRED CURRENT AGE
mongo          1                 1                  1h
redis-server 1                 1                  1h

The effect is seen when you list the pods.

# kubectl get pods
NAME              READY STATUS RESTARTS AGE
mongo-0          2/2        Running 0                   1h
redis-server-0 1/1         Running 0                   1h

At install time

These changes can be made at install time by updating the various .yml files in /microservices/hybridcloud/templates/* and /microservices/hybridcloud/templates/complete.6_0.yaml and then running install.sh.

Finally

I have only experimented on the default applications and have not touched those from the kube-system namespace which are the ICp applications and not OM specific.

I haven’t tried this on a working system yet, purely a detached single node running all roles with hostpath configuration.

Since there is no load on the server my measurements with regards to resources consumed pre and post changes is far from scientific but looking at the UI the amount of CPU and memory is certainly less then previously used.

I have no idea as yet whether this will break OM but I will persist and see whether it does or whether it works swimmingly. If anyone tries this out then please feedback to me.

BTW – I restarted the OS and had a couple of problems with analysisservice and indexingservice pods not being ready and shown as unhealthy but after deleting haproxy, redis-server-0 and redis-sentinel all my pods are showing as healthy.

IBM, please please provide a relatively simple way (ideally at install time) for us to cut the deployment down to bare bones maybe a small, medium or large deployment as you do with traditional Connections?

Update 05/07/2017

Once I integrated the server with a working Connections 6.0 server with latest fixes applied the ITM bar did not work. Nico Meisenzahl has also been looking into this and we hope to have a working set up soon

Update 07/07/2017

Nico created a great blog updating the yml files to decrease the amount of pods/containers during installation of Orient Me.

IBM Connections Files plugin not working within Notes when TLSv1.2 is enforced

After enforcing TLSv1.2 on our internal Connections 5.5 servers the Files plugin would not work.

In the IHS logs I would see errors such as

[warn] [client 80.229.222.90] [7f9a700a7060] [21173] SSL0222W: SSL Handshake Failed, No ciphers specified (no shared ciphers or no shared protocols). [xx.xx.xx.xx:62899 -> xxx.xxx.xxx.xxx:443] [09:45:11.000102454] 0ms

Enabling trace on IHS showed that the protocol being used was TLSv1.0 which matched Wireshark output. Oddly Status Updates and Activities plugins use TLSv1.2.

“GET /files/basic/api/library/4a7a7240-8f68-44d8-9447-7410cc2bb467/feed?pageSize=300&acls=true&sI=601 HTTP/1.1” 200 168770 TLS_RSA_WITH_AES_128_CBC_SHA TLSV1

I then had to allow TLSv1.0 until I could get an explanation from IBM.

Finally IBM came back with the following two lines to be added to the notes.ini.

SSL_DISABLE_TLS_10
DISABLE_SSLV3=1

Now in access_log I see TLSv1.2 being used.

“GET /files/basic/api/library/4a7a7240-8f68-44d8-9447-7410cc2bb467/feed?pageSize=300&acls=true&sI=601 HTTP/1.1” 200 168770 TLS_RSA_WITH_AES_128_GCM_SHA256 TLSV1.2

IBM also suggested that I check the following was set in plugin_customization.ini, which it was.

com.ibm.documents.connector.service/ENABLE_SSL=true

The notes.ini values have been pushed out to my colleagues via Domino policies.

Touchpoint problem due to no search index

A new Connections customer got in touch with a raft of problems after an upgrade to Connections 6. One of them was a problem with Touchpoint which stopped them from completing the on boarding process which caused them to repeatedly be directed to Touchpoint. What was happening was that they were able to get two or three screens in to “Add your interests” and then they couldn’t go further and had to use “finish later” or they were faced with “Error during prefetching for step profileTags.”

A quick Google of “profileTags” turned up references to search within Connections. I checked the index (which I hadn’t got around to doing just yet) and I didn’t find INDEX.READY. The search index had not been created due to LTPAToken exceptions which needed the scheduled tasks to be cleared and all clearScheduler.sql scripts run. Once the search index was created Touchpoint worked.

Orient Me and mongoDB connection failures

I have been banging against a mongoDB wall for a good few days as explained in another post but I’m slowly getting there. The problem I was facing was that the migration application in the people-migrate container wasn’t working.

# npm run start migrate
npm info it worked if it ends with ok
npm info using npm@3.10.8
npm info using node@v6.9.1
npm info lifecycle people-datamigration-service@0.0.1~prestart: people-datamigration-service@0.0.1
npm info lifecycle people-datamigration-service@0.0.1~start: people-datamigration-service@0.0.1

> people-datamigration-service@0.0.1 start /usr/src/app
> cross-env NODE_ENV=production node lib/server.js “migrate”

2017-04-20T13:19:56.761Z – info: [migrator] Mongo DB URL: mongodb://mongo-0.mongo:27017,mongo-1.mongo:27017,mongo-2.mongo:27017/relationshipdb?replicaSet=rs0&readPreference=primaryPreferred&wtimeoutMS=2000
2017-04-20T13:19:56.766Z – info: [migrator] Mongo DB URL: mongodb://mongo-0.mongo:27017,mongo-1.mongo:27017,mongo-2.mongo:27017/datamigrationdb?replicaSet=rs0&readPreference=primaryPreferred&wtimeoutMS=2000
2017-04-20T13:19:56.767Z – info: [migrator] Mongo DB URL: mongodb://mongo-0.mongo:27017,mongo-1.mongo:27017,mongo-2.mongo:27017/profiledb?replicaSet=rs0&readPreference=primaryPreferred&wtimeoutMS=2000
Connection fails: MongoError: failed to connect to server [mongo-0:27017] on first connect [MongoError: getaddrinfo ENOTFOUND mongo-0 mongo-0:27017]
It will be retried for the next request.

/usr/src/app/node_modules/mongodb/lib/mongo_client.js:338
          throw err
          ^
MongoError: failed to connect to server [mongo-0:27017] on first connect [MongoError: getaddrinfo ENOTFOUND mongo-0 mongo-0:27017]
    at Pool.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/topologies/server.js:327:35)
    at emitOne (events.js:96:13)
    at Pool.emit (events.js:188:7)
    at Connection.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/connection/pool.js:274:12)
    at Connection.g (events.js:291:16)
    at emitTwo (events.js:106:13)
    at Connection.emit (events.js:191:7)
    at Socket.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/connection/connection.js:177:49)
    at Socket.g (events.js:291:16)
    at emitOne (events.js:96:13)
    at Socket.emit (events.js:188:7)
    at connectErrorNT (net.js:1020:8)
    at _combinedTickCallback (internal/process/next_tick.js:74:11)
    at process._tickCallback (internal/process/next_tick.js:98:9)

If I specify the location of migrationConfig I get the same result.

# npm run start migrate config:/usr/src/app/migrationConfig

Oddly enough, if I run the above command outside of /usr/src/app/ directory it fails. It doesn’t actually read the file you specify, it always looks for migrationConfig in relation to the working directory where you are when you issue it. Of course I may have the syntax wrong but if I do not then it’s a bit sloppy.

On to the problem which seems to be name resolution. The error I was getting was

Connection fails: MongoError: failed to connect to server [mongo-0:27017] on first connect [MongoError: getaddrinfo ENOTFOUND mongo-0 mongo-0:27017]

It seems to be trying to connect to mongo-0 over 27017.

# kubectl exec -it $(kubectl get pods | grep people-migrate | awk ‘{print $1}’) bash

# ping mongo-0
ping: mongo-0: Name or service not known

# ping mongo
PING mongo.default.svc.cluster.local (10.1.67.163) 56(84) bytes of data.
64 bytes from 10.1.67.163 (10.1.67.163): icmp_seq=1 ttl=63 time=0.063 ms

# ping mongo-0.mongo
PING mongo-0.mongo.default.svc.cluster.local (10.1.67.163) 56(84) bytes of data.
64 bytes from 10.1.67.163 (10.1.67.163): icmp_seq=1 ttl=63 time=0.087 ms

This was the cause, “mongo-0” was not resolving for me and this is confirmed by another that there container works the same. To work around this I added an entry to the container’s host file.

# cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1       localhost
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.1.67.176     people-migrate-4029352936-n8fzl
10.1.67.163     mongo-0 mongo-0.mongo

Now the migration app works but I also have mongo-sidecar errors which I’m not clear on as to whether they are supposed to be there.

Update – 27/04/17

This only gets me so far. This allows me to get the data migrated from Connections Profiles in to MongoDB but when the container is torn down and replaced with another the host file entry is gone. Also, there are the following errors in the logs for itm-services containers that I cannot exec to to update the hosts file.

Connection fails: MongoError: failed to connect to server [mongo-0:27017] on first connect [MongoError: getaddrinfo ENOTFOUND mongo-0 mongo-0:27017]
It will be retried for the next request.

/usr/src/app/node_modules/mongodb/lib/mongo_client.js:338
          throw err
          ^
MongoError: failed to connect to server [mongo-0:27017] on first connect [MongoError: getaddrinfo ENOTFOUND mongo-0 mongo-0:27017]
    at Pool.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/topologies/server.js:327:35)
    at emitOne (events.js:96:13)
    at Pool.emit (events.js:188:7)
    at Connection.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/connection/pool.js:274:12)
    at Connection.g (events.js:291:16)
    at emitTwo (events.js:106:13)
    at Connection.emit (events.js:191:7)
    at Socket.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/connection/connection.js:177:49)
    at Socket.g (events.js:291:16)
    at emitOne (events.js:96:13)
    at Socket.emit (events.js:188:7)
    at connectErrorNT (net.js:1020:8)
    at _combinedTickCallback (internal/process/next_tick.js:74:11)
    at process._tickCallback (internal/process/next_tick.js:98:9)

Update 28/04/17

During the (excellent) Connections Pink Developer Workshop hosted by IBM we were given access to a SoftLayer server running CentOS 7.3 where we installed CfC and Orient Me. The installer worked just fine with no signs of the mongoDB errors above. I have come across two others who have the same errors I have documented above.

I sparked up a CentOS 7.3 server on Bluemix for a few hours and the install with the same binaries worked just fine. I compared what yum has installed and installed all on my local CentOS 7.3 server and the same problem occurred. I changed my NIC device name swapping it from ens192 to match Bluemix and eth0 but the result is the same.

Update 05/05/17

This week I was lucky to visit the Dublin labs with a customer discussing Watson Workspace, Watson Work Services, XPages and Pink. I used a couple of hours of those two days to have a chat with David McDonagh and a colleague of his Bruno to look into the problems I was having with Mongo.

The crux of it was that the node I was using as the master, boot, worker and proxy was under a great deal of strain, mainly CPU strain, which seemed to be causing the problem. This would make sense since the differences between my ESXi server and Bluemix are the resources available to it.

I bumped up the resources available to the single node but although the install went OK the problems persisted. It wasn’t until today that I got it working but not with a single node but rather two nodes. Node 1 ran boot, master and proxy roles whilst node 2 was the worker node. I gave a generous helping of resources to both and the thankfully the installation went smoothly and more importantly the errors above are no more.

I have some further work to see how much I can scale the resources back because it does have an impact on my ESXi host and the other guests on it.

Orient Me and some things I’ve come across and wrestled with

Having gained some experience of Docker and CfC (IBM Spectrum Conductor for Containers) before Connections 6.0 was released I thought this would be easy to set up but I must admit I’m struggling.

My setup is 3 CentOS servers for Orient Me with another for DB2/SDI and another for Connections hosting the deployment manager.

Here are some things I have come across which I’d like to add to as I come across other problems.

DNS

Working on a beefy ESXi server running at home I normally manage most things using hosts file which has worked really well, up until now. I won’t steal from Roberto Boccadoro’s blog post but suffice to say I couldn’t get it to work using hosts file even after editing nsswitch.conf. I had to rely on spoofing DNS, internally, on my router by updating /jffs/configs/hosts.add to include all my Connections servers.

Even with this I found that the migration script in people-migrate container would fail because so in this case I had to add my host files to /etc/hosts which got me past that step.

MongoDB

I had to uninstall and reinstall a couple of times. On reinstall I had problems with the migration application (people-migrate) connecting to mongoDB. I was able to check the databases and connect to them.

# kubectl exec -it mongo-0 bash

#mongo

rs0:PRIMARY> show dbs
admin  0.000GB
local  0.000GB

The migration script was failing to connect and I couldn’t fathom why. I uninstalled again and this time I removed the persistent volumes and recreated them and now the migration script gets further but fails with the following exception.

2017-04-12T12:01:42.751Z – info: [migrator] Mongo DB URL: mongodb://mongo-0.mongo:27017/relationshipdb?replicaSet=rs&readPreference=primaryPreferred&wtimeoutMS=2000
2017-04-12T12:01:42.757Z – info: [migrator] Mongo DB URL: mongodb://mongo-0.mongo:27017/datamigrationdb?replicaSet=rs&readPreference=primaryPreferred&wtimeoutMS=2000
2017-04-12T12:01:42.758Z – info: [migrator] Mongo DB URL: mongodb://mongo-0.mongo:27017/profiledb?replicaSet=rs&readPreference=primaryPreferred&wtimeoutMS=2000
2017-04-12T12:01:54.018Z – info: [migrator] total request number: 1
2017-04-12T12:01:54.021Z – info: [populator] Start to populate URL:
–“https://connections.domain.com/profiles/admin/atom/profiles.do?ps=100&#8221;

2017-04-12T12:01:59.417Z – error: [migrator] errors:[{“profileKey”:”16ff2775-2ace-4db8-8e54-56adcc62a5fb”,”externalId”:”382AB352-F9AE-D6E4-8025-7D2C004A7248″,”created”:1491998514408,”orgId”:”a”,”id”:”FAKE_ID”,”error”:{}},{“profileKey”:”8af449b4-0357-4bed-a7c7-c0e5285ba826″,”externalId”:”932ED7B3-988D-9EFC-8625-79E3005B2B62″,”created”:1491998514409,”orgId”:”a”,”id”:”FAKE_ID”,”error”:{}},{“profileKey”:”a9294f18-ee72-49d0-8a44-cf02abe6d4d2″,”externalId”:”0873E9A9-7E12-0609-8025-7D38003BFD71″,”created”:1491998514410,”orgId”:”a”,”id”:”FAKE_ID”,”error”:{}},{“profileKey”:”b6994f86-7525-48b6-92da-900393382e11″,”externalId”:”0F64A6F8-927B-483C-8625-79E3005AC781″,”created”:1491998514410,”orgId”:”a”,”id”:”FAKE_ID”,”error”:{}}]
Connection fails: MongoError: failed to connect to server [mongo-0:27017] on first connect [MongoError: connection 4 to mongo-0:27017 timed out]
It will be retried for the next request.
Connection fails: MongoError: failed to connect to server [mongo-0:27017] on first connect [MongoError: connection 5 to mongo-0:27017 timed out]
It will be retried for the next request.

/usr/src/app/node_modules/mongodb/lib/mongo_client.js:338
throw err
^
MongoError: failed to connect to server [mongo-0:27017] on first connect [MongoError: connection 5 to mongo-0:27017 timed out]
at Pool.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/topologies/server.js:327:35)
at emitOne (events.js:96:13)
at Pool.emit (events.js:188:7)
at Connection.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/connection/pool.js:274:12)
at Connection.g (events.js:291:16)
at emitTwo (events.js:106:13)
at Connection.emit (events.js:191:7)
at Socket.<anonymous> (/usr/src/app/node_modules/mongodb-core/lib/connection/connection.js:187:10)
at Socket.g (events.js:291:16)
at emitNone (events.js:86:13)
at Socket.emit (events.js:185:7)
at Socket._onTimeout (net.js:339:8)
at ontimeout (timers.js:365:14)
at tryOnTimeout (timers.js:237:5)
at Timer.listOnTimeout (timers.js:207:5)

Redis client

In the knowledge center it alludes as to how to test connecting to Redis from the Connections node. If you want to install the client and try for yourself here are the instructions IBM deemed not necessary to write down for you.

# su -c ‘rpm -Uvh http://download.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-9.noarch.rpm&#8217;
# yum install redis

# redis-cli -p 30379
127.0.0.1:30379> set foo bar
OK
127.0.0.1:30379> get foo
“bar”
127.0.0.1:30379>

Odd pod behaviour

I believe I have an underlying problem with the persistent volumes and over night this happened.

# kubectl get pods

zookeeper-controller-3-2528439515-xz702   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-xz79d   0/1       OutOfpods   0          14h
zookeeper-controller-3-2528439515-xzqc9   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-xzzbl   0/1       OutOfpods   0          16h
zookeeper-controller-3-2528439515-z0kwf   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-z13kn   0/1       OutOfpods   0          17h
zookeeper-controller-3-2528439515-z2lsn   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-z6mc5   0/1       OutOfpods   0          14h
zookeeper-controller-3-2528439515-z74nj   0/1       OutOfpods   0          13h
zookeeper-controller-3-2528439515-z97jp   0/1       OutOfpods   0          17h
zookeeper-controller-3-2528439515-zd2js   0/1       OutOfpods   0          4h
zookeeper-controller-3-2528439515-zdc3t   0/1       OutOfpods   0          14h
zookeeper-controller-3-2528439515-zk5bw   0/1       OutOfpods   0          16h

# kubectl get pods | wc -l
2114

There were thousands of pods. I believe they were created faster than they could be garbage collected.

I deleted all the pods in the “OutOfpods” status using the following command.

# kubectl get pod | cut -d ” ” -f 1 | xargs -n1 -P 10 kubectl delete pod

Shutdown

To shutdown my servers I have been running the following to stop all pods.

# docker stop $(docker ps -a -q)

I’m not sure whether I am better off using a different variation of above to stop all pods

# kubectl get pod | cut -d ” ” -f 1 | xargs -n1 -P 10 kubectl delete pod

Is there a better prescribed way of doing this?

Enabling profiles events for Orient Me

I did what was asked of me in the knowledge center but there is little indication of it having worked. In the documentation it states that I should see “OrientMe configured properly – both properties are enabled.” Where should I see that, in SDI’s ibmdi.log or in one of the application servers SystemOut.log? I have looked at both and I do not see this written.

Anyway, I’ll hopefully add  to this as I go. If anyone has come across these problems and found a resolution to them, please get in touch.