Issue with Connections media widget timing out

Whilst building a new Connections environment for a customer we noticed a strange issue when uploading large files to the media gallery.

Initially I thought it was related to the size of the file, but the same file will upload to the Connections files application without issue. There is very little errors in the SystemOut.log for the Connections server, so I was baffled.

A PMR was opened and the very helpful Mr Dave McCarthy was the PMR owner and we then started on our investigation. During the testing I noticed that the uploads appeared to timeout after 20 mins, exactly 20 mins. After some experimenting on 4 different Connections systems, it was confirmed that it was a timeout, reguardless of the file policy or file library size. So not many people are on a intenet connection that may take 20 mins to upload a video, but we know it is an issue as the customer I was building the system for confirmed this.

After much digging through existing PMRs Dave was stummped, so the PMR was passed up the chain to the development team. Who confirmed very quickly that their is a setting in the config.js which is burried in the news ear file which has a time out set to 1200 sec (20 minutes)!! Change this setting and as if by magic the timeout issue is resolved.

To change the time out setting do the following :

/installedApps//News.ear/qkr.lw.war/WEB-INF/pages/js/config.js

Find this section, (Line 450), that specifies some timeout values,
including one for upload that is set to 1200 sec (20 minutes):

timeout: {
request: 60,
update: 200,
upload: 1200,
retrieveFiles: 100,
userSearch: 200,
userTypeahead: 10
},

Raise the upload value from 1200 to what is needed to complete the large file upload on your connection speed and save the file. Then restart the News application to make the change effective. This file should be changed on the primary Connections node if you have more than one and sync the changes around the other nodes.

Weird issue with TDI Connections Wizards

I noticed a weird issue with the TDI Connections population wizard today.

Originally we had TDI 7.1 installed for some specific issue that were addressed when synching different LDAPs together – that worked, the connections population wizard and all the scripts worked a treat (good news if you want to use TDI 7.1)

Now due to one thing and another we took TDI 7.1 off and put TDI 7.0.0.5 back on the machine – BUT if you do not replace the Connections Wizards directory you will get issues.

The GUI DB population wizard runs, everything looks good, you can fill all the info in it will do things, but then reports zero records added – build sucessful !! I was puzzled, I could connect to the LDAP ok – an LDAP search reports back ok, DB connections are all ok – what is going on.

The main issue is once you have run the wizard over a TDI 7.1 install the derby DB inside the wizards directory updates to a newer version. If you then downgrade TDI and run the same population wizard TDI throws an error as soon as it attempts to iterate at all as it can not read the internal derby DB – reporting it is at a newer version and is not compatible.

The only reason I discovered this was to run the collect_dns job and watch the TDI log. If you tail the TDI log from the wizard, becuase it runs multiple jobs and they wizz past bt so quickly you can not catch the error!!

So its an easy fix – delete the exising wizards directory and re-extract it and attempt again – and of course it will work.

I am guessing the moral of the story is use TDI 7.0.0.5 unless you HAVE to use 7.1 and try not to have to roll it back 🙂

 

 

 

 

Issues with TAM and Connections – SOLVED

Issues with TAM and Connections

For those of you that follow me on Twitter you will all know that I have had huge issues with Connections and TAM integration.
I am pleased to report that the issue is now resolved – Instructions below:

Created the transparent junctions as per the info center
Created the ACL defs as per the info center
Created default acl – connectionsdefaultacl and attached to junctions as per the info center
Created additional acl – connectionsacl as per the info center

Resources that do not require authentication which should have connectionsacl applied

/activities/images – Information present in the Lotus Connections wiki but not the official IBM Infocenter documentation.
/files/basic/anonymous/atom – Information present in the Lotus Connections wiki but not the official IBM Infocenter documentation.
/files/form/anonymous/atom – Missing from ALL official IBM documentation

Resources that require basic authentication which should have connectionsacl applied

/blogs/blogsapi – Information present in the Lotus Connections wiki but not the official IBM Infocenter documentation.
/blogs/blogsfeed – Information present in the Lotus Connections wiki but not the official IBM Infocenter documentation.
/communities/dsx – Missing from ALL official IBM documentation
/profiles/dsx – Missing from ALL official IBM documentation

Applied the require forms authentication which should have connectionsdefaultacl applied as per the info center
Created dynurl file as per the info center and applied connectionsacl to /blogs/blogsfeed, /blogs/blogsapi
Edited the web seal config added dynurl-allow-large-posts = yes, forms-auth = https or both, use-same-session = yes
Add the filter types as per the info center
Adding FQDN of load balanced TAM server virtual host – web-host-name = tam.your.domain.com
Import the connectionsAdmin user into TAM via the Web Portal Manager or pdadmin – This step is missing from ALL official IBM documentation
Update LC config file
set dynamic host enabled to “true” and the href/ssl_href to FQDN of load balanced TAM server virtual host i.e my.city.ac.uk
Ensure that the static href, static ssl_href and interService URLs for all services are pointing at the WebSEAL cluster i.e my.city.ac.uk
Set cusom authenticator to TAMAuthenticator and check timeout settings as per the info center
Configure the Lotus Connections directory service extensions to point to the Tivoli Access Manager server i.e setting the extension hrefs to:
http://tam.your.domain.com/ communities/dsx/ & http://tam.your.domain.com/profiles/dsx/

Lotus Connections applications will attempt to open server to server communications with other Lotus Connections applications via Tivoli Access Manager. If forms-auth has been set to https in the webseald-.conf file, then the signer certificate for WebSEAL client-side SSL communications should be added to the WebSphere trust stores – Missing from ALL official IBM documentation

Add the log out button to the HTTP server rewrite config / http config (depending on the set up)

Big thanks to Stephen Swann for the assist (@stephenjswann) – It is now deployed live and working as expected

Issues with Oracle with Connections 2.5 RESOLUTION

as posted by me on the Connections Blog earlier today :

IBM have now released new trigger code to resolve this issue.

The steps are simple

* Stop the application
* Backup the DB
* Run through the code to remove the Trigger
* Recreate the Trigger
* Start the primary server and test
* Assuming all is well start the other App Servers in the clusters

As yet IBM are unsure as to how they are going to release the fix as it falls out of the typical iFix scope.
If you are seeing this specific issue please contact the Lotus Connections Support team who will furnish you with the appropriate code to resolve the issue.
As soon as I have confirmation on how this will be distributed I will add what will hopefully be the final update to this on-going saga.

Big thanks to Kieran Reid at IBM and Andrew Frayling and his team at Cardiff Uni for assistance and support in resolving this issue. Great work all round.

Issues with Oracle on Solaris with Connections 2.5 – UPDATE

After some testing with the SPARC version of this fix – which actually did work we were pleased to find out that Oracle had released a version for x86.

We applied this – this morning, and I am sorry to say it doesn’t work. If you try to delete a file from the DB directly or through the connections interface, the DB is still throwing the mutating trigger issue.

Plot thickens – time to go back to oracle 🙂

Issues with Oracle on Solaris with Connections 2.5

There is an issue when running Connections with Oracle on Solaris
Symptoms of the problem are you can not delete certain files and / or the files widget from communities

The error in the logs is – table FILES.MEDIA is mutating

08/02/10 00:01:00:569 GMT] 0000005d Library E EJPVJ9166E: Unable to delete the library with id b855660b-d6bc-4b19-891f-2087aa3d9a0c. [UserImpl@26ce26ce id=64377ea3-e571-4323-922a-dc0723fead36 directoryId=2BE4B3FF-4AB4-48FF-9B83-73689537A16A]
java.sql.SQLException: ORA-04091: table FILES.MEDIA is mutating, trigger/function may not see it
ORA-06512: at “FILES.PKG_MED_DOWNLOAD_UPD”, line 45
ORA-06512: at “FILES.MED_DOWNLOAD_UPD_S”, line 2
ORA-04088: error during execution of trigger ‘FILES.MED_DOWNLOAD_UPD_S’

at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)

We have since discovered (thanks to Kieran Reid in Connections Support for doing the leg work) that this is an issue with Oracle 10.2.0.4 on Solaris – the triggers have an issue which is fixed in 10.2.0.5 – which is a big no no as far as connections go. There is a fix that you can apply to 10.2.0.4 that will resolve the problem.

From support.oracle.com search the knowledge base for 4574851
You should get three results, select the third match
Click on the link for Patch.4574851
Select the 10.2.0.4 release for the Solaris platform
Download, install and test.

*NOTE* this fix is only available for SPARC not x86

So far this appears to have fixed the issue on the backup of the Prod database (I have put a stand-alone LC25 in front of it to test which involved all sorts of DB hacking to get it to work – not recommended unless you are desperate for a quick test). I am hoping to schedule moving our prod DB from x86 to SPARC applying the patch and then plugging my LC25 cluster into it.

Portal won’t start ??

I have come across a horrible little *feature* that occurs sometimes with WebSphere Portal where the server fails to start and writes absoutely nothing to the log … Just a tad annoying when you are trying to work out why it didn’t start in the first place.

Sometimes this is due to a tranlog problem which is pretty straight forward to resolve:
backup / delete the directory –
< profile root >/tranlog/cellname/nodename/servername

normally this will do the trick and a restart works – if it doesn’t do the following
rename the log folders (or delete them)

< profile root >/logs

to fix the issue that I had seen I had to rename the ffdc, nodeagent and portal server log directories

restart the servers in question and as if by magic it starts 🙂

I have experienced this at Portal 6.1.5 but I have had reports that this also appears to be a problem on 6.1.0.2 as well

Connections 2.5 Clustering – how to avoid some pain

All was going exactly to plan when I installed my primary node – federated correctly worked as expected and I even managed to change it fairly easily to point to a different DB and shared content store I was a very happy bunny UNTIL I decided to add node2 – then it all went “pear shaped”.

So here is a quick over view of the issue and how I have got around it – but I really want to know how this happened and if I can do anything to fix this for the future – I have a PMR open and IBM are trying to recreate the issue now.

I created node1 using the Connections install wizard to create a primary node – I supplied DB(jdbc:oracle:thin:@ < my Original DB server name >:1521:conn1) and file system info (//< my Original File server name >/LotusConnectionsData/< featureName >) and it clustered successfully and node 1 was fine.

I then moved the DB to another machine and also moved the file system. I edited the data source info at cluster and server level (jdbc:oracle:thin:@< my NEW DB server name >:1521:conn1) and also changed the file system (< my NEW File server name >/portal_collabdata$/< featureName >) in the Websphere variables section of the ICS as per the instructions in the info center. Node 1 has always worked as expected even after moving these.

When I added any subsequent node it configured the server with the original file store information (//< my Original File server name >/LotusConnectionsData/< featureName >)) and defaulting back to the original DB data source (jdbc:oracle:thin:@< my Original DB server name >:1521:conn1).

If I change these manually and resynch and restart the servers they work as expected – the Datasource although it is set at Cluster level .. is also set at server level – I had to change the datasource EVERYWHERE to fix the issue (as I have 4 servers per machine and 4 machines that is a lot of editing).

This has prompted me to ask these questions of IBM:

The WebSphere Variables for the file stores are also picking up the original path – It appears that when Node1 was federated and the config was created that it has made some kind of *template* which it creates further nodes/servers from. As I have changed the config the template is not getting updated (if this is how it is doing it).

Am I doing anything wrong?
If so what?
And if no how to I prevent this from happening in the future?

== IBM’s Response ==
I received an email back from IBM regarding the issues that I experienced after changing some settings in my cluster. The bad news is it is a limitation, the good news is they are going to fix it:

The customer is right, this is a limitation in the LC 2.5 install and is being addressed for the next release.

In LC 2.5, variables/datasources/providers/etc are created at the server level, then this is used as a template for additional servers…
the problem is that server level settings like this override higher (node, cluster, cell) level settings, causing the difficulty updating the customer experienced.
ideally, these settings would be at cluster level.

Since the customer has this working, they do not have to change anything, but, if they wish to simplify future changes they can do the following:

1. create cluster level variables, datasources, providers, etc
2. [optional… for testing] create a new node — this node will have all the server level settings by default
3. only if you did 2… delete the server level settings for the items you created at cluster level in step 1
note: if you don’t delete the server level settings for this new node, it would continue to use the server level settings
4. only if you did 2… test that the applications deployed on the new node behave correctly (basically you are verifying the cluster level settings)
5. after verifying (or reviewing) the cluster level settings (variables, datasources, etc), you can delete the server level items corresponding to the new cluster level items
note: if you don’t delete the server level settings for this new node, it would continue to use the server level settings
6. now, when you make changes to the cluster level variables thru the deployment manager, you just need to save changes and synchronize nodes
all the nodes and servers that don’t have node or server level instances of the same variables will get the cluster level values

Again, the order of precedence for finding variables, datasources, etc is….
first, is it defined for the Server? If yes, the server level item is used
second, is it defined for the Node?
third, is it defined for the Cluster?
fourth, is it defined for the Cell?