J2ee server/applications working affected due to file system corruption
SAP NETWEAVER 6.40 and higher versions.
You notice on a SAP j2ee cluster that the server nodes or applications fail to startup after deployment or due file corruptions, manual deletions etc. This wiki deals with the steps to sync the DB contents of the server with the file system and eliminate such issues.
Created on: 19/12/2010
Hemanth is a senior software engineer, working at SAP
On the j2ee server you encounter one of the below symptoms:
There is a discrepancy in the contents of one or more server nodes, while all others are working fine. Manual actions have led to file being missing in the cluster folder. The server node / j2ee server fails to start because of a .jar file corruption and the default traces have errors like:
"java.io.EOFException: Unexpected end of ZLIB input stream"
"Caused by: java.lang.NoClassDefFoundError exceptions"
In a SAP J2EE Engine, all the data as well as files/binaries are stored in the database level along with the file system level. So, whenever there is any change in any of the binaries/files, the DB recognizes the same and synchronizes it with the file system. The synchronization happens in the bootstrap phase of the engine startup. It is possible that the files are corrupt on the file system level which leads to the server node being in a inconsistent state.
The below steps shoould resolve the issue. However note that this is a time consuming process and should be done with utmost care espacially on production servers.
1) Under cluster directory (/usr/sap/<SID>/JC<nr>/cluster) rename dispatcher folder to dispatcher.old.
2) Rename server0 and subsequent nodes to ".old" respectively. The intention of Step 1 and 2 is to create a backup of the compete backup of existing files. Do not rename the bootstrap folder as this is needed for the sync process.
3) Change bootstrap.properties file to ensure it has element.resynch=force as per SAP note 710663.
4) Restart the J2EE Engine. The new dispatcher/server folders should be created during the bootstrapping and this is a time consuming process as all the contents of the server nodes will be syched with the DB content. All the file system content is now newly copied from the DB.
5) Remove the element.resynch=force property or change it to element.resynch=detect which is the default setting.