Page tree
Skip to end of metadata
Go to start of metadata

Product versions:
Business Objects Enteprise XI 3.1

Problem

Cause and Resolution

Reference Id

Replication Jobs remain in a Pending state

Symptom :
A Replication Job is in a Pending state and never gets to run even though nothing obvious is blocking it. When you look at the instance details, you notice the Next Run Time is 30 minutes after the time you originally set.
Cause 1 :
This may occur when several jobs are scheduled to run with the same origin site at the same time. Out of all these concurrent jobs, only one job will execute and lock the origin site (see replication lock in How Federation XI 3.1 works ?), while the others will fail and get postponed by the ThrottleRetryInterval time. By default, it is set to 30 minutes.
See "One replication job at a time" section in How Federation XI 3.1 works ?
Solution :
It is important to understand that the jobs will never execute simultaneously, however you can shorten the wait time between 2 jobs by decreasing the retry interval. To do so, edit the Adaptive Job Server command line and add the following parameter to the existing javaArgs :
Dbobj.federation.ThrottleRetryInterval=<value in seconds> (1800 by default)
Example

-trace -javaArgs Dbobj.logging=true,Xmx512M,Dbobj.federation.ThrottleRetryInterval=180
-loggingPath "C:/Xi3_federation/BusinessObjects Enterprise 12.0/Logging/" ...

Cause 2 :
This may occur a replication lock did not get released properly, for instance when a job server terminate unexpectedly during a job execution. If tracing is on in the Adaptive Job Server, you will see the following error:
The property SI_CUID of info object with id number X is not unique.
Solution :
To correct the problem:

  1. Navigate to the Temporary Storage folder
  2. Find an object called "Replication Lock for <OriginCMS>", where <OriginCMS> is the CMS name as entered in the Remote System Connection. The object will be located in one of the subfolders of Temporary Storage.
  3. Delete the object.
  4. Return to the replication Job's History
  5. Either wait 30 minutes or delete the Instance and reschedule.

N/A

Objects fail to replicate with the error message : "Not replicating object to destination: object was already replicated to origin from site "TESTAFIESTA3.product.businessobjects.com""

Symptom :
A replication job succeeds but some objects fail to replicate with the following error message :
"Not replicating object to destination: object was already replicated to origin from site "TESTAFIESTA3.product.businessobjects.com""
Cause :
You are trying to replicate an object that is itself replicated from a different origin. This falls under the multiple hops scenario that is not supported in XI 3.1. See How Federation XI 3.1 works ? for more details.
Solution :
You must replicate that object in a separate replication job from the origin site.

N/A

Replication Jobs fail with error message "Unexpected exception caught. Reason: [java.lang.OutOfMemoryError : Java heap space]"

see tuning parameters in Tuning Federation XI 3.1 for performance and heavy load

N/A

apparent signs of memory leak on Adaptive Job Server Child process (java.exe)

Symptom: you may see the memory consumption of the java process gradually increase and never decrease over time and as it processes replication jobs. This might lead to job failing with OutOfMemoryError.
Cause: The root cause is a leak in a 3rd party library Derby involved in the replication process for caching : https://issues.apache.org/jira/browse/DERBY-1947.
Solution: see tuning parameters in Tuning Federation XI 3.1 for performance and heavy load

 

Error: "Invalid Batch ID"

This can occur, with weblogic, when committing a large number of objects. The Object Manager can not stream the objects quick enough to the CMS for the CMS to commit the objects within the CMS timeout window. The default is 10 seconds. Haven't seen this problem on any other Java application server yet, only Web Logic.
To fix this
Add -BatchExpirySeconds <seconds> to the CMS command line.
If you experience this try increasing to 30 seconds, if the error still occurs increase as required.
Feature Request for Aurora:
Currently have a problem with searching the log. Want to be able to paginate through the logs.

N/A

CMS Query Timeout

"SDK error occurred at the origin site. CMS operation timed out after 9 minutes. The server supplied the following details: OCA_Abuse exception 7710 at [.\exceptionmapper.cpp : 79] 48022 {9} ...CMS operation timed out after 9 minutes. Thread timeout has expired."
When replicating a large number of objects (error occurred at approximately 300k objects, though this may vary depending on your environment), Federation will send queries to the CMS that require more than 9 minutes to be executed. The current default value for CMS thread timeout is 9 minutes.
To fix this, increase the CMS thread timeout registry value.
The registry key is called "ThreadTimeoutAfterNMinutes" and the value is interpreted in minutes.
It can be located at "HKEY_LOCAL_MACHINE\SOFTWARE\Business Objects\Suite 12\CMS\Instances\<CMS_NAME>.cms"
Try increasing it to "60" minutes, though further adjustment might be need depending on your environment.
In addition you should increase the connection timeout used by the SDK to be greater than the CMS thread timeout.
The registry key is called "ConnectionTimeout" and the value is interpreted in milliseconds.
It can be located at "HKLM\Software\Business Objects\Suite 12\CER"
Try increasing it to 61 minutes or "3660000" milliseconds, though further adjustment might be need depending on your environment.

N/A

  • No labels