Alert Progagation
To understand the way EEM is dealing with event/alerts it is necessary to define a specific vocabulary to be sure to talk about the same.
If an EEM Script is executed on an EEM Robot the Solution Manager gets a notification about a ScriptExecution.
The ScriptExecution consists out of several (1-n) ScriptExecutionSteps
Global Event Rule
In SAP Solution Manager 7.1 the event rule (propagation) between script executions and a script can only be set globally for all scripts. Starting with Solution Manger 7.2 this event rule can be set individually (for each script different).
ScriptExecution Step to scriptExecution
The ScriptExecutionStep is the only object which carries a hard rating in two categories "Availability" and "Performance". Everything else is just result of propagation.
For example
- a Logonstep performed at 12:23:14 / 24th Dec 2012 was successfull (200) and fast enough.
- a click in a workflow at 14:28:17 / 12.09.2011 failed with "No service"
In a next step, we have to take in consideration that a ScriptExecutionStep is, in most cases, not an isolated event. It belongs to a defined ScriptExecution.
For example the LogonStep from 12:23:14 belongs to a Script Execution which started at 12:23:12 and performed several other ScriptExecutionSteps after the logon.
Hopefully all ScriptExecutionSteps in the ScriptExecution are green but what if not?
Per default you would get a worstcase propagation which means that the highest status code from all ScriptExecutionSteps would be propagated to the ScriptExecution.
That means one red rated ScriptExecutionStep, maybe would turn the whole ScriptExecution red.
Normally this makes sense because a click sequence often only produces valid results if it is executed step by step in a fixed order and every step is depending on the one before.
If you want to influence this beavior you can do in Guided Procedure for End User Experience Monitoring in Transaction SOLMAN_SETUP. The procedure step "Alerting" allows to modify the behavior in the global tab.
Bestcase method would mean that at least one green rated ScriptExecutionStep is needed to keep a ScriptExecution green.SAP is not recommending to use bestcase propagation at this place because it easily happens that steps like "logoff" appear green even if the monitored business scenario is down.
ScriptExecution to Script
If you have a look in the Unified Alert Inbox you will see that we do not alert on ScriptExecutions. The monitored object is a Script not a localized, single execution of it on one EEM Robot somewhere.
That means we have to look at the last ScriptExecution of the same Script on all Robots and define a rule for propagation of an overall status.
Default for availability is bestcase propagation: If the script could have been executed at least from one location (EEM Robot) the backend cannot be down.
Default for performance is worstcase propagation from ScriptExecution to Script.
There is another method which makes the infrastructure more resistant against temporary fluctuations by combining worst and bestcase rules. (7.1SP8+).
The name of the rule is (depending of the reelase) Worst rating of last n executions or Worst rating of last N best events.
In a first step it will identify the best of the last n (default n=3) ScriptExecutions on every single EEM Robot for a Script.
That means (with n=3) it needs three yellow or red ScriptExecutions in a sequence in a particular location/Robot for a "local" red/yellow.
Now local status from all EEM Robots are propagated with a worst case rule. That means if a situation for a business scenario in a location is proofed to be bad for a longer time it will not be hidden by other green status from a different continent.
In the result you will get less alerts due to random spikes during monitoring. In the graph below no alert is triggert because no critical performance issue was constant enough to be worth a deeper analysis or incidet handling.