Introduction
Document Status
The Diagnostics Agent troubleshooting guide used to be available as a PDF document. It is now replaced by this wiki page. In case the content of this page does not address your problem, please refer to the former PDF document, being still available as DiagnosticsAgentTroubleshooting.pdf attachement to SAP Note 1833501.
Diagnostics Agent Installer
For more general information on installation strategy, installer versions, etc., please refer to Diagnostics Agents.
Troubleshooting Resource Consumption and Booting Issues of the Diagnostics Agent
Verify the Prerequisites on the host
Verify the "Hosts File" format has correctly been define on the host impact following the Mandatory Standard. See https://wiki.scn.sap.com/wiki/x/i4IjIQ
Check OS Settings on Linux Hosts (ulimit and umask)
After the installation of the Diagnostics Agent on UNIX platforms, remind to double check the following OS environment settings for the Diagnostics Agent OS user (often daaadm):
ulimit
- Log on with the agent OS user, e.g.
su – daaadm
- Open an sh shell with the command
sh
- Execute the command
ulimit -a
Compare the values with the recommendedations in the Installation & Setup Guide for the most recent installer (see SAP Note 1833501).
If you change the limits, remember to stop the Diagnostics Agent (sapstop), kill the Diagnostics Agent processes (sapstartsrv) and start the Agent again (sapstart). In this way the Diagnostics Agent will take the new settings into account.
If the user’s limit parameters for the OS Diagnostics Agent user are not correct, high CPU consumption situations or out-of-memory errors can result.
umask
Please refer to SAP note 1163751.
Trigger Thread Dumps for Diagnostics Agents
It is important to do thread dumps of the Diagnostics Agent process, to understand the problem context in case of a deadlock, not responding Diagnostic Agent or a high CPU time consumption situation.
Trigger a Thread Dump on Windows
The Diagnostics Agent is an SAP system and available in SAP MMC.
In SAP MMC, open the node “AS Java Process Table” of the Diagnostics Agent identified by the SID and instance number. Select the desired Agent Node and right click on it to display the context menu, then click the item “Dump Stack Trace”.
Trigger a Thread Dump on Unix
For Diagnostics Agents 7.1x, 7.2x, 7.3x or installed with SWPM 1.0 SP 3 and higher, log on in a UNIX shell with the OS user of the Diagnostics Agent, navigate to the folder /usr/sap/<SID>/SMDA<instance number>/exe and execute command:
sapcontrol.exe -user <SID>adm <PWD> -nr <Instance number> -function J2EEControlProcess smdagent DumpStackTrace
example:
./sapcontrol -user daaadm abc123 -nr 98 -function J2EEControlProcess smdagent DumpStackTrace
Alternatively a thread dump can be triggered via "kill -3 <pid>", with <pid> being the process ID of the Diagnostics Agent's Java process.
Where Is the Thread Dump Stored?
The thread dump is written into the file /usr/sap/<SID>/SMDA<instance number>/work/std_SMDAgent.out, provided that the agent is running on a SAP JVM.
Trigger Heap Dumps for Diagnostics Agents
See Using JVMMON to Trigger a Heap Dump
Verbose Class Loading
To enable verbose logging of the class loading of Diagnostics Agent add -verbose:class-verbose:class to the property smdagent.javaParameters in /usr/sap/<<SID>>/SMDA<<INSTANCE_NUMBER>>/SMDAgent/smdagent.properties. Restart the Diagnostics Agent and find the verbose class loading information at usr/sap/<<SID>>/SMDA<<INSTANCE_NUMBER>>/work/jvm_smdagent.out.
Common Issues Related to Ressource Consumption and Booting of the Diagnostics Agent
The Diagnostics Agent Log Contains Out-of-Memory Errors caused by abapReadSyslog
Symptom:
The log contains an exception like this:
[MAIJobObserver] ERROR occurred for metric collection 00000000000000000003[com.sap.smd.mai.model.collector.SAPControlWSCollector]. [EXCEPTION] com.sap.smd.agent.plugin.connectors.webservice.WebServiceInvocationException: Webservice invocation error occured on BindingProvider JAX-WS RI 2.1.6 in JDK 6: Stub for http://mySystem:myPort/SAPControl.cgi at com.sap.smd.agent.plugin.connectors.webservice.JaxWebserviceInvocationHandler.invoke(JaxWebserviceInvocationHandler.java:114) at com.sun.proxy.$Proxy36.abapReadSyslog(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sap.smd.agent.facade.hostagent.HostAgentSyncProxy$SyncHandler.singleInvoke(HostAgentSyncProxy.java:127) at com.sap.smd.agent.facade.hostagent.HostAgentSyncProxy$SyncHandler.invoke(HostAgentSyncProxy.java:85) at com.sun.proxy.$Proxy43.abapReadSyslog(Unknown Source) at com.sap.smd.mai.model.collector.SAPControlWSCollector.abapReadSysLog(SAPControlWSCollector.java:1112) at com.sap.smd.mai.model.collector.SAPControlWSCollector.collect(SAPControlWSCollector.java:217) at com.sap.smd.mai.job.MetricJobRunner.run(MetricJobRunner.java:32) at com.sap.smd.server.exec.TaskRunner.run(TaskRunner.java:47) at com.sap.smd.server.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:785) at java.lang.Thread.run(Thread.java:763) Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
Cause:
The syslog size is at least nearly as big as the Java maximum heap size.
Resolution:
- To resolve the issue increase the Java maximum heap size (see below) to be sufficiently bigger than the syslog size.
- Alternatively, limit the size of the syslog or delete it.
The Diagnostics Agent Log Contains other Out-of-Memory Errors
Symptom:
java.lang.OutOfMemoryError: Java heap space and java.lang.OutOfMemoryError: GC overhead limit exceeded can be found in the log file.
Cause:
By default, the SMD Agent's Java heap maximum size is set to 256 MB. In the event of a large data collection, a heap size of 512 MB or more is required.
Resolution:
To resolve the issue, do the following for the Diagnostic Agent on host :
1. On your managing host, browse to the following directory:
Windows <DRIVE>:\usr\sap\DAA\SMDA98\SMDAgent
Unix /usr/sap/DAA/SMDA98/SMDAgent
Note: In the above example DAA is the name of the SID and the instance number is 98. This may differ on your system.
2. Open the file smdagent.properties using a text editor
3. Locate the property smdagent.javaParameters then locate the parameter -Xmx
4. Change the value of -Xmx to -Xmx512m then save the changes. For example:
smdagent.javaParameters=-DP4ClassLoad=P4Connection -Xmx512m -Xms256m -XX:MaxPermSize=128m
5. Restart the managing SMD Agent.
The Diagnostics Agent Does Not Start
For issues related to the start-up of the Diagnostics Agent, please refer to this Wiki page.
Extracting Configuration From Secure Store File Failed
If the SMDSystem.*.log contains errors similar to the entries listed below, the secure store of the Diagnostics Agent might be broken due to incompatible updates of the Diagnostics Agent or the JVM:
java.io.IOException: javax.crypto.BadPaddingException: Invalid PKCS#5 padding length: 33 at javax.crypto.CipherInputStream.a(DashoA13*..) at javax.crypto.CipherInputStream.read(DashoA13*..) at javax.crypto.CipherInputStream.read(DashoA13*..) at java.util.Properties$LineReader.readLine(Properties.java:434) at java.util.Properties.load0(Properties.java:353) at java.util.Properties.load(Properties.java:341)
Extracting configuration from secstore file /usr/sap/DAA/SMDA98/SMDAgent/applications.config/com.sap.smd.agent.application.e2emai.std.collectors/_Default_Configuration.properties Failed.
The secret key could not be read in secure storage
com.sap.security.core.server.secstorefs.FileInvalidException: Getting Secure Store failed: File "/usr/sap/DAA/SMDA98/SMDAgent/configuration/secstore.properties" is invalid: software version 6.30.000.001 is incompatible with file version 7.00.000.001.
File "/usr/sap/DAA/SMDA98/SMDAgent/./temp/smdserver/secstore/TechnicalSecStorea8e9da9b2f1d7a8a9b079d5d1804a6c4.properties" is invalid: record with key "$internal/mode" is missing
Unable to create SSLContext because of KeyStore Exception java.security.UnrecoverableKeyException: Cannot recover key
To reset the secure store, proceed as follows:
- Stop the Diagnostics Agent
- Delete /usr/sap/<SID>/SMDA<instance number>/SMDAgent/configuration/secstore.properties.
- Delete /usr/sap/<SID>/SMDA<instance number>/SMDAgent/configuration/security/.CertificatesKeyStore
- Start the Diagnostics Agent.
- In the Agent Administration → Non-Authenticated Agents select the respective Diagnostics Agent and click on Trust Agent.
In case the issue persists after the execution of the steps above, it is possible to follow the steps in the following article to correct this scenario.
- 2447919 - Diagnostic Agent does connect to Solution Manager with the error: 'record with key "$internal/mode" is missing.'
The Diagnostics Agent is Blocking HANA Client Ports
When the Diagnostics Agent is started, all connections from the Solution Manager to the SAP HANA fail (e.g. Telnet on port 22 or the HANA port that is used by the Solution Manager server).
Solution:
A possible cause might be that there are enough ports, but they cannot be reused fast enough. This can be configured using the below mentioned parameters. At SAP Note 2382421 especially check the sections "net.ipv4.ip_local_port_range" and "SAP Host Agent Configuration". The latter points to SAP Note 401162, which should also be taken into account. Furthermore, check the parameters "net.ipv4.tcp_tw_reuse" and "net.ipv4.tcp_tw_recycle" as described at SAP Note 2382421.
SAP Solution Manager Java Stack Scalability Settings
Please refer to SAP Note 2417827
10 Comments
Former Member
Hi
When running an SMD agent behind a firewall I am using the saprouter to connect it to the P4 port of the Solution Manager. This works successfully, however, once the agent is started some SMD applications (like MAIMeasurementProvider) are using webservices to try to communicate directly with Solution Manager . eg I am seeing the following error in the webservices.n.log
Dec 10, 2013 8:40:16 AM [Thread[ExRun:dpc:job_6,5,dpc:job:ExecTG]] Error Connection parameters: host [<solman>], port [50200], proxy host [null], proxy port [-1]
Dec 10, 2013 8:40:16 AM [Thread[ExRun:dpc:job_6,5,dpc:job:ExecTG]] Error Connection parameters: endpointURL [http://<solman>:50200/sap/bc/srt/scs/sap/e2e_dpc_push?sap-client=100], proxy host [null], proxy port [-1]
and in smdagent_trace.n.trc
[MAIMeasurementProvider.pushFailed] MAI measurement push failed.
[EXCEPTION]
{0}#1#java.rmi.RemoteException: Service call exception; nested exception is:
java.net.ConnectException: Connection timed out
at com.sap.smd.agent.wsclients.dpc.BindingStub.e2eDpcPushMetrics(BindingStub.java:85)
at com.sap.smd.agent.wsclients.dpc.BindingStub.e2eDpcPushMetrics(BindingStub.java:96)
at sun.reflect.GeneratedMethodAccessor392.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.sap.smd.api.util.SynchronizedProxy$SyncHandler.invoke(SynchronizedProxy.java:32)
at com.sun.proxy.$Proxy11.e2eDpcPushMetrics(Unknown Source)
at com.sap.smd.dpc.job.DPCServicePushMetricJob.pushSimpleEvents(DPCServicePushMetricJob.java:122)
at com.sap.smd.dpc.job.DPCServicePushMetricJob.run(DPCServicePushMetricJob.java:66)
at com.sap.smd.server.exec.TaskRunner.run(TaskRunner.java:46)
at com.sap.smd.server.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:781)
at java.lang.Thread.run(Thread.java:743)
Caused by: java.net.ConnectException: Connection timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:531)
at java.net.Socket.connect(Socket.java:478)
at java.net.Socket.<init>(Socket.java:375)
at java.net.Socket.<init>(Socket.java:189)
at com.sap.engine.services.webservices.jaxm.soap.HTTPSocket.initStreamsFromSocket(HTTPSocket.java:695)
at com.sap.engine.services.webservices.jaxm.soap.HTTPSocket.initializeStreams(HTTPSocket.java:512)
at com.sap.engine.services.webservices.jaxm.soap.HTTPSocket.getOutputStream(HTTPSocket.java:469)
at com.sap.engine.services.webservices.jaxrpc.wsdl2java.soapbinding.HTTPTransport.getRequestStream(HTTPTransport.java:410)
at com.sap.engine.services.webservices.jaxrpc.wsdl2java.soapbinding.MimeHttpBinding.outputMessage(MimeHttpBinding.java:557)
at com.sap.engine.services.webservices.jaxrpc.wsdl2java.soapbinding.MimeHttpBinding.call(MimeHttpBinding.java:1447)
at com.sap.smd.agent.wsclients.dpc.BindingStub.e2eDpcPushMetrics(BindingStub.java:78)
... 11 more
Can anyone advise how to set the http.proxyHost and http.proxyPort in the JVM of the SMD agent to get the application to use a proxy server to proxy the URL from the SMD agent to Solution Manager? I have tried updating the following parameter in smdagent<group>.properties file
smdagent.javaParameters=-DP4ClassLoad=P4Connection -Dhttp.proxyEnable=true -DproxySet=true -Dhttp.proxyHost=proxyhost -Dhttp.proxyPort=8080 -Dhttp.nonProxyHosts="localhost|127.0.0.1" -Dhttps.proxyHost=proxyhost -Dhttps.proxyPort=8080 -Xmx256m -Xms256m -XX:MaxPermSize=128m
but it does not seem to take effect after restarting the agent.
I have seen in the wiki http://wiki.scn.sap.com/wiki/display/EEM/Troubleshooting+7.1?focusedCommentId=361464663#Troubleshooting7.1-Executor: where you can set these java parameters for the robot parameters in SOLMAN_SETUP. Is there a similar setting available for the SMD agents themselves or the applications running on the SMD agents like MAI Measurement Provider?
regards,
Richard
Former Member
Regarding scalability:
We found that another important setting to make on the Solution Manager Java stack is in Visual Admin --> server 0 --> Services --> tc~smd~server~service.
Change the property smd.mgr.pool.threads.max.count from default 50 to 100 (or higher). Restart the service.
Thomas Stone III
Do the java stack scalability calculations change if you have multiple dispatchers (multiple application servers)?
Olivier MONTABERT
Hi Thomas,
I now created an Excel document, which summarizes all currently known scalability settings in regards to the SAP Solution Manager 7.1 Java stack.
I hope this answers also your question.
Best regards,
Olivier
Thomas Stone III
Thanks Olivier, I followed your guide last week. A couple notes, we have two servers nodes per java dispatcher. We have 4 app servers total giving us 4 dispatchers and 8 java server nodes total. Because of this I left our dispatcher heap size at double what the document recommended, but it wasn't clear if that was necessary.
Feedback: maxSoTimeOutConnections: Estimated agents: 2000; spreadsheet result: 200; with this setting we immediately ran into issues upon startup and only about 170 agents would come online. I reverted this to my previous settings of 2700 and MaxParallelUsers back to 0 (leaving all other recommendations in place) and everything started cleanly.
The new recommended 'Application Manager → Max Thread Count' suggested that I change from 400 to 1500 which alleviated a bottle-neck in my system. (Thanks!) I also followed all of the other calculations that your sheet provided and all worked well. Please review the maxSoTimeOutConnections calculation. If you believe that by leaving the heap at the higher setting may have been related, I can schedule time to change the heap to 1000 and try a maxSoTimeOutConnections setting of 200 again. Unfortunately due to outage time limitations I wasn't able to test all of the combinations I wanted to during my previous task.
I really appreciate the spreadsheet. It makes a confusing topic straight forward.
Regards,
Thomas
Former Member
SAP note 1916217 v3 says:
"ConnectionsManipulator/maxParallelUsers = [Total number of Diagnostics Agents] + 100 "
and
"Alternatively, the maxParallelUsers value can be set to 0 (unlimited number of parallel users)."
What's the impact when changing this to a higher value?
Olivier MONTABERT
Hi Darryl,
I now created an Excel document, which summarizes all currently known scalability settings in regards to the SAP Solution Manager 7.1 Java stack.
I hope this answers also your question.
Best regards,
Olivier
Diego Hernandez
Hi colls,
another Java parameter.
(SAPSolutionManager7.1 SP10_SizingGuide)
Olivier MONTABERT
Hi Diego,
I now created an Excel document, which summarizes all currently known scalability settings in regards to the SAP Solution Manager 7.1 Java stack.
Kindly use now this Excel - SAP Note 2417827 - to define your parameter adjustments.
Best regards,
Olivier
Diego Hernandez
Hi Oliver,
thank you so much. This note will be usefull.
Kind regards
Diego Hernández Rollizo