Page tree
Skip to end of metadata
Go to start of metadata


Introduction

Document Status

The Diagnostics Agent troubleshooting guide used to be available as a PDF document. It is now replaced by this wiki page. In case the content of this page does not address your problem, please refer to the former PDF document, being still available as DiagnosticsAgentTroubleshooting.pdf attachement to SAP Note 1833501.

Diagnostics Agent Installer

For more general information on installation strategy, installer versions, etc., please refer to Diagnostics Agents.

 

Troubleshooting Resource Consumption and Booting Issues of the Diagnostics Agent

Verify the Prerequisites on the host

Verify the "Hosts File" format has correctly been define on the host impact following the Mandatory Standard. See https://wiki.scn.sap.com/wiki/x/i4IjIQ

Check OS Settings on Linux Hosts (ulimit and umask)

After the installation of the Diagnostics Agent on UNIX platforms, remind to double check the following OS environment settings for the Diagnostics Agent OS user (often daaadm):

ulimit

  • Log on with the agent OS user, e.g. su – daaadm
  • Open an sh shell with the command sh
  • Execute the command ulimit -a
  • Compare the values with the recommendedations in the Installation & Setup Guide for the most recent installer (see SAP Note 1833501).

  • If you change the limits, remember to stop the Diagnostics Agent (sapstop), kill the Diagnostics Agent processes (sapstartsrv) and start the Agent again (sapstart). In this way the Diagnostics Agent will take the new settings into account.

If the user’s limit parameters for the OS Diagnostics Agent user are not correct, high CPU consumption situations or out-of-memory errors can result.

umask

Please refer to SAP note 1163751.

Trigger Thread Dumps for Diagnostics Agents

It is important to do thread dumps of the Diagnostics Agent process, to understand the problem context in case of a deadlock, not responding Diagnostic Agent or a high CPU time consumption situation.

Trigger a Thread Dump on Windows

The Diagnostics Agent is an SAP system and available in SAP MMC. 

                          
In SAP MMC, open the node “AS Java Process Table” of the Diagnostics Agent identified by the SID and instance number. Select the desired Agent Node and right click on it to display the context menu, then click the item “Dump Stack Trace”.


Trigger a Thread Dump on Unix

For Diagnostics Agents 7.1x, 7.2x, 7.3x or installed with SWPM 1.0 SP 3 and higher, log on in a UNIX shell with the OS user of the Diagnostics Agent, navigate to the folder /usr/sap/<SID>/SMDA<instance number>/exe and execute command:

sapcontrol.exe -user <SID>adm <PWD> -nr <Instance number> -function J2EEControlProcess smdagent DumpStackTrace

example:

./sapcontrol -user daaadm abc123 -nr 98 -function J2EEControlProcess smdagent DumpStackTrace

Alternatively a thread dump can be triggered via "kill -3 <pid>", with <pid> being the process ID of the Diagnostics Agent's Java process.

Where Is the Thread Dump Stored?

The  thread dump is written into the file /usr/sap/<SID>/SMDA<instance number>/work/std_SMDAgent.out, provided that the agent is running on  a SAP JVM.

Trigger Heap Dumps for Diagnostics Agents

See Using JVMMON to Trigger a Heap Dump

Verbose Class Loading

To enable verbose logging of the class loading of Diagnostics Agent add -verbose:class-verbose:class to the property smdagent.javaParameters in /usr/sap/<<SID>>/SMDA<<INSTANCE_NUMBER>>/SMDAgent/smdagent.properties. Restart the Diagnostics Agent and find the verbose class loading information at usr/sap/<<SID>>/SMDA<<INSTANCE_NUMBER>>/work/jvm_smdagent.out.

Common Issues Related to Ressource Consumption and Booting of the Diagnostics Agent

The Diagnostics Agent Log Contains Out-of-Memory Errors caused by abapReadSyslog

Symptom:

The log contains an exception like this:

[MAIJobObserver] ERROR occurred for metric collection 00000000000000000003[com.sap.smd.mai.model.collector.SAPControlWSCollector].
[EXCEPTION]
com.sap.smd.agent.plugin.connectors.webservice.WebServiceInvocationException: Webservice invocation error occured on BindingProvider JAX-WS RI 2.1.6 in JDK 6: Stub for http://mySystem:myPort/SAPControl.cgi
        at com.sap.smd.agent.plugin.connectors.webservice.JaxWebserviceInvocationHandler.invoke(JaxWebserviceInvocationHandler.java:114)
        at com.sun.proxy.$Proxy36.abapReadSyslog(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at com.sap.smd.agent.facade.hostagent.HostAgentSyncProxy$SyncHandler.singleInvoke(HostAgentSyncProxy.java:127)
        at com.sap.smd.agent.facade.hostagent.HostAgentSyncProxy$SyncHandler.invoke(HostAgentSyncProxy.java:85)
        at com.sun.proxy.$Proxy43.abapReadSyslog(Unknown Source)
        at com.sap.smd.mai.model.collector.SAPControlWSCollector.abapReadSysLog(SAPControlWSCollector.java:1112)
        at com.sap.smd.mai.model.collector.SAPControlWSCollector.collect(SAPControlWSCollector.java:217)
        at com.sap.smd.mai.job.MetricJobRunner.run(MetricJobRunner.java:32)
        at com.sap.smd.server.exec.TaskRunner.run(TaskRunner.java:47)
        at com.sap.smd.server.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:785)
        at java.lang.Thread.run(Thread.java:763)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded

Cause:

The syslog size is at least nearly as big as the Java maximum heap size.

Resolution:

  • To resolve the issue increase the Java maximum heap size (see below) to be sufficiently bigger than the syslog size.
  • Alternatively, limit the size of the syslog or delete it.


The Diagnostics Agent Log Contains other Out-of-Memory Errors


Symptom:

java.lang.OutOfMemoryError: Java heap space and java.lang.OutOfMemoryError: GC overhead limit exceeded can be found in the log file.

Cause:

By default, the SMD Agent's Java heap maximum size is set to 256 MB.  In the event of a large data collection, a heap size of 512 MB or more is required.

Resolution:

To resolve the issue, do the following for the Diagnostic Agent on host :

1.  On your managing host, browse to the following directory:

     Windows      <DRIVE>:\usr\sap\DAA\SMDA98\SMDAgent

     Unix      /usr/sap/DAA/SMDA98/SMDAgent

     Note: In the above example DAA is the name of the SID and the instance number is 98.  This may differ on your system.

2.  Open the file smdagent.properties using a text editor

3.  Locate the property smdagent.javaParameters then locate the parameter -Xmx

4.  Change the value of -Xmx to -Xmx512m then save the changes.  For example:

smdagent.javaParameters=-DP4ClassLoad=P4Connection -Xmx512m -Xms256m -XX:MaxPermSize=128m

5.  Restart the managing SMD Agent.

The Diagnostics Agent Does Not Start

For issues related to the start-up of the Diagnostics Agent, please refer to this Wiki page.

Extracting Configuration From Secure Store File Failed

If the SMDSystem.*.log contains errors similar to the entries listed below, the secure store of the Diagnostics Agent might be broken due to incompatible updates of the Diagnostics Agent or the JVM:

java.io.IOException: javax.crypto.BadPaddingException: Invalid PKCS#5 padding length: 33
at javax.crypto.CipherInputStream.a(DashoA13*..)
at javax.crypto.CipherInputStream.read(DashoA13*..)
at javax.crypto.CipherInputStream.read(DashoA13*..)
at java.util.Properties$LineReader.readLine(Properties.java:434)
at java.util.Properties.load0(Properties.java:353)
at java.util.Properties.load(Properties.java:341)
Extracting configuration from secstore file /usr/sap/DAA/SMDA98/SMDAgent/applications.config/com.sap.smd.agent.application.e2emai.std.collectors/_Default_Configuration.properties Failed. 
The secret key could not be read in secure storage 
com.sap.security.core.server.secstorefs.FileInvalidException: Getting Secure Store failed: File "/usr/sap/DAA/SMDA98/SMDAgent/configuration/secstore.properties" is invalid: software version 6.30.000.001 is incompatible with file version 7.00.000.001.
File "/usr/sap/DAA/SMDA98/SMDAgent/./temp/smdserver/secstore/TechnicalSecStorea8e9da9b2f1d7a8a9b079d5d1804a6c4.properties" is invalid: record with key "$internal/mode" is missing
Unable to create SSLContext because of KeyStore Exception java.security.UnrecoverableKeyException: Cannot recover key


To reset the secure store, proceed as follows:

  1. Stop the Diagnostics Agent
  2. Delete  /usr/sap/<SID>/SMDA<instance number>/SMDAgent/configuration/secstore.properties.
  3. Delete /usr/sap/<SID>/SMDA<instance number>/SMDAgent/configuration/security/.CertificatesKeyStore 
  4. Start the Diagnostics Agent.
  5. In the Agent AdministrationNon-Authenticated Agents select the respective Diagnostics Agent and click on Trust Agent.

In case the issue persists after the execution of the steps above, it is possible to follow the steps in the following article to correct this scenario.

  • 2447919 - Diagnostic Agent does connect to Solution Manager with the error: 'record with key "$internal/mode" is missing.'

The Diagnostics Agent is Blocking HANA Client Ports

When the Diagnostics Agent is started, all connections from the Solution Manager to the SAP HANA fail (e.g. Telnet on port 22 or the HANA port that is used by the Solution Manager server). 

Solution:

A possible cause might be that there are enough ports, but they cannot be reused fast enough. This can be configured using the below mentioned parameters. At SAP Note 2382421 especially check the sections "net.ipv4.ip_local_port_range" and "SAP Host Agent Configuration". The latter points to SAP Note 401162, which should also be taken into account. Furthermore, check the parameters "net.ipv4.tcp_tw_reuse" and "net.ipv4.tcp_tw_recycle" as described at SAP Note 2382421.


SAP Solution Manager Java Stack Scalability Settings

Please refer to SAP Note 2417827

  • No labels

10 Comments

  1. Former Member

    Hi

    When running an SMD agent behind a firewall I am using the saprouter to connect it to the P4 port of the Solution Manager. This works successfully, however, once the agent is started some SMD applications (like MAIMeasurementProvider) are using webservices to try to communicate directly with Solution Manager . eg I am seeing the following error in the webservices.n.log

    Dec 10, 2013 8:40:16 AM [Thread[ExRun:dpc:job_6,5,dpc:job:ExecTG]] Error      Connection parameters: host [<solman>], port [50200], proxy host [null], proxy port [-1]
    Dec 10, 2013 8:40:16 AM [Thread[ExRun:dpc:job_6,5,dpc:job:ExecTG]] Error      Connection parameters: endpointURL [http://<solman>:50200/sap/bc/srt/scs/sap/e2e_dpc_push?sap-client=100], proxy host [null], proxy port [-1]

    and in smdagent_trace.n.trc

    [MAIMeasurementProvider.pushFailed] MAI measurement push failed.

    [EXCEPTION]

     {0}#1#java.rmi.RemoteException: Service call exception; nested exception is:

            java.net.ConnectException: Connection timed out

            at com.sap.smd.agent.wsclients.dpc.BindingStub.e2eDpcPushMetrics(BindingStub.java:85)

            at com.sap.smd.agent.wsclients.dpc.BindingStub.e2eDpcPushMetrics(BindingStub.java:96)

            at sun.reflect.GeneratedMethodAccessor392.invoke(Unknown Source)

            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

            at java.lang.reflect.Method.invoke(Method.java:597)

            at com.sap.smd.api.util.SynchronizedProxy$SyncHandler.invoke(SynchronizedProxy.java:32)

            at com.sun.proxy.$Proxy11.e2eDpcPushMetrics(Unknown Source)

            at com.sap.smd.dpc.job.DPCServicePushMetricJob.pushSimpleEvents(DPCServicePushMetricJob.java:122)

            at com.sap.smd.dpc.job.DPCServicePushMetricJob.run(DPCServicePushMetricJob.java:66)

            at com.sap.smd.server.exec.TaskRunner.run(TaskRunner.java:46)

            at com.sap.smd.server.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:781)

            at java.lang.Thread.run(Thread.java:743)

    Caused by: java.net.ConnectException: Connection timed out

            at java.net.PlainSocketImpl.socketConnect(Native Method)

            at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)

            at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

            at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)

            at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

            at java.net.Socket.connect(Socket.java:531)

            at java.net.Socket.connect(Socket.java:478)

            at java.net.Socket.<init>(Socket.java:375)

            at java.net.Socket.<init>(Socket.java:189)

            at com.sap.engine.services.webservices.jaxm.soap.HTTPSocket.initStreamsFromSocket(HTTPSocket.java:695)

            at com.sap.engine.services.webservices.jaxm.soap.HTTPSocket.initializeStreams(HTTPSocket.java:512)

            at com.sap.engine.services.webservices.jaxm.soap.HTTPSocket.getOutputStream(HTTPSocket.java:469)

            at com.sap.engine.services.webservices.jaxrpc.wsdl2java.soapbinding.HTTPTransport.getRequestStream(HTTPTransport.java:410)

            at com.sap.engine.services.webservices.jaxrpc.wsdl2java.soapbinding.MimeHttpBinding.outputMessage(MimeHttpBinding.java:557)

            at com.sap.engine.services.webservices.jaxrpc.wsdl2java.soapbinding.MimeHttpBinding.call(MimeHttpBinding.java:1447)

            at com.sap.smd.agent.wsclients.dpc.BindingStub.e2eDpcPushMetrics(BindingStub.java:78)

            ... 11 more

    Can anyone advise how to set the http.proxyHost and http.proxyPort in the JVM of the SMD agent to get the application to use a proxy server to proxy the URL from the SMD agent  to Solution Manager? I have tried updating the following parameter in smdagent<group>.properties file

    smdagent.javaParameters=-DP4ClassLoad=P4Connection -Dhttp.proxyEnable=true -DproxySet=true -Dhttp.proxyHost=proxyhost -Dhttp.proxyPort=8080 -Dhttp.nonProxyHosts="localhost|127.0.0.1" -Dhttps.proxyHost=proxyhost -Dhttps.proxyPort=8080 -Xmx256m -Xms256m -XX:MaxPermSize=128m

    but it does not seem to take effect after restarting the agent.

    I have seen in the wiki http://wiki.scn.sap.com/wiki/display/EEM/Troubleshooting+7.1?focusedCommentId=361464663#Troubleshooting7.1-Executor: where you can set these java parameters for the robot parameters in SOLMAN_SETUP. Is there a similar setting available for the SMD agents themselves or the applications running on the SMD agents like MAI Measurement Provider?

     

    regards,

    Richard

  2. Former Member

    Regarding scalability:

    We found that another important setting to make on the Solution Manager Java stack is in Visual Admin --> server 0 --> Services --> tc~smd~server~service.

    Change the property smd.mgr.pool.threads.max.count from default 50 to 100 (or higher). Restart the service.

  3. Do the java stack scalability calculations change if you have multiple dispatchers (multiple application servers)?

    1. Hi Thomas,

      I now created an Excel document,  which summarizes all currently known scalability settings in regards to the SAP Solution Manager 7.1 Java stack.

      I hope this answers also your question.

      Best regards,
      Olivier

      1. Thanks Olivier, I followed your guide last week.  A couple notes, we have two servers nodes per java dispatcher.  We have 4 app servers total giving us 4 dispatchers and 8 java server nodes total.  Because of this I left our dispatcher heap size at double what the document recommended, but it wasn't clear if that was necessary.

        Feedback: maxSoTimeOutConnections: Estimated agents: 2000; spreadsheet result: 200; with this setting we immediately ran into issues upon startup and only about 170 agents would come online.  I reverted this to my previous settings of 2700 and MaxParallelUsers back to 0 (leaving all other recommendations in place) and everything started cleanly.

        The new recommended 'Application Manager → Max Thread Count' suggested that I change from 400 to 1500 which alleviated a bottle-neck in my system. (Thanks!) I also followed all of the other calculations that your sheet provided and all worked well.  Please review the maxSoTimeOutConnections calculation.  If you believe that by leaving the heap at the higher setting may have been related, I can schedule time to change the heap to 1000 and try a maxSoTimeOutConnections setting of 200 again.  Unfortunately due to outage time limitations I wasn't able to test all of the combinations I wanted to during my previous task.

        I really appreciate the spreadsheet.  It makes a confusing topic straight forward.

        Regards,
        Thomas 

  4. Former Member

    SAP note 1916217 v3 says:

    "ConnectionsManipulator/maxParallelUsers = [Total number of Diagnostics Agents] + 100 "

     and

     "Alternatively, the maxParallelUsers value can be set to 0 (unlimited number of parallel users)."

    What's the impact when changing this to a higher value?

    1. Hi Darryl,

      I now created an Excel document,  which summarizes all currently known scalability settings in regards to the SAP Solution Manager 7.1 Java stack.

      I hope this answers also your question.

      Best regards,
      Olivier

  5. Hi colls,

    another Java parameter.

    (SAPSolutionManager7.1 SP10_SizingGuide)

    Maximum Heap Size for Dispatcher 
    The java heap size of the Dispatcher should be set to the following value in order to avoid timeout connection issues:
    Java Max Heap Size (-Xmx) = Number of SAP DAs * 1.28 
    1. Hi Diego,

      I now created an Excel document,  which summarizes all currently known scalability settings in regards to the SAP Solution Manager 7.1 Java stack.

      Kindly use now this Excel - SAP Note 2417827 - to define your parameter adjustments.

      Best regards,
      Olivier

      1. Hi Oliver,

        thank you so much. This note will be usefull.

        Kind regards

        Diego Hernández Rollizo