Skip to end of metadata
Go to start of metadata

Purpose

The purpose of this page is to clarify the understanding of how SAP ASE uses spinlocks and what the effects on overall CPU usage may be.

Overview

Often high CPU  in SAP ASE can be traced to spinlock usage. This page will show how to identify that condition and suggest ways to tune ASE.

What is a Spinlock?

  • In a multi-engine server synchronization mechanisms are needed to protect shared resources

    • ASE uses spinlocks as one of its synchronization mechanisms

    • A spinlock is a data structure with a field that can only be updated atomically (that is, only one engine at a time can make changes to it).

  • When a task modifies a data item which is shared it must first hold a spinlock

    • Shared items are such things as run queues, data cache page lists, lock structures, etc.

  • The spinlock prevents another task from modifying the value at the same time

    • This of course assumes that the other task also performs its access under the protection of a spinlock

  • A task needing a spinlock will “spin” (block) until the lock is granted

    • When multiple engines are spinning at the same time CPU usage can rise substantially.

  • Spinlocks must be as fast and efficient as possible

    • In order to reduce contention a process which loops typically acquires and releases its spinlock each time through the loop.

    • Therefore the spinlock code is written in platform-specific assembly language.

Comparison of Spinlocks to Other Synchronization Mechanisms

Type

Complexity

CPU overhead

Wait time

Spinlock
Low
High
Very low
Latch
Moderate
Low
Should be small
Table/page/row/
address Lock
High
Low
Can vary considerably

Spinlocks and CPU Usage

  • Spids trying to get a spinlock will never yield the engine until they have it.

    • So one spid, waiting on a spinlock, will cause 100% user busy on one engine until it gets the spinlock.

  • Spinlock contention percentage is measured as waits/grabs

    • Example: 10,000 grabs with 3,000 waits = 30% contention

  • For looking at performance issues, use total spins, not contention

    • Example: Assume two spinlocks

      • One had 100 grabs with 40 waits and 200 spins = 40% contention

      • Second had 100,000 grabs with 400 waits and 20,000 spins = 4% contention

    • The second used up more many cpu cycles spinning, even though contention was lower.

    • We should then look at tuning for the second example, not the first.

  • As more engines spin on the same spinlock, the wait time and number of spins increases; sometimes geometrically

Troubleshooting Spinlocks

  • Spinlock contention/spinning is one of the major causes of high CPU

  • Step 1 is determining if, in fact, the high cpu is being caused by spinlock usage.

  • Step 2 is determining which spinlock or spinlocks are causing the condition.

  • Step 3 is determining what tuning to use to help reduce the problem.

*Note* You will never get to 0% spinlock contention unless you only run with one engine. That is, do not think that spinlock contention can be eliminated. It can only possibly be reduced.

Step 1 - Checking for spinlock contention/spinning

  • Using sp_sysmon to determine if high cpu is due to spinlocks

  • Check “CPU Busy” (or “User Busy” in 15.7 Threaded Mode).

  • If engines  are not showing high busy% then spinlocks are not a big issue.

  • Check “Total Cache Hits” in the “Data Cache Management” section.

    • If the cache hits per second is high, and goes up with cpu busy %, then you likely are looking at table scanning/query plans and not spinlocks.

  • In general, if cpu usage increases but measurements of throughput such as committed xacts, cache hits, lock requests, scans, etc. go down then it is very possible that spinlock usage is an issue

Step 2 - which spinlock or spinlocks are causing the contention?

  • Using sp_sysmon

  • There are several spinlocks listed, but only contention % is shown

    • ŸObject Manager Spinlock Contention

    • ŸObject Spinlock Contention

    • ŸIndex Spinlock Contention

    • ŸIndex Hash Spinlock Contention

    • ŸPartition Spinlock Contention

    • ŸPartition Hash Spinlock Contention

    • ŸLock Hashtables Spinlock Contention

    • ŸData Caches Spinlock Contention

  • High contention on any of these may indicate a problem

    • But, you may have contention on other spinlocks not reported in sp_sysmon

  • Using MDA table monSpinockActivity

    • This table was added in 15.7  ESD#2

  • Query using standard SQL.

One possible query showing the top 10 spinlocks by number of spins over a one-minute interval

select * into #t1 from monSpinlockActivity
waitfor delay "00:01:00"
select * into #t2 from monSpinlockActivity
select top 10 convert(char(30),a.SpinlockName) as SpinlockName,
(b.Grabs - a.Grabs) as Grabs, (b.Spins - a.Spins) as Spins,
(b.Waits – a.Waits) as Waits,
case when a.Grabs = b.Grabs then 0.00 else convert (numeric(5,2),(100.0 * (b.Waits - a.Waits))/(b.Grabs - a.Grabs))
end as Contention
from #t1 a, #t2 b where a.SpinlockName = b.SpinlockName
order by 3 desc

Possible Issues with monSpinlockActivity

  • Spinlocks with multiple instances will get aggregated

    • For example, all default data cache partition spinlocks will show up as one line

    • This can make it impossible to see if just one cache partition is causing the problem

  • You must set the 'enable spinlock monitoring' configuration variable

    • Tests show that this adds about a 1 percent overhead to a busy server.

  • monSpinlockActivity  does show the current and last owner KPIDs. This can be useful to check if certain processes are the ones heavily hitting certain spinlocks.

 

Step 3 - what tuning to can be done to help reduce the problem

  • This is going to depend a great deal on which spinlock(s) the high spins are on.

  • Note as well that it is quite possible to reduce contention on one spinlock only to have it increase on another

Some of the more common spinlocks and possible remedies

  • Object Manager Spinlock (Resource->rdesmgr_spin)

  • Make sure that sufficient ‘number of open objects’ have been configured.

  • Identify ‘hot’ objects by using monOpenObjectActivity.

  • Use dbcc tune (des_bind) to bind the hot objects to the DES cache.

  • The reason this works is that the spinlock is used to protect the DES keep count in order to make sure an in-use DES does not get scavenged. When the DES is bound that whole process gets skipped.

  • Data Cache spinlocks

  • The best single method to reduce data cache spinlock usage is to increae the number of partitions in the data cache.

  • Note that if a cache can be set to ‘relaxed LRU’ the spinlock usage may be decreased dramatically. This is because the relaxed LRU cache does not maintain the LRU->MRU chain, and so does not need to grab the spinlock to move pages to the MRU side.

  • There are definite requirements for this to help (a cache that has high turnover is a very poor candidate for relaxed LRU).

  • Procedure Cache Spinlock (Resource->rproccache_spin)

    • This spinlock is used when allocating or freeing pages from the global procedure cache memory pool (this includes statement cache).

    • Some possible causes include

      • Proc cache too small – procs and statements being frequently removed/replaced.

      • Procedure recompilations

      • Large scale allocations

    • To reduce pressure on the spinlock

    • Eliminate the cause(s) for procedure recompilations (maybe TF 299)

    • If you are running a version prior to ASE 15.7 ESD#4 upgrade. ASE 15.7 ESD#4 and 4.2 have some fixes to hold the spinlock for less time

    • Trace flags 753 and 757 can help reduce large-scale allocations

    • In ASE versions past 15.7 SP100, use the configuration option "enable large chunk elc“.

    • Use dbcc proc_cache(free_unused) as temporary help to reduce spinlock/cpu usage.

  • Procedure Cache  Manager Spinlock (Resource->rprocmgr_spin)

    • This spinlock is used whenever moving procedures and dynamic SQL into or out of procedure cache.

    • This spinlock was also used prior to ASE 15.7 ESD#1 when updating the memory accounting structures (pmctrl).

      • Due to contention a separate spinlock was created.

    • Causes of high contention include:

      • Heavy use of dynamic SQL

      • Procedure cache sized too small

    • Possible remedies are the same as for rproccache_spin

  • Lock Manager spinlocks (fglockspins , addrlockspins, tablockspins)

    • These spinlocks are used to protect the lock manager hashtables.

    • If the lock HWMs are set too high, that means more locks and more contention

    • Configuration tunables are the primary way to address this

      • lock spinlock ratio

      • lock address spinlock ratio

      • lock table spinlock ratio

      • lock hashtable size

What not to do

    • Resist the urge to add more engines because cpu is high

      • Adding additional engines when the high cpu busy is caused by spinlock contention will only make matter worse

      • Adding more "spinners" will simply increase the amount of time it takes each spid to obtain the spinlock, slowing things doen even more.

__________________________________________________________________________________________________________


  • No labels

17 Comments

  1.  

    Nice article on spinlocks Dave, very intuitive.

  2. Former Member

    Excellent information. Really helped us identify contention on procedure cache.

    Is there a document listing all the locks, we've also seen high contention on 

    Kernel->kaspinlock 

    Resource->rpssmgr_spin

    SSQLCACHE_SPIN

    Thanks

    1. Hello,

      Kernel->kaspinlock 

      This spinlock is associated with kernel alarm. Spinlocks with this structure is rare in nature.

      Resource->rpssmgr_spin

      PSS(Process Stack Structure), this is the primary structure that is associated to a process when a SAP ASE receives a connection from client. Since this being a structure and handles memory, a spinlock associated with this structure is called pss_spin. Since there are multiple processes involved so a control mechanism is required called PSS Manager hence rpssmg and the associated locking mechanism(spinlock) is called rpssmgr_spin.

      SSQLCACHE_SPIN

      The name suggests this locking mechanism is associated with statement cache. This is required to avoid contention among similar profiled statement from various processes.

      1. Former Member

        Thanks - that helps a lot.

        Its a shame more of the information isn't better documented - but thanks to Dave for producing this.

  3. hi,

    Maybe someone know what it is Resource->rpmctrl_spin.
    Because I had this spin on my sysmon(70-90%)
    But I can't find information about it spin

    1. Hello,

      The rpmctrl is a structure, used to track memory allocated from procedure cache. If spins on this structure shows high values then there is a contention in procedure cache. There may be number of reasons for such high values but at a high level the dbcc procedure dbcc proc_cache('free_unused') may help in reducing/eliminate the spins associated on this structure.

       

                          

          

      1. Thanks a lot))
        This information strongly is useful to me.
        Maybe there is a document where all possible spin are described?

        1. I'm sorry to say that there is no standard document to explain various spinlocks and associated structures. Hence David(this document original writer) made some good efforts to collect and catalog them. Also I scourge the code and came with that explanation.

          H.T.H.

  4. Former Member

    Another query I have is at what point should we be concerned about the Contention ?

    I appreciate that this will depend on the number of engines, speed of engines, etc.

     

    eg I have this 

    SpinlockName Spinlock                SlotID          Grabs                 Spins                  Waits                Contention
    ------------------------------              -------------- -------------------- -------------------- -------------------- ----------
    Resource->rproccache_spin       1401            13780565             330466267          3492329             25.34

     

    Is 25% contention with 3,492,329  Waits considered high over a 20second period.

    We have 20 engines so we're getting 8730 waits per engine per second.

    It each spinlock is 0.1ms then that equates to 0.8s per engine per second - which isn't too bad but high for a 20s period (4% of the time)

    But if a spinlock is 1ms then that equates to 8s per engine per second wich is very high.

    I appreciate the time will vary on many factors but what sort of figure is the point where we should be concerned.

    How long is the  low"Wait Time" ? (I know we can't say for sure as it depends on the machine)

  5. Former Member

    If this really is a 20 second sample, then 330,466,267 spins is a very real problem. I suggest increasing procedure cache by, say, 25% and to open a case with support to check what options that are available in your version. (depending on version this may be a trace flag or may be a configuration parameter).


    Cheers,

    /Stefan

    1. Former Member

      Stefan,

      Yes, we logged a  case.

      We "fixed" it by setting traceflags 753, increased the procedure cache and set the ELC percent at 80%.

      The more useful option was trace flag 753.

      There's a lot of more detail here ...

      http://scn.sap.com/thread/3815278

      What I don't understand is that TF 753 forces 2k pages to be grabbed. Why isn't there an option to always grab 16k pages as well.

      2k is such a small amount of memory. 

      Surely we want to grab 8k, 16k or 32k at a time. We've upgraded our server to use 16k pages, why are we still using 2k pages for procedure cache ?

      This is 2015 after all not 1990. We have servers with 0.5Tb and 1Tb RAM.

       

       

  6. Former Member

    2kB has nothing to do with your data page size. It is the size of a memory page.

  7. Former Member

    Yes - I'm aware 2k of page size is different from 2k of memory page. 

    We've upgrade our server to have 2k page sizes, and without trace flag 753 the procedure cache is grabbed in blocks of 2k,4k or16k (is there an 8k ?)
    TF753 will then force the procedure cache to grab all its memory in 2k chunks. It would be useful to be able to force all the pages to say 4k or 16k. Why only 2k ?

    The point I was making is that both of these sizes are very small in terms of todays computers. 

  8. Mike,

    I understand where your are coming from. You mean you want a 16kib page instead of 2kib page, when you configure the ASE to 16Kib page size.
    The design of ASE cache pages use only 2Kib. There are so many technical reasons behind this decision. One main reason is cache latency with larger page sizes.
    Even though the pages are 2Kib but scales with multiples of 2Kib.

    Hope this helps. 

     

  9. Former Member

    Yes I appreciate the increased latency of getting 16k cache page versus a 2k cache page.

    However, any process requiring more than 2k will therefore have to grab a 2k page multiple times and ASE has approached this issue by allowing a grab of more than 2k pages (which TR 753 turns off). This makes sense. However, grabbing different sizes seems to cause fragmentation (at least for us) so we have to turn this performance improvement off.

    If (and I mean if) most of our procs need a minimum of, say 32k, to run, and forcing ASE to only allow 2k page grabs then most procs needs to grab pages 16 times. If we force the grab size to be say 4k, then the smaller procs will have a small amount of wastage but the larger more intense procs will reduce the number of grabs.

    I have no idea how the performance would vary, but since SAP have implemented grabbing pages of different size, I'd guess its beneficial or have SAP got this wrong ?

    So in the same way, that ASE allows different size disk pages, then why not allow the setting of the page size. We could try 2k, 4k and 16k and choose whats best for our application.

    Although, the Linux memory page size is 4k so I would have expected the optimum ASE page size to match that in much the same way, we get optimum disk performance when the ASE page size matches the SAN page size (does any disk have 2k blocks these days ?) But obviously there's a lot of information I don't know about the ASE internals so perhaps there's a reason for this.

  10. Former Member

    In Sybase 16 - it seems trace flag 753 has no impact and you need to use the configuration setting "enable large chunk elc" - set this to zero to have the same impact as 753

     

  11. Greate Article!

    Thank you very much.