BW SYSTEM TUNING
In an end user perspective, performance is nothing but, is the next logical dialog screen appears on his/her GUI without any long delay. If there is a delay, it appears to be a bad performing system.
In a traditional way, performance tuning of an SAP application deals with buffer management, database tuning, work process tuning, fragmentation of the database, reorganization of the database, reducing I/O contention, operating system tuning, table stripping and the list goes on depending on the nature of the system.
This document deals with more of performance tuning in a BW perspective rather than general R/3 parameter tuning. Like, query performance, data load performance, aggregate tuning etc.
This document will focus the following key aspects in a detailed fashion. 1. What are the different ways to Tune an SAP system? ( General )
2. What are the general settings we need to adapt in a good performing BW system?
3. What are the factors which influence the performance on a BW system?
4. What are the factors to consider while extracting data from source system?
5. What are the factors to consider while loading the data?
6. How to tune your queries?
7. How to tune your aggregates?
8. What are the different options in Oracle for a good performing BW system?
9. What are the different tools available to tune a BW system? (With screenshots).
10. What are the best practices we can follow in a BW system?
1. What are the different ways to tune an SAP system?
Aim of tuning an SAP system should focus on one major aspect. Availability of the next logical screen to all users (end users/business users/super users) with equal or unequal (depending on the business requirement) allocation of technical resources in a timely manner. And also we need to keep in mind that we have spent just the optimal amount of money on the technical resources.
There are two major paths we need follow to tune an SAP system.
Tune it depending on the business requirement.
Tune it depending on the technical requirement. Business requirement.
Consider how many Lines of businesses we have in our company. Which Lines of business uses which IT infrastructure and how efficiently or inefficiently does that LOB uses the IT infrastructure? Who are all my critical users? Is it possible to assign a part of the technical resources just for them to use? How is the growth of my database? What are the key LOB's and who are the key users influencing the growth in the database? What is the data most frequently used? Is that data available always? Likewise, the list goes on... Understanding the business requirement and we can tune the system accordingly. Technical requirement:
How many CPU's? How many disks? Is there an additional server node required? How balanced is the load? How much is the network speed? Is table stripping required? What is the hit ratio? What is the I/O contention? Should we reorganize? What is the efficiency of the operating system? How is the performance of BEX? Likewise here also the list goes on...
By gauging, analyzing and balancing the two lists of technical requirements and business requirements we can end up in a good performing SAP system. 2. What are the general settings we need to adapt in a good performing BW system?
Following are the main parameters we need to monitor and maintain for a BW system. To start with performance tuning in a BW system, we have to focus on the following parameters. Rsdb/esm/buffersize_kb.
Depending on the size of the main memory, the program buffer should be between 200 and 400 MB. Unlike in R/3 Systems, a higher number of program buffer swaps is less important in BW Systems and is often unavoidable since the information stored in the program buffer is significantly less likely to be reused. While the response times of R/3 transactions is only around several hundred milliseconds, the response times of BW queries takes seconds. However, by tuning the program buffer, you can only improve the performance by milliseconds.
Therefore, if the available main memory is limited, you should increase the size of the extended memory. However, the program buffer should not be set lower than 200 MB. If the available main memory is sufficient, the program buffer in BW 2.X/3.X systems should be set to at least 300 MB.
BW users require significantly more extended memory than R/3 users. The size of the extended memory is related to the available main memory but should not be lower than 512 MB.
Set the Maximum work process runtime parameter to maximum and also set the timeout sessions to be high. Set the parameter dbs/ora/array_buf_size to a sufficiently large size to keep the number of array inserts, for example, when you upload data or during the rollup, as low as possible. This improves the performance during insert operations.
The main performance-related tables in the BW environment are:
- F-Fact tables: /BI0/F<cube>
- E-Fact tables: /BI0/E<cube>
- Dimension tables: /BI0/D<cube>
- SID tables: /BI0/S<characteristic>
- SID tables (navigation attribute, time-independent): /BI0/X<characteristic>
- SID tables (navigation attribute, time-dependent): /BI0/Y<characteristic>In addition to the /BI0 tables delivered by SAP, you also have customer-specific /BIC tables with a naming convention that is otherwise identical.
Since objects and partitions are frequently created and deleted in BW, and extents are thus allocated and reallocated, you should use Locally Managed Table spaces (LMTS) in the BW environment wherever possible.
Since numerous hashes, bitmap and sort operations are carried out in the BW environment especially; you must pay particular attention to the configuration of the PGA and PSAPTEMP table spaces. These components are crucial factors in the performance of processing the operations described. You must therefore ensure that PGA_AGGREGATE_TARGET is set to a reasonable size and that PSAPTEMP is in a high-speed disk area. It may be useful to add up to 40 % of the memories available for Oracle to the PGA.
If you work with large hierarchies, you have to increase the size of this buffer considerably. You should be able to store at least 5,000 objects in the buffer.
The BW basis parameters must be set optimally for the BW system to work without errors and the system to perform efficiently. The recommendations for BW systems are not always the same as those for R/3 systems.
3. What are the factors which influence the performance on a BW system?
There are three major factors that influence the performance of a BW system.
ü How we administer the BW system?
ü Technical resources available.
ü How the entire BW landscape is designed?
First step to resolve most of the problems in BW system is Archive. Archive the most amount of data you can.Archive data from Info Cubes and ODS objects and delete the archived data from the BW database. This reduces the data volume and, thus, improves upload and query performance.
An archiving plan can also affect the data model. For a yearly update, an Multiprovider partitioning per year
The archiving process in the BW system works slightly differently to that in an R/3 environment. In an R/3 system, the data is written to an archive file. Afterwards, this file is read and the deleted from the database, driven by the content of the file. In a BW system, the data from the archived file is not used in the deletion process (only verified to be accessible and complete). The values of the selection characteristics, which have been used for retrieving data in the 'Write' job, are passed to the selective deletion of the data target. This is the same functionality that is available within data target management in the Administrator Workbench ('Contents' tab strip). This functionality tries to apply an optimal deletion strategy, depending on the values selected, that is, it drops a partition when possible or copies and renames the data target when more than a certain percentage of the data has to be deleted.
Reloading archived data should be an exception rather than the general case, since data should be archived only if it is not needed in the database anymore. When the archived data target is serving also as a data mart to populate other data targets, we recommend that you load the data to a copy of the original (archived) data target, and combine the two resulting data targets with a MultiProvider.
In order to reload the data to a data target, you have to use the export Data Source of the archived data target. You then trigger the upload either by using 'Update ODS data in data target' or by replicating the Data Sources of the MYSELF source system and subsequently scheduling an Info Package for the respective Info Source. In this scenario we have used the first option. Load balancing:
Load balancing provides the capability to distribute processing across several servers in order to optimally utilize the server resources that are available. An effective load balancing strategy can help you to avoid inefficient situations where one server is overloaded (and thus performance suffers on that server), while other servers go underutilized. The following processes can be balanced:
? Logon load balancing (via group login): This allows you to distribute the workload of multiple query/administration users across several application servers.
? Distribution of web users across application servers can be configured in the BEx service in SICF.
And also, Process chains, Data loads and data extractions should be routed to perform in specific target servers.
In some cases, it is useful to restrict the extraction or data load to a specific server (in SBIW in an SAP source system, or SPRO in BW), i.e. not using load balancing. This can be used for special cases where a certain server has fast CPUs and therefore you may want to designate it as an extraction or data load server.
Reorganize the table:
Logs of several processes are collected in the application log tables. These tables tend to grow very big as they are not automatically deleted by the system and can impact the overall system performance.
Table EDI40 can also grow very big depending on the number of IDOC records.
Depending on the growth rate (i.e., number of processes running in the system), either schedule the reorganization process (transaction SLG2) regularly or delete log data as soon as you notice significant DB time spent in table BALDAT (e.g., in SQL trace).
Delete regularly old RSDDSTAT entries.If several traces and logs run in the background, this can lead to bad overall performance and sometimes it's difficult to discover all active logs. So be sure to switch off traces and logs as soon as they are not used any more.
Technical resources available:
The capacity of the hardware resources represents highly significant aspect of the overall performance of the BW system in general. Insufficient resources in any one area can constraint performance capabilities
? Number of CPUs
? Speed of CPUs
? Disk architecture (e.g. RAID)
A BW environment can contain a DB server and several application servers. These servers can be configured individually (e.g. number of dialog and batch processes), so that the execution of the different job types (such as queries, loading, DB processes) can be optimized. The general guideline here is to avoid hot spots and bottlenecks.
For optimizing the hardware resources, it is recommended to define at least two operation modes: one for batch processing (if there is a dedicated batch window) with several batch processes and one for the query processing with several dialog processes.
Different application servers have separate buffers and caches. E.g. the OLAP cache (BW 3.x) on one application server does not use the OLAP cache on other servers.
BW landscape design:
Info Cube modeling is the process by which business reporting requirements are structured into an object with the facts and characteristics that will meet the reporting needs.
Characteristics are structured together in related branches called dimensions.
The key figures form the facts.
The configuration of dimension tables in relation to the fact table is denoted as "star schema".
For a BW system to perform better we should not combine dynamic characteristics in the same dimension in order to keep dimensions rather small. Example: Don't combine customer and material in one dimension if the two characteristics are completely independent. As a general rule, it makes more sense to have many smaller dimensions vs. fewer larger dimensions. Dimension tables should be sized less than 10% of the fact table.
Use MultiProvider (or logical) partitioning to reduce the sizes of the Info Cubes.
Example: Define Info Cubes for one year and join them via a MultiProvider so we can have parallel access to underlying basis Info Cubes, load balancing, and resource utilization.
Define large dimensions as line item dimensions (e.g. document number or customer number) if (as a rule of thumb) the dimension table size exceeds 10 % of the fact table(s) size; B-tree is generally preferable for cases where there is high cardinality (high number of distinct values)
Info Cubes containing non-cumulative key figures should not be too granular. A high granularity will result in a huge amount of reference points which will impact aggregate build significantly. Reference points can only be deleted by deleting an object key not specifying the time period, i.e. all available records for this key are deleted.
The data model has tremendous impact on both query AND load performance. E.g. bad dimension model. Example: Customer and material in one dimension instead of separate dimensions can lead to huge dimension tables and thus slows down query performance, as it is expensive to join a huge dimension table to a huge fact table. Transaction RSRV can be used to check the fact to dimension table ratio.
As non-cumulative key figures are well defined for every possible point in time (according to the calculation algorithm), it could make sense to restrict the validity to a certain time period. Example: If a plant is closed, it should not show up any stock figures. These objects can be defined as validity objects. Note that for every entry in the validity table, a separate query is generated at query runtime.
4. What are the factors to consider while extracting data from source system?
Data load performance can be affected by following key aspects.
Customer exits. à Check with RSA3, SE30 and ST05
Resource utilization. à SM50 / SM51
Load balancing. à SM50 / SM51 (Configure ROIDOCPRMS)
Data package size.
Indices on tables. à ST05
Flat file format.
Content Vs generic extractor. The size of the packages depends on the application, the contents and structure of the documents. During data extraction, a dataset is collected in an array (internal table) in memory. The package size setting determines how large this internal table will grow before a data package is sent. Thus, it also defines the number of Commit's on DB level.
Use RSMO and RSA3 to monitor the load.
Indices can be built on Data Source tables to speed up the selection process.
If there is a poor performance in data load, refer the following note
Note 417307 - Extractor package size: Collective note for applications.
If you define selection criteria in your Info Package and the selection of the data is very slow, consider building indices on the Data Source tables in the source system.
5. What are the factors to consider while loading the data?
There are two major aspects to consider while loading data.
O/S Monitors. I/O contention.
High number of DB writes during large data loads.
Disk Layout and Striping. (What is located on the same disk or table space/DB space etc.?)At the time of data load we need to also check the transformation rules. à Use SE30 and ST05.The master data load creates all SIDs and populates the master data tables (attributes and/or texts). If the SIDs does not exist when transaction data is loaded, these tables have to be populated during the transaction data load, which slows down the overall process.
Another major function which could be performed at data load is buffering number ranges.SID number range can be buffered instead of accessing the DB for each SID.
If you encounter massive accesses to DB table NRIV via SQL trace (ST05), increase the number range buffer in transaction SNRO.
Always load master data before transaction data. The transaction data load will be improved, as all master data SIDs are created prior to the transaction data load, thus precluding the system from creating the SIDs at the time of load.
In transaction RSCUSTV6 the size of each PSA partition can be defined. This size defines the number of records that must be exceeded to create a new PSA partition. One request is contained in one partition, even if its size exceeds the user-defined PSA size; several packages can be stored within one partition.
The PSA is partitioned to enable fast deletion (DDL statement DROP PARTITION). Packages are not deleted physically until all packages in the same partition can be deleted.
Transformation rules are transfer rules and update rules. Start routines enable you to manipulate whole data packages (database array operations) instead of changing record-by-record. In general it is preferable to apply transformations as early as possible in order to reuse the data for several targets.
Flat files: Flat files can be uploaded either in CSV format or in fixed-length ASCII format. If you choose CSV format, the records are internally converted in fixed-length format, which generates overhead.
You can upload files either from the client or from the application server. Uploading files from the client workstation implies sending the file to the application server via the network - the speed of the server backbone will determine the level of performance impact, Gigabit backplanes make this a negligible impact.
The size (i.e., number of records) of the packages, the frequency of status IDocs can be defined in table RSADMINC (Transaction RSCUSTV6) for the flat file upload. If you load a large amount of flat file data, it is preferable to use fixed-length ASCII format, to store the files on the application server rather than on the client and to set the parameters according the recommendations in the referenced note.
If possible, split the files to achieve parallel upload. We recommend as many equally-sized files as CPUs are available. 6 / 7. How to tune your queries and aggregates?
The data in a Data Warehouse is largely very detailed. In SAP BW, the Info Cube is the primary unit of storage for data for reporting purposes. The results obtained by executing a report or query represent a summarized dataset.
An aggregate is a materialized, summarized view of the data in an Info Cube. It stores a subset of Info Cube data in a redundant form. When an appropriate aggregate for a query exists, summarized data can be read directly from the database during query execution, instead of having to perform this summarization during runtime. Aggregates reduce the volume of data to be read from the database, speed up query execution time, and reduce the overall load on the database.
A sound data model in BW should comprise of the following
The main purpose of aggregate is to accelerate the response time of the queries, by reducing the amount of data that must be read in the database for a navigation step. Grouping and filtering will enhance the value of an aggregate.
We can group according to the characteristic or attribute value, according to the nodes of the hierarchy level, and also filter according to a fixed value.
It is guaranteed that queries always deliver consistent data when you drilldown. This means that data provided when querying against an aggregate is always from the same set of data that is visible within an Info Cube.
New data packets / requests that are loaded into the InfoCube cannot be used at first for reporting if there are aggregates that are already filled. The new packets must first be written to the aggregates by a so-called "roll-up". Data that has been recently loaded into an InfoCube is not visible for reporting, from the InfoCube or aggregates, until an aggregate roll-up takes place. During this process you can continue to report using the data that existed prior to the recent data load. The new data is only displayed by queries that are executed after a successful roll-up. See the attachment for more details on the technical process of a roll-up.
The split of a query is rule-based.
Parts of the query on different aggregation level are split.
Parts with different selections on characteristic are combined.
Parts on different hierarchy levels or parts using different hierarchies are split.
After the split, OLAP processor searches for an optimal aggregate each part. Parts which use the same aggregate will be combined again (in some cases it is not possible to combine them)
Maintaining an aggregate: RSDDV.
After selecting a particular info cube, we could drill down to the options of the aggregate to tune each of them.
This is the same screen for BI Accelerator index.
This is another important T code where we could perform the following actions. Possible actions
Restart host: restarts the BI accelerator hardware
Restart BIA server: restarts all the BI accelerator servers and services. This includes the name server and index server
Restart BIA index server: restarts the index server. (The name servers are not restarted.) Rebuild BIA indexes: If a check discovers inconsistencies in the indexes, you can use this action to delete and rebuild all the BI accelerator indexes.
Reorganize BIA landscape: If the BI accelerator server landscape is unevenly distributed, this action redistributes the loaded indexes on the BI accelerator servers
In our system BIA monitor is not set up. So, we need to set up this. Here am not going to set up this, because it might affect few other RFC destinations.
Query design: Multi-dimensional Query.
Inclusion / Exclusion.
Multi provider query.
Query read mode.
Every Query should start with a relatively small result set; let the user drill down to more detailed information.
Do not use ODS objects for multi-dimensional reporting.
Queries on Multi Providers usually access all underlying Info Providers, even if some cannot be hit as no key figures within the query definition are contained in this Info Provider.
In ORACLE, fact tables can be indexed either by bitmap indices or by B-tree indices. A bitmap index stores a bitmap stream for every characteristic value. Bitmap indices are suitable for characteristics with few values. Binary operations (AND or OR) are very fast.
B-tree indices are stored in a (balanced) tree structured. If the system searches one entry, it starts at the root and follows a path down to the leaf where the row ID is linked. B-tree indices suit for characteristics with lots of values.
In some cases, ORACLE indices can degenerate. Degeneration is similar to fragmentation, and reduces the performance efficiency of the indexes. This happens when records are frequently added and deleted.
The OLAP Cache can help with most query performance issues. For frequently used queries, the first access fills the OLAP Cache and all subsequent calls will hit the OLAP Cache and do not have to read the database tables. In addition to this pure caching functionality, the Cache can also be used to optimize specific queries and drill-down paths by 'warming up' the Cache; with this you fill the Cache in batch to improve all accesses to this query data substantially.
8. What are the different options in Oracle for a good performing BW system?
The purpose of disk layout is to avoid I/O hot spots by distributing the data accesses across several physical disks. The goal is to optimize the overall throughput to the disks.
The basic rule is: stripe over everything, including RAID-subsystems.
Managing table spaces:Locally-Managed Table spaces manage their own extents by maintaining bitmaps in each data file. The bitmaps correspond to (groups of) blocks.
Be sure that all (bigger) table spaces are locally managed. Extent and partition maintenance is drastically improved, as DB dictionary accesses are minimized. Administration maintenance is also reduced.
Parallel query option:
ORACLE can read database table contents in parallel if this setting is active. BW uses this feature especially for staging processes and aggregate build. The Parallel Query Option is used by default. Be sure, that the init<SID>.ora-entries for PARALLEL_MAX_SERVERS are set appropriate to the recommendations in the ORACLE note.
Table partitions are physically separated tables, but logically they are linked to one table name. PSA tables and non-compressed F-fact table are partitioned by the system (by request ID). The (compressed) E-fact table can be partitioned by the user by certain time characteristics. For range-partitioned InfoCubes, the SID of the chosen time characteristic is added to both fact tables.
When using range partitioning, query response time is generally improved by partition pruning on the E fact table: all irrelevant partitions are discarded and the data volume to be read is reduced by the time restriction of the query.
In ORACLE, report SAP_DROP_EMPTY_FPARTITIONS can help you to remove unused or empty partitions of InfoCube or aggregate fact tables. Unused or empty partitions can emerge in case of selective deletion or aborted compression and may affect query performance as all F fact table partitions are accessed for queries on the InfoCube.
9. What are the different tools available to tune a BW system?
RSMO is used to monitor data flow to target system from source system. We can see data by request, source system, time request id etc. It provides all necessary information on times spent in different processes during the load (e.g., extraction time, transfer, posting to PSA, processing transformation rules, writing to fact tables). In the upload monitor you are also able to debug transfer and update rules.
If the extraction from an SAP source system consumes significant time, use the extractor checker (transaction RSA3) in the source system.
If the data transfer times are too high, check if too many work processes are busy (if so, avoid large data loads with "Update Data Targets in Parallel" method), and check swapping on one application server (set "rdisp/bufrefmode = "sendoff, exeauto" during load phase if you use several application servers).
The Query Monitor (transaction RSRT) allows you to execute queries and to trace queries in a debug mode with several parameters (e.g., do not use aggregates, do not use buffer, show SQL statement).
In the debug mode, you can investigate if the correct aggregate(s) are used and which statistics the query execution generates. For checking reasons, you can switch off the usage of aggregates, switch to no parallel processing (see for more details in the MultiProvider section) or display the SQL statement and the run schedule.
Select a particular query and then click on performance info.
Like this query we can generate detailed performance info for every query. Below is the screen shot containing the detailed information for this query.
RSRTRACE. The Query Trace Tool (transaction RSRTRACE) allows you to record some important function module calls and process and debug them at a later stage. Transaction RSRCATTTRACE takes the log of RSRTRACE as input and gives aggregates suggestions for the first execution AND all further navigations performed.
RSRV:BW objects can be checked for consistency in transaction RSRV and inconsistent objects can be repaired.
Apart from these BW tools, we have standard ABAP based tools like ST05, ST03n, SE30, SM50 and SM51 to check and measure the performance of the system.
In SE 30, we have special options like if cases, field conversions and monitoring the SQL interface.
ST05: The SQL trace (transaction ST05) records all activities on the database and enables you to check long runtimes on a DB table or several similar accesses to the same data.
If we find problems for an isolated process (upload or query) and we have analyzed for example the existence of aggregates, we can detail our analyses by using the SQL trace. Filter on a specific user (e.g. query user or extraction user ALEREMOTE) and make sure that no concurrent jobs run at the same time with this execution. We will find out which tables are accessed, what time is consumed and if some tables are accessed redundantly.
Another important tool to be used is ST10.
Here we can find out the statistics of the table and get more detailed info on a particular table. If we assume a general buffer problem, check ST10 and check the buffer settings of all tables; compare usage of buffer vs. invalidations.
ST04, DB02, SM50, SM51, ST02, ST06 are some the important tools which we normally use in R/3. These transaction codes should be extensively used here as well for gauging and optimizing the performance of the system. 10.What are the best practices we can follow in a BW system?
Best practices for a production BW system can be drafted only with a close interaction with the functional team and technical team and the nature of the production system.
Here, are couple of best practices we could implement to improve the performance.
Activate Transfer rule for info source:When you have maintained the transfer structure and the communication structure, you can use the transfer rules to determine how the transfer structure fields are to be assigned to the communication structure InfoObjects. You can arrange for a 1:1 assignment. But you can also fill InfoObjects using routines or constants.
The scheduler is the connecting link between the source systems and the SAP Business Information Warehouse. Using the scheduler you can determine when and from which InfoSource, DataSource, and source system, data (transaction data, master data, texts or hierarchies) is requested and updated.
The principle behind the scheduler relates to the functions of SAP background jobs. The data request can be scheduled either straight away or it can be scheduled with a background job and started automatically at a later point in time. We get to the data request via the Scheduler in the Administration Workbench Modeling, by choosing InfoSource Tree ® Your Application Component ® InfoSources ® Source System ® Context Menu ® Create InfoPackage
Assign several info sources: Assign several DataSources to one InfoSource, if you want to gather data from different sources into a single InfoSource. This is used, for example, if data from different IBUs that logically belongs together is grouped together in BW.
The fields for a DataSource are assigned to InfoObjects in BW. This assignment takes place in the same way in the transfer rules maintenance.