SAP Data Services Strategies to execute jobs

Maximizing push-down operations to the database server

SAP BusinessObjects Data Services generates SQL SELECT statements to retrieve the data from

source databases. The software automatically distributes the processing workload by pushing down

as much as possible to the source database server.

Pushing down operations provides the following advantages:

Use the power of the database server to execute SELECT operations (such as joins, Group By, and common functions such as decode and string functions). Often the database is optimized for these operations.
Minimize the amount of data sent over the network. Fewer rows can be retrieved when the SQL statements include filters or aggregations. You can also do a full push down from the source to the target, which means the software sends SQL INSERT INTO... SELECT statements to the target database. The following features enable a full push down:
Data_Transfer transform
Database links and linked datastores

Use the following features to improve throughput:

You can improve the performance of data transformations by caching as much data as possible. By

caching data in memory, you limit the number of times the system must access the database.

The software supports database bulk loading engines including the Oracle bulk load API. You can

have multiple bulk load processes running in parallel.

If your jobs have CPU-intensive and memory-intensive operations, you can use the following advanced

tuning features to improve performance:

Parallel processes—Individual work flows and data flows can execute in parallel if you do not connect them in the Designer workspace.
Parallel threads—The software supports partitioned source tables, partitioned target tables, and degree of parallelism. These options allow you to control the number of instances for a source, target, and transform that can run in parallel within a data flow. Each instance runs as a separate thread and can run on a separate CPU.
Server groups and distribution levels—You can group Job Servers on different computers into a logical component called a server group. A server group automatically measures resource availability on each Job Server in the group and distributes scheduled batch jobs to the computer with the lightest load at runtime. This functionality also provides a hot backup method. If one Job Server in a server group is down, another Job Server in the group processes the job. You can distribute the execution of data flows or sub data flows within a batch job across multiple Job Servers within a Server Group to better balance resource-intensive operations.