User manual BUSINESS OBJECTS DATA INTEGRATOR 11.7.1 FOR WINDOWS AND UNIX PERFORMANCE OPTIMIZTION GUIDE

If this document matches the user guide, instructions manual or user manual, feature sets, schematics you are looking for, download it now. Lastmanuals provides you a fast and easy access to the user manual BUSINESS OBJECTS DATA INTEGRATOR 11.7.1. We hope that this BUSINESS OBJECTS DATA INTEGRATOR 11.7.1 user guide will be useful to you.

Lastmanuals help download the user guide BUSINESS OBJECTS DATA INTEGRATOR 11.7.1.

Mode d'emploi BUSINESS OBJECTS DATA INTEGRATOR 11.7.1

Download

Manual abstract: user guide BUSINESS OBJECTS DATA INTEGRATOR 11.7.1FOR WINDOWS AND UNIX PERFORMANCE OPTIMIZTION GUIDE

Detailed instructions for use are in the User's Guide.

 [. . . ] Data Integrator Performance Optimization Guide
<title of your book>

Data Integrator 11. 7. 1 for Windows and UNIX

Patents

Business Objects owns the following U. S. patents,  which may cover products that are offered and sold by Business Objects: 5, 555, 403,  6, 247, 008 B1,  6, 578, 027 B2,  6, 490, 593 and 6, 289, 352. Business Objects owns the following U. S. patents,  which may cover products that are offered and licensed by Business Objects: 5, 555, 403; 6, 247. 008 B1; 6, 578, 027 B2; 6, 490, 593; and 6, 289, 352.  [. . . ] With this capability,  Data Integrator can distribute CPU-intensive and memory-intensive operations (such as join,  grouping,  table comparison and lookups). This distribution of data flow execution provides the following potential benefits:

· ·

Better memory management by taking advantage of more CPU power and physical memory Better job performance and scalability by taking advantage of grid computing

You can create sub data flows so that Data Integrator does not need to process the entire data flow in memory at one time. You can also distribute the sub data flows to different job servers within a server group to use additional memory and CPU resources. This section contains the following topics:

· ·

Splitting a data flow into sub data flows Using grid computing to distribute data flows execution

Splitting a data flow into sub data flows
Use the following features to split a data flow into multiple sub data flows:

· ·

Run as a separate process option Data_Transfer transform

90

Data Integrator Performance Optimization Guide

Distributing Data Flow Execution Splitting a data flow into sub data flows

7

Run as a separate process option
If your data flow contains multiple resource-intensive operations,  you can run each operation as a separate process (sub data flow) that uses separate resources (memory and computer) from each other to improve performance and throughput. When you specify multiple Run as separate process options in objects in a data flow,  Data Integrator splits the data flow into sub data flows that run in parallel. The Run as a separate process option is available on resource-intensive operations that including the following:

· ·

Hierarchy_Flattening transform Query operations that are CPU-intensive and memory-intensive:

· · · · · · ·

Join GROUP BY ORDER BY DISTINCT

Table_Comparison transform Lookup_ext function Count_distinct function

Examples of multiple processes for a data flow
A data flow can contain multiple resource-intensive operations that each require large amounts of memory or CPU utilization. You can run each resource-intensive operation as a separate process that can use more memory on a different computer or on the same computer that has more than two gigabytes of memory. For example,  you might have a data flow that sums sales amounts from a lookup table and groups the sales by country and region to find which regions are generating the most revenue. The data flow contains a Query transform for the lookup_ext function to obtains sales subtotals and another Query transform to group the results by country and region. 

Data Integrator Performance Optimization Guide

91

7

Distributing Data Flow Execution Splitting a data flow into sub data flows

To define separate processes in this sample data flow,  take one of the following actions:

·

When you define the Lookup_ext function in the first query transform,  select the Run as a separate process option. 

·

When you define the Group By operation in the second query transform,  select the Run GROUP BY as a separate process option on the Advanced tab. 

92

Data Integrator Performance Optimization Guide

Distributing Data Flow Execution Splitting a data flow into sub data flows

7

Scenario 1: Run multiple sub data flows with DOP set to 1
The following diagram shows how Data Integrator splits this data flow into two sub data flows when you specify the Run as a separate process option for either the Lookup_ext function or the Group By. 

Data Integrator generates sub data flow names that follow this format:
DFName_executionGroupNumber_indexInExecutionGroup · DFName is the name of the data flow. 

· ·

executionGroupNumber is the order that Data Integrator executes a group of sub data flows indexInExecutionGroup is the sub data flow within an execution group. 

When you execute the job,  the Trace Log shows that Data Integrator creates two sub data flows that execute in parallel and have different process IDs (Pids). For example,  the following trace log shows two sub data flows GroupBy_DF_1_1 and GroupBy_DF_1_2 that each start at the same time and have a different Pid than the parent data flow GroupBy_DF. 

Data Integrator Performance Optimization Guide

93

7

Distributing Data Flow Execution Splitting a data flow into sub data flows

94

Data Integrator Performance Optimization Guide

Distributing Data Flow Execution Splitting a data flow into sub data flows

7

Scenario 2: Run multiple sub data flows with DOP greater than 1
When Degree Of Parallelism (DOP) is set to a value greater than 1,  each transform defined in the data flow replicates for use on a parallel subset of data. For more information,  see "Degree of parallelism" on page 76. Set DOP to a value greater than 1 on the data flow Properties window. 

The following diagram shows the sub data flows that Data Integrator generates for GroupBy_DOP2_Job when the Run GROUP BY as a separate process is selected and DOP set to 2. 

Data Integrator Performance Optimization Guide

95

7

Distributing Data Flow Execution Splitting a data flow into sub data flows

When you execute the job,  the Trace Log shows that Data Integrator creates sub data flows that execute in parallel with different process IDs (Pids). For example,  the following trace log shows the following four sub data flows that start concurrently and that each have a different Pid than the parent data flow GroupBy_DOP2_DF:

· · · ·

GroupBy_DOP2_DF_1_1 GroupBy_DOP2_DF_1_2 GroupBy_DOP2_DF_1_3 GroupBy_DOP2_DF_1_4

96

Data Integrator Performance Optimization Guide

Distributing Data Flow Execution Splitting a data flow into sub data flows

7

Tip: When your data flow has DOP is greater than one,  select either job or data flow for the Distribution level option when you execute the job. If you execute the job with the value sub data flow for Distribution level,  the RoundRobin Split or Hash Split sends data to the replicated queries that might be executing on different job servers. Because the data is sent on the network

Data Integrator Performance Optimization Guide

97

7

Distributing Data Flow Execution Splitting a data flow into sub data flows

between different job servers,  the entire data flow might be slower. For more information about job distribution levels,  see "Using grid computing to distribute data flows execution" on page 104. 

Data_Transfer transform
The Data_Transfer transform creates transfer tables in datastores to enable Data Integrator to push down operations to the database server. The Data_Transfer transform creates two sub data flows and uses the transfer table to distribute the data from one sub data flow to the other sub data flow. For information about the Data_Transfer editor,  see "Data_Transfer" on page 275 of the Data Integrator Reference Guide. 

Examples of multiple processes with Data_Transfer
The following are typical scenarios of when you might use the Data_Transfer transform to split a data flow into sub data flows to push down operations to the database server. 

Scenario 1: Sub data flow to push down join of file and table sources
Your data flow might join an Orders flat file and a Orders table,  perform a lookup_ext function to obtain sales subtotals,  and another Query transform to group the results by country and region. 

98

Data Integrator Performance Optimization Guide

Distributing Data Flow Execution Splitting a data flow into sub data flows

7

1. 

To define sub data flows to push down a join of a file and table Add a Data_Transfer transform between the Orders file source and the Query transform. 

2. 

Select the value Table from the drop-down list in the Transfer type option in the Data_Transfer editor. 

Data Integrator Performance Optimization Guide

99

7

Distributing Data Flow Execution Splitting a data flow into sub data flows

3. 

For Table name in the Table options area,  browse to the datastore that contains the source table that the Query joins to this file. Double-click the datastore name and enter a name for the transfer table on the Input table for Data_Transfer window. In this example,  browse to the same datastore that contains the Orders table and enter Orders_FromFile in Table name. 

100

Data Integrator Performance Optimization Guide

Distributing Data Flow Execution Splitting a data flow into sub data flows

7

4. 

After you save the data flow and click Validation > Display Optimized SQL. . . ,  the Optimized SQL window shows that the join between the transfer table and source Orders table is pushed down to the database. 

Data Integrator can push down many operations without using the Data_Transfer transform. For more information,  see "Push-down operations" on page 38. 

Data Integrator Performance Optimization Guide

101

7

Distributing Data Flow Execution Splitting a data flow into sub data flows

5. 

When you execute the job,  the Trace Log shows messages such as the following that indicate that Data Integrator created two sub data flows with different Pids to run the different operations serially. 

Scenario 2: Sub data flow to push down memory-intensive operations
You can use the Data_Transfer transform to push down memory-intensive operations such as Group By or Order By. For the sample data flow in "Scenario 1: Sub data flow to push down join of file and table sources" on page 98,  you might want to push down the Group By operation.  [. . . ] You cannot combine bulk loading with the following options:

· · · ·

Auto-correct load Enable Partitioning Number of Loaders Full push down to a database Data Integrator automatically selects this optimizer process when the following conditions are met:

· ·

The source and target in a data flow are on the same database The database supports the operations in the data flow

Data Integrator Performance Optimization Guide

157

10

Other Tuning Techniques Target-based performance options

If the optimizer pushes down source or target operations,  then it ignores the performance options set for sources (Array fetch size,  Caching,  and Join rank) because Data Integrator is not solely processing the data flow. For more information,  see "Push-down operations" on page 38. 

· ·

Overflow file Transactional loading

To improve performance for a regular load (parameterized SQL),  you can select the following options from the target table editor. Note that if you use one,  you cannot use the others for the same target. 

·

Enable Partitioning Parallel loading option. The number of parallel loads is determined by the number of partitions in the target table.  [. . . ]

DISCLAIMER TO DOWNLOAD THE USER GUIDE BUSINESS OBJECTS DATA INTEGRATOR 11.7.1

Lastmanuals offers a socially driven service of sharing, storing and searching manuals related to use of hardware and software : user guide, owner's manual, quick start guide, technical datasheets...
In any way can't Lastmanuals be held responsible if the document you are looking for is not available, incomplete, in a different language than yours, or if the model or language do not match the description. Lastmanuals, for instance, does not offer a translation service.

Click on "Download the user manual" at the end of this Contract if you accept its terms, the downloading of the manual BUSINESS OBJECTS DATA INTEGRATOR 11.7.1 will begin.