Increasing Performance and Tuning

This chapter provides an overview of SQR performance and SQL statements and discusses how to:

Simplify a complex select paragraph.
Use LOAD-LOOKUP to simplify joins.
Improve SQL performance with dynamic SQL.
Examine SQL cursor status.
Avoid temporary database tables.
Create multiple reports in one pass.
Tune SQR numerics.
Compile SQR programs and use SQR Execute.
Set processing limits.
Buffer fetched rows.
Run programs on the database server.

Understanding SQR Performance and SQL Statements

Whenever a program contains a BEGIN-SELECT, BEGIN-SQL, or EXECUTE command, it performs a SQL statement. Processing SQL statements typically consumes significant computing resources. Tuning SQL statements typically yields higher performance gains than tuning any other part of the program.

General tuning of SQL is outside the scope of this book. Tuning SQL is often specific to the type of database that you are using—tuning SQL statements for an Oracle database may be different from tuning SQL statements for DB2. This chapter focuses on SQR tools for simplifying SQL statements and reducing the number of times SQL is run.

Simplifying a Complex Select Paragraph

With relational database design, information is often normalized by storing data entities in separate tables. To display the normalized information, you must write a select paragraph that joins these tables together. With many database systems, performance suffers when you join more than three or four tables in one select paragraph.

With SQR, you can perform multiple select paragraphs and nest them. In this way, you can break a large join into several simpler selects. For example, you can break a select paragraph that joins the orders and the products tables into two selects. The first select retrieves the orders that you want. For each order that is retrieved, a second select retrieves the products that were ordered. The second select is correlated to the first select by having a condition such as:

where order_num = &order_num

This condition specifies that the second select retrieves only products for the current order.

Similarly, if the report is based on products that were ordered, you can make the first select retrieve the products and make the second select retrieve the orders for each product.

This method improves performance in many cases, but not all. To achieve the best performance, you may need to experiment with the different alternatives.

You can use master and detail reports to perform multiple select paragraphs and nest them.

See Using Dynamic SQL and Error Checking.

Using LOAD-LOOKUP to Simplify Joins

Database tables often contain key columns, such as a product code or customer number. To retrieve a certain piece of information, you join two or more tables that contain the same column. For example, to obtain a product description, you can join the orderlines table with the products table by using the product_code column as the key.

With LOAD-LOOKUP, you can reduce the number of tables that are joined in one select. Use this command with LOOKUP commands.

The LOAD-LOOKUP command defines an array containing a set of keys and values and loads it into memory. The LOOKUP command looks up a key in the array and returns the associated value. In some programs, this technique performs better than a conventional table join.

You can use LOAD-LOOKUP in the SETUP section or in a procedure. If used in the SETUP section, it is processed only once. If used in a procedure, it is processed each time that it is encountered.

LOAD-LOOKUP retrieves two fields from the database: the KEY field and the RETURN_VALUE field. Rows are ordered by KEY and stored in an array. The KEY field must be unique and contain no null values.

When the LOOKUP command is used, the array is searched (by using a binary search) to find the RETURN_VALUE field corresponding to the KEY that is referenced in the lookup.

The following code example illustrates LOAD-LOOKUP and LOOKUP:

begin-setup
  load-lookup   
     name=prods 
      table=products
      key=product_code
      return_value=description
end-setup
...
begin-select
order_num (+1,1)
product_code
  lookup prods &product_code $desc
  print $desc (,15)
from orderlines
end-select

In this code example, the LOAD-LOOKUP command loads an array with the product_code and description columns from the products table. The lookup array is named prods. The product_code column is the key and the description column is the return value. In the select paragraph, a LOOKUP on the prods array retrieves the description for each product_code. This technique eliminates the need to join the products table in the select.

If the orderlines and products tables were joined in the select (without LOAD-LOOKUP), the code would look like this:

begin-select
order_num (+1,1)
ordlines.product_code
description (,15)
from ordlines, products
where ordlines.product_code = products.product_code
end-select

Whether a database join or LOAD-LOOKUP is faster depends on the program. LOAD-LOOKUP improves performance when:

It is used with multiple select paragraphs.
It keeps the number of tables being joined from exceeding three or four.
The number of entries in the LOAD-LOOKUP table is small compared with the number of rows in the select, and they are used often.
Most entries in the LOAD-LOOKUP table are used.

Note. You can concatenate columns if you want RETURN_VALUE to return more than one column. The concatenation symbol is database specific.

Improving SQL Performance with Dynamic SQL

You can use dynamic SQL in some situations to simplify a SQL statement and gain performance:

begin-select
order_num
from orders, customers
where order.customer_num = customers.customer_num
and ($state = 'CA' and order_date > $start_date
     or $state != 'CA' and ship_date > $start_date)
end-select

In this example, a given value of $state, order_date, or ship_date is compared with $start_date.The OR operator in the condition makes such multiple comparisons possible. With most databases, an OR operator slows processing. It can cause the database to perform more work than necessary.

However, the same work can be done with a simpler select. For example, if $state is ‘CA,’ the following select works:

begin-select order_num from orders, customers where order.customer_num = customers.customer_num and order_date > $start_date end-select

Dynamic SQL enables you to check the value of $state and create the simpler condition:

if $state = 'CA'
   let $datecol = 'order_date'
else
   let $datecol = 'ship_date'
end-if
begin-select
order_num
from orders, customers
where order.customer_num = customers.customer_num
and [$datecol] > $start_date
end-select

The [$datecol] substitution variable substitutes the name of the column to be compared with $start_date. The select is simpler and no longer uses an OR operator. In most cases, this use of dynamic SQL improves performance.

See Using Dynamic SQL and Error Checking.

Examining SQL Cursor Status

Because SQR programs select and manipulate data from a SQL database, you should understand how SQR processes SQL statements and queries.

SQR programs can perform multiple SQL statements. Moreover, they can run the same SQL statement multiple times.

When a program runs, a pool of SQL statement handles, called cursors, is maintained. A cursor is a storage location for one SQL statement—for example, SELECT, INSERT, or UPDATE. Every SQL statement uses a cursor for processing. A cursor holds the context for the execution of a SQL statement.

The cursor pool contains 30 cursors, and you cannot change its size. When a SQL statement is rerun, its cursor can be immediately reused if it is still in the cursor pool. When an SQR program runs more than 30 different SQL statements, cursors in the pool are reassigned.

To examine how cursors are managed, use the -S command-line flag. This flag displays cursor status information at the end of a run.

The following information appears for each cursor:

Cursor #nn:
SQL = <SQL statement>
Compiles = nn
Executes = nn
Rows = nn

The listing also includes the number of compiles, which vary according to the database and the complexity of the query. With Oracle, for example, a simple query is compiled only once. With Sybase, a SQL statement is compiled before it is first run and recompiled for the purpose of validation during the SQR compile phase. Therefore, you may see two compiles for a SQL statement. Later, when the SQL is rerun, if its cursor is found in the cursor pool, it can proceed without recompiling.

Avoiding Temporary Database Tables

This section provides an overview of temporary database tables and discusses how to:

Use and sort arrays.
Use and sort flat files.

Understanding Temporary Database Tables

Programs often use temporary database tables to hold intermediate results. Creating, updating, and deleting temporary tables is a resource-consuming task, however, and can slow the program’s performance. SQR provides two alternatives to using temporary database tables:

Store intermediate results in an SQR array.
Store intermediate results in a local flat file.

Both techniques can yield a significant performance gain. Use the SQR language to manipulate the data that is stored in an array or a flat file.

Using and Sorting Arrays

An SQR array can hold as many records as can fit in memory. During the first pass, when records are retrieved from the database, you can store them in the array. Subsequent passes on the data can be made without additional database access.

The following code example retrieves records, prints them, and saves them into an array named customer_array:

create-array name=customer_array size=1000
   field=state:char   field=city:char
   field=name:char    field=phone:char
let #counter = 0
begin-select
state (,1)
city  (,7)
name  (,24)
phone (,55)
   position (+1)
   put &state &city &name &phone into customer_array(#counter)
   add 1 to #counter
from customers
end-select

The customer_array array has four fields that correspond to the four columns that are selected from the customers table, and it can hold up to 1,000 rows. If the customers table had more than 1,000 rows, you would need to create a larger array.

The select paragraph prints the data. The PUT command then stores the data in the array. You could use the LET command to assign values to array fields; however, the PUT command performs the same work, with fewer lines of code. With PUT, you can assign all four fields in one command.

The #counter variable serves as the array subscript. It starts with zero and maintains the subscript of the next available entry. At the end of the select paragraph, the value of #counter is the number of records in the array.

The next code example retrieves the data from customer_array and prints it:

let #i = 0
while #i < #counter
   get $state $city $name $phone from customer_array(#i)
   print $state (,1)
   print $city  (,7)
   print $name  (,24)
   print $phone (,55)
   position (+1)
   add 1 to #i
end-while

In this code example, #i goes from 0 to #counter– 1. The fields from each record are moved into the corresponding variables: $name, $city, $state, and $phone. These values are then printed.

Sorting Arrays

In many cases, intermediate results must be sorted by a different field. The following sample program indicates how to sort customer_array by name. The sample program uses a well-known sorting algorithm called QuickSort. You can copy this code into your program, make appropriate changes, and use it to sort your array:

Program ex24a.sqr
#define MAX_ROWS 1000
begin-setup
create-array name=customer_array size={MAX_ROWS}
   field=state:char     field=city:char
   field=name:char      field=phone:char
!
! Create a helper array that is used in the sort
!
create-array name=QSort size={MAX_ROWS}
      field=n:number    field=j:number
end-setup
begin-program
  do main
end-program
begin-procedure main
let #counter = 0
!
! Print customers sorted by state
!
begin-select
state (,1)
city  (,7)
name  (,24)
phone (,55)
   position (+1)
   ! Put data in the array
   put &state &city &name &phone into customer_array(#counter)
   add 1 to #counter
from customers
order by state
end-select
position (+2)
!
! Sort customer_array by name
!
let #last_row = #counter - 1
do QuickSort(0, 0, #last_row)
!
! Print customers (which are now sorted by name)
!
let #i = 0
while #i < #counter
   ! Get data from the array
   get $state $city $name $phone from customer_array(#i)
   print $state (,1)
   print $city  (,7)
   print $name  (,24)
   print $phone (,55)
   position (+1)
   add 1 to #i
end-while
end-procedure ! main
!
! QuickSort
!
! Purpose: Sort customer_array by name.
! This is a recursive function. Since SQR does not allocate
! local variables on a stack (they are all static), this
! procedure uses a helper array.
!
! #level - Recursion level (used as a subscript to the helper 
! array)
! #m     - The "m" argument of the classical QuickSort
! #n     - The "n" argument of the classical QuickSort
!
begin-procedure QuickSort(#level, #m, #n)
   if #m < #n
      let #i = #m
      let #j = #n + 1
      ! Sort key is "name"
      let $key = customer_array.name(#m)
      while 1
         add 1 to #i
         while #i <= #j and customer_array.name(#i) < $key
             add 1 to #i
         end-while
         subtract 1 from #j
         while #j >= 0 and customer_array.name(#j) > $key
            subtract 1 from #j
         end-while
         if #i < #j
            do QSortSwap(#i, #j)
         else
            break
         end-if
      end-while
      do QSortSwap(#m, #j)
      add 1 to #level
      ! Save #j and #n
      let QSort.j(#level - 1) = #j
      let QSort.n(#level - 1) = #n
      subtract 1 from #j
      do QuickSort(#level, #m, #j)
      ! restore #j and #n
      let #j = QSort.j(#level - 1)
      let #n = QSort.n(#level - 1)
      add 1 to #j
      do QuickSort(#level, #j, #n)
      subtract 1 from #level
   end-if
end-procedure ! QuickSort
!
!
! QSortSwap
!
! Purpose: Swaps records #i and #j of customer_array
 !
! #i     - Array subscript
! #j     - Array subscript
!
begin-procedure QSortSwap(#i, #j)
   get $state $city $name $phone from customer_array(#i)
   let customer_array.state(#i) = customer_array.state(#j)
   let customer_array.city(#i)  = customer_array.city(#j)
   let customer_array.name(#i)  = customer_array.name(#j)
   let customer_array.phone(#i) = customer_array.phone(#j)
   put $state $city $name $phone into customer_array(#j)
end-procedure ! QSortSwap

The QuickSort algorithm uses a recursive procedure, which means that it calls itself. SQR maintains only one copy of the procedure’s local variables. In QuickSort, the #j and #n variables are overwritten when QuickSort calls itself.

For the algorithm to work properly, the program must save the values of these two variables before making the recursive call, and then restore those values when the call finishes. QuickSort can call itself recursively many times, so the program may need to save many copies of #j and #n. To have the program do this, add a #level variable that maintains the depth of recursion. In this example, a helper array, Qsort, is used to hold multiple values of #j and #n.

The QuickSort procedure takes three arguments. The first is the recursion level (or depth), which is #level, as previously described. The second and third arguments are the beginning and end of the range of rows to be sorted. Each time QuickSort calls itself, the range gets smaller. The main procedure starts QuickSort by calling it with the full range of rows.

The QSortSwap procedure swaps two rows in customer_array. Typically, rows with a lower key value are moved up.

The QuickSort and QSortSwap procedures in ex24a.sqr refer to customer_array and its fields. If you plan to use these procedures to sort an array in your applications, you must change these references to the applicable array and fields. The QuickSort procedure sorts in ascending order.

SQR and Language Sensitive Sorting

SQR does not natively support National Language Sensitive sorting. SQR compares characters based on Unicode codepoint, and sorting based on Unicode codepoint does not correctly sort order language-sensitive data.

See Sorting in PeopleTools.

The QuickSort procedure does not support National Language Sensitive character string sort. The comparisons are simple string comparisons based on Unicode codepoint used internally in SQR to represent string data. For instance, the following code lines from the preceding code sample would sort data in Unicode codepoint order. Unicode codepoints are not ordered to make a correct sorting order of any language.

while #i <= #j and customer_array.name(#i) < $key
and
while #j >= 0 and customer_array.name(#j) > $key

If you want to sort string data in SQR, you may need to write a National Language Sensitive character string comparison and add that to SQR. The QuickSort procedure will then be modified in the following way:

while #i <= #j and NLS_STRING_COMPARE(customer_array.name(#i),$key)
while #j >= 0 and NLS_STRING_COMPARE($key,customer_array.name(#j))

Using and Sorting Flat Files

An alternative to an array is a flat file. You can use a flat file when the required array size exceeds the available memory.

The code example in the previous section can be rewritten to use a file instead of an array. The advantage of using a file is that the program is not constrained by the amount of memory that is available. The disadvantage of using a file is that the program performs more input and output (I/O). However, it may still be faster than performing another SQL statement to retrieve the same data.

This program uses the UNIX/Linux sort utility to sort the file by name. This example can be extended to include other operating systems.

The following code example is rewritten to use the cust.dat file instead of the array:

Program ex24b.sqr
begin-program
  do main
end-program
begin-procedure main
!
! Open cust.dat
!
open 'cust.dat' as 1 for-writing record=80:vary
begin-select
state (,1)
city  (,7)
name  (,24)
phone (,55)
   position (+1)
   ! Put data in the file
   write 1 from &name:30 &state:2 &city:16 &phone:10
from customers
order by state
end-select
position (+2)
!
! Close cust.dat
close 1
! Sort cust.dat by name
!
call system using 'sort cust.dat > cust2.dat' #status
if #status <> 0
    display 'Error in sort'
    stop
end-if
!
! Print customers (which are now sorted by name)
!
open 'cust2.dat' as 1 for-reading record=80:vary
while 1  ! loop until break
   ! Get data from the file
   read 1 into $name:30 $state:2 $city:16 $phone:10
   if #end-file
      break   ! End of file reached
   end-if
   print $state (,1)
   print $city  (,7)
   print $name  (,24)
   print $phone (,55)
   position (+1)
end-while
!
! close cust2.dat
close 1
end-procedure ! main

The program starts by opening a cust.dat file:

open 'cust.dat' as 1 for-writing record=80:vary

The OPEN command opens the file for writing and assigns it file number 1. You can open as many as 12 files in one SQR program. The file is set to support records of varying lengths with a maximum of 80 bytes (characters). For this example, you can also use fixed-length records.

As the program selects records from the database and prints them, it writes them to cust.dat:

write 1 from &name:30 &state:2 &city:16 &phone:10

The WRITE command writes the four columns into file number 1, the currently open cust.dat. It writes the name first, which simplifies sorting the file by name. The program writes fixed-length fields. For example, &name:30 specifies that the name column uses exactly 30 characters. If the actual name is shorter, it is padded with blanks. When the program has finished writing data to the file, it closes the file by using the CLOSE command.

The file is sorted with the UNIX sort utility:

call system using 'sort cust.dat > cust2.dat' #status

The sort cust.dat > cust2.dat command is sent to the UNIX system. It invokes the UNIX sort command to sort cust.dat and direct the output to cust2.dat. The completion status is saved in #status; a status of 0 indicates success. Because name is at the beginning of each record, the file is sorted by name.

Next,open cust2.dat for reading. The following command reads one record from the file and places the first 30 characters in $name:

read 1 into $name:30 $state:2 $city:16 $phone:10

The next two characters are placed in $state, and so on. When the end of the file is encountered, the #end-file reserved variable is automatically set to 1 (true). The program checks for #end-file and breaks out of the loop when the end of the file is reached. Finally, the program closes the file by using the CLOSE command.

Creating Multiple Reports in One Pass

Sometimes you must create multiple reports that are based on the same data. In many cases, these reports are similar, with only a difference in layout or summary. Typically, you can create multiple programs and even reuse code. However, if each program is run separately, the database has to repeat the query. Such repeated processing is often unnecessary.

With SQR, one program can create multiple reports simultaneously. In this method, a single program creates multiple reports, making just one pass on the data and reducing the amount of database processing.

See Creating Multiple Reports from One Program.

Tuning SQR Numerics

SQR for PeopleSoft provides three types of numeric values:

Machine floating point numbers
Decimal numbers
Integers

Machine floating point numbers are the default. They use the floating point arithmetic that is provided by the hardware. This method is very fast. It uses binary floating point and normally holds up to 15 digits of precision.

Some accuracy can be lost when you are converting decimal fractions to binary floating point numbers. To overcome this loss of accuracy, you can sometimes use the ROUND option of commands such as ADD, SUBTRACT, MULTIPLY, and DIVIDE. You can also use the round function of LET or numeric edit masks that round the results to the needed precision.

Decimal numbers provide exact math and precision of up to 38 digits. Math is performed in the software. This is the most accurate method, but also the slowest.

You can use integers for numbers that are known to be integers. Using integers is beneficial because they:

Enforce the integer type by not allowing fractions.
Adhere to integer rules when dividing numbers.

Integer math is also the fastest method, typically faster than floating point numbers.

If you use the DECLARE-VARIABLE command, the -DNT command-line flag, or the DEFAULT-NUMERIC entry in the Default-Settings section of the PSSQR.INI file, you can select the type of numbers that SQR uses. Moreover, you can select the type for individual variables in the program with the DECLARE-VARIABLE command. When you select decimal numbers, you can also specify the needed precision.

Selecting the numeric type for variables enables you to fine-tune the precision of numbers in your program. For most applications, however, this type of tuning does not yield a significant performance improvement, so it's best to select decimal. The default is machine floating point to provide compatibility with older releases of the product.

Compiling SQR Programs and Using SQR Execute

Compiling an SQR program can improve its performance. The compiled program is stored in a runtime (.SQT) file. You can then run it with SQR Execute. Your program runs faster because it bypasses the compile phase.

See Compiling Programs and Using SQR Execute.

Setting Processing Limits

Use a startup file and the Processing-Limits section of pssqr.ini to define the sizes and limitations of some of the internal structures that SQR uses. An -M command-line flag can specify a startup file whose entries override those in pssqr.ini. If you use the -Mb command-line flag, then corresponding sections of the file are not processed. Many of these settings have a direct affect on memory requirements.

Tuning of memory requirements used to be a factor with older, 16-bit operating systems, such as Windows 3.1. Today, most operating systems use virtual memory, and tuning memory requirements normally do not affect performance in any significant way. The only case in which you might need to be concerned with processing limit settings is with large SQR programs that exceed default processing limit settings. In such cases you must increase the corresponding settings.

Buffering Fetched Rows

When you run a BEGIN-SELECT command, SQR fetches records from the database server. For better performance, SQR fetches them in groups rather than one at a time—by default in groups of 10 records. SQR buffers the records, and a program processes these records one at a time. SQR therefore performs a database fetch operation after every 10 records, instead of after every single record—a substantial performance gain. If the database server is on another computer, network traffic is also significantly reduced.

Modify the number of records to fetch together by using the -B command-line flag or, for an individual BEGIN-SELECT command, by using its -B option. In both cases, specify the number of records to be fetched together. For example -B100 specifies that records be fetched in groups of 100. This means that the number of database fetch operations is further reduced.

This feature is currently available with SQR for Oracle or Sybase databases and SQR for ODBC.

Running Programs on the Database Server

To reduce network traffic and improve performance, run SQR programs directly on the database server machine. The SQR server is available on many server platforms including Microsoft Windows and UNIX/Linux.