This chapter provides an overview of SQR performance and SQL statements and discusses how to:
Simplify a complex select paragraph.
Use LOAD-LOOKUP to simplify joins.
Improve SQL performance with dynamic SQL.
Examine SQL cursor status.
Avoid temporary database tables.
Create multiple reports in one pass.
Tune SQR numerics.
Compile SQR programs and use SQR Execute.
Set processing limits.
Buffer fetched rows.
Run programs on the database server.
Whenever a program contains a BEGIN-SELECT, BEGIN-SQL, or EXECUTE command, it performs a SQL statement. Processing SQL statements typically consumes significant computing resources. Tuning SQL statements typically yields higher performance gains than tuning any other part of the program.
General tuning of SQL is outside the scope of this book. Tuning SQL is often specific to the type of database that you are using—tuning SQL statements for an Oracle database may be different from tuning SQL statements for DB2. This chapter focuses on SQR tools for simplifying SQL statements and reducing the number of times SQL is run.
With relational database design, information is often normalized by storing data entities in separate tables. To display the normalized information, you must write a select paragraph that joins these tables together. With many database systems, performance suffers when you join more than three or four tables in one select paragraph.
With SQR, you can perform multiple select paragraphs and nest them. In this way, you can break a large join into several simpler selects. For example, you can break a select paragraph that joins the orders and the products tables into two selects. The first select retrieves the orders that you want. For each order that is retrieved, a second select retrieves the products that were ordered. The second select is correlated to the first select by having a condition such as:
where order_num = &order_num
This condition specifies that the second select retrieves only products for the current order.
Similarly, if the report is based on products that were ordered, you can make the first select retrieve the products and make the second select retrieve the orders for each product.
This method improves performance in many cases, but not all. To achieve the best performance, you may need to experiment with the different alternatives.
You can use master and detail reports to perform multiple select paragraphs and nest them.
See Using Dynamic SQL and Error Checking.
Database tables often contain key columns, such as a product code or customer number. To retrieve a certain piece of information, you join two or more tables that contain the same column. For example, to obtain a product description, you can join the orderlines table with the products table by using the product_code column as the key.
With LOAD-LOOKUP, you can reduce the number of tables that are joined in one select. Use this command with LOOKUP commands.
The LOAD-LOOKUP command defines an array containing a set of keys and values and loads it into memory. The LOOKUP command looks up a key in the array and returns the associated value. In some programs, this technique performs better than a conventional table join.
You can use LOAD-LOOKUP in the SETUP section or in a procedure. If used in the SETUP section, it is processed only once. If used in a procedure, it is processed each time that it is encountered.
LOAD-LOOKUP retrieves two fields from the database: the KEY field and the RETURN_VALUE field. Rows are ordered by KEY and stored in an array. The KEY field must be unique and contain no null values.
When the LOOKUP command is used, the array is searched (by using a binary search) to find the RETURN_VALUE field corresponding to the KEY that is referenced in the lookup.
The following code example illustrates LOAD-LOOKUP and LOOKUP:
begin-setup load-lookup name=prods table=products key=product_code return_value=description end-setup ... begin-select order_num (+1,1) product_code lookup prods &product_code $desc print $desc (,15) from orderlines end-select
In this code example, the LOAD-LOOKUP command loads an array with the product_code and description columns from the products table. The lookup array is named prods. The product_code column is the key and the description column is the return value. In the select paragraph, a LOOKUP on the prods array retrieves the description for each product_code. This technique eliminates the need to join the products table in the select.
If the orderlines and products tables were joined in the select (without LOAD-LOOKUP), the code would look like this:
begin-select order_num (+1,1) ordlines.product_code description (,15) from ordlines, products where ordlines.product_code = products.product_code end-select
Whether a database join or LOAD-LOOKUP is faster depends on the program. LOAD-LOOKUP improves performance when:
It is used with multiple select paragraphs.
It keeps the number of tables being joined from exceeding three or four.
The number of entries in the LOAD-LOOKUP table is small compared with the number of rows in the select, and they are used often.
Most entries in the LOAD-LOOKUP table are used.
Note. You can concatenate columns if you want RETURN_VALUE to return more than one column. The concatenation symbol is database specific.
You can use dynamic SQL in some situations to simplify a SQL statement and gain performance:
begin-select order_num from orders, customers where order.customer_num = customers.customer_num and ($state = 'CA' and order_date > $start_date or $state != 'CA' and ship_date > $start_date) end-select
In this example, a given value of $state, order_date, or ship_date is compared with $start_date.The OR operator in the condition makes such multiple comparisons possible. With most databases, an OR operator slows processing. It can cause the database to perform more work than necessary.
However, the same work can be done with a simpler select. For example, if $state is ‘CA,’ the following select works:
begin-select order_num from orders, customers where order.customer_num = customers.customer_num and order_date > $start_date end-select
Dynamic SQL enables you to check the value of $state and create the simpler condition:
if $state = 'CA' let $datecol = 'order_date' else let $datecol = 'ship_date' end-if begin-select order_num from orders, customers where order.customer_num = customers.customer_num and [$datecol] > $start_date end-select
The [$datecol] substitution variable substitutes the name of the column to be compared with $start_date. The select is simpler and no longer uses an OR operator. In most cases, this use of dynamic SQL improves performance.
See Using Dynamic SQL and Error Checking.
Because SQR programs select and manipulate data from a SQL database, you should understand how SQR processes SQL statements and queries.
SQR programs can perform multiple SQL statements. Moreover, they can run the same SQL statement multiple times.
When a program runs, a pool of SQL statement handles, called cursors, is maintained. A cursor is a storage location for one SQL statement—for example, SELECT, INSERT, or UPDATE. Every SQL statement uses a cursor for processing. A cursor holds the context for the execution of a SQL statement.
The cursor pool contains 30 cursors, and you cannot change its size. When a SQL statement is rerun, its cursor can be immediately reused if it is still in the cursor pool. When an SQR program runs more than 30 different SQL statements, cursors in the pool are reassigned.
To examine how cursors are managed, use the -S command-line flag. This flag displays cursor status information at the end of a run.
The following information appears for each cursor:
Cursor #nn: SQL = <SQL statement> Compiles = nn Executes = nn Rows = nn
The listing also includes the number of compiles, which vary according to the database and the complexity of the query. With Oracle, for example, a simple query is compiled only once. With Sybase, a SQL statement is compiled before it is first run and recompiled for the purpose of validation during the SQR compile phase. Therefore, you may see two compiles for a SQL statement. Later, when the SQL is rerun, if its cursor is found in the cursor pool, it can proceed without recompiling.
This section provides an overview of temporary database tables and discusses how to:
Use and sort arrays.
Use and sort flat files.
Programs often use temporary database tables to hold intermediate results. Creating, updating, and deleting temporary tables is a resource-consuming task, however, and can slow the program’s performance. SQR provides two alternatives to using temporary database tables:
Store intermediate results in an SQR array.
Store intermediate results in a local flat file.
Both techniques can yield a significant performance gain. Use the SQR language to manipulate the data that is stored in an array or a flat file.
An SQR array can hold as many records as can fit in memory. During the first pass, when records are retrieved from the database, you can store them in the array. Subsequent passes on the data can be made without additional database access.
The following code example retrieves records, prints them, and saves them into an array named customer_array:
create-array name=customer_array size=1000 field=state:char field=city:char field=name:char field=phone:char let #counter = 0 begin-select state (,1) city (,7) name (,24) phone (,55) position (+1) put &state &city &name &phone into customer_array(#counter) add 1 to #counter from customers end-select
The customer_array array has four fields that correspond to the four columns that are selected from the customers table, and it can hold up to 1,000 rows. If the customers table had more than 1,000 rows, you would need to create a larger array.
The select paragraph prints the data. The PUT command then stores the data in the array. You could use the LET command to assign values to array fields; however, the PUT command performs the same work, with fewer lines of code. With PUT, you can assign all four fields in one command.
The #counter variable serves as the array subscript. It starts with zero and maintains the subscript of the next available entry. At the end of the select paragraph, the value of #counter is the number of records in the array.
The next code example retrieves the data from customer_array and prints it:
let #i = 0 while #i < #counter get $state $city $name $phone from customer_array(#i) print $state (,1) print $city (,7) print $name (,24) print $phone (,55) position (+1) add 1 to #i end-while
In this code example, #i goes from 0 to #counter– 1. The fields from each record are moved into the corresponding variables: $name, $city, $state, and $phone. These values are then printed.
In many cases, intermediate results must be sorted by a different field. The following sample program indicates how to sort customer_array by name. The sample program uses a well-known sorting algorithm called QuickSort. You can copy this code into your program, make appropriate changes, and use it to sort your array:
Program ex24a.sqr #define MAX_ROWS 1000 begin-setup create-array name=customer_array size={MAX_ROWS} field=state:char field=city:char field=name:char field=phone:char ! ! Create a helper array that is used in the sort ! create-array name=QSort size={MAX_ROWS} field=n:number field=j:number end-setup begin-program do main end-program begin-procedure main let #counter = 0 ! ! Print customers sorted by state ! begin-select state (,1) city (,7) name (,24) phone (,55) position (+1) ! Put data in the array put &state &city &name &phone into customer_array(#counter) add 1 to #counter from customers order by state end-select position (+2) ! ! Sort customer_array by name ! let #last_row = #counter - 1 do QuickSort(0, 0, #last_row) ! ! Print customers (which are now sorted by name) ! let #i = 0 while #i < #counter ! Get data from the array get $state $city $name $phone from customer_array(#i) print $state (,1) print $city (,7) print $name (,24) print $phone (,55) position (+1) add 1 to #i end-while end-procedure ! main ! ! QuickSort ! ! Purpose: Sort customer_array by name. ! This is a recursive function. Since SQR does not allocate ! local variables on a stack (they are all static), this ! procedure uses a helper array. ! ! #level - Recursion level (used as a subscript to the helper ! array) ! #m - The "m" argument of the classical QuickSort ! #n - The "n" argument of the classical QuickSort ! begin-procedure QuickSort(#level, #m, #n) if #m < #n let #i = #m let #j = #n + 1 ! Sort key is "name" let $key = customer_array.name(#m) while 1 add 1 to #i while #i <= #j and customer_array.name(#i) < $key add 1 to #i end-while subtract 1 from #j while #j >= 0 and customer_array.name(#j) > $key subtract 1 from #j end-while if #i < #j do QSortSwap(#i, #j) else break end-if end-while do QSortSwap(#m, #j) add 1 to #level ! Save #j and #n let QSort.j(#level - 1) = #j let QSort.n(#level - 1) = #n subtract 1 from #j do QuickSort(#level, #m, #j) ! restore #j and #n let #j = QSort.j(#level - 1) let #n = QSort.n(#level - 1) add 1 to #j do QuickSort(#level, #j, #n) subtract 1 from #level end-if end-procedure ! QuickSort ! ! ! QSortSwap ! ! Purpose: Swaps records #i and #j of customer_array ! ! #i - Array subscript ! #j - Array subscript ! begin-procedure QSortSwap(#i, #j) get $state $city $name $phone from customer_array(#i) let customer_array.state(#i) = customer_array.state(#j) let customer_array.city(#i) = customer_array.city(#j) let customer_array.name(#i) = customer_array.name(#j) let customer_array.phone(#i) = customer_array.phone(#j) put $state $city $name $phone into customer_array(#j) end-procedure ! QSortSwap
The QuickSort algorithm uses a recursive procedure, which means that it calls itself. SQR maintains only one copy of the procedure’s local variables. In QuickSort, the #j and #n variables are overwritten when QuickSort calls itself.
For the algorithm to work properly, the program must save the values of these two variables before making the recursive call, and then restore those values when the call finishes. QuickSort can call itself recursively many times, so the program may need to save many copies of #j and #n. To have the program do this, add a #level variable that maintains the depth of recursion. In this example, a helper array, Qsort, is used to hold multiple values of #j and #n.
The QuickSort procedure takes three arguments. The first is the recursion level (or depth), which is #level, as previously described. The second and third arguments are the beginning and end of the range of rows to be sorted. Each time QuickSort calls itself, the range gets smaller. The main procedure starts QuickSort by calling it with the full range of rows.
The QSortSwap procedure swaps two rows in customer_array. Typically, rows with a lower key value are moved up.
The QuickSort and QSortSwap procedures in ex24a.sqr refer to customer_array and its fields. If you plan to use these procedures to sort an array in your applications, you must change these references to the applicable array and fields. The QuickSort procedure sorts in ascending order.
SQR and Language Sensitive Sorting
SQR does not natively support National Language Sensitive sorting. SQR compares characters based on Unicode codepoint, and sorting based on Unicode codepoint does not correctly sort order language-sensitive data.
The QuickSort procedure does not support National Language Sensitive character string sort. The comparisons are simple string comparisons based on Unicode codepoint used internally in SQR to represent string data. For instance, the following code lines from the preceding code sample would sort data in Unicode codepoint order. Unicode codepoints are not ordered to make a correct sorting order of any language.
while #i <= #j and customer_array.name(#i) < $key and while #j >= 0 and customer_array.name(#j) > $key
If you want to sort string data in SQR, you may need to write a National Language Sensitive character string comparison and add that to SQR. The QuickSort procedure will then be modified in the following way:
while #i <= #j and NLS_STRING_COMPARE(customer_array.name(#i),$key) while #j >= 0 and NLS_STRING_COMPARE($key,customer_array.name(#j))
An alternative to an array is a flat file. You can use a flat file when the required array size exceeds the available memory.
The code example in the previous section can be rewritten to use a file instead of an array. The advantage of using a file is that the program is not constrained by the amount of memory that is available. The disadvantage of using a file is that the program performs more input and output (I/O). However, it may still be faster than performing another SQL statement to retrieve the same data.
This program uses the UNIX/Linux sort utility to sort the file by name. This example can be extended to include other operating systems.
The following code example is rewritten to use the cust.dat file instead of the array:
Program ex24b.sqr begin-program do main end-program begin-procedure main ! ! Open cust.dat ! open 'cust.dat' as 1 for-writing record=80:vary begin-select state (,1) city (,7) name (,24) phone (,55) position (+1) ! Put data in the file write 1 from &name:30 &state:2 &city:16 &phone:10 from customers order by state end-select position (+2) ! ! Close cust.dat close 1 ! Sort cust.dat by name ! call system using 'sort cust.dat > cust2.dat' #status if #status <> 0 display 'Error in sort' stop end-if ! ! Print customers (which are now sorted by name) ! open 'cust2.dat' as 1 for-reading record=80:vary while 1 ! loop until break ! Get data from the file read 1 into $name:30 $state:2 $city:16 $phone:10 if #end-file break ! End of file reached end-if print $state (,1) print $city (,7) print $name (,24) print $phone (,55) position (+1) end-while ! ! close cust2.dat close 1 end-procedure ! main
The program starts by opening a cust.dat file:
open 'cust.dat' as 1 for-writing record=80:vary
The OPEN command opens the file for writing and assigns it file number 1. You can open as many as 12 files in one SQR program. The file is set to support records of varying lengths with a maximum of 80 bytes (characters). For this example, you can also use fixed-length records.
As the program selects records from the database and prints them, it writes them to cust.dat:
write 1 from &name:30 &state:2 &city:16 &phone:10
The WRITE command writes the four columns into file number 1, the currently open cust.dat. It writes the name first, which simplifies sorting the file by name. The program writes fixed-length fields. For example, &name:30 specifies that the name column uses exactly 30 characters. If the actual name is shorter, it is padded with blanks. When the program has finished writing data to the file, it closes the file by using the CLOSE command.
The file is sorted with the UNIX sort utility:
call system using 'sort cust.dat > cust2.dat' #status
The sort cust.dat > cust2.dat command is sent to the UNIX system. It invokes the UNIX sort command to sort cust.dat and direct the output to cust2.dat. The completion status is saved in #status; a status of 0 indicates success. Because name is at the beginning of each record, the file is sorted by name.
Next,open cust2.dat for reading. The following command reads one record from the file and places the first 30 characters in $name:
read 1 into $name:30 $state:2 $city:16 $phone:10
The next two characters are placed in $state, and so on. When the end of the file is encountered, the #end-file reserved variable is automatically set to 1 (true). The program checks for #end-file and breaks out of the loop when the end of the file is reached. Finally, the program closes the file by using the CLOSE command.
Sometimes you must create multiple reports that are based on the same data. In many cases, these reports are similar, with only a difference in layout or summary. Typically, you can create multiple programs and even reuse code. However, if each program is run separately, the database has to repeat the query. Such repeated processing is often unnecessary.
With SQR, one program can create multiple reports simultaneously. In this method, a single program creates multiple reports, making just one pass on the data and reducing the amount of database processing.
See Creating Multiple Reports from One Program.
SQR for PeopleSoft provides three types of numeric values:
Machine floating point numbers
Decimal numbers
Integers
Machine floating point numbers are the default. They use the floating point arithmetic that is provided by the hardware. This method is very fast. It uses binary floating point and normally holds up to 15 digits of precision.
Some accuracy can be lost when you are converting decimal fractions to binary floating point numbers. To overcome this loss of accuracy, you can sometimes use the ROUND option of commands such as ADD, SUBTRACT, MULTIPLY, and DIVIDE. You can also use the round function of LET or numeric edit masks that round the results to the needed precision.
Decimal numbers provide exact math and precision of up to 38 digits. Math is performed in the software. This is the most accurate method, but also the slowest.
You can use integers for numbers that are known to be integers. Using integers is beneficial because they:
Enforce the integer type by not allowing fractions.
Adhere to integer rules when dividing numbers.
Integer math is also the fastest method, typically faster than floating point numbers.
If you use the DECLARE-VARIABLE command, the -DNT command-line flag, or the DEFAULT-NUMERIC entry in the Default-Settings section of the PSSQR.INI file, you can select the type of numbers that SQR uses. Moreover, you can select the type for individual variables in the program with the DECLARE-VARIABLE command. When you select decimal numbers, you can also specify the needed precision.
Selecting the numeric type for variables enables you to fine-tune the precision of numbers in your program. For most applications, however, this type of tuning does not yield a significant performance improvement, so it's best to select decimal. The default is machine floating point to provide compatibility with older releases of the product.
Compiling an SQR program can improve its performance. The compiled program is stored in a runtime (.SQT) file. You can then run it with SQR Execute. Your program runs faster because it bypasses the compile phase.
See Compiling Programs and Using SQR Execute.
Use a startup file and the Processing-Limits section of pssqr.ini to define the sizes and limitations of some of the internal structures that SQR uses. An -M command-line flag can specify a startup file whose entries override those in pssqr.ini. If you use the -Mb command-line flag, then corresponding sections of the file are not processed. Many of these settings have a direct affect on memory requirements.
Tuning of memory requirements used to be a factor with older, 16-bit operating systems, such as Windows 3.1. Today, most operating systems use virtual memory, and tuning memory requirements normally do not affect performance in any significant way. The only case in which you might need to be concerned with processing limit settings is with large SQR programs that exceed default processing limit settings. In such cases you must increase the corresponding settings.
When you run a BEGIN-SELECT command, SQR fetches records from the database server. For better performance, SQR fetches them in groups rather than one at a time—by default in groups of 10 records. SQR buffers the records, and a program processes these records one at a time. SQR therefore performs a database fetch operation after every 10 records, instead of after every single record—a substantial performance gain. If the database server is on another computer, network traffic is also significantly reduced.
Modify the number of records to fetch together by using the -B command-line flag or, for an individual BEGIN-SELECT command, by using its -B option. In both cases, specify the number of records to be fetched together. For example -B100 specifies that records be fetched in groups of 100. This means that the number of database fetch operations is further reduced.
This feature is currently available with SQR for Oracle or Sybase databases and SQR for ODBC.
To reduce network traffic and improve performance, run SQR programs directly on the database server machine. The SQR server is available on many server platforms including Microsoft Windows and UNIX/Linux.