Running the Data Transformation Process and Correcting Errors

This chapter discusses how to:

Define map groups and map group chunking criteria.
Extract, transform, and load source data.
Handle data transformation process errors.

Defining Map Groups and Map Group Chunking Criteria

This section discusses how to:

Define map groups.
Define chunking criteria.

You can configure the data transformation process to run on maps as a group. A map group may contain map groups within the main map group, as well as one or more individual maps. Map groups are submitted as a job unit and can run either in serial or parallel mode. To optimize performance by processing data more efficiently, you can define chunking criteria on the Map Group Filter page.

Pages Used to Define Map Groups and Chunking Criteria

Page Name	Object Name	Navigation	Usage
Map Groups	EOEW_GRP_DFN	Enterprise Components, Data Transformation, Data Transformation Home, Define Map Groups, Map Groups	Create map groups.
Map Group Filter	EOEW_GRP_FLTR	Click the Chunking link on the Map Groups page.	Define chunking criteria for maps.

Defining Map Groups

Access the Map Groups page.

Subject Area

Select a subject area.

For new maps groups, the subject area will be set to the default as defined on the Subject Area page.

Parallel Processing For Group

Select to run the process in parallel mode, which runs the processes simultaneously. Clear to run it in serial mode, which runs each process in the map group sequentially.

Group Type and Map Object

Select the maps and/or map groups that you want in the order in which you want them to run.

Note. Map object prompts are restricted by subject area. Only objects that are in the map group's current subject area and those in the default subject area appear.

(Optional) Chunking

Click to access the Map Group Filter page to define chunking criteria for the associated map.

See Defining Chunking Criteria.

Note. The Chunking link is only available for Group Types of Map. If you want to chunk a group, you need to go to that group's definition to define the criteria.

Defining Chunking Criteria

Access the Map Group Filter page.

Chunking is a mechanism that makes large amounts of processing easier through the use of multiple small parallel processes. By enabling chunking, multiple jobs are spawned from one job stream. These jobs run in parallel or serial to process data efficiently. It is an optional mechanism to help with performance. The user is responsible to define chunks that include all of the source data without duplicating any rows. The system will not verify this. The map group is still the unit of work. The group job is not complete until all of the chunks are also complete.

Parallel Processing	Select to run the process in parallel mode, which runs the processes simultaneously. Clear to run it in serial mode, which runs each process in the chunk sequentially.
Column Alias	Select a column alias. Available values are derived from the source data object for the map you are currently chunking.
Operator	Select an operator to define the chunking condition
Field Value	Enter the field value that completes criteria for the chunk number.
And/Or Switch	Select And or Or to compound multiple sets of criteria.

Note. The chunks you define must be configured to capture all of the source data without duplicating rows.

Extracting, Transforming, and Loading Source Data

In this section, we discuss how to:

Extract, transform, and load source data.
View summaries of a data transformation process run control.

Understanding Extracting, Transforming, and Loading Source Data

The data transformation process can be run to extract, transform and load source data by a single map or by map groups. When a map or group is executed, it is compiled at runtime. No SQL or code, only metadata, is stored. This reduces the risk of encountering problems late in a multi-map process, guarantees that each parallel process is executing the same version, and insulates the current running job from any changes to the actual map definition.

The data transformation process is run using the Data Transformation Application Engine process (EOEW_ETL_EXE).

Pages Used to Extract, Transform, and Load Source Data

Page Name	Object Name	Navigation	Usage
Run Data Transformations	EOEW_RUN_ETL	Enterprise Components, Data Transformation, Data Transformation Home, Run Data Transformations, Run Data Transformations	Define run control criteria for and run the data transformation process.
Run Data Transformations - Run Summary	EOEW_RUN_ETL_SUM	Click the Run Summary link on the Run Data Transformations page.	View information about only the jobs related to a particular data transformation process run control.

Running the Data Transformation Process

Access the Run Data Transformations page.

Note. The data transformation process uses the Data Transformation Application Engine process (EOEW_ETL_EXE).

Data Transformer Object Type	Select a Data Transformer object type. You can run either a Map or Map Group. Note. PeopleSoft Catalog Management uses the run control ID RUN_MAP to load partner source data. When using this run control ID, the Data Transformer object type must be Map.
Map Object	Select a map object. The prompt for available field values is based on the Data Transformer object type.
Target Load Option	Select a target load option. Full Load. Extracts all data from the source as defined by the source data object and inserts into the target. Incremental Update. Copies all rows from the source table that have been updated or modified since the last load, based on the date/time the row was updated or modified. The Map Options page must have a date/timestamp field defined in order to use incremental load.
Destructive Load	Select to delete all rows from the target table before the new rows are inserted. Warning! Use this option with caution, as this will delete all rows in the target table.
Parallel Processing	Select to run the process in parallel mode, which runs the processes simultaneously. Clear to run it in serial mode, which runs each process in the chunk sequentially. Note. This option is only available if the Object Type is Map.
Chunking Criteria	Click to access the Map Group Filter page to define chunking criteria for the associated map. See Defining Chunking Criteria. Note. This link is only available for Group Types of Map. If you want to chunk a group, you need to go to that group's definition to define the criteria.
Run	Click to run the data transformation process. A process request is submitted. Click the Process Monitor link to monitor the status of the request.
Run Summary	Click to view information that is related to the status of the data transformation process.

Viewing the Run Summary

Access the Run Data Transformations - Run Summary page.

After running a data transformation process by clicking Run on the Run Data Transformations page, you can access the Run Summary page just as you would access Report Manager or Process Monitor.

Although the Process Monitor provides information regarding a process run, the Run Summary feature offers a more granular view of the individual subprocesses, such as chunks, that are not exposed in the Process Monitor. For example, a single map containing chunks or a group can spawn numerous jobs. If you use the Process Monitor to view these jobs, you find that the numerous jobs that are associated with a single map are mixed in with all of the other jobs that are currently running. Depending on the number of jobs that are running, this can make it difficult to view only those jobs that are associated with a particular Data Transformer process run control.

However, by using the Run Summary feature you can view all of the jobs that were spawned for the run control that is associated with a particular run on of the Data Transformer process. The Run Summary feature is especially useful when running parallel processes that are associated with multiple maps. By using the Run Summary feature, you can associate a process instance with each chunk as it runs.

The Run Summary feature enables you to see:

Which subprocesses are involved within a particular data transformation process run control.
When a particular subprocess (chunk or map) begins.
When a particular subprocess (chunk or map) completes.
Which subprocesses didn't complete successfully.
Which process instance is associated with a particular chunk or map.

Main Information

Select the Main Information tab.

Process Instance

Displays the PeopleSoft Process Scheduler process instance that is assigned to the individual process. This value also appears on the Times and Chunking Criteria tabs for consistent identification.

Note. For parallel processes, you see different process instances; for serial processes, you see the same process instance.

Run Status

Reflects the same status that appears in the Process Monitor. If the run status displays an error, go to the Process Monitor to troubleshoot and restart the process.

Times

Select the Times tab.

Use these times to track the performance of the processes.

Chunking Criteria

Select the Chunking Criteria tab.

Chunking Where clause

Displays information about the chunking criteria that is specified for a particular map, including:

Relational operators (=, <, >, and so on).
Boolean operators (AND and OR).

Note. This page also displays information that is relevant only to the internal aspects of the PeopleSoft mapping functionality.

The field names that are used for chunking are converted to an internal format; therefore, the format of the Chunking Where clause may not necessarily be a true reflection. That is, it may contain an extra “AND (“), for example. However, determining the chunking criteria that is used can be very useful when you are troubleshooting.

Also, values that are similar to EOEW_FP_CHAR30_0 are used internally by the PeopleSoft system to store data in a temporary table while the data is being transformed and loaded.

Handling Data Transformation Process Errors

In this section, we discuss how to view and correct data transformation errors that arise from a run of the data transformation process.

Understanding Data Transformation Process Errors

After you run the data transformation process for a map with the Correct data error & reprocess option selected, use the Error Correction page to check for any errors that were logged during runtime. You can correct these errors online and rerun the data transformation process with the Include Errors option selected.

The Error Correction page sources its information from a PeopleSoft table specified on the Map Field Detail page. When the data transformation process runs and finds an error (such as a look-up or edit on an entry not found), it writes an entry into this error table.

The error table comprises:

Two key fields, PROCESS_INSTANCE and EOEW_ETL_SEQNUM.
Error message fields from the EOEW_ERRMSG_SBR subrecord (EOEW_ERR_MSG01..EOEW_ERR_MSG10).
All the fields being sourced from the Source Data Object.

Whenever an error is encountered for a look-up or edit transformation, the data transformation process stores the associated error message set number and message number in the error fields (EOEW_ERRMSG_XX) that are found in the error table so that the user can then troubleshoot the rows with errors.

By default, PeopleSoft allocates 10 error messages on the EOEW_ERRMSG_SBR subrecord (each error message field includes the message set number and message number), but users can delete or add more error message fields on the subrecord, as needed.

If more errors are encountered during the data transformation process run than are allocated on the error table, those errors that are encountered after the limit was reached are not written to the error table. Every error record must include one or more EOEW_ERRMSG_XX fields. The errors that are encountered during the data transformation process appear on the Error Correction page. The Error Correction page displays error messages that are associated with each specific row of data that is found to be in error by the data transformation process.

Page Used to Handle Data Transformation Process Errors

Page Name	Object Name	Navigation	Usage
Error Correction	EOEW_CORRECTION	Enterprise Components, Data Transformation, Data Transformation Home, Correct Errors, Error Correction	View and correct data transformation process errors.

Viewing and Correcting Data Transformer Process Errors

Access the Error Correction page.

After you use this page to correct all errors, rerun the map.

Edit	Select for the row that you want to correct. The Field list appears at the top of the page with the associated error message for that row.
Delete	Select to remove the current row. This will physically delete the error row from the error table.
Delete All Rows	Select to remove all current rows. This will physically delete all error rows from the error table.
Field	Select the field you want to edit. Enter the new, corrected value for the field.
Update	Click to save the correction to the field.

Subject Area	Select a subject area. For new maps groups, the subject area will be set to the default as defined on the Subject Area page.
Parallel Processing For Group	Select to run the process in parallel mode, which runs the processes simultaneously. Clear to run it in serial mode, which runs each process in the map group sequentially.
Group Type and Map Object	Select the maps and/or map groups that you want in the order in which you want them to run. Note. Map object prompts are restricted by subject area. Only objects that are in the map group's current subject area and those in the default subject area appear.
(Optional) Chunking	Click to access the Map Group Filter page to define chunking criteria for the associated map. See Defining Chunking Criteria. Note. The Chunking link is only available for Group Types of Map. If you want to chunk a group, you need to go to that group's definition to define the criteria.