Setting up error handling

Objective

After completing this lesson, you will be able to Explain the levels of data recovery strategies.

Error Handling Strategies

The first solution to manage error situations is to avoid them. Some error situations can easily be sidestepped by constructing jobs so that they take into account the issues that frequently cause them to fail.

But some situations are unavoidable, such as server failures. You must then design your data flows to recover from the failed situation.

Avoiding Errors

You can avoid errors in your data flows by:

  • Using conditions so that the data flow only executes when the context is correct.
  • Using a loop to start over until the context is correct.

    Don't forget to add a way out in case situation is never reached!

One example is when an external file is required to run a job. In this situation, use the wait_for_file function or a while loop and the file_exists function to check that the file exists in a specified location before executing the job.

The while loop is a single–use object that is used in a work flow. The while loop repeats a sequence of steps as long as a condition is true.

Typically, the steps done during the while loop result in a change in the condition so that the condition is eventually no longer satisfied and the work flow exits from the while loop. If the condition does not change, the while loop does not end.

For example, you might want a work flow to wait until the system writes a particular file. Use a while loop to check for the existence of the file using the file_exists function. As long as the file does not exist, the work flow can go into sleep mode for a particular length of time before checking again.

As the system might never write the file, add another check to the loop, such as a counter, to ensure that the while loop eventually exits. In other words, change the while loop to check for the existence of the file and the value of the counter. As long as the file does not exist and the counter is less than a particular value, repeat the while loop. In each iteration of the loop, put the work flow in sleep mode and then increment the counter.

Levels of Data Recovery Strategies

Some errors cannot be anticipated and will cause the job to fail.

Executing a failed job again may result in duplication of rows that were loaded successfully during the first job run.

You then have to find a way of recovering your jobs so that the database is in a correct state and you can go on with the schedule.

Here are some recovery strategies:
  • Recover entire database

    Use the standard RDBMS services to restore crashed data cache to an entire database. This option is outside of the scope of this course.

  • Use transactions:

    Setup your data flow to be encased in a transaction so that tables stay in a consistent state.

  • Recover from partially loaded tables:

    Avoid duplicate loading of rows.

  • Recover missing values or rows:

    Identify missing values and manage rows that could not be inserted.

  • Manage exceptions:

    Ensure that all exceptions are managed in a work flow.

Depending on the relationships between data flows in your application, you may use a combination of these techniques to recover from errors.

Setting Up Transactions

If your data flow needs to load a large amount of data or if it loads dependent tables, for example sales order and sales order item tables, you don't want rows to be loaded if some others are not.

To improve performance and use of resources, SAP Data Services by default loads tables in several transactions. You can set the "Rows per commit" option to determine how many rows will be processed in each transaction.

But, then, if your data flow fails, some rows might have already been committed in the table when the job stops.

To avoid this situation, you can ask Data Services to include the whole data flow into a single transaction.

Partially Loaded Data

Several methods can be used to ensure that duplicate rows are not inserted:

  • To avoid inserting duplicates, use one of the following options:
    • Include the Table Comparison transform in the data flow to only insert missing rows when the table has more rows and fewer fields, such as fact tables.
    • Select the Delete data from table before loading or Drop and re-create table option in the target table to replace data during each execution. This technique can be optimal when the changes to the target table are numerous compared to the size of the table.
    • Change the target table options to use the auto-correct load feature when a table has fewer rows and more fields, such as dimension tables.
    • Include an SQL command to remove previous partial updates before the table loads.

The auto-correct load checks the target table for existing rows before adding new rows to the table. Using the auto-correct load option, however, can slow jobs. Consider this technique when the target table is large and the changes to the table are relatively few.

Preload SQL commands can remove partial database updates that occur during incomplete execution of a step in a job. Typically, the preload SQL command deletes rows based on a variable that is set before the partial insertion step began.

Missing Values or Rows

  • Handle missing values with Validation or Query transform
  • Handle missing rows with the Use overflow file options.

Missing rows are rows that cannot be inserted into the target table. For example, rows may be missing in instances where a primary key constraint is violated. Overflow files help to process this type of data problem.

When you specify an overflow file and Data Services cannot load a row into a table, Data Services writes the row to the overflow file instead. The trace log indicates the data flow in which the load failed and the location of the file. Use the overflow information to identify invalid data in the source or problems introduced in the data movement. Every new run overwrites the existing overflow file.

When using overflow files, you need to enter a file name and a file format.

For the file format, you have two options:

  • If you select Write data , you can use Data Services to read the data from the overflow file, cleanse it, and load it into the target table.
  • If you select Write sql , you can use the SQL commands to load the target manually when the target is accessible.

Caution

Give a full path name for your overflow file, to ensure that Data Services creates a unique file when more than one file is created in the same job.

Alternative Work Flows

You can set up jobs to use alternative work flows that cover all possible exceptions and have recovery mechanisms built in. This technique allows you to automate the process of recovering results.

In this example, the main data flow is using default settings for loading a target table.

In case of an error, the job records the failure. After correcting the issue, you execute the job again.

The next execution will then use another data flow with auto-correct load option. Thus enabling the data flow to complete successfully despite the rows that might have already been committed the first time.

The successful execution needs to be recorded for the next execution of the job to use the default data flow.

Alternative Work Flow Components

Alternative work flows consist of several components, as shown in the previous figure:

  1. A script to determine when a recovery is required.

    This script reads the value in a status table and populates a global variable with the same value. The initial value is set to indicate that a recovery is not required.

  2. A conditional that calls the appropriate work flow based on whether recovery is required.

    The conditional contains an If/Then/Else statement that specifies that work flows do not require recovery are processed one way, and those that do require recovery are processed another way.

  3. A work flow with a try/catch block to execute a data flow without recovery.

    The data flow where recovery is not required is set up without the auto correct load option set. This ensures that, wherever possible, the data flow is executed in a less resource–intensive mode.

  4. A script in the catch object to update the status table.

    The script specifies that recovery is required if any exceptions are generated.

  5. A work flow to execute a data flow with recovery and a script to update the status table.

    The data flow is set up for more resource–intensive processing that will resolve the exceptions. The script updates the status table to indicate that recovery is not required.

Conditionals

Conditionals are single-use objects used to implement conditional logic in a work flow.

When conditional is defined, specify a condition and two logical branches:

StatementDescription
IfA Boolean expression that evaluates to TRUE or FALSE. Use functions, variables, and standard operators to construct the expression.
ThenWork flow element to execute if the IF expression evaluates to TRUE.
ElseWork flow element to execute if the IF expression evaluates to FALSE.

Both the Then and Else branches of the conditional can contain any object that you can have in a work flow, including other work flows, data flows, nested conditionals, try/catch blocks, scripts, and so on.

Try/Catch Blocks

A try/catch block allows you to specify alternative work flows if errors occur during job execution.

Try/catch blocks catch specified classes of errors, apply provided solutions, and continue execution.

You can specify several catch operation in a same block, for different classes or errors.

For Each Catch

  • Specify one exception or group of exceptions handled by the catch.

    To handle more than one exception or group of exceptions, add more catches to the try/catch block.

  • Specify the work flow to execute if the indicated exception occurs.

    Use an existing work flow or define a work flow in the catch editor.

If an exception is thrown during the execution of a try/catch block, and if no catch is looking for that exception, then the exception is handled by normal error logic.

Let's Try It

Let me guide you through the use of an alternative work flow. First, let's prepare all the objects we need:

Now, let me guide you through the creation of the alternative work flow:

Log in to track your progress & complete quizzes