Recovering from file system errors

The following topics describe the actions you should take to recover from errors that you receive. Major and minor return codes in files by the system describes return codes.

Normal completion of errors by the system

A major and minor return code of 0000 indicates that the operation requested by your program completed successfully. Most of the time, the system issues no message. In some cases, the system might use a diagnostic message to inform the user of some unusual condition that it could not handle, but which might be considered an error under some conditions. For example, it might ignore a parameter that is not valid, or it might take some default action.

For communications devices, a major return code of 00, indicating successful completion with data received, is accompanied by a minor return code that indicates what operation the application program is expected to perform next. The nonzero minor does not indicate an error. No message is issued.

Completion with exceptions of errors by the system

The system assigns several rather specific major return codes to conditions for which a specific response from the application program is appropriate.

A major return code of 02 indicates that the requested input operation completed successfully, but the system is ending the job in a controlled. The application program should complete its processing as quickly as possible. The controlled cancel is intended to allow programs time to end in an orderly manner. If your program does not end within the time specified on the ENDJOB command, the system will end the job without further notice.

A major return code of 03 indicates that an input operation completed successfully without transferring any data. For some applications, this might be an error condition, or it might be expected when the user presses a function key instead of entering data. It might also indicate that all the data has been processed, and the application program should proceed with its completion processing. In any case, the contents of the input buffer in the program should be ignored.

A major and minor code of 0309 indicates that the system received no data and is ending the job in a controlled manner. A major and minor code of 0310 indicates that there is no data because the specified wait time has ended. Other minor return codes accompanying the 02 or 03 major code are the same as for a 00 major code, indicating communications status and the operation to be performed next.

A major return code of 04 indicates that an output exception occurred. Specifically, your program attempted to send data when data should have been received. This is probably the result of not handling the minor return code properly on the previous successful completion. Your program can recover by simply receiving the incoming data and then repeating the write operation.

A major return code of 34 indicates that an input exception occurred. The received data was either too long or incompatible with the record format. The minor return code indicates what was wrong with the received data, and whether the data was truncated or rejected. Your program can probably handle the exception and continue. If the data was rejected, you may be able to read it by specifying a different record format.

Two other return codes in this group, 0800 and 1100, are both usually the result of application programming errors, but are still recoverable. 0800 indicates that an acquire operation failed because the device has already been acquired or the session has already been established. 1100 indicates that the program attempted to read from invited devices with no devices invited. In both cases, the program ignored the request that is not valid, and the program may continue.

No message is issued with a 02 major code or most minor codes with the 03 major code, but the other exceptions in this group are usually accompanied by a message in the CPF4701-CPF47FF or CPF5001-CPF50FF range.

Permanent system or file error

A major return code of 80 indicates a serious error that affects the file. The application program must close the file and reopen it before attempting to use it again, but recovery is unlikely until the problem causing the error is found and corrected. To reset an error condition in a shared file by closing it and opening it again, all programs sharing the open data path must close the file. This may require returning to previous programs in the call stack and closing the shared file in each of those programs. The operator or programmer should refer to the text of the accompanying message to determine what action is appropriate for the particular error.

Within this group, several minor return codes are of particular interest. A major and minor code of 8081 indicates a serious system error that probably requires an APAR. The message sent with the major and minor return code may direct you to run the Analyze Problem (ANZPRB) command to obtain more information.

A major and minor code of 80EB indicates that incorrect or incompatible options were specified in the device file or as parameters on the open operation. In most cases you can close the file, end the program, correct the parameter that is not valid with an override command, and run the program again. The override command affects only the job in which it is issued. It allows you to test the change easily, but you may eventually want to change or re-create the device file as appropriate to make the change permanent.

Permanent device or session error on I/O operation

A major return code of 81 indicates a serious error that affects the device or session. This includes hardware failures that affect the device, communications line, or communications controller. It also includes errors due to a device being disconnected or powered off unexpectedly and abnormal conditions that were discovered by the device and reported back to the system. Both the minor return code and the accompanying message provide more specific information regarding the cause of the problem.

Depending on the file type, the program must either close the file and open it again, release the device and acquire it again, or acquire the session again. To reset an error condition in a shared file by closing it and opening it again, all programs sharing the open data path must close the file. In some cases, the message may instruct you to reset the device by varying it off and on again. It is unlikely that the program will be able to use the failing device until the problem causing the error is found and corrected, but recovery within the program may be possible if an alternate device is available.

Some of the minor return codes in this group are the same as those for the 82 major return code. Device failures or line failures may occur at any time, but an 81 major code occurs on an I/O operation. This means that your program had already established a link with the device or session. Therefore, the program may transfer some data, but when the program starts from the beginning when it starts again. A possible duplication of data could result.

Message numbers accompanying an 81 major code may be in the range that indicates either an I/O or a close operation. A device failure on a close operation simply may be the result of a failure in sending the final block of data, rather than action specific to closing the file. An error on a close operation can cause a file to not close completely. Your error recovery program should respond to close failures with a second close operation. The second close will always complete, regardless of errors.

Device or session error on open or acquire operation

A major return code of 82 indicates that a device error or a session error occurred during an open or acquire operation. Both the minor return code and the accompanying message will provide more specific information regarding the cause of the problem.

Some of the minor return codes in this group are the same as those for the 81 major return code. Device or line failures may occur at any time, but an 82 major code indicates that the device or session was unusable when your program first attempted to use it. Thus no data was transferred. The problem may be the result of a configuration or installation error.

Depending on the minor return code, it may be appropriate for your program to recover from the error and try the failing operation again after some waiting period. You should specify the number of times you try in your program. It may also be possible to use an alternate or backup device or session instead.

Message numbers accompanying an 82 major code may be in the range indicating either an open or an acquire operation. If the operation was an open, it is necessary to close the partially opened file and reopen it to recover from the error. If the operation was an acquire, it may be necessary to do a release operation before trying the acquire again. In either case, you should specify a wait time for the file that is long enough to allow the system to recover from the error.

Recoverable device or session errors on I/O operation

A major return code of 83 indicates that an error occurred in sending data to a device or receiving data from the device. Recovery by the application program is possible. Both the minor return code and the accompanying message provide more specific information regarding the cause of the problem.

Most of the errors in this group are the result of sending commands or data that are not valid to the device, or sending valid data at the wrong time or to a device that is not able to handle it. The application program may recover by skipping the failing operation or data item and going on to the next one, or by substituting an appropriate default. There may be a logic error in the application.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]