3.12.10

SAP Note 20071 - Permanent lock after connection termination.

Symptom

R/3 work processes serialize when accessing a particular table.

In V$SESSION_WAIT, the same wait event is displayed for all sessions concerned. The wait event refers to an enqueue, lock or latch, for example:

  • enqueue
  • library cache lock
  • latch free
  • row cache lock

DDL statements like CREATE INDEX or VALIDATE STRUCTURE (without ONLINE option) fails with ORA-00054.

The situation is static - even after a long time nothing has changed.

Reason and Prerequisites

When an Oracle session holds an enqueue, lock or latch, the associated client process (SAP work process, for example) loses the connection to this session. The cause may be a network problem, for example. The Oracle session may not realize this under certain conditions. Instead, it is still waiting for the next Net8 packages from the (no longer existent) client process. Due to this, other sessions requiring the same lock cannot continue working either. As a consequence, the sessions and client processes serialize for this lock.

Solution

To solve the problem, the hanging Oracle session needs to be cleaned up. This can be done as follows:

    1. Restart the Oracle database

The simplest, but often not acceptable solution to this problem is to restart the Oracle database. This stops the hanging Oracle session. If you cannot restart the database for business reasons, you can do one of the following.

    2. Set SQLNET.EXPIRE_TIME in sqlnet.ora.

A preventive measure for avoiding the problem is the setting of SQLNET.EXPIRE_TIME in sqlnet.ora, and this parameter is set to 0 (deactivated) by default in the SAP environment. If it is set to a value greater than 0, interrupted connections are cleaned up automatically after minutes. This also corrects the enqueue, lock or latch so that the subsequent sessions can continue working properly. If you set too low, an unnecessary network overhead may occur. If you set too high, an unnecessarily long time passes until the session is cleaned up. A useful compromise may be SQLNET.EXPIRE_TIME = 5, which ensures cleaning up after five minutes.

Note that in the past, with Oracle 8. x or lower, this mechanism also cancelled long-running SQL statements in some cases. Therefore check during the days after activating SQLNET.EXPIRE_TIME whether there are inexplicable terminations with ORA-01013 (see Note 618861).

SQLNET. EXPIRE_TIME only affects connections whose Oracle session is INACTIVE. If the Oracle session has the status ACTIVE, the status is not changed if the connection is broken.

If the problem has already occurred, SQLNET. EXPIRE_TIME can no longer solve it.. This parameter can only prevent similar problems in the future.

    3. WINDOWS: TCP/IP configuration:

In some cases, a dead connection from SQLNET.EXPIRE_TIME is not cleaned up (for example, when the session is still active on the Oracle side). In this case, the configuration of a TCP/IP timeout helps on WINDOWS. To do this, define the parameter KeepAliveTime in the registry under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters, and set this to a value such as 300,000 (->5 minutes). By doing this, the system cleans up dead connections after five minutes.

    4. Manual termination of the hanging session

In a first step, determine the session ID (SID) and serial number (SERIAL#) of the session holding the particular enqueue, latch or lock.

As of Oracle 10g, sessions that are currently blocking can in most cases be determined as follows:

SELECT
S2.SID BLOCKING_SESSION,
S2.SERIAL# BLOCKING_SERIAL#,
DECODE(S2.WAIT_TIME, 0, S2.EVENT, 'CPU') BLOCKING_SESSION_EVENT,
S1.SID WAITING_SESSION,
DECODE(S1.WAIT_TIME, 0, S1.EVENT, 'CPU') WAITING_SESSION_EVENT
FROM
V$SESSION S1, V$SESSION S2
WHERE
S2.SID = S1.BLOCKING_SESSION;

Until Oracle 9i, the required steps depend on the wait event:

  • enqueue
    • Determine the session IDs , , ..., of the sessions that are accessing the affected table :

      SELECT SID FROM V$ACCESS WHERE OBJECT = "
      ";
        • Determine the session ID of the session actually holding the enqueue:

          SELECT SID FROM V$LOCK
          WHERE SID IN (, , ..., ) AND LMODE = 6;
        • Determine the serial number of session :

          SELECT SERIAL# FROM V$SESSION WHERE SID = ;
      • library cache lock

      Note 607944 describes how to determine and <> of the locking session.

      • latch free
        • Determine the name of the latch that (most of) the sessions are waiting for:

          SELECT N.NAME FROM V$LATCHNAME N, V$SESSION_WAIT W
          WHERE W.EVENT = 'latch free' AND W.P2 = N.LATCH#;
        • Determine the SID of the session that is holding the latch:

          SELECT H.SID FROM V$LATCHHOLDER H, V$SESSION_WAIT W
          WHERE W.P1RAW = H.LADDR AND H.NAME = "";
        • Determine the serial number of session :

          SELECT SERIAL# FROM V$SESSION WHERE SID = ;
      • row cache lock
        • Determine the sessions that currently have row cache locks as described in Note 619188. Perform this step several times in a row, and determine the SID of each session that has a permanent lock.

      Make sure that the session holding the lock is no longer connected to any client process. Use the following command to determine the OS-PID and the host of the client process:

      SELECT PROCESS, MACHINE FROM V$SESSION WHERE SID = ;

      Make sure that no process with the specified OS-PID is running on the returned host.

      If the client process actually no longer exists, you can terminate the hanging Oracle session as follows:

      ALTER SYSTEM KILL SESSION ',';

      If this command returns after some time with the message

      ORA-00031: session marked for kill

      , this means that the session cannot yet be terminated. In this case, you can try to hard terminate the corresponding shadow process (UNIX: "kill -9 ", WINDOWS: "orakill ").

      No comments:

      Post a Comment