
本节是PostgreSQL Locks的概要部分,翻译自README文件.



Locking Overview
Postgres uses four types of interprocess locks:
* Spinlocks.  These are intended for *very* short-term locks.  If a lock
is to be held more than a few dozen instructions, or across any sort of
kernel call (or even a call to a nontrivial subroutine), don't use a
spinlock. Spinlocks are primarily used as infrastructure for lightweight
locks. They are implemented using a hardware atomic-test-and-set
instruction, if available.  Waiting processes busy-loop until they can
get the lock. There is no provision for deadlock detection, automatic
release on error, or any other nicety.  There is a timeout if the lock
cannot be gotten after a minute or so (which is approximately forever in
comparison to the intended lock hold time, so this is certainly an error
* Spinlocks(自旋锁).这种一种非常短期的锁.如果持有锁会超过几十个指令周期,或者
* Lightweight locks (LWLocks).  These locks are typically used to
interlock access to datastructures in shared memory.  LWLocks support
both exclusive and shared lock modes (for read/write and read-only
access to a shared object). There is no provision for deadlock
detection, but the LWLock manager will automatically release held
LWLocks during elog() recovery, so it is safe to raise an error while
holding LWLocks.  Obtaining or releasing an LWLock is quite fast (a few
dozen instructions) when there is no contention for the lock.  When a
process has to wait for an LWLock, it blocks on a SysV semaphore so as
to not consume CPU time.  Waiting processes will be granted the lock in
arrival order.  There is no timeout.
* Lightweight locks (轻量级锁,LWLocks).这些锁典型的用于保护共享内存中的数据结构.
* Regular locks (a/k/a heavyweight locks).  The regular lock manager
supports a variety of lock modes with table-driven semantics, and it has
full deadlock detection and automatic release at transaction end.
Regular locks should be used for all user-driven lock requests.
* Regular locks (重量级锁,a/k/a heavyweight locks). 
* SIReadLock predicate locks.  See separate README-SSI file for details.
* SIReadLock predicate locks(SIReadLock谓词锁).详情参见README-SSI.
Acquisition of either a spinlock or a lightweight lock causes query
cancel and die() interrupts to be held off until all such locks are
released. No such restriction exists for regular locks, however.  Also
note that we can accept query cancel and die() interrupts while waiting
for a regular lock, but we will not accept them while waiting for
spinlocks or LW locks. It is therefore not a good idea to use LW locks
when the wait time might exceed a few seconds.
The rest of this README file discusses the regular lock manager in detail.
Lock Data Structures
Lock methods describe the overall locking behavior.  Currently there are
two lock methods: DEFAULT and USER.
Lock modes describe the type of the lock (read/write or shared/exclusive).
In principle, each lock method can have its own set of lock modes with
different conflict rules, but currently DEFAULT and USER methods use
identical lock mode sets. See src/include/storage/lock.h for more details.
(Lock modes are also called lock types in some places in the code and
There are two main methods for recording locks in shared memory.  The primary
mechanism uses two main structures: the per-lockable-object LOCK struct, and
the per-lock-and-requestor PROCLOCK struct.  A LOCK object exists for each
lockable object that currently has locks held or requested on it.  A PROCLOCK
struct exists for each backend that is holding or requesting lock(s) on each
LOCK object.
There is also a special "fast path" mechanism which backends may use to
record a limited number of locks with very specific characteristics: they must
use the DEFAULT lockmethod; they must represent a lock on a database relation
(not a shared relation), they must be a "weak" lock which is unlikely to
conflict (AccessShareLock, RowShareLock, or RowExclusiveLock); and the system
must be able to quickly verify that no conflicting locks could possibly be
present.  See "Fast Path Locking", below, for more details.
另外,还有一种成为"fast path"的机制,后台进程可使用非常规特性的用于记录有限数量的锁,
它们必须是一个不太可能出现冲突的"弱"锁(AccessShareLock, RowShareLock, or RowExclusiveLock);
系统必须能够快速的验证冲突锁有没有可能出现.详细参见下面的"Fast Path Locking".
Each backend also maintains an unshared LOCALLOCK structure for each lockable
object and lock mode that it is currently holding or requesting.  The shared
lock structures only allow a single lock grant to be made per lockable
object/lock mode/backend.  Internally to a backend, however, the same lock may
be requested and perhaps released multiple times in a transaction, and it can
also be held both transactionally and session-wide.  The internal request
counts are held in LOCALLOCK so that the shared data structures need not be
accessed to alter them.


 * Per-locked-object lock information:
 * tag -- uniquely identifies the object being locked
 * grantMask -- bitmask for all lock types currently granted on this object.
 * waitMask -- bitmask for all lock types currently awaited on this object.
 * procLocks -- list of PROCLOCK objects for this lock.
 * waitProcs -- queue of processes waiting for this lock.
 * requested -- count of each lock type currently requested on the lock
 *      (includes requests already granted!!).
 * nRequested -- total requested locks of all types.
 * granted -- count of each lock type currently granted on the lock.
 * nGranted -- total granted locks of all types.
 * Note: these counts count 1 for each backend.  Internally to a backend,
 * there may be multiple grabs on a particular lock, but this is not reflected
 * into shared memory.
typedef struct LOCK
    /* hash key */
    LOCKTAG     tag;            /* unique identifier of lockable object */
    /* data */
    LOCKMASK    grantMask;      /* bitmask for lock types already granted */
    LOCKMASK    waitMask;       /* bitmask for lock types awaited */
    SHM_QUEUE   procLocks;      /* list of PROCLOCK objects assoc. with lock */
    PROC_QUEUE  waitProcs;      /* list of PGPROC objects waiting on lock */
    int         requested[MAX_LOCKMODES];   /* counts of requested locks */
    int         nRequested;     /* total of requested[] array */
    int         granted[MAX_LOCKMODES]; /* counts of granted locks */
    int         nGranted;       /* total of granted[] array */
#define LOCK_LOCKMETHOD(lock) ((LOCKMETHODID) (lock).tag.locktag_lockmethodid)


 * We may have several different backends holding or awaiting locks
 * on the same lockable object.  We need to store some per-holder/waiter
 * information for each such holder (or would-be holder).  This is kept in
 * a PROCLOCK struct.
 * PROCLOCKTAG is the key information needed to look up a PROCLOCK item in the
 * proclock hashtable.  A PROCLOCKTAG value uniquely identifies the combination
 * of a lockable object and a holder/waiter for that object.  (We can use
 * pointers here because the PROCLOCKTAG need only be unique for the lifespan
 * of the PROCLOCK, and it will never outlive the lock or the proc.)
 * Internally to a backend, it is possible for the same lock to be held
 * for different purposes: the backend tracks transaction locks separately
 * from session locks.  However, this is not reflected in the shared-memory
 * state: we only track which backend(s) hold the lock.  This is OK since a
 * backend can never block itself.
 * The holdMask field shows the already-granted locks represented by this
 * proclock.  Note that there will be a proclock object, possibly with
 * zero holdMask, for any lock that the process is currently waiting on.
 * Otherwise, proclock objects whose holdMasks are zero are recycled
 * as soon as convenient.
 * releaseMask is workspace for LockReleaseAll(): it shows the locks due
 * to be released during the current call.  This must only be examined or
 * set by the backend owning the PROCLOCK.
 * Each PROCLOCK object is linked into lists for both the associated LOCK
 * object and the owning PGPROC object.  Note that the PROCLOCK is entered
 * into these lists as soon as it is created, even if no lock has yet been
 * granted.  A PGPROC that is waiting for a lock to be granted will also be
 * linked into the lock's waitProcs queue.
typedef struct PROCLOCKTAG
    /* NB: we assume this struct contains no padding! */
    LOCK       *myLock;         /* link to per-lockable-object information */
    PGPROC     *myProc;         /* link to PGPROC of owning backend */
typedef struct PROCLOCK
    /* tag */
    PROCLOCKTAG tag;            /* unique identifier of proclock object */
    /* data */
    PGPROC     *groupLeader;    /* proc's lock group leader, or proc itself */
    LOCKMASK    holdMask;       /* bitmask for lock types currently held */
    LOCKMASK    releaseMask;    /* bitmask for lock types to be released */
    SHM_QUEUE   lockLink;       /* list link in LOCK's list of proclocks */
    SHM_QUEUE   procLink;       /* list link in PGPROC's list of proclocks */
#define PROCLOCK_LOCKMETHOD(proclock) \


 * Each backend also maintains a local hash table with information about each
 * lock it is currently interested in.  In particular the local table counts
 * the number of times that lock has been acquired.  This allows multiple
 * requests for the same lock to be executed without additional accesses to
 * shared memory.  We also track the number of lock acquisitions per
 * ResourceOwner, so that we can release just those locks belonging to a
 * particular ResourceOwner.
 * When holding a lock taken "normally", the lock and proclock fields always
 * point to the associated objects in shared memory.  However, if we acquired
 * the lock via the fast-path mechanism, the lock and proclock fields are set
 * to NULL, since there probably aren't any such objects in shared memory.
 * (If the lock later gets promoted to normal representation, we may eventually
 * update our locallock's lock/proclock fields after finding the shared
 * objects.)
 * Caution: a locallock object can be left over from a failed lock acquisition
 * attempt.  In this case its lock/proclock fields are untrustworthy, since
 * the shared lock object is neither held nor awaited, and hence is available
 * to be reclaimed.  If nLocks > 0 then these pointers must either be valid or
 * NULL, but when nLocks == 0 they should be considered garbage.
typedef struct LOCALLOCKTAG
    LOCKTAG     lock;           /* identifies the lockable object */
    LOCKMODE    mode;           /* lock mode for this table entry */
typedef struct LOCALLOCKOWNER
     * Note: if owner is NULL then the lock is held on behalf of the session;
     * otherwise it is held on behalf of my current transaction.
     * Must use a forward struct reference to avoid circularity.
    struct ResourceOwnerData *owner;
    int64       nLocks;         /* # of times held by this owner */
typedef struct LOCALLOCK
    /* tag */
    LOCALLOCKTAG tag;           /* unique identifier of locallock entry */
    /* data */
    uint32      hashcode;       /* copy of LOCKTAG's hash value */
    LOCK       *lock;           /* associated LOCK object, if any */
    PROCLOCK   *proclock;       /* associated PROCLOCK object, if any */
    int64       nLocks;         /* total number of times lock is held */
    int         numLockOwners;  /* # of relevant ResourceOwners */
    int         maxLockOwners;  /* allocated size of array */
    LOCALLOCKOWNER *lockOwners; /* dynamically resizable array */
    bool        holdsStrongLockCount;   /* bumped FastPathStrongRelationLocks */
    bool        lockCleared;    /* we read all sinval msgs for lock */
#define LOCALLOCK_LOCKMETHOD(llock) ((llock).tag.lock.locktag_lockmethodid)


