二. 多线程间的同步模型

多线程的线程安全

线程安全是指多线程同时使用共享资源时，并行或并发的线程可能不可预料的篡改共享资源，如下面代码所示。

from threading import *
globalVar = 0

def thd():
  global globalVar
  while 1:
    a = globalVar + 1
    globalVar = a
    a = globalVar - 1
    globalVar = a
    print (globalVar)
  
for i in range(100):
  Thread(target=thd).start()

from threading import *

globalVar = 0

def thd():

global globalVar

while 1:

a = globalVar + 1

globalVar = a

a = globalVar - 1

globalVar = a

print (globalVar)

for i in range(100):

Thread(target=thd).start()

首先了解原子操作的概念。对于原子操作的所有环节，只会是全都执行或全未执行，并且中途不被调度打断，会一直执行到结束。操作系统提供的同步模块在很多环节都是原子操作，例如互斥锁的获取是原子操作，若不利用原子操作很难仅通过编程语言本身实现线程间的同步。然后分析上述程序，单线程多次执行thd的打印结果是0，但并发执行thd的打印结果却是无法预知的。这是因为多线程引用了全局变量，由于调度时机是未知的，所以在某个线程在赋值过程中，会被其他线程打断并篡改全局变量。因为Python存在GIL锁，并且赋值是原子操作（但+=并不是原子操作）。所以可以把该程序的并发认为是把原子操作“打乱次序混在一起”，这就是其线程不安全的本质。代码如下所示。

globalVar = 0

a1 = globalVar + 1
globalVar = a1
a2 = globalVar + 1
globalVar = a2
a2 = globalVar - 1
a1 = globalVar - 1
globalVar = a2
globalVar = a1

print (globalVar)

globalVar = 0

a1 = globalVar + 1

globalVar = a1

a2 = globalVar + 1

globalVar = a2

a2 = globalVar - 1

a1 = globalVar - 1

globalVar = a2

globalVar = a1

print (globalVar)

多线程的线程同步模型

根据上述讨论可知，通过多个线程完成任务时，需要同步机制对线程进行协调和制约。在线程并行时显然需要同步机制的。但由于在线程内部是无法预知调度，所以并发也需要同步机制。这也验证了对于无论是并行或并发，在“系统内核外”都应该认为它们是“在同时执行的”。另外，本文只考虑同进程下线程间关于“共享资源”的同步。由于进程间的共享资源是逻辑隔离的，所以进程间的同步（或是说进程间的通信）与线程同步会有所区别。下面是一些同步模型。

1）自旋锁

定义一个自旋锁，获得该自旋锁的线程才能访问共享资源，访问结束需释放自旋锁。如果线程没有获得自旋锁，会轮询获取自旋锁而不会阻塞，这能保证立即获取到未占用的自旋锁，但代价是不阻塞会占用CPU时间片。

Python没有提供自旋锁的相关支持，但是可以把自旋锁类比为while的死循环。

2）互斥锁

定义一个互斥量，获得该互斥量的线程才能访问共享资源，访问结束需释放互斥量。如果其他线程未得到该互斥量，则会被置于阻塞状态直到获取到互斥量。其相比自旋锁，不会浪费CPU时间片，但需要操作系统解除阻塞。

Python中默认的锁对象Lock就是用互斥锁实现的，其提供acquire与release的接口，其中acquire表示获取互斥锁，release表示释放互斥锁。如果没有获取到互斥锁，线程会在该语句处阻塞，直到获取互斥锁后继续执行后续代码。

from threading import *
globalVar = 0
lock = Lock()

def thd():
  global globalVar,lock
  while 1:
    lock.acquire()
    a = globalVar + 1
    globalVar = a
    a = globalVar - 1
    globalVar = a
    print (globalVar)
    lock.release()
  
for i in range(100):
  Thread(target=thd).start()

from threading import *

globalVar = 0

lock = Lock()

def thd():

global globalVar,lock

while 1:

lock.acquire()

a = globalVar + 1

globalVar = a

a = globalVar - 1

globalVar = a

print (globalVar)

lock.release()

for i in range(100):

Thread(target=thd).start()

操作系统所提供的互斥锁往往支持可重入操作，即在同一线程中可以多次获取。RLock对象提供了可重入的互斥锁，其在同一线程中可多次acquire与release，但仅在同一线程中release次数等于acquire次数时，互斥锁才会被释放。

from threading import *
from time import sleep
rlock = RLock()

def thd():
  while 1:
    rlock.acquire()
    print("{} get RLock".format(currentThread()))
    rlock.acquire()
    print("{} get RLock again!".format(currentThread()))
    rlock.release()
    sleep(1)
    rlock.release()

[Thread(target=thd).start() for i in range(2)]

from threading import *

from time import sleep

rlock = RLock()

def thd():

while 1:

rlock.acquire()

print("{} get RLock".format(currentThread()))

rlock.acquire()

print("{} get RLock again!".format(currentThread()))

rlock.release()

sleep(1)

rlock.release()

[Thread(target=thd).start() for i in range(2)]

3）信号量

信号量又称为PV操作，信号量允许多个线程同时获取，但限定最大的同时获取数，线程访问结束后需要增加信号量计数。如果其他线程没有获得信号量，则会被一直处于阻塞直到获取到信号量。

Python中提供Semaphore对象作为信号量，在实例化时需传入线程数上限的参数，其接口与Lock对象大致相同。但是有一个细节是在所有线程中可无限次release，当在该线程中release大于acquire的次数，最大线程数上限会增加。所以另有BoundedSemaphore对象，其保证不会出现release大于acquire次数的情况，保证线程数上限不会改变。

from threading import *
from time import *
s = BoundedSemaphore(2) 

def thd():
  while 1:
    s.acquire()
    print(currentThread())
    sleep(1)
    s.release()
    
[Thread(target=thd).start() for i in range(4)]

from threading import *

from time import *

s = BoundedSemaphore(2)

def thd():

while 1:

s.acquire()

print(currentThread())

sleep(1)

s.release()

[Thread(target=thd).start() for i in range(4)]

4）条件变量

条件变量是利用线程间共享资源本身进行同步的机制。比如有编程语言中的全局变量a，如果a大于1则交给线程X处理，如果a小于1则交给线程Y处理，其中线程X与线程Y在执行中可能改变变量a。该场景其实可利用互斥锁同步，比如可以在线程X获取互斥锁，如果a小于1则不执行后续逻辑并释放互斥锁，如果a大于1则执行后续逻辑最终释放互斥锁，对线程Y同理。但这样线程可能多次抢占互斥锁，但不满足条件又马上释放互斥锁，从而浪费性能。操作系统提供的条件变量机制内部包含了一个互斥锁，但提供更细粒度的操作，实现基于资源状态的同步机制。

从上述例子来看条件变量比较抽象，所以不妨对比Python中的Lock对象与Condition条件变量对象，通过实现同样的逻辑来理解条件变量。Condition对象除了提供acquire与release的互斥锁接口外，还提供了三个更细粒度的接口。wait函数表示在wait处进入阻塞并释放互斥锁。notify函数表示唤醒某个阻塞于该对象wait函数的线程但不是释放互斥锁。notifyAll函数则表示唤醒全部wait线程。首先通过Condition实现一个利用共享资源同步的例子。

from threading import *
cond = Condition()
condVar = 2

def t1():
  global cond,condVar
  while 1:
    cond.acquire()
    if condVar==1:
      condVar = 2
      print("{} changed the condVar".format(currentThread()))
      cond.notify()
      cond.wait()
    else:
      print("{} get the lock,but cannot change the condVar".format(currentThread()))
      cond.wait()

def t2():
   global cond,condVar
   while 1: 
     cond.acquire() 
     if condVar==2: 
       condVar = 1
       print("{} changed the condVar".format(currentThread()))
       cond.notify()
       cond.wait()
     else:
       print("{} get the lock,but cannot change the condVar".format(currentThread()))
       cond.wait()

[Thread(target=i).start() for i in [t1,t2]]

from threading import *

cond = Condition()

condVar = 2

def t1():

global cond,condVar

while 1:

cond.acquire()

if condVar==1:

condVar = 2

print("{} changed the condVar".format(currentThread()))

cond.notify()

cond.wait()

else:

print("{} get the lock,but cannot change the condVar".format(currentThread()))

cond.wait()

def t2():

global cond,condVar

while 1:

cond.acquire()

if condVar==2:

condVar = 1

print("{} changed the condVar".format(currentThread()))

cond.notify()

cond.wait()

else:

print("{} get the lock,but cannot change the condVar".format(currentThread()))

cond.wait()

[Thread(target=i).start() for i in [t1,t2]]

从上述代码可以发现条件变量确实需要内置互斥锁，因为在这个场景下需要保护全局变量condVar的线程安全，如果条件变量中不含互斥锁，那肯定会有一些应用场景是无法满足的，当然条件变量肯定会有更强大的别的用途。接下来只使用互斥锁Lock来实现该逻辑并执行这个程序，可以发现确实频繁出现线程抢了互斥锁，但又马上释放的情况。

from threading import *
lock = Lock()
condVar = 2

def t1():
  global lock,condVar
  while 1:
    lock.acquire()
    if condVar == 1:
      condVar = 2
      print("{} changed the condVar".format(currentThread()))
    else:
      print("{} get the lock,but cannot change the condVar".format(currentThread()))
    lock.release()

def t2():
  global lock,condVar
  while 1:
    lock.acquire()
    if condVar == 2:
      condVar = 1
      print("{} changed the condVar".format(currentThread()))
    else:
      print("{} get the lock,but cannot change the condVar".format(currentThread()))
    lock.release()

[Thread(target=i).start() for i in [t1,t2]]

from threading import *

lock = Lock()

condVar = 2

def t1():

global lock,condVar

while 1:

lock.acquire()

if condVar == 1:

condVar = 2

print("{} changed the condVar".format(currentThread()))

else:

print("{} get the lock,but cannot change the condVar".format(currentThread()))

lock.release()

def t2():

global lock,condVar

while 1:

lock.acquire()

if condVar == 2:

condVar = 1

print("{} changed the condVar".format(currentThread()))

else:

print("{} get the lock,but cannot change the condVar".format(currentThread()))

lock.release()

[Thread(target=i).start() for i in [t1,t2]]

5）事件

事件机制令“等待事件”的线程处于阻塞状态，直到“有事件”后解除阻塞。

Python提供事件机制的Event对象。wait(timeout)使线程在该处进入阻塞直到超时。set()使其Event对象被标记为“有事件”，此时所有wait的进程解除阻塞。clear()使其Event对象被标记为“无事件”，如果有进程执行到wait则进入阻塞。另外还有一个isSet()函数可以返回其Event对象当前是否“有事件”。

from threading import *
from time import *
event = Event()

def t1():
  global event
  while 1:
    event.set()
    sleep(1)
    event.clear()
    sleep(1)

def t():
  global event
  while 1:
    event.wait()
    print("{} get the event".format(currentThread()))

[Thread(target=i).start() for i in [t1,t,t,t]]

from threading import *

from time import *

event = Event()

def t1():

global event

while 1:

event.set()

sleep(1)

event.clear()

sleep(1)

def t():

global event

while 1:

event.wait()

print("{} get the event".format(currentThread()))

[Thread(target=i).start() for i in [t1,t,t,t]]

6）定时器

线程处于等待时间的状态，如果时间没有结束，线程会一直阻塞。在Linux或者Windows这样的非及时操作系统中，定时器所造成的实际等待时间往往会有一定的误差。

Python中的定时器Timer是Thread的派生类，但并没有为Thread类提供额外的方法等。在类实例化的传入参数中，需要额外的传入一个时间，规定在该对象执行start之后延迟执行的时间。

from threading import *
def t(): print(currentThread())
Timer(1.0,t).start()
Timer(1.5,t).start()

from threading import *

def t(): print(currentThread())

Timer(1.0,t).start()

Timer(1.5,t).start()

发表评论 取消回复

发表评论取消回复