LRUCache的实现原理及利用python实现的方法

2023-09-20 18:54:06 69

简介

LRU(LeastRecentlyUsed)最近最少使用，最近有时间和空间最近的歧义，所以我更喜欢叫它近期最少使用算法。它的核心思想是，如果一个数据被访问过，我们有理由相信它在将来被访问的概率就越高。于是当LRU缓存达到设定的最大值时将缓存中近期最少使用的对象移除。LRUCache内部使用LinkedHashMap来存储key-value键值对，并将LinkedHashMap设置为访问顺序来体现LRU算法。

无论是对某个key的get，还是set都算做是对该key的一次使用。当set一个不存在的key，并且LRUCache中key的数量超过cachesize的时候，需要将使用时间距离现在最长的那个key从LRUCache中清除。

LRUCache实现

在Java中，LRUCache是通过LinkedHashMap实现的。鄙人照猫画虎，实现一个Python版的LRUCache（可能和其他大神的实现有所区别）。

首先，需要说明的是：

LRUCache对象内部会维护一个双端循环链表的头节点

LRUCache对象内部会维护一个dict

内部dict的value都是Entry对象，每个Entry对象包含：

key的hash_code（hash_code=hash(key)，在本实现中，hash_code相同的不同key，会被当作一个key来处理。因此，对于自定义类，应该实现魔术方法：__hash__）
v-(key,value)对中的value
prev-前一个对象
next-后一个对象

具体实现是：

当从LRUCache中get一个key的时候：

计算该key的hash_code
从内部dict中获取到entry
将该entry移动到双端循环链表的第一个位置
返回entry.value

当向LRUCache中set一个(key,value)对的时候：

计算该key的hash_code，

从LRUCache的内部dict中，取出该hash_code对应的old_entry（可能不存在），然后根据(key,value)对生成一个new_entry，之后执行：

dict[hash_code]=new_entry
将new_entry提到双端循环链表的第一个位置
如果old_entry存在，则从链表中删除old_entry
如果是新增了一个(key,value)对，并且cache中key的数量超过了cachesize，那么将双端链表的最后一个元素删除（该元素就是那个最近最少被使用的元素），并且从内部dict中删除该元素

HashMap的实现原理

（面试过程中也经常会被问到）：数组和链表组合成的链表散列结构，通过hash算法，尽量将数组中的数据分布均匀，如果hashcode相同再比较equals方法，如果equals方法返回false，那么就将数据以链表的形式存储在数组的对应位置，并将之前在该位置的数据往链表的后面移动，并记录一个next属性，来指示后移的那个数据。

注意：数组中保存的是entry（其中保存的是键值）

Python实现

classEntry:
def__init__(self,hash_code,v,prev=None,next=None):
self.hash_code=hash_code
self.v=v
self.prev=prev
self.next=next

def__str__(self):
return"Entry{hash_code=%d,v=%s}"%(
self.hash_code,self.v)
__repr__=__str__

classLRUCache:
def__init__(self,max_size):
self._max_size=max_size
self._dict=dict()
self._head=Entry(None,None)
self._head.prev=self._head
self._head.next=self._head

def__setitem__(self,k,v):
try:
hash_code=hash(k)
exceptTypeError:
raise

old_entry=self._dict.get(hash_code)
new_entry=Entry(hash_code,v)
self._dict[hash_code]=new_entry

ifold_entry:
prev=old_entry.prev
next=old_entry.next
prev.next=next
next.prev=prev

head=self._head
head_prev=self._head.prev
head_next=self._head.next

head.next=new_entry
ifhead_previshead:
head.prev=new_entry
head_next.prev=new_entry
new_entry.prev=head
new_entry.next=head_next

ifnotold_entryandlen(self._dict)>self._max_size:
last_one=head.prev
last_one.prev.next=head
head.prev=last_one.prev
self._dict.pop(last_one.hash_code)

def__getitem__(self,k):
entry=self._dict[hash(k)]
head=self._head
head_next=head.next
prev=entry.prev
next=entry.next

ifentry.previsnothead:
ifhead.previsentry:
head.prev=prev
head.next=entry

head_next.prev=entry
entry.prev=head
entry.next=head_next

prev.next=next
next.prev=prev

returnentry.v

defget_dict(self):
returnself._dict

if__name__=="__main__":
cache=LRUCache(2)
inner_dict=cache.get_dict()

cache[1]=1
assertinner_dict.keys()==[1],"test1"
cache[2]=2
assertsorted(inner_dict.keys())==[1,2],"test2"
cache[3]=3
assertsorted(inner_dict.keys())==[2,3],"test3"
cache[2]
assertsorted(inner_dict.keys())==[2,3],"test4"
assertinner_dict[hash(2)].next.v==3
cache[4]=4
assertsorted(inner_dict.keys())==[2,4],"test5"
assertinner_dict[hash(4)].v==4,"test6"

总结

以上就是这篇文章的全部内容了，希望本文的内容对大家的学习或者工作具有一定的参考学习价值，如果有疑问大家可以留言交流，谢谢大家对毛票票的支持。

LRUCache的实现原理及利用python实现的方法

热门推荐

随机推荐