https://cis.temple.edu/~jiewu/research/publications/Publication_files/Distributed_Deep_Multi-Agent_Reinforcement_Learning_for_Cooperative_Edge_Caching_in_Internet-of-Vehicles.pdf
This situation may occur because LFU and LRU learn only from one-step past and operate based on simple rules, while RL-based edge caching methods can be derived from the observed historical content demands and concentrate more on the reward that agents can earn rather than users’ requests.