Redis 缓存穿透

世界以痛吻我，我仍报之以歌。

什么是缓存穿透

缓存穿透是指查询一个根本不存在的数据，缓存层和存储层都不会命中。

例如现有 1w 个商品，以商品 id 作为 key 进行缓存。当查询一个完全不存在的商品时就会发生缓存穿透。如果遭遇攻击会导致大量请求直接访问存储层，经常运用的两种解决方案如下：

缓存空值
布隆过滤

缓存空值

如果缓存没有命中，查询存储层，如果存储没有命中，缓存空值。通常空值会设置较短的缓存时间，防止缓存过多无用数据。

示例：

public Object getObjectById(Integer id) {
    // 从缓存中获取数据
    Object cacheValue = cache.get(id);
    // 缓存为空
    if (cacheValue == null) {
        // 从数据库中获取
        Object dbValue = db.get(id);
        // 缓存空对象
        cache.set(id, dbValue);
        // 如果存储数据为空，需要设置一个较短的过期时间
        if (dbValue == null) {
            cache.expire(id, 30000);
        }
        return dbValue;
    }
    // 缓存不为空则直接返回
    return cacheValue;
}

布隆过滤

布隆过滤实际上是一个很长的二进制向量和一系列随机映射函数。布隆过滤器可以用于检索一个元素是否在一个集合中，但是有一定的误判。

使用 google 开箱即用的 BloomFilter 示例：

布隆过滤器 maven 依赖

<dependency>
   <groupId>com.google.guava</groupId>
   <artifactId>guava</artifactId>
   <version>29.0-jre</version>
</dependency>

创建布隆过滤器，容量为10000，容错率为0.01

private static final BloomFilter<Integer> bloomFilter = BloomFilter.create(Funnels.integerFunnel(), 10000, 0.01);

添加元素

bloomFilter.put(id);

判断元素是否存在

bloomFilter.mightContain(id);

BloomFilter 不能删除问题

布隆过滤器判断一个元素存在就是判断对应位置是否为1来确定的，但是如果要删除掉一个元素是不能直接把1改成0的，因为这个位置可能存在其它元素。所以常规布隆过滤器是不支持删除的，但 BloomFilter 的变种 CountingBloomFilter 是支持删除的。原理就是在 BloomFilter 的基础上引入了计数器。

可删除布隆过滤器 maven 依赖

<dependency>
    <groupId>com.baqend</groupId>
    <artifactId>bloom-filter</artifactId>
    <version>2.2.4</version>
</dependency>

创建布隆过滤器，其中 countingBits(8) 表示计数器空间大小为8，即最多允许255次重复。如果不传默认是16位大小，即允许65535次重复。

CountingBloomFilter<Integer> countingBloomFilter = new FilterBuilder( 10000, 0.01).countingBits(8).buildCountingBloomFilter();

添加元素

countingBloomFilter.add(1);

删除元素

countingBloomFilter.remove(1);

判断元素是否存在

 countingBloomFilter.contains(1);

redisson 的布隆过滤器

maven 依赖

<dependency>
    <groupId>org.redisson</groupId>
    <artifactId>redisson-spring-boot-starter</artifactId>
    <version>3.10.6</version>
</dependency>

创建并初始化

RBloomFilter<String> bf = client.getBloomFilter("bf");
bf.tryInit(10000L,0.01);

添加元素

bf.add("martin");

判断元素是否存在

bf.contains("tom");

测试示例

public static void main(String[] args) {
    // 建立连接
    Config config = new Config();
    config.useSingleServer().setAddress("redis://127.0.0.1:6379").setPassword("password");
    RedissonClient client = Redisson.create(config);
    // 创建并初始化布隆过滤器
    RBloomFilter<String> bf = client.getBloomFilter("bf");
    bf.tryInit(10000L,0.01);
    bf.add("martin");
    // 验证
    System.out.println("martin是否存在："+bf.contains("martin"));
    System.out.println("tom是否存在："+bf.contains("tom"));
}

对比总结

缓存空值

优点	缺点
准确度高，不会出现布隆过滤那样误判的情况	容易导致缓存过多无用数据
无需预设 key，适用场景更广	无法防御短时间超高并发攻击

布隆过滤

优点	缺点
空间效率和查询效率比较高	有一定的误判概率