EF Core中的拆分查询策略

概要

从EF Core 5.0中,引入了拆分查询策略,该策略可以显著的提升多表查询的效率。本文主要介绍该策略的使用场景和基本使用方法。

代码和实现

使用场景

该策略主要使用在涉及多表连接查询的场景。本例的场景是这样,一个银行分行拥有多个设备,例如ATM机,麦当劳优惠劵ATM机或支票读取机。按照设备的不同,每种设备对应一个数据表。

如果查询分行包含的全部设备,需要多个数据表的联接,基本代码如下:

public async Task<List<Branch>> GetBranches() {
     List<Branch> branches = await _context.Set<Branch>()
         .Where(b => b.IsDeleted == false)
         .Include(b => b.Atms)
         .Include(b => b.Cdms)
         .Include(b => b.MCAtms).ToListAsync();
     return branches;
 }

我们先看一下,如果不加拆分策略,生成的单一SQL如下:

SELECT [t].[Id], [t].[Address], [t].[IsDeleted], [t].[Name], [t].[Rowversi
on], [t].[hasChequeService], [t].[hasCreditCardService], [t0].[Id], [t0].[Branch
Id], [t0].[DeviceStatus], [t0].[IsDeleted], [t0].[Name], [t0].[Rowversion], [t0]
.[SupportForeignCurrency], [t1].[Id], [t1].[BranchId], [t1].[CurrencyType], [t1]
.[DeviceStatus], [t1].[IsDeleted], [t1].[Name], [t1].[Rowversion], [t2].[Id], [t
2].[BranchId], [t2].[Campaign], [t2].[Coupon], [t2].[DeviceStatus], [t2].[IsDele
ted], [t2].[Name], [t2].[Rowversion], [t2].[SupportForeignCurrency]
      FROM [tt_branch] AS [t]
      LEFT JOIN [tt_atm] AS [t0] ON [t].[Id] = [t0].[BranchId]
      LEFT JOIN [tt_check_device] AS [t1] ON [t].[Id] = [t1].[BranchId]
      LEFT JOIN [tt_mcatm] AS [t2] ON [t].[Id] = [t2].[BranchId]
      WHERE [t].[IsDeleted] = CAST(0 AS bit)
      ORDER BY [t].[Id], [t0].[Id], [t1].[Id]

我们可以看到,EF Core是使用的左联方式加载相关的设备数据表。

现在我们尝试增加拆分策略,代码如下:

public async Task<List<Branch>> GetBranches() {
   List<Branch> branches = await _context.Set<Branch>()
        .Where(b => b.IsDeleted == false)
        .AsSplitQuery()
        .Include(b => b.Atms)
        .Include(b => b.Cdms)
        .Include(b => b.MCAtms).ToListAsync();
    return branches;
}
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (119ms) [Parameters=[], CommandType='Text', CommandTime
out='30']
      SELECT [t].[Id], [t].[Address], [t].[IsDeleted], [t].[Name], [t].[Rowversi
on], [t].[hasChequeService], [t].[hasCreditCardService]
      FROM [tt_branch] AS [t]
      WHERE [t].[IsDeleted] = CAST(0 AS bit)
      ORDER BY [t].[Id]
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (9ms) [Parameters=[], CommandType='Text', CommandTimeou
t='30']
      SELECT [t0].[Id], [t0].[BranchId], [t0].[DeviceStatus], [t0].[IsDeleted],
[t0].[Name], [t0].[Rowversion], [t0].[SupportForeignCurrency], [t].[Id]
      FROM [tt_branch] AS [t]
      INNER JOIN [tt_atm] AS [t0] ON [t].[Id] = [t0].[BranchId]
      WHERE [t].[IsDeleted] = CAST(0 AS bit)
      ORDER BY [t].[Id]
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (3ms) [Parameters=[], CommandType='Text', CommandTimeou
t='30']
      SELECT [t0].[Id], [t0].[BranchId], [t0].[CurrencyType], [t0].[DeviceStatus
], [t0].[IsDeleted], [t0].[Name], [t0].[Rowversion], [t].[Id]
      FROM [tt_branch] AS [t]
      INNER JOIN [tt_check_device] AS [t0] ON [t].[Id] = [t0].[BranchId]
      WHERE [t].[IsDeleted] = CAST(0 AS bit)
      ORDER BY [t].[Id]
info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (5ms) [Parameters=[], CommandType='Text', CommandTimeou
t='30']
      SELECT [t0].[Id], [t0].[BranchId], [t0].[Campaign], [t0].[Coupon], [t0].[D
eviceStatus], [t0].[IsDeleted], [t0].[Name], [t0].[Rowversion], [t0].[SupportFor
eignCurrency], [t].[Id]
      FROM [tt_branch] AS [t]
      INNER JOIN [tt_mcatm] AS [t0] ON [t].[Id] = [t0].[BranchId]
      WHERE [t].[IsDeleted] = CAST(0 AS bit)
      ORDER BY [t].[Id]

我们可以看到,生成的查询语句进行了拆分,在查询到现有的分行数据后, 分别对不同的设备表,进行了内联查询。

拆分查询的好处就是每次以内联的方式,只联接一张表,避免同时左联多个可能很大的表,从而引发的性能问题。

另一个好处如下:

public async Task<List<Branch>> GetBranches() {
     List<Branch> branches = await _context.Set<Branch>()
         .Where(b => b.IsDeleted == true)
         .AsSplitQuery()
         .Include(b => b.Atms)
         .Include(b => b.Cdms)
         .Include(b => b.MCAtms).ToListAsync();
     return branches;
 }

如果在第一个表中,没有查询到数据,后面的联表操作也就不会进行,这样如果后面有很大的字典表,根本就不会再去查询,从而提高的查询的性能。

该查询生成的SQL如下:

     info: Microsoft.EntityFrameworkCore.Database.Command[20101]
      Executed DbCommand (256ms) [Parameters=[], CommandType='Text', CommandTime
out='30']
      SELECT [t].[Id], [t].[Address], [t].[IsDeleted], [t].[Name], [t].[Rowversi
on], [t].[hasChequeService], [t].[hasCreditCardService]
      FROM [tt_branch] AS [t]
      WHERE [t].[IsDeleted] = CAST(1 AS bit)
      ORDER BY [t].[Id]

上面的代码中,只有分行的查询,因为分行查询结果为空,所以就直接返回,不需要再进行后面的查询。

拆分查询的副作用

由于拆分策略将原有的单次查询,分割成多次数据库交互查询,每次的查询结果将被放到缓存中,这样如果查询过的数据表,在结果汇总返回之前,又被修改,可能会导致数据一致性的问题。

在分页和排序方面,如果涉及分页,必须保证排序方式的唯一性,如果排序的内容相同,则无法保证每次的查询结果都是一样的,即使数据没有被修改过,也无法保证。所以如果涉及分页,请慎用该策略。