网上很多文章,都说MySQL驱动并没有实现"真正的"batchUpdate,执行的时候还是一条一条按顺序将SQL发送到MySQL服务器,其实这是错误的。
先贴一段源码(基于MySQL 5.1.40驱动),执行batchUpdate的时候最终执行如下方法:executeBatchInternal
protected long[] executeBatchInternal() throws SQLException { synchronized (checkClosed().getConnectionMutex()) { if (this.connection.isReadOnly()) { throw new SQLException(Messages.getString("PreparedStatement.25") + Messages.getString("PreparedStatement.26"), SQLError.SQL_STATE_ILLEGAL_ARGUMENT); } if (this.batchedArgs == null || this.batchedArgs.size() == 0) { return new long[0]; } // we timeout the entire batch, not individual statements int batchTimeout = this.timeoutInMillis; this.timeoutInMillis = 0; resetCancelledState(); try { statementBegins(); clearWarnings(); if (!this.batchHasPlainStatements && this.connection.getRewriteBatchedStatements()) { if (canRewriteAsMultiValueInsertAtSqlLevel()) { return executeBatchedInserts(batchTimeout); } if (this.connection.versionMeetsMinimum(4, 1, 0) && !this.batchHasPlainStatements && this.batchedArgs != null && this.batchedArgs.size() > 3 /* cost of option setting rt-wise */) { return executePreparedBatchAsMultiStatement(batchTimeout); } } return executeBatchSerially(batchTimeout); } finally { this.statementExecuting.set(false); clearBatch(); } } }
为了测试网上文章说法的正误,本地测试写了一个batch批量执行,跟踪源码的时候发现最后进入到
return executeBatchSerially(batchTimeout);
该方法的javadoc 如是说:
Executes the current batch of statements by executing them one-by-one.
executeBatchSerially核心源码如下:
for (this.batchCommandIndex = 0; this.batchCommandIndex < nbrCommands; this.batchCommandIndex++) { Object arg = this.batchedArgs.get(this.batchCommandIndex); try { if (arg instanceof String) { updateCounts[this.batchCommandIndex] = executeUpdateInternal((String) arg, true, this.retrieveGeneratedKeys); // limit one generated key per OnDuplicateKey statement getBatchedGeneratedKeys(this.results.getFirstCharOfQuery() == 'I' && containsOnDuplicateKeyInString((String) arg) ? 1 : 0); } else { BatchParams paramArg = (BatchParams) arg; //核心代码,for循环执行每一条SQL updateCounts[this.batchCommandIndex] = executeUpdateInternal(paramArg.parameterStrings, paramArg.parameterStreams, paramArg.isStream, paramArg.streamLengths, paramArg.isNull, true); // limit one generated key per OnDuplicateKey statement getBatchedGeneratedKeys(containsOnDuplicateKeyUpdateInSQL() ? 1 : 0); } } catch (SQLException ex) { updateCounts[this.batchCommandIndex] = EXECUTE_FAILED; if (this.continueBatchOnError && !(ex instanceof MySQLTimeoutException) && !(ex instanceof MySQLStatementCancelledException) && !hasDeadlockOrTimeoutRolledBackTx(ex)) { sqlEx = ex; } else { long[] newUpdateCounts = new long[this.batchCommandIndex]; System.arraycopy(updateCounts, 0, newUpdateCounts, 0, this.batchCommandIndex); throw SQLError.createBatchUpdateException(ex, newUpdateCounts, getExceptionInterceptor()); } } }
通过代码分析,也确实是一条一条SQL执行,而不是把batch的SQL发送到服务器
但是
重点来了,执行executeBatchSerially是有条件的,再次贴一下源码:
if (!this.batchHasPlainStatements && this.connection.getRewriteBatchedStatements()) { ... } return executeBatchSerially(batchTimeout);
也就是说,如果没做任何配置,默认情况下if条件是进不去的,会直接执行if块后边的
return executeBatchSerially(batchTimeout);
那,我们来看一下if条件是什么:
!this.batchHasPlainStatements && this.connection.getRewriteBatchedStatements()
batchHasPlainStatements默认初始化就是false,可以不用管,重点是connection.getRewriteBatchedStatements()。这个是Connection的一个参数rewriteBatchedStatements,会在读取jdbcUrl的时候读取进来:jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true
我们将jdbcUrl添加一个rewriteBatchedStatements
试试,即变成:jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&rewriteBatchedStatements=true
这时候就会进if块了。进入if块之后,再根据执行的是insert 还是update、 delete,会走不同方法
- 如果是insert语句,满成条件情况下,会整合成形如:"insert into xxx_table values (xx),(yy),(zz)..."这样的语句
- 如果是update\delete语句,满成条件情况下,会整合成形如:"update t set … where id = 1; update t set … where id = 2; update t set … where id = 3 ..."这样的语句
然后分批次发送给MySQL(会有一次发送的package大小限制,所以需要拆分批次)
int maxAllowedPacket = this.connection.getMaxAllowedPacket();
if (!this.batchHasPlainStatements && this.connection.getRewriteBatchedStatements()) { //insert会在这里进行再次判断 if (canRewriteAsMultiValueInsertAtSqlLevel()) { return executeBatchedInserts(batchTimeout); } //update、delete会在这里进行再次判断 //1. mysql版本>=4.1.0 //2. batchHasPlainStatements为false //3. batch的数量>3 if (this.connection.versionMeetsMinimum(4, 1, 0) && !this.batchHasPlainStatements && this.batchedArgs != null && this.batchedArgs.size() > 3 /* cost of option setting rt-wise */) { return executePreparedBatchAsMultiStatement(batchTimeout); } } return executeBatchSerially(batchTimeout);
在这里总结一下,如果想要达到MySQL真正batchUpdate效果,需要有以下几个条件:
- 需要在jdbcUrl后添加参数rewriteBatchedStatements=true
- this.batchHasPlainStatements 为false
- 如果是update \ delete 语句,还需要mysql版本>=4.1.0,并且batch的数量>3
因此,如果可能的情况下,请在jdbcUrl后添加参数rewriteBatchedStatements=true,尽可能利用上MySQL给我们提供的便利,提高性能。