Better At Oracle

A collection of tutorials, code and tools to help you get better using Oracle


28 July 2016

JDBC Bind By Name is slow for wide tables

With Oracle JDBC statements, you can bind parameters to a query using the traditional bind by position method calls, eg:

stmt.setString(1, "abcdefgh");

Or you can use an Oracle extention called bindAtName, eg:

stmt.setStringAtName("c1", "abcdefgh");

The advantage of the second method, is that it lets you give the bind variables names, and then you don't have to know the position of the bind which makes some code easier.

I was looking at some flame charts recently comparing the performance of two different JDBC insert processes and I noticed a lot of time seems to be spent in the bindAtName calls, so I got to wondering - how big is the overhead with this code saving method?

The summary is that bindByName is much slower than bind by position, at least using ojdbc6. Not only that, but the larger the number of bind variables in the query, the longer it takes to bind each parameter. It seems there is some sort of O(n) operation inside the bindByName calls that is dependent on the bind count.

The Test Program

To eliminate network and actual database calls, I simply prepare a SQL statement and then bind one value to it over and over. I have 3 different tables that are used only to prepare statements against:

  • One with 1 column (ie one bind parameter)
  • One with 400 columns (ie 400 bind parameters)
  • One with 800 columns (ie 800 bind parameters)
import oracle.jdbc.*;
import java.sql.*;

public class BindOnly {

    public static void main(String[] args)
    throws ClassNotFoundException, SQLException
    {
       DriverManager.registerDriver
           (new oracle.jdbc.OracleDriver());

        String url = "jdbc:oracle:thin:@//192.168.33.8:1521/ora12cr1";

        Connection conn =
        DriverManager.getConnection(url,"user","pass");

        conn.setAutoCommit(false);


        // Ugly code to generate the SQL statement  
        String sql = "insert into template_table_800 (";

        int numColumns = 800;
        int numRuns    = 10000000;

        for (int i=1; i<numColumns+1; i++) {
            sql = sql + "c"+i;
            if (i < numColumns) {
                sql = sql + ",";
            }
        }
        sql = sql + ") values (";
        for (int i=1; i<numColumns+1; i++) {
            sql = sql + ":c"+i;
            if (i < numColumns) {
                sql = sql + ",";
            }
        }
        sql = sql + ")";

        System.out.println(sql);

        OraclePreparedStatement stmt = 
          (OraclePreparedStatement)conn.prepareStatement(sql);

        System.out.println("Starting bind by index");

        long start = java.lang.System.currentTimeMillis();

        for (int j=1; j<numRuns; j++) {
          stmt.setString(1, "abcdefgh");  // Bind by position
        }

        long end = java.lang.System.currentTimeMillis();
        System.out.println("Ran for: "+ (end - start));

        stmt.close();

        stmt = (OraclePreparedStatement)conn.prepareStatement(sql);

        System.out.println("Starting bind by name");

        start = java.lang.System.currentTimeMillis();

        for (int j=1; j<numRuns; j++) {
            stmt.setStringAtName("c1", "abcdefgh"); // Bind by Name
        }

        end = java.lang.System.currentTimeMillis();
        System.out.println("Ran for: "+ (end - start));

        stmt.close();
    }
}

Test Results - One Bind

The query used here was one like:

insert into t (c1) values (:c1);

Running the test:

Bind by index Ran for: 472, 206, 455, 471; AVG=401ms

Bind by name Ran for: 1254, 1134, 1258, 1309; AVG=1238ms

So, bindByName is about 3x slower for a query with 1 bind variable. If you are inserting 10M rows like this, you can probably live with the extra second.

Test Results - 400 Binds

The query used here was one like:

insert into t (c1, c2, ... c400) values (:c1, :c2, ... :c400);

Running the test:

Bind by Index ran for: 481, 212, 487, 480; AVG=415ms

Bind by Name ran for: 4216, 4260, 4320, 4252; AVG=4262ms

With 400 binds in the query, we are at a 10x difference in runtime. This test executed 10M binds - if you have a query with 400 binds per query, 10M binds is 25K rows. That means this overhead costs 3.8 seconds per 25K rows, or 3.8 * 400 for 10M, which is 1520 seconds.

Test Results - 800 Binds

The query used here was one like:

insert into t (c1, c2, ... c800) values (:c1, :c2, ... :c800);

Running the test:

Bind by Index ran for: 485, 408, 462, 478; AVG=458ms

Bind By Name ran for: 7035, 7022, 6934, 7005; AVG=6999ms

With 800 binds in the query, we are at a difference in 15x runtime. Using the same logic as above, this costs about 6.5 seconds per 10M binds, so for 10M rows that is 5200 extra seconds. Not good.

Conclusion

If you have a very high performance application, that needs to bind a lot of variables per query, use bind by position. Also, make sure you use JDBC batching.

blog comments powered by Disqus