Friday, September 17, 2021

MERGE Statement for better Performance

I have come across a scenario where I have to insert or update the data in a table based on some source table depending on whether the data exists. And the first thing that comes to mind most of the time is doing something like this.

-- update the table

update some_table set 
 col_1 = value1,
 col_2 = value2,
 col_3 = value3
where pk_col = some_value;

-- checking using the sql%rowcount, whether the row is updated and if not then simply insert the data.

IF SQL%ROWCOUNT = 0 THEN
      INSERT INTO some_table (col_1, col_2, col_3)
      VALUES (value_1, value_2, value_3);
END IF;
 
I believe most of you at least once in your carrier had code something like this. Now, this looks alright if you have a small number of rows that need to be processed, but what if the number of rows is in thousands or millions.
When processing this amount of rows, the first thing that comes into our big brain is to use the collection, so the above logic becomes something like this:


declare 
 TYPE t_tab IS TABLE OF some_table%ROWTYPE;
 l_tab   t_tab;
Begin
SELECT *
  BULK COLLECT INTO l_tab
  FROM some_table
    
  FOR i IN l_tab.first .. l_tab.last 
  LOOP
     BEGIN
      INSERT INTO some_other_table(col_1, col_2, col_3) // considerting col_1 as a PK
      VALUES (l_tab(i).value_1, l_tab(i).value_2,l_tab(i).value_3);
    EXCEPTION
      WHEN OTHERS THEN
        UPDATE some_other_table SET
          col_2= l_tab(i).value_2,
          col_3= l_tab(i).value_3
        WHERE col_1= l_tab(i).col_1;
    END;
  END LOOP;
end;

now let's try to rebuild this using the MERGE statement and then I will show you how to compare and calculate the performance.

BEGIN
    MERGE INTO some_other_table a
      USING some_table b
      ON (a.col_1 = b.col_1) // col_1 is a reference ID in both the tables
      WHEN MATCHED THEN
        UPDATE SET
          col_2 = b.col_2,
          col_3 = b.col_3,
      WHEN NOT MATCHED THEN
        INSERT (col_1, col_2, col_3)
        VALUES (b.col_1, b.col_2, b.col_3);
END;

This looks short and simple with no need to use bulk collect and collection and rather than going in rows the MERGE operation works on the sets of data and performs better in many scenarios.

Use the script below and edit as per your requirement to calculate and compare the performance:

DECLARE
  TYPE t_tab IS TABLE OF source_tab%ROWTYPE;
  l_tab   t_tab;
  v_start NUMBER;
BEGIN

-- calculating the time for the merge operation
  v_start := DBMS_UTILITY.get_time;
  
  MERGE INTO some_other_table a
      USING some_table b
      ON (a.col_1 = b.col_1) // col_1 is a reference ID in both the tables
      WHEN MATCHED THEN
        UPDATE SET
          col_2 = b.col_2,
          col_3 = b.col_3,
      WHEN NOT MATCHED THEN
        INSERT (col_1, col_2, col_3)
        VALUES (b.col_1, b.col_2, b.col_3);

  DBMS_OUTPUT.put_line('MERGE Operation: ' ||  (DBMS_UTILITY.get_time - v_start) || ' sec');

  ROLLBACK;

-- calculating the time for bulk collect/collection
 
  v_start := DBMS_UTILITY.get_time;

  SELECT *
  BULK COLLECT INTO l_tab
  FROM some_table
    
  FOR i IN l_tab.first .. l_tab.last 
  LOOP
     BEGIN
      INSERT INTO some_other_table(col_1, col_2, col_3) // considerting col_1 as a PK
      VALUES (l_tab(i).value_1, l_tab(i).value_2,l_tab(i).value_3);
    EXCEPTION
      WHEN OTHERS THEN
        UPDATE some_other_table SET
          col_2= l_tab(i).value_2,
          col_3= l_tab(i).value_3
        WHERE col_1= l_tab(i).col_1;
    END;
  END LOOP;

  DBMS_OUTPUT.put_line('Using Bulk collect and Collection: ' || (DBMS_UTILITY.get_time - v_start) || 'sec');

  ROLLBACK;
END;
/









No comments:

Post a Comment

🔄 PostgreSQL Database Cloning via Backup and Restore

 When working with PostgreSQL, there are many situations where you need to  clone a database : Creating a dev or staging environment from pr...