Why You Should Use BULK COLLECT With a LIMIT Clause

Stop Processing One Row at a Time

If you have been writing Oracle PL/SQL for any length of time, you have probably encountered the classic pattern of opening a cursor, looping through it row by row, and performing some operation on each record. While this approach is perfectly valid and will produce correct results, it carries a significant performance penalty that many developers overlook until their batch jobs start taking hours instead of minutes.

The core issue is context switching. Every time your PL/SQL block fetches a single row from a SQL cursor, control passes from the PL/SQL engine to the SQL engine and back again. On a small data set this overhead is negligible, but when you are processing hundreds of thousands or millions of rows, those tiny delays add up to a massive drag on throughput.

Enter BULK COLLECT

Oracle introduced the BULK COLLECT clause to address exactly this problem. Instead of fetching one row at a time, BULK COLLECT retrieves multiple rows in a single context switch and deposits them into a PL/SQL collection (typically a nested table or a VARRAY). The reduction in context switches can easily produce a 5x to 20x speed improvement on large data sets.

Here is a simple before and after comparison:

The Slow Way (Row by Row)

DECLARE
   CURSOR c_employees IS
      SELECT employee_id, salary FROM employees;
   l_emp c_employees%ROWTYPE;
BEGIN
   OPEN c_employees;
   LOOP
      FETCH c_employees INTO l_emp;
      EXIT WHEN c_employees%NOTFOUND;
      -- process each row individually
      UPDATE employees
         SET salary = l_emp.salary * 1.05
       WHERE employee_id = l_emp.employee_id;
   END LOOP;
   CLOSE c_employees;
END;
/

The Fast Way (BULK COLLECT with LIMIT)

DECLARE
   CURSOR c_employees IS
      SELECT employee_id, salary FROM employees;

   TYPE t_emp_tab IS TABLE OF c_employees%ROWTYPE;
   l_employees t_emp_tab;

   c_batch_size CONSTANT PLS_INTEGER := 1000;
BEGIN
   OPEN c_employees;
   LOOP
      FETCH c_employees BULK COLLECT INTO l_employees LIMIT c_batch_size;
      EXIT WHEN l_employees.COUNT = 0;

      FORALL i IN 1 .. l_employees.COUNT
         UPDATE employees
            SET salary = l_employees(i).salary * 1.05
          WHERE employee_id = l_employees(i).employee_id;

      COMMIT;
   END LOOP;
   CLOSE c_employees;
END;
/

Why the LIMIT Clause Matters

You might be tempted to use BULK COLLECT without a LIMIT, which would load every row from the cursor into memory at once. For a table with 50 rows, that is fine. For a table with 50 million rows, you will quickly exhaust your PGA memory allocation and crash the session.

The LIMIT clause tells Oracle to fetch only a specified number of rows per iteration. A value of 1000 is a common starting point. The exact optimal number depends on your row size and available PGA memory. Larger batches reduce context switches further, but consume more memory. In practice, values between 500 and 5000 cover most use cases well. Beyond 5000, the incremental improvement in speed usually tapers off while memory consumption continues to climb.

Pairing BULK COLLECT with FORALL

BULK COLLECT handles the read side of bulk processing. On the write side, the FORALL statement performs DML operations on entire collections in a single context switch. When you combine the two, you create a pipeline that minimizes context switching on both the read and write paths. This combination is the gold standard for high volume data processing in PL/SQL.

Version Notes

BULK COLLECT has been available since Oracle 8i, but subsequent releases have improved its behavior. In Oracle 12c and later, implicit cursors in FOR loops automatically perform bulk fetching behind the scenes (with an internal limit of 100 rows). This means that simple FOR rec IN (SELECT ...) loops are faster than they used to be, but explicit BULK COLLECT with a tuned LIMIT value still outperforms implicit bulk fetching when dealing with large volumes because you control the batch size and can pair it with FORALL for DML.

In Oracle 19c through 23ai, the PGA memory management has been refined, so you may find you can use slightly larger LIMIT values without running into memory pressure. Always test with your actual data volumes and monitor PGA usage via V$PROCESS_MEMORY or V$PGASTAT when tuning the batch size.

A Practical Guideline

Start with a LIMIT of 1000. Benchmark your process, then double the limit and benchmark again. Keep increasing until you see diminishing returns on elapsed time or until PGA memory consumption becomes a concern. Document the chosen value with a comment explaining why it was selected. Future developers maintaining your code will thank you.

Key Takeaways

Row by row processing is the slowest approach for large data sets because of context switching overhead. BULK COLLECT reduces read side context switches by fetching many rows at once. The LIMIT clause prevents memory exhaustion on large tables. FORALL reduces write side context switches. Together, BULK COLLECT with LIMIT and FORALL form the most efficient pattern for high volume PL/SQL data processing.

martin

Administrator

Visit Website View All Posts

Leave a Reply Cancel reply

Related Stories

The Hidden Power of NOCOPY for OUT and IN OUT Parameters

Edition Based Redefinition for Zero Downtime PL/SQL Deployments

Pipelined Table Functions: When and Why to Use Them

You May Have Missed

Debugging APEX Like a Pro: APEX_DEBUG and the Debug Toolbar

Building a Dark Mode Toggle for Universal Theme

Client Side Validation in APEX Without a Round Trip

Five APEX URL Tricks Every Developer Should Memorize