Beyond Regular Expressions: Row Pattern Matching
Most developers are familiar with regular expressions for matching patterns in text. Oracle 12c introduced MATCH_RECOGNIZE, which brings regular expression style pattern matching to rows of data. Instead of matching characters in a string, you match sequences of rows based on conditions you define. This is incredibly powerful for identifying trends, detecting anomalies, and recognizing complex event patterns in time series data.
A Practical Example: Stock Price Patterns
Suppose you want to find “V shaped” recoveries in stock prices, meaning a sequence where the price drops for several consecutive days and then rises for several consecutive days:
SELECT *
FROM stock_prices
MATCH_RECOGNIZE (
PARTITION BY ticker
ORDER BY trade_date
MEASURES
FIRST(DOWN.trade_date) AS decline_start,
LAST(DOWN.trade_date) AS decline_end,
LAST(UP.trade_date) AS recovery_end,
FIRST(DOWN.close_price) AS start_price,
LAST(DOWN.close_price) AS bottom_price,
LAST(UP.close_price) AS recovery_price
ONE ROW PER MATCH
PATTERN (DOWN{2,} UP{2,})
DEFINE
DOWN AS close_price < PREV(close_price),
UP AS close_price > PREV(close_price)
) mr;
Breaking Down the Syntax
The PARTITION BY clause divides the data into groups (similar to analytic functions). ORDER BY establishes the sequence within each partition. The PATTERN clause uses regular expression notation: DOWN{2,} means “two or more consecutive DOWN rows” and UP{2,} means “two or more consecutive UP rows.” The DEFINE clause specifies what makes a row qualify as DOWN or UP.
The MEASURES clause extracts data from the matched pattern, using functions like FIRST(), LAST(), and COUNT() within pattern variables. ONE ROW PER MATCH tells Oracle to return a single summary row per match rather than all the individual rows.
Real World Applications
MATCH_RECOGNIZE excels at detecting fraud patterns in transaction data (for example, a rapid sequence of small transactions followed by a large withdrawal), identifying equipment failure patterns in sensor data (a series of readings that exceed normal thresholds), tracking user session behavior in web analytics (pages visited in a specific sequence), and recognizing trading patterns for regulatory compliance monitoring.
Session Detection Example
-- Identify user sessions: sequences of page views with gaps under 30 minutes
SELECT *
FROM page_views
MATCH_RECOGNIZE (
PARTITION BY user_id
ORDER BY view_time
MEASURES
FIRST(view_time) AS session_start,
LAST(view_time) AS session_end,
COUNT(*) AS pages_viewed
ONE ROW PER MATCH
PATTERN (START CONTINUED*)
DEFINE
CONTINUED AS view_time - PREV(view_time) < INTERVAL '30' MINUTE
) sessions;
Performance and Version Notes
MATCH_RECOGNIZE is processed by the SQL engine, not PL/SQL, so it benefits from all the SQL optimizer's capabilities including parallel execution. It is generally far more efficient than attempting to implement the same logic with PL/SQL cursor loops or self joins. Available from Oracle 12c Release 1 onward, its optimizer support has been refined in each subsequent release. In 19c and later, the optimizer is better at estimating cardinality for MATCH_RECOGNIZE operations, which leads to better execution plans when the MATCH_RECOGNIZE output feeds into joins or aggregations.