PTF’S – Row Sets Deep Dive Part 2

The previous post demonstrated multiple row sets being processed by a single PTF execution. This one expands on that by showing how each row set can have its own replication factor. Here is the new FETCH_ROWS procedure I’ll use:

PROCEDURE fetch_rows IS
  v_rowset DBMS_TF.ROW_SET_T;
  v_added DBMS_TF.COLUMN_DATA_T;
  v_env DBMS_TF.ENV_T;
BEGIN
  v_env := DBMS_TF.GET_ENV;
  DBMS_OUTPUT.PUT_LINE('ROWCOUNT = ' || v_env.row_count);
  v_added.tab_varchar2(1) := 'Added Col';
  v_rowset(1) := v_added;
  --
  -- Set the replication factor equal to the number of
  -- rows in the rowset
  --
  DBMS_TF.ROW_REPLICATION(replication_factor => 
                               v_env.row_count );
  DBMS_TF.PUT_ROW_SET( rowset => v_rowset);
END;

And here is the query and results:

SELECT added_col,
       COUNT(*)
  FROM poly_func(table1)
GROUP BY added_col;

ADDED_COL         COUNT(*)
--------------- ----------
Added Col                2
                   1048575

ROWCOUNT = 1024
ROWCOUNT = 1

The first rowset

  • was 1,024 rows
  • had a replication factor of 1,024
  • became 1,048,576 rows after applying the replication factor of 1.024
  • the ADDED_COL for the first row was “Added Col” while the rest remained null
  • contributed 1,048,575 rows to the NULL count
  • contributed 1 row to the “Added Col” count

The second rowset

  • was 1 row
  • had a replication factor of 1
  • became 1 row after applying the replication factor of 1
  • the ADDED_COL for the first, and only,  row was “Added Col”
  • contributed 1 row to the “Added Col” count

Thanks for reading

PTF’S – Row Sets Deep Dive Part 1

The previous post used a PTF to populate row number one from a one row table. It also did this in row set number one from the first, and only execution of the FETCH_ROWS procedure. This post shows the simplest way to increase executions of FETCH_ROWS — increase the number of rows to process.

I’ll use this query to vaidate the changes. It shows the count of rows with each value in the ADDED_COL. Here’s the results using the code from the previous post.

SELECT added_col,
      COUNT(*)
 FROM poly_func(table1)
GROUP BY added_col;

ADDED_COL         COUNT(*)
--------------- ----------
Added Col                1

There is only 1 row so it shows with a count of 1.

Next I’ll add another row.

INSERT INTO table1
VALUES(2018);

And query the counts again.

SELECT added_col,
       COUNT(*)
  FROM poly_func(table1)
GROUP BY added_col;

ADDED_COL         COUNT(*)
--------------- ----------
Added Col                1
                         1

It shows 1 row (the very first one in the rowset) with an ADDED_COL value of “Added Col” and 1 row with a NULL value.

Next I’ll set the table up with 1025 rows.

DELETE table1;
BEGIN
  FOR counter IN 1..1025 LOOP
    INSERT INTO table1
    VALUES(counter);
  END LOOP;
END;

And run the count query again.

SELECT added_col,
       COUNT(*)
  FROM poly_func(table1)
GROUP BY added_col;

ADDED_COL         COUNT(*)
--------------- ----------
Added Col                2
                      1023

The results show 2 rows with an ADDED_COL value of “Added Col” – that must mean two row sets from two executions of the FETCH_ROWS procedure. I’ll augment the FETCH_ROWS procedure to verify it.

PROCEDURE fetch_rows IS
  v_rowset DBMS_TF.ROW_SET_T;
  v_added DBMS_TF.COLUMN_DATA_T;
  v_env DBMS_TF.ENV_T;
BEGIN
  v_env := DBMS_TF.GET_ENV;
  DBMS_OUTPUT.PUT_LINE('ROWCOUNT = ' || v_env.row_count);
  v_added.tab_varchar2(1) := 'Added Col';
  v_rowset(1) := v_added;
  DBMS_TF.PUT_ROW_SET( rowset => v_rowset);
END;

And run the query again.

SELECT added_col,
       COUNT(*)
  FROM poly_func(table1)
GROUP BY added_col

ADDED_COL         COUNT(*)
--------------- ----------
Added Col                2
                      1023
ROWCOUNT = 1024
ROWCOUNT = 1

Sure enough FETCH_ROWS was run twice. Why was 1025 the number of rows that caused this? At this point I don’t know. But stay tuned for more details of row sets.

Thanks for reading.

Are PTF Added Columns Visible – Part 2

In a previous post I showed how columns added by a PTF can be used by the outside world (or a simple Oracle view masquerading as the outside world). This post discusses how added columns are seen, or not seen, by the inside world of the Oracle optimizer.

-- a demo table
CREATE TABLE table1
( col1 NUMBER NOT NULL PRIMARY KEY );

-- with demo data
INSERT INTO table1
VALUES(2019);

-- and a PTF package and body
CREATE OR REPLACE PACKAGE poly_pkg AS
  FUNCTION describe (p_tbl IN OUT DBMS_TF.TABLE_T )
  RETURN DBMS_TF.DESCRIBE_T;
  PROCEDURE fetch_rows;
END;

CREATE OR REPLACE PACKAGE BODY poly_pkg AS
  FUNCTION describe (p_tbl IN OUT DBMS_TF.TABLE_T )
  RETURN DBMS_TF.DESCRIBE_T AS
    v_new_col1 DBMS_TF.COLUMN_METADATA_T;
    v_new_cols DBMS_TF.COLUMNS_NEW_T;
  BEGIN
    -- add a varchars(15) column named ADDED_COL
    -- to the result set of this PTF
    v_new_col1 := DBMS_TF.COLUMN_METADATA_T( 
                   type => DBMS_TF.TYPE_VARCHAR2,
                   name => 'ADDED_COL',
                   max_len => 15 );
     v_new_cols := DBMS_TF.COLUMNS_NEW_T( 1 => v_new_col1 );
     RETURN DBMS_TF.DESCRIBE_T ( new_columns => v_new_cols );
  END;

  PROCEDURE fetch_rows IS
    v_rowset DBMS_TF.ROW_SET_T;
    v_added DBMS_TF.COLUMN_DATA_T;
  BEGIN
    -- populate the new column with hard
    -- coded text
    v_added.tab_varchar2(1) := 'Added Col';
    v_rowset(1) := v_added;
    DBMS_TF.PUT_ROW_SET( rowset => v_rowset);
  END;

END;

-- last but not least the PTF
CREATE OR REPLACE FUNCTION poly_func(p_tbl IN TABLE )
  RETURN TABLE PIPELINED
  ROW POLYMORPHIC USING poly_pkg;

Now a query using the added column as criteria.

SELECT *
  FROM poly_func(table1)
 WHERE added_col = 'Added Col';

COL1 ADDED_COL
---------- ---------------
2019 Added Col

What did the optimizer do with that query? Lets look at the explain plan:

Plan hash value: 1059882907
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 22 | 2 (0)| 00:00:01 |
|* 1 | POLYMORPHIC TABLE FUNCTION | POLY_FUNC | 1 | 13 | | |
| 2 | INDEX FAST FULL SCAN | SYS_C008053 | 1 | 13 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ADDED_COL"='Added Col')
Note
-----
- dynamic statistics used: dynamic sampling (level=2)

In simplest terms it shows a select was executed against the PTF and an index was used to find the required rows. A filter was then applied using the value from the WHERE clause.

Wait – was the index on the added column? Lets check…

SELECT table_name,
       column_name
  FROM user_ind_columns
 WHERE index_name = 'SYS_C008053;

TABLE_NAME           COLUMN_NAME
-------------------- --------------------
TABLE1               COL1

The index is on the tables primary key – not the added column. This makes sense because the overhead to maintain an index for every single dynamically added column would be onerous.

In future articles I’ll investigate way that may help out the optimizer when working with PTF’s.

Thans for reading!