JSON Index Return Lengths

This is the fourth post investigating nuances of Oracle indexes on JSON data. The previous posts demonstrated how changing the length of returned data in an SQL JSON function call influences Oracle’s choice of index. This post begins to expose why that occurs by creating 3 indexes for 3 different return value lengths and showing how Oracle uses them.

The data set used in these examples was created in the first post here.

--
-- Create an index returning 4 characters
--
SQL> CREATE INDEX json_4 ON json_data jd
 2     ( JSON_VALUE(jd.json_col,'$.col'
 3       RETURNING VARCHAR2(4) ) );
Index created.

--
-- Create an index returning 5 characters
--
SQL> CREATE INDEX json_5 ON json_data jd
 2     ( JSON_VALUE(jd.json_col,'$.col'
 3       RETURNING VARCHAR2(5) ) );
Index created.

--
-- Create an index returning 6 characters
--
SQL> CREATE INDEX json_6 ON json_data jd
 2     ( JSON_VALUE(jd.json_col,'$.col'
 3       RETURNING VARCHAR2(6) ) );
Index created.

--
-- Verify that each index has a different number of rows
-- in it even though its the same table and JSON element.
--
SQL> SELECT index_name,
 2          num_rows
 3     FROM user_indexes
 4    WHERE index_name like 'JSON%';
INDEX_NAME                       NUM_ROWS
------------------------------ ----------
JSON_4                                  9
JSON_5                                 99
JSON_6                                100

--
-- Query the length of the JSON elements to correlate to
-- the number of rows in the index. Note that:
--    JSON_4 includes  9 length 4 rows
--    JSON_5 includes 90 length 5 rows AND
--                     9 length 4 rows
--    JSON_6 includes  1 length 6 rows AND
                      90 length 5 rows AND
                       9 length 4 rows
--
SQL> SELECT LENGTH(JSON_VALUE(jd.json_col,'$.col')) lngth,
 2          COUNT(*)
 3     FROM json_data jd
 4 GROUP BY LENGTH(JSON_VALUE(jd.json_col,'$.col' ))
 5 ORDER BY LENGTH(JSON_VALUE(jd.json_col,'$.col' ));
     LNGTH   COUNT(*)
---------- ----------
         4          9
         5         90
         6          1

--
-- Verify the JSON_4 index is used for VARCHAR2(4)
--
SQL> EXPLAIN PLAN FOR
 2   SELECT JSON_VALUE(jd.json_col,'$.col'
 3          RETURNING VARCHAR2(4) )
 4     FROM json_data jd
 5    WHERE JSON_VALUE(jd.json_col,'$.col'
 6    RETURNING VARCHAR2(4) ) = 'val9';
Explained.

SQL> SELECT *
 2    FROM TABLE(DBMS_XPLAN.DISPLAY(
 3               format => 'BASIC ROWS COST'))
 4   WHERE TRIM(plan_table_output) IS NOT NULL;
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2835480143
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)|
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 (0)|
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| JSON_DATA | 1 | 2 (0)|
| 2 | INDEX RANGE SCAN | JSON_4 | 1 | 1 (0)|
------------------------------------------------------------------------------
8 rows selected.

--
-- Verify the JSON_5 index is used for VARCHAR2(5)
--
SQL> EXPLAIN PLAN FOR
 2    SELECT JSON_VALUE(jd.json_col,'$.col'
 3           RETURNING VARCHAR2(5) )
 4      FROM json_data jd
 5     WHERE JSON_VALUE(jd.json_col,'$.col'
 6     RETURNING VARCHAR2(5) ) = 'val99';

SQL> SELECT *
 2     FROM TABLE(DBMS_XPLAN.DISPLAY(
 3                format => 'BASIC ROWS COST'))
 4   WHERE TRIM(plan_table_output) IS NOT NULL;
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 322595427
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)|
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 (0)|
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| JSON_DATA | 1 | 2 (0)|
| 2 | INDEX RANGE SCAN | JSON_5 | 1 | 1 (0)|
------------------------------------------------------------------------------
8 rows selected.

--
-- Verify the JSON_6 index is used for VARCHAR2(6)
--
SQL> EXPLAIN PLAN FOR
 2   SELECT JSON_VALUE(jd.json_col,'$.col'
 3                     RETURNING VARCHAR2(6) )
 4     FROM json_data jd
 5    WHERE JSON_VALUE(jd.json_col,'$.col'
 6   RETURNING VARCHAR2(6) ) = 'val100';
Explained.

SQL> SELECT *
 2     FROM TABLE(DBMS_XPLAN.DISPLAY(
 3                format => 'BASIC ROWS COST'))
 4    WHERE TRIM(plan_table_output) IS NOT NULL;
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1579104487
--------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)|
--------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 1 (0)|
| 1 | INDEX RANGE SCAN| JSON_6 | 1 | 1 (0)|
--------------------------------------------------------

This proves how precise one must be when querying JSON data to ensure the index gets used. In the next post I’ll discuss more nuances to be aware of.

Thanks for reading.

JSON Index Non Reuse

Indexes on JSON data elements  can be used by all SQL JSON functions (JSON_VALUE, JSON_QUERY, etc) using that element as criteria.  But the match must be exact including the RETURNING and ERROR clause. Here’s an example of and index not being re-used.

--
-- Verify the virtual column used for the index
--
SQL> SELECT index_name,
 2          column_name
 3     FROM user_ind_columns
 4    WHERE index_name = 'JSON_COL_IX';
INDEX_NAME COLUMN_NAME
-------------------- --------------------
JSON_COL_IX          SYS_NC00002$

--
-- Verify the JSON index. It uses JSON_QUERY on the
-- COL element in the JSON_COL column. It uses the default
-- return type of VARCHAR2(30) and the default error
-- behavior of NULL ON ERROR
--
SQL> SELECT data_default
 2     FROM user_tab_cols
 3    WHERE column_name = ( SELECT column_name
 4                            FROM user_ind_columns
 5                           WHERE index_name = 'JSON_COL_IX' );
DATA_DEFAULT
-------------------------------------------
JSON_QUERY("JSON_COL" FORMAT JSON , '$.col'
RETURNING VARCHAR2(4000) ASIS WITHOUT ARRAY
WRAPPER NULL ON ERROR)

--
-- Run a JSON_VALUE query that could use the index
-- because it is the same column after all. But it
-- won't because the RETURNING clause differs from the
-- index one (10 characters versus 4000)
--
SQL> SELECT JSON_VALUE(jd.json_col,'$.col') col
 2     FROM json_data jd
 3    WHERE JSON_VALUE(jd.json_col,'$.col'
 4                     RETURNING VARCHAR2(10) ) = 'val33';
COL
----------
val33

--
-- Create an explain plan for the JSON_VALUE query
--
SQL> EXPLAIN PLAN FOR
 2   SELECT JSON_VALUE(jd.json_col,'$.col') col
 3     FROM json_data jd
 4    WHERE JSON_VALUE(jd.json_col,'$.col'
 5                     RETURNING VARCHAR2(10) ) = 'val33';
Explained.

--
-- Check the explain plan. Sure enough the index was
-- not used.
--
SQL> SELECT *
 2     FROM TABLE(DBMS_XPLAN.DISPLAY)
 3    WHERE TRIM(plan_table_output) IS NOT NULL;
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3213740116
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2002 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| JSON_DATA | 1 | 2002 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 1 - filter(JSON_VALUE("JD"."JSON_COL" FORMAT JSON , '$.col'
 RETURNING VARCHAR2(10) NULL ON ERROR)='val33')
Note
-----
 - dynamic statistics used: dynamic sampling (level=2)
14 rows selected.

The index did not get used because changing the length of the return type is enough to make Oracle unsure the index will contain all of the possible values. In the next post I’ll delve into this in more detail.

Thanks for reading!

JSON Index Reuse

The first post in this series created an index using Dot Notation to specify the JSON element to index. Behind the scenes Oracle translated the Dot Notation to a JSON_QUERY call to build the index. Subsequent queries using Dot Notation undergo the same translation to use the index.

Can other functions such as JSON_VALUE use the index too? Let’s dive into an example to find out. We’ll use index created in the first post here.

-- Query details of the index column
SQL> SELECT index_name,
 2          column_name
 3     FROM user_ind_columns
 4    WHERE index_name = 'JSON_COL_IX';
INDEX_NAME COLUMN_NAME
-------------------- --------------------
JSON_COL_IX SYS_NC00002$

-- View the JSON_QUERY used for the index
SQL> SELECT data_type,
 2          data_length,
 3          data_default
 4     FROM user_tab_cols
 5    WHERE column_name = ( SELECT column_name
 6                            FROM user_ind_columns
 7                           WHERE index_name = 'JSON_COL_IX' );
DATA_TYPE LNGTH DATA_DEFAULT
--------------- ----- -------------------------
VARCHAR2 4000 JSON_QUERY("JSON_COL" FORMAT JSON , '$.col'
              RETURNING VARCHAR2(4000) ASIS
              WITHOUT ARRAY WRAPPER NULL ON ERROR
1 row selected.

-- Get the query plan for a dot notation query
SQL> EXPLAIN PLAN FOR
 2   SELECT *
 3     FROM json_data jd
 4    WHERE jd.json_col.col = 'val33';
Explained.

--
-- Review the explain plan to verify the index was used
-- Note the "access" closely matches the index column
-- "DATA_DEAFULT" shown above. That's why the index was used
--
SQL> SELECT *
 2     FROM TABLE(DBMS_XPLAN.DISPLAY)
 3    WHERE TRIM(plan_table_output) IS NOT NULL;
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 716575500
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4004 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| JSON_DATA | 1 | 4004 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | JSON_COL_IX | 1 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 2 - access(JSON_QUERY("JSON_COL" FORMAT JSON , '$.col' 
            RETURNING VARCHAR2(4000) ASIS
            WITHOUT ARRAY WRAPPER NULL ON ERROR)='val33')
Note
-----
 - dynamic statistics used: dynamic sampling (level=2)

15 rows selected.

-- Now use JSON_VALUE to find the same record with
-- essentially the same query as above
SQL> SELECT JSON_VALUE(jd.json_col,'$.col' )
 2     FROM json_data jd
 3    WHERE JSON_VALUE(jd.json_col,'$.col') = 'val33';
JSON_VALUE(JD.JSON_COL,'$.COL')
------------------------------------------------------------------------------------------------------------------------------------
val33

-- Get the explain plan for the JSON_VALUE query
SQL> EXPLAIN PLAN FOR
 2   SELECT JSON_VALUE(jd.json_col,'$.col' )
 3     FROM json_data jd
 4    WHERE JSON_VALUE(jd.json_col,'$.col') = 'val33';
Explained.

--
-- Check the query plan to see if the index was used.
-- It was! Note the "filter" shown is directly from the
-- query text while the "access" is what Oracle used to
-- find the row.
--
SQL> SELECT *
 2     FROM TABLE(DBMS_XPLAN.DISPLAY)
 3    WHERE TRIM(plan_table_output) IS NOT NULL;
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 716575500
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4004 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| JSON_DATA | 1 | 4004 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | JSON_COL_IX | 1 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 1 - filter(JSON_VALUE("JD"."JSON_COL" FORMAT JSON , '$.col'
            RETURNING VARCHAR2(4000)
            NULL ON ERROR)='val33')
 2 - access(JSON_QUERY("JSON_COL" FORMAT JSON , '$.col'
            RETURNING VARCHAR2(4000) ASIS
            WITHOUT ARRAY WRAPPER NULL ON ERROR)='val33')
Note
-----
 - dynamic statistics used: dynamic sampling (level=2)
17 rows selected.

This proves that JSON indexes usage is not always restricted to the specific JSON syntax used to create it. It will reuse it if possible. In the next post I’ll discuss cases where it may not be used.

Thanks for reading.