2021 – What About Indexes

This post compares index creation and usage for RDBMS data and JSON data. I’ll focus on three specific aspects.

  • Creating Indexes
    • Including JSON Search Indexes
  • Verifying Index Usage
    • Oracle’s EXPLAIN PLAN utility shows if an index was used or not in a query
  • Resolving Index Usage
    • Oracle’s EXPLAIN PLAN utility shows the actual actual criteria used for the query
    • The actual criteria may not match what you typed in
    • Use of SQL Functions such as UPPER and TO_NUMBER may negate index usage

Current Indexes

The patient management systems used for this series of posts have the following indexes:

  • The RDBMS table PATIENT has a unique index named PATIENT_PK on the patient_name column
  • The JSON table JSON_PATIENT has a unique index named J on the $.pname element

Query Using Indexed Columns/Elements

This query joins the RDBMS PATIENT and PATIENT_MRN tables using patient_name as criteria.

SELECT *
  FROM patient p,
       patient_mrn mrn
 WHERE patient_name = 'Patient 33'
   AND p.patient_id = mrn.patient_id;

The query uses the PATIENT_PK to find rows because that is how the tables are joined.. Here is a snippet from the query plan.

TABLE ACCESS BY INDEX ROWID| PATIENT 
  INDEX UNIQUE SCAN | PATIENT_PK

This query accesses JSON data for a particular $.pname element.

SELECT *
  FROM json_patient
 WHERE JSON_VALUE(patient_data,'$.pname') = 'Patient 33';

That query uses the index as shown in this plan snippet.

TABLE ACCESS BY INDEX ROWID| JSON_PATIENT
  INDEX UNIQUE SCAN | J

Queries That Don’t Use Indexed Columns/Elements

This query does not use an index because the UPPER function negates the use of the index on patient_name.

SELECT *
  FROM patient
 WHERE UPPER(patient_name) = 'PATIENT 345';

Here is the plan snippet showing a FULL TABLE SCAN instead of index usage.

TABLE ACCESS FULL| PATIENT

Using JSON functions also negates index usage. For example this query will not use the index on $.pname.

SELECT *
  FROM json_patient
 WHERE UPPER(JSON_VALUE(patient_data,'$.pname')) = 'PATIENT 345';

And the plan snippet.

TABLE ACCESS FULL| JSON_PATIENT

Function Based Indexes

To ensure an index is used for functions just create an index with the function. For example these SQL statements create indexes on uppercase versions of the patient names in our application.

CREATE INDEX uppity_patient ON patient
          ( UPPER(patient_name) );
CREATE INDEX juppity_patient ON json_patient
          ( UPPER(JSON_VALUE(patient_data,'$.pname')));

Here is the plan for the JSON query with the UPPER function.

TABLE ACCESS BY INDEX ROWID BATCHED| JSON_PATIENT
INDEX RANGE SCAN | JUPPITY_PATIENT

And the RDBMS plan.

TABLE ACCESS BY INDEX ROWID BATCHED| PATIENT
INDEX RANGE SCAN | UPPITY_PATIENT

JSON Functions are Just Functions After All

An index created on a JSON_VALUE call wont be used on other calls such as JSON_EXISTS as shown here.

SELECT *
  FROM json_patient
 WHERE JSON_EXISTS(patient_data,'$.pname');

Here’s the plan with the dreaded full table scan.

TABLE ACCESS FULL| JSON_PATIENT

Even adding criteria to the JSON_EXISTS call will not use the index.

SELECT *
  FROM json_patient
 WHERE JSON_EXISTS(patient_data,'$.pname?(@ == "Patient 224")');

Still a full table scan.

TABLE ACCESS FULL| JSON_PATIENT
…Begin Tangent…

It is possible to create an index on a JSON_EXISTS with criteria.

CREATE INDEX favourite_patient
  ON json_patient ( JSON_EXISTS(patient_data,'$.pname?(@ == "Patient 224")') );

The plan.

TABLE ACCESS BY INDEX ROWID BATCHED| JSON_PATIENT
INDEX RANGE SCAN | FAVOURITE_PATIENT

That index is not used for any other patients though.

…End Tangent…

JSON Search Indexes

To better support queries using all JSON functions Oracle provides JSON Search Indexes. Think of them as indexes with in-depth JSON knowledge baked in. Here is how they are created,

CREATE SEARCH INDEX patient_search_idx ON json_patient (patient_data) FOR JSON;

The first point to make is the JSON Search Index does not take over for all queries. For example this query uses index on $.pname because the database believes it will be faster.

SELECT *
  FROM json_patient
 WHERE JSON_VALUE(patient_data,'$.pname') = 'Patient 33';

This query does uses the Search Index.

SELECT 1
  FROM json_patient
 WHERE JSON_VALUE(patient_data,'$.notthere') = 'Patient 54';

The query plan.

TABLE ACCESS BY INDEX ROWID| JSON_PATIENT
DOMAIN INDEX | PATIENT_SEARCH_IDX

This query also uses the search index.

SELECT 1
  FROM json_patient
 WHERE JSON_EXISTS(patient_data,'$.pname');

The plan.

DOMAIN INDEX | PATIENT_SEARCH_IDX

This query uses the Search Index too.

SELECT *
  FROM json_patient
  WHERE JSON_EXISTS(patient_data,'$.pname?(@ == "Patient 224")');

The plan.

TABLE ACCESS BY INDEX ROWID| JSON_PATIENT
DOMAIN INDEX | PATIENT_SEARCH_IDX

Lets query the MRN’s array for patients whose first MRN is from Ordering System (OS) 1.

SELECT COUNT(*)
  FROM json_patient
 WHERE JSON_VALUE(patient_data,'$.mrns[0].OS') = 1;

Will this use the Search Index? No! Why? Let’s check the query filter.

filter(TO_NUMBER(JSON_VALUE("PATIENT_DATA" /*+ LOB_BY_VALUE */
FORMAT JSON , '$.mrns[0].OS' RETURNING VARCHAR2(4000) NULL ON ERROR))=1)

Looks like the filter wound up with an added TO_NUMBER function which negates use of the index. Lets try putting quotes around the 1 on the right side of the criteria to avoid the conversion.

That worked! Here is the plan.

|* 2 | TABLE ACCESS BY INDEX ROWID| JSON_PATIENT | 1 | 18399 | 4 (0)| 00:00:01 |
|* 3 | DOMAIN INDEX | PATIENT_SEARCH_IDX | | | 4 (0)| 00:00:01 |

And the filter without the TO_NUMBER function call.

2 - filter(JSON_VALUE("PATIENT_DATA" FORMAT JSON , '$.mrns[0].OS' RETURNING
VARCHAR2(4000) NULL ON ERROR)='1')

That still does not look right. The right hand side of the comparison should safely be a number. So I just need to force the left side to be a number by adding the RETURNING NUMBER clause like this.

SELECT COUNT(*)
  FROM json_patient
 WHERE JSON_VALUE("PATIENT_DATA" FORMAT JSON , '$.mrns[0].OS' RETURNING NUMBER ) = 1;

That worked! Here’s the plan.

TABLE ACCESS BY INDEX ROWID| JSON_PATIENT
DOMAIN INDEX | PATIENT_SEARCH_IDX

And the filter.

2 - filter(JSON_VALUE("PATIENT_DATA" FORMAT JSON , '$.mrns[0].OS' RETURNING NUMBER NULL
ON ERROR)=1)

Few things are more frustrating than watching an application slow down because an index is not being used. Especially when the criteria applied does not match the code due to translations in the middle. Luckily the EXPLAIN plan utility provides information to fix the problem.

Thanks for reading!

Leave a Comment