2021 – Virtual Columns For JSON Access Part 1

The previous post in this series showed SQL and PLSQL code to find bad JSON data and then fix that bad data. It required many more lines of code (and complexity) than the RDBMS Version for the same functionality. This post begins introducing ways to simplify the “finding bad JSON data” using an Oracle feature called Virtual Columns.

First a reminder of the application data.

SELECT JSON_VALUE(patient_data,'$.pname') pname,
       JSON_QUERY(patient_data,'$.mrns[*]' WITH WRAPPER ) mrns
  FROM json_patient

PNAME      MRNS
--------- ------------------------------------------------------------
Patient 1 [{"OS" : 1,"MRN" : "JUSTONE"},{"OS" : 2,"MRN" : "ONEONLY"}]
Patient 2 [{"OS" : 1,"MRN" : "ONEISOK"},{"OS" : 1,"MRN" : "TWOISNOT"}]

The bad data we are looking for is duplicate Ordering System (OS) values for a single patient. Patient 2 is an example of this. The first step in the effort to find these is to isolate the OS values in the array using JSON_QUERY like this.

SELECT JSON_VALUE(patient_data,'$.pname') pname,
       JSON_QUERY(patient_data,'$.mrns[*].OS' WITH WRAPPER) just_os_please
  FROM json_patient

PNAME     JUST_OS_PLEASE
--------- --------------
Patient 1 [1,2]
Patient 2 [1,1]

Now we just need a way make these values easily queried and parsed for duplicates. This post demonstrates the first part (easy query) using a special sort of column that does not need to be a real column – it should be…oh I don’t know what word to use…maybe…virtual?

Yes that is exactly it – a virtual column. Lets add one now for the list of Ordering Systems (OS) values for each patient. We’ll even wrap it with square brackets because this is JSON after all!

Here is the SQL to do it.

ALTER TABLE json_patient
ADD ( os_list VARCHAR2(4000) GENERATED ALWAYS 
           AS (JSON_QUERY(patient_data,'$.mrns[*].OS' WITH WRAPPER)) VIRTUAL);

Now lets see what we get from the table.

SELECT JSON_VALUE(patient_data,'$.pname') pname,
        os_list
   FROM json_patient;

PNAME     OS_LIST
--------- -------
Patient 1 [1,2]
Patient 2 [1,1]

We get the list with a whole lot less typing!

I wonder if we can use that as query criteria (aka the WHERE clause)

SELECT JSON_VALUE(patient_data,'$.pname') pname,
        os_list
  FROM json_patient
 WHERE os_list = '[1,1]'

PNAME     OS_LIST
--------- -------
Patient 2 [1,1]

Sweet – we can use it as criteria!

Order By as well?

SELECT JSON_VALUE(patient_data,'$.pname') pname,
       os_list
  FROM json_patient
ORDER BY os_list;

PNAME     OS_LIST
--------- -------
Patient 2 [1,1]
Patient 1 [1,2]

This makes it really easy to find all OS values for a particular patient. But there is still work to be done to isolate the ones with duplicates. The next post will demonstrate how that could be done.

Thanks for reading!

 

dnklasndlas

Leave a Comment