2021 – Virtual Columns For JSON Access Part 2

This post continues our quest to find Bad JSON data while reducing code complexity. In particular we hid many lines of complex PL/SQL behind a simple to understand and use Virtual Column (VC). The previous post explained the theory and syntax for VC’s – this post continues our implementation of them to find bad data.

Speaking of data – here’s our test bed of two patients with two Ordering System (OS)/Medical Record Number (MRN) combinations each.

SELECT patient_name,
       os,
       mrn
  FROM json_patient,
       JSON_TABLE(patient_data, '$'
          COLUMNS ( patient_name VARCHAR2 PATH '$.pname',
            NESTED PATH '$.mrns[*]'
              COLUMNS ( os  VARCHAR2(30) PATH '$.OS',
                        mrn VARCHAR2(30) PATH '$.MRN' )))

PATIENT_NAME OS MRN
------------ -- ------
Patient 1     1 MRN123
Patient 1     2 MRN456
Patient 2     1 MRN99
Patient 2     1 MRN333

Patient 2 has the bad data – two MRN’s in OS 1.

The previous post in this series added a Virtual Column (VC) named OS_LIST to the JSON_PATIENT table. It lists all of the OS entries in which the patient has one or more MRN. Here’s what is returned when that column is queried.

SELECT JSON_VALUE(patient_data,'$.pname') pname,
        os_list
  FROM json_patient;
PNAME     OS_LIST
--------- -------
Patient 1 [1,2]
Patient 2 [1,1]

Our objective now is to assemble a list of unique OS values per patient. Patient 1’s unique list would be [1,2] while Patient 2’s would be [1]. Let’s write a function to do that.

CREATE OR REPLACE FUNCTION unique_os_list ( p_name VARCHAR2 )
                                            RETURN VARCHAR2 AS

  v_ret_val VARCHAR2(1000);
  v_item_count BINARY_INTEGER;
  v_array DBMS_UTILITY.LNAME_ARRAY;
  TYPE v_tab_t IS TABLE OF NUMBER INDEX BY VARCHAR2(100);
  v _tab v_tab_t;
  v_lmnt VARCHAR2(100);

BEGIN

  -- remove the square brackets from the JSON object passed in
  -- eg: - [1,1] becomes 1,1
  v_ret_val := REPLACE(p_name,'[','');
  v_ret_val := REPLACE(v_ret_val,']','');

  -- add x's so the table can load correctly using the
  -- COMMA_TO_TABLE built in function
  -- eg: 1 becomes 1x
  v_ret_val := regexp_replace(v_ret_val,'(^|,)','\1x');
  DBMS_UTILITY.COMMA_TO_TABLE(v_ret_val,v_item_count,v_array);

  -- loop through recording the unique values only
  -- eg: 1,1,1,1 would only load 1 total entry (4 times)
  FOR counter IN 1..v_item_count LOOP
    v_tab(v_array(counter)) := 1;
  END LOOP;

  -- now loop through the list of unique values and add
  -- them to the functions return value
  v_lmnt := v_tab.FIRST;
  v_ret_val := '';
  LOOP
    EXIT WHEN v_lmnt IS NULL;
    DBMS_OUTPUT.PUT_LINE(v_lmnt);
    v_ret_val := v_ret_val || v_lmnt || ',';
    v_lmnt := v_tab.NEXT(v_lmnt);
  END LOOP;

  -- remove the last comma and the x's
  v_ret_val := RTRIM(v_ret_val,',');
  v_ret_val := REPLACE(v_ret_val,'x','');
  v_ret_val := '[' || v_ret_val || ']';

  RETURN(v_ret_val);

END;

Now test it out.

SELECT JSON_VALUE(patient_data,'$.pname') pname,
        os_list,
        unique_os_list(os_list) unique_os
  FROM json_patient;
PNAME     OS_LIST UNIQUE_OS
--------- ------- ---------
Patient 1 [1,2]   [1,2]
Patient 2 [1,1]   [1]

And now isolate the bad ones where the unique list is not equal to the full list.

SELECT JSON_VALUE(patient_data,'$.pname') pname,
        os_list,
        unique_os_list(os_list) unique_os_list
  FROM json_patient
  WHERE os_list <> unique_os_list(os_list);
PNAME     OS_LIST UNIQUE_OS_LIST
--------- ------- --------------
Patient 2 [1,1]   [1]

That’s exactly what we are after! Let’s make that VC as well.

ALTER TABLE json_patient
 ADD ( unique_os_list VARCHAR2(4000) GENERATED ALWAYS AS
          (unique_os_list(os_list)) VIRTUAL);

ORA-54012: virtual column is referenced in a column expression

Unfortunately we can’t add a VC that references another VC. In the next post in this series I’ll demonstrate how to shrink everything down to a single VC.

Thanks for reading!

Leave a Comment