Страницы

четверг, 18 июня 2015 г.

Function-Based index changed its definition after a rebuild

One of our developers have discovered a problem when index column data type changed after index had been rebuild.
They asked me for help with that issue. Here's a demo script:
SQL> create table t(
  2    x int,
  3    y int);

Table created.

SQL> create index t_i on t(
  2    decode(x, 1, to_number(null), y));

Index created.

SQL> exec dbms_stats.gather_table_stats( '', 't')

PL/SQL procedure successfully completed.
I create a function-based index with one column.
The index column is defined to show "Y" column (with type int) if the "X" column isn't equal to 1 (or null).
According to a documentation:
http://docs.oracle.com/database/121/SQLRF/functions056.htm#SQLRF00631
Oracle automatically converts the return value to the same data type as the first result.
If the first result has the data type CHAR or if the first result is null, 
then Oracle converts the return value to the data type VARCHAR2.
Notice that index column data type is a NUMBER:
SQL> select column_name
  2    from user_ind_columns
  3   where index_name = 'T_I';

COLUMN_NAME
------------------------------
SYS_NC00003$

SQL> select data_type
  2    from user_tab_cols
  3   where table_name = 'T'
  4     and column_name = 'SYS_NC00003$';

DATA_TYPE
------------------------------
NUMBER
The developer have found that his query doesn't use the index:
SQL> explain plan for
  2  select *
  3    from t
  4   where decode(x, 1, to_number(null), y) = to_number(:1);

Explained.

SQL> 
SQL> @?/rdbms/admin/utlxpls

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
Plan hash value: 2153619298

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    26 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    26 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(DECODE("X",1,TO_NUMBER(NULL),"Y")=TO_NUMBER(:1))
Index will be used when he changed his query to following:
SQL> explain plan for
  2  select *
  3    from t
  4   where decode(x, 1, null, y) = to_number(:1);

Explained.

SQL> 
SQL> @?/rdbms/admin/utlxpls

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
Plan hash value: 2858887366

--------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |     1 |    39 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T    |     1 |    39 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T_I  |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(DECODE("X",1,NULL,TO_CHAR("Y"))=TO_NUMBER(:1))
Why is this?
The index expression is not what I passed in the CREATE INDEX statement:
SQL> select column_expression
  2    from user_ind_expressions
  3   where index_name = 'T_I';

COLUMN_EXPRESSION
------------------------------
DECODE("X",1,NULL,"Y")
It looks like Oracle is "clever enough" to change our index expression from:
DECODE(X, 1, TO_NUMBER(NULL), Y)
to:
DECODE("X", 1, NULL, "Y")
This leads to unexpected results when index was rebuild:
alter index t_i rebuild;
The index column data type is VARCHAR2 after that!
SQL> select data_type
  2    from user_tab_cols
  3   where table_name = 'T'
  4     and column_name = 'SYS_NC00003$';

DATA_TYPE
------------------------------
VARCHAR2
Our query now suddenly has started using FTS where previously it was using an index:
SQL> explain plan for
  2  select *
  3    from t
  4   where decode(x, 1, null, y) = to_number(:1);

Explained.

SQL> 
SQL> @?/rdbms/admin/utlxpls

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 2153619298

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    48 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    48 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_NUMBER(DECODE("X",1,NULL,TO_CHAR("Y")))=TO_NUMBER(:1))

Index hint did not help either:
SQL> explain plan for
  2  select /*+ index(t t_i)*/*
  3    from t
  4   where decode(x, 1, null, y) = to_number(:1);

Explained.

SQL> 
SQL> @?/rdbms/admin/utlxpls

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
Plan hash value: 2153619298

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    48 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |     1 |    48 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_NUMBER(DECODE("X",1,NULL,TO_CHAR("Y")))=TO_NUMBER(:1))
Notice TO_NUMBER conversion at line 1:
   1 - filter(TO_NUMBER(DECODE("X",1,NULL,TO_CHAR("Y")))=TO_NUMBER(:1))
We should rewrite our query as follows:
SQL> explain plan for
  2  select *
  3    from t
  4   where decode(x, 1, null, y) = to_char(:1);

Explained.

SQL> 
SQL> @?/rdbms/admin/utlxpls

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
Plan hash value: 2858887366

--------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |     1 |    48 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T    |     1 |    48 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T_I  |     1 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(DECODE("X",1,NULL,TO_CHAR("Y"))=:1)
To workaround this issue, we could use an index on virtual columns.
We could prevent TO_NUMBER(NULL) to NULL conversion by changing TO_NUMBER(NULL) to CAST(NULL as NUMBER), or other such expression.
I have opened the SR with Oracle and they told me that this's due to:
Bug 17871767: FUNCTION BASE INDEX DEFINITION CHANGED ON 11.2, ADDED A TO_CHAR FUNCTION This bug is still under work and hasn't been resolved yet.
Actually, this issue with function-based indexes has been present for a long time.
I have reproduced it on 9.2.0.6 at least.