在Oracle 数据库中去重统计是一个常规的操作, 在Oracle 12c 之前的版本中,一般使用COUNT (DISTINCT expr) 来实现。 但当表数据量巨大时,该语法性能较差。 因此,在Oracle 12c中,引入了APPROX_COUNT_DISTINCT 函数。
官网对其描述如下:
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/APPROX_COUNT_DISTINCT.html
APPROX_COUNT_DISTINCT returns the approximate number of rows that contain a distinct value for expr.
This function provides an alternative to the COUNT (DISTINCT expr) function, which returns the exact number of rows that contain distinct values of expr. APPROX_COUNT_DISTINCT processes large amounts of data significantly faster than COUNT, with negligible deviation from the exact result.
For expr, you can specify a column of any scalar data type other than BFILE, BLOB, CLOB, LONG, LONG RAW, or NCLOB.
APPROX_COUNT_DISTINCT ignores rows that contain a null value for expr. This function returns a NUMBER.
根据说明,就是在12c 以后的版本,建议使用APPROX_COUNT_DISTINCT 代替COUNT (DISTINCT expr) 函数。实际上,在Oracle 19c中,对COUNT DISTINCT做了进一步的优化,为特定SQL带来极大性能提升。
但是在12c 的版本中,该函数有几个bug。 如果执行遇到问题,建议升级到19c版本。
Bug 18121220 - ORA-600 [kghfrempty:ds] on Query using APPROX_COUNT_DISTINCT (Doc ID 18121220.8)
Bug 22631327 - ORA-600 [15851] on group by rollup using approx_count_distinct (Doc ID 22631327.8)
Bug 29331066 - vector_transform does not support approx_count_distinct (Doc ID 29331066.8)
官网给到我们的2个SQL 示例, 如下:
SELECT APPROX_COUNT_DISTINCT(manager_id) AS "Active Managers"
FROM employees;
Active Managers
---------------
18
SELECT prod_id, APPROX_COUNT_DISTINCT(cust_id) AS "Number of Customers"
FROM sales
GROUP BY prod_id
ORDER BY prod_id;
PROD_ID Number of Customers
---------- -------------------
13 2516
14 2030
15 2105
16 2367
17 2093
18 2975
19 2630
20 3791
. . .
对比下2个函数的执行时间,APPROX_COUNT_DISTINCT 确实也是要快很多:
正常查看不同object_name的总数
15:39:31 SQL> select count(distinct(object_name)) from cndba_t;
COUNT(DISTINCT(OBJECT_NAME))
---------------------------
67455
Elapsed: 00:00:00.64
使用APPROX_COUNT_DISTINCT()
15:39:36 SQL> select APPROX_COUNT_DISTINCT(object_name) from cndba_t;
APPROX_COUNT_DISTINCT(OBJECT_NAME)
----------------------------------
67063
Elapsed: 00:00:00.14
版权声明:本文为博主原创文章,未经博主允许不得转载。