Formatting Output of 11.2 Compression Advisor (DBMS_COMPRESSION)
Posted by Tyler Muth on October 21, 2010
Let me start with a few facts about compression that I feel are a bit confusing or could use better documentation. The Compression Advisor is a package named DBMS_COMPRESSION that allows takes a table as input, then estimates the compression ratio and resulting size in blocks if you compressed the table. You can download a version of it that works in any 9.2+ database from the Advanced Compression page. However, this will only estimate Advanced Compression / OLTP Compression results. It will NOT estimate the 4 forms of Hybrid Columnar Compression (HCC) which are only available in Exadata. To estimate HCC results, you have to use an 11.2 database as the version of DBMS_COMPRESSION in 11.2 has been enhanced. You don’t need an Exadata machine, just an 11.2 database. For more information on HCC, check out this presentation (PDF, 2.7 MB). When computing HCC estimates in the 11.2 Compression Advisor, it’s important to note that for each type of compression you wish to test, it creates a complete copy of your table or partition. This is very accurate but PAINFULLY slow, so be warned. If you want to know the results of each type of compression for every table in a schema, it will literally recreate a temporary copy of each table 4 times!
Now that we have that out of the way, lets get to the real reason for this post. The 11.2 compression advisor is functional, but it really leaves it up to you to format the output. Typically you would use DBMS_OUTPUT for this, but I wanted a format that I could easily import into a spreadsheet to do some sorting and compute some aggregates. So, I wrote a pipelined function that is a wrapper on DBMS_COMPRESSION which allows you to run it in a SQL statement. If you run it in SQL Developer, you can easily export the results as XLS.
You can download the code here. A few notes which are in the head of the function as well:
- Tables and partitions must have at least 1,000,000 rows based on the dbms_compression.comp_ratio_minrows global.
- Computing HCC compression recreates the entire table or partition, which is very accurate but VERY slow.
Update: Thanks to Kam Shergill for pointing out the “subset_numrows” in dbms_compression. I’ve added a new parameter called p_subset_numrows. You can pass in a number of rows such as 1500000 or a percentage such as 50%. However, the number of rows per segment still has to be above 1000000 for the compression advisor to work.
- This is an “Invokers Rights” function, so it queries “USER_” dictionary views of the schema that calls the function, not the schema it’s installed in.
- The p_compression_level parameter is a bitwise parameter based on the table below (which comes from the package spec of DBMS_COMPRESSION). 2+4 will compute OLTP and HCC QUERY HIGH. 2+4+8+16+32 (or 62) will compute ALL levels of compression. Warning!!! Each level of HCC is very slow, so 62 will take forever.
create or replace function get_compression_stats( p_table_name in varchar2, P_compression_level in number, p_subset_num_rows in varchar2 default '100%', p_scratch_tablespace in varchar2 default 'USERS' ) return table_compression_stats authid current_user pipelined is pragma autonomous_transaction;
-- For the EMPLOYEES_LARGE table, compute all compression estimates (2+4+8+16+32), create the temporary copy of EMPLOYEES_LARGE in the USERS tablespace select * from table(get_compression_stats('EMPLOYEES_LARGE',2+4+8+16+32,'USERS'));
-- For the ALL_OBJS table, compute compression estimates for HCC COMP_FOR_QUERY_HIGH and COMP_FOR_QUERY_LOW (4+8), create the temporary copy of ALL_OBJS in the USERS tablespace select * from table(get_compression_stats('ALL_OBJS',4+8,'USERS'));
Output of Example 2 (HTML “unload” from SQL Developer 3.x. Empty columns removed for width):