oracleinquisitive.files.wordpress.com  · Web view2019. 3. 6. · Exadata Part III: Compression....

24
Exadata Part III: Compression With this posting, we look at the New Feature Hybrid Columnar Compression that is only available with the Exadata Database Machine, offering stunning compression ratios together with high query speed. Meanwhile we have not less than 6 different kinds of Table Data Compression methods: 1. BASIC compression, introduced in Oracle 8 already and only recommended for Data Warehouse 2.OLTP compression, introduced in Oracle 11 and recommended for OLTP Databases as well 3. QUERY LOW compression (Exadata only), recommended for Data Warehouse with Load Time as a critical factor 4. QUERY HIGH compression (Exadata only), recommended for Data Warehouse with focus on Space Saving 5. ARCHIVE LOW compression (Exadata only), recommended for Archival Data with Load Time as a critical factor 6. ARCHIVE HIGH compression (Exadata only), recommended for Archival Data with maximum Space Saving Method 1 and 2 are working by compressing the rows inside of Oracle Blocks – the more redundant values inside the block, the better the compression ratio. OLTP compression might be used with Exadata Database Machine, because Exadata V2 is designed to support OLTP as well as Data Warehouse. Methods 3 to 6 use Hybrid Columnar Compression (HCC), which is designed to achieve better compression ratios by organizing sets of rows into compression units and then detaching the columns from the rows and collecting them. It is more likely that the same columns store similar (redundant) data, therefore the compression ratio improves with HCC.

Transcript of oracleinquisitive.files.wordpress.com  · Web view2019. 3. 6. · Exadata Part III: Compression....

Exadata Part III: Compression

With this posting, we look at the New Feature Hybrid Columnar Compression that is only available with the Exadata Database Machine, offering stunning compression ratios together with high query speed. Meanwhile we have not less than 6 different kinds of Table Data Compression methods:

1. BASIC compression, introduced in Oracle 8 already and only recommended for Data Warehouse

2. OLTP compression, introduced in Oracle 11 and recommended for OLTP Databases as well

3. QUERY LOW compression (Exadata only), recommended for Data Warehouse with Load Time as a critical factor

4. QUERY HIGH compression (Exadata only), recommended for Data Warehouse with focus on Space Saving

5. ARCHIVE LOW compression (Exadata only), recommended for Archival Data with Load Time as a critical factor

6. ARCHIVE HIGH compression (Exadata only), recommended for Archival Data with maximum Space Saving

Method 1 and 2 are working by compressing the rows inside of Oracle Blocks – the more redundant values inside the block, the better the compression ratio. OLTP compression might be used with Exadata Database Machine, because Exadata V2 is designed to support OLTP as well as Data Warehouse. Methods 3 to 6 use Hybrid Columnar Compression (HCC), which is designed to achieve better compression ratios by organizing sets of rows into compression units and then detaching the columns from the rows and collecting them. It is more likely that the same columns store similar (redundant) data, therefore the compression ratio improves with HCC.

I will give examples for the 6 compression methods now. Keep in mind that I  have only limited access to our Oracle University Exadata Database Machine,  restricting me to just one Database Server and two Storage Servers – speed on a Full Rack will probably be faster. Also, the demo table is somewhat artificial and may therefore deliver better compression ratios compared to Real World tables.

In short: Although you should get the picture from the examples, it will be different on your site and it requires thorough testing before you take any kind of compression into production.

The (uncompressed) demonstration table is the same as in the previous posting and was created with this statement:

SQL> create table sales as

select

rownum as id,

rownum + 1 as flag,

'Oracle Enterprise Edition' as product,

mod(rownum,5) as channel_id,

mod(rownum,1000) as cust_id ,

5000 as amount_sold,

trunc(sysdate - 10000 + mod(rownum,10000))

as order_date,

trunc(sysdate - 9999 + mod(rownum,10000))

as ship_date

from dual connect by level<=2e7;

It is 1.5 GB in size. Let’s look how small we can compress and how fast we can load!

BASIC method

SQL> create table comp_classic compress as select * from sales where 1=2;

Table created.

SQL> alter table comp_classic nologging;

Table altered.

SQL> insert /*+ append */ into comp_classic select * from sales;

20000000 rows created.

Elapsed: 00:00:39.80

SQL> commit;

Commit complete.

OLTP method. Notice that we cannot assume direct loads in an OLTP environment. Therefore conventional insert:

SQL> create table comp_oltp compress for oltp as select * from sales where 1=2;

Table created.

SQL> insert into comp_oltp select * from sales;

20000000 rows created.

Elapsed: 00:02:16.56

SQL> commit;

Commit complete.

QUERY LOW method. We expect this method to have the fastest Load Time:

SQL> create table comp_query_low compress for query low as select * from sales where 1=2;

Table created.

SQL> alter table comp_query_low nologging;

Table altered.

SQL> insert /*+ append */ into comp_query_low select * from sales;

20000000 rows created.

Elapsed: 00:00:26.96

SQL> commit;

Commit complete.

QUERY HIGH method

SQL> create table comp_query_high compress for query high as select * from sales where 1=2;

Table created.

SQL> alter table comp_query_high nologging;

Table altered.

SQL> insert /*+ append */ into comp_query_high select * from sales;

20000000 rows created.

Elapsed: 00:00:33.99

SQL> commit;

Commit complete.

ARCHIVE LOW method

SQL> create table comp_archive_low compress for archive low as select * from sales where 1=2;

Table created.

SQL> alter table comp_archive_low nologging;

Table altered.

SQL> insert /*+ append */ into comp_archive_low select * from sales;

20000000 rows created.

Elapsed: 00:00:35.14

SQL> commit;

Commit complete.

ARCHIVE HIGH method. We expect this method to have the slowest Load Time and the best Compression Ratio:

SQL> create table comp_archive_high compress for archive high as select * from sales where 1=2;

Table created.

SQL> alter table comp_archive_high nologging;

Table altered.

SQL> insert /*+ append */ into comp_archive_high select * from sales;

20000000 rows created.

Elapsed: 00:04:13.13

SQL> commit;

Commit complete.

That was indeed slow! Now let’s look at the outcome:

SQL> select segment_name,bytes/1024/1024 as mb from user_segments order by 1;

SEGMENT_NAME                           MB

------------------------------ ----------

COMP_ARCHIVE_HIGH                       5

COMP_ARCHIVE_LOW                        8

COMP_CLASSIC                          736

COMP_OLTP                             864

COMP_QUERY_HIGH                         8

COMP_QUERY_LOW                        304

SALES                              1535.5

7 rows selected.

We have condensend 1.5 Gig into 5 Megabyte with the ARCHIVE HIGH method!

Summary: Except for ARCHIVE HIGH, we have faster Load Times with HCC than with Block Compression. Every HCC method delivers a better compression ratio than the Block Compression methods. Now how about Query Times?

SQL> set timing on

SQL> connect adam/adam

Connected.

SQL> select count(*) from sales where channel_id=1;

COUNT(*)

----------

4000000

Elapsed: 00:00:02.07

SQL> select name,value/1024/1024 as mb from v$statname natural join v$mystat

where name = 'cell physical IO interconnect bytes returned by smart scan';  2

NAME                                                                     MB

---------------------------------------------------------------- ----------

cell physical IO interconnect bytes returned by smart scan        46.807457

We take the FTS (with Smart Scan) on the uncompressed table as Baseline. I always reconnect to initialize v$mystat. Now Query Performance of BASIC compression:

SQL> connect adam/adam

Connected.

SQL> select count(*) from comp_classic where channel_id=1;

COUNT(*)

----------

4000000

Elapsed: 00:00:01.79

SQL> select name,value/1024/1024 as mb from v$statname natural join v$mystat

where name = 'cell physical IO interconnect bytes returned by smart scan';  2

NAME                                                                     MB

---------------------------------------------------------------- ----------

cell physical IO interconnect bytes returned by smart scan       123.650536

Runtime was faster as uncompressed, Smart Scan was possible. Now OLTP compression:

SQL> connect adam/adam

Connected.

SQL> select count(*) from comp_oltp  where channel_id=1;

COUNT(*)

----------

4000000

Elapsed: 00:00:01.80

SQL> select name,value/1024/1024 as mb from v$statname natural join v$mystat

where name = 'cell physical IO interconnect bytes returned by smart scan';  2

NAME                                                                     MB

---------------------------------------------------------------- ----------

cell physical IO interconnect bytes returned by smart scan        130.22644

Dito. Now QUERY LOW (HCC) compression:

SQL> connect adam/adam

Connected.

SQL>  select count(*) from comp_query_low where channel_id=1;

COUNT(*)

----------

4000000

Elapsed: 00:00:00.86

SQL>  select name,value/1024/1024 as mb from v$statname natural join v$mystat

where name = 'cell physical IO interconnect bytes returned by smart scan';  2

NAME                                                                     MB

---------------------------------------------------------------- ----------

cell physical IO interconnect bytes returned by smart scan       2.68922424

Fastest Runtime yet. Smart Scan was possible. Now QUERY HIGH (HCC):

SQL> connect adam/adam

Connected.

SQL> select count(*) from comp_query_high where channel_id=1;

COUNT(*)

----------

4000000

Elapsed: 00:00:01.27

SQL>  select name,value/1024/1024 as mb from v$statname natural join v$mystat

where name = 'cell physical IO interconnect bytes returned by smart scan';  2

NAME                                                                     MB

---------------------------------------------------------------- ----------

cell physical IO interconnect bytes returned by smart scan                0

Still faster than uncompressed, although not using Smart Scan. Now ARCHIVE LOW (HCC):

SQL>  connect adam/adam

Connected.

SQL>  select count(*) from comp_archive_low where channel_id=1;

COUNT(*)

----------

4000000

Elapsed: 00:00:01.13

SQL> select name,value/1024/1024 as mb from v$statname natural join v$mystat

where name = 'cell physical IO interconnect bytes returned by smart scan';  2

NAME                                                                     MB

---------------------------------------------------------------- ----------

cell physical IO interconnect bytes returned by smart scan                0

Faster than uncompressed (and surprisingly faster than QUERY HIGH) without Smart Scan. Finally ARCHIVE HIGH (HCC):

SQL> connect adam/adam

Connected.

SQL> select count(*) from comp_archive_high where channel_id=1;

COUNT(*)

----------

4000000

Elapsed: 00:00:02.93

SQL> select name,value/1024/1024 as mb from v$statname natural join v$mystat

where name = 'cell physical IO interconnect bytes returned by smart scan';  2

NAME                                                                     MB

---------------------------------------------------------------- ----------

cell physical IO interconnect bytes returned by smart scan                0

Slowest Query Performance (the only one slower than uncompressed in this experiment) without Smart Scan.

Conclusion: HCC delivers (far) better Compression Ratios than the Block Compression methods. Load Time increases from QUERY LOW (best) over QUERY HIGH and ARCHIVE LOW (both moderate) to ARCHIVE HIGH (longest Load Time). Query Performance decreases similar from QUERY LOW (best) to ARCHIVE HIGH (longest Query Time). All HCC methods except ARCHIVE HIGH delivered better Query Performance than uncompressed, though. In short: HCC is most likely able to save large amounts of space in your Data Warehouse without decreasing Query Performance much if at all. It will probably compress your Archival Data significantly.

·

·

·

· inShare

· Email

·

· Share on Tumblr

·

·

·

Like this:

Like Loading...

Related

DBMS_COMPRESSION ExampleIn "TOI"

Important Statistics & Wait Events on ExadataIn "TOI"

Top 10 postings in 2012In "TOI"

Data Warehouse, exadata, hcc

This entry was posted on January 21, 2011, 12:47 and is filed under TOI. You can follow any responses to this entry through RSS 2.0. You can leave a response, or trackback from your own site.

· Comments (26)

· Trackbacks (4)

1.

#1 by Ignacio Ruiz on March 7, 2011 - 09:06

“Also, the demo table is somewhat artificial and may therefore deliver better compression ratios compared to Real World tables”… you’re very right!!! I had one customer which db had this huge 310+ columns table and compression improvement from QUERY LOW to HIGH didn’t changed the 10x factor, and moving the next level to ARCHIVE jumped to the 15x compression factor only… thus compression results are very dependent on data sets.

2.

#2 by Uwe Hesse on March 14, 2011 - 16:48

Ignacio,thank you for pointing this out – all I can say about it is: “I told you so” 

3.

#3 by Bhavik Desai on April 11, 2011 - 08:34

Hi Uwe,

Thanks for such wonderful series on exadata. I appreciate your efforts.

I have a question on exadata and active standby db.i am using a quarter rack exadata machine with EHCC on all my old partitions. I wanted to confirm that whether i can create an active physical standby on non-exadata machine ? If yes, how would Oracle store data on physical standby ? Compressed ? or non compressed ?

Can you clear one more doubt ? I cannot have smart scan on reports running on active standby since they are on non-exadata….is that right ?

4.

#4 by Uwe Hesse on April 11, 2011 - 19:15

Hi Bhavik,you’re welcome To your questions:1) It is possible to have a non Exadata Standby Database for an Exadata Primary, even with HCC Compression on the Primary. The HCC compressed columns get actualized on the Standby by Redo Apply – they cannot be accessed by select statements, though. This will compromise the Availability of these tables in case of a failover accordingly – you probably should not go with that strategy2) Since Smart Scan is a feature of Exadata Cells, you cannot have it on the non Exadata Standby. You will have just Dumb Scans 

5.

#5 by Bhavik Desai on April 12, 2011 - 13:35

Thanks Uwe for clarifying this.

So, for my first question, there could be two scenarios1) Primary running on Exadata and Standby on traditional storage system

In this case, all reports running on primary will get benefited by SMART SCAN. Also at storage level EHCC would be in effect.However, on standby, MRP will apply NON-COMPRESSED data and thus there will not be any compression. If standby is accessed by ACTIVE DATA GUARD, then also reports will not get any advantage of SMART SCAN as active standby is running on non-exadata platform.

2) Primary is running on non-exadata platform and standby on exadata.In this case, There will not be any EHCC on primary as well as on standby as well. (?) If exadata standby is accessed as ACTIVE standby, then reports can take advantage of SMART SCAN…

Am i right in understanding your point ?

Regards,Bhavik Desai

6.

#6 by Uwe Hesse on April 12, 2011 - 20:35

Bhavik, in 1), Redo Apply is done WITHOUT DECOMPRESSING. After a failover/switchover to the non Exadata site, you cannot access the tables with HCC compressed columns. Instead, YOU need to decompress them now. That is why I said it will compromise the availability of these tables.

2) yes, I think so. Haven’t seen this configuration yet (non Exadata Primary and Exadata Physical Standby with Real-Time Query), though 

7.

#7 by John Clarke on May 17, 2011 - 06:15

In your second query against comp_oltp, it looks as if more bytes are returned over the interconnect although size and block counts are less than the “normal” table. In fact, I’d bet that “cell IO uncompressed bytes” is less for the query against comp_oltp and “cell physical IO interconnect bytes” statistic is higher proportional to “something”, because my tests have shown the same thing. My question is this – what is the “something”? Is the cell un-compressing the OLTP blocks before sending over the interconnect, and if so, why would it do this?

– John

8.

#8 by Uwe Hesse on May 18, 2011 - 13:06

John,it is indeed so that the Cells do the decompression for HCC compressed tables as well as decryption for TDE encrypted data. So “offloading” does not only contain I/O but also CPU intensive operations. HCC decompression is in fact very CPU intensive, so it makes sense to offload that to the Cells. That is the technical reason why HCC was made exclusively available for Exadata – after a short period in 11gR1 where it was available on conventional Oracle also.

9.

#9 by Ballough on August 23, 2011 - 15:37

Very informative posting. I’m disappointed though that you didn’t include a load time comparison with nologging OLTP compression, rather than subjecting only this type of compression to redo log overhead.

10.

#10 by Uwe Hesse on August 24, 2011 - 10:54

Thank you, Ballough, for your comment! To your question: The population of the OLTP table in a real world scenario would probably be done with multiple sessions, inserting single rows over a very long time. In so far, the load time is somewhat useless and maybe I should have avoided to mention it altogether The important point to emphasize here was (in my opinion), that although we can’t assume direct load on an OLTP table, the compression ratio is similar as good as with the BASIC method.

11.

#11 by Michael on September 27, 2011 - 10:20

Hi Uwe,

can HCC be combined with Secure Files Compression or will we get better results if we are using HCC instead of Secure Files compression for our LOBs?

Thanks

Michael

12.

#12 by JSM on October 10, 2011 - 19:03

Secure Files Compression can do LOB compression/deduplication. HCC doesn’t.

13.

#13 by Anthony on November 20, 2011 - 08:05

Hi.

Can you please explain why the selects on the basic compressed and OLTP compressed tables returned approximately 3 times as much data as the select on the uncompressed original table?

Thanks

Anthony

14.

#14 by Uwe Hesse on November 20, 2011 - 21:54

Anthony,I think that is because the Cells did not do the decompression for the BASIC and OLTP compressed tables.

They DID the filtering according to the where condition (the data volume delivered by smart scan are the compressed blocks, containing channel_id 1) but NOT the column projection count(*) like they did for the uncompressed table.

15.

#15 by Marco on December 7, 2011 - 17:51

Verry good.

16.

#16 by vnmaster on December 22, 2011 - 14:25

HiEHCC is used for compressing tables. How do we compress the indexes. Is there an option for compressing indexes?

17.

#17 by Mark on August 18, 2012 - 01:34

All documentation and examples are for compressing tables. Can you address compressing indexes on Exadata.1. Is HCC available for indexes? How would you specify, all I see is a “compress” parameter with no additional detail. What type of compression is this enabling?2. Can you modify or rebuild a non-compressed index to compressed without dropping?

18.

#18 by Uwe Hesse on August 25, 2012 - 08:41

1) No. That is the ordinary key compression that has been available since Oracle 82) Yes. See http://docs.oracle.com/cd/E11882_01/server.112/e26088/statements_1010.htm#BABHFFHI

19.

#19 by Kumar Ramalingam on November 1, 2012 - 02:59

HiI was testing with EHCC compression and storage indexes. In my testing, first I compressed my subpartitions for query high and I ended up with 74K blocks for about 9M rows. I created another test table with the same structure of the previous (same storage attributes) and loaded the data from the partition mentioned above (that has 9M rows) with sorting on two columns. Now, after firing the query with the predicate on the sorted columns, I could get the storage indexes working perfectly.But, when I checked the number of blocks occupied for the partition that has the sorted data, it was 20K blocks more than the unsorted partition.Do you know, from your testing if it shows such a behavior? Would it be a bug? Thanks in advance.

–Kumar Ramalingam

20.

#20 by Mike Ault on December 2, 2014 - 18:58

I realize this is an old blog but I wanted to ask a question, using AWR data only how can you determine the bandwidth used when HCC is used?

21.

#21 by sshdba on January 9, 2015 - 22:32

Reblogged this on Easy Oracle DBA.

22.

#22 by Nitin Kashiv on June 22, 2015 - 13:24

Hello dearI like this post very much, would like to be a member/follower please add me

RegardsNitin

23.

#23 by Uwe Hesse on June 30, 2015 - 07:14

Nitin, you can subscribe yourself easy: Just click the subscribe button in the upper left which says: ‘Sign me up!’

24.

#24 by MIke Kim on August 15, 2015 - 20:26

Uwe,

I am trying to find a reference for the fact that decompression for Basic and OLTP does occur in the exadata cells and not in the database servers. I have read conflicting information and need some guidance.

25.

#25 by Uwe Hesse on August 17, 2015 - 13:35

Mike, to my knowledge, decompression can only be offloaded for hybrid columnar compression.

26.

#26 by Mike Kim on August 17, 2015 - 18:18

UweThanks for the prompt reply. I thought the same way and wanted some opinion from the master.

Mike

1. Exadata « Oracle Scratchpad

2. DBMS_COMPRESSION Example « The Oracle Instructor

3. DML and HCC – Exadata at Amardeep Sidhu's Oracle blog

4. Top 10 postings in 2012 « The Oracle Instructor

Leave a Reply

Top of Form

Bottom of Form

·

· Last Tweet

· RT @MartinKlierDBA: Auf zum @DOAGeV #noon2noon doag.org/termine/termin… freue mich auf @carajandb @MartinDBA @UweHesse @MikeDietrichDE und v…3 hours ago

·

·

·

· Like uhesse.com on Facebook 

· Email Subscription

Top of Form

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 4,640 other followers

Bottom of Form

· Blog Stats

· 1,534,660 hits

· Top Posts

· #Oracle SQL Tuning Advisor on the Command Line

· adrci: A survival guide for the DBA

· Turning Flashback Database on & off with Instance in Status OPEN

· Important Statistics & Wait Events on Exadata

· Reorganizing Tables in Oracle - is it worth the effort?

· Retrieve SQL and Execution Plan from AWR Snapshots

· SQL*Plus output in nice HTML format

· Data Guard

· How to change the size of Online Redologs?

· Common Users & SYSDBA with #Oracle 12c Multitenancy

· Search inside this Blog

Top of Form

 

Bottom of Form

· Archives

Archives                                                                                                                                                                                              

· #DOAG2014 #KB2918614 #ougn2015 #ukoug_tech13 #ukoug_tech14 3e 10g New Features 11g New Features 12c New Features ABMR Active Data Guard ADO adrciadvert ASM auditing AWR Backup & Recovery Brief Introduction Database Control Data Guard Data Warehouse DBA JTA dbms_monitor dbms_redefinition DBMS_RLS dcli Digital Learning DOAG2013 exadata Extended RAC external tables fine grained access control flashback flash cache griddisk Grid Infrastructure hcc High Availability ILM logminer LVC Materialized ViewsMultitenant NYOUG OCM Oracle Certification Oracle Enterprise Manager Oracle Restart OU EMEA Newsletter OUG Ireland 2014 OUGN2012 OU Streams partitioning Performance Tuning PracticalGuide RAC Real-Time Query reorganizing RMAN security services smart scan Snapshot Standby storage index TOD Total Recall tracing trcsess ukoug2012 video vpd Webinar whoami X3

· Blogroll

· Alexander Hansal

· Dan Morgan

· Databases at CERN blog

· Gavin Soorma

· Harald van Breederode

· Joel Goodman

· Jonathan Lewis

· Julian Dontcheff

· Kamran Agayev

· Kellyn Pot’Vin

· Kerry Osborne

· Kyle Hailey

· Mark J. Bobak

· Mike Dietrich's Upgrade Blog

· Oracle Optimizer Dev Group

· Pete Sharman

· Richard Foote

· Tanel Poder

· Tim Hall

· Tom Kyte

· Links

· Ask Tom

· Exadata on LinkedIn (IOUG SIG)

· Exadata public Info OTN

· Exadata References Booklet

· Exadata related OU courses

· MOS Master Note Data Guard

· MOS Master Note Exadata

· MOS Master Note RAC

· MOS Note Flashback Database Best Practices & Performance

· Oracle Community

· Oracle Documentation

· OTN Data Guard Forum

· OTN Database Forum

· OTN Exadata Forum

· OTN RAC Forum

· Strategic Acquisitions of Oracle

· Timeline of Oracle 1977-now

· Random Posting

· One of my postings at random

S

ign me up!

S

earch