This is a third post in a series regarding Exadata HCC (Hybrid Columnar Compression) and the storage savings it brings to Oracle customers. In Part 1 I showed the real-world compression ratios of Oracle’s best DW references, in Part 2 I investigated why is that so, and in this part I’ll question the whole saving accounting.
So, we saw in part 1 that most Exadata DW references don’t mention the storage savings of HCC, but those who do show an average 3.4x “storage savings”. Now let’s see what savings, if any, this brings. It all has to do with the compromises involved in giving up modern storage capabilities and the price to pay to when fulfilling these requirements with Exadata.
Let me start with a somewhat weaker but valid point. Modern storage software allows online storage software upgrade. A mission-critical database (or any database) shouldn’t be down or at risk when upgrading storage firmware or software. In order to achieve similar results with Exadata, the storage layer has to be configured as a three-way mirror (ASM High Redundancy). This is actually Oracle’s best practice, see for example the bottom of page 5 of the Exadata MAA HA paper. This configuration uses significantly more storage than any other solution on the market. This means that while the total size of all the data files might be smaller in Exadata thanks to HCC, you still need a surprisingly large raw volume of storage to support it, or you’ll have to compromise and always use offline storage software upgrades – likely the critical quarterly patch bundle, which could take at least an hour of downtime to apply, from what I read on the blogsphere.
To make it a bit more confusing, the Exadata X3 datasheet only mentions (in page 6) the usable data capacity with 2-way mirror (ASM normal redundancy), even though the recommended configuration is 3-way mirror. I wonder if that has anything to do with later providing less net storage? So, to have rolling Exadata storage upgrades (without shutting down the databases that run on top of it), Exadata storage needs to be configured to deliver significantly less storage than what’s in their data sheet! Let’s hope that most customers “get it” before they finalize their sizing or ROI analysis and take the extra risk or extra cost into account.
Let’s move on. One of the most popular storage features is snapshots. Almost every modern storage tech provides it nowadays, to allow great ease of use and great storage savings. Since Exadata storage does not provide snapshot capabilities, a lot of storage (and DB CPU cycles) has to be emulate similar ends. Here are a couple of examples:
- Super fast restores – storage snapshots allows immediate restore to a previous point in time – both in regular operations (nightly pre-ETL / post-ETL snapshots) and for special occasions (pre-upgrade). Since restoring from external backup system is likely to take hours, if customers want fast restore, they will have to backup to their Exadata system. One option is to turn on flashback logs to allow faster rollback of the database. These consume additional disk space and are typically configured for up to a day or two of retention. To be able to go back several days (and recover from more scenarios) , a customer could have a full copy of their database datafiles in the Fast Recovery Area on Exadata (called “local disk backups” in the MAA paper) and have it updated in an “incremental forever” RMAN script with some delay. That could allow faster recovery by switching to that (single point in time) copy and rolling it forward or backwards. But, this comes with a heavy price – having to keep an extra full copy of the database on Exadata storage (plus flashback logs and archive logs).
- Provisioning a copy of the production database – there are many, many reasons why customers do that. For example, providing a frozen nightly / end of month / end of quarter reporting environment for a while, delivering test environments, pre-production environments etc etc. The magic of snapshots means that most of these operations have relatively small storage overhead, and can be done very fast. With Exadata storage, each new environment means a full copy of the database – which eats up your space. In addition it is done on the production database servers – eating their precious DB CPU cycles and I/O bandwidth.
Even with one copy for backup and one copy for testing or reporting, you already consume 3x the storage. So, where are the “3x storage savings” now?
Of course, each customer has different requirements and usage, for good or bad, so your millage will vary. For example, I’ve seen EBS and SAP customers with a dozen copies of their production environment… Also, some other pieces of technology may came into play, like Data Guard on another Exadata (especially if you buy Active Data Guard option) or even Oracle ZFS Appliance if you want to be special. Still, I believe the storage / functional price of Exadata storage should be be carefully analyzed by any potential customer.
To sum it up, HCC is a great feature. But, when locked to Exadata storage, the savings it provides are typically negated by the minimal feature set of the Exadata storage layer (ASM and Exadata storage software). Specifically, the lack of snapshots and the requirement to use triple mirroring for online storage upgrades leads to significant amounts of storage being wasted.