I have been very happy with the benefit from InnoDB compression and a lot of hard work locally by my peers to make it work in production for us. When a table is compressed with InnoDB, you must declare a compression factor for that table (key_block_size=...) so that the table will possibly be compressed by 2X, 4X, 8X, or 16X. We have been using key_block_size=8 for many tables to get 2X compression.
I think it is possible to do much (2X or more) better than InnoDB. InnoDB uses an update-in-place approach to modify database pages on disk. There are solutions described as write-optimized (shadow pages, log-structured merge tree, fractal tree) that have a few significant advantages and one potential disadvantage with respect to compression. The advantages of the write-optimized family include:
From the reasons above, the first two are a big deal and explain why it is possible to get much better compression rates from the write-optimized alternatives. An additional benefit from a write-optimized solution is that they don't suffer from fragmentation so a database won't grow over when it is updated without adding data to it and there is no need to reorg tables to get space back. While they don't suffer from fragmentation they do introduce a new source of inefficiency. Write-optimized solutions usually have some amount of database space occupied by dead rows (rows that cannot be read by any current or future transactions). Compaction and garbage collection are the methods used to reclaim this space and you might be able to estimate the amount of space required for this by considering the row change rate and the frequency at which compaction can be done.