Nowadays people are not concerned about how large their database is in terms of MB. Storage is cheap. Even getting cheap SSD storage is not a big deal.
However, this is true if we are talking about hundreds of MB or even several GB, but sometimes we get into a situation where we have massive amounts of data (i.e Several tables with lots of longtext columns). At this point it becomes a concern because we need to increase the hard disk size, and find ourselves checking to see if the hard disk is full several times per day or week, etc.
Now, if you have faced a situation like this before, it's time to talk about database compression.
Compression is a technique, developed theoretically back in the 1940s but actually implemented in the 1970s. For this post we will focus on MySQL compression, which is performed using the open-source ZLib library. This library implements the LZ77 dictionary-based compression algorithm.
Before going into MySQL compression details, lets name some of the main DBMS and their compression techniques:
-
MySQL: ZLib (LZ77) [1]
-
Oracle: Oracle Advanced Compression (Proprietary)[2]
-
Postgres: PGLZ or LZ4 (if added this option at compilation level) [3]
-
DB2: Fixed-length compression or Huffman in some systems [4]
So, now that we know this useless information, lets learn how to implement this in MySQL.
Firstly, you need to know that you CAN'T enable compression if:
-
Your table lives into `system` tablespace, or
-
Your tablespace was created with the option `innodb_file_per_table` disabled.
It is important to test if the compression is the best solution for you. If you have a table with a lot of small columns, you will probably end up with a larger-size table after "compressing" because of the headers and compression information.
Compression is always great when you have longtext columns which can be heavily compressed.
Then, to enable compression for a table, you just need to include the following option when your table is created, or execute it as part of an alter statement:
ROW_FORMAT=COMPRESSED
These are the basics but you may find more useful information in MySQL manual.
You can also take a look at Percona which implements a Column level compression. This is interesting if you have a table with a lot of small fields and one large column, or if you have to optimize your database as much as possible. [6]
Finally, just say that even that storage is cheaper than ever, the amount of information has increased as well and we are now using and processing an incredible amount of data... so it looks like compression will always be a requirement.
I hope you find this information useful and please let me know if you have any questions or suggestions below in the comments section.
[1]:https://dev.mysql.com/doc/internals/en/zlib-directory.html
[3]:https://www.postgresql.org/docs/devel/runtime-config-client.html
[4]:https://www.ibm.com/docs/en/db2-for-zos/12?topic=performance-compressing-your-data
[6]:https://www.percona.com/doc/percona-server/8.0/flexibility/compressed_columns.html