Data Compression in the Column Store
Types of data compression in the HANA database
Data in column tables can have a two-fold compression:
- Dictionary compression
This default method of compression is applied to all columns. It involves the mapping of distinct column values to consecutive numbers so that instead of the actual value being stored, the typically much smaller consecutive number is stored.
- Advanced compression
Each column can be further compressed using different compression methods, namely prefix encoding, run length encoding (RLE), cluster encoding, sparse encoding, and indirect encoding. The SAP HANA database uses compression algorithms to determine which type of compression is most appropriate for a column. Columns with the PAGE LOADABLE attribute are compressed with the NBit algorithm only.
Compression is automatically calculated and optimized as part of the delta merge operation. If you create an empty column table, no compression is applied initially as the database cannot know which method is most appropriate. As you start to insert data into the table and the delta merge operation starts being executed at regular intervals, data compression is automatically (re)evaluated and optimized.
Automatic compression optimization is ensured by the parameter active in the optimize_compression section of the indexserver.ini configuration file. This parameter must have the value yes.
Thank
Rupesh Chavan