I have enabled compression (pageSize=16384, diskPageCompression=ZSTD, diskPageCompressionLevel=18) but the partition files don't appear to be very compressed. I tested by adding approx 16000 data items to my cache and looking at the partition files on disk.
Example: part-96.bin is 339M in size. If I compress that file with zstd (default settings) it goes down to 106M.
Is it possible to do better than this with Ignite? I need to be able to store a lot of data.
Relevant parts of my ignite config:
<bean id="grid.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="consistentId" value=""/>
<property name="pageSize" value="16384"/>
<property name="name" value="activity-stream-data"/>
<property name="atomicityMode" value="ATOMIC"/>
<property name="diskPageCompression" value="ZSTD"/>
<property name="diskPageCompressionLevel" value="18"/>
<property name="backups" value="1"/>
Ignite compresses each page individually. The result of whole file compression will always be better than the result of each individual page compression. Moreover, Ignite stores compressed pages only if the page size shrunk by one or more filesystem blocks. So, for example, if you have fs block size 4K, page size 16Kb and after compression your page size is 13Kb, then the page will be stored without compression.
BTW, how do you check file size? Ignite compression uses sparse files. "ls -l" reports allocated file size and doesn't utilize information about "holes" in a sparse file. To see the real amount of disk space occupied by the file you should use "du" or "ls -s".
Aha! I didn't know about the sparse file thing. Thanks!
# ll -hs
159M -rw-r--r-- 1 ignite ignite 339M Nov 16 21:32 part-96.bin
So the real space used is only 159M. That's great. I currently have all of this data stored on the filesystem in csv.gz files using 177M of space for the 16000 I tested with.
Any other tips on how to reduce disk usage? Any point in using compression level more than 18 for ZSTD? Most of this data will only be written once so I am not so concerned about write speed.
On Tue, Nov 17, 2020 at 9:34 AM Alex Plehanov <[hidden email]> wrote:
If you have a write-heavy workload, to reduce disk usage you can also compress WAL (see "WAL compaction" and "WAL page snapshots compression" features).
I'm not sure about ZSTD compression levels, you can try it. But there is a warning in the ZSTD manual: "Levels >= 20 should be used with caution, as they require more memory". Perhaps someone who is more familiar with ZSTD will answer how higher compression levels affect resource consumption during decompression.
вт, 17 нояб. 2020 г. в 11:00, David Tinker <[hidden email]>:
|Free forum by Nabble||Edit this page|