How to confirm that disk compression is in effect?

classic Classic list List threaded Threaded
8 messages Options
38797715 38797715
Reply | Threaded
Open this post in threaded view
|

How to confirm that disk compression is in effect?

Hi,

We turn on disk compression to see the trend of execution time and disk space.

Our expectation is that after disk compression is turned on, although more CPU is used, the disk space is less occupied. Because more data is written per unit time, the overall execution time will be shortened in the case of insufficient memory.

However, it is found that the execution time and disk consumption do not change significantly. We tested the diskPageCompressionLevel values as 0, 10 and 17 respectively.

Our test method is as follows:
The ignite-compress module has been introduced.

The configuration of ignite is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="peerClassLoadingEnabled" value="true"/>
<property name="consistentId" value="b"/>
<property name="igniteInstanceName" value="ClusterName1"/>
<property name="workDirectory" value="/home/ignite"/>
<property name="gridLogger">
<bean class="org.apache.ignite.logger.log4j2.Log4J2Logger">
<constructor-arg type="java.lang.String" value="config/ignite-log4j2.xml"/>
</bean>
</property>
<property name="cacheConfiguration">
<list>
<bean id="partitioned-cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="cache-partitioned*"/>
<property name="cacheMode" value="PARTITIONED" />
<property name="queryParallelism" value="2"/>
<property name="diskPageCompression" value="LZ4"/>
<property name="diskPageCompressionLevel" value="17"/>
</bean>
</list>
</property>
<!-- Enabling Apache Ignite Persistent Store. -->
<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<property name="pageSize" value="#{4096 * 2}"/>
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="persistenceEnabled" value="true"/>
<property name="maxSize" value="#{1L * 1024 * 1024 * 1024}"/>
</bean>
</property>
</bean>
</property>
</bean>
</beans>
Mikhail Mikhail
Reply | Threaded
Open this post in threaded view
|

Re: How to confirm that disk compression is in effect?

Could you please share your benchmark code? I believe compression might depend on data you write, if it full random, it's difficult to compress the data.

On Wed, Aug 26, 2020, 8:26 PM 38797715 <[hidden email]> wrote:

Hi,

We turn on disk compression to see the trend of execution time and disk space.

Our expectation is that after disk compression is turned on, although more CPU is used, the disk space is less occupied. Because more data is written per unit time, the overall execution time will be shortened in the case of insufficient memory.

However, it is found that the execution time and disk consumption do not change significantly. We tested the diskPageCompressionLevel values as 0, 10 and 17 respectively.

Our test method is as follows:
The ignite-compress module has been introduced.

The configuration of ignite is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="peerClassLoadingEnabled" value="true"/>
<property name="consistentId" value="b"/>
<property name="igniteInstanceName" value="ClusterName1"/>
<property name="workDirectory" value="/home/ignite"/>
<property name="gridLogger">
<bean class="org.apache.ignite.logger.log4j2.Log4J2Logger">
<constructor-arg type="java.lang.String" value="config/ignite-log4j2.xml"/>
</bean>
</property>
<property name="cacheConfiguration">
<list>
<bean id="partitioned-cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="cache-partitioned*"/>
<property name="cacheMode" value="PARTITIONED" />
<property name="queryParallelism" value="2"/>
<property name="diskPageCompression" value="LZ4"/>
<property name="diskPageCompressionLevel" value="17"/>
</bean>
</list>
</property>
<!-- Enabling Apache Ignite Persistent Store. -->
<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<property name="pageSize" value="#{4096 * 2}"/>
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="persistenceEnabled" value="true"/>
<property name="maxSize" value="#{1L * 1024 * 1024 * 1024}"/>
</bean>
</property>
</bean>
</property>
</bean>
</beans>
38797715 38797715
Reply | Threaded
Open this post in threaded view
|

Re: How to confirm that disk compression is in effect?

Hi,

create table statement are as follows:

CREATE TABLE PI_COM_DAY
(COM_ID VARCHAR(30) NOT NULL ,
ITEM_ID VARCHAR(30) NOT NULL ,
DATE1 VARCHAR(8) NOT NULL ,
KIND VARCHAR(1),
QTY_IOD DECIMAL(18, 6) ,
AMT_IOD DECIMAL(18, 6) ,
QTY_PURCH DECIMAL(18, 6) ,
AMT_PURCH DECIMAL(18,6) ,
QTY_SOLD DECIMAL(18,6) ,
AMT_SOLD DECIMAL(18, 6) ,
AMT_SOLD_NO_TAX DECIMAL(18, 6) ,
QTY_PROFIT DECIMAL(18, 6) ,
AMT_PROFIT DECIMAL(18, 6) ,
QTY_LOSS DECIMAL(18,6) ,
AMT_LOSS DECIMAL(18, 6) ,
QTY_EOD DECIMAL(18, 6) ,
AMT_EOD DECIMAL(18,6) ,
UNIT_COST DECIMAL(18,8) ,
SUMCOST_SOLD DECIMAL(18,6) ,
GROSS_PROFIT DECIMAL(18, 6) ,
QTY_ALLOCATION DECIMAL(18,6) ,
AMT_ALLOCATION DECIMAL(18,2) ,
AMT_ALLOCATION_NO_TAX DECIMAL(18, 2) ,
GROSS_PROFIT_ALLOCATION DECIMAL(18,6) ,
SUMCOST_SOLD_ALLOCATION DECIMAL(18,6) ,
PRIMARY KEY (COM_ID,ITEM_ID,DATE1)) WITH "template=cache-partitioned,CACHE_NAME=PI_COM_DAY";
CREATE INDEX IDX_PI_COM_DAY_ITEM_DATE ON PI_COM_DAY(ITEM_ID,DATE1);

I don't think there's anything special about it.
Then we imported 10 million data using the COPY command.Data is basically the actual production data, I think the dispersion is OK, not artificial data with high similarity.
I would like to know if there are test results for the function of disk compression? Most of the other memory databases also have the function of data compression, but it doesn't look like it is now, or what's wrong with me?

在 2020/8/28 上午12:39, Michael Cherkasov 写道:
Could you please share your benchmark code? I believe compression might depend on data you write, if it full random, it's difficult to compress the data.

On Wed, Aug 26, 2020, 8:26 PM 38797715 <[hidden email]> wrote:

Hi,

We turn on disk compression to see the trend of execution time and disk space.

Our expectation is that after disk compression is turned on, although more CPU is used, the disk space is less occupied. Because more data is written per unit time, the overall execution time will be shortened in the case of insufficient memory.

However, it is found that the execution time and disk consumption do not change significantly. We tested the diskPageCompressionLevel values as 0, 10 and 17 respectively.

Our test method is as follows:
The ignite-compress module has been introduced.

The configuration of ignite is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="peerClassLoadingEnabled" value="true"/>
<property name="consistentId" value="b"/>
<property name="igniteInstanceName" value="ClusterName1"/>
<property name="workDirectory" value="/home/ignite"/>
<property name="gridLogger">
<bean class="org.apache.ignite.logger.log4j2.Log4J2Logger">
<constructor-arg type="java.lang.String" value="config/ignite-log4j2.xml"/>
</bean>
</property>
<property name="cacheConfiguration">
<list>
<bean id="partitioned-cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="cache-partitioned*"/>
<property name="cacheMode" value="PARTITIONED" />
<property name="queryParallelism" value="2"/>
<property name="diskPageCompression" value="LZ4"/>
<property name="diskPageCompressionLevel" value="17"/>
</bean>
</list>
</property>
<!-- Enabling Apache Ignite Persistent Store. -->
<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<property name="pageSize" value="#{4096 * 2}"/>
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="persistenceEnabled" value="true"/>
<property name="maxSize" value="#{1L * 1024 * 1024 * 1024}"/>
</bean>
</property>
</bean>
</property>
</bean>
</beans>
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: How to confirm that disk compression is in effect?

Hello!

Did you add `ignite-compres` module to your classpath?


Regards,
--
Ilya Kasnacheev


пт, 28 авг. 2020 г. в 06:52, 38797715 <[hidden email]>:

Hi,

create table statement are as follows:

CREATE TABLE PI_COM_DAY
(COM_ID VARCHAR(30) NOT NULL ,
ITEM_ID VARCHAR(30) NOT NULL ,
DATE1 VARCHAR(8) NOT NULL ,
KIND VARCHAR(1),
QTY_IOD DECIMAL(18, 6) ,
AMT_IOD DECIMAL(18, 6) ,
QTY_PURCH DECIMAL(18, 6) ,
AMT_PURCH DECIMAL(18,6) ,
QTY_SOLD DECIMAL(18,6) ,
AMT_SOLD DECIMAL(18, 6) ,
AMT_SOLD_NO_TAX DECIMAL(18, 6) ,
QTY_PROFIT DECIMAL(18, 6) ,
AMT_PROFIT DECIMAL(18, 6) ,
QTY_LOSS DECIMAL(18,6) ,
AMT_LOSS DECIMAL(18, 6) ,
QTY_EOD DECIMAL(18, 6) ,
AMT_EOD DECIMAL(18,6) ,
UNIT_COST DECIMAL(18,8) ,
SUMCOST_SOLD DECIMAL(18,6) ,
GROSS_PROFIT DECIMAL(18, 6) ,
QTY_ALLOCATION DECIMAL(18,6) ,
AMT_ALLOCATION DECIMAL(18,2) ,
AMT_ALLOCATION_NO_TAX DECIMAL(18, 2) ,
GROSS_PROFIT_ALLOCATION DECIMAL(18,6) ,
SUMCOST_SOLD_ALLOCATION DECIMAL(18,6) ,
PRIMARY KEY (COM_ID,ITEM_ID,DATE1)) WITH "template=cache-partitioned,CACHE_NAME=PI_COM_DAY";
CREATE INDEX IDX_PI_COM_DAY_ITEM_DATE ON PI_COM_DAY(ITEM_ID,DATE1);

I don't think there's anything special about it.
Then we imported 10 million data using the COPY command.Data is basically the actual production data, I think the dispersion is OK, not artificial data with high similarity.
I would like to know if there are test results for the function of disk compression? Most of the other memory databases also have the function of data compression, but it doesn't look like it is now, or what's wrong with me?

在 2020/8/28 上午12:39, Michael Cherkasov 写道:
Could you please share your benchmark code? I believe compression might depend on data you write, if it full random, it's difficult to compress the data.

On Wed, Aug 26, 2020, 8:26 PM 38797715 <[hidden email]> wrote:

Hi,

We turn on disk compression to see the trend of execution time and disk space.

Our expectation is that after disk compression is turned on, although more CPU is used, the disk space is less occupied. Because more data is written per unit time, the overall execution time will be shortened in the case of insufficient memory.

However, it is found that the execution time and disk consumption do not change significantly. We tested the diskPageCompressionLevel values as 0, 10 and 17 respectively.

Our test method is as follows:
The ignite-compress module has been introduced.

The configuration of ignite is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="peerClassLoadingEnabled" value="true"/>
<property name="consistentId" value="b"/>
<property name="igniteInstanceName" value="ClusterName1"/>
<property name="workDirectory" value="/home/ignite"/>
<property name="gridLogger">
<bean class="org.apache.ignite.logger.log4j2.Log4J2Logger">
<constructor-arg type="java.lang.String" value="config/ignite-log4j2.xml"/>
</bean>
</property>
<property name="cacheConfiguration">
<list>
<bean id="partitioned-cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="cache-partitioned*"/>
<property name="cacheMode" value="PARTITIONED" />
<property name="queryParallelism" value="2"/>
<property name="diskPageCompression" value="LZ4"/>
<property name="diskPageCompressionLevel" value="17"/>
</bean>
</list>
</property>
<!-- Enabling Apache Ignite Persistent Store. -->
<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<property name="pageSize" value="#{4096 * 2}"/>
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="persistenceEnabled" value="true"/>
<property name="maxSize" value="#{1L * 1024 * 1024 * 1024}"/>
</bean>
</property>
</bean>
</property>
</bean>
</beans>
38797715 38797715
Reply | Threaded
Open this post in threaded view
|

Re: How to confirm that disk compression is in effect?

Hi Ilya,

This module has already been imported.

We re tested three scenarios:

1.pageSize=4096

2.pageSize=8192

3.pageSize=8192,disk compression and wal compression are enabled at the same time.

From the test results, pageSize = 4096, the writing speed of this scenario is slightly faster, and the disk space occupation is slightly smaller, but the amplitude is less than 10%.

In the two scenarios with pageSize = 8192, there is no big difference in write speed and disk space usage. However, for wal files, the size of a single file will always be 64M. It is not clear whether more compressed data is stored in the file.

My test environment is:

For notebook computers (8G RAM, 256G SSD), Apache ignite version is 2.8.1, and the COPY command is used to import 6M data.

在 2020/9/7 下午10:06, Ilya Kasnacheev 写道:
Hello!

Did you add `ignite-compres` module to your classpath?


Regards,
--
Ilya Kasnacheev


пт, 28 авг. 2020 г. в 06:52, 38797715 <[hidden email]>:

Hi,

create table statement are as follows:

CREATE TABLE PI_COM_DAY
(COM_ID VARCHAR(30) NOT NULL ,
ITEM_ID VARCHAR(30) NOT NULL ,
DATE1 VARCHAR(8) NOT NULL ,
KIND VARCHAR(1),
QTY_IOD DECIMAL(18, 6) ,
AMT_IOD DECIMAL(18, 6) ,
QTY_PURCH DECIMAL(18, 6) ,
AMT_PURCH DECIMAL(18,6) ,
QTY_SOLD DECIMAL(18,6) ,
AMT_SOLD DECIMAL(18, 6) ,
AMT_SOLD_NO_TAX DECIMAL(18, 6) ,
QTY_PROFIT DECIMAL(18, 6) ,
AMT_PROFIT DECIMAL(18, 6) ,
QTY_LOSS DECIMAL(18,6) ,
AMT_LOSS DECIMAL(18, 6) ,
QTY_EOD DECIMAL(18, 6) ,
AMT_EOD DECIMAL(18,6) ,
UNIT_COST DECIMAL(18,8) ,
SUMCOST_SOLD DECIMAL(18,6) ,
GROSS_PROFIT DECIMAL(18, 6) ,
QTY_ALLOCATION DECIMAL(18,6) ,
AMT_ALLOCATION DECIMAL(18,2) ,
AMT_ALLOCATION_NO_TAX DECIMAL(18, 2) ,
GROSS_PROFIT_ALLOCATION DECIMAL(18,6) ,
SUMCOST_SOLD_ALLOCATION DECIMAL(18,6) ,
PRIMARY KEY (COM_ID,ITEM_ID,DATE1)) WITH "template=cache-partitioned,CACHE_NAME=PI_COM_DAY";
CREATE INDEX IDX_PI_COM_DAY_ITEM_DATE ON PI_COM_DAY(ITEM_ID,DATE1);

I don't think there's anything special about it.
Then we imported 10 million data using the COPY command.Data is basically the actual production data, I think the dispersion is OK, not artificial data with high similarity.
I would like to know if there are test results for the function of disk compression? Most of the other memory databases also have the function of data compression, but it doesn't look like it is now, or what's wrong with me?

在 2020/8/28 上午12:39, Michael Cherkasov 写道:
Could you please share your benchmark code? I believe compression might depend on data you write, if it full random, it's difficult to compress the data.

On Wed, Aug 26, 2020, 8:26 PM 38797715 <[hidden email]> wrote:

Hi,

We turn on disk compression to see the trend of execution time and disk space.

Our expectation is that after disk compression is turned on, although more CPU is used, the disk space is less occupied. Because more data is written per unit time, the overall execution time will be shortened in the case of insufficient memory.

However, it is found that the execution time and disk consumption do not change significantly. We tested the diskPageCompressionLevel values as 0, 10 and 17 respectively.

Our test method is as follows:
The ignite-compress module has been introduced.

The configuration of ignite is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="peerClassLoadingEnabled" value="true"/>
<property name="consistentId" value="b"/>
<property name="igniteInstanceName" value="ClusterName1"/>
<property name="workDirectory" value="/home/ignite"/>
<property name="gridLogger">
<bean class="org.apache.ignite.logger.log4j2.Log4J2Logger">
<constructor-arg type="java.lang.String" value="config/ignite-log4j2.xml"/>
</bean>
</property>
<property name="cacheConfiguration">
<list>
<bean id="partitioned-cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="cache-partitioned*"/>
<property name="cacheMode" value="PARTITIONED" />
<property name="queryParallelism" value="2"/>
<property name="diskPageCompression" value="LZ4"/>
<property name="diskPageCompressionLevel" value="17"/>
</bean>
</list>
</property>
<!-- Enabling Apache Ignite Persistent Store. -->
<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<property name="pageSize" value="#{4096 * 2}"/>
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="persistenceEnabled" value="true"/>
<property name="maxSize" value="#{1L * 1024 * 1024 * 1024}"/>
</bean>
</property>
</bean>
</property>
</bean>
</beans>
ilya.kasnacheev ilya.kasnacheev
Reply | Threaded
Open this post in threaded view
|

Re: How to confirm that disk compression is in effect?

Hello!

If your data does not compress at least 2x, then pageSize=8192 is useless. Frankly speaking I've never seen any beneficial deployments of page compression. I recommend turning it off and keeping WAL compression only.

Regards,
-- 
Ilya Kasnacheev


вт, 8 сент. 2020 г. в 05:18, 38797715 <[hidden email]>:

Hi Ilya,

This module has already been imported.

We re tested three scenarios:

1.pageSize=4096

2.pageSize=8192

3.pageSize=8192,disk compression and wal compression are enabled at the same time.

From the test results, pageSize = 4096, the writing speed of this scenario is slightly faster, and the disk space occupation is slightly smaller, but the amplitude is less than 10%.

In the two scenarios with pageSize = 8192, there is no big difference in write speed and disk space usage. However, for wal files, the size of a single file will always be 64M. It is not clear whether more compressed data is stored in the file.

My test environment is:

For notebook computers (8G RAM, 256G SSD), Apache ignite version is 2.8.1, and the COPY command is used to import 6M data.

在 2020/9/7 下午10:06, Ilya Kasnacheev 写道:
Hello!

Did you add `ignite-compres` module to your classpath?


Regards,
--
Ilya Kasnacheev


пт, 28 авг. 2020 г. в 06:52, 38797715 <[hidden email]>:

Hi,

create table statement are as follows:

CREATE TABLE PI_COM_DAY
(COM_ID VARCHAR(30) NOT NULL ,
ITEM_ID VARCHAR(30) NOT NULL ,
DATE1 VARCHAR(8) NOT NULL ,
KIND VARCHAR(1),
QTY_IOD DECIMAL(18, 6) ,
AMT_IOD DECIMAL(18, 6) ,
QTY_PURCH DECIMAL(18, 6) ,
AMT_PURCH DECIMAL(18,6) ,
QTY_SOLD DECIMAL(18,6) ,
AMT_SOLD DECIMAL(18, 6) ,
AMT_SOLD_NO_TAX DECIMAL(18, 6) ,
QTY_PROFIT DECIMAL(18, 6) ,
AMT_PROFIT DECIMAL(18, 6) ,
QTY_LOSS DECIMAL(18,6) ,
AMT_LOSS DECIMAL(18, 6) ,
QTY_EOD DECIMAL(18, 6) ,
AMT_EOD DECIMAL(18,6) ,
UNIT_COST DECIMAL(18,8) ,
SUMCOST_SOLD DECIMAL(18,6) ,
GROSS_PROFIT DECIMAL(18, 6) ,
QTY_ALLOCATION DECIMAL(18,6) ,
AMT_ALLOCATION DECIMAL(18,2) ,
AMT_ALLOCATION_NO_TAX DECIMAL(18, 2) ,
GROSS_PROFIT_ALLOCATION DECIMAL(18,6) ,
SUMCOST_SOLD_ALLOCATION DECIMAL(18,6) ,
PRIMARY KEY (COM_ID,ITEM_ID,DATE1)) WITH "template=cache-partitioned,CACHE_NAME=PI_COM_DAY";
CREATE INDEX IDX_PI_COM_DAY_ITEM_DATE ON PI_COM_DAY(ITEM_ID,DATE1);

I don't think there's anything special about it.
Then we imported 10 million data using the COPY command.Data is basically the actual production data, I think the dispersion is OK, not artificial data with high similarity.
I would like to know if there are test results for the function of disk compression? Most of the other memory databases also have the function of data compression, but it doesn't look like it is now, or what's wrong with me?

在 2020/8/28 上午12:39, Michael Cherkasov 写道:
Could you please share your benchmark code? I believe compression might depend on data you write, if it full random, it's difficult to compress the data.

On Wed, Aug 26, 2020, 8:26 PM 38797715 <[hidden email]> wrote:

Hi,

We turn on disk compression to see the trend of execution time and disk space.

Our expectation is that after disk compression is turned on, although more CPU is used, the disk space is less occupied. Because more data is written per unit time, the overall execution time will be shortened in the case of insufficient memory.

However, it is found that the execution time and disk consumption do not change significantly. We tested the diskPageCompressionLevel values as 0, 10 and 17 respectively.

Our test method is as follows:
The ignite-compress module has been introduced.

The configuration of ignite is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="peerClassLoadingEnabled" value="true"/>
<property name="consistentId" value="b"/>
<property name="igniteInstanceName" value="ClusterName1"/>
<property name="workDirectory" value="/home/ignite"/>
<property name="gridLogger">
<bean class="org.apache.ignite.logger.log4j2.Log4J2Logger">
<constructor-arg type="java.lang.String" value="config/ignite-log4j2.xml"/>
</bean>
</property>
<property name="cacheConfiguration">
<list>
<bean id="partitioned-cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="cache-partitioned*"/>
<property name="cacheMode" value="PARTITIONED" />
<property name="queryParallelism" value="2"/>
<property name="diskPageCompression" value="LZ4"/>
<property name="diskPageCompressionLevel" value="17"/>
</bean>
</list>
</property>
<!-- Enabling Apache Ignite Persistent Store. -->
<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<property name="pageSize" value="#{4096 * 2}"/>
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="persistenceEnabled" value="true"/>
<property name="maxSize" value="#{1L * 1024 * 1024 * 1024}"/>
</bean>
</property>
</bean>
</property>
</bean>
</beans>
38797715 38797715
Reply | Threaded
Open this post in threaded view
|

Re: How to confirm that disk compression is in effect?

Hi,

I tried to test the following scenario, but it didn't seem to improve.

pageSize=4096 & wal compression enabled & COPY command import for 6M data


I've looked at the following discussion and performance test results, and it seems that the throughput has been improved by 2x-4x.

https://issues.apache.org/jira/browse/IGNITE-11336

http://apache-ignite-developers.2346864.n4.nabble.com/Disk-page-compression-for-Ignite-persistent-store-td38009.html

According to my understanding, the execution time of the copy command should be greatly reduced, but this is not the case. Why?

在 2020/9/8 下午5:16, Ilya Kasnacheev 写道:
Hello!

If your data does not compress at least 2x, then pageSize=8192 is useless. Frankly speaking I've never seen any beneficial deployments of page compression. I recommend turning it off and keeping WAL compression only.

Regards,
-- 
Ilya Kasnacheev


вт, 8 сент. 2020 г. в 05:18, 38797715 <[hidden email]>:

Hi Ilya,

This module has already been imported.

We re tested three scenarios:

1.pageSize=4096

2.pageSize=8192

3.pageSize=8192,disk compression and wal compression are enabled at the same time.

From the test results, pageSize = 4096, the writing speed of this scenario is slightly faster, and the disk space occupation is slightly smaller, but the amplitude is less than 10%.

In the two scenarios with pageSize = 8192, there is no big difference in write speed and disk space usage. However, for wal files, the size of a single file will always be 64M. It is not clear whether more compressed data is stored in the file.

My test environment is:

For notebook computers (8G RAM, 256G SSD), Apache ignite version is 2.8.1, and the COPY command is used to import 6M data.

在 2020/9/7 下午10:06, Ilya Kasnacheev 写道:
Hello!

Did you add `ignite-compres` module to your classpath?


Regards,
--
Ilya Kasnacheev


пт, 28 авг. 2020 г. в 06:52, 38797715 <[hidden email]>:

Hi,

create table statement are as follows:

CREATE TABLE PI_COM_DAY
(COM_ID VARCHAR(30) NOT NULL ,
ITEM_ID VARCHAR(30) NOT NULL ,
DATE1 VARCHAR(8) NOT NULL ,
KIND VARCHAR(1),
QTY_IOD DECIMAL(18, 6) ,
AMT_IOD DECIMAL(18, 6) ,
QTY_PURCH DECIMAL(18, 6) ,
AMT_PURCH DECIMAL(18,6) ,
QTY_SOLD DECIMAL(18,6) ,
AMT_SOLD DECIMAL(18, 6) ,
AMT_SOLD_NO_TAX DECIMAL(18, 6) ,
QTY_PROFIT DECIMAL(18, 6) ,
AMT_PROFIT DECIMAL(18, 6) ,
QTY_LOSS DECIMAL(18,6) ,
AMT_LOSS DECIMAL(18, 6) ,
QTY_EOD DECIMAL(18, 6) ,
AMT_EOD DECIMAL(18,6) ,
UNIT_COST DECIMAL(18,8) ,
SUMCOST_SOLD DECIMAL(18,6) ,
GROSS_PROFIT DECIMAL(18, 6) ,
QTY_ALLOCATION DECIMAL(18,6) ,
AMT_ALLOCATION DECIMAL(18,2) ,
AMT_ALLOCATION_NO_TAX DECIMAL(18, 2) ,
GROSS_PROFIT_ALLOCATION DECIMAL(18,6) ,
SUMCOST_SOLD_ALLOCATION DECIMAL(18,6) ,
PRIMARY KEY (COM_ID,ITEM_ID,DATE1)) WITH "template=cache-partitioned,CACHE_NAME=PI_COM_DAY";
CREATE INDEX IDX_PI_COM_DAY_ITEM_DATE ON PI_COM_DAY(ITEM_ID,DATE1);

I don't think there's anything special about it.
Then we imported 10 million data using the COPY command.Data is basically the actual production data, I think the dispersion is OK, not artificial data with high similarity.
I would like to know if there are test results for the function of disk compression? Most of the other memory databases also have the function of data compression, but it doesn't look like it is now, or what's wrong with me?

在 2020/8/28 上午12:39, Michael Cherkasov 写道:
Could you please share your benchmark code? I believe compression might depend on data you write, if it full random, it's difficult to compress the data.

On Wed, Aug 26, 2020, 8:26 PM 38797715 <[hidden email]> wrote:

Hi,

We turn on disk compression to see the trend of execution time and disk space.

Our expectation is that after disk compression is turned on, although more CPU is used, the disk space is less occupied. Because more data is written per unit time, the overall execution time will be shortened in the case of insufficient memory.

However, it is found that the execution time and disk consumption do not change significantly. We tested the diskPageCompressionLevel values as 0, 10 and 17 respectively.

Our test method is as follows:
The ignite-compress module has been introduced.

The configuration of ignite is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="peerClassLoadingEnabled" value="true"/>
<property name="consistentId" value="b"/>
<property name="igniteInstanceName" value="ClusterName1"/>
<property name="workDirectory" value="/home/ignite"/>
<property name="gridLogger">
<bean class="org.apache.ignite.logger.log4j2.Log4J2Logger">
<constructor-arg type="java.lang.String" value="config/ignite-log4j2.xml"/>
</bean>
</property>
<property name="cacheConfiguration">
<list>
<bean id="partitioned-cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="cache-partitioned*"/>
<property name="cacheMode" value="PARTITIONED" />
<property name="queryParallelism" value="2"/>
<property name="diskPageCompression" value="LZ4"/>
<property name="diskPageCompressionLevel" value="17"/>
</bean>
</list>
</property>
<!-- Enabling Apache Ignite Persistent Store. -->
<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<property name="pageSize" value="#{4096 * 2}"/>
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="persistenceEnabled" value="true"/>
<property name="maxSize" value="#{1L * 1024 * 1024 * 1024}"/>
</bean>
</property>
</bean>
</property>
</bean>
</beans>
Alex Plehanov Alex Plehanov
Reply | Threaded
Open this post in threaded view
|

Re: How to confirm that disk compression is in effect?

Hello.

Actually performance test results attached to IGNITE-11336 are not correct, I forgot to delete these results from the ticket. The environment was not tuned correctly and I get too often checkpoints for runs without WAL compression and this leads to bad results for "compdisabled" runs. But I've benchmarked it again later and there is still performance boost about 30% on our synthetic tests on our environment. Also, benchmarks were based on early 2.8 Ignite version, later, in 2.8 some optimizations were introduced, which reduced the count of page snapshots in the WAL. So, currently, you can still get several percents (it depends on your data and your environment) performance boost by using WAL page snapshot compression, but don't expect 2x or more.
 

ср, 9 сент. 2020 г. в 04:37, 38797715 <[hidden email]>:

Hi,

I tried to test the following scenario, but it didn't seem to improve.

pageSize=4096 & wal compression enabled & COPY command import for 6M data


I've looked at the following discussion and performance test results, and it seems that the throughput has been improved by 2x-4x.

https://issues.apache.org/jira/browse/IGNITE-11336

http://apache-ignite-developers.2346864.n4.nabble.com/Disk-page-compression-for-Ignite-persistent-store-td38009.html

According to my understanding, the execution time of the copy command should be greatly reduced, but this is not the case. Why?

在 2020/9/8 下午5:16, Ilya Kasnacheev 写道:
Hello!

If your data does not compress at least 2x, then pageSize=8192 is useless. Frankly speaking I've never seen any beneficial deployments of page compression. I recommend turning it off and keeping WAL compression only.

Regards,
-- 
Ilya Kasnacheev


вт, 8 сент. 2020 г. в 05:18, 38797715 <[hidden email]>:

Hi Ilya,

This module has already been imported.

We re tested three scenarios:

1.pageSize=4096

2.pageSize=8192

3.pageSize=8192,disk compression and wal compression are enabled at the same time.

From the test results, pageSize = 4096, the writing speed of this scenario is slightly faster, and the disk space occupation is slightly smaller, but the amplitude is less than 10%.

In the two scenarios with pageSize = 8192, there is no big difference in write speed and disk space usage. However, for wal files, the size of a single file will always be 64M. It is not clear whether more compressed data is stored in the file.

My test environment is:

For notebook computers (8G RAM, 256G SSD), Apache ignite version is 2.8.1, and the COPY command is used to import 6M data.

在 2020/9/7 下午10:06, Ilya Kasnacheev 写道:
Hello!

Did you add `ignite-compres` module to your classpath?


Regards,
--
Ilya Kasnacheev


пт, 28 авг. 2020 г. в 06:52, 38797715 <[hidden email]>:

Hi,

create table statement are as follows:

CREATE TABLE PI_COM_DAY
(COM_ID VARCHAR(30) NOT NULL ,
ITEM_ID VARCHAR(30) NOT NULL ,
DATE1 VARCHAR(8) NOT NULL ,
KIND VARCHAR(1),
QTY_IOD DECIMAL(18, 6) ,
AMT_IOD DECIMAL(18, 6) ,
QTY_PURCH DECIMAL(18, 6) ,
AMT_PURCH DECIMAL(18,6) ,
QTY_SOLD DECIMAL(18,6) ,
AMT_SOLD DECIMAL(18, 6) ,
AMT_SOLD_NO_TAX DECIMAL(18, 6) ,
QTY_PROFIT DECIMAL(18, 6) ,
AMT_PROFIT DECIMAL(18, 6) ,
QTY_LOSS DECIMAL(18,6) ,
AMT_LOSS DECIMAL(18, 6) ,
QTY_EOD DECIMAL(18, 6) ,
AMT_EOD DECIMAL(18,6) ,
UNIT_COST DECIMAL(18,8) ,
SUMCOST_SOLD DECIMAL(18,6) ,
GROSS_PROFIT DECIMAL(18, 6) ,
QTY_ALLOCATION DECIMAL(18,6) ,
AMT_ALLOCATION DECIMAL(18,2) ,
AMT_ALLOCATION_NO_TAX DECIMAL(18, 2) ,
GROSS_PROFIT_ALLOCATION DECIMAL(18,6) ,
SUMCOST_SOLD_ALLOCATION DECIMAL(18,6) ,
PRIMARY KEY (COM_ID,ITEM_ID,DATE1)) WITH "template=cache-partitioned,CACHE_NAME=PI_COM_DAY";
CREATE INDEX IDX_PI_COM_DAY_ITEM_DATE ON PI_COM_DAY(ITEM_ID,DATE1);

I don't think there's anything special about it.
Then we imported 10 million data using the COPY command.Data is basically the actual production data, I think the dispersion is OK, not artificial data with high similarity.
I would like to know if there are test results for the function of disk compression? Most of the other memory databases also have the function of data compression, but it doesn't look like it is now, or what's wrong with me?

在 2020/8/28 上午12:39, Michael Cherkasov 写道:
Could you please share your benchmark code? I believe compression might depend on data you write, if it full random, it's difficult to compress the data.

On Wed, Aug 26, 2020, 8:26 PM 38797715 <[hidden email]> wrote:

Hi,

We turn on disk compression to see the trend of execution time and disk space.

Our expectation is that after disk compression is turned on, although more CPU is used, the disk space is less occupied. Because more data is written per unit time, the overall execution time will be shortened in the case of insufficient memory.

However, it is found that the execution time and disk consumption do not change significantly. We tested the diskPageCompressionLevel values as 0, 10 and 17 respectively.

Our test method is as follows:
The ignite-compress module has been introduced.

The configuration of ignite is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="peerClassLoadingEnabled" value="true"/>
<property name="consistentId" value="b"/>
<property name="igniteInstanceName" value="ClusterName1"/>
<property name="workDirectory" value="/home/ignite"/>
<property name="gridLogger">
<bean class="org.apache.ignite.logger.log4j2.Log4J2Logger">
<constructor-arg type="java.lang.String" value="config/ignite-log4j2.xml"/>
</bean>
</property>
<property name="cacheConfiguration">
<list>
<bean id="partitioned-cache-template" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="cache-partitioned*"/>
<property name="cacheMode" value="PARTITIONED" />
<property name="queryParallelism" value="2"/>
<property name="diskPageCompression" value="LZ4"/>
<property name="diskPageCompressionLevel" value="17"/>
</bean>
</list>
</property>
<!-- Enabling Apache Ignite Persistent Store. -->
<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<property name="pageSize" value="#{4096 * 2}"/>
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="persistenceEnabled" value="true"/>
<property name="maxSize" value="#{1L * 1024 * 1024 * 1024}"/>
</bean>
</property>
</bean>
</property>
</bean>
</beans>