数据库 PostgreSQL

postgresql无序uuid性能测试及对数据库的影响

Posted in PostgreSQL onJune 11, 2021

无序uuid对数据库的影响

由于最近在做超大表的性能测试，在该过程中发现了无序uuid做主键对表插入性能有一定影响。结合实际情况发现当表的数据量越大，对表插入性能的影响也就越大。

测试环境

PostgreSQL创建插入脚本，测试各种情况的tps。

数据库版本：PostgreSQL 10.4 (ArteryBase 5.0.0, Thunisoft)

操作系统配置：CentOS Linux release 7 ，32GB内存，8 cpu

测试参数：pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb

空表，1000w数据，5000w数据，一亿数据的各种主键测试。

测试无序的uuid，有序的uuid，序列，有普通btree，有唯一索引和没有主键的情况

测试

1.创建表

--无序的uuid
pgbenchdb=# create table test_uuid_v4(id char(32) primary key);
CREATE TABLE
--有序的uuid
pgbenchdb=# create table test_time_nextval(id char(32) primary key);
CREATE TABLE
--递增序列
pgbenchdb=# create table test_seq_bigint(id int8 primary key);
CREATE TABLE
--创建序列
 create sequence test_seq start with 1 ;

2.测试脚本

--测试无序uuid脚本
vi pgbench_uuid_v4.sql
insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));
--测试有序uuid脚本
vi pgbench_time_nextval.sql
insert into test_time_nextval (id) values (replace(uuid_time_nextval()::text,'-',''));
--测试序列脚本
vi pgbench_seq_bigint.sql
insert into test_seq_bigint (id) values (nextval('test_seq'::regclass));

无序uuid,无数据情况

磁盘使用情况
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.76    0.00    0.38    4.67    0.00   94.19

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00   96.00     0.00  2048.00    42.67     1.02   10.67    0.00   10.67  10.33  99.20
dm-0              0.00     0.00    0.00   96.00     0.00  2048.00    42.67     1.02   10.66    0.00   10.66  10.32  99.10
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

tps：
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb 
transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 53494
latency average = 8.974 ms
tps = 891.495404 (including connections establishing)
tps = 891.588967 (excluding connections establishing)
script statistics:
 - statement latencies in milliseconds:
         9.006  insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));

无数据情况下，tps

类别     |  第一次  | 第二次  | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
 无序uuid		  | 919  	| 907     |  891  |   906     | 99.2% | 10.66   
 有序uuid    	  | 985  	| 882     |  932  |   933     | 98.7% | 4.4
 序列    	      | 1311     | 1277    |  1280 |  1289     | 97.5% | 3.4

向表里面初始化100w数据

pgbenchdb=# insert into test_uuid_v4 (id) select  replace(uuid_generate_v4()::text,'-','') from generate_series(1,1000000);
INSERT 0 1000000
Time: 43389.817 ms (00:43.390)
pgbenchdb=# insert into test_time_nextval (id) select replace(uuid_time_nextval()::text,'-','') from generate_series(1,1000000);
INSERT 0 1000000
Time: 30585.134 ms (00:30.585)
pgbenchdb=#  insert into test_seq_bigint select generate_series (1,1000000);
INSERT 0 1000000
Time: 9818.639 ms (00:09.819)
无序uuid插入100w需要43s，有序需要30s，序列需要10s。

插入一百万数据后的tps

类别     |  第一次  | 第二次  | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
 无序uuid		  | 355  	| 440     |  302  |   365     | 98.8% | 13   
 有序uuid    	  | 948  	| 964     |  870  |   927     | 97.2% | 4.0
 序列    	      | 1159     | 1234    |  1115 |  1169     | 96.6% | 3.5

插入一千万数据后的tps

类别     |  第一次  | 第二次  | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
 无序uuid		  | 260  	| 292     |  227  |   260     | 99.2% | 16.8   
 有序uuid    	  | 817  	| 960     |  883  |   870     | 97.7% | 3.9
 序列       	   | 1305     | 1261    |  1270 |  1278     | 96.8% | 3.0

插入五千万数据后

向表中插入5kw数据，并且添加主键
pgbenchdb=# insert into test_time_nextval (id) select replace(uuid_time_nextval()::text,'-','') from generate_series(1,50000000);
INSERT 0 50000000
Time: 453985.318 ms (07:33.985)
pgbenchdb=# insert into test_seq_bigint select generate_series (1,50000000);
INSERT 0 50000000
Time: 352206.160 ms (05:52.206)
pgbenchdb=# insert into test_uuid_v4 (id) select  replace(uuid_generate_v4()::text,'-','') from generate_series(1,50000000);
INSERT 0 50000000
Time: 1159689.338 ms (00:19:19.689)

在无主键情况下，插入五千万数据，有序uuid耗时7分钟，序列耗时6分钟，而无序uuid耗时接近20分钟。

pgbenchdb=# alter table test_uuid_v4 add primary key ("id");
ALTER TABLE
Time: 845199.296 ms (14:05.199)
pgbenchdb=# alter table test_time_nextval add primary key ("id");
ALTER TABLE
Time: 932151.103 ms (15:32.151)
pgbenchdb=# alter table test_seq_bigint add primary key ("id");
ALTER TABLE
Time: 148138.871 ms (02:28.139)

pgbenchdb=# select pg_size_pretty(pg_total_relation_size('test_uuid_v4'));
 pg_size_pretty 
----------------
 6072 MB
(1 row)

Time: 0.861 ms
pgbenchdb=#  select pg_size_pretty(pg_total_relation_size('test_time_nextval'));
 pg_size_pretty 
----------------
 6072 MB
(1 row)

Time: 0.942 ms
pgbenchdb=#  select pg_size_pretty(pg_total_relation_size('test_seq_bigint'));
 pg_size_pretty 
----------------
 2800 MB
(1 row)

Time: 0.699 ms

插入5kw后

类别     |  第一次  | 第二次  | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
 无序uuid		  | 162  	| 163     |  163  |   163     | 99.6% | 18.4   
 有序uuid    	  | 738  	| 933     |  979  |   883     | 97.7% | 3.9
 序列         	 | 1132     | 1264    |  1265 |  1220     | 96.8% | 3.5

插入1亿条数据后

类别     |  第一次  | 第二次  | 第三次 | 平均值(tps) |%util |await
---------------+---------+---------+---------+---------+-------+-------
 无序uuid		  | 121  	| 131     |  143  |   131     | 99.6% | 28.2   
 有序uuid    	  | 819  	| 795     |  888  |   834     | 99.2% | 28.7
 序列      	    | 1193     | 1115    |  1109 |  1139     | 96.8% | 11.3

普通btree索引

上面测了无序uuid，1kw情况下，有主键的tps是260，无主键的tps是1234。尝试测试普通的索引，和唯一索引tps

--创建普通索引
pgbenchdb=# create index i_test_uuid_v4_id on test_uuid_v4(id);
CREATE INDEX
Time: 316367.010 ms (05:16.367)
--创建普通索引后
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb 
transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 13308
latency average = 36.080 ms
tps = 221.727391 (including connections establishing)
tps = 221.749660 (excluding connections establishing)
script statistics:
 - statement latencies in milliseconds:
        38.512  insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));
--创建唯一索引
pgbenchdb=# drop index i_test_uuid_v4_id;
DROP INDEX
Time: 267.451 ms
pgbenchdb=# create unique index i_test_uuid_v4_id on test_uuid_v4(id);
CREATE INDEX
Time: 153372.622 ms (02:33.373)
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb 
^[[3~transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 13847
latency average = 34.693 ms
tps = 230.593988 (including connections establishing)
tps = 230.620469 (excluding connections establishing)
script statistics:
 - statement latencies in milliseconds:
        36.410  insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));

无论是普通btree索引和唯一索引，都会影响插入的效率。

删除所有的主键索引

--删除所有主键
alter table test_uuid_v4 drop constraint "test_uuid_v4_pkey";
alter table test_time_nextval drop constraint "test_time_nextval_pkey" ;
alter table test_seq_bigint drop constraint "test_seq_bigint_pkey";

1,--无序uuid：测试pgbench_uuid_v4.sql
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb 
transaction type: /opt/thunisoft/pgbench_uuid_v4.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 74109
latency average = 6.479 ms
tps = 1234.842229 (including connections establishing)
tps = 1235.042674 (excluding connections establishing)
script statistics:
 - statement latencies in milliseconds:
         6.112  insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-',''));

2、--有序uuid，测试pgbench_time_nextval.sql
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_time_nextval.sql -U sa pgbenchdb 
transaction type: /opt/thunisoft/pgbench_time_nextval.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 74027
latency average = 6.486 ms
tps = 1233.364360 (including connections establishing)
tps = 1233.482292 (excluding connections establishing)
script statistics:
 - statement latencies in milliseconds:
         6.186  insert into test_time_nextval (id) values (replace(uuid_time_nextval()::text,'-',''));
3、--序列，测试pgbench_seq_bigint.sql
[thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_seq_bigint.sql -U sa pgbenchdb 
transaction type: /opt/thunisoft/pgbench_seq_bigint.sql
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 60 s
number of transactions actually processed: 76312
latency average = 6.290 ms
tps = 1271.832907 (including connections establishing)
tps = 1272.124397 (excluding connections establishing)
script statistics:
 - statement latencies in milliseconds:
         5.916  insert into test_seq_bigint (id) values (nextval('test_seq'::regclass));

删除主键约束后，三种情况下tps非常接近，都达到了1200+。

Btree索引，插入操作的平均tps对比

类别/平均tps    |  无数据  | 一千万  | 五千万 | 一亿 		|
---------------+---------+---------+---------+---------+
 无序uuid		  | 960  	| 260     |  163  |   131     |
 有序uuid    	  | 933  	| 870     |  883  |   834     |
 序列        	  | 1289     | 1278    |  1220 |  1139     |

根据测试数据可以看出无序的uuid在数据到达1kw后插入数据的tps下降的非常厉害，而有序的uuid和递增序列下降的比较少。到一亿数据的tps有序uuid是无序的6倍，序列是无序uuid的9倍。

创建单独的表空间用来存储索引信息

如果有多快磁盘那么可以将索引和数据分开存储，以此来加快写入的速度。

创建单独的索引空间：

create tablespace indx_test owner sa location '/home/tablespace/index_test';

指定索引存储目录：

create index i_test_uuid_v4_id on test_uuid_v4 using btree(id) tablespace indx_test;

关于有序uuid

测试使用的sequential-uuids插件，生成的有序uuid。

有序uuid的结构为(block ID; random data)，实际上就是把数据拆成两部分，一部分自增，一部分随机。

sequential-uuids

sequential-uuids-git

提供了两种算法：

1.uuid_sequence_nextval(sequence regclass, block_size int default 65536, block_count int default 65536)

前缀为自增序列，如果块ID使用2字节存储，一个索引BLOCK里面可以存储256条记录（假设8K的BLOCK，一条记录包括uuid VALUE（16字节）以及ctid（6字节），所以一个索引页约存储363条记录（8000 /（16 + 6）））

2.uuid_time_nextval(interval_length int default 60, interval_count int default 65536) RETURNS uuid

默认每60秒内的数据的前缀是一样的，前缀递增1，到65535后循环。

使用uuid_time_nextval生成的有序uuid
pgbenchdb=# select id from test_time_nextval;
                id                
----------------------------------
 a18b7dd0ca92b0b5c1844a402f9c6999
 a18b540b8bbe0ddb2b6d0189b2e393c6
 a18b83eb7320b0a90e625185421e065e
 a18bade4ff15e05dab81ecd3f4c2dee4
 a18b79e41c3bc8d2d4ba4b70447e6b29
 a18bdad18d9e0d2fa1d9d675bc7129f0
 a18b13723ec7be9a2f1a3aec5345a88b
 a18bd9d866047aec69a064d30e9493d2
 a18bd76e8c787c7464479502f381e6d7
 a18ba5c0c966f81cfdbeff866618da8d
......

有序uuid前四位有序，后面的随机生成。

结语

1.关于有序的uuid，前4位是有序的，后面都是随机生成的。

2.在该环境中发现，无序uuid随着数据量的不断增大，tps下滑比较厉害。

3.由于btree索引的存在，无序的uuid会导致大量的离散io。导致磁盘使用率高。进而影响插入效率。随着表数据量的增大更加明显。

4.该测试是在普通的磁盘上面测试，并未在ssd上面测试。

5.如果要使用有序uuid，有多种实现方式，还需要考虑分布式情况下生成全局有序uuid。

以上就是postgresql无序uuid性能测试的详细内容，更多关于postgresql无序uuid性能测试的资料请关注三水点靠木其它相关文章！

postgresql无序uuid性能测试及对数据库的影响

- Author -

月图灵

声明：登载此文出于传递更多信息之目的，并不意味着赞同其观点或证实其描述。

PostgreSQL 相关文章推荐

Centos环境下Postgresql 安装配置及环境变量配置技巧

May 18 PostgreSQL

如何使用PostgreSQL进行中文全文检索

May 27 PostgreSQL

postgres之jsonb属性的使用操作

Jun 23 PostgreSQL

浅谈PostgreSQL表分区的三种方式

Jun 29 PostgreSQL

postgresql使用filter进行多维度聚合的解决方法

Jul 16 PostgreSQL

基于PostgreSQL/openGauss 的分布式数据库解决方案

Dec 06 PostgreSQL

PostgreSQL数据库创建并使用视图以及子查询

Apr 11 PostgreSQL

PostgreSQL 插入INSERT、删除DELETE、更新UPDATE、事务transaction

Apr 12 PostgreSQL

PostgreSQL数据库去除重复数据和运算符的基本查询操作

Apr 12 PostgreSQL

postgreSQL数据库基础知识介绍

Apr 12 PostgreSQL

PostgreSQL怎么创建分区表详解

Jun 25 PostgreSQL

PostgreSQL存储过程实用脚本(二)：创建函数入门

PostgreSQL将数据加载到buffer cache中操作方法

Apr 16 #PostgreSQL

自定义函数实现单词排序并运用于PostgreSQL(实现代码)

Centos环境下Postgresql 安装配置及环境变量配置技巧

PostgreSQL通过oracle_fdw访问Oracle数据的实现步骤

如何使用PostgreSQL进行中文全文检索

May 27 #PostgreSQL

postgres之jsonb属性的使用操作

Jun 23 #PostgreSQL