参考安装
https://my.oschina.net/hackenhu/blog/4987911
一.安装部署组件
1.官网下载安装包: https://pingcap.com/download-cn/community/
将离线包发送到目标集群的中控机后,执行以下命令安装 TiUP 组件:
tar xzvf tidb-community-server-${version}-linux-amd64.tar.gz && \
sh tidb-community-server-${version}-linux-amd64/local_install.sh && \
source /home/tidb/.bash_profile
2.在线安装tiup
1.curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
2.source ~/.bashrc
which tiup
3.重新登录tidb用户
3.创建无密码登录
2.1创建普通用户
#创建用户
#adduser tidb
#设置密码
#passwd tidb
tidb2021
ssh-keygen -t rsa
cat /home/tidb/.ssh/admin_rsa.pub >> /home/tidb/.ssh/authorized_keys
chmod 600 /home/tidb/.ssh/authorized_keys
注意:自身无密码登录
远程
scp -r tidb01_rsa.pub tidb@cnhuam0dp02:/home/tidb/.ssh/
cat /home/tidb/.ssh/tidb01_rsa.pub >> /home/tidb/.ssh/authorized_keys
4.创建部署和数据目录--
mkdir /home/tidb/tidb-deploy /home/tidb/tidb-data
mkdir /home/tidb/tidb-data
5.解压安装程序
tar -zxvf tidb-community-server-v5.0.2-linux-amd64.tar.gz
sh tidb-community-server-v5.0.2-linux-amd64/local_install.sh
source /home/tidb/.bash_profile
6.编辑部署配置文件
vi topology.yaml
global:
user: "tidb"
ssh_port: 22
deploy_dir: "/home/tidb/tidb-deploy"
data_dir: "/home/tidb/tidb-data"
monitored:
node_exporter_port: 9100
blackbox_exporter_port: 9115
server_configs:
tidb:
log.slow-threshold: 300
tikv:
readpool.storage.use-unified-pool: false
readpool.coprocessor.use-unified-pool: true
pd:
replication.enable-placement-rules: true
replication.location-labels: ["host"]
tiflash:
logger.level: "info"
pd_servers:
- host: admin.dt.com
tidb_servers:
- host: admin.dt.com
tikv_servers:
- host: master.dt.com
port: 20160
status_port: 20180
config:
server.labels: { host: "logic-host-1" }
- host: master.dt.com
port: 20161
status_port: 20181
config:
server.labels: { host: "logic-host-2" }
- host: master.dt.com
port: 20162
status_port: 20182
config:
server.labels: { host: "logic-host-3" }
tiflash_servers:
- host: admin.dt.com
monitoring_servers:
- host: admin.dt.com
grafana_servers:
- host: master.dt.com
7.开始部署
下载安装tiup
curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
curl --proto '=https' --tlsv1.2 -sSf http:172.26.160.70:53/install.sh | sh
source .bash_profile
tiup install tikv:v6.1.0
tidb获取sudo权限
vi /etc/sudoers
增加:tidb ALL = NOPASSWD: ALL
输入:wq! 保存
检测: tiup cluster check ./topology.yaml --user tidb [-p] [-i /home/tidb/.ssh/id_rsa]
执行部署命令: tiup cluster deploy tidb-jabil v6.0.0 ./topology.yaml --user tidb
tiup cluster start tidb-jabil --init
启动:tiup cluster start tidb-jabil
tiup cluster check ./topology.yaml --user tidb [-p] [-i /home/tidb/.ssh/id_rsa]
tiup cluster deploy tidb-jabil v6.0.0 ./topology.yaml --user tidb
tiup cluster start tidb-jabil
tiup cluster start tidb-jabil --init
8. 启动/停止 命令 cd /home/tidb
tiup cluster start tidb-jabil
tiup cluster restart tidb-jabil
tiup cluster stop tidb-jabil
9.检查安装成功
可以看到tidb的端口号是4000,pd运维端口是2379。我们通过Navicat这种工具连接数据库是使用4000端口,默认密码为空。
控制台:http://admin.dt.com:2379/dashboard
10.数据迁移界面 模式
/home/tidb/tidb-community-toolkit-v5.0.1-linux-amd64/bin/tidb-lightning --server-mode --status-addr :8289
11.增加TiCDC
tiup cluster scale-out tidb-test scale-out.yaml
查看集群列表
tiup cluster list
查看组件
tiup cluster display tidb-jabil
12.卸载tidb
1、tiup cluster list 查看集群列表
2、tiup cluster destroy tidb-jabil 来进行销毁
13. 配置用户密码
set password for 'root'@'%' = password('jabil@2022');
flush privileges;
14.cdc任务操作
cd /home/tidb
tidb同步到mysql
tiup ctl:v5.0.2 cdc changefeed create --pd=http://admin.dt.com:2379 --sink-uri="mysql://root:root@admin.dt.com:3306/test?max-txn-row=500" --config /home/tidb/moveDataConf/test_to_mysql.toml --changefeed-id="test-to-202-mysql"
tiup ctl:v5.0.2 cdc changefeed query -s --pd=http://admin.dt.com:2379 --changefeed-id="test-to-201-mysql"
查询同步任务
tiup ctl:v5.0.2 cdc changefeed list --pd=http://admin.dt.com:2379
查询 capture 列表
tiup ctl:v5.0.2 cdc capture list --pd=http://admin.dt.com:2379
停止任务
tiup ctl:v5.0.2 cdc changefeed pause --pd=http://admin.dt.com:2379 --changefeed-id "test-to-201-mysql"
tiup ctl:v5.0.2 cdc changefeed pause --pd=http://admin.dt.com:2379 --changefeed-id test-to-201-mysql
tiup ctl:v5.0.2 cdc cli changefeed remove --pd=http://admin.dt.com:2379 --changefeed-id test-to-201-mysql
15. TIDB执行跨库关联脚本
export MYSQL_PWD=root && mysql -uroot -h admin.dt.com -P4000 < 1.sql
1.sql内容:select * from bdweb_db.bi_users b,test.bi_users t where t.u_id=b.u_id
16.数据迁移需要用DM,安装DM
tiup install dm dmctl
查看DM tiup dm list
tiup uninstall dm-test
编辑初始化配置文件topology-dm.yaml , 获取版本号:tiup list dm-master
tiup dm deploy dm-test v2.0.3 /home/tidb/topology-dm.yaml --user tidb [-p] [-i /home/tidb/.ssh/id_rsa]
启动 tiup dm start dm-test
查看dm角色 tiup dm display dm-test
删除部署dm tiup dm destroy dm-test
tiup cluster reload
执行数据迁移命令
1.加载数据源
tiup dmctl --master-addr 192.168.0.20:8261 operate-source create source-mysql.yaml
2.核对加载任务
tiup dmctl --master-addr 192.168.0.20:8261 check-task test-task.yaml
3.启动任务
tiup dmctl --master-addr 192.168.0.20:8261 start-task test-task.yaml
17.监控
http://slave.test.com:3001/
admin/123456
18.执行tidb脚本命令
mysql -h master.test.com -P 4000 -uroot -proot test < /opt/mysql/sql/del_resu.sql >> /opt/log/sql.out
mysql -h master.test.com -P 4000 -uroot -proot test -e " select dt,max(pv) from pvuv_sink GROUP BY dt order by dt desc limit 10 " >> /opt/log/sql.out
19.数据节点操作
tiup ctl:v6.0.0 pd -u admin.test.com:2379 -i
删除某个节点
store delete 1
20. 单独启动某个角色
tiup cluster start tidb-jabil -R tikv
tiup cluster start tidb-jabil -R pd
tiup cluster stop tidb-jabil -R cdc
21.重启节点
tiup dm reload
22.查找日志
cat /home/tidb/tidb-bin/pd-2379/log/pd.log | grep "init cluster id"
cat /home/tidb/tidb-deploy/pd-2379/log/pd.log | grep "init cluster id"
23.禁止开机启动
tiup cluster disable tidb-jabil
24.离线部署v6.1.0
注意权限
chmod -R 777 /df_bigdata_cluster/software/tidb
chmod -R 777 /df_bigdata_cluster/data/tidb
镜像一
tar xzvf tidb-community-server-v6.1.0-linux-amd64.tar.gz && \
sh tidb-community-server-v6.1.0-linux-amd64/local_install.sh && \
source /home/tidb/.bash_profile
镜像二
tar xf tidb-community-toolkit-v6.1.0-linux-amd64.tar.gz
ls -ld tidb-community-server-v6.1.0-linux-amd64 tidb-community-toolkit-v6.1.0-linux-amd64
cd tidb-community-server-v6.1.0-linux-amd64/
cp -rp keys ~/.tiup/
tiup mirror merge ../tidb-community-toolkit-v6.1.0-linux-amd64
验证
tiup cluster check ./topology.yaml --user tidb
部署--需要5分钟左右
tiup cluster deploy tidb-jabil v6.1.0 ./topology.yaml --user tidb
tiup cluster deploy tidb-JBDP v6.1.0 ./topology.yaml --user tidb
启动
第一次初始化
tiup cluster start tidb-JBDP --init
tiup cluster start tidb-JBDP
生产集群命令
tiup cluster restart tidb-JBDP
tiup cluster display tidb-JBDP
tiup cluster stop tidb-JBDP
tiup cluster destroy tidb-JBDP 来进行销毁
重启
tiup cluster restart tidb-JBDP
开始启动
tiup cluster start tidb-JBDP
mkdir /data6/tidb
mkdir /data6/tidb/data
chown -R tidb:tidb /data6/tidb
mkdir /data8/tidb
mkdir /data8/tidb/data
chown -R tidb:tidb /data8/tidb
Started cluster `tidb-JBDP` successfully
The root password of TiDB database has been changed.
The new password is: 'r8=%UT1#2D0W-G7YJ4'.
Copy and record it to somewhere safe, it is only displayed once, and will not be stored.
The generated password can NOT be get and shown again.
25.防火墙设计
需要开启端口
grafana 3001
pd 2379/2380
prometheus 9090/12020
tidb 4000/10080
tiflash 8889/8123/3930/20170/20292/8234
tikv 20160/20180
#pd
firewall-cmd --permanent --add-port=2379/tcp
firewall-cmd --permanent --add-port=2380/tcp
#tidb
firewall-cmd --permanent --add-port=4000/tcp
firewall-cmd --permanent --add-port=10080/tcp
#tikv
firewall-cmd --permanent --add-port=20160/tcp
firewall-cmd --permanent --add-port=20180/tcp
#tiflash
firewall-cmd --permanent --add-port=8889/tcp
firewall-cmd --permanent --add-port=8123/tcp
firewall-cmd --permanent --add-port=3930/tcp
firewall-cmd --permanent --add-port=20170/tcp
firewall-cmd --permanent --add-port=20292/tcp
firewall-cmd --permanent --add-port=8234/tcp
#grafana
firewall-cmd --permanent --add-port=3001/tcp
#prometheus
firewall-cmd --permanent --add-port=9090/tcp
firewall-cmd --permanent --add-port=12020/tcp
firewall-cmd --reload
firewall-cmd --list-port
先停防火墙
service firewalld stop
service firewalld start
26.数据导出备份工具Dumpling
参考:https://www.modb.pro/db/331066
获取 安装
tiup install dumpling
运行
tiup dumpling ...
tiup dumpling \
-u root \
-p jabil@2022 \
-P 4000 \
-h cnhuam0testg84 \
--filetype csv \
--sql 'select * from test.aps_rt_sink' \
-t 8 \
-o /df_bigdata_cluster/data/tidb/test \
-r 200000 \
-F 256MiB
tiup dumpling \
-u root \
-p jabil@2022 \
-P 4000 \
-h cnhuam0testg84 \
--filetype sql \
-t 8 \
-o /df_bigdata_cluster/data/tidb/test \
-r 200000 \
--filter "test.*" \
-F 256MiB
27.tidb-lightning和dumpling安装
tiup install tidb-lightning
chmod +x tidb-lightning
tiup tidb-lightning -h
数据导入 nohup tiup tidb-lightning -config eam-lightning.toml > nohup.out 2>&1 &
nohup /data13/tidb/deploy/tidb-community-toolkit-v6.1.0-linux-amd64/tidb-lightning -config /data13/tidb/lighting/import-conf/whirlpool-lightning.toml > /data13/tidb/tmp/nohup.out &
scp -r /data13/tidb/deploy/duming-data/esss tidb@10.114.19.66:/df_bigdata_cluster/software/tidb/tmp-data
scp -r /data13/tidb/duming-data/sbs tidb@10.114.19.66:/df_bigdata_cluster/software/tidb/tmp-data
#!/bin/bash
tiup install tidb-dumpling
数据导出:
/home/tidb/.tiup/bin/tiup dumpling -u root -p jabil@2022 -P 4000 -h cnhuam0testg84 --filetype sql -t 8 -o /df_bigdata_cluster/data/tidb/bakData/whirlpool -r 200000 --filter "whirlpool_db.*" -F 256MiB >>/df_bigdata_cluster/data/tidb/log/test.out
/home/tidb/.tiup/bin/tiup dumpling -u esss -p esss#20230817 -P 4001 -h cnhuam0dp06 --filetype sql -t 8 -o /data13/tidb/deploy/duming-data/esss -r 200000 --filter "esss_db.*" -F 256MiB >>/data13/tidb/logs/dump_data.out
/home/tidb/.tiup/bin/tiup dumpling -u eam_prod -p eam#20240125 -P 4001 -h cnhuam0dp06 --filetype sql -t 8 -o /data13/tidb/deploy/duming-data/eam -r 200000 --filter "eam_prod_db.*" -F 256MiB >>/data13/tidb/logs/dump_data.out
/home/tidb/.tiup/bin/tiup dumpling -u sbs -p sbs@20230303 -P 4001 -h cnhuam0dp06 --filetype sql -t 8 -o /data13/tidb/duming-data/sbs -r 200000 --filter "sbs_db.*" -F 256MiB >>/data13/tidb/logs/dump_data.out
/home/tidb/.tiup/bin/tiup dumpling -u valeo -p vdb#20230608 -P 4001 -h cnhuam0dp06 -o /data13/tidb/duming-data/valeo/ods_aoi_image -r 200000 --filetype csv --sql 'select * from valeo_db.ods_spi_detail_camx_history where create_time>date_sub(curdate(),interval 47 day) and create_time< date_sub(curdate(),interval 18 day)' -F 100MiB --output-filename-template 'ods_spi_detail_camx.{{.Index}}'
/home/tidb/.tiup/bin/tiup dumpling -u valeo -p vdb#20230608 -P 4001 -h cnhuam0dp06 -o /data11/apps/spark/duming-data/valeo/ods_aoi_image2 -r 200000 --filetype csv --sql 'select * from valeo_db.ods_aoi_image where create_time>date_sub(curdate(),interval 55 day) and create_time< date_sub(curdate(),interval 50 day)' -F 100MiB --output-filename-template 'ods_aoi_image.{{.Index}}'
scp -r tidb@10.114.19.66:/df_bigdata_cluster/data/tidb/bakData/whirlpool /data13/tidb/lighting/dumper-data/whirlpool
28.在线扩容
tiup cluster scale-out tidb-JBDP scale-out-tidb.yaml -uroot -p
scale-out-tidb.yaml内容:
tidb_servers:
- host: 10.0.1.5
ssh_port: 22
port: 4000
status_port: 10080
deploy_dir: /data/deploy/install/deploy/tidb-4000
log_dir: /data/deploy/install/log/tidb-4000
检查集群状态
tiup cluster display tidb-JBDP
参考:https://blog.csdn.net/weixin_42241611/article/details/126785954
29.设置系统最大打开文件数(不需要重启系统)
vi /etc/security/limits.conf, tidb配置数一定小于等于系统配置数,系统配置数一定要高否则没用添加:
# 进程线程数
* soft nproc 131072
* hard nproc 131072
# 文件句柄数
* soft nofile 1000000
* hard nofile 1000000
# 内存锁定交换
* soft memlock unlimited
* hard memlock unlimited
二.运维问题
1.报错:Error: Failed to initialize TiDB environment on remote host 'admin.dt.com' (task.env_init.failed)
caused by: Failed to create '~/.ssh' directory for user 'tidb'
caused by: Failed to execute command over SSH for 'tidb@admin.dt.com:22'
caused by: Process exited with status 1
解决办法:每台机器都加
chmod 470 /etc/sudoers
vi /etc/sudoers
添加:tidb ALL = NOPASSWD: ALL
chmod 440 /etc/sudoers
2.Starting component `ctl`: /home/tidb/.tiup/components/ctl/v5.0.1/ctl v5.0.1
Error: ctl need an explicit version, please run with `tiup ctl:<cluster-version>`
Error: run `/home/tidb/.tiup/components/ctl/v5.0.1/ctl` (wd:/home/tidb/.tiup/data/SZzAnSw) failed: exit status 1
解决办法: 运行
mysql_tzinfo_to_sql /usr/share/zoneinfo | mysql -u root mysql -p
3.[CDC:ErrMySQLConnectionError]Error 1298: Unknown or incorrect time zone: 'Asia/Shanghai'
windows下mysql时区设置
修改的my.ini配置文件,在MySQL的安装根目录下
[mysqld]
4.Error: failed to start cdc: failed to start: master.dt.com cdc-8300.service, please check the instance's log(/home/tidb/tidb-deploy/cdc-8300/log) for more detail.: timed out waiting for port 8300 to be started after 2m0s
Verbose debug logs has been written to /home/tidb/.tiup/logs/tiup-cluster-debug-2021-06-13-17-01-07.log.
Error: run `/home/tidb/.tiup/components/cluster/v1.5.0/tiup-cluster` (wd:/home/tidb/.tiup/data/SaCg628) failed: exit status 1
解决办法: 重启机器
没有 leader 的 region
tiup pd-ctl -u http://admin.test.com:2379 -d region --jq '.regions[]|select(has("leader")|not)|{id: .id,peer_stores: [.peers[].store_id]}'
副本丢失的 region
pd-ctl -u http://admin.test.com:2379 region --jq=".regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(length != 2)}" // 当前环境中 max-replicas 参数为 2
5.路径不对报:is not a valid SemVer string
6.初始化报:Failed to create new system user 'tidb' on remote host
远程无密码账户不对
7. 启动dm初始化报fail to validate embed etcd config, RawCause: expected IP in URL for binding
解决办法:配置文件不能用机器名只能是IP地址
8.启动集群报错:Error: failed to start tidb: failed to start: admin.test.com tidb-4000.service, please check the instance's log
解决办法:再次启动就ok
9.启动TIDB集群报:failed to start tikv: failed to start:
解决办法:宿主机重启共享网络
10.启动 TiKV 一直报 PD 节点不通,retrying of unary invoker failed
解决办法: 重启节点 tiup dm reload ,最坏的办法销毁集群,重新部署,但先做好数据备份。
11. failed to start grafana: failed to start: slave.test.com grafana-3001.service, please check the instance's
解决办法:一般是网络问题,等待
12.PD报错:["PD failed to respond"] [err="Grpc(RpcFailure(RpcStatus { code: 4-DEADLINE_EXCEEDED, message: \"Deadline Exceeded\", details: [] }))"] [endpoints=admin.test.com:2379]
[2022/06/28 14:26:57.683 +08:00] [INFO] [util.rs:547] ["connecting to PD endpoint"] [endpoints=master.test.com:2379]
tikv报错:[ERROR] [etcdutil.go:71] ["failed to get cluster from remote"] [error="[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs"]
解决办法:安装时不能用域名,只能改机器名
13.镜像报错:-bash: tsh: command not found
解决办法:复制官网的镜像语句再执行
ssh_stderr: bash: /tmp/tiup/bin/insight: Permission denied
cat /etc/fstab
umount /tmp
mount /dev/mapper/root_vg-tmp /tmp
14. 安装前检测cpu-governor Fail :CPU frequency governor is powersave, should use performance
解决办法:执行 cpupower frequency-set --governor performance
15. 安装前检测 Fail numactl not usable, bash: numactl: command not found
解决办法:执行 yum -y install numactl.x86_64
16.10.114.26.111 network Fail network speed of eno6 is 100MB too low, needs 1GB or more
17.The component `dumpling` not found (may be deleted from repository); skipped
解决办法:没有把tidb-community-toolkit-v6加入到tiup, 按照镜像2操作: tiup mirror merge ../tidb-community-toolkit-v6.1.0-linux-amd64
18. 创建表1055 ,group 报错
查看:show variables like '%sql_mode%'
set global sql_mode ='STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION';
原 global sql_mode ='ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION';
需要重启 tidb
19.TIDB-Error 1105: Out Of Memory Quota问题解决
server_configs:
tidb:
mem-quota-query: 10737418240# 修改大小,
这个配置值默认为 1GB,
20.报错:Transaction is too large(交易量太大)
修改方法:
performance.txn-total-size-limit: 10737418240 -->34359738368在tidb下加此参数
txn-total-size-limit
TiDB 单个事务大小限制
默认值:104857600
单位:Byte
单个事务中,所有 key-value 记录的总大小不能超过该限制。该配置项的最大值不超过 10737418240(表示 10GB)。
注意,如果使用了以 Kafka 为下游消费者的 binlog,如:arbiter 集群,该配置项的值不能超过 1073741824(表示 1GB),因为这是 Kafka 的处理单条消息的最大限制,超过该限制 Kafka 将会报错。
21.重启SSH报错:ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods re "errorVerbose": "ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey],
解决办法:目标机器authorized_keys缺少密钥对,开始部署集群时,tiup就自己生成了一对密钥对与普通SSH不同 ,需要拷贝到目标机的authorized_keys;
22.tidb删除大量数据报错:ERROR 1105 (HY000) at line 1: Out Of Memory Quota![conn_id=676956115087820587]
解决办法:修改参数
server_configs:
tidb:
mem-quota-query: 4294967296 # 修改大小
23.tidb后台链接报错:MySQLNonTransientConnectionException: Could not create connection to database server
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.]
2023-12-25 17:48:00.209 INFO 6897 --- [o-8089-exec-221] com.jabil.datax.utils.DBUtil : 异常连接URL:jdbc:mysql://cnhuam0dp06:4001?serverTimezone=GMT%2B8&useUnicode=true&characterEncoding=UTF-8,用户名:apiWS,密码:api@2022
2023-12-25 17:48:02.209 ERROR 6897 --- [o-8089-exec-221] com.jabil.datax.utils.RetryUtil : Exception when calling callable, 即将尝试执行第2次重试.本次重试计划等待[2000]ms,实际等待[2000]ms, 异常Msg:[Code:[MYSQLErrCode-02],
Description:[数据库服务的IP地址或者Port错误,请检查填写的IP地址和Port或者联系DBA确认IP地址和Port是否正确。如果是同步中心用户请联系DBA确认idb上录入的IP和PORT信息和数据库的当前实际信息是一致的].
解决方案是:
查询数据库没有关闭链接,释放资源
//执行完查询关闭资源 DBUtil.closeDBResources(null, null, conn);
24. ERROR 8242 (HY000): 'Drop Table' is unsupported on cache tables 缓存表不能删除
解决方案是: 先转化普通表再删除执行:ALTER TABLE dwd_indicator NOCACHE
25.tidb lightning encountered error: [Lightning:PreCheck:ErrSystemRequirementNotMet]system requirement not met: the maximum number of open file descriptors is too small, got 1024, expect greater or equal to 12590: operation not permitted
解决方案是: 每台执行 ulimit -HSn 1000000
26.导入数据报:Cluster doesn't have too many empty regions
解决方案是:
27. tidbit tiup无法下达指令 报:cluster/module.(*WaitFor).Execute\n\tgithub.com/pingcap/tiup/pkg/cluster
解决方案是:检测主机配置文件:/home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/config-cache 和从机:/data13/tidb/deploy/tidb-4000/scripts 等
对应目录是否存在
28.链接数据库4000失败,报:rpc error: code = Canceled desc = grpc: the client connection is closing
解决方案是:tidb端口4000被占用,杀手进程后自动启动
29.tikv缩容后,节点不在集群中,但对应的tikv仍然在启动状态,需要删除节点,找到对应的store id 执行删除
解决方案是: /home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store delete 1
30.[FATAL] [server.rs:950] ["failed to start node: StoreTombstone(\"store is tombstone\")"]
解决方案是:删除对应storeid
31. 下线失败使用 ,先weight为0,然后删除
解决方案是:
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store weight 1 0 0
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store weight 183060412 0 0
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store weight 183062372 0 0
删除
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store remove-tombstone 1
curl -X POST http://10.114.26.111:2379 /pd/api/v1/store/1/state?state=Up
32.发现store 永远处于Down无法用store delete 删除,执行如下命令
解决方案是:/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 unsafe remove-failed-stores store_id
33.每次重启tidb时已经tombstone的实例又被启动,需要删除
解决方案是: tiup cluster prune tidb-JBDP
34.如果启动tidb超时,而中断其他服务不能正常启动,例如:timed out waiting for port 4000 to be started after 2m0s
解决方案是: tiup cluster start tidb-JBDP
35.系统健康检查失败集群中未启动必要组件 `NgMonitoring`,部分功能将不可用。
解决方案是: tiup cluster start tidb-JBDP
三.TIDB运维命令
1.2021/6/7
https://asktug.com/t/topic/92780
2.问题搜索
https://search.asktug.com
3.导入导出语句
导入表结构先创建表
tar -zxvf tidb-lightning-${version}-linux-amd64.tar.gz
tidb-lightning -d "mysql://user:password@IP:port/DBName" -t "tidb://user:password@IP:port/DBName
mysql -h 10.114.17.165 -P 4002 -uroot -pjabil@2022
use sbs_db;
/data13/tidb/duming-data/sbs_db-20240528.sql;
拷贝导出数据
scp -r tidb@cnhuam0dp01:/data13/tidb/duming-data/sbs_db /df_bigdata_cluster/data/tidb/bakData/lightData
4.导入测试环境
nohup /df_bigdata_cluster/software/tidb/tidb-community-server-v6.1.0-linux-amd64/tidb-lightning -config /df_bigdata_cluster/data/tidb/bakData/lightData/sbs_db/sbs-lightning.toml > /df_bigdata_cluster/data/tidb/bakData/lightData/logs/nohup.out &
nohup /df_bigdata_cluster/software/tidb/tidb-community-server-v6.1.0-linux-amd64/tidb-lightning -config /df_bigdata_cluster/data/tidb/bakData/lightData/conf/sbs-lightning.toml > /df_bigdata_cluster/data/tidb/bakData/lightData/logs/nohup.out &
5. 导出数据表结构
mysqldump -h cnhuam0dp09 -P 4001 -d sbs_db -uroot -pjabil#202210 >/data13/tidb/duming-data/sbs_db-20240528.sql
mysqldump -h cnhuam0dp09 -P 4001 -d valeo_db -uroot -pjabil#202210 >/data13/tidb/duming-data/valeo_db-20240528.sql
mysqldump -h cnhuam0dp09 -P 4001 -d bmwcss_db -uroot -pjabil#202210 >/data13/tidb/duming-data/bmwcss_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d dfengsys_prod_db -uroot -pjabil#202210 >/data13/tidb/duming-data/dfengsys_prod_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d eam_db -uroot -pjabil#202210 >/data13/tidb/duming-data/eam_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d eam_prod_db -uroot -pjabil#202210 >/data13/tidb/duming-data/eam_prod_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d ehs_hcas_db -uroot -pjabil#202210 >/data13/tidb/duming-data/ehs_hcas_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d esss_db -uroot -pjabil#202210 >/data13/tidb/duming-data/esss_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d etask_db -uroot -pjabil#202210 >/data13/tidb/duming-data/etask_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d hcss_db -uroot -pjabil#202210 >/data13/tidb/duming-data/hcss_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d ic_sap_db -uroot -pjabil#202210 >/data13/tidb/duming-data/ic_pro_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d ic_sap_db -uroot -pjabil#202210 >/data13/tidb/duming-data/ic_sap_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d ie_ucs_db -uroot -pjabil#202210 >/data13/tidb/duming-data/ie_ucs_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d iot_db -uroot -pjabil#202210 >/data13/tidb/duming-data/iot_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d mat_fuc_db -uroot -pjabil#202210 >/data13/tidb/duming-data/mat_fuc_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d pub_db -uroot -pjabil#202210 >/data13/tidb/duming-data/pub_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d wef_db -uroot -pjabil#202210 >/data13/tidb/duming-data/wef_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d wic_db -uroot -pjabil#202210 >/data13/tidb/duming-data/wic_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d soea_db -uroot -pjabil#202210 >/data13/tidb/duming-data/soea_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d screw_db -uroot -pjabil#202210 >/data13/tidb/duming-data/screw_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d scada_db -uroot -pjabil#202210 >/data13/tidb/duming-data/scada_db-20240531.sql
hdfs数据上传备份
hdfs dfs -put *.sql /user/tidb/dbCreateSQL
导出表数据
/home/tidb/.tiup/bin/tiup dumpling -u $userName -p $passwd -P 4001 -h cnhuam0dp09 -o /data13/tidb/duming-data/$dbName/$exportTabName/$partYearMonth -r 200000 --filetype csv --sql "select * from $tidb_tab where $fiterKeyName>=date_sub(curdate(),interval $fromDays day) and $fiterKeyName<= date_sub(curdate(),interval $toDays day)" -F 200MiB --output-filename-template "$4.{{.Index}}" >>/data13/tidb/logs/hue_dumpling.out
导出表数据
/home/tidb/.tiup/bin/tiup dumpling -u root -p jabil#202210 -P 4001 -h cnhuam0dp09 --filetype sql -t 8 -o /data13/tidb/duming-data/sbs_db -r 200000 --filter "sbs_db.*" -F 256MiB >>/data13/tidb/logs/dump_data.out
6.备份数据库
导出数据
/home/tidb/.tiup/bin/tiup dumpling -u hcss -p hcss -P 4001 -h cnhuam0dp06 --filetype sql -t 8 -o /data13/tidb/dbBak/hcss_20231128 -r 200000 --filter "hcss_db.*" -F 256MiB >>/data13/tidb/logs/hcss.out
zip -r /data13/tidb/dbBak/hcss_20231128.zip /data13/tidb/dbBak
7.强制启动服务
tiup cluster start tidb-JBDP -R tidb
tiup cluster restart tidb-JBDP -R tiflash
tiup cluster restart tidb-JBDP -R node_exporter
tiup cluster restart tidb-JBDP -R prometheus
tiup cluster start tidb-JBDP -R tikv
tiup cluster stop tidb-JBDP -R tikv
tiup cluster start tidb-JBDP -R prometheus
tiup cluster start tidb-JBDP -R tiflash
tiup cluster start tidb-JBDP -R grafana
tiup cluster start tidb-JBDP -R NgMonitoring
tiup cluster display tidb-JBDP
03强制启动某个ID
tiup cluster start tidb-JBDP
tiup cluster start tidb-JBDP --node 10.114.26.104:20163
tiup cluster start tidb-JBDP --node 10.114.26.113:9100
tiup cluster restart tidb-JBDP -R node_exporter
四.优化调优
1.表级别,开启tiflash列存储,列存储统计计算快
针对大表才开启,例如:
ALTER TABLE soea_db.ods_product_bay_data SET TIFLASH REPLICA 1;
ALTER TABLE qmp_db.InternalAuditPlanReport SET TIFLASH REPLICA 1;
ALTER TABLE tb_equ_moniter_record_2022 SET TIFLASH REPLICA 1;
ALTER TABLE screw_db.dwd_tighten_basic SET TIFLASH REPLICA 1;
ALTER TABLE soea_db.ods_product_bay_data SET TIFLASH REPLICA 1;
取消
ALTER TABLE screw_db.dwd_tighten_basic SET TIFLASH REPLICA 0;
查看:
SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'soea_db' and TABLE_NAME = 'ods_product_bay_data';
开启tiflash
ALTER TABLE tab SET TIFLASH REPLICA 1; #开启数据同步
set @@session.tidb_allow_mpp=1; #开启MPP计算
set @@session.tidb_enforce_mpp=1;
2.大批量删除数据失败时用
Delete from t where xx limit 5000(过滤条件)
3.配置参数调优
1. 数据同步:Transaction is too large, size: 104857819 With statement:
tiup cluster edit-config tidb-JBDP
tiup cluster show-config tidb-JBDP
tiup cluster show-config tidb-jabil
tiup cluster edit-config tidb-jabil
#修改配置参数:
performance.txn-total-size-limit: 10737418240 ## 单个事务大小限制,例如:一致性导数数据大小限制
重启生效:
tiup cluster reload tidb-JBDP -R tidb
tiup cluster stop tidb-JBDP -R tidb
2. #不超过机器内存的45%
storage.block-cache.capacity: 32GB
#重启生效:
tiup cluster reload tidb-JBDP -R tikv
3. 配置临时计算存储位置 tidb:
tmp-storage-path: /data13/tidb/tmpStorage
performance.txn-total-size-limit: 10737418240
4.日志保存天数
log.file.max-days: 30
4.处理函数长度上限问题:
修改GROUP_CONCAT函数存储字符串的最大长度
默认GROUP_CONCAT函数返回的结果大小被MySQL默认限制为1024(字节)的长度。
SET GLOBAL group_concat_max_len = 102400;
select * from mysql.GLOBAL_VARIABLES where variable_name = ‘group_concat_max_len’;
echo PubkeyAcceptedKeyTypes=+ssh-rsa >> /etc/ssh/sshd_config
systemctl restart sshd
重新加载SSH
tiup cluster reload --ssh system tidb-JBDP
6.清理日志
tiup cluster clean tidb-jabil --log
tiup cluster stop tidb-jabil
tiup cluster start tidb-jabil
tiup cluster clean tidb-JBDP --log
7.如果出现一次启动失败可以了接连tiup cluster start tidb-JBDP,直到成功为止
8.出现创建和删除表失败很有可能是日志积累太多需清除日志,日志需定期清理
tiup cluster clean tidb-JBDP --log
再重启
tiup cluster start tidb-JBDP
tiup cluster reload tidb-JBDP --role tikv --nodes 10.114.26.112
systemctl start tikv-server
tiup cluster reload tidb-JBDP
9.查看进程状态
SHOW PROCESSLIST
SELECT * FROM information_schema.processlist WHERE COMMAND='Sleep' ORDER BY db
50条以下比较健康 不然汇报错:communications link failure
10.某台服务器IO占很高,后台tikv日志报错:Error starting HTTP server" err="accept tcp [::]:9115: accept4: too many open files in system
解决方案是: vi /etc/security/limits.conf, tidb配置数一定小于等于系统配置数,系统配置数一定要高否则没用添加:
# 进程线程数
* soft nproc 131072
* hard nproc 131072
# 文件句柄数
* soft nofile 1000000
* hard nofile 1000000
# 内存锁定交换
* soft memlock unlimited
* hard memlock unlimited
11.查看系统IO使用情况
yum install -y sysstat
查看IO:iostat
实时:dumpling
iostat -d -m 2
配置文件
vi /home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/meta.yaml
执行 tiup cluster reload tidb-JBDP
/data13/tidb/deploy/tikv-20161/bin/tikv-server > /data11/tidb/deploy/tikv-20161/tikv.log 2>&1 &
/data13/tidb/deploy/tikv-20162/bin/tikv-server > /data12/tidb/deploy/tikv-20162/tikv.log 2>&1 &
/data13/tidb/deploy/tikv-20163/bin/tikv-server > /data13/tidb/deploy/tikv-20163/tikv.log 2>&1 &
12.手动启动tidb服务
tidb
112
/data13/tidb/deploy/tidb-4000/bin/tidb-server >/data13/tidb/deploy/tidb-4000/log/tidb.log 2>&1 &
105
/data13/tidb/deploy/tidb-4000/bin/tidb-server >/data13/tidb/deploy/tidb-4000/log/tidb.log 2>&1 &
/data13/tidb/deploy/tikv-server-20162 start
/data13/tidb/deploy/tikv-20162/bin/tikv-server start
/data13/tidb/deploy/tikv-20163/bin/tikv-server stop
systemctl list-unit-files --type=service --state=disabled
systemctl is-enabled tikv-20163
systemctl disable tikv-20163
13.集群丢失数据
https://zhuanlan.zhihu.com/p/616686144
tiup ctl:v6.1.0 pd -u https://10.114.26.111:2379 --cacert=/home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/tls/ca.crt --key=/home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/tls/client.pem --cert=/home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/tls/client.crt -i
pd -u https://10.114.26.111:2379 --cacert=/home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/tls/ca.crt --key=/home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/tls/client.pem --cert=/home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/tls/client.crt -i » unsafe
https://www.cnblogs.com/vansky/p/9415551.html
14.查看tikv实例详细操作
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store 5
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 member leader show
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store delete 2
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store remove-tombstone 183060412
添加evict调度方式驱逐待下线store上的Leader
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 scheduler add evict-leader-scheduler 183060412
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store delete 183060412 183062372
取消
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store cancel-delete 1
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store delete
17.删除分区
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 operator remove-region 185889957
18.查看tidb配置参数
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 config show
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 unsafe remove-failed-stores show
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 unsafe remove-failed-stores 1
详细
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 config show all
19.实例和分区操作
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 unsafe remove-failed-stores -s 183060412 -r 185889957
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 operator add remove-peer 185889957 183060412
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 operator add remove-peer 4743 183060412
20.查看实例(store)对应的分区存储
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 region --jq='.regions[]|select(has("leader")|not)|{id:,peer_stores: [.peers[].store_id]}'
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store state_name 1 tombstone
查看分区详情
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 region 159869
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 config show|grep limit > limit.config
21.设置region-schedule
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 config set region-schedule-limit 0
pd-ctl config set region-schedule-limit 0
pd-ctl config set region-schedule-limit 0
22.应用JQ脚本查询
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 region --jq='.regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(length as $total | map(if .==(1) then . else empty end) | length>=$total-length)}' >>tt3.json
查看副本不足的
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 region --jq='.regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(length as $total | map(if .==(183062372,183060412,1,14,8,5,2,183,184,16,9,15,17,179394555,179397408,179393803,179395662) then . else empty end) | length>=$total-length)}' >> tt5.json
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 region --jq=".regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(length as $total | map(if .==(1,14,8,5,2,183,184,16,9,15,17,179394555,179397408,179393803,179395662) then . else empty end) | length>=$total-length)}"
查看没有leader的分区
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 region --jq '.regions[] | select(.leader.store_id == null) | {id: .id, peer_stores: [.peers[].store_id], leader_store: .leader.store_id}' > noleader.json
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 region store 1
23.删除存在下架store上的分区(每台tikv都要执行),当tidb无法启动时使用会丢失数据
主机
/home/tidb/.tiup/components/ctl/v6.1.0/tikv-ctl --data-dir /data11/tidb/data/tikv-20161 unsafe-recover remove-fail-stores -s 183060412,183062372 --all-regions
/home/tidb/.tiup/components/ctl/v6.1.0/tikv-ctl --data-dir /data12/tidb/data/tikv-20162 unsafe-recover remove-fail-stores -s 183060412,183062372 --all-regions
/home/tidb/.tiup/components/ctl/v6.1.0/tikv-ctl --data-dir /data13/tidb/data/tikv-20163 unsafe-recover remove-fail-stores -s 1 --all-regions
从机远程拷贝工具
scp -r /home/tidb/.tiup/components/ctl/v6.1.0/tikv-ctl tidb@cnhuam0dp03:/home/tidb/tikv-ctl
/home/tidb/tikv-ctl --data-dir /data11/tidb/data/tikv-20161 unsafe-recover remove-fail-stores -s 183060412,183062372 --all-regions
/home/tidb/tikv-ctl --data-dir /data12/tidb/data/tikv-20162 unsafe-recover remove-fail-stores -s 183060412,183062372 --all-regions
/home/tidb/tikv-ctl --data-dir /data13/tidb/data/tikv-20163 unsafe-recover remove-fail-stores -s 183060412,183062372 --all-regions
停止tikv
curl http://10.114.26.111:10080/regions/23209844
tiup cluster prune tidb-JBDP
/home/tidb/.tiup/components/ctl/v6.1.0/tikv-ctl --data-dir /df_bigdata_cluster/data/tidb/tidb-data/tikv-20160/ bad-regions
set region-schedule-limit 0
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 config set max-replicas 1
tiup cluster start tidb-JBDP -N 10.114.26.112:20161
ps -ef | grep tikv-server | grep -v grep | awk '{print $2}' | xargs kill -9
ps -ef | grep pd-server | grep -v grep | awk '{print $2}' | xargs kill -9
tiup mirror clone tiup-custom-mirror-6.1.0 --tiup mirror-6.1.0
tiup mirror grant $USER
cp -rp /home/tidb/tidb-community-server-6.1.0-linux-amd64/keys /home/tidb/.tiup/
tiup mirror merge /home/tidb/tidb-community-server-6.1.0-linux-amd64
tiup mirror clone /home/tidb/tidb-community-server-6.1.0-linux-amd64 --full
tiup mirror clone tiup-custom-mirror-v6.1.0 --PCC 1.0.1 --os=linux --arch=amd64
tiup mirror merge /home/tidb/tiup-custom-mirror-v6.1.0
tiup install PCC
17.检测安装tidb合格线--调优
Node Check Result Message
---- ----- ------ -------
10.114.26.113 os-version Pass OS is CentOS Linux 7 (Core) 7.9.2009
10.114.26.113 cpu-governor Fail CPU frequency governor is powersave, should use performance
10.114.26.113 sysctl Fail net.ipv4.tcp_syncookies = 1, should be 0
10.114.26.113 sysctl Fail vm.swappiness = 1, should be 0
10.114.26.113 selinux Pass SELinux is disabled
10.114.26.113 thp Fail THP is enabled, please disable it for best performance
10.114.26.113 cpu-cores Pass number of CPU cores / threads: 64
10.114.26.113 memory Pass memory size is 262144MB
10.114.26.113 network Pass network speed of bond0 is 2000MB
10.114.26.113 network Pass network speed of eno5 is 1000MB
10.114.26.113 network Pass network speed of eno6 is 1000MB
10.114.26.113 network Pass network speed of eno7 is 1000MB
10.114.26.113 network Pass network speed of eno8 is 1000MB
10.114.26.113 command Pass numactl: policy: default
10.114.26.113 timezone Pass time zone is the same as the first PD machine: Asia/Shanghai
10.114.26.113 disk Warn mount point /data11 does not have 'noatime' option set
10.114.26.113 service Fail service firewalld is running but should be stopped
10.114.26.113 swap Warn swap is enabled, please disable it for best performance
查看:cat /proc/sys/fs/file-nr
vi /etc/sysctl.conf
fs.file-max=1000000
vm.swappiness=0
net.ipv4.tcp_syncookies=0
sysctl -p
18.tidb扩容和缩容优化
查看
检测
tiup cluster check tidb-JBDP scale-out.yml --cluster --user tidb
扩容
注意:检查扩容文件对应的路径权限是否是tidb的,端口是否已被使用
tiup cluster scale-out tidb-JBDP scale-out-20161.yml
tiup cluster scale-out tidb-JBDP scale-out-20162.yml
缩容
tiup cluster start tidb-JBDP --node 10.114.26.104:20163
tiup cluster stop tidb-JBDP --node 10.114.26.104:20163
tiup cluster scale-in tidb-JBDP --node 10.114.26.103:20163
tiup cluster scale-in tidb-JBDP --node 10.114.26.112:20162 --force
tiup cluster scale-in tidb-JBDP --node 10.114.26.112:20161 --force
tiup cluster scale-in --force -N
tiup cluster scale-in prod-cluster --node 192.168.4.18:20160 --force
注意:下线时间可能12小时以上。
tiup cluster restart tidb-JBDP --node 10.114.26.112:20161
修复tikv
tiup cluster start tidb-JBDP -N 10.114.26.112:20162
curl --proto =https --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh 7 | sh
tiup ctl:v1.14.1 pd -u http://10.114.26.111:2379 store
tiup cluster force-leader tidb-JBDP 10.114.26.112:20161
tiup cluster force-leader tidb-JBDP --node
启动
tiup tidb-JBDP restart tidb-JBDP 10.114.26.112:20161
19,看下tikv线程的参数 cat /proc/49897/limits
tidb优化方案措施
1.对于大表数据量超过100万最好建时间索引,检索数据最好带上时间过滤条件,避免全表搜索;
2.做表与表关联时尽量先过滤条件后关联;
3.用union all替换union,尽量避免使用in .like
4.尽量避免使用join关联
5.如果有大量批量计算和分析时,数据导入sparkSQL用spark做批量计算;
20.检测tidb卡死,队列等待情况
检测tidb 当前作业
admin show ddl jobs;
取消作业,
ADMIN CANCEL DDL JOBS 1887486;
21. 设置最大连接数
SET GLOBAL max_connections = 350;
show variables like '%max_connections%'
注意和haproxy的 maxconn 1024 #最大连接数 对应的1024约等于3*max_connections
注意haproxy的 maxconn(可能多个)不能超过2048.否则tidb扛不住
22.设置每条SQL默认最大内存大小
#SET GLOBAL tidb_mem_quota_query ='10737418240';
show variables like '%tidb_mem_quota_query%';
https://my.oschina.net/hackenhu/blog/4987911
一.安装部署组件
1.官网下载安装包: https://pingcap.com/download-cn/community/
将离线包发送到目标集群的中控机后,执行以下命令安装 TiUP 组件:
tar xzvf tidb-community-server-${version}-linux-amd64.tar.gz && \
sh tidb-community-server-${version}-linux-amd64/local_install.sh && \
source /home/tidb/.bash_profile
2.在线安装tiup
1.curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
2.source ~/.bashrc
which tiup
3.重新登录tidb用户
3.创建无密码登录
2.1创建普通用户
#创建用户
#adduser tidb
#设置密码
#passwd tidb
tidb2021
ssh-keygen -t rsa
cat /home/tidb/.ssh/admin_rsa.pub >> /home/tidb/.ssh/authorized_keys
chmod 600 /home/tidb/.ssh/authorized_keys
注意:自身无密码登录
远程
scp -r tidb01_rsa.pub tidb@cnhuam0dp02:/home/tidb/.ssh/
cat /home/tidb/.ssh/tidb01_rsa.pub >> /home/tidb/.ssh/authorized_keys
4.创建部署和数据目录--
mkdir /home/tidb/tidb-deploy /home/tidb/tidb-data
mkdir /home/tidb/tidb-data
5.解压安装程序
tar -zxvf tidb-community-server-v5.0.2-linux-amd64.tar.gz
sh tidb-community-server-v5.0.2-linux-amd64/local_install.sh
source /home/tidb/.bash_profile
6.编辑部署配置文件
vi topology.yaml
global:
user: "tidb"
ssh_port: 22
deploy_dir: "/home/tidb/tidb-deploy"
data_dir: "/home/tidb/tidb-data"
monitored:
node_exporter_port: 9100
blackbox_exporter_port: 9115
server_configs:
tidb:
log.slow-threshold: 300
tikv:
readpool.storage.use-unified-pool: false
readpool.coprocessor.use-unified-pool: true
pd:
replication.enable-placement-rules: true
replication.location-labels: ["host"]
tiflash:
logger.level: "info"
pd_servers:
- host: admin.dt.com
tidb_servers:
- host: admin.dt.com
tikv_servers:
- host: master.dt.com
port: 20160
status_port: 20180
config:
server.labels: { host: "logic-host-1" }
- host: master.dt.com
port: 20161
status_port: 20181
config:
server.labels: { host: "logic-host-2" }
- host: master.dt.com
port: 20162
status_port: 20182
config:
server.labels: { host: "logic-host-3" }
tiflash_servers:
- host: admin.dt.com
monitoring_servers:
- host: admin.dt.com
grafana_servers:
- host: master.dt.com
7.开始部署
下载安装tiup
curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
curl --proto '=https' --tlsv1.2 -sSf http:172.26.160.70:53/install.sh | sh
source .bash_profile
tiup install tikv:v6.1.0
tidb获取sudo权限
vi /etc/sudoers
增加:tidb ALL = NOPASSWD: ALL
输入:wq! 保存
检测: tiup cluster check ./topology.yaml --user tidb [-p] [-i /home/tidb/.ssh/id_rsa]
执行部署命令: tiup cluster deploy tidb-jabil v6.0.0 ./topology.yaml --user tidb
tiup cluster start tidb-jabil --init
启动:tiup cluster start tidb-jabil
tiup cluster check ./topology.yaml --user tidb [-p] [-i /home/tidb/.ssh/id_rsa]
tiup cluster deploy tidb-jabil v6.0.0 ./topology.yaml --user tidb
tiup cluster start tidb-jabil
tiup cluster start tidb-jabil --init
8. 启动/停止 命令 cd /home/tidb
tiup cluster start tidb-jabil
tiup cluster restart tidb-jabil
tiup cluster stop tidb-jabil
9.检查安装成功
可以看到tidb的端口号是4000,pd运维端口是2379。我们通过Navicat这种工具连接数据库是使用4000端口,默认密码为空。
控制台:http://admin.dt.com:2379/dashboard
10.数据迁移界面 模式
/home/tidb/tidb-community-toolkit-v5.0.1-linux-amd64/bin/tidb-lightning --server-mode --status-addr :8289
11.增加TiCDC
tiup cluster scale-out tidb-test scale-out.yaml
查看集群列表
tiup cluster list
查看组件
tiup cluster display tidb-jabil
12.卸载tidb
1、tiup cluster list 查看集群列表
2、tiup cluster destroy tidb-jabil 来进行销毁
13. 配置用户密码
set password for 'root'@'%' = password('jabil@2022');
flush privileges;
14.cdc任务操作
cd /home/tidb
tidb同步到mysql
tiup ctl:v5.0.2 cdc changefeed create --pd=http://admin.dt.com:2379 --sink-uri="mysql://root:root@admin.dt.com:3306/test?max-txn-row=500" --config /home/tidb/moveDataConf/test_to_mysql.toml --changefeed-id="test-to-202-mysql"
tiup ctl:v5.0.2 cdc changefeed query -s --pd=http://admin.dt.com:2379 --changefeed-id="test-to-201-mysql"
查询同步任务
tiup ctl:v5.0.2 cdc changefeed list --pd=http://admin.dt.com:2379
查询 capture 列表
tiup ctl:v5.0.2 cdc capture list --pd=http://admin.dt.com:2379
停止任务
tiup ctl:v5.0.2 cdc changefeed pause --pd=http://admin.dt.com:2379 --changefeed-id "test-to-201-mysql"
tiup ctl:v5.0.2 cdc changefeed pause --pd=http://admin.dt.com:2379 --changefeed-id test-to-201-mysql
tiup ctl:v5.0.2 cdc cli changefeed remove --pd=http://admin.dt.com:2379 --changefeed-id test-to-201-mysql
15. TIDB执行跨库关联脚本
export MYSQL_PWD=root && mysql -uroot -h admin.dt.com -P4000 < 1.sql
1.sql内容:select * from bdweb_db.bi_users b,test.bi_users t where t.u_id=b.u_id
16.数据迁移需要用DM,安装DM
tiup install dm dmctl
查看DM tiup dm list
tiup uninstall dm-test
编辑初始化配置文件topology-dm.yaml , 获取版本号:tiup list dm-master
tiup dm deploy dm-test v2.0.3 /home/tidb/topology-dm.yaml --user tidb [-p] [-i /home/tidb/.ssh/id_rsa]
启动 tiup dm start dm-test
查看dm角色 tiup dm display dm-test
删除部署dm tiup dm destroy dm-test
tiup cluster reload
执行数据迁移命令
1.加载数据源
tiup dmctl --master-addr 192.168.0.20:8261 operate-source create source-mysql.yaml
2.核对加载任务
tiup dmctl --master-addr 192.168.0.20:8261 check-task test-task.yaml
3.启动任务
tiup dmctl --master-addr 192.168.0.20:8261 start-task test-task.yaml
17.监控
http://slave.test.com:3001/
admin/123456
18.执行tidb脚本命令
mysql -h master.test.com -P 4000 -uroot -proot test < /opt/mysql/sql/del_resu.sql >> /opt/log/sql.out
mysql -h master.test.com -P 4000 -uroot -proot test -e " select dt,max(pv) from pvuv_sink GROUP BY dt order by dt desc limit 10 " >> /opt/log/sql.out
19.数据节点操作
tiup ctl:v6.0.0 pd -u admin.test.com:2379 -i
删除某个节点
store delete 1
20. 单独启动某个角色
tiup cluster start tidb-jabil -R tikv
tiup cluster start tidb-jabil -R pd
tiup cluster stop tidb-jabil -R cdc
21.重启节点
tiup dm reload
22.查找日志
cat /home/tidb/tidb-bin/pd-2379/log/pd.log | grep "init cluster id"
cat /home/tidb/tidb-deploy/pd-2379/log/pd.log | grep "init cluster id"
23.禁止开机启动
tiup cluster disable tidb-jabil
24.离线部署v6.1.0
注意权限
chmod -R 777 /df_bigdata_cluster/software/tidb
chmod -R 777 /df_bigdata_cluster/data/tidb
镜像一
tar xzvf tidb-community-server-v6.1.0-linux-amd64.tar.gz && \
sh tidb-community-server-v6.1.0-linux-amd64/local_install.sh && \
source /home/tidb/.bash_profile
镜像二
tar xf tidb-community-toolkit-v6.1.0-linux-amd64.tar.gz
ls -ld tidb-community-server-v6.1.0-linux-amd64 tidb-community-toolkit-v6.1.0-linux-amd64
cd tidb-community-server-v6.1.0-linux-amd64/
cp -rp keys ~/.tiup/
tiup mirror merge ../tidb-community-toolkit-v6.1.0-linux-amd64
验证
tiup cluster check ./topology.yaml --user tidb
部署--需要5分钟左右
tiup cluster deploy tidb-jabil v6.1.0 ./topology.yaml --user tidb
tiup cluster deploy tidb-JBDP v6.1.0 ./topology.yaml --user tidb
启动
第一次初始化
tiup cluster start tidb-JBDP --init
tiup cluster start tidb-JBDP
生产集群命令
tiup cluster restart tidb-JBDP
tiup cluster display tidb-JBDP
tiup cluster stop tidb-JBDP
tiup cluster destroy tidb-JBDP 来进行销毁
重启
tiup cluster restart tidb-JBDP
开始启动
tiup cluster start tidb-JBDP
mkdir /data6/tidb
mkdir /data6/tidb/data
chown -R tidb:tidb /data6/tidb
mkdir /data8/tidb
mkdir /data8/tidb/data
chown -R tidb:tidb /data8/tidb
Started cluster `tidb-JBDP` successfully
The root password of TiDB database has been changed.
The new password is: 'r8=%UT1#2D0W-G7YJ4'.
Copy and record it to somewhere safe, it is only displayed once, and will not be stored.
The generated password can NOT be get and shown again.
25.防火墙设计
需要开启端口
grafana 3001
pd 2379/2380
prometheus 9090/12020
tidb 4000/10080
tiflash 8889/8123/3930/20170/20292/8234
tikv 20160/20180
#pd
firewall-cmd --permanent --add-port=2379/tcp
firewall-cmd --permanent --add-port=2380/tcp
#tidb
firewall-cmd --permanent --add-port=4000/tcp
firewall-cmd --permanent --add-port=10080/tcp
#tikv
firewall-cmd --permanent --add-port=20160/tcp
firewall-cmd --permanent --add-port=20180/tcp
#tiflash
firewall-cmd --permanent --add-port=8889/tcp
firewall-cmd --permanent --add-port=8123/tcp
firewall-cmd --permanent --add-port=3930/tcp
firewall-cmd --permanent --add-port=20170/tcp
firewall-cmd --permanent --add-port=20292/tcp
firewall-cmd --permanent --add-port=8234/tcp
#grafana
firewall-cmd --permanent --add-port=3001/tcp
#prometheus
firewall-cmd --permanent --add-port=9090/tcp
firewall-cmd --permanent --add-port=12020/tcp
firewall-cmd --reload
firewall-cmd --list-port
先停防火墙
service firewalld stop
service firewalld start
26.数据导出备份工具Dumpling
参考:https://www.modb.pro/db/331066
获取 安装
tiup install dumpling
运行
tiup dumpling ...
tiup dumpling \
-u root \
-p jabil@2022 \
-P 4000 \
-h cnhuam0testg84 \
--filetype csv \
--sql 'select * from test.aps_rt_sink' \
-t 8 \
-o /df_bigdata_cluster/data/tidb/test \
-r 200000 \
-F 256MiB
tiup dumpling \
-u root \
-p jabil@2022 \
-P 4000 \
-h cnhuam0testg84 \
--filetype sql \
-t 8 \
-o /df_bigdata_cluster/data/tidb/test \
-r 200000 \
--filter "test.*" \
-F 256MiB
27.tidb-lightning和dumpling安装
tiup install tidb-lightning
chmod +x tidb-lightning
tiup tidb-lightning -h
数据导入 nohup tiup tidb-lightning -config eam-lightning.toml > nohup.out 2>&1 &
nohup /data13/tidb/deploy/tidb-community-toolkit-v6.1.0-linux-amd64/tidb-lightning -config /data13/tidb/lighting/import-conf/whirlpool-lightning.toml > /data13/tidb/tmp/nohup.out &
scp -r /data13/tidb/deploy/duming-data/esss tidb@10.114.19.66:/df_bigdata_cluster/software/tidb/tmp-data
scp -r /data13/tidb/duming-data/sbs tidb@10.114.19.66:/df_bigdata_cluster/software/tidb/tmp-data
#!/bin/bash
tiup install tidb-dumpling
数据导出:
/home/tidb/.tiup/bin/tiup dumpling -u root -p jabil@2022 -P 4000 -h cnhuam0testg84 --filetype sql -t 8 -o /df_bigdata_cluster/data/tidb/bakData/whirlpool -r 200000 --filter "whirlpool_db.*" -F 256MiB >>/df_bigdata_cluster/data/tidb/log/test.out
/home/tidb/.tiup/bin/tiup dumpling -u esss -p esss#20230817 -P 4001 -h cnhuam0dp06 --filetype sql -t 8 -o /data13/tidb/deploy/duming-data/esss -r 200000 --filter "esss_db.*" -F 256MiB >>/data13/tidb/logs/dump_data.out
/home/tidb/.tiup/bin/tiup dumpling -u eam_prod -p eam#20240125 -P 4001 -h cnhuam0dp06 --filetype sql -t 8 -o /data13/tidb/deploy/duming-data/eam -r 200000 --filter "eam_prod_db.*" -F 256MiB >>/data13/tidb/logs/dump_data.out
/home/tidb/.tiup/bin/tiup dumpling -u sbs -p sbs@20230303 -P 4001 -h cnhuam0dp06 --filetype sql -t 8 -o /data13/tidb/duming-data/sbs -r 200000 --filter "sbs_db.*" -F 256MiB >>/data13/tidb/logs/dump_data.out
/home/tidb/.tiup/bin/tiup dumpling -u valeo -p vdb#20230608 -P 4001 -h cnhuam0dp06 -o /data13/tidb/duming-data/valeo/ods_aoi_image -r 200000 --filetype csv --sql 'select * from valeo_db.ods_spi_detail_camx_history where create_time>date_sub(curdate(),interval 47 day) and create_time< date_sub(curdate(),interval 18 day)' -F 100MiB --output-filename-template 'ods_spi_detail_camx.{{.Index}}'
/home/tidb/.tiup/bin/tiup dumpling -u valeo -p vdb#20230608 -P 4001 -h cnhuam0dp06 -o /data11/apps/spark/duming-data/valeo/ods_aoi_image2 -r 200000 --filetype csv --sql 'select * from valeo_db.ods_aoi_image where create_time>date_sub(curdate(),interval 55 day) and create_time< date_sub(curdate(),interval 50 day)' -F 100MiB --output-filename-template 'ods_aoi_image.{{.Index}}'
scp -r tidb@10.114.19.66:/df_bigdata_cluster/data/tidb/bakData/whirlpool /data13/tidb/lighting/dumper-data/whirlpool
28.在线扩容
tiup cluster scale-out tidb-JBDP scale-out-tidb.yaml -uroot -p
scale-out-tidb.yaml内容:
tidb_servers:
- host: 10.0.1.5
ssh_port: 22
port: 4000
status_port: 10080
deploy_dir: /data/deploy/install/deploy/tidb-4000
log_dir: /data/deploy/install/log/tidb-4000
检查集群状态
tiup cluster display tidb-JBDP
参考:https://blog.csdn.net/weixin_42241611/article/details/126785954
29.设置系统最大打开文件数(不需要重启系统)
vi /etc/security/limits.conf, tidb配置数一定小于等于系统配置数,系统配置数一定要高否则没用添加:
# 进程线程数
* soft nproc 131072
* hard nproc 131072
# 文件句柄数
* soft nofile 1000000
* hard nofile 1000000
# 内存锁定交换
* soft memlock unlimited
* hard memlock unlimited
二.运维问题
1.报错:Error: Failed to initialize TiDB environment on remote host 'admin.dt.com' (task.env_init.failed)
caused by: Failed to create '~/.ssh' directory for user 'tidb'
caused by: Failed to execute command over SSH for 'tidb@admin.dt.com:22'
caused by: Process exited with status 1
解决办法:每台机器都加
chmod 470 /etc/sudoers
vi /etc/sudoers
添加:tidb ALL = NOPASSWD: ALL
chmod 440 /etc/sudoers
2.Starting component `ctl`: /home/tidb/.tiup/components/ctl/v5.0.1/ctl v5.0.1
Error: ctl need an explicit version, please run with `tiup ctl:<cluster-version>`
Error: run `/home/tidb/.tiup/components/ctl/v5.0.1/ctl` (wd:/home/tidb/.tiup/data/SZzAnSw) failed: exit status 1
解决办法: 运行
mysql_tzinfo_to_sql /usr/share/zoneinfo | mysql -u root mysql -p
3.[CDC:ErrMySQLConnectionError]Error 1298: Unknown or incorrect time zone: 'Asia/Shanghai'
windows下mysql时区设置
修改的my.ini配置文件,在MySQL的安装根目录下
[mysqld]
4.Error: failed to start cdc: failed to start: master.dt.com cdc-8300.service, please check the instance's log(/home/tidb/tidb-deploy/cdc-8300/log) for more detail.: timed out waiting for port 8300 to be started after 2m0s
Verbose debug logs has been written to /home/tidb/.tiup/logs/tiup-cluster-debug-2021-06-13-17-01-07.log.
Error: run `/home/tidb/.tiup/components/cluster/v1.5.0/tiup-cluster` (wd:/home/tidb/.tiup/data/SaCg628) failed: exit status 1
解决办法: 重启机器
没有 leader 的 region
tiup pd-ctl -u http://admin.test.com:2379 -d region --jq '.regions[]|select(has("leader")|not)|{id: .id,peer_stores: [.peers[].store_id]}'
副本丢失的 region
pd-ctl -u http://admin.test.com:2379 region --jq=".regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(length != 2)}" // 当前环境中 max-replicas 参数为 2
5.路径不对报:is not a valid SemVer string
6.初始化报:Failed to create new system user 'tidb' on remote host
远程无密码账户不对
7. 启动dm初始化报fail to validate embed etcd config, RawCause: expected IP in URL for binding
解决办法:配置文件不能用机器名只能是IP地址
8.启动集群报错:Error: failed to start tidb: failed to start: admin.test.com tidb-4000.service, please check the instance's log
解决办法:再次启动就ok
9.启动TIDB集群报:failed to start tikv: failed to start:
解决办法:宿主机重启共享网络
10.启动 TiKV 一直报 PD 节点不通,retrying of unary invoker failed
解决办法: 重启节点 tiup dm reload ,最坏的办法销毁集群,重新部署,但先做好数据备份。
11. failed to start grafana: failed to start: slave.test.com grafana-3001.service, please check the instance's
解决办法:一般是网络问题,等待
12.PD报错:["PD failed to respond"] [err="Grpc(RpcFailure(RpcStatus { code: 4-DEADLINE_EXCEEDED, message: \"Deadline Exceeded\", details: [] }))"] [endpoints=admin.test.com:2379]
[2022/06/28 14:26:57.683 +08:00] [INFO] [util.rs:547] ["connecting to PD endpoint"] [endpoints=master.test.com:2379]
tikv报错:[ERROR] [etcdutil.go:71] ["failed to get cluster from remote"] [error="[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs"]
解决办法:安装时不能用域名,只能改机器名
13.镜像报错:-bash: tsh: command not found
解决办法:复制官网的镜像语句再执行
ssh_stderr: bash: /tmp/tiup/bin/insight: Permission denied
cat /etc/fstab
umount /tmp
mount /dev/mapper/root_vg-tmp /tmp
14. 安装前检测cpu-governor Fail :CPU frequency governor is powersave, should use performance
解决办法:执行 cpupower frequency-set --governor performance
15. 安装前检测 Fail numactl not usable, bash: numactl: command not found
解决办法:执行 yum -y install numactl.x86_64
16.10.114.26.111 network Fail network speed of eno6 is 100MB too low, needs 1GB or more
17.The component `dumpling` not found (may be deleted from repository); skipped
解决办法:没有把tidb-community-toolkit-v6加入到tiup, 按照镜像2操作: tiup mirror merge ../tidb-community-toolkit-v6.1.0-linux-amd64
18. 创建表1055 ,group 报错
查看:show variables like '%sql_mode%'
set global sql_mode ='STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION';
原 global sql_mode ='ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION';
需要重启 tidb
19.TIDB-Error 1105: Out Of Memory Quota问题解决
server_configs:
tidb:
mem-quota-query: 10737418240# 修改大小,
这个配置值默认为 1GB,
20.报错:Transaction is too large(交易量太大)
修改方法:
performance.txn-total-size-limit: 10737418240 -->34359738368在tidb下加此参数
txn-total-size-limit
TiDB 单个事务大小限制
默认值:104857600
单位:Byte
单个事务中,所有 key-value 记录的总大小不能超过该限制。该配置项的最大值不超过 10737418240(表示 10GB)。
注意,如果使用了以 Kafka 为下游消费者的 binlog,如:arbiter 集群,该配置项的值不能超过 1073741824(表示 1GB),因为这是 Kafka 的处理单条消息的最大限制,超过该限制 Kafka 将会报错。
21.重启SSH报错:ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods re "errorVerbose": "ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey],
解决办法:目标机器authorized_keys缺少密钥对,开始部署集群时,tiup就自己生成了一对密钥对与普通SSH不同 ,需要拷贝到目标机的authorized_keys;
22.tidb删除大量数据报错:ERROR 1105 (HY000) at line 1: Out Of Memory Quota![conn_id=676956115087820587]
解决办法:修改参数
server_configs:
tidb:
mem-quota-query: 4294967296 # 修改大小
23.tidb后台链接报错:MySQLNonTransientConnectionException: Could not create connection to database server
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.]
2023-12-25 17:48:00.209 INFO 6897 --- [o-8089-exec-221] com.jabil.datax.utils.DBUtil : 异常连接URL:jdbc:mysql://cnhuam0dp06:4001?serverTimezone=GMT%2B8&useUnicode=true&characterEncoding=UTF-8,用户名:apiWS,密码:api@2022
2023-12-25 17:48:02.209 ERROR 6897 --- [o-8089-exec-221] com.jabil.datax.utils.RetryUtil : Exception when calling callable, 即将尝试执行第2次重试.本次重试计划等待[2000]ms,实际等待[2000]ms, 异常Msg:[Code:[MYSQLErrCode-02],
Description:[数据库服务的IP地址或者Port错误,请检查填写的IP地址和Port或者联系DBA确认IP地址和Port是否正确。如果是同步中心用户请联系DBA确认idb上录入的IP和PORT信息和数据库的当前实际信息是一致的].
解决方案是:
查询数据库没有关闭链接,释放资源
//执行完查询关闭资源 DBUtil.closeDBResources(null, null, conn);
24. ERROR 8242 (HY000): 'Drop Table' is unsupported on cache tables 缓存表不能删除
解决方案是: 先转化普通表再删除执行:ALTER TABLE dwd_indicator NOCACHE
25.tidb lightning encountered error: [Lightning:PreCheck:ErrSystemRequirementNotMet]system requirement not met: the maximum number of open file descriptors is too small, got 1024, expect greater or equal to 12590: operation not permitted
解决方案是: 每台执行 ulimit -HSn 1000000
26.导入数据报:Cluster doesn't have too many empty regions
解决方案是:
27. tidbit tiup无法下达指令 报:cluster/module.(*WaitFor).Execute\n\tgithub.com/pingcap/tiup/pkg/cluster
解决方案是:检测主机配置文件:/home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/config-cache 和从机:/data13/tidb/deploy/tidb-4000/scripts 等
对应目录是否存在
28.链接数据库4000失败,报:rpc error: code = Canceled desc = grpc: the client connection is closing
解决方案是:tidb端口4000被占用,杀手进程后自动启动
29.tikv缩容后,节点不在集群中,但对应的tikv仍然在启动状态,需要删除节点,找到对应的store id 执行删除
解决方案是: /home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store delete 1
30.[FATAL] [server.rs:950] ["failed to start node: StoreTombstone(\"store is tombstone\")"]
解决方案是:删除对应storeid
31. 下线失败使用 ,先weight为0,然后删除
解决方案是:
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store weight 1 0 0
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store weight 183060412 0 0
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store weight 183062372 0 0
删除
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store remove-tombstone 1
curl -X POST http://10.114.26.111:2379 /pd/api/v1/store/1/state?state=Up
32.发现store 永远处于Down无法用store delete 删除,执行如下命令
解决方案是:/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 unsafe remove-failed-stores store_id
33.每次重启tidb时已经tombstone的实例又被启动,需要删除
解决方案是: tiup cluster prune tidb-JBDP
34.如果启动tidb超时,而中断其他服务不能正常启动,例如:timed out waiting for port 4000 to be started after 2m0s
解决方案是: tiup cluster start tidb-JBDP
35.系统健康检查失败集群中未启动必要组件 `NgMonitoring`,部分功能将不可用。
解决方案是: tiup cluster start tidb-JBDP
三.TIDB运维命令
1.2021/6/7
https://asktug.com/t/topic/92780
2.问题搜索
https://search.asktug.com
3.导入导出语句
导入表结构先创建表
tar -zxvf tidb-lightning-${version}-linux-amd64.tar.gz
tidb-lightning -d "mysql://user:password@IP:port/DBName" -t "tidb://user:password@IP:port/DBName
mysql -h 10.114.17.165 -P 4002 -uroot -pjabil@2022
use sbs_db;
/data13/tidb/duming-data/sbs_db-20240528.sql;
拷贝导出数据
scp -r tidb@cnhuam0dp01:/data13/tidb/duming-data/sbs_db /df_bigdata_cluster/data/tidb/bakData/lightData
4.导入测试环境
nohup /df_bigdata_cluster/software/tidb/tidb-community-server-v6.1.0-linux-amd64/tidb-lightning -config /df_bigdata_cluster/data/tidb/bakData/lightData/sbs_db/sbs-lightning.toml > /df_bigdata_cluster/data/tidb/bakData/lightData/logs/nohup.out &
nohup /df_bigdata_cluster/software/tidb/tidb-community-server-v6.1.0-linux-amd64/tidb-lightning -config /df_bigdata_cluster/data/tidb/bakData/lightData/conf/sbs-lightning.toml > /df_bigdata_cluster/data/tidb/bakData/lightData/logs/nohup.out &
5. 导出数据表结构
mysqldump -h cnhuam0dp09 -P 4001 -d sbs_db -uroot -pjabil#202210 >/data13/tidb/duming-data/sbs_db-20240528.sql
mysqldump -h cnhuam0dp09 -P 4001 -d valeo_db -uroot -pjabil#202210 >/data13/tidb/duming-data/valeo_db-20240528.sql
mysqldump -h cnhuam0dp09 -P 4001 -d bmwcss_db -uroot -pjabil#202210 >/data13/tidb/duming-data/bmwcss_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d dfengsys_prod_db -uroot -pjabil#202210 >/data13/tidb/duming-data/dfengsys_prod_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d eam_db -uroot -pjabil#202210 >/data13/tidb/duming-data/eam_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d eam_prod_db -uroot -pjabil#202210 >/data13/tidb/duming-data/eam_prod_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d ehs_hcas_db -uroot -pjabil#202210 >/data13/tidb/duming-data/ehs_hcas_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d esss_db -uroot -pjabil#202210 >/data13/tidb/duming-data/esss_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d etask_db -uroot -pjabil#202210 >/data13/tidb/duming-data/etask_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d hcss_db -uroot -pjabil#202210 >/data13/tidb/duming-data/hcss_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d ic_sap_db -uroot -pjabil#202210 >/data13/tidb/duming-data/ic_pro_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d ic_sap_db -uroot -pjabil#202210 >/data13/tidb/duming-data/ic_sap_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d ie_ucs_db -uroot -pjabil#202210 >/data13/tidb/duming-data/ie_ucs_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d iot_db -uroot -pjabil#202210 >/data13/tidb/duming-data/iot_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d mat_fuc_db -uroot -pjabil#202210 >/data13/tidb/duming-data/mat_fuc_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d pub_db -uroot -pjabil#202210 >/data13/tidb/duming-data/pub_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d wef_db -uroot -pjabil#202210 >/data13/tidb/duming-data/wef_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d wic_db -uroot -pjabil#202210 >/data13/tidb/duming-data/wic_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d soea_db -uroot -pjabil#202210 >/data13/tidb/duming-data/soea_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d screw_db -uroot -pjabil#202210 >/data13/tidb/duming-data/screw_db-20240531.sql
mysqldump -h cnhuam0dp09 -P 4001 -d scada_db -uroot -pjabil#202210 >/data13/tidb/duming-data/scada_db-20240531.sql
hdfs数据上传备份
hdfs dfs -put *.sql /user/tidb/dbCreateSQL
导出表数据
/home/tidb/.tiup/bin/tiup dumpling -u $userName -p $passwd -P 4001 -h cnhuam0dp09 -o /data13/tidb/duming-data/$dbName/$exportTabName/$partYearMonth -r 200000 --filetype csv --sql "select * from $tidb_tab where $fiterKeyName>=date_sub(curdate(),interval $fromDays day) and $fiterKeyName<= date_sub(curdate(),interval $toDays day)" -F 200MiB --output-filename-template "$4.{{.Index}}" >>/data13/tidb/logs/hue_dumpling.out
导出表数据
/home/tidb/.tiup/bin/tiup dumpling -u root -p jabil#202210 -P 4001 -h cnhuam0dp09 --filetype sql -t 8 -o /data13/tidb/duming-data/sbs_db -r 200000 --filter "sbs_db.*" -F 256MiB >>/data13/tidb/logs/dump_data.out
6.备份数据库
导出数据
/home/tidb/.tiup/bin/tiup dumpling -u hcss -p hcss -P 4001 -h cnhuam0dp06 --filetype sql -t 8 -o /data13/tidb/dbBak/hcss_20231128 -r 200000 --filter "hcss_db.*" -F 256MiB >>/data13/tidb/logs/hcss.out
zip -r /data13/tidb/dbBak/hcss_20231128.zip /data13/tidb/dbBak
7.强制启动服务
tiup cluster start tidb-JBDP -R tidb
tiup cluster restart tidb-JBDP -R tiflash
tiup cluster restart tidb-JBDP -R node_exporter
tiup cluster restart tidb-JBDP -R prometheus
tiup cluster start tidb-JBDP -R tikv
tiup cluster stop tidb-JBDP -R tikv
tiup cluster start tidb-JBDP -R prometheus
tiup cluster start tidb-JBDP -R tiflash
tiup cluster start tidb-JBDP -R grafana
tiup cluster start tidb-JBDP -R NgMonitoring
tiup cluster display tidb-JBDP
03强制启动某个ID
tiup cluster start tidb-JBDP
tiup cluster start tidb-JBDP --node 10.114.26.104:20163
tiup cluster start tidb-JBDP --node 10.114.26.113:9100
tiup cluster restart tidb-JBDP -R node_exporter
四.优化调优
1.表级别,开启tiflash列存储,列存储统计计算快
针对大表才开启,例如:
ALTER TABLE soea_db.ods_product_bay_data SET TIFLASH REPLICA 1;
ALTER TABLE qmp_db.InternalAuditPlanReport SET TIFLASH REPLICA 1;
ALTER TABLE tb_equ_moniter_record_2022 SET TIFLASH REPLICA 1;
ALTER TABLE screw_db.dwd_tighten_basic SET TIFLASH REPLICA 1;
ALTER TABLE soea_db.ods_product_bay_data SET TIFLASH REPLICA 1;
取消
ALTER TABLE screw_db.dwd_tighten_basic SET TIFLASH REPLICA 0;
查看:
SELECT * FROM information_schema.tiflash_replica WHERE TABLE_SCHEMA = 'soea_db' and TABLE_NAME = 'ods_product_bay_data';
开启tiflash
ALTER TABLE tab SET TIFLASH REPLICA 1; #开启数据同步
set @@session.tidb_allow_mpp=1; #开启MPP计算
set @@session.tidb_enforce_mpp=1;
2.大批量删除数据失败时用
Delete from t where xx limit 5000(过滤条件)
3.配置参数调优
1. 数据同步:Transaction is too large, size: 104857819 With statement:
tiup cluster edit-config tidb-JBDP
tiup cluster show-config tidb-JBDP
tiup cluster show-config tidb-jabil
tiup cluster edit-config tidb-jabil
#修改配置参数:
performance.txn-total-size-limit: 10737418240 ## 单个事务大小限制,例如:一致性导数数据大小限制
重启生效:
tiup cluster reload tidb-JBDP -R tidb
tiup cluster stop tidb-JBDP -R tidb
2. #不超过机器内存的45%
storage.block-cache.capacity: 32GB
#重启生效:
tiup cluster reload tidb-JBDP -R tikv
3. 配置临时计算存储位置 tidb:
tmp-storage-path: /data13/tidb/tmpStorage
performance.txn-total-size-limit: 10737418240
4.日志保存天数
log.file.max-days: 30
4.处理函数长度上限问题:
修改GROUP_CONCAT函数存储字符串的最大长度
默认GROUP_CONCAT函数返回的结果大小被MySQL默认限制为1024(字节)的长度。
SET GLOBAL group_concat_max_len = 102400;
select * from mysql.GLOBAL_VARIABLES where variable_name = ‘group_concat_max_len’;
echo PubkeyAcceptedKeyTypes=+ssh-rsa >> /etc/ssh/sshd_config
systemctl restart sshd
重新加载SSH
tiup cluster reload --ssh system tidb-JBDP
6.清理日志
tiup cluster clean tidb-jabil --log
tiup cluster stop tidb-jabil
tiup cluster start tidb-jabil
tiup cluster clean tidb-JBDP --log
7.如果出现一次启动失败可以了接连tiup cluster start tidb-JBDP,直到成功为止
8.出现创建和删除表失败很有可能是日志积累太多需清除日志,日志需定期清理
tiup cluster clean tidb-JBDP --log
再重启
tiup cluster start tidb-JBDP
tiup cluster reload tidb-JBDP --role tikv --nodes 10.114.26.112
systemctl start tikv-server
tiup cluster reload tidb-JBDP
9.查看进程状态
SHOW PROCESSLIST
SELECT * FROM information_schema.processlist WHERE COMMAND='Sleep' ORDER BY db
50条以下比较健康 不然汇报错:communications link failure
10.某台服务器IO占很高,后台tikv日志报错:Error starting HTTP server" err="accept tcp [::]:9115: accept4: too many open files in system
解决方案是: vi /etc/security/limits.conf, tidb配置数一定小于等于系统配置数,系统配置数一定要高否则没用添加:
# 进程线程数
* soft nproc 131072
* hard nproc 131072
# 文件句柄数
* soft nofile 1000000
* hard nofile 1000000
# 内存锁定交换
* soft memlock unlimited
* hard memlock unlimited
11.查看系统IO使用情况
yum install -y sysstat
查看IO:iostat
实时:dumpling
iostat -d -m 2
配置文件
vi /home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/meta.yaml
执行 tiup cluster reload tidb-JBDP
/data13/tidb/deploy/tikv-20161/bin/tikv-server > /data11/tidb/deploy/tikv-20161/tikv.log 2>&1 &
/data13/tidb/deploy/tikv-20162/bin/tikv-server > /data12/tidb/deploy/tikv-20162/tikv.log 2>&1 &
/data13/tidb/deploy/tikv-20163/bin/tikv-server > /data13/tidb/deploy/tikv-20163/tikv.log 2>&1 &
12.手动启动tidb服务
tidb
112
/data13/tidb/deploy/tidb-4000/bin/tidb-server >/data13/tidb/deploy/tidb-4000/log/tidb.log 2>&1 &
105
/data13/tidb/deploy/tidb-4000/bin/tidb-server >/data13/tidb/deploy/tidb-4000/log/tidb.log 2>&1 &
/data13/tidb/deploy/tikv-server-20162 start
/data13/tidb/deploy/tikv-20162/bin/tikv-server start
/data13/tidb/deploy/tikv-20163/bin/tikv-server stop
systemctl list-unit-files --type=service --state=disabled
systemctl is-enabled tikv-20163
systemctl disable tikv-20163
13.集群丢失数据
https://zhuanlan.zhihu.com/p/616686144
tiup ctl:v6.1.0 pd -u https://10.114.26.111:2379 --cacert=/home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/tls/ca.crt --key=/home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/tls/client.pem --cert=/home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/tls/client.crt -i
pd -u https://10.114.26.111:2379 --cacert=/home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/tls/ca.crt --key=/home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/tls/client.pem --cert=/home/tidb/.tiup/storage/cluster/clusters/tidb-JBDP/tls/client.crt -i » unsafe
https://www.cnblogs.com/vansky/p/9415551.html
14.查看tikv实例详细操作
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store 5
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 member leader show
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store delete 2
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store remove-tombstone 183060412
添加evict调度方式驱逐待下线store上的Leader
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 scheduler add evict-leader-scheduler 183060412
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store delete 183060412 183062372
取消
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store cancel-delete 1
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store delete
17.删除分区
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 operator remove-region 185889957
18.查看tidb配置参数
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 config show
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 unsafe remove-failed-stores show
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 unsafe remove-failed-stores 1
详细
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 config show all
19.实例和分区操作
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 unsafe remove-failed-stores -s 183060412 -r 185889957
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 operator add remove-peer 185889957 183060412
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 operator add remove-peer 4743 183060412
20.查看实例(store)对应的分区存储
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 region --jq='.regions[]|select(has("leader")|not)|{id:,peer_stores: [.peers[].store_id]}'
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 store state_name 1 tombstone
查看分区详情
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 region 159869
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 config show|grep limit > limit.config
21.设置region-schedule
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 config set region-schedule-limit 0
pd-ctl config set region-schedule-limit 0
pd-ctl config set region-schedule-limit 0
22.应用JQ脚本查询
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 region --jq='.regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(length as $total | map(if .==(1) then . else empty end) | length>=$total-length)}' >>tt3.json
查看副本不足的
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 region --jq='.regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(length as $total | map(if .==(183062372,183060412,1,14,8,5,2,183,184,16,9,15,17,179394555,179397408,179393803,179395662) then . else empty end) | length>=$total-length)}' >> tt5.json
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 region --jq=".regions[] | {id: .id, peer_stores: [.peers[].store_id] | select(length as $total | map(if .==(1,14,8,5,2,183,184,16,9,15,17,179394555,179397408,179393803,179395662) then . else empty end) | length>=$total-length)}"
查看没有leader的分区
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 region --jq '.regions[] | select(.leader.store_id == null) | {id: .id, peer_stores: [.peers[].store_id], leader_store: .leader.store_id}' > noleader.json
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 region store 1
23.删除存在下架store上的分区(每台tikv都要执行),当tidb无法启动时使用会丢失数据
主机
/home/tidb/.tiup/components/ctl/v6.1.0/tikv-ctl --data-dir /data11/tidb/data/tikv-20161 unsafe-recover remove-fail-stores -s 183060412,183062372 --all-regions
/home/tidb/.tiup/components/ctl/v6.1.0/tikv-ctl --data-dir /data12/tidb/data/tikv-20162 unsafe-recover remove-fail-stores -s 183060412,183062372 --all-regions
/home/tidb/.tiup/components/ctl/v6.1.0/tikv-ctl --data-dir /data13/tidb/data/tikv-20163 unsafe-recover remove-fail-stores -s 1 --all-regions
从机远程拷贝工具
scp -r /home/tidb/.tiup/components/ctl/v6.1.0/tikv-ctl tidb@cnhuam0dp03:/home/tidb/tikv-ctl
/home/tidb/tikv-ctl --data-dir /data11/tidb/data/tikv-20161 unsafe-recover remove-fail-stores -s 183060412,183062372 --all-regions
/home/tidb/tikv-ctl --data-dir /data12/tidb/data/tikv-20162 unsafe-recover remove-fail-stores -s 183060412,183062372 --all-regions
/home/tidb/tikv-ctl --data-dir /data13/tidb/data/tikv-20163 unsafe-recover remove-fail-stores -s 183060412,183062372 --all-regions
停止tikv
curl http://10.114.26.111:10080/regions/23209844
tiup cluster prune tidb-JBDP
/home/tidb/.tiup/components/ctl/v6.1.0/tikv-ctl --data-dir /df_bigdata_cluster/data/tidb/tidb-data/tikv-20160/ bad-regions
set region-schedule-limit 0
/home/tidb/.tiup/components/ctl/v6.1.0/pd-ctl -u http://10.114.26.111:2379 config set max-replicas 1
tiup cluster start tidb-JBDP -N 10.114.26.112:20161
ps -ef | grep tikv-server | grep -v grep | awk '{print $2}' | xargs kill -9
ps -ef | grep pd-server | grep -v grep | awk '{print $2}' | xargs kill -9
tiup mirror clone tiup-custom-mirror-6.1.0 --tiup mirror-6.1.0
tiup mirror grant $USER
cp -rp /home/tidb/tidb-community-server-6.1.0-linux-amd64/keys /home/tidb/.tiup/
tiup mirror merge /home/tidb/tidb-community-server-6.1.0-linux-amd64
tiup mirror clone /home/tidb/tidb-community-server-6.1.0-linux-amd64 --full
tiup mirror clone tiup-custom-mirror-v6.1.0 --PCC 1.0.1 --os=linux --arch=amd64
tiup mirror merge /home/tidb/tiup-custom-mirror-v6.1.0
tiup install PCC
17.检测安装tidb合格线--调优
Node Check Result Message
---- ----- ------ -------
10.114.26.113 os-version Pass OS is CentOS Linux 7 (Core) 7.9.2009
10.114.26.113 cpu-governor Fail CPU frequency governor is powersave, should use performance
10.114.26.113 sysctl Fail net.ipv4.tcp_syncookies = 1, should be 0
10.114.26.113 sysctl Fail vm.swappiness = 1, should be 0
10.114.26.113 selinux Pass SELinux is disabled
10.114.26.113 thp Fail THP is enabled, please disable it for best performance
10.114.26.113 cpu-cores Pass number of CPU cores / threads: 64
10.114.26.113 memory Pass memory size is 262144MB
10.114.26.113 network Pass network speed of bond0 is 2000MB
10.114.26.113 network Pass network speed of eno5 is 1000MB
10.114.26.113 network Pass network speed of eno6 is 1000MB
10.114.26.113 network Pass network speed of eno7 is 1000MB
10.114.26.113 network Pass network speed of eno8 is 1000MB
10.114.26.113 command Pass numactl: policy: default
10.114.26.113 timezone Pass time zone is the same as the first PD machine: Asia/Shanghai
10.114.26.113 disk Warn mount point /data11 does not have 'noatime' option set
10.114.26.113 service Fail service firewalld is running but should be stopped
10.114.26.113 swap Warn swap is enabled, please disable it for best performance
查看:cat /proc/sys/fs/file-nr
vi /etc/sysctl.conf
fs.file-max=1000000
vm.swappiness=0
net.ipv4.tcp_syncookies=0
sysctl -p
18.tidb扩容和缩容优化
查看
检测
tiup cluster check tidb-JBDP scale-out.yml --cluster --user tidb
扩容
注意:检查扩容文件对应的路径权限是否是tidb的,端口是否已被使用
tiup cluster scale-out tidb-JBDP scale-out-20161.yml
tiup cluster scale-out tidb-JBDP scale-out-20162.yml
缩容
tiup cluster start tidb-JBDP --node 10.114.26.104:20163
tiup cluster stop tidb-JBDP --node 10.114.26.104:20163
tiup cluster scale-in tidb-JBDP --node 10.114.26.103:20163
tiup cluster scale-in tidb-JBDP --node 10.114.26.112:20162 --force
tiup cluster scale-in tidb-JBDP --node 10.114.26.112:20161 --force
tiup cluster scale-in --force -N
tiup cluster scale-in prod-cluster --node 192.168.4.18:20160 --force
注意:下线时间可能12小时以上。
tiup cluster restart tidb-JBDP --node 10.114.26.112:20161
修复tikv
tiup cluster start tidb-JBDP -N 10.114.26.112:20162
curl --proto =https --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh 7 | sh
tiup ctl:v1.14.1 pd -u http://10.114.26.111:2379 store
tiup cluster force-leader tidb-JBDP 10.114.26.112:20161
tiup cluster force-leader tidb-JBDP --node
启动
tiup tidb-JBDP restart tidb-JBDP 10.114.26.112:20161
19,看下tikv线程的参数 cat /proc/49897/limits
tidb优化方案措施
1.对于大表数据量超过100万最好建时间索引,检索数据最好带上时间过滤条件,避免全表搜索;
2.做表与表关联时尽量先过滤条件后关联;
3.用union all替换union,尽量避免使用in .like
4.尽量避免使用join关联
5.如果有大量批量计算和分析时,数据导入sparkSQL用spark做批量计算;
20.检测tidb卡死,队列等待情况
检测tidb 当前作业
admin show ddl jobs;
取消作业,
ADMIN CANCEL DDL JOBS 1887486;
21. 设置最大连接数
SET GLOBAL max_connections = 350;
show variables like '%max_connections%'
注意和haproxy的 maxconn 1024 #最大连接数 对应的1024约等于3*max_connections
注意haproxy的 maxconn(可能多个)不能超过2048.否则tidb扛不住
22.设置每条SQL默认最大内存大小
#SET GLOBAL tidb_mem_quota_query ='10737418240';
show variables like '%tidb_mem_quota_query%';
相关推荐
PassMark BurnInTest V5.3 Copyright (C) 1999-2008 PassMark Software All Rights Reserved http://www.passmark.com Overview ======== Passmark's BurnInTest is a software tool that allows all the major sub...
最好用的单元测试工具,除了这里你是找不到9.0版本的破解的。 ... 独立的版本破解: ... 把lic_client.jar复制到 ... c:\Program Files (x86)\Parasoft\Test\9.0\plugins\...这个是:plugins-c++Test For Visual Studio.7z
eNetTest 网管内网单机测速工具eNetTest 网管内网单机测速工具eNetTest 网管内网单机测速工具eNetTest 网管内网单机测速工具eNetTest 网管内网单机测速工具eNetTest 网管内网单机测速工具eNetTest 网管内网单机测速...
c:\Program Files (x86)\Parasoft\C++test for Visual Studio\9.0\plugins\ 这个目录中 把plugins-Test for Virsual Studio.7z 中的文件覆盖到 c:\Program Files (x86)\Parasoft\Test for Visual Studio\9.0\...
Modeltest 使用说明 Modeltest 是一个选择核苷酸替代模型的软件,通过和 PAUP 配合使用,可以选择出合适的 MODEL,并同时计算出相关参数。下面是 Modeltest 的使用说明和相关知识点: 一、Modeltest 概述 * Model...
Parasoft C++Test 9.5是一款由Parasoft公司开发的专业自动化白盒测试工具,专注于C++编程语言的测试。它集成了多种测试策略,包括静态代码分析、动态测试、单元测试、代码覆盖率分析以及缺陷预防等功能,旨在提高...
(speedtest服务器搭建教程) 本篇教程旨在指导读者搭建speedtest服务器,通过安装PHPStudy、配置WNMP和Nginx、下载并配置speedtest测速平台,实现本地测速功能。 一、 PHPStudy 安装和配置 PHPStudy 是一个集成...
### ECU-Test高级教程知识点解析 #### 一、ECU-Test概述 **ECU-Test**是一款专为汽车电子控制单元(ECU)开发与验证而设计的强大工具。它支持自动化测试流程,并能有效管理和控制整个测试环境,极大地提高了ECU开发...
Google Test是Google开发的一款强大的C++测试框架,它使得C++开发者能够编写单元测试和集成测试,以确保代码的质量和稳定性。本文档将详细介绍Google Test框架的使用方法,包括基本概念、断言、测试套件、测试用例、...
最好用的单元测试工具,除了这里你是找不到9.0版本的破解的。 ... 独立的版本破解: ... 把lic_client.jar复制到 ... c:\Program Files (x86)\Parasoft\Test\9.0\plugins\...这个是:( plugins-Test for Virsual Studio.7z )
Test Track Client 使用说明 Test Track 是一个功能强大且实用的BUG管理软件,能够帮助测试工程师、开发工程师、开发主管和项目管理人员等角色更好地管理和跟踪项目中的BUG。该软件具有强大的管理功能和灵活的配置...
Test Bench是电子设计自动化(EDA)领域中的一个重要概念,主要用于验证数字集成电路的设计。在硬件描述语言(HDL,如Verilog或VHDL)中,Test Bench是模拟真实硬件环境来测试设计功能的一个虚拟平台。它能帮助...
CAN Test V2.53 软件使用说明 CAN Test V2.53 软件是一款功能强大且易用的CAN总线测试工具,旨在帮助用户快速地测试和诊断CAN总线设备。以下是CAN Test V2.53 软件使用说明的详细知识点: 软件安装 CAN Test 软件...
### ECU-TEST基本教程知识点概述 #### 一、ECU-TEST简介 ECU-TEST是一款由Vector公司开发的专业汽车电子控制单元(Electronic Control Unit, ECU)测试工具,它能够实现对ECU进行全面而深入的功能性测试,并且支持...
《Parasoft C++test 9.2官方用户手册_eclipse_中文版》是一本详尽的指南,专为使用C++test工具的开发者提供在Eclipse集成开发环境中的使用方法。C++test是一款强大的静态代码分析和单元测试工具,旨在提高C++软件的...
cifar-10数据集由10个类的60000个32x32彩色图像组成,每个类有6000个图像。有50000个训练图像和10000个测试图像。数据集分为五个训练批次和一个测试...具体:test.mat文件,该训练集可以用于图片识别,非负矩阵分解等。
**串口调试工具——PortTest详解** 在计算机通信领域,串行端口(Serial Port)是一种常见的硬件接口,用于设备间的通信。PortTest是一款专为串口调试设计的实用工具,它可以帮助用户检测和测试串口通讯功能,确保...
C++test简明操作手册 C++test是一款功能强大的测试工具,旨在帮助开发者编写高质量的代码。作为Parasoft公司的旗舰产品,C++test提供了全面的测试解决方案,涵盖了静态测试、动态测试、测试用例生成等多方面的测试...