日本综合一区二区|亚洲中文天堂综合|日韩欧美自拍一区|男女精品天堂一区|欧美自拍第6页亚洲成人精品一区|亚洲黄色天堂一区二区成人|超碰91偷拍第一页|日韩av夜夜嗨中文字幕|久久蜜综合视频官网|精美人妻一区二区三区

RELATEED CONSULTING
相關(guān)咨詢
選擇下列產(chǎn)品馬上在線溝通
服務(wù)時間:8:30-17:00
你可能遇到了下面的問題
關(guān)閉右側(cè)工具欄

新聞中心

這里有您想知道的互聯(lián)網(wǎng)營銷解決方案
Ulimits不生效導(dǎo)致數(shù)據(jù)庫啟動失敗和相關(guān)設(shè)置說明

1. 問題描述

在某客戶的生產(chǎn)環(huán)境GreatSQL數(shù)據(jù)庫緊急重啟過程中,發(fā)現(xiàn)啟動失敗

-- 正常啟動中
2022-07-16T09:30:27.428609+08:00 0 [Note] [MY-010252] [Server] Server hostname (bind-address): '127.0.0.1'; port: 33062
2022-07-16T09:30:27.429134+08:00 0 [Note] [MY-010264] [Server] - '127.0.0.1' resolves to '127.0.0.1';
2022-07-16T09:30:27.429792+08:00 0 [Note] [MY-010251] [Server] Server socket created on IP: '127.0.0.1'.
2022-07-16T09:30:27.430296+08:00 0 [Note] [MY-010252] [Server] Server hostname (bind-address): '*'; port: 3306
2022-07-16T09:30:27.430816+08:00 0 [Note] [MY-010254] [Server] IPv6 is not available.
2022-07-16T09:30:27.431308+08:00 0 [Note] [MY-010264] [Server] - '0.0.0.0' resolves to '0.0.0.0';
2022-07-16T09:30:27.431991+08:00 0 [ERROR] [MY-010250] [Server] Failed to create a socket for IPv4 '0.0.0.0': errno: 24.
2022-07-16T09:30:27.432466+08:00 0 [ERROR] [MY-010255] [Server] Can't create IP socket: Too many open files
-- 報錯Can't create IP socket: Too many open files
2022-07-16T09:30:27.433711+08:00 0 [ERROR] [MY-010119] [Server] Aborting
2022-07-16T09:30:27.435690+08:00 0 [Note] [MY-012330] [InnoDB] FTS optimize thread exiting.
2022-07-16T09:30:28.164281+08:00 0 [Note] [MY-010120] [Server] Binlog end
2022-07-16T09:30:28.165714+08:00 0 [Note] [MY-000000] [Server] Plugin GreatSQL reported: 'Gdb_job_thread stopped!'
2022-07-16T09:30:28.165960+08:00 0 [Note] [MY-000000] [Server] Plugin GreatSQL reported: 'Job manager local thread stopped!'
-- 接下來開始走shutdown流程

上面的錯誤日志非常清晰的指向了 open files 相關(guān)設(shè)置,于是查看 ulimit 信息

[GreatSQL@GDB02-DB01 ~]$ ulimit -a
...
open files (-n) 1024
...
[GreatSQL@GDB02-DB01 ~]$

但是運(yùn)維人員確認(rèn)/etc/security/limits.conf中設(shè)置的限制用戶使用的最大文件數(shù)是正常的65535

[GreatSQL@GDB02-DB01 ~]$ tail -10 /etc/security/limits.conf 
#@faculty soft nproc 20
#@faculty hard nproc 50
#ftp hard nproc 0
#@student - maxlogins 4

* soft nofile 65535
* hard nofile 65535


# End of file
[GreatSQL@GDB02-DB01 ~]$

盡管堡壘機(jī)登錄GreatSQL用戶不正常,但是由root用戶再切換回GreatSQL普通用戶后,open files就變回正常的65535

-- 堡壘機(jī)直接登錄GreatSQL用戶,有open files未修改成功的提示信息
-bash: ulimit: open files: cannot modify limit: Operation not permitted
-- 此時的open files配置確實(shí)沒生效
[GreatSQL@GDB02-DB01 ~]$ ulimit -a |grep 'open files'
open files (-n) 1024
-- su切換到root,root的open files是正常的
[GreatSQL@GDB02-DB01 ~]$ sudo su - root
Last login: Tue Jul 19 11:40:45 CST 2022 on pts/5
[root@GDB02-DB01 ~]# ulimit -a |grep 'open files'
open files (-n) 65535
-- su切換到GreatSQL,open files也是正常的
[root@GDB02-DB01 ~]# su - GreatSQL
Last login: Tue Jul 26 14:23:56 CST 2022 from XXXXXX on pts/8
[GreatSQL@GDB02-DB01 ~]$ ulimit -a |grep 'open files'
open files (-n) 65535
[GreatSQL@GDB02-DB01 ~]$

為了盡快恢復(fù)業(yè)務(wù),先建議運(yùn)維人員由root用戶切換回GreatSQL普通用戶后再啟動數(shù)據(jù)庫,此時啟動成功,業(yè)務(wù)和相關(guān)監(jiān)控 (監(jiān)控里限制必須由GreatSQL用戶啟動數(shù)據(jù)庫) 恢復(fù)正常。

2. ulimits不生效的問題分析

在同批次備機(jī)上進(jìn)行問題復(fù)現(xiàn)分析時,運(yùn)維人員發(fā)現(xiàn)了更多的信息。

(1)堡壘機(jī)直接登錄GreatSQL普通用戶執(zhí)行ulimit命令報錯

[GreatSQL@GDB02-DB02 ~]$ ulimit -n 1026
-bash: ulimit: open files: cannot modify limit: Operation not permitted
[GreatSQL@GDB02-DB02 ~]$ ulimit -Hn
1024 -- 可以發(fā)現(xiàn)這里使用的硬件資源限制1024

(2)堡壘機(jī)直接登錄GreatSQL用戶,也有相關(guān)報錯信息(之前被忽略了)

Connecting to XXXXXX...
Connection established.
To escape to local shell, press 'Ctrl+Alt+]'.

Last login: Tue Jul 26 14:32:32 2022 from XXXXXX

Prepare to login to the target device, Please wait a second.

Last login: Tue Jul 26 14:31:21 2022 from XXXXXX
-bash: ulimit: open files: cannot modify limit: Operation not permitted
[GreatSQL@GDB02-DB01 ~]$ ulimit -a

根據(jù)上面信息的堡壘機(jī)ssh登錄ulimits異常,結(jié)合su到同樣用戶ulimits正常,于是檢查了下ssh配置文件,發(fā)現(xiàn)UsePAM為默認(rèn)的no

cat /etc/ssh/sshd_config
.......
# Set this to 'yes' to enable PAM authentication, account processing,
# and session processing. If this is enabled, PAM authentication will
# be allowed through the ChallengeResponseAuthentication and
# PasswordAuthentication. Depending on your PAM configuration,
# PAM authentication via ChallengeResponseAuthentication may bypass
# the setting of "PermitRootLogin without-password".
# If you just want the PAM account and session checks to run without
# PAM authentication, then enable this but set PasswordAuthentication
# and ChallengeResponseAuthentication to 'no'.
# UsePAM no
......

至此原因比較清晰了,由于/etc/security/limits.conf 文件實(shí)際是 Linux PAM(插入式認(rèn)證模塊,Pluggable Authentication Modules)中 pam_limits.so 的配置文件,而沒有使用 PAM 模塊場景下,自然也就沒有讀取到 /etc/security/limits.conf 的內(nèi)容。

而 su 進(jìn)行用戶切換時使用的是終端TTY登陸(默認(rèn)使用PAM模塊),導(dǎo)致堡壘機(jī)的GreatSQL切換到root、再su GreatSQL后limits相關(guān)設(shè)置正常。

3. 解決方法

(1)修改ssh配置文件,UsePAM=yes

vi /etc/ssh/sshd_config
.......
# Set this to 'yes' to enable PAM authentication, account processing,
# and session processing. If this is enabled, PAM authentication will
# be allowed through the ChallengeResponseAuthentication and
# PasswordAuthentication. Depending on your PAM configuration,
# PAM authentication via ChallengeResponseAuthentication may bypass
# the setting of "PermitRootLogin without-password".
# If you just want the PAM account and session checks to run without
# PAM authentication, then enable this but set PasswordAuthentication
# and ChallengeResponseAuthentication to 'no'.
UsePAM yes
......

PS:經(jīng)過與局方確認(rèn),局方的機(jī)器規(guī)范中也是推薦UsePAM=yes,因此本次問題的原因應(yīng)該是這批機(jī)器在投產(chǎn)時沒有檢查相關(guān)配置項導(dǎo)致。

(2)重啟sshd服務(wù)

[root@GDB02-DB01 ~]# systemctl restart sshd
[root@GDB02-DB01 ~]# systemctl status sshd
● sshd.service - SYSV: OpenSSH server daemon
Loaded: loaded (/etc/rc.d/init.d/sshd; bad; vendor preset: enabled)
Active: active (running) since Tue 2022-07-26 10:28:30 CST; 2s ago
Docs: man:systemd-sysv-generator(8)
Process: 46808 ExecStop=/etc/rc.d/init.d/sshd stop (code=exited, status=0/SUCCESS)
Process: 46815 ExecStart=/etc/rc.d/init.d/sshd start (code=exited, status=0/SUCCESS)
Main PID: 46823 (sshd)
Tasks: 14
Memory: 85.8M
......
[root@GDB02-DB01 ~]#

(3)驗(yàn)證:堡壘機(jī)通過GreatSQL應(yīng)用用戶連接后不再報錯,open files也是設(shè)置的65535

Connection established.
To escape to local shell, press 'Ctrl+Alt+]'.

Last login: Tue Jul 26 10:28:11 2022 from XXXXXX

Prepare to login to the target device, Please wait a second.

Last login: Tue Jul 26 10:28:17 2022 from XXXXXX
[GreatSQL@GDB02-DB01 ~]$ ulimit -a
...
open files (-n) 65535
...
[GreatSQL@GDB02-DB01 ~]$

4. limits.conf配置文件相關(guān)說明

limits.conf限制的是每個用戶可以使用的最大文件數(shù)、最大線程、最大內(nèi)存等資源配置,相關(guān)的設(shè)置如下所示:

* soft nofile 655350  #任何用戶可以打開的最大的文件描述符數(shù)量,默認(rèn)1024,這里的數(shù)值會限制tcp連接
* hard nofile 655350
* soft nproc 655350 #任何用戶可以打開的最大進(jìn)程數(shù)
* hard nproc 650000

@student hard nofile 65535
@student soft nofile 4096
@student hard nproc 50 #學(xué)生組中的任何人不能擁有超過50個進(jìn)程,并且會在擁有30個進(jìn)程時發(fā)出警告
@student soft nproc 30

(1)查看每個用戶創(chuàng)建的進(jìn)程數(shù)

$ ps h -Led -o user | sort | uniq -c | sort -n
1 chrony
1 dbus
2 postfix
7 polkitd
129 root
1326 GreatSQL

(2)系統(tǒng)最大打開文件描述符數(shù)

-- 查看
$ cat /proc/sys/fs/file-max
6553600
-- 設(shè)置
$ vim /etc/sysctl.conf
fs.file-max = 6553600

(3)進(jìn)程最大打開文件句柄數(shù)

-- 查看soft limit,ulimit -n默認(rèn)查看的是soft limit
$ ulimit -n
65535
-- 查看hard limit
$ ulimit -Hn
65535

-- 臨時設(shè)置
-- 通過ulimit -Sn設(shè)置最大打開文件描述符數(shù)的soft limit,注意soft limit必須小于hard limit
$ ulimit -Sn 65535
-- 同時設(shè)置soft limit和hard limit。對于非root用戶只能設(shè)置比原來小的hard limit。
$ ulimit -n 65535

永久設(shè)置
#root權(quán)限下,在/etc/security/limits.conf中添加如下兩行,表示所有用戶最大打開文件描述符數(shù)的soft limit為102400,hard limit為104800。重啟生效
* soft nofile 655350
* hard nofile 655350
復(fù)制
注意:設(shè)置nofile的hard limit還有一點(diǎn)要注意的就是hard limit不能大于/proc/sys/fs/nr_open,假如hard limit大于nr_open,注銷后將無法正常登錄。

(4)查看當(dāng)前系統(tǒng)使用的打開文件句柄數(shù)

$ cat /proc/sys/fs/file-nr
5664 0 186405
其中第一個數(shù)表示當(dāng)前系統(tǒng)已分配使用的打開文件描述符數(shù),
第二個數(shù)為分配后已釋放的(目前已不再使用),
第三個數(shù)等于file-max。

(5)設(shè)置nofile的最大值

使用ulimt -n命令進(jìn)行測試,如果小于系統(tǒng)允許的最大值,設(shè)置成功;大于最大值,系統(tǒng)會報錯提示。

$ ulimit -n 1100000
-bash: ulimit: open files: cannot modify limit: Operation not permitted
$ ulimit -n 1048576
$ ulimit -n 1048577
-bash: ulimit: open files: cannot modify limit: Operation not permitted
$ ulimit -n 1048575
$ ulimit -n 1048576

(6)ulimit -a/n/H/S/u的含義

ulimit -a    顯示當(dāng)前所有的資源限制
ulimit -n 設(shè)置進(jìn)程最大打開文件描述符數(shù)
ulimit -H 設(shè)置硬件資源限制
ulimit -S 設(shè)置軟件資源限制述符數(shù)
ulimit -u 用戶最多可開啟的程序數(shù)目

文章標(biāo)題:Ulimits不生效導(dǎo)致數(shù)據(jù)庫啟動失敗和相關(guān)設(shè)置說明
轉(zhuǎn)載注明:http://www.dlmjj.cn/article/dhppiid.html