<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Applenice</title>
  
  <subtitle>有核有肉有梦想</subtitle>
  <link href="https://www.applenice.net/atom.xml" rel="self"/>
  
  <link href="https://www.applenice.net/"/>
  <updated>2026-05-04T17:31:36.134Z</updated>
  <id>https://www.applenice.net/</id>
  
  <author>
    <name>Applenice</name>
    
  </author>
  
  <generator uri="https://hexo.io/">Hexo</generator>
  
  <entry>
    <title>OpenSSH的PermitRootLogin配置问题分析</title>
    <link href="https://www.applenice.net/2026/05/04/OpenSSH-PermitRootLogin-Configuration/"/>
    <id>https://www.applenice.net/2026/05/04/OpenSSH-PermitRootLogin-Configuration/</id>
    <published>2026-05-04T15:10:31.000Z</published>
    <updated>2026-05-04T17:31:36.134Z</updated>
    
    <content type="html"><![CDATA[<p>因为要在 CentOS 7.6 上进行漏洞修复，就自行编译了 OpenSSH RPM，A产品上验证通过，但是B产品在拿到RPM按步骤升级后却出现了问题，新开 ssh 会话无法使用 root 用户进入后台，只能靠老会话维持排查问题。那么到底是什么原因导致的呢？</p><span id="more"></span><h3 id="问题介绍"><a href="#问题介绍" class="headerlink" title="问题介绍"></a>问题介绍</h3><p>OS: CentOS 7.6.1810<br>OpenSSH版本：OpenSSH_7.4p1, OpenSSL 1.0.2k-fips  26 Jan 2017</p><p>经过漏扫工具进行扫描后需要进行漏洞修复，这里选择自行编译最新的 OpenSSH RPM 进行版本升级，并在A产品进行业务验证。A产品上校验通过，但使用相同 OS 的B产品上却出现了非常严重的问题，在操作后新开 ssh 会话无法使用 root 用户进入后台，只能靠老会话维持排查问题。</p><p>通过老会话可以查看到如下信息:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">[root@localhost]# </span><span class="language-bash">systemctl status sshd</span></span><br><span class="line">● sshd.service - SYSV: OpenSSH server daemon</span><br><span class="line">   Loaded: loaded (/etc/rc.d/init.d/sshd; bad; vendor preset: enabled)</span><br><span class="line">   Active: active (running) since Mon 2026-05-04 22:34:55 CST; 49min ago</span><br><span class="line">     Docs: man:systemd-sysv-generator(8)</span><br><span class="line">  Process: 9262 ExecStart=/etc/rc.d/init.d/sshd start (code=exited, status=0/SUCCESS)</span><br><span class="line"> Main PID: 9284 (sshd)</span><br><span class="line">    Tasks: 1</span><br><span class="line">   Memory: 11.0M</span><br><span class="line">   CGroup: /system.slice/sshd.service</span><br><span class="line">           └─9284 sshd: /usr/sbin/sshd [listener] 0 of 10-100 startups</span><br><span class="line"></span><br><span class="line">May 04 22:34:55 localhost systemd[1]: Starting SYSV: OpenSSH server daemon...</span><br><span class="line">May 04 22:34:55 localhost sshd[9284]: Server listening on 0.0.0.0 port 22.</span><br><span class="line">May 04 22:34:55 localhost sshd[9284]: Server listening on :: port 22.</span><br><span class="line">May 04 22:34:55 localhost sshd[9262]: Starting sshd:[  OK  ]</span><br><span class="line">May 04 22:34:55 localhost systemd[1]: Started SYSV: OpenSSH server daemon.</span><br><span class="line">May 04 22:36:56 localhost sshd[9284]: Timeout before authentication for connection from 192.168.0.133 to 192.168.0.105, pid = 9590</span><br><span class="line">May 04 22:46:21 localhost sshd-session[10584]: Accepted keyboard-interactive/pam for develop from 192.168.0.105 port 54452 ssh2</span><br><span class="line">May 04 23:24:31 localhost sshd-session[11113]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=127.0.0.1  user=root</span><br><span class="line">May 04 23:24:31 localhost sshd-session[11113]: pam_succeed_if(sshd:auth): requirement &quot;uid &gt;= 1000&quot; not met by user &quot;root&quot;</span><br><span class="line">May 04 23:24:32 localhost sshd-session[11111]: error: PAM: Authentication failure for root from 127.0.0.1</span><br></pre></td></tr></table></figure><p>通过如上信息可以看到，PAM 模块显示，当前只能 uid 大于等于1000的普通用户登录。如果是初次看，很容易被引导到 PAM 配置上的问题，我们也确实被引导到这个路线上排查了一段时间。</p><h3 id="分析PAM相关配置"><a href="#分析PAM相关配置" class="headerlink" title="分析PAM相关配置"></a>分析PAM相关配置</h3><p>按照提示信息，查看PAM相关的配置:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">[root@localhost]# </span><span class="language-bash">grep -ni <span class="string">&quot;pam_succeed_if&quot;</span> /etc/pam.d/*ac</span></span><br><span class="line">/etc/pam.d/fingerprint-auth-ac:10:account     sufficient    pam_succeed_if.so uid &lt; 1000 quiet</span><br><span class="line">/etc/pam.d/fingerprint-auth-ac:18:session     [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid</span><br><span class="line">/etc/pam.d/password-auth-ac:7:auth        requisite     pam_succeed_if.so uid &gt;= 1000 quiet_success</span><br><span class="line">/etc/pam.d/password-auth-ac:12:account     sufficient    pam_succeed_if.so uid &lt; 1000 quiet</span><br><span class="line">/etc/pam.d/password-auth-ac:24:session     [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid</span><br><span class="line">/etc/pam.d/postlogin-ac:6:session     [success=1 default=ignore] pam_succeed_if.so service !~ gdm* service !~ su* quiet</span><br><span class="line">/etc/pam.d/smartcard-auth-ac:10:account     sufficient    pam_succeed_if.so uid &lt; 1000 quiet</span><br><span class="line">/etc/pam.d/smartcard-auth-ac:18:session     [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid</span><br><span class="line">/etc/pam.d/system-auth-ac:8:auth        requisite     pam_succeed_if.so uid &gt;= 1000 quiet_success</span><br><span class="line">/etc/pam.d/system-auth-ac:13:account     sufficient    pam_succeed_if.so uid &lt; 1000 quiet</span><br><span class="line">/etc/pam.d/system-auth-ac:23:session     [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">[root@localhost]# </span><span class="language-bash"><span class="built_in">ls</span> -lrt /etc/pam.d/password-auth</span></span><br><span class="line">lrwxrwxrwx. 1 root root 16 Apr 16  2023 /etc/pam.d/password-auth -&gt; password-auth-ac</span><br></pre></td></tr></table></figure><p>其中和 password 有关的是 password-auth-ac:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">[root@localhost]# </span><span class="language-bash"><span class="built_in">cat</span> /etc/pam.d/password-auth-ac</span> </span><br><span class="line"><span class="meta prompt_">#</span><span class="language-bash">%PAM-1.0</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">This file is auto-generated.</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">User changes will be destroyed the next time authconfig is run.</span></span><br><span class="line">auth        required      pam_env.so</span><br><span class="line">auth        required      pam_faildelay.so delay=2000000</span><br><span class="line">auth        sufficient    pam_unix.so nullok try_first_pass</span><br><span class="line">auth        requisite     pam_succeed_if.so uid &gt;= 1000 quiet_success</span><br><span class="line">auth        required      pam_deny.so</span><br><span class="line"></span><br><span class="line">account     required      pam_unix.so</span><br><span class="line">account     sufficient    pam_localuser.so</span><br><span class="line">account     sufficient    pam_succeed_if.so uid &lt; 1000 quiet</span><br><span class="line">account     required      pam_permit.so</span><br><span class="line"></span><br><span class="line">password    requisite     pam_pwquality.so try_first_pass local_users_only retry=3 authtok_type=</span><br><span class="line">password    sufficient    pam_unix.so sha512 shadow nullok try_first_pass use_authtok</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">password    required      pam_deny.so</span><br><span class="line"></span><br><span class="line">session     optional      pam_keyinit.so revoke</span><br><span class="line">session     required      pam_limits.so</span><br><span class="line">-session     optional      pam_systemd.so</span><br><span class="line">session     [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid</span><br><span class="line">session     required      pam_unix.so</span><br></pre></td></tr></table></figure><p>那么，正常通过 root 登录的流程大概是下面的链路:  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">sshd</span><br><span class="line"><span class="meta prompt_">  -&gt; </span><span class="language-bash">/etc/pam.d/sshd</span></span><br><span class="line">      -&gt; auth substack password-auth</span><br><span class="line">          -&gt; /etc/pam.d/password-auth</span><br><span class="line">              -&gt; /etc/pam.d/password-auth-ac</span><br></pre></td></tr></table></figure><p>然后在 password-auth-ac 上逐行匹配条件，依次执行。直到</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">auth        sufficient    pam_unix.so nullok try_first_pass</span><br></pre></td></tr></table></figure><p>这里 pam_unix.so 会验证本地 &#x2F;etc&#x2F;shadow 里的 root 密码。如果 root 密码正确，pam_unix.so 返回成功。由于它的控制标志是 sufficient。如果本模块认证成功，并且前面没有 required 模块失败，那么整个认证栈可以直接认为成功，后面的模块不再继续执行。  </p><p>所以如果 root 密码正确，后面这句 uid 判断不会触发执行。  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">auth        requisite     pam_succeed_if.so uid &gt;= 1000 quiet_success</span><br></pre></td></tr></table></figure><p>但在A、B两个产品上都没有修改过 password-auth-ac，且和刚装完 OS 的内容一致。那么到底什么因素影响的呢？  </p><h3 id="故障解除"><a href="#故障解除" class="headerlink" title="故障解除"></a>故障解除</h3><p>回来重新分析下问题，现状是两个：</p><ul><li>A产品在升级前后都能够正常使用</li><li>B产品在没有升级OpenSSH之前，root是能够正常使用的，只是升级后才出现了root无法登录</li></ul><p>到这里需要修改下怀疑方向，大概率问题还是在 ssh 的配置有所区别，导致表现不一样。</p><p>A产品的 ssh_config 默认使用了自定义内容，会对初始化安装的配置文件进行覆盖:  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">[root@localhost]# </span><span class="language-bash"><span class="built_in">cat</span> /etc/ssh/sshd_config | grep Root</span></span><br><span class="line">PermitRootLogin yes</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">the setting of <span class="string">&quot;PermitRootLogin without-password&quot;</span>.</span></span><br></pre></td></tr></table></figure><p>B产品的 ssh_config 默认使用了 CentOS 7.6 初始化安装完的状态:  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">[root@localhost]# </span><span class="language-bash"><span class="built_in">cat</span> /etc/ssh/sshd_config | grep Root</span></span><br><span class="line"><span class="meta prompt_">#</span><span class="language-bash">PermitRootLogin <span class="built_in">yes</span></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">the setting of <span class="string">&quot;PermitRootLogin without-password&quot;</span>.</span></span><br></pre></td></tr></table></figure><p>在B产品的配置上，将 PermitRootLogin 调整为 yes，重启 sshd 服务后故障解除。新开 ssh 会话能够正常使用 root 用户登录后台。</p><h3 id="问题根因"><a href="#问题根因" class="headerlink" title="问题根因"></a>问题根因</h3><p>但这里要思考一个问题，在B产品上没有修改过 sshd_config、password-auth-ac 配置，升级后出现的问题到底是由谁导致的？为什么把 PermitRootLogin yes 打开就恢复了？</p><p>这里使用虚拟机进行一个故障模拟，在一台初始化安装完的 CentOS 7.6机器上，查看信息如下:  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">[root@localhost]# </span><span class="language-bash"><span class="built_in">cat</span> /etc/ssh/sshd_config | grep Root</span></span><br><span class="line"><span class="meta prompt_">#</span><span class="language-bash">PermitRootLogin <span class="built_in">yes</span></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">the setting of <span class="string">&quot;PermitRootLogin without-password&quot;</span>.</span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">[root@localhost]# </span><span class="language-bash">sshd -T | grep permit</span></span><br><span class="line">permitrootlogin yes</span><br><span class="line">permittty yes</span><br><span class="line">permituserrc yes</span><br><span class="line">permitemptypasswords no</span><br><span class="line">permituserenvironment no</span><br><span class="line">permittunnel no</span><br><span class="line">permitopen any</span><br></pre></td></tr></table></figure><p>其中 sshd -T 会解析所有配置并显示最终生效的参数，用于配置验证、故障排查、安全审计等使用。这里可以看出 permitrootlogin 默认就是 yes。</p><p>当升级完 Openssh 版本之后，再次查看信息如下:  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">[root@localhost]# </span><span class="language-bash"><span class="built_in">cat</span> /etc/ssh/sshd_config | grep Root</span></span><br><span class="line"><span class="meta prompt_">#</span><span class="language-bash">PermitRootLogin <span class="built_in">yes</span></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">the setting of <span class="string">&quot;PermitRootLogin without-password&quot;</span>.</span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">[root@localhost]# </span><span class="language-bash">sshd -T | grep permit</span></span><br><span class="line">permitrootlogin without-password</span><br><span class="line">permittty yes</span><br><span class="line">permituserrc yes</span><br><span class="line">permitemptypasswords no</span><br><span class="line">permittunnel no</span><br><span class="line">permitopen any</span><br><span class="line">permitlisten any</span><br><span class="line">permituserenvironment no</span><br></pre></td></tr></table></figure><p>可以看到 PermitRootLogin 已经变成了 without-password。这个值和prohibit-password等价，代表root 可以使用公钥登录，但不能使用密码登录，也不能使用 keyboard-interactive&#x2F;PAM 交互式密码登录。</p><p>到这里，就可以得出第一个结论:  </p><p><strong>在升级 OpenSSH 版本后，对 PermitRootLogin 的默认行为发生了改变。由 yes 变为了 without-password，导致新开 ssh 会话无法使用 root 用户登录。</strong></p><h3 id="谁改了PermitRootLogin默认行为"><a href="#谁改了PermitRootLogin默认行为" class="headerlink" title="谁改了PermitRootLogin默认行为"></a>谁改了PermitRootLogin默认行为</h3><p>经过上面的分析，可以继续深究的问题是，PermitRootLogin 的默认行为发生了改变，由 yes 变为了 without-password，是什么时候发生的事？</p><p>通过查找资料，在 OpenSSH 7.0 release Changelog 中可能不兼容的变更说明里找到了如下内容:</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">OpenSSH 7.0 was released on 2015-08-11</span><br><span class="line"></span><br><span class="line">Potentially-incompatible Changes</span><br><span class="line">--------------------------------</span><br><span class="line"> .....</span><br><span class="line"></span><br><span class="line"> * The default for the sshd_config(5) PermitRootLogin option has</span><br><span class="line">   changed from &quot;yes&quot; to &quot;prohibit-password&quot;.</span><br><span class="line"></span><br><span class="line"> * PermitRootLogin=without-password/prohibit-password now bans all</span><br><span class="line">   interactive authentication methods, allowing only public-key,</span><br><span class="line">   hostbased and GSSAPI authentication (previously it permitted</span><br><span class="line">   keyboard-interactive and password-less authentication if those</span><br><span class="line">   were enabled).</span><br></pre></td></tr></table></figure><p>但是 OpenSSH 7.0 发布在 2015-08-11，CentOS 7.6.1810默认携带的版本是 OpenSSH 7.4p1版本，OpenSSH 7.4 发布在 2016-12-19。</p><p>查看7.4的源代码，可以看出来当没有设置 PermitRootLogin 时，将赋值为 PERMIT_NO_PASSWD，启用的是 without-password&#x2F;prohibit-password。<br>代码链接：<a href="https://github.com/openssh/openssh-portable/blob/V_7_4_P1/servconf.c">https://github.com/openssh/openssh-portable/blob/V_7_4_P1/servconf.c</a>  </p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">    ...</span><br><span class="line">    options-&gt;permit_root_login = PERMIT_NOT_SET;</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> (options-&gt;permit_root_login == PERMIT_NOT_SET)</span><br><span class="line">options-&gt;permit_root_login = PERMIT_NO_PASSWD;</span><br><span class="line">    ...</span><br><span class="line"></span><br><span class="line"><span class="type">static</span> <span class="type">const</span> <span class="class"><span class="keyword">struct</span> <span class="title">multistate</span> <span class="title">multistate_permitrootlogin</span>[] =</span> &#123;</span><br><span class="line">&#123; <span class="string">&quot;without-password&quot;</span>,PERMIT_NO_PASSWD &#125;,</span><br><span class="line">&#123; <span class="string">&quot;prohibit-password&quot;</span>,PERMIT_NO_PASSWD &#125;,</span><br><span class="line">&#123; <span class="string">&quot;forced-commands-only&quot;</span>,PERMIT_FORCED_ONLY &#125;,</span><br><span class="line">&#123; <span class="string">&quot;yes&quot;</span>,PERMIT_YES &#125;,</span><br><span class="line">&#123; <span class="string">&quot;no&quot;</span>,PERMIT_NO &#125;,</span><br><span class="line">&#123; <span class="literal">NULL</span>, <span class="number">-1</span> &#125;</span><br><span class="line">&#125;;</span><br></pre></td></tr></table></figure><p>那么早在7.0版本中就完成变更的事，为什么在 CentOS 7.6.1810中，PermitRootLogin 的默认行为仍是 yes呢？  </p><p>通过进一步查找资料，在CentOS 7的归档仓库中找到了相关的openssh patch，其中有一条是openssh-7.4p1-permit-root-login.patch，该 patch 将上述 OpenSSH 源码中的变更又改了回去…，默认赋值为 PERMIT_YES</p><p>代码链接: <a href="https://gitlab.com/CentOS/archives/git.centos.org/rpms/openssh/-/blob/c7/SOURCES/openssh-7.4p1-permit-root-login.patch">https://gitlab.com/CentOS/archives/git.centos.org/rpms/openssh/-/blob/c7/SOURCES/openssh-7.4p1-permit-root-login.patch</a></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br></pre></td><td class="code"><pre><span class="line">diff -up openssh-7.4p1/servconf.c.permit-root openssh-7.4p1/servconf.c</span><br><span class="line">--- openssh-7.4p1/servconf.c.permit-root2017-02-10 10:27:18.109487568 +0100</span><br><span class="line">+++ openssh-7.4p1/servconf.c2017-02-10 10:28:12.385776132 +0100</span><br><span class="line">@@ -231,7 +231,7 @@ fill_default_server_options(ServerOption</span><br><span class="line"> if (options-&gt;login_grace_time == -1)</span><br><span class="line"> options-&gt;login_grace_time = 120;</span><br><span class="line"> if (options-&gt;permit_root_login == PERMIT_NOT_SET)</span><br><span class="line">-options-&gt;permit_root_login = PERMIT_NO_PASSWD;</span><br><span class="line">+options-&gt;permit_root_login = PERMIT_YES;</span><br><span class="line"> if (options-&gt;ignore_rhosts == -1)</span><br><span class="line"> options-&gt;ignore_rhosts = 1;</span><br><span class="line"> if (options-&gt;ignore_user_known_hosts == -1)</span><br><span class="line">diff -up openssh-7.4p1/sshd_config.5.permit-root openssh-7.4p1/sshd_config.5</span><br><span class="line">--- openssh-7.4p1/sshd_config.5.permit-root2017-02-10 10:28:24.174605582 +0100</span><br><span class="line">+++ openssh-7.4p1/sshd_config.52017-02-10 10:28:42.254344023 +0100</span><br><span class="line">@@ -1227,7 +1227,7 @@ The argument must be</span><br><span class="line"> or</span><br><span class="line"> .Cm no .</span><br><span class="line"> The default is</span><br><span class="line">-.Cm prohibit-password .</span><br><span class="line">+.Cm yes .</span><br><span class="line"> .Pp</span><br><span class="line"> If this option is set to</span><br><span class="line"> .Cm prohibit-password</span><br><span class="line">diff -up openssh-7.4p1/sshd_config.permit-root openssh-7.4p1/sshd_config</span><br><span class="line">--- openssh-7.4p1/sshd_config.permit-root2017-02-10 10:26:52.256797645 +0100</span><br><span class="line">+++ openssh-7.4p1/sshd_config2017-02-10 10:26:52.276797405 +0100</span><br><span class="line">@@ -35,7 +35,7 @@ SyslogFacility AUTHPRIV</span><br><span class="line"> # Authentication:</span><br><span class="line"> </span><br><span class="line"> #LoginGraceTime 2m</span><br><span class="line">-#PermitRootLogin prohibit-password</span><br><span class="line">+#PermitRootLogin yes</span><br><span class="line"> #StrictModes yes</span><br><span class="line"> #MaxAuthTries 6</span><br><span class="line"> #MaxSessions 10</span><br></pre></td></tr></table></figure><p>所以这里可以得出第二个结论:  </p><p><strong>在CentOS 7上又或者是其上游 Redhat 的原因，在打包 OpenSSH 7.4p1 时打了一个补丁，可能是出于兼容性的原因修改了源码，把默认值又改回了 yes。所以在 CentOS 7 刚装完系统时，即使配置文件里是 #PermitRootLogin yes，它的默认行为依然是允许 root 密码登录。</strong></p><p>综上，当我们手动升级到官方版本编译出的制品物料时，从 OpenSSH 7.0 开始的默认行为 PERMIT_NO_PASSWD (即 prohibit-password &#x2F; without-password) 就在机器上生效了，导致升级完 SSH 版本，PAM 配置一行没动，新开 ssh 会话时 root 就不能用正常登录。</p><h3 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h3><p>1、OpenSSH 7.0版本开始，对 PermitRootLogin 的默认行为发生了改变。由 yes 变为了 prohibit-password &#x2F; without-password<br>2、操作系统的发行版可能会因为兼容性等因素，改变 OpenSSH 的默认行为，配置影响会非常明显，要根据不同发行版具体分析<br>3、在处理组件升级变更时，还是要多验证分析，对关键配置项进行不同验证，避免出现配置问题影响</p><h3 id="参考"><a href="#参考" class="headerlink" title="参考"></a>参考</h3><p>1、<a href="https://www.openssh.org/txt/release-7.0">https://www.openssh.org/txt/release-7.0</a><br>2、<a href="https://www.openssh.org/txt/release-7.4">https://www.openssh.org/txt/release-7.4</a><br>3、<a href="https://gitlab.com/CentOS/archives/git.centos.org/rpms/openssh/-/blob/c7/SOURCES/openssh-7.4p1-permit-root-login.patch">https://gitlab.com/CentOS/archives/git.centos.org/rpms/openssh/-/blob/c7/SOURCES/openssh-7.4p1-permit-root-login.patch</a><br>4、<a href="https://github.com/openssh/openssh-portable/blob/V_7_4_P1/servconf.c">https://github.com/openssh/openssh-portable/blob/V_7_4_P1/servconf.c</a>  </p>]]></content>
    
    
    <summary type="html">&lt;p&gt;因为要在 CentOS 7.6 上进行漏洞修复，就自行编译了 OpenSSH RPM，A产品上验证通过，但是B产品在拿到RPM按步骤升级后却出现了问题，新开 ssh 会话无法使用 root 用户进入后台，只能靠老会话维持排查问题。那么到底是什么原因导致的呢？&lt;/p&gt;</summary>
    
    
    
    <category term="Skill" scheme="https://www.applenice.net/categories/Skill/"/>
    
    
    <category term="Linux" scheme="https://www.applenice.net/tags/Linux/"/>
    
    <category term="OpenSSH" scheme="https://www.applenice.net/tags/OpenSSH/"/>
    
    <category term="CentOS" scheme="https://www.applenice.net/tags/CentOS/"/>
    
  </entry>
  
  <entry>
    <title>使用 LD_LIBRARY_PATH 解决服务 So 库冲突</title>
    <link href="https://www.applenice.net/2025/12/08/Resolving-Service-SO-Library-Conflicts-Using-LD-LIBRARY-PATH/"/>
    <id>https://www.applenice.net/2025/12/08/Resolving-Service-SO-Library-Conflicts-Using-LD-LIBRARY-PATH/</id>
    <published>2025-12-08T14:15:47.000Z</published>
    <updated>2025-12-08T15:06:53.927Z</updated>
    
    <content type="html"><![CDATA[<p>最近遇到一个案例，在客户提供的机器上部署业务时，所需要的 So 库和对应的软链接已经存在，但是业务报错显示符号表有缺失，导致服务启动失败。排查原因的时候发现是该机器上部署了一个名为 PubKit 的服务，提前占用了同名软链接且对应的 So 库版本低。那么本着服务互不影响的原则，该怎么处理这个问题呢？</p><span id="more"></span><h3 id="环境情况"><a href="#环境情况" class="headerlink" title="环境情况"></a>环境情况</h3><p>为了复现这个问题，记录对应的解决方案，我在家里搭建了一套类似的环境。环境如下:<br>OS: Kylin Linux Advanced Server release V10 SP3 2403&#x2F;(Halberd)-x86_64-Build20&#x2F;20240426<br>软件: MongoDB 3.4.10 及依赖的 libcrypto.so.1.0.2k、libssl.so.1.0.2k</p><p>只模拟 PubKit 服务引起的故障点，已知 PubKit 使用的 So 库软链接是:  </p><ul><li>libcrypto.so.10，对应 libcrypto.so.1.0.0 版本</li><li>libssl.so.10，对应 libssl.so.1.0.0 版本</li></ul><p>想直接找到现成的对应版本 So 库还真有点费劲，毕竟 OpenSSL 1.0.0t 发布时间已经是 2015 年了，CentOS 6.6开始使用的版本基本都是 OpenSSL 1.0.1 版本，但好在可以自行编译，包地址: <a href="https://openssl-library.org/source/old/1.0.0/">https://openssl-library.org/source/old/1.0.0/</a>  </p><p>这里编译 OpenSSL 1.0.0t 时是在CentOS 7.6上进行，主要是担心存在 GCC 版本兼容问题，CentOS 7.6 GCC 是 4.8.5 版本，在兼容范围内</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">cd /home/misaka</span><br><span class="line">tar -xvf openssl-1.0.0t.tar.gz</span><br><span class="line">cd openssl-1.0.0t</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">注意这里是要编译 So 库的，所以要加 shared 参数，因为不打算做安装动作，所以 prefix 和 openssldir 参数都不加</span></span><br><span class="line">./config shared </span><br><span class="line">make</span><br></pre></td></tr></table></figure><p>等待编译完成，即可在当前目录下找到对应的 So 文件: libcrypto.so.1.0.0、libssl.so.1.0.0，将其拷贝出来即可用在故障复现的机器上。</p><h3 id="故障复现"><a href="#故障复现" class="headerlink" title="故障复现"></a>故障复现</h3><p>1、将上一步准备好的 libcrypto.so.1.0.0、libssl.so.1.0.0 文件上传到 麒麟V10 虚拟机(默认使用 1.1.1f 版本，因此和 1.0.0 系列不冲突)上，并创建软链接，用来模拟客户现场的 PubKit 服务提前占用。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br></pre></td><td class="code"><pre><span class="line">[root@localhost 100]# cp libcrypto.so.1.0.0 /usr/lib64/</span><br><span class="line">[root@localhost 100]# cp libssl.so.1.0.0 /usr/lib64/</span><br><span class="line">[root@localhost 100]# cd /usr/lib64</span><br><span class="line"></span><br><span class="line">[root@localhost lib64]# ln -s libcrypto.so.1.0.0 libcrypto.so.10</span><br><span class="line">[root@localhost lib64]# ls -lrt libcrypto*</span><br><span class="line">lrwxrwxrwx 1 root root      19 Feb 28  2024 libcrypto.so.1.1 -&gt; libcrypto.so.1.1.1f</span><br><span class="line">lrwxrwxrwx 1 root root      19 Feb 28  2024 libcrypto.so -&gt; libcrypto.so.1.1.1f</span><br><span class="line">-rwxr-xr-x 1 root root 3078240 Feb 28  2024 libcrypto.so.1.1.1f</span><br><span class="line">-rw-r--r-- 1 root root 5806670 Feb 28  2024 libcrypto.a</span><br><span class="line">-rw-r--r-- 1 root root 2059728 Dec  7 00:33 libcrypto.so.1.0.0</span><br><span class="line">lrwxrwxrwx 1 root root      18 Dec  7 00:34 libcrypto.so.10 -&gt; libcrypto.so.1.0.0</span><br><span class="line"></span><br><span class="line">[root@localhost lib64]# ln -s libssl.so.1.0.0 libssl.so.10</span><br><span class="line">[root@localhost lib64]# ls -lrt libssl*</span><br><span class="line">lrwxrwxrwx 1 root root      16 Feb 28  2024 libssl.so.1.1 -&gt; libssl.so.1.1.1f</span><br><span class="line">lrwxrwxrwx 1 root root      16 Feb 28  2024 libssl.so -&gt; libssl.so.1.1.1f</span><br><span class="line">-rwxr-xr-x 1 root root  624664 Feb 28  2024 libssl.so.1.1.1f</span><br><span class="line">-rw-r--r-- 1 root root 1073826 Feb 28  2024 libssl.a</span><br><span class="line">-rwxr-xr-x 1 root root  386368 Mar 15  2024 libssl3.so</span><br><span class="line">-rw-r--r-- 1 root root  441560 Dec  7 01:20 libssl.so.1.0.0</span><br><span class="line">lrwxrwxrwx 1 root root      15 Dec  7 01:21 libssl.so.10 -&gt; libssl.so.1.0.0</span><br></pre></td></tr></table></figure><p>2、安装准备好的 MongoDB RPM包:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line">[root@localhost mongo3.4.10]# ls -lrt</span><br><span class="line">total 93532</span><br><span class="line">-rw-rw-r-- 1 misaka misaka     5988 Dec  7 00:32 mongodb-org-3.4.10-1.el7.x86_64.rpm</span><br><span class="line">-rw-rw-r-- 1 misaka misaka 12193778 Dec  7 00:32 mongodb-org-mongos-3.4.10-1.el7.x86_64.rpm</span><br><span class="line">-rw-rw-r-- 1 misaka misaka 20625955 Dec  7 00:32 mongodb-org-server-3.4.10-1.el7.x86_64.rpm</span><br><span class="line">-rw-rw-r-- 1 misaka misaka 11777295 Dec  7 00:32 mongodb-org-shell-3.4.10-1.el7.x86_64.rpm</span><br><span class="line">-rw-rw-r-- 1 misaka misaka 51164902 Dec  7 00:32 mongodb-org-tools-3.4.10-1.el7.x86_64.rpm</span><br><span class="line"></span><br><span class="line">[root@localhost mongo3.4.10]# rpm -ivh *.rpm --force --nodeps</span><br><span class="line">warning: mongodb-org-3.4.10-1.el7.x86_64.rpm: Header V3 RSA/SHA1 Signature, key ID a15703c6: NOKEY</span><br><span class="line">Verifying...                          ################################# [100%]</span><br><span class="line">Preparing...                          ################################# [100%]</span><br><span class="line">Updating / installing...</span><br><span class="line">   1:mongodb-org-tools-3.4.10-1.el7   ################################# [ 20%]</span><br><span class="line">   2:mongodb-org-shell-3.4.10-1.el7   ################################# [ 40%]</span><br><span class="line">   3:mongodb-org-server-3.4.10-1.el7  ################################# [ 60%]</span><br><span class="line">Created symlink /etc/systemd/system/multi-user.target.wants/mongod.service → /usr/lib/systemd/system/mongod.service.</span><br><span class="line">   4:mongodb-org-mongos-3.4.10-1.el7  ################################# [ 80%]</span><br><span class="line">   5:mongodb-org-3.4.10-1.el7         ################################# [100%]</span><br></pre></td></tr></table></figure><p>3、启动 MongoDB 服务并查看状态，发现服务启动，提示<code>/usr/bin/mongod: symbol FIPS_mode_set version libcrypto.so.10 not defined</code>.  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">[root@localhost mongo3.4.10]# systemctl start mongod</span><br><span class="line">[root@localhost mongo3.4.10]# systemctl status mongod</span><br><span class="line">● mongod.service - High-performance, schema-free document-oriented database</span><br><span class="line">   Loaded: loaded (/usr/lib/systemd/system/mongod.service; enabled; vendor preset: disabled)</span><br><span class="line">   Active: failed (Result: exit-code) since Sun 2025-12-07 01:28:24 CST; 4s ago</span><br><span class="line">     Docs: https://docs.mongodb.org/manual</span><br><span class="line">  Process: 7414 ExecStartPre=/usr/bin/mkdir -p /var/run/mongodb (code=exited, status=0/SUCCESS)</span><br><span class="line">  Process: 7416 ExecStartPre=/usr/bin/chown mongod:mongod /var/run/mongodb (code=exited, status=0/SUCCESS)</span><br><span class="line">  Process: 7418 ExecStartPre=/usr/bin/chmod 0755 /var/run/mongodb (code=exited, status=0/SUCCESS)</span><br><span class="line">  Process: 7420 ExecStart=/usr/bin/mongod $OPTIONS (code=exited, status=127)</span><br><span class="line"> Main PID: 7420 (code=exited, status=127)</span><br><span class="line"></span><br><span class="line">Dec 07 01:28:24 localhost.localdomain systemd[1]: Starting High-performance, schema-free document-oriented database...</span><br><span class="line">Dec 07 01:28:24 localhost.localdomain systemd[1]: Started High-performance, schema-free document-oriented database.</span><br><span class="line">Dec 07 01:28:24 localhost.localdomain mongod[7420]: /usr/bin/mongod: /lib64/libcrypto.so.10: no version information available (required by /usr/bin/mongod)</span><br><span class="line">Dec 07 01:28:24 localhost.localdomain mongod[7420]: /usr/bin/mongod: /lib64/libcrypto.so.10: no version information available (required by /usr/bin/mongod)</span><br><span class="line">Dec 07 01:28:24 localhost.localdomain mongod[7420]: /usr/bin/mongod: /lib64/libssl.so.10: no version information available (required by /usr/bin/mongod)</span><br><span class="line">Dec 07 01:28:24 localhost.localdomain mongod[7420]: /usr/bin/mongod: relocation error: /usr/bin/mongod: symbol FIPS_mode_set version libcrypto.so.10 not defined in file libcrypto.so.1&gt;</span><br><span class="line">Dec 07 01:28:24 localhost.localdomain systemd[1]: mongod.service: Main process exited, code=exited, status=127/n/a</span><br><span class="line">Dec 07 01:28:24 localhost.localdomain systemd[1]: mongod.service: Failed with result &#x27;exit-code&#x27;.</span><br></pre></td></tr></table></figure><p>4、查看动态库引用关系</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">[root@localhost mongo3.4.10]# ldd /usr/bin/mongod</span><br><span class="line">/usr/bin/mongod: /lib64/libcrypto.so.10: no version information available (required by /usr/bin/mongod)</span><br><span class="line">/usr/bin/mongod: /lib64/libcrypto.so.10: no version information available (required by /usr/bin/mongod)</span><br><span class="line">/usr/bin/mongod: /lib64/libssl.so.10: no version information available (required by /usr/bin/mongod)</span><br><span class="line">        linux-vdso.so.1 (0x00007ffc0ed99000)</span><br><span class="line">        libssl.so.10 =&gt; /lib64/libssl.so.10 (0x00007fa28fa5b000)</span><br><span class="line">        libcrypto.so.10 =&gt; /lib64/libcrypto.so.10 (0x00007fa28f69d000)</span><br><span class="line">        librt.so.1 =&gt; /usr/lib64/librt.so.1 (0x00007fa28f694000)</span><br><span class="line">        libdl.so.2 =&gt; /usr/lib64/libdl.so.2 (0x00007fa28f68f000)</span><br><span class="line">        libm.so.6 =&gt; /usr/lib64/libm.so.6 (0x00007fa28f50e000)</span><br><span class="line">        libgcc_s.so.1 =&gt; /usr/lib64/libgcc_s.so.1 (0x00007fa28f4f5000)</span><br><span class="line">        libpthread.so.0 =&gt; /usr/lib64/libpthread.so.0 (0x00007fa28f4d3000)</span><br><span class="line">        libc.so.6 =&gt; /usr/lib64/libc.so.6 (0x00007fa28f326000)</span><br><span class="line">        /lib64/ld-linux-x86-64.so.2 (0x00007fa292a26000)</span><br><span class="line"></span><br><span class="line">[root@localhost mongo3.4.10]# ls -lrt /lib64/libcrypto.so.10</span><br><span class="line">lrwxrwxrwx 1 root root 18 Dec  7 00:34 /lib64/libcrypto.so.10 -&gt; libcrypto.so.1.0.0</span><br></pre></td></tr></table></figure><p>5、确认 libcrypto.so.1.0.0 中是否包含 symbol FIPS_mode_set，这里可以用nm或者objdump等方法，如不包含则没有任何输出</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">[root@localhost mongo3.4.10]# nm -D /usr/lib64/libcrypto.so.10 | grep FIPS_mode_set</span><br><span class="line">[root@localhost mongo3.4.10]# objdump -T /usr/lib64/libcrypto.so.10 | grep FIPS_mode_set</span><br></pre></td></tr></table></figure><p>到这里就复现了现场发现的问题，那么该如何解决呢？</p><h3 id="解决冲突"><a href="#解决冲突" class="headerlink" title="解决冲突"></a>解决冲突</h3><h4 id="确认FIPS-mode-set"><a href="#确认FIPS-mode-set" class="headerlink" title="确认FIPS_mode_set"></a>确认FIPS_mode_set</h4><p>确认要准备好的正常 So 库中是包含了 symbol FIPS_mode_set:  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">[root@localhost mongo3.4.10]# nm -D ./102/libcrypto.so.1.0.2k | grep FIPS_mode_set</span><br><span class="line">0000000000070b60 T FIPS_mode_set</span><br><span class="line">[root@localhost mongo3.4.10]# </span><br><span class="line">[root@localhost mongo3.4.10]# objdump -T ./102/libcrypto.so.1.0.2k | grep FIPS_mode_set</span><br><span class="line">0000000000070b60 g    DF .text  000000000000007a  libcrypto.so.10 FIPS_mode_set</span><br></pre></td></tr></table></figure><h4 id="配置LD-LIBRARY-PATH"><a href="#配置LD-LIBRARY-PATH" class="headerlink" title="配置LD_LIBRARY_PATH"></a>配置LD_LIBRARY_PATH</h4><p>LD_LIBRARY_PATH 是 Linux&#x2F;Unix 系统中动态链接器（ld-linux.so） 的核心环境变量，用于临时指定共享库（.so 文件）的搜索路径，补充系统默认的库搜索规则，解决「动态库找不到」「多版本库共存」等问题。当执行一个依赖动态库的程序时，系统的动态链接器会按优先级查找所需的 .so 文件，LD_LIBRARY_PATH 是其中优先级较高的「自定义路径来源」。</p><p>基于以上的情况，即不能动 PubKit 的服务依赖，又要保证自身业务能运行。那么基于 LD_LIBRARY_PATH 的方案就是一个比较好的办法。具体操作步骤如下:  </p><p>1、将上述准备好的 libcrypto.so.1.0.2k、libssl.so.1.0.2k 拷贝到 &#x2F;usr&#x2F;lib64&#x2F;mongodb 下，并创建软链接:  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">[root@localhost 102]# mkdir -p /usr/lib64/mongodb</span><br><span class="line">[root@localhost 102]# cp libcrypto.so.1.0.2k /usr/lib64/mongodb/</span><br><span class="line">[root@localhost 102]# cp libssl.so.1.0.2k /usr/lib64/mongodb/</span><br><span class="line">[root@localhost 102]# cd /usr/lib64/mongodb/</span><br><span class="line">[root@localhost mongodb]# ln -s libcrypto.so.1.0.2k libcrypto.so.10</span><br><span class="line">[root@localhost mongodb]# ln -s libssl.so.1.0.2k libssl.so.10</span><br><span class="line">[root@localhost mongodb]# ls -lrt</span><br><span class="line">total 2916</span><br><span class="line">-rw-r--r-- 1 root root 2513000 Dec  7 01:45 libcrypto.so.1.0.2k</span><br><span class="line">-rw-r--r-- 1 root root  470360 Dec  7 01:45 libssl.so.1.0.2k</span><br><span class="line">lrwxrwxrwx 1 root root      19 Dec  7 01:45 libcrypto.so.10 -&gt; libcrypto.so.1.0.2k</span><br><span class="line">lrwxrwxrwx 1 root root      16 Dec  7 01:45 libssl.so.10 -&gt; libssl.so.1.0.2k</span><br></pre></td></tr></table></figure><p>2、编辑 &#x2F;usr&#x2F;lib&#x2F;systemd&#x2F;system&#x2F;mongod.service 文件，在 Environment 中写入 LD_LIBRARY_PATH 配置:</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">[Service]</span><br><span class="line">User=mongod</span><br><span class="line">Group=mongod</span><br><span class="line"># 增加该行</span><br><span class="line">Environment=&quot;LD_LIBRARY_PATH=/usr/lib64/mongodb:$LD_LIBRARY_PATH&quot;</span><br><span class="line">Environment=&quot;OPTIONS=-f /etc/mongod.conf&quot;</span><br><span class="line">ExecStart=/usr/bin/mongod $OPTIONS</span><br></pre></td></tr></table></figure><p>3、daemon-reload 后启动 mongod，可以看到目前服务已经正常启动:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">[root@localhost mongodb]# systemctl daemon-reload </span><br><span class="line">[root@localhost mongodb]# systemctl start mongod</span><br><span class="line">[root@localhost mongodb]# systemctl status mongod</span><br><span class="line">● mongod.service - High-performance, schema-free document-oriented database</span><br><span class="line">   Loaded: loaded (/usr/lib/systemd/system/mongod.service; enabled; vendor preset: disabled)</span><br><span class="line">   Active: active (running) since Sun 2025-12-07 01:49:52 CST; 1min 11s ago</span><br><span class="line">     Docs: https://docs.mongodb.org/manual</span><br><span class="line">  Process: 8032 ExecStartPre=/usr/bin/mkdir -p /var/run/mongodb (code=exited, status=0/SUCCESS)</span><br><span class="line">  Process: 8034 ExecStartPre=/usr/bin/chown mongod:mongod /var/run/mongodb (code=exited, status=0/SUCCESS)</span><br><span class="line">  Process: 8036 ExecStartPre=/usr/bin/chmod 0755 /var/run/mongodb (code=exited, status=0/SUCCESS)</span><br><span class="line"> Main PID: 8041 (mongod)</span><br><span class="line">   Memory: 41.0M</span><br><span class="line">   CGroup: /system.slice/mongod.service</span><br><span class="line">           └─8041 /usr/bin/mongod -f /etc/mongod.conf</span><br><span class="line"></span><br><span class="line">Dec 07 01:49:52 localhost.localdomain systemd[1]: Starting High-performance, schema-free document-oriented database...</span><br><span class="line">Dec 07 01:49:52 localhost.localdomain systemd[1]: Started High-performance, schema-free document-oriented database.</span><br><span class="line">Dec 07 01:49:52 localhost.localdomain mongod[8039]: about to fork child process, waiting until server is ready for connections.</span><br><span class="line">Dec 07 01:49:52 localhost.localdomain mongod[8040]: forked process: 8041</span><br><span class="line">Dec 07 01:49:53 localhost.localdomain mongod[8039]: child process started successfully, parent exiting</span><br></pre></td></tr></table></figure><p>和上面使用 ldd 的方法不同，这里可以用 lsof 结合 pgrep 命令进行确认，确实已经引用了 LD_LIBRARY_PATH 带入的 So 库文件:  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">[root@localhost mongodb]# lsof -p $(pgrep mongod) | grep mongodb</span><br><span class="line">mongod  8041 mongod  mem    REG              253,0   2513000    583497 /usr/lib64/mongodb/libcrypto.so.1.0.2k</span><br><span class="line">mongod  8041 mongod  mem    REG              253,0    470360    583498 /usr/lib64/mongodb/libssl.so.1.0.2k</span><br><span class="line">mongod  8041 mongod    4w   REG              253,0      3702 105520618 /var/log/mongodb/mongod.log</span><br><span class="line">mongod  8041 mongod    8u  unix 0x00000000665de206       0t0     87499 /tmp/mongodb-27017.sock type=STREAM</span><br></pre></td></tr></table></figure><p>4、但如果执行 mongo 命令，还是会报出如下错误:  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">mongo: relocation error: mongo: symbol FIPS_mode_set version libcrypto.so.10 not defined in file libcrypto.so.10 with link time reference</span><br></pre></td></tr></table></figure><p>那么客户端该怎么解决呢？同样也可以在执行 mongo 命令前添加 LD_LIBRARY_PATH，类似如下:  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line">[root@localhost mongodb]# LD_LIBRARY_PATH=/usr/lib64/mongodb:$LD_LIBRARY_PATH mongo</span><br><span class="line">MongoDB shell version v3.4.10</span><br><span class="line">connecting to: mongodb://127.0.0.1:27017</span><br><span class="line">MongoDB server version: 3.4.10</span><br><span class="line">Welcome to the MongoDB shell.</span><br><span class="line">For interactive help, type &quot;help&quot;.</span><br><span class="line">For more comprehensive documentation, see</span><br><span class="line">        http://docs.mongodb.org/</span><br><span class="line">Questions? Try the support group</span><br><span class="line">        http://groups.google.com/group/mongodb-user</span><br><span class="line">Server has startup warnings: </span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] </span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.</span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.</span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] </span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] </span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is &#x27;always&#x27;.</span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] **        We suggest setting it to &#x27;never&#x27;</span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] </span><br><span class="line"><span class="meta prompt_">&gt; </span><span class="language-bash"><span class="built_in">exit</span></span></span><br><span class="line">bye</span><br></pre></td></tr></table></figure><p>但如果每次都这样执行，就未免太过麻烦，业务上访问也多有不便。这里我的思路是可以使用脚本替代的方式，充当一个拐棍的作用。<br>1）将原客户端 mongo 转移到新目录  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">cd /usr/bin</span><br><span class="line">mkdir mongodb-client-tools</span><br><span class="line">mv mongo mongodb-client-tools/</span><br></pre></td></tr></table></figure><p>2）新建 mongo.sh 脚本，并写入如下内容:  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">#</span><span class="language-bash">!/bin/bash</span></span><br><span class="line">LD_LIBRARY_PATH=/usr/lib64/mongodb:$LD_LIBRARY_PATH /usr/bin/mongodb-client-tools/mongo</span><br></pre></td></tr></table></figure><p>3）增加执行权限并创建软链接:  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">[root@localhost bin]# chmod +x mongo.sh </span><br><span class="line">[root@localhost bin]# ln -s mongo.sh  mongo</span><br></pre></td></tr></table></figure><p>4）验证可行:  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">[root@localhost bin]# mongo</span><br><span class="line">MongoDB shell version v3.4.10</span><br><span class="line">connecting to: mongodb://127.0.0.1:27017</span><br><span class="line">MongoDB server version: 3.4.10</span><br><span class="line">Server has startup warnings: </span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] </span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.</span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.</span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] </span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] </span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is &#x27;always&#x27;.</span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] **        We suggest setting it to &#x27;never&#x27;</span><br><span class="line">2025-12-07T01:49:52.704+0800 I CONTROL  [initandlisten] </span><br><span class="line"><span class="meta prompt_">&gt; </span></span><br></pre></td></tr></table></figure><p>5）根据上述步骤将 mongoimport、mongoexport 等工具做类似处理即可。  </p><h3 id="问题思考"><a href="#问题思考" class="headerlink" title="问题思考"></a>问题思考</h3><p>经过以上步骤，服务冲突就已经解决完毕，可以开展后续的业务工作，但还是有些内容值得思考的:<br>1）如果业务上如果有 So 库特定版本依赖并需要创建软链接时，需要考虑下是否为类似 OpenSSL 相关的基础依赖，在部署时是否独占机器，如果非独占，那么还是要考虑下是否会影响其他服务。在本次案例中是采取了主动规避的方案，两边都不受影响，但如果遇到暴力操作，直接解除 PubKit 的软链接再部署服务，那么对业务来说是灾难性的，可能会遇到在排查解决问题时双方互相反复解除软链接的情况。  </p><p>2）对于业务部署来讲，提前检查环境是必须进行的，除了操作系统、磁盘分区等等基础信息检查项之外，还应该加上业务依赖上的检查，比如 So 版本检查，提前发现问题并制定解决方案，节省现场处理问题的时间。  </p>]]></content>
    
    
    <summary type="html">&lt;p&gt;最近遇到一个案例，在客户提供的机器上部署业务时，所需要的 So 库和对应的软链接已经存在，但是业务报错显示符号表有缺失，导致服务启动失败。排查原因的时候发现是该机器上部署了一个名为 PubKit 的服务，提前占用了同名软链接且对应的 So 库版本低。那么本着服务互不影响的原则，该怎么处理这个问题呢？&lt;/p&gt;</summary>
    
    
    
    <category term="Skill" scheme="https://www.applenice.net/categories/Skill/"/>
    
    
    <category term="Linux" scheme="https://www.applenice.net/tags/Linux/"/>
    
    <category term="MongoDB" scheme="https://www.applenice.net/tags/MongoDB/"/>
    
    <category term="OpenSSL" scheme="https://www.applenice.net/tags/OpenSSL/"/>
    
    <category term="Kylin" scheme="https://www.applenice.net/tags/Kylin/"/>
    
  </entry>
  
  <entry>
    <title>NetworkManager管理DNS配置</title>
    <link href="https://www.applenice.net/2025/07/13/NetworkManager-manages-DNS/"/>
    <id>https://www.applenice.net/2025/07/13/NetworkManager-manages-DNS/</id>
    <published>2025-07-13T02:39:09.000Z</published>
    <updated>2025-07-13T04:18:05.599Z</updated>
    
    <content type="html"><![CDATA[<p>遇到一个场景，装完操作系统通过 GUI 进行网络配置、安装业务平台并直接修改 &#x2F;etc&#x2F;resolv.conf 文件，改变了DNS地址配置。过了几天后发现配置被修改，影响到了业务平台，但排查以后确认无人操作过 &#x2F;etc&#x2F;resolv.conf 文件。最后追查到 NetworkManager 被重启过，每次 NetworkManager 重启都会出现DNS恢复到系统安装时初始网络配置写的 DNS。NetworkManager 的处理流程是什么，跟着日志和源代码一起看看吧☺️</p><span id="more"></span><h3 id="NetworkManager日志"><a href="#NetworkManager日志" class="headerlink" title="NetworkManager日志"></a>NetworkManager日志</h3><p>这里复现问题的环境是 Rocky 9.6，网络的初始配置如下</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">[root@localhost misaka]# nmcli device show ens33</span><br><span class="line">GENERAL.DEVICE:                         ens33</span><br><span class="line">GENERAL.TYPE:                           ethernet</span><br><span class="line">GENERAL.HWADDR:                         </span><br><span class="line">GENERAL.MTU:                            1500</span><br><span class="line">GENERAL.STATE:                          100 (connected)</span><br><span class="line">GENERAL.CONNECTION:                     ens33</span><br><span class="line">GENERAL.CON-PATH:                       /org/freedesktop/NetworkManager/ActiveConnection/2</span><br><span class="line">WIRED-PROPERTIES.CARRIER:               on</span><br><span class="line">IP4.ADDRESS[1]:                         192.168.0.239/24</span><br><span class="line">IP4.GATEWAY:                            192.168.0.1</span><br><span class="line">IP4.ROUTE[1]:                           dst = 192.168.0.0/24, nh = 0.0.0.0, mt = 100</span><br><span class="line">IP4.ROUTE[2]:                           dst = 0.0.0.0/0, nh = 192.168.0.1, mt = 100</span><br><span class="line">IP4.DNS[1]:                             8.8.8.8</span><br><span class="line">IP6.ADDRESS[1]:                         fe80::20c:29ff:feec:d07a/64</span><br><span class="line">IP6.GATEWAY:                            --</span><br><span class="line">IP6.ROUTE[1]:                           dst = fe80::/64, nh = ::, mt = 1024</span><br></pre></td></tr></table></figure><p>RHEL发行版的日志位置通常是 &#x2F;var&#x2F;log&#x2F;message，起初是想通过现有日志看看是否能找到有用信息。这里打开两个窗口，分别执行</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">tail -f  /var/log/messages | grep NetworkManager</span><br><span class="line"></span><br><span class="line">systemctl restart NetworkManager</span><br></pre></td></tr></table></figure><p>可以看到有如下输出:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br></pre></td><td class="code"><pre><span class="line">Jul 12 23:33:45 localhost systemd[1]: NetworkManager-wait-online.service: Deactivated successfully.</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[171123]: &lt;info&gt;  [1752334425.0882] caught SIGTERM, shutting down normally.</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[171123]: &lt;info&gt;  [1752334425.0891] manager: NetworkManager state is now CONNECTED_SITE</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[171123]: &lt;info&gt;  [1752334425.0935] exiting (success)</span><br><span class="line">Jul 12 23:33:45 localhost systemd[1]: NetworkManager.service: Deactivated successfully.</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1471] NetworkManager (version 1.52.0-4.el9_6) is starting... (after a restart, boot:682b7f4f-9040-4361-9bbe-3dd582d2db4a)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1472] Read config: /etc/NetworkManager/NetworkManager.conf, /usr/lib/NetworkManager/conf.d/&#123;00-server.conf,99-nvme-nbft-no-ignore-carrier.conf&#125;, /run/NetworkManager/conf.d/15-carrier-timeout.conf</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1508] manager[0x55d4e0a3a050]: monitoring kernel firmware directory &#x27;/lib/firmware&#x27;.</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1553] hostname: hostname: using hostnamed</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1556] dns-mgr: init: dns=default,systemd-resolved rc-manager=symlink (auto)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1557] policy: set-hostname: set hostname to &#x27;localhost.localdomain&#x27; (no hostname found)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1560] manager[0x55d4e0a3a050]: rfkill: Wi-Fi hardware radio set enabled</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1561] manager[0x55d4e0a3a050]: rfkill: WWAN hardware radio set enabled</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1576] Loaded device plugin: NMAtmManager (/usr/lib64/NetworkManager/1.52.0-4.el9_6/libnm-device-plugin-adsl.so)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1578] Loaded device plugin: NMWifiFactory (/usr/lib64/NetworkManager/1.52.0-4.el9_6/libnm-device-plugin-wifi.so)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1586] Loaded device plugin: NMTeamFactory (/usr/lib64/NetworkManager/1.52.0-4.el9_6/libnm-device-plugin-team.so)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1599] Loaded device plugin: NMBluezManager (/usr/lib64/NetworkManager/1.52.0-4.el9_6/libnm-device-plugin-bluetooth.so)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1601] Loaded device plugin: NMWwanFactory (/usr/lib64/NetworkManager/1.52.0-4.el9_6/libnm-device-plugin-wwan.so)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1603] manager: rfkill: Wi-Fi enabled by radio killswitch; enabled by state file</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1604] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1605] manager: Networking is enabled by state file</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1607] settings: Loaded settings plugin: keyfile (internal)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1610] settings: Loaded settings plugin: ifcfg-rh (&quot;/usr/lib64/NetworkManager/1.52.0-4.el9_6/libnm-settings-plugin-ifcfg-rh.so&quot;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1626] dhcp: init: Using DHCP client &#x27;internal&#x27;</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1629] manager: (lo): new Loopback device (/org/freedesktop/NetworkManager/Devices/1)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1633] device (lo): state change: unmanaged -&gt; unavailable (reason &#x27;connection-assumed&#x27;, managed-type: &#x27;external&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1640] device (lo): state change: unavailable -&gt; disconnected (reason &#x27;connection-assumed&#x27;, managed-type: &#x27;external&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1647] device (lo): Activation: starting connection &#x27;lo&#x27; (ba530e38-0cc8-44a7-9dea-c1126d22d767)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1651] device (ens33): carrier: link connected</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1655] manager: (ens33): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1659] manager: (ens33): assume: will attempt to assume matching connection &#x27;ens33&#x27; (2752299c-a8b2-362e-af75-d3b722cce23b) (indicated)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1659] device (ens33): state change: unmanaged -&gt; unavailable (reason &#x27;connection-assumed&#x27;, managed-type: &#x27;assume&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1664] device (ens33): state change: unavailable -&gt; disconnected (reason &#x27;connection-assumed&#x27;, managed-type: &#x27;assume&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1669] device (ens33): Activation: starting connection &#x27;ens33&#x27; (2752299c-a8b2-362e-af75-d3b722cce23b)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1677] bus-manager: acquired D-Bus service &quot;org.freedesktop.NetworkManager&quot;</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1687] device (lo): state change: disconnected -&gt; prepare (reason &#x27;none&#x27;, managed-type: &#x27;external&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1690] device (lo): state change: prepare -&gt; config (reason &#x27;none&#x27;, managed-type: &#x27;external&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1693] device (lo): state change: config -&gt; ip-config (reason &#x27;none&#x27;, managed-type: &#x27;external&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1695] device (ens33): state change: disconnected -&gt; prepare (reason &#x27;none&#x27;, managed-type: &#x27;assume&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1697] device (ens33): state change: prepare -&gt; config (reason &#x27;none&#x27;, managed-type: &#x27;assume&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1771] modem-manager: ModemManager available</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1775] device (ens33): state change: config -&gt; ip-config (reason &#x27;none&#x27;, managed-type: &#x27;assume&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1779] device (lo): state change: ip-config -&gt; ip-check (reason &#x27;none&#x27;, managed-type: &#x27;external&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1787] policy: set &#x27;ens33&#x27; (ens33) as default for IPv4 routing and DNS</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1835] device (lo): state change: ip-check -&gt; secondaries (reason &#x27;none&#x27;, managed-type: &#x27;external&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1836] device (lo): state change: secondaries -&gt; activated (reason &#x27;none&#x27;, managed-type: &#x27;external&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1839] device (lo): Activation: successful, device activated.</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1857] device (ens33): state change: ip-config -&gt; ip-check (reason &#x27;none&#x27;, managed-type: &#x27;assume&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1891] device (ens33): state change: ip-check -&gt; secondaries (reason &#x27;none&#x27;, managed-type: &#x27;assume&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1894] device (ens33): state change: secondaries -&gt; activated (reason &#x27;none&#x27;, managed-type: &#x27;assume&#x27;)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1898] manager: NetworkManager state is now CONNECTED_SITE</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1901] device (ens33): Activation: successful, device activated.</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1907] manager: NetworkManager state is now CONNECTED_GLOBAL</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1909] manager: startup complete</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.2826] policy: set-hostname: set hostname to &#x27;localhost.localdomain&#x27; (no hostname found)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.6020] agent-manager: agent[f8948c5639bb9824,:1.25/org.gnome.Shell.NetworkAgent/42]: agent registered</span><br><span class="line">Jul 12 23:33:55 localhost systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully.</span><br><span class="line">Jul 12 23:34:15 localhost NetworkManager[178235]: &lt;info&gt;  [1752334455.0274] policy: set-hostname: set hostname to &#x27;localhost.localdomain&#x27; (no hostname found)</span><br><span class="line">Jul 12 23:34:25 localhost systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully.</span><br><span class="line">Jul 12 23:34:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334485.0224] policy: set-hostname: set hostname to &#x27;localhost.localdomain&#x27; (no hostname found)</span><br><span class="line">Jul 12 23:34:55 localhost systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully.</span><br><span class="line">Jul 12 23:38:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334725.0229] policy: set-hostname: set hostname to &#x27;localhost.localdomain&#x27; (no hostname found)</span><br><span class="line">Jul 12 23:38:55 localhost systemd[1]: NetworkManager-dispatcher.service: Deactivated successfully.</span><br></pre></td></tr></table></figure><p>将此段日志，让豆包分析，豆包反馈日志记录了 NetworkManager 从”旧进程关闭”到”新进程启动并完成初始化””的过程，关键节点包括：</p><ul><li>旧进程正常关闭：收到 SIGTERM 信号后，旧 NetworkManager 进程（PID 171123）正常退出，状态为 success。</li><li>新进程启动：新进程（PID 178235）以版本 1.52.0-4.el9_6 启动，读取配置文件并初始化核心模块。</li><li>设备识别与激活：成功识别回环设备（lo）和以太网设备（ens33），并完成激活，最终网络状态达到 CONNECTED_GLOBAL（全局连接）。</li><li>服务初始化完成：启动过程无错误，所有核心功能（DNS 管理、DHCP 客户端、设备插件等）正常加载。</li></ul><p>其中能说明有DNS相关的操作如下:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1556] dns-mgr: init: dns=default,systemd-resolved rc-manager=symlink (auto)</span><br><span class="line">Jul 12 23:33:45 localhost NetworkManager[178235]: &lt;info&gt;  [1752334425.1787] policy: set &#x27;ens33&#x27; (ens33) as default for IPv4 routing and DNS</span><br></pre></td></tr></table></figure><p>当 systemd-resolved 服务不存在时，dns&#x3D;default 会使 NetworkManager 直接管理 DNS，生成 &#x2F;etc&#x2F;resolv.conf，并通过 rc-manager&#x3D;symlink 创建符号链接。</p><h3 id="NetworkManager源码"><a href="#NetworkManager源码" class="headerlink" title="NetworkManager源码"></a>NetworkManager源码</h3><p>到这里其实只能看出 NetworkManager 确实操作了&#x2F;etc&#x2F;resolv.conf的变更，但具体如何执行还是无法清晰获取，接下来转换思路。</p><p>由 NetworkManager 管理的 DNS 配置会在 &#x2F;etc&#x2F;resolv.conf 中包含以下注释：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">Generated by NetworkManager</span></span><br><span class="line">nameserver 8.8.8.8</span><br></pre></td></tr></table></figure><p>直接使用 Generated by NetworkManager 字符串检索代码，作为入口，来看下代码中的处理流程。</p><p>在 NetworkManager 源代码 src&#x2F;core&#x2F;dns&#x2F;nm-dns-manager.c 中存在函数 create_resolv_conf，其中包含了相应的字符串，且生成 nameserver 配置。</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">static</span> <span class="type">char</span> *</span><br><span class="line"><span class="title function_">create_resolv_conf</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *<span class="type">const</span> *searches,</span></span><br><span class="line"><span class="params">                   <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span> *nameservers,</span></span><br><span class="line"><span class="params">                   <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span> *options)</span></span><br><span class="line">&#123;</span><br><span class="line">    GString *str;</span><br><span class="line">    gsize    i;</span><br><span class="line"></span><br><span class="line">    str = g_string_new_len(<span class="literal">NULL</span>, <span class="number">245</span>);</span><br><span class="line"></span><br><span class="line">    g_string_append(str, <span class="string">&quot;# Generated by NetworkManager\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (searches &amp;&amp; searches[<span class="number">0</span>]) &#123;</span><br><span class="line">        gsize search_base_idx;</span><br><span class="line"></span><br><span class="line">        g_string_append(str, <span class="string">&quot;search&quot;</span>);</span><br><span class="line">        search_base_idx = str-&gt;len;</span><br><span class="line"></span><br><span class="line">        <span class="keyword">for</span> (i = <span class="number">0</span>; searches[i]; i++) &#123;</span><br><span class="line">            <span class="type">const</span> <span class="type">char</span> *s = searches[i];</span><br><span class="line">            gsize       l = <span class="built_in">strlen</span>(s);</span><br><span class="line"></span><br><span class="line">            <span class="keyword">if</span> (l == <span class="number">0</span> || NM_STRCHAR_ANY(s, ch, NM_IN_SET(ch, <span class="string">&#x27; &#x27;</span>, <span class="string">&#x27;\t&#x27;</span>, <span class="string">&#x27;\n&#x27;</span>))) &#123;</span><br><span class="line">                <span class="comment">/* there should be no such characters in the search entry. Also,</span></span><br><span class="line"><span class="comment">                 * because glibc parser would treat them as line/word separator.</span></span><br><span class="line"><span class="comment">                 *</span></span><br><span class="line"><span class="comment">                 * Skip the value silently. */</span></span><br><span class="line">                <span class="keyword">continue</span>;</span><br><span class="line">            &#125;</span><br><span class="line"></span><br><span class="line">            <span class="keyword">if</span> (search_base_idx &gt; <span class="number">0</span>) &#123;</span><br><span class="line">                <span class="keyword">if</span> (str-&gt;len - search_base_idx + <span class="number">1</span> + l &gt; <span class="number">254</span>) &#123;</span><br><span class="line">                    <span class="comment">/* this entry crosses the 256 character boundary. Older glibc versions</span></span><br><span class="line"><span class="comment">                     * would truncate the entry at this point.</span></span><br><span class="line"><span class="comment">                     *</span></span><br><span class="line"><span class="comment">                     * Fill the line with spaces to cross the 256 char boundary and continue</span></span><br><span class="line"><span class="comment">                     * afterwards. This way, the truncation happens between two search entries. */</span></span><br><span class="line">                    <span class="keyword">while</span> (str-&gt;len - search_base_idx &lt; <span class="number">257</span>)</span><br><span class="line">                        g_string_append_c(str, <span class="string">&#x27; &#x27;</span>);</span><br><span class="line">                    search_base_idx = <span class="number">0</span>;</span><br><span class="line">                &#125;</span><br><span class="line">            &#125;</span><br><span class="line"></span><br><span class="line">            g_string_append_c(str, <span class="string">&#x27; &#x27;</span>);</span><br><span class="line">            g_string_append_len(str, s, l);</span><br><span class="line">        &#125;</span><br><span class="line">        g_string_append_c(str, <span class="string">&#x27;\n&#x27;</span>);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (nameservers &amp;&amp; nameservers[<span class="number">0</span>]) &#123;</span><br><span class="line">        <span class="keyword">for</span> (i = <span class="number">0</span>; nameservers[i]; i++) &#123;</span><br><span class="line">            <span class="keyword">if</span> (i == <span class="number">3</span>) &#123;</span><br><span class="line">                g_string_append(</span><br><span class="line">                    str,</span><br><span class="line">                    <span class="string">&quot;# NOTE: the libc resolver may not support more than 3 nameservers.\n&quot;</span>);</span><br><span class="line">                g_string_append(str, <span class="string">&quot;# The nameservers listed below may not be recognized.\n&quot;</span>);</span><br><span class="line">            &#125;</span><br><span class="line">            g_string_append(str, <span class="string">&quot;nameserver &quot;</span>);</span><br><span class="line">            g_string_append(str, nameservers[i]);</span><br><span class="line">            g_string_append_c(str, <span class="string">&#x27;\n&#x27;</span>);</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (options &amp;&amp; options[<span class="number">0</span>]) &#123;</span><br><span class="line">        g_string_append(str, <span class="string">&quot;options&quot;</span>);</span><br><span class="line">        <span class="keyword">for</span> (i = <span class="number">0</span>; options[i]; i++) &#123;</span><br><span class="line">            g_string_append_c(str, <span class="string">&#x27; &#x27;</span>);</span><br><span class="line">            g_string_append(str, options[i]);</span><br><span class="line">        &#125;</span><br><span class="line">        g_string_append_c(str, <span class="string">&#x27;\n&#x27;</span>);</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> g_string_free(str, FALSE);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在IDE中继续搜索该函数存在多次调用，分别为</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br><span class="line">102</span><br><span class="line">103</span><br><span class="line">104</span><br><span class="line">105</span><br><span class="line">106</span><br><span class="line">107</span><br><span class="line">108</span><br><span class="line">109</span><br><span class="line">110</span><br><span class="line">111</span><br><span class="line">112</span><br><span class="line">113</span><br><span class="line">114</span><br><span class="line">115</span><br><span class="line">116</span><br><span class="line">117</span><br><span class="line">118</span><br><span class="line">119</span><br><span class="line">120</span><br><span class="line">121</span><br><span class="line">122</span><br><span class="line">123</span><br><span class="line">124</span><br><span class="line">125</span><br><span class="line">126</span><br><span class="line">127</span><br><span class="line">128</span><br><span class="line">129</span><br><span class="line">130</span><br><span class="line">131</span><br><span class="line">132</span><br><span class="line">133</span><br><span class="line">134</span><br><span class="line">135</span><br><span class="line">136</span><br><span class="line">137</span><br><span class="line">138</span><br><span class="line">139</span><br><span class="line">140</span><br><span class="line">141</span><br><span class="line">142</span><br><span class="line">143</span><br><span class="line">144</span><br><span class="line">145</span><br><span class="line">146</span><br><span class="line">147</span><br><span class="line">148</span><br><span class="line">149</span><br><span class="line">150</span><br><span class="line">151</span><br><span class="line">152</span><br><span class="line">153</span><br><span class="line">154</span><br><span class="line">155</span><br><span class="line">156</span><br><span class="line">157</span><br><span class="line">158</span><br><span class="line">159</span><br><span class="line">160</span><br><span class="line">161</span><br><span class="line">162</span><br><span class="line">163</span><br><span class="line">164</span><br><span class="line">165</span><br><span class="line">166</span><br><span class="line">167</span><br><span class="line">168</span><br><span class="line">169</span><br><span class="line">170</span><br><span class="line">171</span><br><span class="line">172</span><br><span class="line">173</span><br><span class="line">174</span><br><span class="line">175</span><br><span class="line">176</span><br><span class="line">177</span><br><span class="line">178</span><br><span class="line">179</span><br><span class="line">180</span><br><span class="line">181</span><br><span class="line">182</span><br><span class="line">183</span><br><span class="line">184</span><br><span class="line">185</span><br><span class="line">186</span><br><span class="line">187</span><br><span class="line">188</span><br><span class="line">189</span><br><span class="line">190</span><br><span class="line">191</span><br><span class="line">192</span><br><span class="line">193</span><br><span class="line">194</span><br><span class="line">195</span><br><span class="line">196</span><br><span class="line">197</span><br><span class="line">198</span><br><span class="line">199</span><br><span class="line">200</span><br><span class="line">201</span><br><span class="line">202</span><br><span class="line">203</span><br><span class="line">204</span><br><span class="line">205</span><br><span class="line">206</span><br><span class="line">207</span><br><span class="line">208</span><br><span class="line">209</span><br><span class="line">210</span><br><span class="line">211</span><br><span class="line">212</span><br><span class="line">213</span><br><span class="line">214</span><br><span class="line">215</span><br><span class="line">216</span><br><span class="line">217</span><br><span class="line">218</span><br><span class="line">219</span><br><span class="line">220</span><br><span class="line">221</span><br><span class="line">222</span><br><span class="line">223</span><br><span class="line">224</span><br><span class="line">225</span><br><span class="line">226</span><br><span class="line">227</span><br><span class="line">228</span><br><span class="line">229</span><br><span class="line">230</span><br><span class="line">231</span><br><span class="line">232</span><br><span class="line">233</span><br><span class="line">234</span><br><span class="line">235</span><br><span class="line">236</span><br><span class="line">237</span><br><span class="line">238</span><br><span class="line">239</span><br><span class="line">240</span><br><span class="line">241</span><br><span class="line">242</span><br><span class="line">243</span><br><span class="line">244</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">char</span> *</span><br><span class="line"><span class="title function_">nmtst_dns_create_resolv_conf</span><span class="params">(<span class="type">const</span> <span class="type">char</span> *<span class="type">const</span> *searches,</span></span><br><span class="line"><span class="params">                             <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span> *nameservers,</span></span><br><span class="line"><span class="params">                             <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span> *options)</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">return</span> create_resolv_conf(searches, nameservers, options);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="type">static</span> gboolean</span><br><span class="line"><span class="title function_">write_resolv_conf</span><span class="params">(FILE              *f,</span></span><br><span class="line"><span class="params">                  <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span> *searches,</span></span><br><span class="line"><span class="params">                  <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span> *nameservers,</span></span><br><span class="line"><span class="params">                  <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span> *options,</span></span><br><span class="line"><span class="params">                  GError           **error)</span></span><br><span class="line">&#123;</span><br><span class="line">    gs_free <span class="type">char</span> *content = <span class="literal">NULL</span>;</span><br><span class="line"></span><br><span class="line">    content = create_resolv_conf(searches, nameservers, options);</span><br><span class="line">    <span class="keyword">return</span> write_resolv_conf_contents(f, content, error);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="type">static</span> <span class="type">void</span></span><br><span class="line"><span class="title function_">update_resolv_conf_no_stub</span><span class="params">(NMDnsManager      *self,</span></span><br><span class="line"><span class="params">                           <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span> *searches,</span></span><br><span class="line"><span class="params">                           <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span> *nameservers,</span></span><br><span class="line"><span class="params">                           <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span> *options)</span></span><br><span class="line">&#123;</span><br><span class="line">    gs_free <span class="type">char</span> *content = <span class="literal">NULL</span>;</span><br><span class="line">    GError       *local   = <span class="literal">NULL</span>;</span><br><span class="line"></span><br><span class="line">    content = create_resolv_conf(searches, nameservers, options);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (!g_file_set_contents(NO_STUB_RESOLV_CONF, content, <span class="number">-1</span>, &amp;local)) &#123;</span><br><span class="line">        _LOGD(<span class="string">&quot;update-resolv-no-stub: failure to write file: %s&quot;</span>, local-&gt;message);</span><br><span class="line">        g_error_free(local);</span><br><span class="line">        <span class="keyword">return</span>;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    _LOGT(<span class="string">&quot;update-resolv-no-stub: &#x27;%s&#x27; successfully written&quot;</span>, NO_STUB_RESOLV_CONF);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="type">static</span> SpawnResult</span><br><span class="line"><span class="title function_">update_resolv_conf</span><span class="params">(NMDnsManager                 *self,</span></span><br><span class="line"><span class="params">                   <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span>            *searches,</span></span><br><span class="line"><span class="params">                   <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span>            *nameservers,</span></span><br><span class="line"><span class="params">                   <span class="type">const</span> <span class="type">char</span> *<span class="type">const</span>            *options,</span></span><br><span class="line"><span class="params">                   GError                      **error,</span></span><br><span class="line"><span class="params">                   NMDnsManagerResolvConfManager rc_manager)</span></span><br><span class="line">&#123;</span><br><span class="line">    FILE         *f;</span><br><span class="line">    gboolean      success;</span><br><span class="line">    gs_free <span class="type">char</span> *content           = <span class="literal">NULL</span>;</span><br><span class="line">    SpawnResult   write_file_result = SR_SUCCESS;</span><br><span class="line">    <span class="type">int</span>           errsv;</span><br><span class="line">    gboolean      resconf_link_cached = FALSE;</span><br><span class="line">    gs_free <span class="type">char</span> *resconf_link        = <span class="literal">NULL</span>;</span><br><span class="line"></span><br><span class="line">    content = create_resolv_conf(searches, nameservers, options);</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (rc_manager == NM_DNS_MANAGER_RESOLV_CONF_MAN_FILE</span><br><span class="line">        || (rc_manager == NM_DNS_MANAGER_RESOLV_CONF_MAN_SYMLINK</span><br><span class="line">            &amp;&amp; !_read_link_cached(_PATH_RESCONF, &amp;resconf_link_cached, &amp;resconf_link))) &#123;</span><br><span class="line">        gs_free <span class="type">char</span>      *rc_path_syml = <span class="literal">NULL</span>;</span><br><span class="line">        nm_auto_free <span class="type">char</span> *rc_path_real = <span class="literal">NULL</span>;</span><br><span class="line">        <span class="type">const</span> <span class="type">char</span>        *rc_path      = _PATH_RESCONF;</span><br><span class="line">        GError            *local        = <span class="literal">NULL</span>;</span><br><span class="line"></span><br><span class="line">        <span class="keyword">if</span> (rc_manager == NM_DNS_MANAGER_RESOLV_CONF_MAN_FILE) &#123;</span><br><span class="line">            rc_path_real = realpath(_PATH_RESCONF, <span class="literal">NULL</span>);</span><br><span class="line">            <span class="keyword">if</span> (rc_path_real)</span><br><span class="line">                rc_path = rc_path_real;</span><br><span class="line">            <span class="keyword">else</span> &#123;</span><br><span class="line">                <span class="comment">/* realpath did not resolve a path-name. That either means,</span></span><br><span class="line"><span class="comment">                 * _PATH_RESCONF:</span></span><br><span class="line"><span class="comment">                 *   - does not exist</span></span><br><span class="line"><span class="comment">                 *   - is a plain file</span></span><br><span class="line"><span class="comment">                 *   - is a dangling symlink</span></span><br><span class="line"><span class="comment">                 *</span></span><br><span class="line"><span class="comment">                 * Handle the case, where it is a dangling symlink... */</span></span><br><span class="line">                rc_path_syml = nm_utils_read_link_absolute(_PATH_RESCONF, <span class="literal">NULL</span>);</span><br><span class="line">                <span class="keyword">if</span> (rc_path_syml)</span><br><span class="line">                    rc_path = rc_path_syml;</span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        <span class="comment">/* we first write to /etc/resolv.conf directly. If that fails,</span></span><br><span class="line"><span class="comment">         * we still continue to write to runstatedir but remember the</span></span><br><span class="line"><span class="comment">         * error. */</span></span><br><span class="line">        <span class="keyword">if</span> (!g_file_set_contents(rc_path, content, <span class="number">-1</span>, &amp;local)) &#123;</span><br><span class="line">            _LOGT(<span class="string">&quot;update-resolv-conf: write to %s failed (rc-manager=%s, %s)&quot;</span>,</span><br><span class="line">                  rc_path,</span><br><span class="line">                  _rc_manager_to_string(rc_manager),</span><br><span class="line">                  local-&gt;message);</span><br><span class="line">            g_propagate_error(error, local);</span><br><span class="line">            <span class="comment">/* clear @error, so that we don&#x27;t try reset it. This is the error</span></span><br><span class="line"><span class="comment">             * we want to propagate to the caller. */</span></span><br><span class="line">            error             = <span class="literal">NULL</span>;</span><br><span class="line">            write_file_result = SR_ERROR;</span><br><span class="line">        &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">            _LOGT(<span class="string">&quot;update-resolv-conf: write to %s succeeded (rc-manager=%s)&quot;</span>,</span><br><span class="line">                  rc_path,</span><br><span class="line">                  _rc_manager_to_string(rc_manager));</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> ((f = fopen(MY_RESOLV_CONF_TMP, <span class="string">&quot;we&quot;</span>)) == <span class="literal">NULL</span>) &#123;</span><br><span class="line">        errsv = errno;</span><br><span class="line">        g_set_error(error,</span><br><span class="line">                    NM_MANAGER_ERROR,</span><br><span class="line">                    NM_MANAGER_ERROR_FAILED,</span><br><span class="line">                    <span class="string">&quot;Could not open %s: %s&quot;</span>,</span><br><span class="line">                    MY_RESOLV_CONF_TMP,</span><br><span class="line">                    nm_strerror_native(errsv));</span><br><span class="line">        _LOGT(<span class="string">&quot;update-resolv-conf: open temporary file %s failed (%s)&quot;</span>,</span><br><span class="line">              MY_RESOLV_CONF_TMP,</span><br><span class="line">              nm_strerror_native(errsv));</span><br><span class="line">        <span class="keyword">return</span> SR_ERROR;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    success = write_resolv_conf_contents(f, content, error);</span><br><span class="line">    <span class="keyword">if</span> (!success) &#123;</span><br><span class="line">        errsv = errno;</span><br><span class="line">        _LOGT(<span class="string">&quot;update-resolv-conf: write temporary file %s failed (%s)&quot;</span>,</span><br><span class="line">              MY_RESOLV_CONF_TMP,</span><br><span class="line">              nm_strerror_native(errsv));</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (fclose(f) &lt; <span class="number">0</span>) &#123;</span><br><span class="line">        <span class="keyword">if</span> (success) &#123;</span><br><span class="line">            errsv = errno;</span><br><span class="line">            <span class="comment">/* only set an error here if write_resolv_conf() was successful,</span></span><br><span class="line"><span class="comment">             * since its error is more important.</span></span><br><span class="line"><span class="comment">             */</span></span><br><span class="line">            g_set_error(error,</span><br><span class="line">                        NM_MANAGER_ERROR,</span><br><span class="line">                        NM_MANAGER_ERROR_FAILED,</span><br><span class="line">                        <span class="string">&quot;Could not close %s: %s&quot;</span>,</span><br><span class="line">                        MY_RESOLV_CONF_TMP,</span><br><span class="line">                        nm_strerror_native(errsv));</span><br><span class="line">            _LOGT(<span class="string">&quot;update-resolv-conf: close temporary file %s failed (%s)&quot;</span>,</span><br><span class="line">                  MY_RESOLV_CONF_TMP,</span><br><span class="line">                  nm_strerror_native(errsv));</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="keyword">return</span> SR_ERROR;</span><br><span class="line">    &#125; <span class="keyword">else</span> <span class="keyword">if</span> (!success)</span><br><span class="line">        <span class="keyword">return</span> SR_ERROR;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (rename(MY_RESOLV_CONF_TMP, MY_RESOLV_CONF) &lt; <span class="number">0</span>) &#123;</span><br><span class="line">        errsv = errno;</span><br><span class="line">        g_set_error(error,</span><br><span class="line">                    NM_MANAGER_ERROR,</span><br><span class="line">                    NM_MANAGER_ERROR_FAILED,</span><br><span class="line">                    <span class="string">&quot;Could not replace %s: %s&quot;</span>,</span><br><span class="line">                    MY_RESOLV_CONF,</span><br><span class="line">                    nm_strerror_native(errsv));</span><br><span class="line">        _LOGT(<span class="string">&quot;update-resolv-conf: failed to rename temporary file %s to %s (%s)&quot;</span>,</span><br><span class="line">              MY_RESOLV_CONF_TMP,</span><br><span class="line">              MY_RESOLV_CONF,</span><br><span class="line">              nm_strerror_native(errsv));</span><br><span class="line">        <span class="keyword">return</span> SR_ERROR;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (rc_manager == NM_DNS_MANAGER_RESOLV_CONF_MAN_FILE) &#123;</span><br><span class="line">        _LOGT(<span class="string">&quot;update-resolv-conf: write internal file %s succeeded (rc-manager=%s)&quot;</span>,</span><br><span class="line">              MY_RESOLV_CONF,</span><br><span class="line">              _rc_manager_to_string(rc_manager));</span><br><span class="line">        <span class="keyword">return</span> write_file_result;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (rc_manager != NM_DNS_MANAGER_RESOLV_CONF_MAN_SYMLINK</span><br><span class="line">        || !_read_link_cached(_PATH_RESCONF, &amp;resconf_link_cached, &amp;resconf_link)) &#123;</span><br><span class="line">        _LOGT(<span class="string">&quot;update-resolv-conf: write internal file %s succeeded&quot;</span>, MY_RESOLV_CONF);</span><br><span class="line">        <span class="keyword">return</span> write_file_result;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (!nm_streq0(_read_link_cached(_PATH_RESCONF, &amp;resconf_link_cached, &amp;resconf_link),</span><br><span class="line">                   MY_RESOLV_CONF)) &#123;</span><br><span class="line">        _LOGT(<span class="string">&quot;update-resolv-conf: write internal file %s succeeded (don&#x27;t touch symlink %s &quot;</span></span><br><span class="line">              <span class="string">&quot;linking to %s)&quot;</span>,</span><br><span class="line">              MY_RESOLV_CONF,</span><br><span class="line">              _PATH_RESCONF,</span><br><span class="line">              _read_link_cached(_PATH_RESCONF, &amp;resconf_link_cached, &amp;resconf_link));</span><br><span class="line">        <span class="keyword">return</span> write_file_result;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">/* By this point, /etc/resolv.conf exists and is a symlink to our internal</span></span><br><span class="line"><span class="comment">     * resolv.conf. We update the symlink so that applications get an inotify</span></span><br><span class="line"><span class="comment">     * notification.</span></span><br><span class="line"><span class="comment">     */</span></span><br><span class="line">    <span class="keyword">if</span> (unlink(RESOLV_CONF_TMP) != <span class="number">0</span> &amp;&amp; ((errsv = errno) != ENOENT)) &#123;</span><br><span class="line">        g_set_error(error,</span><br><span class="line">                    NM_MANAGER_ERROR,</span><br><span class="line">                    NM_MANAGER_ERROR_FAILED,</span><br><span class="line">                    <span class="string">&quot;Could not unlink %s: %s&quot;</span>,</span><br><span class="line">                    RESOLV_CONF_TMP,</span><br><span class="line">                    nm_strerror_native(errsv));</span><br><span class="line">        _LOGT(<span class="string">&quot;update-resolv-conf: write internal file %s succeeded &quot;</span></span><br><span class="line">              <span class="string">&quot;but cannot delete temporary file %s: %s&quot;</span>,</span><br><span class="line">              MY_RESOLV_CONF,</span><br><span class="line">              RESOLV_CONF_TMP,</span><br><span class="line">              nm_strerror_native(errsv));</span><br><span class="line">        <span class="keyword">return</span> SR_ERROR;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (symlink(MY_RESOLV_CONF, RESOLV_CONF_TMP) == <span class="number">-1</span>) &#123;</span><br><span class="line">        errsv = errno;</span><br><span class="line">        g_set_error(error,</span><br><span class="line">                    NM_MANAGER_ERROR,</span><br><span class="line">                    NM_MANAGER_ERROR_FAILED,</span><br><span class="line">                    <span class="string">&quot;Could not create symlink %s pointing to %s: %s&quot;</span>,</span><br><span class="line">                    RESOLV_CONF_TMP,</span><br><span class="line">                    MY_RESOLV_CONF,</span><br><span class="line">                    nm_strerror_native(errsv));</span><br><span class="line">        _LOGT(<span class="string">&quot;update-resolv-conf: write internal file %s succeeded &quot;</span></span><br><span class="line">              <span class="string">&quot;but failed to symlink %s: %s&quot;</span>,</span><br><span class="line">              MY_RESOLV_CONF,</span><br><span class="line">              RESOLV_CONF_TMP,</span><br><span class="line">              nm_strerror_native(errsv));</span><br><span class="line">        <span class="keyword">return</span> SR_ERROR;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> (rename(RESOLV_CONF_TMP, _PATH_RESCONF) == <span class="number">-1</span>) &#123;</span><br><span class="line">        errsv = errno;</span><br><span class="line">        g_set_error(error,</span><br><span class="line">                    NM_MANAGER_ERROR,</span><br><span class="line">                    NM_MANAGER_ERROR_FAILED,</span><br><span class="line">                    <span class="string">&quot;Could not rename %s to %s: %s&quot;</span>,</span><br><span class="line">                    RESOLV_CONF_TMP,</span><br><span class="line">                    _PATH_RESCONF,</span><br><span class="line">                    nm_strerror_native(errsv));</span><br><span class="line">        _LOGT(<span class="string">&quot;update-resolv-conf: write internal file %s succeeded &quot;</span></span><br><span class="line">              <span class="string">&quot;but failed to rename temporary symlink %s to %s: %s&quot;</span>,</span><br><span class="line">              MY_RESOLV_CONF,</span><br><span class="line">              RESOLV_CONF_TMP,</span><br><span class="line">              _PATH_RESCONF,</span><br><span class="line">              nm_strerror_native(errsv));</span><br><span class="line">        <span class="keyword">return</span> SR_ERROR;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    _LOGT(<span class="string">&quot;update-resolv-conf: write internal file %s succeeded and update symlink %s&quot;</span>,</span><br><span class="line">          MY_RESOLV_CONF,</span><br><span class="line">          _PATH_RESCONF);</span><br><span class="line">    <span class="keyword">return</span> write_file_result;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>因为存在多个函数调用，还是无法确定具体调用流程。但在上述源码中可以看到包含 _LOGT 的日志函数调用，是否可以通过日志配置来观测 NetworkManager 的详细行为呢？</p><p>跟随IDE可以在 nm-logging-fwd.h 中可以看到一些关于日志级别的定义:</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="type">static</span> <span class="keyword">inline</span> <span class="type">int</span></span><br><span class="line"><span class="title function_">nm_log_level_to_syslog</span><span class="params">(NMLogLevel nm_level)</span></span><br><span class="line">&#123;</span><br><span class="line">    <span class="keyword">switch</span> (nm_level) &#123;</span><br><span class="line">    <span class="keyword">case</span> LOGL_ERR:</span><br><span class="line">        <span class="keyword">return</span> <span class="number">3</span>; <span class="comment">/* LOG_ERR */</span></span><br><span class="line">    <span class="keyword">case</span> LOGL_WARN:</span><br><span class="line">        <span class="keyword">return</span> <span class="number">4</span>; <span class="comment">/* LOG_WARN */</span></span><br><span class="line">    <span class="keyword">case</span> LOGL_INFO:</span><br><span class="line">        <span class="keyword">return</span> <span class="number">5</span>; <span class="comment">/* LOG_NOTICE */</span></span><br><span class="line">    <span class="keyword">case</span> LOGL_DEBUG:</span><br><span class="line">        <span class="keyword">return</span> <span class="number">6</span>; <span class="comment">/* LOG_INFO */</span></span><br><span class="line">    <span class="keyword">case</span> LOGL_TRACE:</span><br><span class="line">        <span class="keyword">return</span> <span class="number">7</span>; <span class="comment">/* LOG_DEBUG */</span></span><br><span class="line">    <span class="keyword">default</span>:</span><br><span class="line">        <span class="keyword">return</span> <span class="number">0</span>; <span class="comment">/* LOG_EMERG */</span></span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> _LOGL_TRACE LOGL_TRACE</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> _LOGL_DEBUG LOGL_DEBUG</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> _LOGL_INFO  LOGL_INFO</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> _LOGL_WARN  LOGL_WARN</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> _LOGL_ERR   LOGL_ERR</span></span><br><span class="line"></span><br><span class="line"><span class="comment">/* This is the default definition of _NMLOG_ENABLED(). Special implementations</span></span><br><span class="line"><span class="comment"> * might want to undef this and redefine it. */</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> _NMLOG_ENABLED(level) (nm_logging_enabled((level), (_NMLOG_DOMAIN)))</span></span><br><span class="line"></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> _LOGT(...) _NMLOG(_LOGL_TRACE, __VA_ARGS__)</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> _LOGD(...) _NMLOG(_LOGL_DEBUG, __VA_ARGS__)</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> _LOGI(...) _NMLOG(_LOGL_INFO, __VA_ARGS__)</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> _LOGW(...) _NMLOG(_LOGL_WARN, __VA_ARGS__)</span></span><br><span class="line"><span class="meta">#<span class="keyword">define</span> _LOGE(...) _NMLOG(_LOGL_ERR, __VA_ARGS__)</span></span><br></pre></td></tr></table></figure><p>可以看到最详细的日志是 TRACE 级别。那么要观测 NetworkManager 详细输出，就可以在 &#x2F;etc&#x2F;NetworkManager&#x2F;conf.d&#x2F;99-logging.conf 中添加如下内容，该文件默认不存在直接创建即可，重启 NetworkManager 时会被加载。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">[logging]</span><br><span class="line">level=DEBUG  # 可选级别：ERR, WARN, INFO, DEBUG</span><br><span class="line">domains=ALL  # 记录所有模块的日志，或指定特定模块（如 &quot;DHCP4,DHCP6&quot;）</span><br></pre></td></tr></table></figure><p>打开两个shell会话，并分别执行以下命令，就可以通过 journalctl 看到详细的日志输出，将该日志截取出来，分析执行步骤即可:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">journalctl -fu NetworkManager</span><br><span class="line"></span><br><span class="line">systemctl restart NetworkManager</span><br></pre></td></tr></table></figure><p>因为 TRACE 日志量非常大，这里就不全放出了。通过检索&#x2F;etc&#x2F;resolv.conf关键字，在 TRACE 日志中找到了如下内容:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;debug&gt; [1752340261.9116] dns-mgr: (device_l3cd_changed): queueing DNS updates (1)</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;info&gt;  [1752340261.9116] policy: set &#x27;ens33&#x27; (ens33) as default for IPv4 routing and DNS</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;debug&gt; [1752340261.9116] manager: PrimaryConnection now ens33</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;trace&gt; [1752340261.9117] policy: set-hostname: updating hostname (ip conf)</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;trace&gt; [1752340261.9117] policy: get-hostname: &quot;localhost&quot; (from dbus)</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;trace&gt; [1752340261.9117] policy: device hostname info:</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;trace&gt; [1752340261.9117] policy:   - prio:  100 ipv4 (def) dhcp  dns  dev:ens33</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;trace&gt; [1752340261.9117] policy:   - prio:  100 ipv6       dhcp  dns  dev:ens33</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;trace&gt; [1752340261.9117] policy: get-hostname: &quot;localhost&quot; (from dbus)</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;info&gt;  [1752340261.9117] policy: set-hostname: set hostname to &#x27;localhost.localdomain&#x27; (no hostname found)</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;debug&gt; [1752340261.9118] dns-mgr: (device_l3cd_changed): DNS configuration changed</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;debug&gt; [1752340261.9118] dns-mgr: (device_l3cd_changed): committing DNS changes (0)</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;debug&gt; [1752340261.9118] dns-mgr: update-dns: updating resolv.conf</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;trace&gt; [1752340261.9118] dns-mgr: config:      100 best    v4 2     : 8.8.8.8</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;trace&gt; [1752340261.9118] dns-mgr: config:      100 default v6 2     : </span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;trace&gt; [1752340261.9118] dns-mgr: plugin: add domain &lt;auto-default&gt; (i=2, p=100)</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;trace&gt; [1752340261.9118] dns-mgr: plugin: settings: ifindex=2, priority=100, default-route=1, search=, reverse=0.168.192.in-addr.arpa</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;trace&gt; [1752340261.9119] dns-mgr: update-resolv-no-stub: &#x27;/run/NetworkManager/no-stub-resolv.conf&#x27; successfully written</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;trace&gt; [1752340261.9148] dns-mgr: update-resolv-conf: write to /etc/resolv.conf succeeded (rc-manager=symlink)</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;trace&gt; [1752340261.9150] dns-mgr: update-resolv-conf: write internal file /run/NetworkManager/resolv.conf succeeded</span><br><span class="line">Jul 13 01:11:01 localhost.localdomain NetworkManager[275984]: &lt;trace&gt; [1752340261.9151] dns-mgr: current configuration: [&#123;&#x27;nameservers&#x27;: &lt;[&#x27;8.8.8.8&#x27;]&gt;, &#x27;interface&#x27;: &lt;&#x27;ens33&#x27;&gt;, &#x27;priority&#x27;: &lt;100&gt;, &#x27;vpn&#x27;: &lt;false&gt;&#125;]</span><br></pre></td></tr></table></figure><p>可以看出，正是由 update_resolv_conf 函数产生的写操作。那么通过日志，并结合源代码，利用IDE可以追溯出具体的函数调用链为:</p><blockquote><p>nm_policy_class_init() -&gt; constructed() -&gt; device_added() -&gt; devices_list_register() -&gt; device_l3cd_changed()  -&gt; nm_dns_manager_set_ip_config() -&gt; update_dns() -&gt; update_resolv_conf_no_stub()、update_resolv_conf() -&gt; create_resolv_conf() -&gt;  write_resolv_conf_contents()</p></blockquote><p>其中 nm_policy_class_init() 是 GObject 框架下的类初始化函数，其调用机制遵循 GObject 的类注册流程。具体来说，这个函数会在 NMPolicy 类型注册时被自动调用，而非通过显式函数调用。跟随IDE可以发现在nm-policy.c文件中存在如下一行</p><figure class="highlight c"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">G_DEFINE_TYPE(NMPolicy, nm_policy, G_TYPE_OBJECT)</span><br></pre></td></tr></table></figure><p>这个宏会展开为类型注册代码，最终触发 nm_policy_class_init() 的调用。</p><p>NetworkManager 由 meson 进行构建，在 meason.build 中包含了 subdir(‘src’)，而 src&#x2F;meson.build 中又包含了 subdir(‘core’)，在执行 meson build 后在 build 目录中生成的 build.ninja 文件中包含了构建过程。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">build src/core/libNetworkManager.a.p/nm-policy.c.o: c_COMPILER ../src/core/nm-policy.c || src/libnm-core-public/nm-core-enum-types.h</span><br><span class="line"> DEPFILE = src/core/libNetworkManager.a.p/nm-policy.c.o.d</span><br><span class="line"> DEPFILE_UNQUOTED = src/core/libNetworkManager.a.p/nm-policy.c.o.d</span><br><span class="line"> ARGS = -Isrc/core/libNetworkManager.a.p -Isrc/core -I../src/core -Isrc/libnm-core-public -I../src/libnm-core-public -Isrc -I../src -I. -I.. -I/usr/include/gio-unix-2.0 -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/sysprof-4 -I/usr/include/libmount -I/usr/include/blkid -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -Wcast-align=strict -Wdeclaration-after-statement -Wfloat-equal -Wformat-nonliteral -Wformat-security -Wimplicit-function-declaration -Wimplicit-int -Winit-self -Wint-conversion -Wlogical-op -Wmissing-declarations -Wmissing-include-dirs -Wmissing-prototypes -Wold-style-definition -Wpointer-arith -Wshadow -Wshift-negative-value -Wstrict-prototypes -Wundef -Wvla -Wno-duplicate-decl-specifier -Wno-format-truncation -Wno-format-y2k -Wno-missing-field-initializers -Wno-pragmas -Wno-sign-compare -Wno-unknown-pragmas -Wno-unused-parameter -fno-strict-aliasing -Wimplicit-fallthrough -fPIC -pthread -DGLIB_VERSION_MIN_REQUIRED=GLIB_VERSION_2_42 -DGLIB_VERSION_MAX_ALLOWED=GLIB_VERSION_2_42</span><br></pre></td></tr></table></figure><p>NetworkManager 管理 DNS 配置和 NetworkManager 的配置有关。有default、systemd-resolved、dnsmasq 三种方式。我个人观察到在 Centos7、Rocky 9 上通常是采用 default 方式进行，而 Ubuntu 20.04 中采用的是 systemd-resolvd。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">Ubuntu 20.04的日志输出</span></span><br><span class="line">Jul 13 02:51:24 misaka NetworkManager[815]: &lt;info&gt;  [1752346284.4650] dns-mgr[0x55eb10a9f290]: init: dns=systemd-resolved rc-manager=symlink, plugin=systemd-resolved</span><br></pre></td></tr></table></figure><p>systemd-resolved 是 systemd 套件的一部分，用于处理 DNS 解析和其他网络名称解析任务。它提供了 DNS 缓存、多重 DNS 解析、DNS-over-TLS 等功能。在使用systemd-resolved的 Ubuntu 系统中，&#x2F;etc&#x2F;resolv.conf通常是指向&#x2F;run&#x2F;systemd&#x2F;resolve&#x2F;stub-resolv.conf的符号链接，所有 DNS 查询会被转发到systemd-resolved的本地代理（127.0.0.53），然后由systemd-resolved根据&#x2F;run&#x2F;systemd&#x2F;resolve&#x2F;resolv.conf的配置进行实际的 DNS 解析。</p><p>到这里就基础梳理出每次 NetworkManager 重启时，对 DNS 配置进行操作的处理流程。</p><p>那么解决文章开头 NetworkManager 每次重启都还原成初始网络配置中的 DNS 问题的答案也非常简单:</p><ul><li><p>在&#x2F;etc&#x2F;NetworkManager&#x2F;system-connections&#x2F;ens33.nmconnection中注释dns配置，然后在&#x2F;etc&#x2F;resolv.conf中添加目标nameserver</p></li><li><p>在&#x2F;etc&#x2F;NetworkManager&#x2F;system-connections&#x2F;ens33.nmconnection中修改成目标dns配置，如果有多个，要注意写分号呦(例如: dns&#x3D;8.8.8.8;114.114.114.114;)</p></li></ul><h3 id="NetworkManager编译"><a href="#NetworkManager编译" class="headerlink" title="NetworkManager编译"></a>NetworkManager编译</h3><p>在排查问题前有想过可能需要 debug 或者自行加一些日志，但实际项目 TRACE日志记录的非常详细，这一点是值得去学习的。同时为了IDE读取比较顺利，这里就学习了下 NetworkManager 的编译方法。我尝试了 Rocky 9 和 Ubuntu 20.04 上的编译，进行下记录。</p><h4 id="Meson构建"><a href="#Meson构建" class="headerlink" title="Meson构建"></a>Meson构建</h4><p>NetworkManager 使用 Meson 构建，Meson 是一个开源构建系统，旨在实现极快的速度，更重要的是，尽可能地方便用户使用。目前有很多项目在使用，比如 GNOME、KDE 等。</p><p>Meson 官网: <a href="https://mesonbuild.com/index.html">https://mesonbuild.com/index.html</a><br>Github 地址: <a href="https://github.com/mesonbuild/meson">https://github.com/mesonbuild/meson</a>  </p><h4 id="ninja编译"><a href="#ninja编译" class="headerlink" title="ninja编译"></a>ninja编译</h4><p>Meson 和 Ninja 通常会配合使用，Meson 负责构建项目依赖关系，Ninja 负责编译代码。Ninja 是一个轻量的构建系统，主要关注构建的速度。Ninja 使用 .ninja 文件定义构建规则，语法简洁，通常由 Meson 等工具自动生成。</p><p>Ninja 官网: <a href="https://ninja-build.org/">https://ninja-build.org/</a><br>Github 地址：<a href="https://github.com/ninja-build/ninja">https://github.com/ninja-build/ninja</a>  </p><h4 id="Rocky-9-6"><a href="#Rocky-9-6" class="headerlink" title="Rocky 9.6"></a>Rocky 9.6</h4><p>1、克隆项目</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">yum install vim tree curl wget tree</span><br><span class="line"></span><br><span class="line">git clone -b nm-1-52 https://gitlab.freedesktop.org/NetworkManager/NetworkManager.git</span><br></pre></td></tr></table></figure><p>2、打开rocky devel repo，将 enable 改为1，要不有些 devel 包找不到</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">vim /etc/yum.repos.d/rocky-devel.repo</span><br><span class="line"></span><br><span class="line">[rocky-devel]</span><br><span class="line">name=Rocky Linux $releasever - Devel</span><br><span class="line">baseurl=https://mirrors.rockylinux.org/$contentdir/$releasever/devel/$basearch/os/</span><br><span class="line">enabled=1  # 改为 1 启用</span><br><span class="line">gpgcheck=1</span><br><span class="line">gpgkey=/etc/pki/rpm-gpg/RPM-GPG-KEY-Rocky-9</span><br><span class="line"></span><br><span class="line">yum makecache</span><br></pre></td></tr></table></figure><p>3、安装 Meson 和 Ninja</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">yum install python3-pip</span><br><span class="line"></span><br><span class="line">pip3 config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple</span><br><span class="line"></span><br><span class="line">pip3 install ninja meson</span><br><span class="line"></span><br><span class="line">ln /usr/local/bin/meson -s /usr/bin/meson</span><br><span class="line"></span><br><span class="line">ln /usr/local/bin/ninja -s /usr/bin/ninja</span><br></pre></td></tr></table></figure><p>4、安装 NetworkManager 所需要的依赖。怎么确定这些依赖呢？项目依赖可以在 meson.build 文件中搜索dependency配置项。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">yum install -y cmake uuid libuuid-devel libudev-devel dbus-devel glib2-devel libndp-devel gobject-introspection-devel libaudit-devel audit-libs-devel polkit-devel gnutls-devel nss-devel nspr-devel ppp-devel ModemManager-glib-devel mobile-broadband-provider-info-devel jansson-devel libpsl-devel libcurl-devel readline-devel libedit-devel newt-devel </span><br></pre></td></tr></table></figure><p>5、执行 Meson 构建和 Ninja 编译</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">cd NetworkManager &amp;&amp; meson build</span><br><span class="line"></span><br><span class="line">cd build &amp;&amp; ninja</span><br></pre></td></tr></table></figure><p>ninja 会显示编译进度，执行的很快:  </p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">[649/988] Compiling C object src/core/libNetworkManager.a.p/devices_nm-device-ip-tunnel.c.o</span><br></pre></td></tr></table></figure><h4 id="Ubuntu-20-04"><a href="#Ubuntu-20-04" class="headerlink" title="Ubuntu 20.04"></a>Ubuntu 20.04</h4><p>1、克隆项目</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">apt install vim tree curl wget tree git</span><br><span class="line"></span><br><span class="line">git clone -b nm-1-52 https://gitlab.freedesktop.org/NetworkManager/NetworkManager.git</span><br></pre></td></tr></table></figure><p>2、安装 Meson 和 Ninja</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">python3 -m pip install --upgrade pip</span><br><span class="line"></span><br><span class="line">ln -s /usr/local/bin/pip3 /usr/bin/pip</span><br><span class="line"></span><br><span class="line">pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">pip3 install ninja meson</span><br></pre></td></tr></table></figure><p>3、安装 NetworkManager 所需要的依赖，因为 20.04 默认的 Cmake 版本对于 Meson 最新版本来说低一些，所以需要先处理下 CMake PPA 源。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2&gt;/dev/null | gpg --dearmor - | sudo tee /usr/share/keyrings/kitware-archive-keyring.gpg &gt;/dev/null</span><br><span class="line">echo &#x27;deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ focal main&#x27; | sudo tee /etc/apt/sources.list.d/kitware.list &gt;/dev/null</span><br><span class="line">apt-get update</span><br><span class="line">apt-get install kitware-archive-keyring</span><br><span class="line">rm /usr/share/keyrings/kitware-archive-keyring.gpg</span><br><span class="line">apt-get install cmake</span><br><span class="line"></span><br><span class="line">apt install -y build-essential libdbus-glib-1-dev libglib2.0-dev libnm-dev libssl-dev libxml2-dev libreadline-dev gettext autogen autoconf automake libtool libevdev-dev libsystemd-dev libglib2.0-dev libjson-glib-dev libunistring-dev check valgrind swig libndp-dev libgirepository1.0-dev gobject-introspection gir1.2-glib-2.0 libaudit-dev libpolkit-gobject-1-dev  libgnutls28-dev libnss3-dev libnspr4-dev gnutls-bin ppp-dev libmm-glib-dev  dhcpcd5 libjansson-dev libpsl-dev libcurl4-openssl-dev libnewt-dev xsltproc uuid</span><br></pre></td></tr></table></figure><p>4、执行 Meson 构建和 Ninja 编译</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">cd NetworkManager &amp;&amp; meson build</span><br><span class="line"></span><br><span class="line">cd build &amp;&amp; ninja</span><br></pre></td></tr></table></figure><h4 id="构建、编译完成"><a href="#构建、编译完成" class="headerlink" title="构建、编译完成"></a>构建、编译完成</h4><p>1、在上面执行 meson build 通过后会生成如下内容，证明可以进入到 ninja 阶段。如果有报错提示，就按信息解决即可。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br></pre></td><td class="code"><pre><span class="line">Message: </span><br><span class="line">System paths:</span><br><span class="line">  prefix: /usr/local</span><br><span class="line">  exec_prefix: /usr/local</span><br><span class="line">  systemdunitdir: /usr/local/lib/systemd/system</span><br><span class="line">  udev_dir: /usr/local/lib/udev</span><br><span class="line">  nmbinary: /usr/local/sbin/NetworkManager</span><br><span class="line">  nmconfdir: /usr/local/etc/NetworkManager</span><br><span class="line">  nmlibdir: /usr/local/lib/NetworkManager</span><br><span class="line">  nmdatadir: /usr/local/share/NetworkManager</span><br><span class="line">  nmstatedir: /var/local/lib/NetworkManager</span><br><span class="line">  nmrundir: /var/local/run/NetworkManager</span><br><span class="line">  nmvpndir: /usr/local/lib/x86_64-linux-gnu/NetworkManager</span><br><span class="line">  nmplugindir: /usr/local/lib/x86_64-linux-gnu/NetworkManager/1.52.1</span><br><span class="line">  system_ca_path: /etc/ssl/certs</span><br><span class="line">  dbus_conf_dir: /usr/local/share/dbus-1/system.d</span><br><span class="line"></span><br><span class="line">Platform:</span><br><span class="line">  session tracking: systemd-logind,consolekit</span><br><span class="line">  suspend/resume: systemd</span><br><span class="line">  policykit: true (default: true) (restrictive modify.system)</span><br><span class="line">  polkit-agent-helper-1: /usr/lib/policykit-1/polkit-agent-helper-1</span><br><span class="line">  selinux: true</span><br><span class="line">  systemd-journald: true (default: logging.backend=journal)</span><br><span class="line">  hostname persist: default</span><br><span class="line">  libaudit: true (default: logging.audit=true)</span><br><span class="line"></span><br><span class="line">Features:</span><br><span class="line">  wext: true</span><br><span class="line">  wifi: true</span><br><span class="line">  iwd:  false</span><br><span class="line">  pppd: true /usr/sbin/pppd plugins:/usr/local/lib/x86_64-linux-gnu/pppd/2.4.9</span><br><span class="line">  jansson: yes (soname: libjansson.so.4)</span><br><span class="line">  iptables: &quot;/usr/sbin/iptables&quot;</span><br><span class="line">  ip6tables: &quot;/usr/sbin/ip6tables&quot;</span><br><span class="line">  nft: &quot;/usr/sbin/nft&quot;</span><br><span class="line">  modprobe: &quot;/usr/sbin/modprobe&quot;</span><br><span class="line">  modemmanager-1: true</span><br><span class="line">  mobile-broadband-provider-info-database: /usr/share/mobile-broadband-provider-info/serviceproviders.xml</span><br><span class="line">  ofono: false</span><br><span class="line">  concheck: true</span><br><span class="line">  libteamdctl: false</span><br><span class="line">  ovs: true</span><br><span class="line">  nmcli: true</span><br><span class="line">  nmtui: true</span><br><span class="line">  nm-cloud-setup: true</span><br><span class="line"></span><br><span class="line">Configuration_plugins (main.plugins=)</span><br><span class="line">  ifcfg-rh: false (deprecated)</span><br><span class="line">    default value of main.migrate-ifcfg-rh: false</span><br><span class="line">  ifupdown: true</span><br><span class="line"></span><br><span class="line">Handlers for /etc/resolv.conf:</span><br><span class="line">  resolvconf: true /usr/sbin/resolvconf</span><br><span class="line">  netconfig: false</span><br><span class="line"></span><br><span class="line">  config-dns-rc-manager-default: auto</span><br><span class="line"></span><br><span class="line">DHCP clients (default internal):</span><br><span class="line">  dhcpcd: true /usr/sbin/dhcpcd</span><br><span class="line">  dhclient: false (deprecated)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">Miscellaneous:</span><br><span class="line">  have introspection: true</span><br><span class="line">  build documentation and manpages: false</span><br><span class="line">  firewalld zone for shared mode: true</span><br><span class="line">  tests: yes</span><br><span class="line">  more-asserts: 0</span><br><span class="line">  more-logging: true</span><br><span class="line">  warning-level: 2</span><br><span class="line">  valgrind: false</span><br><span class="line">  code coverage: false</span><br><span class="line">  LTO: false</span><br><span class="line">  Linker garbage collection: true</span><br><span class="line">  crypto: nss (have-gnutls: true, have-nss: true)</span><br><span class="line">  sanitizers: none</span><br><span class="line">  Mozilla Public Suffix List: true</span><br><span class="line">  vapi: false</span><br><span class="line">  ebpf: false</span><br><span class="line">  readline: libreadline</span><br><span class="line"></span><br><span class="line">Build targets in project: 369</span><br><span class="line"></span><br><span class="line">Found ninja-1.11.1.git.kitware.jobserver-1 at /usr/local/bin/ninja</span><br><span class="line">WARNING: Running the setup command as `meson [options]` instead of `meson setup [options]` is ambiguous and deprecated.</span><br></pre></td></tr></table></figure><p>2、ninja 执行完进度条会消失掉，直接进去看文件即可。编译 NetworkManager 完成后的目录及大小</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">du -sh ./* | sort -h -r</span><br><span class="line">1.2G    ./src</span><br><span class="line">18M     ./introspection</span><br><span class="line">8.9M    ./po</span><br><span class="line">2.9M    ./meson-private</span><br><span class="line">1.8M    ./build.ninja</span><br><span class="line">1.2M    ./meson-info</span><br><span class="line">1012K   ./compile_commands.json</span><br><span class="line">664K    ./examples</span><br><span class="line">200K    ./data</span><br><span class="line">124K    ./meson-logs</span><br><span class="line">8.0K    ./meson-uninstalled</span><br><span class="line">8.0K    ./config.h</span><br><span class="line">4.0K    ./config-extra.h</span><br></pre></td></tr></table></figure><p>3、NetworkManager 编译后的产物会分散在 src 下的各个子目录中，如果有安装需要，可以执行 ninja install 命令。</p><p>到这里学习就结束啦☺️</p><h3 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h3><p>1、<a href="https://apt.kitware.com/">https://apt.kitware.com/</a><br>2、<a href="https://people.freedesktop.org/~lkundrak/nm-docs/NetworkManager.conf.html">https://people.freedesktop.org/~lkundrak/nm-docs/NetworkManager.conf.html</a><br>3、<a href="https://www.alibabacloud.com/help/zh/alinux/user-guide/networkmanager-configuration-files-and-common-configurations">https://www.alibabacloud.com/help/zh/alinux/user-guide/networkmanager-configuration-files-and-common-configurations</a></p>]]></content>
    
    
    <summary type="html">&lt;p&gt;遇到一个场景，装完操作系统通过 GUI 进行网络配置、安装业务平台并直接修改 &amp;#x2F;etc&amp;#x2F;resolv.conf 文件，改变了DNS地址配置。过了几天后发现配置被修改，影响到了业务平台，但排查以后确认无人操作过 &amp;#x2F;etc&amp;#x2F;resolv.conf 文件。最后追查到 NetworkManager 被重启过，每次 NetworkManager 重启都会出现DNS恢复到系统安装时初始网络配置写的 DNS。NetworkManager 的处理流程是什么，跟着日志和源代码一起看看吧☺️&lt;/p&gt;</summary>
    
    
    
    <category term="Skill" scheme="https://www.applenice.net/categories/Skill/"/>
    
    
    <category term="Linux" scheme="https://www.applenice.net/tags/Linux/"/>
    
    <category term="Network" scheme="https://www.applenice.net/tags/Network/"/>
    
  </entry>
  
  <entry>
    <title>单机部署K3s服务并接入Kuboard</title>
    <link href="https://www.applenice.net/2025/03/26/Standalone-deployment-of-k3s/"/>
    <id>https://www.applenice.net/2025/03/26/Standalone-deployment-of-k3s/</id>
    <published>2025-03-26T15:29:09.000Z</published>
    <updated>2025-03-27T15:55:15.588Z</updated>
    
    <content type="html"><![CDATA[<p>在Kubernetes上做实验或者写一些自己的小工具时，通常要搭建一个环境用来学习，采用K3s的方式在单机服务器上搭建一套环境是占用资源较少的方式，并且配置上Kuboard进行管理后会更容易操作。本文就记录下环境搭建的步骤。</p><span id="more"></span><h3 id="环境准备"><a href="#环境准备" class="headerlink" title="环境准备"></a>环境准备</h3><p>这里采用一台2C、2G配置的腾讯云 22.04.5 LTS (Jammy Jellyfish) 云主机进行，环境为初始化系统，未安装任何服务。</p><p>1、先进行系统更新，保证系统内版本是最新的。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">apt update &amp;&amp; apt upgrade -y &amp;&amp; apt autoremove -y &amp;&amp; reboot</span><br></pre></td></tr></table></figure><p>2、安装Docker 25.0.5<br>这里我选择安装Docker 25.0.5版本作为K3s安装的runtime。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">apt install apt-transport-https ca-certificates curl gnupg lsb-release -y</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">curl -fsSL https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc</span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">echo</span> \</span></span><br><span class="line"><span class="language-bash">  <span class="string">&quot;deb [arch=<span class="subst">$(dpkg --print-architecture)</span> signed-by=/etc/apt/keyrings/docker.asc] https://mirrors.ustc.edu.cn/docker-ce/linux/ubuntu/ \</span></span></span><br><span class="line"><span class="string"><span class="language-bash">  <span class="subst">$(. /etc/os-release &amp;&amp; echo <span class="string">&quot;<span class="variable">$VERSION_CODENAME</span>&quot;</span>)</span> stable&quot;</span> | \</span></span><br><span class="line"><span class="language-bash">  <span class="built_in">sudo</span> <span class="built_in">tee</span> /etc/apt/sources.list.d/docker.list &gt; /dev/null</span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">apt update</span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">#</span><span class="language-bash">查找可用的docker版本</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">apt-cache madison docker-ce</span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">安装Docker</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">apt install docker-ce=5:25.0.5-1~ubuntu.22.04~jammy docker-ce-cli=5:25.0.5-1~ubuntu.22.04~jammy containerd.io -y</span></span><br></pre></td></tr></table></figure><p>3、确认Dokcer安装完成</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">systemctl status docker</span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker info</span></span><br></pre></td></tr></table></figure><p>4、配置成稳定的Dokcer镜像源</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">sudo</span> <span class="built_in">tee</span> /etc/docker/daemon.json &lt;&lt;-<span class="string">&#x27;EOF&#x27;</span></span></span><br><span class="line">&#123;</span><br><span class="line">    &quot;registry-mirrors&quot;: [</span><br><span class="line">        &quot;https://docker.m.daocloud.io&quot;</span><br><span class="line">    ]</span><br><span class="line">&#125;</span><br><span class="line">EOF</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">systemctl daemon-reload</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">systemctl restart docker</span></span><br></pre></td></tr></table></figure><h3 id="K3S部署"><a href="#K3S部署" class="headerlink" title="K3S部署"></a>K3S部署</h3><p>K3s社区已经将所需的K3s资源都同步到了国内的服务器上，可以使用这些国内资源在国内环境上安装K3s，提升了安装速度的同时也提升了安装的稳定性。K3s默认使用containerd作为容器runtime，这里进行修改，选择Docker作为容器runtime。</p><p>1、执行安装脚本，使用Docker作为runtime，并设置默认registry地址为registry.cn-hangzhou.aliyuncs.com。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">使用Docker作为runtime，使用该命令</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn sh -s - --docker --system-default-registry <span class="string">&quot;registry.cn-hangzhou.aliyuncs.com&quot;</span></span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">默认使用containerd作为runtime，使用该命令</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">curl –sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn sh -s - --system-default-registry <span class="string">&quot;registry.cn-hangzhou.aliyuncs.com&quot;</span></span></span><br><span class="line"></span><br><span class="line">[INFO]  Finding release for channel stable</span><br><span class="line">[INFO]  Using v1.31.6+k3s1 as release</span><br><span class="line">[INFO]  Downloading hash rancher-mirror.rancher.cn/k3s/v1.31.6-k3s1/sha256sum-amd64.txt</span><br><span class="line">[INFO]  Downloading binary rancher-mirror.rancher.cn/k3s/v1.31.6-k3s1/k3s</span><br><span class="line">[INFO]  Verifying binary download</span><br><span class="line">[INFO]  Installing k3s to /usr/local/bin/k3s</span><br><span class="line">[INFO]  Skipping installation of SELinux RPM</span><br><span class="line">[INFO]  Creating /usr/local/bin/kubectl symlink to k3s</span><br><span class="line">[INFO]  Creating /usr/local/bin/crictl symlink to k3s</span><br><span class="line">[INFO]  Skipping /usr/local/bin/ctr symlink to k3s, command exists in PATH at /usr/bin/ctr</span><br><span class="line">[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh</span><br><span class="line">[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh</span><br><span class="line">[INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env</span><br><span class="line">[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service</span><br><span class="line">[INFO]  systemd: Enabling k3s unit</span><br><span class="line">Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.</span><br><span class="line">[INFO]  systemd: Starting k3s</span><br></pre></td></tr></table></figure><p>2、可以等待几分钟后，再查看K3s工作状态：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">systemctl status k3s</span></span><br><span class="line">● k3s.service - Lightweight Kubernetes</span><br><span class="line">     Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: enabled)</span><br><span class="line">     Active: active (running) since Thu 2025-03-27 00:21:18 CST; 3min ago</span><br><span class="line">       Docs: https://k3s.io</span><br><span class="line">    Process: 26559 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service 2&gt;/dev/null (code=exited, status=0/SUCCESS)</span><br><span class="line">    Process: 26561 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)</span><br><span class="line">    Process: 26562 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)</span><br><span class="line">   Main PID: 26563 (k3s-server)</span><br><span class="line">      Tasks: 16</span><br><span class="line">     Memory: 483.3M</span><br><span class="line">        CPU: 21.133s</span><br><span class="line">     CGroup: /system.slice/k3s.service</span><br><span class="line">             ├─26563 &quot;/usr/local/bin/k3s server&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot;&gt;</span><br><span class="line">             └─28640 &quot;/usr/local/bin/k3s server&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot; &quot;&quot;&gt;</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">kubectl get pods --all-namespaces</span></span><br><span class="line">NAMESPACE     NAME                                      READY   STATUS      RESTARTS   AGE</span><br><span class="line">kube-system   coredns-7f9dc8d998-jl8q8                  1/1     Running     0          93s</span><br><span class="line">kube-system   helm-install-traefik-crd-4v5w2            0/1     Completed   0          93s</span><br><span class="line">kube-system   helm-install-traefik-tjs4j                0/1     Completed   2          93s</span><br><span class="line">kube-system   local-path-provisioner-864d7dff5d-5rph4   1/1     Running     0          93s</span><br><span class="line">kube-system   metrics-server-69969b57cb-k7g2q           1/1     Running     0          93s</span><br><span class="line">kube-system   svclb-traefik-98e2a50a-h5n6s              2/2     Running     0          34s</span><br><span class="line">kube-system   traefik-57d9d494d7-x52jb                  1/1     Running     0          34s</span><br></pre></td></tr></table></figure><p>3、同时可以在Docker中查看image和container变化：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker image <span class="built_in">ls</span></span></span><br><span class="line">REPOSITORY                                                           TAG                    IMAGE ID       CREATED        SIZE</span><br><span class="line">registry.cn-hangzhou.aliyuncs.com/rancher/mirrored-library-traefik   2.11.20                d7d7095a482f   7 weeks ago    178MB</span><br><span class="line">registry.cn-hangzhou.aliyuncs.com/rancher/local-path-provisioner     v0.0.31                8309ed19e06b   2 months ago   60.4MB</span><br><span class="line">registry.cn-hangzhou.aliyuncs.com/rancher/klipper-helm               v0.9.4-build20250113   cf1d4e2d0dbd   2 months ago   190MB</span><br><span class="line">....</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker ps</span></span><br><span class="line">CONTAINER ID   IMAGE                                                                COMMAND                  CREATED         STATUS         PORTS     NAMES</span><br><span class="line">624e32c551a8   registry.cn-hangzhou.aliyuncs.com/rancher/mirrored-library-traefik   &quot;/entrypoint.sh --gl…&quot;   5 minutes ago   Up 5 minutes             k8s_traefik_traefik-57d9d494d7-x52jb_kube-system_1562282c-dbdb-490b-bf7e-85a6e4ebc251_0</span><br><span class="line">f1a36596c826   b82360cf0b97                                                         &quot;entry&quot;                  5 minutes ago   Up 5 minutes             k8s_lb-tcp-443_svclb-traefik-98e2a50a-h5n6s_kube-system_2e314526-1424-4989-9dea-09023c830c05_0</span><br><span class="line">ecdd527845ce   registry.cn-hangzhou.aliyuncs.com/rancher/klipper-lb                 &quot;entry&quot;                  5 minutes ago   Up 5 minutes             k8s_lb-tcp-80_svclb-traefik-98e2a50a-h5n6s_kube-system_2e314526-1424-4989-9dea-09023c830c05_0</span><br><span class="line">......</span><br></pre></td></tr></table></figure><h4 id="containerd"><a href="#containerd" class="headerlink" title="containerd"></a>containerd</h4><p>1、如果是选择containerd作为容器runtime，则还要配置k3s containerd的镜像源并重启K3s，K3s启动时会检查&#x2F;etc&#x2F;rancher&#x2F;k3s&#x2F;中是否存在registries.yaml文件，并指示containerd使用文件中定义的镜像仓库。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> &gt;&gt; /etc/rancher/k3s/registries.yaml &lt;&lt;<span class="string">EOF</span></span></span><br><span class="line">mirrors:</span><br><span class="line">  &quot;docker.io&quot;:</span><br><span class="line">    endpoint:</span><br><span class="line">      - &quot;https://docker.m.daocloud.io&quot;</span><br><span class="line">EOF</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="string">systemctl restart k3s</span></span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash"><span class="string">确认镜像配置</span></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="string">cat /var/lib/rancher/k3s/agent/etc/containerd/certs.d/docker.io/hosts.toml</span></span></span><br><span class="line">mirrors:</span><br><span class="line">  &quot;docker.io&quot;:</span><br><span class="line">    endpoint:</span><br><span class="line">      - &quot;https://docker.m.daocloud.io&quot;</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash"><span class="string">查看k3s containerd 的 socket</span></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="string">ls -lrt /run/k3s/containerd/containerd.sock</span></span></span><br><span class="line">srw-rw---- 1 root root 0 Mar 27 01:29 /run/k3s/containerd/containerd.sock</span><br></pre></td></tr></table></figure><p>2、containerd提供ctr和crictl工具，查看k3s containerd拉取的镜像：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">k3s ctr image <span class="built_in">ls</span></span></span><br><span class="line">REF                                                                                                                                        TYPE                                    DIGEST                                                                  SIZE      PLATFORMS                                                                                               LABELS                                                          </span><br><span class="line">registry.cn-hangzhou.aliyuncs.com/rancher/klipper-helm:v0.9.4-build20250113                                                                application/vnd.oci.image.index.v1+json sha256:6b33b9efa5b89c2606e777b137e9959fbcd4364501d19c0c746d0cc8f32026d9 67.2 MiB  linux/amd64,linux/arm,linux/arm64/v8                                                                    io.cri-containerd.image=managed    </span><br><span class="line">......</span><br><span class="line"></span><br><span class="line">或</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">crictl image <span class="built_in">ls</span></span></span><br><span class="line">root@VM-8-16-ubuntu:/home/ubuntu# crictl image ls</span><br><span class="line">IMAGE                                                                TAG                    IMAGE ID            SIZE</span><br><span class="line">registry.cn-hangzhou.aliyuncs.com/rancher/klipper-helm               v0.9.4-build20250113   cf1d4e2d0dbd1       70.4MB</span><br><span class="line">......</span><br></pre></td></tr></table></figure><p>3、对于习惯使用Docker的小伙伴来说，ctr和crictl的操作和Docker还是有所区别，这里可以使用nerdctl与K3s集成，nerdctl是一个与Docker cli风格兼容的containerd客户端工具，而且直接兼容docker compose的语法。<br>1）下载nerdctl并解压</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">wget https://github.com/containerd/nerdctl/releases/download/v2.0.4/nerdctl-full-2.0.4-linux-amd64.tar.gz</span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">tar -xvf nerdctl-full-2.0.4-linux-amd64.tar.gz</span></span><br></pre></td></tr></table></figure><p>2）nerdctl操作K3s自带的containerd，注意此处必须额外指定<code>--namespace=k8s.io</code></p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">./bin/nerdctl -H /run/k3s/containerd/containerd.sock --namespace=k8s.io images</span></span><br><span class="line">REPOSITORY                                                            TAG                     IMAGE ID        CREATED           PLATFORM       SIZE       BLOB SIZE</span><br><span class="line">registry.cn-hangzhou.aliyuncs.com/rancher/mirrored-library-traefik    &lt;none&gt;                  21f5c16b2215    22 hours ago    linux/amd64    180.4MB    49.45MB</span><br><span class="line">&lt;none&gt;                                                                &lt;none&gt;                  21f5c16b2215    22 hours ago    linux/amd64    180.4MB    49.45MB</span><br><span class="line">........</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">./bin/nerdctl -H /run/k3s/containerd/containerd.sock --namespace=k8s.io ps</span></span><br><span class="line">CONTAINER ID    IMAGE                                                                         COMMAND                   CREATED           STATUS    PORTS    NAMES</span><br><span class="line">7db864ef1a2c    registry.cn-hangzhou.aliyuncs.com/rancher/mirrored-library-traefik:2.11.20    &quot;/entrypoint.sh --gl…&quot;    22 hours ago    Up                 k8s://kube-system/traefik-57d9d494d7-wn6nx/traefik</span><br></pre></td></tr></table></figure><h3 id="Pod测试"><a href="#Pod测试" class="headerlink" title="Pod测试"></a>Pod测试</h3><p>服务部署好以后，这里通过启动一个nginx pod来验证是否可以正常使用<br>1、编写</p><figure class="highlight yml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">$</span> <span class="string">cat</span> <span class="string">nginx-deployment.yml</span> </span><br><span class="line"><span class="attr">apiVersion:</span> <span class="string">apps/v1</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">Deployment</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line">  <span class="attr">name:</span> <span class="string">nginx-deployment</span></span><br><span class="line">  <span class="attr">labels:</span></span><br><span class="line">    <span class="attr">app:</span> <span class="string">nginx</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line">  <span class="attr">replicas:</span> <span class="number">1</span></span><br><span class="line">  <span class="attr">selector:</span></span><br><span class="line">    <span class="attr">matchLabels:</span></span><br><span class="line">      <span class="attr">app:</span> <span class="string">nginx</span></span><br><span class="line">  <span class="attr">template:</span></span><br><span class="line">    <span class="attr">metadata:</span></span><br><span class="line">      <span class="attr">labels:</span></span><br><span class="line">        <span class="attr">app:</span> <span class="string">nginx</span></span><br><span class="line">    <span class="attr">spec:</span></span><br><span class="line">      <span class="attr">containers:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">nginx</span></span><br><span class="line">        <span class="attr">image:</span> <span class="string">nginx:1.26.3</span></span><br></pre></td></tr></table></figure><p>2、创建deployment并查看pod</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">kubectl apply -f nginx-deployment.yml</span> </span><br><span class="line">deployment.apps/nginx-deployment created</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">kubectl get pod</span></span><br><span class="line">NAME                                READY   STATUS    RESTARTS   AGE</span><br><span class="line">nginx-deployment-74bd454fc9-kqjnj   1/1     Running   0          41s</span><br></pre></td></tr></table></figure><p>到这里就说明服务OK，可以正常使用，但通过命令操作起来仍然是相对繁琐的，接下来安装Kuboard来管理刚部署的K3s服务。</p><h3 id="Kuboard接入"><a href="#Kuboard接入" class="headerlink" title="Kuboard接入"></a>Kuboard接入</h3><p>1、创建一个脚本，并执行Kuboard的部署:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$</span><span class="language-bash"><span class="built_in">cat</span> kuboard-install.sh</span> </span><br><span class="line">docker run -itd \</span><br><span class="line">  --restart=unless-stopped \</span><br><span class="line">  --name=kuboard \</span><br><span class="line">  -p 18080:80/tcp \</span><br><span class="line">  -p 10081:10081/tcp \</span><br><span class="line">  -e KUBOARD_ENDPOINT=&quot;http://&#123;内网IP&#125;:80&quot; \</span><br><span class="line">  -e KUBOARD_AGENT_SERVER_TCP_PORT=&quot;10081&quot; \</span><br><span class="line">  -v /root/kuboard-data:/data \</span><br><span class="line">  eipwork/kuboard:v3</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">sh kuboard-install.sh</span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker ps -a | grep kuboard</span></span><br><span class="line">ae14cf2e2ae5   eipwork/kuboard:v3                                                   &quot;/entrypoint.sh&quot;         13 seconds ago   Up 12 seconds               443/tcp, 0.0.0.0:10081-&gt;10081/tcp, :::10081-&gt;10081/tcp, 0.0.0.0:18080-&gt;80/tcp, :::18080-&gt;80/tcp   kuboard</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">通过logs确认kuboard启动正常</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker logs kuboard</span></span><br></pre></td></tr></table></figure><p>Kuboard安装完成后的默认口令是：admin&#x2F;Kuboard123</p><p>该口令目前是弱口令，建议安装完成后，立即在个人设置中进行修改。</p><p>2、在腾讯云主机防火墙上新建一条放行策略，允许访问云主机的18080端口，并通过默认口令登录。</p><p>3、点击添加集群，可以看到支持通过Token、KuboConfig、Kuboard Agent三种方式添加Kubernetes 集群。这里我选择Token方式，按照操作说明添加即可。  </p><img src="/2025/03/26/Standalone-deployment-of-k3s/Kuboard-001.png" class=""><img src="/2025/03/26/Standalone-deployment-of-k3s/Kuboard-002.png" class=""><p>4、添加完成后，即可在Kuboard中进行各种操作了。</p><h3 id="参考"><a href="#参考" class="headerlink" title="参考"></a>参考</h3><p>1、<a href="https://docs.rancher.cn/docs/k3s/quick-start/_index">https://docs.rancher.cn/docs/k3s/quick-start/_index</a><br>2、<a href="https://forums.rancher.cn/t/k3s/1416">https://forums.rancher.cn/t/k3s/1416</a><br>3、<a href="https://kuboard.cn/install/v3/install-built-in.html">https://kuboard.cn/install/v3/install-built-in.html</a><br>4、<a href="https://docs.rancher.cn/docs/k3s/installation/private-registry/_index/">https://docs.rancher.cn/docs/k3s/installation/private-registry/_index/</a><br>5、<a href="https://docs.k3s.io/installation/private-registry">https://docs.k3s.io/installation/private-registry</a><br>6、<a href="https://github.com/containerd/nerdctl/issues/128#issuecomment-803544231">https://github.com/containerd/nerdctl/issues/128#issuecomment-803544231</a>  </p>]]></content>
    
    
    <summary type="html">&lt;p&gt;在Kubernetes上做实验或者写一些自己的小工具时，通常要搭建一个环境用来学习，采用K3s的方式在单机服务器上搭建一套环境是占用资源较少的方式，并且配置上Kuboard进行管理后会更容易操作。本文就记录下环境搭建的步骤。&lt;/p&gt;</summary>
    
    
    
    <category term="Kubernetes" scheme="https://www.applenice.net/categories/Kubernetes/"/>
    
    
    <category term="Linux" scheme="https://www.applenice.net/tags/Linux/"/>
    
    <category term="K3s" scheme="https://www.applenice.net/tags/K3s/"/>
    
    <category term="Kubernetes" scheme="https://www.applenice.net/tags/Kubernetes/"/>
    
    <category term="Kuboard" scheme="https://www.applenice.net/tags/Kuboard/"/>
    
  </entry>
  
  <entry>
    <title>Golang中http.Client Transport配置与TIME_WAIT现象</title>
    <link href="https://www.applenice.net/2025/02/01/Go-http-Client-Transport-TIME-WAIT/"/>
    <id>https://www.applenice.net/2025/02/01/Go-http-Client-Transport-TIME-WAIT/</id>
    <published>2025-02-01T04:54:26.000Z</published>
    <updated>2025-02-01T07:38:42.000Z</updated>
    
    <content type="html"><![CDATA[<p>2025年的第一篇文章，祝大家新年快乐(#^.^#)</p><p>本文用于记录在Golang中使用net&#x2F;http包的Client时Transport配置不当，同时遇到大量并发请求时引起的TIME_WAIT问题，文章中会通过demo程序复现。</p><span id="more"></span><h3 id="环境准备"><a href="#环境准备" class="headerlink" title="环境准备"></a>环境准备</h3><p>此前定位的故障环境是物理机，系统是CentOS 7.6。众所周知，本地IP地址可使用的端口范围是有限的，可以通过如下命令查看：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">sysctl -a | grep net.ipv4.ip_local_port_range</span></span><br><span class="line">net.ipv4.ip_local_port_range = 32768    60999</span><br><span class="line">或者</span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">cat</span> /proc/sys/net/ipv4/ip_local_port_range</span></span><br><span class="line">32768   60999</span><br></pre></td></tr></table></figure><p>那么可以计算得出本地可用端口为28231个。</p><p>这里为了便于复现场景，采用docker容器的形式进行，同时在启动容器时就限制容器内的<code>net.ipv4.ip_local_port_range</code>参数。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">docker run -itd --sysctl net.ipv4.ip_local_port_range=<span class="string">&quot;32768    36768&quot;</span> golang:1.20.14-bullseye</span></span><br></pre></td></tr></table></figure><p>这里允许本地可用端口为4000个，使用wrk工具可以非常容易的打满这个限制。</p><h3 id="故障复现"><a href="#故障复现" class="headerlink" title="故障复现"></a>故障复现</h3><p>为了复现故障，这里需要准备两个demo程序，都使用Gin框架提供HTTP服务。<br>1、backends程序，监听8088端口，用于提供的简单的ping、pong应答：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> <span class="string">&quot;github.com/gin-gonic/gin&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line">gin.SetMode(gin.ReleaseMode)</span><br><span class="line">r := gin.Default()</span><br><span class="line">r.GET(<span class="string">&quot;/ping&quot;</span>, <span class="function"><span class="keyword">func</span><span class="params">(c *gin.Context)</span></span> &#123;</span><br><span class="line">c.JSON(<span class="number">200</span>, gin.H&#123;</span><br><span class="line"><span class="string">&quot;message&quot;</span>: <span class="string">&quot;pong&quot;</span>,</span><br><span class="line">&#125;)</span><br><span class="line">&#125;)</span><br><span class="line">r.Run(<span class="string">&quot;:8088&quot;</span>)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>进行编译：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">go build -o backends</span></span><br></pre></td></tr></table></figure><p>2、transponder程序，监听8080端口，起到一个代理请求的作用，tping路由下的请求，会再向8088下的ping路由发起HTTP请求。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line"><span class="string">&quot;fmt&quot;</span></span><br><span class="line"><span class="string">&quot;io&quot;</span></span><br><span class="line"><span class="string">&quot;net/http&quot;</span></span><br><span class="line"><span class="string">&quot;encoding/json&quot;</span></span><br><span class="line"><span class="string">&quot;github.com/gin-gonic/gin&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="keyword">type</span> ping <span class="keyword">struct</span> &#123;</span><br><span class="line">Message <span class="type">string</span> <span class="string">`json:&quot;message&quot;`</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line">gin.SetMode(gin.ReleaseMode)</span><br><span class="line">gin.DefaultWriter = io.Discard</span><br><span class="line">r := gin.Default()</span><br><span class="line"></span><br><span class="line">url := <span class="string">&quot;http://127.0.0.1:8088/ping&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="comment">//直接使用默认的 http.Client</span></span><br><span class="line">client := &amp;http.Client&#123;&#125;</span><br><span class="line"></span><br><span class="line">r.GET(<span class="string">&quot;/tping&quot;</span>, <span class="function"><span class="keyword">func</span><span class="params">(c *gin.Context)</span></span> &#123;</span><br><span class="line">req, err := http.NewRequest(<span class="string">&quot;GET&quot;</span>, url, <span class="literal">nil</span>)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">c.JSON(http.StatusInternalServerError, gin.H&#123;</span><br><span class="line"><span class="string">&quot;error&quot;</span>: fmt.Sprintf(<span class="string">&quot;http.NewRequest failed: %v&quot;</span>, err),</span><br><span class="line">&#125;)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">resp, err := client.Do(req)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">c.JSON(http.StatusInternalServerError, gin.H&#123;</span><br><span class="line"><span class="string">&quot;error&quot;</span>: fmt.Sprintf(<span class="string">&quot;get failed: %v&quot;</span>, err),</span><br><span class="line">&#125;)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">defer</span> resp.Body.Close()</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> resp.StatusCode != http.StatusOK &#123;</span><br><span class="line">c.JSON(resp.StatusCode, gin.H&#123;</span><br><span class="line"><span class="string">&quot;error&quot;</span>: fmt.Sprintf(<span class="string">&quot;request failed with status code: %d&quot;</span>, resp.StatusCode),</span><br><span class="line">&#125;)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">body, err := io.ReadAll(resp.Body)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">c.JSON(http.StatusInternalServerError, gin.H&#123;</span><br><span class="line"><span class="string">&quot;error&quot;</span>: fmt.Sprintf(<span class="string">&quot;read body failed: %v&quot;</span>, err),</span><br><span class="line">&#125;)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> tping ping</span><br><span class="line">err = json.Unmarshal(body, &amp;tping)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">c.JSON(http.StatusInternalServerError, gin.H&#123;</span><br><span class="line"><span class="string">&quot;error&quot;</span>: fmt.Sprintf(<span class="string">&quot;unmarshal failed: %v&quot;</span>, err),</span><br><span class="line">&#125;)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">c.JSON(<span class="number">200</span>, gin.H&#123;</span><br><span class="line"><span class="string">&quot;data&quot;</span>: tping.Message,</span><br><span class="line">&#125;)</span><br><span class="line">&#125;)</span><br><span class="line"></span><br><span class="line">r.Run(<span class="string">&quot;:8080&quot;</span>)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>进行编译：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">go build -o transponder_v1</span></span><br></pre></td></tr></table></figure><p>3、分别在<strong>上面准备好的容器内</strong>启动程序backends、transponder_v1，通过curl请求验证，服务都正常：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">root@d76a1f6b4069:/opt# curl http://127.0.0.1:8088/ping</span><br><span class="line">&#123;&quot;message&quot;:&quot;pong&quot;&#125; </span><br><span class="line"></span><br><span class="line">root@d76a1f6b4069:/opt# curl http://127.0.0.1:8080/tping</span><br><span class="line">&#123;&quot;data&quot;:&quot;pong&quot;&#125;</span><br></pre></td></tr></table></figure><p>且环境中是的网络情况是较为简单的，只有启动的两个服务。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line">root@d76a1f6b4069:/go# netstat -anlp</span><br><span class="line">Active Internet connections (servers and established)</span><br><span class="line">Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    </span><br><span class="line">tcp6       0      0 :::8080                 :::*                    LISTEN      953/./transponder_v </span><br><span class="line">tcp6       0      0 :::8088                 :::*                    LISTEN      579/./backends      </span><br><span class="line">Active UNIX domain sockets (servers and established)</span><br><span class="line">Proto RefCnt Flags       Type       State         I-Node   PID/Program name     Path</span><br><span class="line"></span><br><span class="line">root@d76a1f6b4069:/go# ss -s</span><br><span class="line">Total: 652</span><br><span class="line">TCP:   29 (estab 0, closed 27, orphaned 0, timewait 1)</span><br><span class="line"></span><br><span class="line">Transport Total     IP        IPv6</span><br><span class="line">RAW       0         0         0        </span><br><span class="line">UDP       0         0         0        </span><br><span class="line">TCP       2         0         2        </span><br><span class="line">INET      2         0         2        </span><br><span class="line">FRAG      0         0         0        </span><br></pre></td></tr></table></figure><p>4、准备好wrk程序，并启动wrk程序进行压力请求，这里采用8个thread、600 connection，持续180秒:</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">root@d76a1f6b4069:/opt# ./wrk -t 8 -c 600 -d 180s http://127.0.0.1:8080/tping</span><br><span class="line">Running 3m test @ http://127.0.0.1:8080/tping</span><br><span class="line">  8 threads and 600 connections</span><br><span class="line">  Thread Stats   Avg      Stdev     Max   +/- Stdev</span><br><span class="line">    Latency   574.59ms  611.03ms   2.00s    75.02%</span><br><span class="line">    Req/Sec    67.71     58.33   721.00     86.88%</span><br><span class="line">  96018 requests in 3.00m, 19.79MB read</span><br><span class="line">  Socket errors: connect 0, read 0, write 0, timeout 21152</span><br><span class="line">  Non-2xx or 3xx responses: 57687</span><br><span class="line">Requests/sec:    533.16</span><br><span class="line">Transfer/sec:    112.52KB</span><br></pre></td></tr></table></figure><p>此处先贴上wrk执行结果，在执行中观察下面第五点的结果。</p><p>5、等待十几秒后，通过以下命令即可观察到TIME_WAIT现象:<br>1）多次curl请求8080下tping路由，均获得错误响应</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">root@d76a1f6b4069:/opt# curl http://127.0.0.1:8080/tping</span><br><span class="line">&#123;&quot;error&quot;:&quot;get failed: Get \&quot;http://127.0.0.1:8088/ping\&quot;: dial tcp 127.0.0.1:8088: connect: cannot assign requested address&quot;&#125;</span><br></pre></td></tr></table></figure><p>虽然transponder_v1没有崩溃，但在curl请求下，服务器已经无法正常返回pong结果，同时告知了错误。</p><p>2）ss命令</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">root@d76a1f6b4069:/go# ss -s</span><br><span class="line">Total: 2637</span><br><span class="line">TCP:   5730 (estab 1780, closed 3948, orphaned 0, timewait 3712)</span><br><span class="line"></span><br><span class="line">Transport Total     IP        IPv6</span><br><span class="line">RAW       0         0         0        </span><br><span class="line">UDP       0         0         0        </span><br><span class="line">TCP       1782      890       892      </span><br><span class="line">INET      1782      890       892      </span><br><span class="line">FRAG      0         0         0        </span><br></pre></td></tr></table></figure><p>通过ss的结果，可以看出，当前wrk压力请求产生的连接已经远超我们设置的本地端口范围4000个限制。</p><p>3）netstat筛选<br>通过上面ss的结果，可以看到建立了很多连接，可以筛选看下详细情况。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line">root@d76a1f6b4069:/go# netstat -anlp | grep ESTABLISHED </span><br><span class="line">tcp6      45      0 127.0.0.1:8080          127.0.0.1:34111         ESTABLISHED 598/./transponder_v </span><br><span class="line">tcp6       0      0 127.0.0.1:8080          127.0.0.1:34461         ESTABLISHED 598/./transponder_v </span><br><span class="line">tcp6       0      0 127.0.0.1:8088          127.0.0.1:34845         ESTABLISHED 579/./backends      </span><br><span class="line">tcp6      45      0 127.0.0.1:8080          127.0.0.1:34635         ESTABLISHED 598/./transponder_v </span><br><span class="line">tcp6       0      0 127.0.0.1:8080          127.0.0.1:34877         ESTABLISHED 598/./transponder_v </span><br><span class="line">tcp6      45      0 127.0.0.1:8080          127.0.0.1:33851         ESTABLISHED 598/./transponder_v </span><br><span class="line">tcp6      45      0 127.0.0.1:8080          127.0.0.1:34745         ESTABLISHED 598/./transponder_v </span><br><span class="line">tcp6      45      0 127.0.0.1:8080          127.0.0.1:34001         ESTABLISHED 598/./transponder_v </span><br><span class="line">tcp6       0      0 127.0.0.1:8080          127.0.0.1:33727         ESTABLISHED 598/./transponder_v </span><br><span class="line">tcp6      45      0 127.0.0.1:8080          127.0.0.1:34981         ESTABLISHED 598/./transponder_v </span><br><span class="line">tcp6       0      0 127.0.0.1:8088          127.0.0.1:36216         ESTABLISHED 579/./backends      </span><br><span class="line">tcp6       0      0 127.0.0.1:8088          127.0.0.1:36647         ESTABLISHED 579/./backends      </span><br><span class="line">.......</span><br><span class="line"></span><br><span class="line">root@d76a1f6b4069:/go# netstat -anlp | grep TIME_WAIT</span><br><span class="line">tcp        0      0 127.0.0.1:35690         127.0.0.1:8088          TIME_WAIT   -                   </span><br><span class="line">tcp        0      0 127.0.0.1:33959         127.0.0.1:8088          TIME_WAIT   -                   </span><br><span class="line">tcp        0      0 127.0.0.1:34872         127.0.0.1:8088          TIME_WAIT   -                   </span><br><span class="line">tcp        0      0 127.0.0.1:33673         127.0.0.1:8088          TIME_WAIT   -                   </span><br><span class="line">tcp        0      0 127.0.0.1:35497         127.0.0.1:8088          TIME_WAIT   -                   </span><br><span class="line">tcp        0      0 127.0.0.1:35692         127.0.0.1:8088          TIME_WAIT   -                   </span><br><span class="line">tcp        0      0 127.0.0.1:32972         127.0.0.1:8088          TIME_WAIT   -                   </span><br><span class="line">tcp        0      0 127.0.0.1:34505         127.0.0.1:8088          TIME_WAIT   -                   </span><br><span class="line">tcp        0      0 127.0.0.1:33402         127.0.0.1:8088          TIME_WAIT   -                   </span><br><span class="line">tcp        0      0 127.0.0.1:36374         127.0.0.1:8088          TIME_WAIT   -                   </span><br><span class="line">tcp        0      0 127.0.0.1:33676         127.0.0.1:8088          TIME_WAIT   -                   </span><br><span class="line">tcp        0      0 127.0.0.1:33601         127.0.0.1:8088          TIME_WAIT   -                   </span><br><span class="line">tcp        0      0 127.0.0.1:33902         127.0.0.1:8088          TIME_WAIT   -                   </span><br><span class="line">tcp        0      0 127.0.0.1:33624         127.0.0.1:8088          TIME_WAIT   -                   </span><br><span class="line">tcp        0      0 127.0.0.1:36397         127.0.0.1:8088          TIME_WAIT   -   </span><br><span class="line">........</span><br><span class="line"></span><br><span class="line">root@d76a1f6b4069:/go# netstat -anlp | grep TIME_WAIT | wc -l</span><br><span class="line">3768</span><br><span class="line"></span><br><span class="line">root@d76a1f6b4069:/go# netstat -anlp | grep ESTABLISHED | wc -l</span><br><span class="line">1666</span><br></pre></td></tr></table></figure><p>通过分析netstat筛选的结果，可以看到8088端口产生了大量的TIME_WAIT，同时backends和transponder_v1之间也产生了大量的ESTAB，和ss的结果能够对应上。</p><h3 id="分析现象"><a href="#分析现象" class="headerlink" title="分析现象"></a>分析现象</h3><p>通过上面的步骤，已经复现出TIME_WAIT出现的现象，每个处于TIME_WAIT状态的连接会占用一个本地端口，当TIME_WAIT连接过多时，可能会导致本地端口资源耗尽，从而影响新连接的建立。过多的TIME_WAIT连接会增加系统内核的管理负担，影响系统的整体性能。</p><p>那么我们的demo也非常简单，同时根据curl请求返回的错误信息<code>connect: cannot assign requested address</code>，经过查找资料和源码阅读，那么问题出现在transponder_v1程序上。</p><p>来看这一行代码：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 直接使用默认的 http.Client</span></span><br><span class="line">client := &amp;http.Client&#123;&#125;</span><br></pre></td></tr></table></figure><p>从IDE跳过去以后，查看Client方法：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> Client <span class="keyword">struct</span> &#123;</span><br><span class="line"><span class="comment">// Transport specifies the mechanism by which individual</span></span><br><span class="line"><span class="comment">// HTTP requests are made.</span></span><br><span class="line"><span class="comment">// If nil, DefaultTransport is used.</span></span><br><span class="line">Transport RoundTripper</span><br><span class="line">    ......</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *Client)</span></span> transport() RoundTripper &#123;</span><br><span class="line"><span class="keyword">if</span> c.Transport != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> c.Transport</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> DefaultTransport</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其中说明了Transport指定了发出单个HTTP请求的机制。如果为空，则使用DefaultTransport。再继续往下看：</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// DefaultTransport is the default implementation of Transport and is</span></span><br><span class="line"><span class="comment">// used by DefaultClient. It establishes network connections as needed</span></span><br><span class="line"><span class="comment">// and caches them for reuse by subsequent calls. It uses HTTP proxies</span></span><br><span class="line"><span class="comment">// as directed by the environment variables HTTP_PROXY, HTTPS_PROXY</span></span><br><span class="line"><span class="comment">// and NO_PROXY (or the lowercase versions thereof).</span></span><br><span class="line"><span class="keyword">var</span> DefaultTransport RoundTripper = &amp;Transport&#123;</span><br><span class="line">Proxy: ProxyFromEnvironment,</span><br><span class="line">DialContext: defaultTransportDialContext(&amp;net.Dialer&#123;</span><br><span class="line">Timeout:   <span class="number">30</span> * time.Second,</span><br><span class="line">KeepAlive: <span class="number">30</span> * time.Second,</span><br><span class="line">&#125;),</span><br><span class="line">ForceAttemptHTTP2:     <span class="literal">true</span>,</span><br><span class="line">MaxIdleConns:          <span class="number">100</span>,</span><br><span class="line">IdleConnTimeout:       <span class="number">90</span> * time.Second,</span><br><span class="line">TLSHandshakeTimeout:   <span class="number">10</span> * time.Second,</span><br><span class="line">ExpectContinueTimeout: <span class="number">1</span> * time.Second,</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// DefaultMaxIdleConnsPerHost is the default value of Transport&#x27;s</span></span><br><span class="line"><span class="comment">// MaxIdleConnsPerHost.</span></span><br><span class="line"><span class="keyword">const</span> DefaultMaxIdleConnsPerHost = <span class="number">2</span></span><br></pre></td></tr></table></figure><p>可以看出DefaultTransport是Transport的默认实现，由DefaultClient使用。它根据需要建立网络连接，并缓存这些连接供后续调用重复使用。但这里面有两个关键的参数设置<code>MaxIdleConnsPerHost</code>和<code>MaxIdleConns</code>。</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> Transport <span class="keyword">struct</span> &#123;</span><br><span class="line">    ......</span><br><span class="line"><span class="comment">// MaxIdleConns controls the maximum number of idle (keep-alive)</span></span><br><span class="line"><span class="comment">// connections across all hosts. Zero means no limit.</span></span><br><span class="line">MaxIdleConns <span class="type">int</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// MaxIdleConnsPerHost, if non-zero, controls the maximum idle</span></span><br><span class="line"><span class="comment">// (keep-alive) connections to keep per-host. If zero,</span></span><br><span class="line"><span class="comment">// DefaultMaxIdleConnsPerHost is used.</span></span><br><span class="line">MaxIdleConnsPerHost <span class="type">int</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// MaxConnsPerHost optionally limits the total number of</span></span><br><span class="line"><span class="comment">// connections per host, including connections in the dialing,</span></span><br><span class="line"><span class="comment">// active, and idle states. On limit violation, dials will block.</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// Zero means no limit.</span></span><br><span class="line">MaxConnsPerHost <span class="type">int</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// IdleConnTimeout is the maximum amount of time an idle</span></span><br><span class="line"><span class="comment">// (keep-alive) connection will remain idle before closing</span></span><br><span class="line"><span class="comment">// itself.</span></span><br><span class="line"><span class="comment">// Zero means no limit.</span></span><br><span class="line">IdleConnTimeout time.Duration</span><br><span class="line">    ......</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其中MaxIdleConns默认设置连接池的大小为100个连接，MaxIdleConnsPerHost默认为2，导致只会保留2个连接，而将其他的连接主动关闭进入TIME_WAIT状态。而使用wrk进行压力请求，会产生很多TIME_WAIT状态的连接。最终会耗尽主机的所有可用端口，从而导致无法打开新的连接。进而产生错误返回<code>connect: cannot assign requested address</code>。</p><p>所以需要根据实际情况，通过性能测试、参考经验值和生产监控去逐步调整<code>MaxIdleConnsPerHost</code>和<code>MaxIdleConns</code>的参数。</p><h3 id="程序优化"><a href="#程序优化" class="headerlink" title="程序优化"></a>程序优化</h3><p>知道原因后，通过资料检索，可以对Transport进行配置:</p><figure class="highlight golang"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line"><span class="string">&quot;fmt&quot;</span></span><br><span class="line"><span class="string">&quot;io&quot;</span></span><br><span class="line"><span class="string">&quot;net/http&quot;</span></span><br><span class="line"><span class="string">&quot;encoding/json&quot;</span></span><br><span class="line"><span class="string">&quot;github.com/gin-gonic/gin&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="keyword">type</span> ping <span class="keyword">struct</span> &#123;</span><br><span class="line">Message <span class="type">string</span> <span class="string">`json:&quot;message&quot;`</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line">gin.SetMode(gin.ReleaseMode)</span><br><span class="line">gin.DefaultWriter = io.Discard</span><br><span class="line">r := gin.Default()</span><br><span class="line"></span><br><span class="line">url := <span class="string">&quot;http://127.0.0.1:8088/ping&quot;</span></span><br><span class="line"></span><br><span class="line">defaultRoundTripper := http.DefaultTransport</span><br><span class="line">defaultTransportPointer, ok := defaultRoundTripper.(*http.Transport)</span><br><span class="line"><span class="keyword">if</span> !ok &#123;</span><br><span class="line">fmt.Println(<span class="string">&quot;defaultRoundTripper not an *http.Transport&quot;</span>)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line">defaultTransport := *defaultTransportPointer</span><br><span class="line">defaultTransport.MaxIdleConns = <span class="number">1000</span></span><br><span class="line">defaultTransport.MaxIdleConnsPerHost = <span class="number">1000</span></span><br><span class="line">client := &amp;http.Client&#123;Transport: &amp;defaultTransport&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 直接使用默认的 http.Client</span></span><br><span class="line"><span class="comment">// client := &amp;http.Client&#123;&#125;</span></span><br><span class="line"></span><br><span class="line">r.GET(<span class="string">&quot;/tping&quot;</span>, <span class="function"><span class="keyword">func</span><span class="params">(c *gin.Context)</span></span> &#123;</span><br><span class="line">req, err := http.NewRequest(<span class="string">&quot;GET&quot;</span>, url, <span class="literal">nil</span>)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">c.JSON(http.StatusInternalServerError, gin.H&#123;</span><br><span class="line"><span class="string">&quot;error&quot;</span>: fmt.Sprintf(<span class="string">&quot;http.NewRequest failed: %v&quot;</span>, err),</span><br><span class="line">&#125;)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">resp, err := client.Do(req)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">c.JSON(http.StatusInternalServerError, gin.H&#123;</span><br><span class="line"><span class="string">&quot;error&quot;</span>: fmt.Sprintf(<span class="string">&quot;get failed: %v&quot;</span>, err),</span><br><span class="line">&#125;)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">defer</span> resp.Body.Close()</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> resp.StatusCode != http.StatusOK &#123;</span><br><span class="line">c.JSON(resp.StatusCode, gin.H&#123;</span><br><span class="line"><span class="string">&quot;error&quot;</span>: fmt.Sprintf(<span class="string">&quot;request failed with status code: %d&quot;</span>, resp.StatusCode),</span><br><span class="line">&#125;)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">body, err := io.ReadAll(resp.Body)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">c.JSON(http.StatusInternalServerError, gin.H&#123;</span><br><span class="line"><span class="string">&quot;error&quot;</span>: fmt.Sprintf(<span class="string">&quot;read body failed: %v&quot;</span>, err),</span><br><span class="line">&#125;)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> tping ping</span><br><span class="line">err = json.Unmarshal(body, &amp;tping)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">c.JSON(http.StatusInternalServerError, gin.H&#123;</span><br><span class="line"><span class="string">&quot;error&quot;</span>: fmt.Sprintf(<span class="string">&quot;unmarshal failed: %v&quot;</span>, err),</span><br><span class="line">&#125;)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">c.JSON(<span class="number">200</span>, gin.H&#123;</span><br><span class="line"><span class="string">&quot;data&quot;</span>: tping.Message,</span><br><span class="line">&#125;)</span><br><span class="line">&#125;)</span><br><span class="line"></span><br><span class="line">r.Run(<span class="string">&quot;:8080&quot;</span>)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>进行编译：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">go build -o transponder_v2</span></span><br></pre></td></tr></table></figure><p>停止transponder_v1的程序，运行transponder_v2程序，重新进行wrk压测，并在执行中观察curl和ss情况，会发现，TIME_WAIT现象已经消失。<br>1）wrk请求：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">root@d76a1f6b4069:/opt# ./wrk -t 8 -c 600 -d 180s http://127.0.0.1:8080/tping</span><br><span class="line">Running 3m test @ http://127.0.0.1:8080/tping</span><br><span class="line">  8 threads and 600 connections</span><br><span class="line">  Thread Stats   Avg      Stdev     Max   +/- Stdev</span><br><span class="line">    Latency   195.39ms  239.25ms   1.84s    90.57%</span><br><span class="line">    Req/Sec   606.10    214.25     1.30k    72.97%</span><br><span class="line">  531584 requests in 3.00m, 69.96MB read</span><br><span class="line">  Socket errors: connect 0, read 0, write 0, timeout 600</span><br><span class="line">Requests/sec:   2951.69</span><br><span class="line">Transfer/sec:    397.79KB</span><br></pre></td></tr></table></figure><p>对比上面的wrk结果，优化后的wrk结果已经没有<code>Non-2xx or 3xx responses</code>，同时timeout数量明显减少。</p><p>2）执行多次curl请求，均能获取正常响应。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">root@d76a1f6b4069:/opt# curl http://127.0.0.1:8080/tping</span><br><span class="line">&#123;&quot;data&quot;:&quot;pong&quot;&#125;</span><br></pre></td></tr></table></figure><p>3）ss命令中大量TIME_WAIT同样消失。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">root@d76a1f6b4069:/go# ss -s</span><br><span class="line">Total: 3044</span><br><span class="line">TCP:   2437 (estab 2400, closed 35, orphaned 0, timewait 9)</span><br><span class="line"></span><br><span class="line">Transport Total     IP        IPv6</span><br><span class="line">RAW       0         0         0        </span><br><span class="line">UDP       0         0         0        </span><br><span class="line">TCP       2402      1200      1202     </span><br><span class="line">INET      2402      1200      1202     </span><br><span class="line">FRAG      0         0         0        </span><br></pre></td></tr></table></figure><p>到这里故障复现就结束了，也找到了解决方法。</p><p>总结下经验，在排查问题时netstat和ss是非常有用的工具，通过他们去观察网络情况，可以很快的分析出具体问题。同时研发实现过程中一定要考虑好调用关系，做好自测和压测。</p><h3 id="参考"><a href="#参考" class="headerlink" title="参考"></a>参考</h3><p>1、<a href="https://studygolang.com/articles/28263">https://studygolang.com/articles/28263</a><br>2、<a href="https://pkg.go.dev/net/http#Transport">https://pkg.go.dev/net/http#Transport</a>  </p>]]></content>
    
    
    <summary type="html">&lt;p&gt;2025年的第一篇文章，祝大家新年快乐(#^.^#)&lt;/p&gt;
&lt;p&gt;本文用于记录在Golang中使用net&amp;#x2F;http包的Client时Transport配置不当，同时遇到大量并发请求时引起的TIME_WAIT问题，文章中会通过demo程序复现。&lt;/p&gt;</summary>
    
    
    
    <category term="Skill" scheme="https://www.applenice.net/categories/Skill/"/>
    
    
    <category term="Linux" scheme="https://www.applenice.net/tags/Linux/"/>
    
    <category term="Golang" scheme="https://www.applenice.net/tags/Golang/"/>
    
  </entry>
  
</feed>
