Server fails to start after Centos 7 update
The Problem
HTTPD failed after yum updated Centos 7.
SSH also failed.
There were over 400 updates or installations in over 700 steps during the update. A few that caught my eye just now in reviewing the log:
Updated: centos-release-7-4.1708.el7.centos.x86_64 Updated: firewalld-filesystem-0.4.4.4-6.el7.noarch Updated: iptables-1.4.21-18.0.1.el7.centos.x86_64 Updated: 1:NetworkManager-libnm-1.8.0-9.el7.x86_64 Updated: device-mapper-persistent-data-0.7.0-0.1.rc6.el7.x86_64 Updated: initscripts-9.49.39-1.el7.x86_64 Updated: cronie-anacron-1.4.11-17.el7.x86_64 Installed: 1:NetworkManager-1.8.0-9.el7.x86_64 Installed: 1:NetworkManager-ppp-1.8.0-9.el7.x86_64 Updated: cloud-init-0.7.9-9.el7.centos.2.x86_64 Installed: 1:grub2-2.02-0.64.el7.centos.x86_64 Updated: 1:NetworkManager-tui-1.8.0-9.el7.x86_64 Updated: 1:NetworkManager-team-1.8.0-9.el7.x86_64 Installed: kernel-3.10.0-693.2.2.el7.x86_64
Fortunately, I was able to access the server through an emergency console via the Rackspace cloud server interface. The emergency console never let me down all through this process. I'm so grateful for that small mercy!
The boot.log showed the heart of the matter:
- Starting LSB: Bring up/down networking
- Failed to start LSB: Bring up/down networking
It said to run systemctl status network.service for more details. The screen capture below shows the output from that command.
network.service failed to bring up networking because /etc/sysconfig/network-scripts/ifcfg-eth0 was changed from "BOOTPROTO=static" to "BOOTPROTO=dhcp". The crucial piece of information in the screen capture is that "Determining IP information for eth0... failed." "RTNETLINK answers: File exists" is just another symptom from the loss of the static definition of eth0.
Sometime during the update, the configuration for eth0 was changed from static to dhcp. This is an old copy of ifcfg-eth0 from a server that I crashed in May of 2016. It tells eth0 that my IP address is "166.78.150.236". There is no way dhcp could figure that out by itself. The IP addresses assigned to my server, the gateway, and the DNS servers are precious bits of information that must be fed into the system on boot.
# Automatically generated, do not edit # Label public DEVICE=eth0 BOOTPROTO=static HWADDR=bc:76:4e:05:75:f4 IPADDR=166.78.150.236 NETMASK=255.255.255.0 DEFROUTE=yes GATEWAY=166.78.150.1 IPV6INIT=yes IPV6_AUTOCONF=no IPV6ADDR=2001:4800:7812:0514:7cbc:4d9b:ff05:75f4/64 IPV6_DEFAULTGW=fe80::def%eth0 DNS1=72.3.128.241 DNS2=72.3.128.240 ONBOOT=yes NM_CONTROLLED=no
I didn't save a copy of the mangled file. The crucial change was BOOTPROTO=dhcp. No IPADDR or GATEWAY or DNS servers were configured in the system-generated file, either.
SSH and HTTPD failed because eth0 was uninformed. My server was running OK in all other respects, but it was cut off from the internet.
The Solutions
Choose between network.service and NetworkManager.service
My system was trying to use network.service. Many advice pages recommend not trying to run both network.service and NetworkManager. See this one, for example.
Stopping NetworkManager and disabling it did not solve my problem as it did for that fellow.
I eventually decided to disable network.service and do my best to learn how NetworkManager works. I thought it was the direction that RedHat is taking. I'm not so sure any more.
use nmtui to examine eth1 and configure eth0
- - "Network configuration using sysconfig files."
- - nmcli connection reload
- - nmcli dev disconnect interface-name
- - nmcli con up interface-name
configure /etc/sysconfig/network-scripts/ifcfg-eth0 to be static
After a lot of tweaking and testing and rebooting, this is what I finally came up with for /etc/sysconfig/network-scripts/ifcfg-eth0:
#Label public DEVICE=eth0 BOOTPROTO=static HWADDR=BC:76:4E:05:85:D3 TYPE=Ethernet PROXY_METHOD=none BROWSER_ONLY=no IPADDR=162.242.144.228 NETMASK=255.255.255.0 PREFIX=24 GATEWAY=162.242.144.1 DNS1=72.3.128.241 DNS2=72.3.128.240 DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV4_ROUTE_METRIC=0 IPV4_DNS_PRIORITY=100 IPV6INIT=yes IPV6_AUTOCONF=no IPV6_ADDR=2001:4800:7817:101:be76:4eff:fe05:85d3/64 IPV6_DEFAULTGW=fe80::def%eth0 IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no # .... IPV6_ADDR_GEN_MODE=stable-privacy IPV6_DNS_PRIORITY=100 NAME="System eth0" ONBOOT=yes UUID=5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03 NM_CONTROLLED=yes
stop cloud-init from rewriting ifcfg-eth0
No matter how many times I edited ifcfg-eth0 to salt it with the IP, gateway, and DNS server addresses, every time I rebooted, the BOOTPROTO=dhcp file came back, along with a line at the top that said "Automatically generated, do not edit." I blame cloud-init for the rewriting, but I never could figure out where it was getting the bad configuration from. Similarly, /etc/resolv.conf was also being rewritten--quite stupidly, in fact, with the warning line being added multiple times.
To stop cloud-init from mangling eth0, I created /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with this single line in it:
network: {config: disabled}
That stopped the system from trashing my static configuration. Unfortunately, with network.services also stopped, it stopped my system from activating eth0 and eth1.
let NetworkManager manage the connections
In /etc/sysconfig/network-scripts/ifcfg-eth0 and /etc/sysconfig/network-scripts/ifcfg-eth1, I set "NM_CONTROLLED=yes". This pretty much ran contrary to all the advice I saw, but with network.services and cloud-init disabled, that is what it took to wake up the connections.
start httpd after NetworkManager configures network
The final problem was that the Apache daemon was called before the network was configured and online. This may not be the smartest or most elegant solution, but it's working at present for me. I created /etc/NetworkManager/dispatcher.d/22-httpd as shown below. The system calls it when NetworkManager thinks it is open for business.
#!/bin/sh # This is a NetworkManager dispatcher script to turn http on # when the network is ready. MXM # /etc/NetworkManager/dispatcher.d/22-httpd # https://wiki.archlinux.org/index.php/NetworkManager#Network_services_with_NetworkManager_dispatcher if [ "$2" = "up" ]; then systemctl start httpd fi if [ "$2" = "down" ]; then systemctl stop httpd fi exit 0