Linux is a powerful and popular operating system used widely in personal and professional settings. One of its standout features is its strong networking capabilities, making it an excellent choice for various networking tasks.
Over the years, the Linux network stack has significantly evolved, supporting not only basic functions but also advanced features like network namespaces, which allow the creation of separate and isolated network instances. This makes Linux ideal for use as a network device.
Initially, networking may seem simple since its main role is to provide connectivity. However, ensuring that data packets reach the correct destination on the internet or exposing a server application to external traffic can become complex.
When networking doesn't work as expected, troubleshooting can be challenging. It requires a good understanding of networking concepts and configuration options. This is true for Linux as well—knowing the network stack and networking tools is essential for identifying and solving problems.
This blog post covers the basics of Linux network troubleshooting. It starts with a list of commonly used tools and then describes some example issues and ways to find and resolve them.
Here are some tools that are helpful for troubleshooting networking problems in Linux. Most Linux distributions support these tools natively, meaning they come pre-installed or can be easily installed as packages. You can use them to configure, modify, or check network settings or statuses. This list isn't exhaustive, as there are many different tools available and new ones are constantly being developed. The focus is on the most commonly used tools for diagnosing and resolving issues.
Here are the tools and commands commonly used for network troubleshooting in Linux.
ip | As part of the iproute2 package, this tool lets you control and monitor various aspects of networking in the Linux kernel, such as routing, network interfaces, and tunnels. It replaces older tools like ifconfig, arp, and route, offering a single command with a wide range of options. |
---|---|
iptables/nft | iptables is a command line tool for setting up IP packet filter rules in the Linux kernel firewall. It covers different protocols with specific utilities: iptables for IPv4, ip6tables for IPv6, arptables for ARP, and ebtables for Ethernet frames (Layer 2 of the OSI model). nftables is the successor to iptables, replacing {ip,ip6,arp,eb}_tables, and nft is the command line utility to configure nftables. |
ping | The ping command checks if a host can be reached on an IP network using the ICMP protocol. It also measures how long it takes for packets to travel to the destination and back. |
traceroute/tracepath/mtr | The traceroute command checks and displays the path that packets take from the source to the destination on an IP network, as well as the delay for each hop. The tracepath command provides similar information but with fewer advanced features and options. The mtr command combines the functionality of traceroute and ping, checking the path, round-trip time, and packet loss to each router along the way. It can display the output in both the console and a graphical user interface (GUI). |
tcpdump/wireshark | tcpdump is a command line tool for sniffing and analyzing network packets. It can capture packets from the network connected to your computer or server and supports advanced filters to display only specific packets. Wireshark is another network protocol analyzer, but with a graphical user interface (GUI). It has powerful display filters and supports many protocols, making it ideal for detailed packet analysis. |
nmap | nmap (Network MAPper) is a tool used for exploring networks and checking security. It works by sending packets and analyzing the responses. With nmap, you can find out which devices are on the network, their operating systems, and the services they offer. |
lsof | lsof (LiSt Open Files) is a command line tool that shows all open files and the processes that opened them. It also lets you see current network connections and the related files. For example, the lsof -i command lists all open files linked to Internet connections. |
ethtool | A command line tool that lets you view and change the settings of network interfaces and their drivers. It helps you check and control network drivers and hardware settings for network devices. |
dig | dig (Domain Information Groper) is a tool used to look up DNS (Domain Name System) records, like host addresses and mail exchange addresses, for a given domain name. |
ss | ss (Socket Statistics) is a tool in the iproute2 package, replacing the older netstat tool. It reports statistics for sockets (like TCP, UNIX, and UDP). It also has filters so you can specify which information to gather and display. |
socat | Socat (SOcket CAT) is a versatile relay tool. It acts as a proxy for two-way data transfer between two separate data streams (channels). It supports various types of streams, like files, pipes, devices, and sockets. For example, you can forward a TCP port or redirect standard input to a remote host using TCP or UDP on the chosen remote port. |
sysctl | sysctl is a tool for changing kernel parameters while the system is running. You can find the available parameters in the /proc/sys/ directory or display them with the sysctl -a command. Network-related settings are in the /proc/sys/net directory. |
This section provides examples of networking issues and troubleshooting steps to diagnose, correct, and verify results.
Description: A virtual machine (VM) is running on a Linux-based server and requires Internet access.
There are two machines:
Server: Recognized by the terminal prompt [user@term:~]$
VM: Has a network interface with an IP address 192.168.122.44 and is recognized by the terminal prompt ubuntu@vm:~$
First, log into the VM using SSH and check the basic network configuration to see the assigned IP address:
[user@term:~]$ ssh ubuntu@192.168.122.44
ubuntu@vm:~$
ubuntu@vm:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:36:cd:8c brd ff:ff:ff:ff:ff:ff
altname enp0s3
inet 192.168.122.44/24 metric 100 brd 192.168.122.255 scope global dynamic ens3
valid_lft 3284sec preferred_lft 3284sec
inet6 fe80::5054:ff:fe36:cd8c/64 scope link
valid_lft forever preferred_lft forever
SSH access is working fine, and the VM has an assigned IP address. The next step is to check if it can connect to external networks:
ubuntu@vm:~$ ping 1.1.1.1
ping: connect: Network is unreachable
Pinging an external address (1.1.1.1 in this case) doesn't work, and you see a "Network is unreachable" message. This indicates a routing issue, meaning there is no route entry to the destination IP. It can also happen if the interfaces haven't been assigned IP addresses. Another scenario is when a firewall blocks packets with an "ICMP admin prohibited" message, and the host misinterprets it as "Network is unreachable." For our case, let's start by checking the routing entries:
ubuntu@vm:~$ ip route
192.168.122.0/24 dev ens3 proto kernel scope link src 192.168.122.44 metric 100
192.168.122.1 dev ens3 proto dhcp scope link src 192.168.122.44 metric 100
There is only a route for the directly connected prefix (192.168.122.0/24) and no default route. As a result, the network stack can't send packets to IP addresses from other prefixes. In this setup, the VM uses libvirt's default network, which provides DHCP and DNS services for connected VMs and assigns IP addresses from 192.168.122.2 to 192.168.122.254 by default. On the host system, there is a virbr0 bridge that typically acts as the default gateway with an IP of 192.168.122.1. A ping from the VM shows it can reach the IP address of the host's virbr0 bridge. Let's add the default route via that bridge.
ubuntu@vm:~$ sudo ip route add default via 192.168.122.1
ubuntu@vm:~$ ip route
default via 192.168.122.1 dev ens3
192.168.122.0/24 dev ens3 proto kernel scope link src 192.168.122.44 metric 100
192.168.122.1 dev ens3 proto dhcp scope link src 192.168.122.44 metric 100
ubuntu@vm:~$
ubuntu@vm:~$ ping -c3 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=63 time=3.16 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=63 time=2.04 ms
64 bytes from 1.1.1.1: icmp_seq=3 ttl=63 time=4.23 ms
--- 1.1.1.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 2.041/3.142/4.225/0.891 ms
We can now successfully ping the IP address 1.1.1.1, confirming that external connectivity is working.
Description: In this setup, like in use case 1, a Linux-based virtual machine (VM) is running on a Linux-based server. We want to use DNS to access external hosts from the VM.
There are two machines:
Server: Recognized by the terminal prompt [user@term:~]$
VM: Has a network interface with an IP address 192.168.122.44, recognized by the terminal prompt ubuntu@vm:~$
In the first use case, we specified the destination using an IP address (for ping). Now, we want to use a fully qualified domain name (FQDN), which relies on DNS resolution:
ubuntu@vm:~$ ping google.com
ping: google.com: Temporary failure in name resolution
DNS name resolution isn't working, and we get a "Temporary failure in name resolution" message. To fix this, we need to check the DNS settings on our VM. In most Linux systems, these settings are defined in the /etc/resolv.conf file:
ubuntu@vm:~$ cat /etc/resolv.conf
# This is /run/systemd/resolve/stub-resolv.conf managed by man:systemd-resolved(8).
# Do not edit.
#
# This file might be symlinked as /etc/resolv.conf. If you're looking at
# /etc/resolv.conf and seeing this text, you have followed the symlink.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs should typically not access this file directly, but only
# through the symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a
# different way, replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.
nameserver 127.0.0.53
options edns0 trust-ad
search .
Looking at the content of /etc/resolv.conf (with the first line saying: “This is /run/systemd/resolve/stub-resolv.conf managed by man:systemd-resolved(8)”), we see that the systemd-resolved service handles domain name resolution. To check which DNS servers are configured on our machine, we can use the resolvectl tool.
ubuntu@vm:~$ resolvectl status
Global
Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
Link 2 (ens3)
Current Scopes: DNS
Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 192.168.122.1
DNS Servers: 192.168.122.1
The VM runs on a Linux server managed by the libvirt toolkit, which provides a default DNS server (IP: 192.168.122.1) for the VM. However, there's an issue with DNS resolution. Let's check if the configured DNS server is responding to queries using the dig tool with the +short option for a concise answer:
ubuntu@vm:~$ dig +short google.com @192.168.122.1
;; communications error to 192.168.122.1#53: connection refused
ubuntu@vm:~$ dig +short google.com @1.1.1.1
216.58.208.206
The configured DNS server isn't available, but using an external DNS server like 1.1.1.1 (Cloudflare's public DNS resolver) works. It seems there's an issue with the DNS resolver service for our VM (it might be stopped). If we can't fix it on the server, we can still configure DNS on our VM.
As we saw earlier, the systemd-resolved service handles domain name resolution. We can modify its configuration file /etc/systemd/resolved.conf to add the DNS servers with the following line:
DNS=1.1.1.1 8.8.8.8
And later restart the service:
ubuntu@vm:~$ sudo systemctl restart systemd-resolved.service
ubuntu@vm:~$ resolvectl status
Global
Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
DNS Servers: 1.1.1.1 8.8.8.8
Link 2 (ens3)
Current Scopes: none
Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Running the second command (resolvectl status) confirms that the new DNS server settings have been successfully applied.
ubuntu@vm-kind:~$ ping -c3 google.com
PING google.com (142.250.203.142) 56(84) bytes of data.
64 bytes from waw07s06-in-f14.1e100.net (142.250.203.142): icmp_seq=1 ttl=119 time=5.38 ms
64 bytes from waw07s06-in-f14.1e100.net (142.250.203.142): icmp_seq=2 ttl=119 time=7.19 ms
64 bytes from waw07s06-in-f14.1e100.net (142.250.203.142): icmp_seq=3 ttl=119 time=6.83 ms
--- google.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2004ms
rtt min/avg/max/mdev = 5.379/6.466/7.186/0.782 ms
Now the DNS resolution on our VM is working, and we can use domain names instead of IP addresses.
Description: A Redis database is running inside a virtual machine (VM). You want to access this database from the host server (the machine running the VM) using the VM's IP address and the Redis application port. The tool used for access is redis-cli.
There are two machines:
Server: Recognized by the terminal prompt [user@term:~]$
VM: Has a network interface with an IP address 192.168.122.44, recognized by the terminal prompt ubuntu@vm:~$
From the server, we try to connect to the Redis application running on the VM. The VM has an IP address of 192.168.122.44, and Redis is running on the default port 6379.
[user@term:~]$ redis-cli -h 192.168.122.44 -p 6379
Could not connect to Redis at 192.168.122.44:6379: Connection timed out
not connected>
A common reason for the “Connection timed out” error is a firewall blocking packets. Firewalls can run on the target host's operating system (like iptables) or be a device on the path from the source to the destination host. In this case, the connection is from the server host to a VM running on the same server (so there’s no intermediate device between them). First, check the firewall settings on the VM.
On Linux systems, firewalls are usually managed by iptables or nftables. The VM has Ubuntu installed, which uses ufw (Uncomplicated FireWall) as a higher-level management tool to simplify packet filtering (underneath, it configures iptables or nftables).
Let’s check if ufw is active and if there are any rules applied.
ubuntu@vm:~$ ssh ubuntu@192.168.122.44
Welcome to Ubuntu 22.10 (GNU/Linux 5.19.0-26-generic x86_64)
<truncated for brevity>
ubuntu@vm:~$ sudo ufw status numbered
Status: active
To
----Action
-------From
------
[ 1] 22/tcp
ALLOW IN
Anywhere
[ 2] 22/tcp (v6)
ALLOW IN
Anywhere (v6)
ufw is enabled, and there are rules for SSH, which is why we could log into the VM. By default, ufw blocks incoming packets, so if there's no specific rule allowing certain traffic, it gets dropped. That's why we're seeing connection errors. Let's add a rule to allow incoming connections on TCP port 6379:
ubuntu@vm:~$ sudo ufw allow to 192.168.122.44 proto tcp port 6379
Rule added
ubuntu@vm:~$ sudo ufw status numbered
Status: active
To
----Action
-------From
------
[ 1] 22/tc
ALLOW IN
Anywhere
[ 2] 192.168.122.44 6379/tcp
ALLOW IN
Anywhere
[ 3] 22/tcp (v6)
ALLOW IN
Anywhere (v6)
Now go back to the terminal window where you are logged into the server and try to connect to the Redis application (database) again using redis-cli:
[user@term:~]$ redis-cli -h 192.168.122.44 -p 6379
Could not connect to Redis at 192.168.122.44:6379: Connection refused
not connected>
We still can't connect, but now we see a different error message: "Connection refused". This means either no process is listening on the target host's port, or a firewall is blocking traffic and rejecting packets. To check if the Redis application is running and listening on the VM, we can use the ss tool to list all listening TCP sockets (options: n for numeric, l for listening, t for TCP, p for showing the process using the socket):
ubuntu@vm:~$ sudo ss -nltp
State
Recv-Q
Send-Q
Local Address:Port
Peer Address:Port
Process
LISTEN
0
4096
127.0.0.54:53
0.0.0.0:*
users:(("systemd-resolve",pid=494,fd=16))
LISTEN
0
511
127.0.0.1:6379
0.0.0.0:*
users:(("redis-server",pid=12665,fd=6))
LISTEN
0
4096
127.0.0.53%lo:53
0.0.0.0:*
users:(("systemd-resolve",pid=494,fd=14))
LISTEN
0
4096
*:22
*:*
users:(("sshd",pid=783,fd=3),("systemd",pid=1,fd=134))
Indeed, there's a Redis process running and listening on TCP port 6379, but it's bound to the Linux loopback interface (lo) with the address 127.0.0.1. This address is only accessible for local connections from the host where the lo interface is configured.
As a result, you can't connect to the Redis application from other machines (like the server), but you can connect from the VM itself.
ubuntu@vm:~$ redis-cli
127.0.0.1:6379> hgetall *
(empty array)
127.0.0.1:6379>
We use the redis-cli command hgetall to ensure we can interact with the database. This command retrieves all fields and values stored in the database, although ours is currently empty.
Before changing the Redis configuration to listen on an externally accessible interface, let's use socat to relay the external connection to the loopback address and see if this setup allows external access to Redis.
ubuntu@vm:~$ socat TCP4-LISTEN:6379,bind=192.168.122.44 TCP:127.0.0.1:6379
The above command redirects TCP port 6379 on the IP address 192.168.122.44 to TCP port 6379 on the address 127.0.0.1, which Redis uses. This setup is temporary and allows only one connection (socat will exit when the connection ends). For multiple connections, you can use the fork and reuseaddr options of socat.
In another terminal, after logging into the VM, let's check the listening TCP sockets:
ubuntu@vm:~$ sudo ss -nltp
State
Recv-Q
Send-Q
Local Address:Port
Peer Address:Port
Process
LISTEN
0
4096
127.0.0.54:53
0.0.0.0:*
users:(("systemd-resolve",pid=494,fd=16))
LISTEN
0
5
192.168.122.44:6379
0.0.0.0:*
users:(("socat",pid=17896,fd=5))
LISTEN
0
511
127.0.0.1:6379
0.0.0.0:*
users:(("redis-server",pid=12665,fd=6))
LISTEN
0
4096
127.0.0.53%lo:53
0.0.0.0:*
users:(("systemd-resolve",pid=494,fd=14))
LISTEN
0
4096
*:22
*:*
users:(("sshd",pid=783,fd=3),("systemd",pid=1,fd=134))
We can see that there is a socat process listening on the IP address 192.168.122.44 and port 6379. Now, we should be able to connect to the Redis application running on the VM. Let's return to the terminal window where we're logged into the server and try to connect.
[user@term:~]$ redis-cli -h 192.168.122.44 -p 6379
192.168.122.44:6379> hgetall *
(empty array)
192.168.122.44:6379>
It works! The redis-cli tool on the server successfully connects to Redis running on the VM. To make this permanent (so we don't have to use socat to relay connections), we need to change the Redis configuration to listen on the external interface. However, before doing that, we should secure Redis to accept only authorized connections (secure the database before exposing it). This step is not covered in this blog post.
Description: We need to set up a connection between the server's interface ens3f1 and the switch port Ethernet36. Both the server and the switch have 25Gb interfaces and are connected with a DAC (Direct Attach Copper) cable. They should both be configured as Layer 3 devices to ensure IP connectivity between them.
There are two devices:
Server: Recognized by the terminal prompt [user@term:~]$
Switch: Running SONiC (open-source network operating system), recognized by the terminal prompt admin@sonic:~$
The server interface ens3f1 is connected to the switch interface Ethernet36. First, let's verify that the connection is recognized by both ends.
On the server:
[user@term:~]$ ip link show ens3f1
16: ens3f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:ed:6a:05 brd ff:ff:ff:ff:ff:ff
altname enp179s0f1
The interface is enabled, but its state is reported as down. Let's now check the status on the switch.
Although this is generally outside the scope of Linux network troubleshooting, connectivity issues often require verifying the configuration and logs on both the Linux server and the connected switch.
admin@sonic:~$ show interface status Ethernet36
Interface
-------Lanes
-------Speed
-------MTU
-------FEC
-------Alias
-------Vlan
-------Oper
-------Admin
-------Type
-------Asym PFC
-------
Ethernet36
36
25G
9100
rs
etp10
routed
down
up
SFP/SFP+/SFP28
N/A
The switch also reports that the interface is down.
Now, let's go back to the terminal on the server and check the interface settings using ethtool:
[user@term:~]$ ethtool ens3f1
Settings for ens3f1:
Supported ports: [ FIBRE ]
Supported link modes: 10000baseT/Full
25000baseCR/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: None BaseR RS
Advertised link modes: 10000baseT/Full
25000baseCR/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Advertised FEC modes: None BaseR RS
Speed: Unknown!
Duplex: Unknown! (255)
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: g
Wake-on: g
Current message level: 0x00000007 (7)
drv probe link
Link detected: no
The ens3f1 interface supports 10Gb and 25Gb link speeds and advertises different FEC (Forward Error Correction) modes: None, BaseR, and RS. The switch port Ethernet36 uses FEC with RS encoding, which is also supported by the server's network card. However, we need to check the FEC configuration, as a different mode might be set or it could be turned off. To do this, we use ethtool with the --show-fec option.
[user@term:~]$ ethtool --show-fec ens3f1
FEC parameters for ens3f1:
Configured FEC encodings: Off
Active FEC encoding: Off
FEC is turned off on the interface, and the switch requires FEC with RS encoding. This mismatch prevents a connection. Let's change the FEC configuration on the server's interface to "auto" so it can negotiate the mode with the switch:
[user@term:~]$ sudo ethtool --set-fec ens3f1 encoding auto
[user@term:~]$ ip link show ens3f1
16: ens3f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether ac:1f:6b:ed:6a:05 brd ff:ff:ff:ff:ff:ff
altname enp179s0f1
After applying the change, we see that the ens3f1 interface on the server is up. The switch also shows the interface as up.
admin@sonic:~$ show interfaces status Ethernet36
Interface
-------Lanes
-------Speed
-------MTU
-------FEC
-------Alias
-------Vlan
-------Oper
-------Admin
-------Type
-------Asym PFC
-------
Ethernet36
36
25G
9100
rs
etp10
routed
up
up
SFP/SFP+/SFP28
N/A
By checking the ens3f1 driver’s logs on the server, we can see that the FEC mode has been successfully negotiated. We use the dmesg tool, which prints the kernel's message buffer, including messages from device drivers.
[user@term:~]$ dmesg | grep ens3f1
[16946973.105255] i40e 0000:b3:00.1 ens3f1: renamed from eth0
[17109033.632111] i40e 0000:b3:00.1 ens3f1: NIC Link is Down
[17109048.596726] i40e 0000:b3:00.1 ens3f1: NIC Link is Up, 25 Gbps Full Duplex, Requested FEC: CL108 RS-FEC, Negotiated FEC: CL108 RS-FEC, Autoneg: False, Flow Control: None
Now we can assign IP addresses and establish communication. On the switch:
admin@sonic:~$ sudo config interface ip add Ethernet36 192.168.100.1/24
and on the server:
[user@term:~]$ sudo ip address add 192.168.100.2/24 dev ens3f1
[user@term:~]$ ping -c 3 192.168.100.1
PING 192.168.100.1 (192.168.100.1) 56(84) bytes of data.
64 bytes from 192.168.100.1: icmp_seq=1 ttl=64 time=0.442 ms
64 bytes from 192.168.100.1: icmp_seq=2 ttl=64 time=0.157 ms
64 bytes from 192.168.100.1: icmp_seq=3 ttl=64 time=0.158 ms
--- 192.168.100.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2040ms
rtt min/avg/max/mdev = 0.157/0.252/0.442/0.134 ms
We now have IP connectivity between the server's and switch's interfaces.
Description: We need to connect the server's interface eno2 to the switch's interface Ethernet0. The switch will be configured as a Layer 3 device to ensure IP connectivity between the server and the switch.
The server's network port is 10Gb, while the switch port is 25Gb. However, the switch port also supports 1Gb, 10Gb, and 25Gb speeds. A physical connection (cable) has been established between the server and switch ports.
There are two devices:
Server: Recognized by the terminal prompt [user@term:~]$
Switch: Running SONiC (open-source network operating system), recognized by the terminal prompt admin@sonic:~$
The server and switch ports are connected with a physical cable. Let's check the interface status on the server side.
[user@term:~]$ ip link show eno2
18: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether ec:f4:bb:da:4a:7a brd ff:ff:ff:ff:ff:ff
altname enp1s0f1
The interface is currently disabled, so the first step is to enable it:
[user@term:~]$ sudo ip link set up eno2
[user@term:~]$ ip link show eno2
18: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
link/ether ec:f4:bb:da:4a:7a brd ff:ff:ff:ff:ff:ff
altname enp1s0f1
After enabling the interface, it is still down. Next, let's see what ethtool reports.
[user@term:~]$ ethtool eno2
Settings for eno2:
Supported ports: [ FIBRE ]
Supported link modes: 10000baseT/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: 10000baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: Unknown!
Duplex: Unknown! (255)
Port: FIBRE
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Cannot get wake-on-lan settings: Operation not permitted
Current message level: 0x00000007 (7)
drv probe link
Link detected: no
We see "Link detected: no," which isn't surprising given the previous outputs. Let's check the kernel logs related to the eno2 interface:
[user@term:~]$ dmesg | grep eno2
[ 286.193572] ixgbe 0000:01:00.1 eno2: renamed from eth0
[ 405.125867] ixgbe 0000:01:00.1: registered PHC device on eno2
[ 405.237725] 8021q: adding VLAN 0 to HW filter on device eno2
[ 405.306018] ixgbe 0000:01:00.1 eno2: detected SFP+: 5
There's nothing in the logs to explain why the link isn't detected. We need to check the other side of the connection—the switch. Like in use case 4, the switch runs the SONiC network operating system, so the following commands are specific to that system. Let's check the status of the Ethernet0 interface on the switch:
admin@sonic:~$ show interfaces status Ethernet0
Interface
-------Lanes
-------Speed
-------MTU
-------FEC
-------Alias
-------Vlan
-------Oper
-------Admin
-------Type
-------Asym PFC
-------
Ethernet0
0
25G
9100
N/A
etp10
routed
down
up
SFP/SFP+/SFP28
N/A
The switch reports the link as down ("Oper down"). The Ethernet0 port is set as a 25Gb interface, while the server's side is a 10Gb interface. Since the switch port also supports 1Gb and 10Gb speeds, we need to check the configuration. First, let's verify if the Ethernet0 port has auto-negotiation enabled:
admin@sonic:~$ show interfaces autoneg status Ethernet0
Interface
-------Auto-Neg Mode
-------Speed
-------Adv Speeds
-------Rmt Adv Speeds
-------Type
-------Adv Types
-------Oper
-------Admin
-------
Ethernet0
disabled
25G
N/A
N/A
N/A
N/A
down
up
Auto-negotiation is currently disabled. To enable it, we run the following command:
admin@sonic:~$ sudo config interface autoneg Ethernet0 enabled
admin@sonic:~$ show interfaces autoneg status Ethernet0
admin@sonic:~$ show interfaces status Ethernet0
Interface
-------Auto-Neg Mode
-------Speed
-------Adv Speeds
-------Rmt Adv Speeds
-------Type
-------Adv Types
-------Oper
-------Admin
-------
Ethernet0
enabled
10G
N/A
N/A
N/A
N/A
up
up
Interface
-------Lanes
-------Speed
-------MTU
-------FEC
-------Alias
-------Vlan
-------Oper
-------Admin
-------Type
-------Asym PFC
-------
Ethernet0
0
10G
9100
N/A
etp10
routed
up
up
SFP/SFP+/SFP28
N/A
After enabling auto-negotiation, the switch shows the link as operational ("Oper up") and the link speed is 10Gb, matching the server's interface.
Now, let's return to the terminal window on the server and check the link status.
[user@term:~]$ ip link show eno2
18: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether ec:f4:bb:da:4a:7a brd ff:ff:ff:ff:ff:ff
altname enp1s0f1
Finally, we need to set IP addresses on both the server (IP: 192.168.200.2/24) and the switch (IP: 192.168.200.1/24) and verify if they can reach each other.
[user@term:~]$ ip address show eno2
18: eno2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether ec:f4:bb:da:4a:7a brd ff:ff:ff:ff:ff:ff
altname enp1s0f1
inet 192.168.200.2/24 scope global eno2
valid_lft forever preferred_lft forever
inet6 fe80::eef4:bbff:feda:4a7a/64 scope link
valid_lft forever preferred_lft forever
Let's check if the server can reach the switch's Ethernet0 interface:
[user@term:~]$ ping -c 3 192.168.200.1
PING 192.168.200.1 (192.168.200.1) 56(84) bytes of data.
64 bytes from 192.168.200.1: icmp_seq=1 ttl=64 time=0.366 ms
64 bytes from 192.168.200.1: icmp_seq=2 ttl=64 time=0.186 ms
64 bytes from 192.168.200.1: icmp_seq=3 ttl=64 time=0.193 ms
--- 192.168.200.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2055ms
rtt min/avg/max/mdev = 0.186/0.248/0.366/0.083 ms
The ping between the server and the switch is successful, meaning the connection is established.
Troubleshooting involved both the Linux server and the switch. Changes were made on both devices (enabling the interface on the server and enabling auto-negotiation on the switch). This shows that proper configuration on both ends is necessary for the link to work.
Network troubleshooting in Linux involves using various tools and techniques to identify and solve problems. This blog post lists and describes the most commonly used tools. It also provides examples of issues and processes for identifying and solving those problems, which can be applied to other situations as well.
In real life, network troubleshooting can be complex, often involving not just Linux but also interconnected network devices. However, understanding these tools and networking concepts allows you to diagnose and resolve issues, ensuring your Linux systems run smoothly.