Basic vSphere host network troubleshooting for ISCSI connectivity

This is me playing with formatting on instruction sets. It’s mostly a writing exercise.  There are probably 1000 better blog posts out there for this subject. I’m trying to see how I’d want this written if I was desperately googling at 7 p.m. on a client site.

vMotion, ISCSI, Management, VM Network, we’ve all had issues connecting.  Here’s the steps I’m writing down for myself to reference when I’m onsite and brain is fuzzy from deployment lag. Let’s assume we’re troubleshooting an ISCSI connection back to a storage array. Connectivity from the host is via 2x 10gbe ports on physical NICs 2 and 5. Assume vSwitch settings and port bindings are set correctly. Assume the storage target IP is 192.168.1.100, host 1 has an ISCSI VMKernel adapter IP of 192.168.1.101 and host 2 has 192.168.1.102. For this exercise, I just want ping connectivity.

  1. Is it plugged in?  So often we skip this or assume.  Physically touch point to point connections and look for link lights. Note physical ports on host and switch, verify against your workbook.
  2. Check the network.  It’s always the network unless it’s DNS.  But it’s probably the network.  Easy steps are ping IPs down the pipe.  Gateway, DNS servers, destination.  Neighbor VMs, neighbor hosts. Verify ICMP Echo is on, otherwise no pings anyway!
  3. You probably can’t ping, otherwise you wouldn’t be here. If you can ping, it’s permissions. Check VLANs across the pipe.  Putty into relevant switch stacks and look at the ports.  Are they in a No Shutdown state?(this means it’s up) Are the port ranges tagged or untagged?  If tagged, you have to specify VLAN on your port groups.  If untagged, remove the VLAN from your port group by setting the VLAN to 0.  On the switch(generally), SSH in, authenticate and type: Show Run to see the running config.
  4. Test across specific interfaces.  Let’s start by identifying our NICs. SSH into the host.  In vCenter, Host>Configure>Services>SSH START. Open PuTTY and enter your Host 1 management IP and login as Root. Now that we’re SSH’d in to our first host, let’s identify our interfaces.  Type: esxcli network ip interface list. Now we have a list of our NICs along with the identifier we’ll use.  Find the NIC you want from your ISCSI vSwitch and ping from that interface to the storage array.  In the host SSH session, type vmkping -I vmk2 192.168.1.100 (replace that ip with your storage array target). We’re telling our host to send a ping out that specific interface (vmk2 or physical NIC2) to that storage target IP. If you don’t specify that interface, you’re going to ping out the management interface and that’s probably not on the ISCSI VLAN.
  5. Assuming that fails, let’s try pinging another host along the same path. Type: vmkping -I vmk2 192.168.1.102 (replace that IP with the ISCSI VMkernel adapter IP on your second host)
  6. If it’s successful, you’ve eliminated your networking between hosts and have moved the issue down to the storage path. If not, you’re more than likely back on the switch stack.  Double check your access groups in your storage.  Does the storage know it’s allowed to talk to the hosts? If your storage array doesn’t have the correct HBAs in it’s allowed list, it’ll drop traffic and your hosts will never connect. After you get connected to the array, ensure your volumes are mapped to the hosts.  If you’ve made changes, don’t forget to rescan your ISCSI software adapters. Go to the host in vCenter>Configure>Storage Adapters>Rescan Storage

Good luck, it’s probably the network.

Trust, but verify

For my first trick!

On a customer site trying to wrap up a new 3-2-1 stack. Easy peasy, console into the new 10gb Dell 4048T-ON switches and get some ports turned up with our Network guys. Except, we can’t console into the switches for anything. We’re trying different cables, adapters, baud speed, you name it. Forum searches, Dell Pro Support, everything until we took a seriously critical look at what we’re doing.

Every single document said to plug the console cable into the console port like the picture shows. Well look closely, see what we saw after too many hours.

The ports were flipped from the docs.

Womp womp