In order for Windward Software products to function correctly in a network environment, there needs to be adequate network infrastructure. By network infrastructure, we mean a professionally cabled network with reliable network cards and switches. A properly configured server, DNS and TCP/IP protocol for our applications and other network services will be required. If things are not working as they should, this document may give you some ideas on what to check and consider.
- Purchase quality network switches, hardware, use professional cabling and patch panels.
- Buy patch cables instead of building your own. For the cost, it is not worth the dollar savings. A bad patch cable can cause many hours of troubleshooting.
Things to consider when things go wrong
- Check the server event log for any message or hints towards what may be failing.
- How old is the hardware? Has it run successfully before?
- What is new? Has anyone brought a device or switch in from home or plugged something in?
- Could the server drive or controller be failing? Is it a single drive or in a RAID configuration?
- Are there any patches available for the server hardware? Firmware such as RAID card, network card, BIOS?
- Could line voltage be an issue? Any large motors, compressors or welding devices nearby? You may want to consider some power conditioning.
- Check for pending Windows updates on all workstations/servers involved. Strange things can occur when updates have not been applied a server degrades into a “reboot pending” state.
- Check settings for power saving options and disable them. We have seeing OS updates reset power-saving settings.
- Check for bad or failing ports on switches that are not consistent (Even new ones can fail)
- Ports auto-detecting incorrectly (ie: 10Mb detected on the switch when the server is transmitting at 100Mb)
- Incorrectly crimped cables (Even store-bought cables can fail and mice will eat working ones and they will still operate some of the time)
- Poorly routed cables (Don’t run parallel with power wires as magnetic fields will cause corruption)
- A switching loop or bridge loop in the network. (e.g. multiple connections between two network switches or two ports on the same switch connected to each other).
- Network cards that randomly fail but work most of the time (Poorly crafted operating system drivers can cause this type of issue as well)
- Other environmental considerations (Newly installed welders, air conditioners, or other power-hungry devices in the same building (even outside the company) can cause power brownouts which can cause pieces of equipment to fail during startup of the power-hungry device) Install a UPS and if it is beeping every few minutes, you may have a power issue.
- Is there a duplicate IP address that conflicts with the server or workstation? You should get a message about this but the message could be presenting at the user workstation level and taking communication to the server off the network. To troubleshoot review your ARP table such as at cmd prompt with ARP – A will show the physical NIC addresses which you would want to confirm are correct against the NIC of the server.
- Is there another DHPC server conflicting with yours? You would have to look for DHCP traffic on port 68 on your switch to find out where it is plugged in. We have seen some network technicians choose to configure the switch to throw away DHCP packets from any other source than your DHCP server as it is a common gotcha and is really hard to find
- Are you using Voice over IP phones on the same network as your data? The best networking practice would be to configure a separate VLAN for phone traffic to allow the data network to communicate without competing with packet priority over voice data.
- Is there any third-party software installed such as State or Provincial level tools that could impact VPN or network access?
- Have you swapped internet providers and left old DNS information hardcoded on workstations? This can be overlooked and can cause 30-second timeout delays while DNS fails over to a working DNS server. Due to the round-robin and caching nature of DNS, System Five will work sometimes and have a long delay at other times.
Have a Plan B
When a networking event takes place, often IT resources get solely placed into troubleshooting and healing the network. This is a great primary action but consider and start moving a Plan B in the event that healing the network does not work. This can decrease downtime experienced and will often show areas that can be improved for Business Continuity Planning.
Example: Restoring from backup to a slower server may not seem like a great plan at the beginning. When it can take days to get parts for a failed network component, operating on a slower second option will eliminate downtime if it is available.