Network Health Assessment

toughhawaiiNetworking and Communications

Oct 26, 2013 (3 years and 11 months ago)

112 views


Network Health

Assessment


LEA

Name
:

Brunswick County Schools








Primary POC

Leonard

Jenkins

D
irector
,

Technology

Department

Brunswick County Schools

3
5 Referendum

Drive

Bolivia
, NC

28422

910
-
253
-
29
31

ljenkins@bcswan.net



Technical POC

Mike
Crawford

Wide Area Network (WAN) Manager

910
-
253
-
2996

mcrawford
@bcswan.net




Date Data Collected

0
6
/
17
/2008
-

0
6
/
1
9
/2008








Network Description


Wide Area Network

The
Brunswick County

Schools

(B
CS)
wide area
network (WAN)
utilizes fiber
infrastructure constructed and maintained in partnership with

A
tlantic Telecom
Membership Corporation

(ATMC).
The network topology is a distributed star network
with the central hubs of the stars owned and maintained by ATMC.
ATMC has network
hubs in four

locations across the county. These locations are connected to each other via
gigabit Ethernet.
The
BC
S

schools
are connected to switches owned by ATMC at these
locations
,
or to other BC
S

schools utilizing dark fiber

provided by ATMC
. The furthest
schoo
l
(JMM)
from the BCS core
is six “switch hops” away.
The
network core at
the
BCS central office at
35 Referendum Drive is connected to ATMC’s network at the
Bolivia
-
CO.



All school facilities are connected via
gigabit Ethernet. Single mode 1000base
-
l
x

and
1000base
-
zx modules are used to light the fiber between the schools.
Every connection
between switches
is

setup as 802.1Q trunks which allow several VLANs to communicate.
There is a core Cisco 6509 series switch acting as the VLAN controller and main

router
for the district. All traffic between VLANs is routed via this device.


An updated network diagram of the BCS wide area network follows:







This diagram can be found in Visio
format
in Appendix A.



Note:

The above diagram differs slightly from the
network diagram provided by BCS

at
the start of the network assessment
.
The network layout detailed in the above diagram
was validated with both the SolarWinds LANsurveyor auto
-
discovery tool, and a review
of
data obtained through
the use on the Cisco Discovery P
rotocol (CDP).


T
he core 6509 switch is split into a L2 switch and a L3 router. The 6509L3 is the main
router for the district, and is routing IP and IPX. The Layer 2 switch is the main switch
for the

Central Office as well as the VLAN controller for the district. 802.1Q VLAN
trunk encapsulation is used on all WAN connections. Cisco VTP is used to control and
configure VLAN configurations across the network. The layer3 switch is using IOS and
the L
ayer 2 switch us running CATOS. Subnet masks and IP address schemes are
consistent with best practices.


Every school has
a single
VLANs setup for traffic. The Logical network is fairly flat as
the 6509 routes all internet traffic between the multiple VLA
Ns.


Local Area Network

The Local Area Networks utilizes

Cisco

and Dell brand
switches and routers. These
include full gigabit

switches, 10/100 Mbps

Switches with and without gigabit uplink
capability.

Cisco model numbers include: 2950, 2950G, 2924M, 351
2, 3508, 3550,
3500XL,
4006, and 6509.
Many of the Cisco switches have been designated end
-
of
-
life

(EoL)

by Cisco.
All the Dell switches are 3024s.

Unmanaged switches are also widely
deployed through the district.




Internet Access

B
CS

contracts with

ATMC

for
22 Mbps of
Internet
access capacity
.



B
CS
’s

public IP address space

is provided by ATMC.

216.99.112.128

/
27


Security and WAN
Optimization

The
B
CS
security and WAN optimization infrastructure is comprised of the following:



Cisco A
SA 5520 F
irewall



SmoothWall School Guardian

Content/URL Filter



Packeteer PacketShaper
6500














Data Collection and Testing Process Summary


Data collection and testing focused in three key areas as follows:


Physical
-
layer analysis:

1.

Physical inventory of

all network infrastructure and LAN cabling

2.

Automated network discovery and mapping using SolarWinds LANsurveyor

3.

Packet capture and analysis at the core switch at each school using Wireshark

-

the physical layers was inspected by capturing packets at each
location and
analyzed for problems and anomalies.

4.

Analysis of
switch logs

and port errors.



Configuration analysis:

1.

Analysis of core switch configurations



configurations compared against best
pr
actices published by major switch manufactures
.


Network
Performance analysis:

1.

N
etwork utilization

analysis



examine network utilization for WAN and Internet
access connections using Cacti for WAN connections.

2.

N
etwork latency

analysis



examine network latency across the WAN using
SmokePing

3.

T
hroughput analysi
s


examine link and end
-
to
-
end throughput

using the Network
Diagnostic Tool (NDT) developed by I2.




Results

& Observations



Physical Layer Analysis:


1.

Physical Inventory and Cabling
:


Varying levels

of craftsmanship are evident

throughout the district’s network
infrastructure. In assessing the craftsmanship, we try to distinguish between best
practices related directly to the reliability and performance of the network, and those
related to network maintainability. Our comments

are focused on issues that are or may
be contributing
to
degradation in
network
reliability and performance.


Of particular significance and concern are issues rela
ted to copper and fiber patch cables.
BCS uses a mixture of
manufactured and fiel
d
-
termina
ted

copper
Ethernet patch cables.
Care must be taken to ensure field
-
terminated cables meet industry standards.


When creating
Ethernet patch cab
les

by hand, t
he jacket of the cable needs to be all the
way into the crimp jack. Also, TIA/EIA 568
-
A or TIA/
EIA 568
-
B standard needs to be
followed to ensure proper crosstalk elimination.





Virginia Williamson


New Wing


Sometimes, being too neat and orderly can cause problems as well. This rack is very
organized and maintained; however the two most
important cables in the rack are bent to
the point of creasing. This can cau
se packet loss and CRC errors.




Bolivia Elemen
tary

IDF
-

200 Wing





Bolivia Elemen
tary

IDF
-

200 Wing


BCS also uses both manufactured and field
-
terminated fiber
optic
patch

cables
. In many
cases the fiber connectors do not have strain relievers and
/or

the protective layers of
shielding have been stripped away from the fiber itself.

In addition,

adequate protection
is often not provided for the fiber optic cables.



The fol
lowing pictures were taken at the BCS
central office
main server room
.




Server Room


Central Office




Server Room


Central Office


The quality of the fiber installation is poor with fiber jumper cables routed
in/around/between OSP cables, and
field
-
terminated fiber cables fully exposed under the
floor.



S
hown below

are examples of problems wi
th the fiber optic infrastructure that need to be
addressed. Many of the fiber optic cables have bends and angles that exceed
recommended stand
ards. The
se bends cause single loss and refraction res
ulting in an
unstable network. Unused terminated fiber cables should have covers to protect the ends.






Bolivia
Elementary


Room 405
J
ess
i
e Mae
Monroe


Server Room





Union Elementary


Room 501


P
ictures of all the wiring closets and switch locations located throughout the Brunswick
County Schools System

are included in the assessment packet.



2.

So
larW
inds
LANsurveyor
Network
Discovery


As
noted earlier, results from the SolarW
inds
LANsurveyor auto
-
discovery tool were
used to develop the network diagram included in Appendix A. The raw data generated
by the network discovery tool is included in Appendix B.

All network
switches are
included
in the map
and it can be used to see the hierarchical design if needed.
Some
parts of the auto discovery did not complete correctly due to ATMC equipment not being
accessible to the discovery agent.


3.

Packet Capture using WireShark


Packet captures of th
e broadcast traffic show normal network broadcast traffic protocols
which include: ARP, DHCP, IPX, Spanning
-
Tree, and NetBIOS. EIGRP is leaking out
into the LANs of the school from the Core 6509. There doesn’t seem to be an overload
of one type of traffi
c, and the amount of broadcast traffic is well within normal
operations.


Packet captures of all traffic using port mirroring, shows much of the same. HTTP and
TCP requests to and from the proxy server at the district consist of the majority of the
traffi
c. There are some NCP file access requests to 10.1.10.5.

Packet Captures at NBHS
show a possible switch loop.



4.

Analysis of Logs and Port Errors


Below are the problems we identified

while analyzing the logs and port errors of all the
switches. The majority of the errors are CRC or Collision type errors. CRC errors are
generally cable based, while Collision errors generally mean there is a duplex mismatch
issue.


Although we listed
all the port error statistics for every school, many of the switches
were

restarted less than 4 days prior to our arrival
. In addition, network port usage was low at
the time of our assessment since school was not in session. Consequently, it is likely
t
here are additional ports with errors.



Central Office 6509

4/1


duplex mismatch 100
-
half collisions (
Packeteer
)

4/5


duplex mismatch 100
-
half
collisions

(
NovaNet
)

6/3


CRC / TCP Runts


Possible bad handmade cable

6/36


Native VLAN mismatch

2/4


Nat
ive VLAN mismatch


Bolivia


10.4.0.0

10.4.1.4


Fa0/16


CRC errors



-

CRC errors

10.4.1.5


Fa0/2 CRC errors

10.4.1.11


VLAN

1 input errors


CRC


on virtual interface


this should not happen


-

Possible

issue with the switch

10.4.1.13


fan fault


replace fan or switch


BCA


10.8.0.0

10.0.0.8


fa0/24


CRC error

10.8.1.3


fa0/23


CRC error


-

fa0/27


CRC error

10.8.1.4


fa0/48


collisions


duplex mismatch

10.8.1.5


fa0/18


CRC error

10.8.1.6


fa0/18


CRC &
VLAN

mismatch


Jess
i
e

Mae Monr
oe



10.10.0.0

** Entire school needs to have the 1000BaseCX modules replaced due to collisions

** Switch response is slower due to dropped packets and retransmits


Le
land Middle


10.16.0.0

10.16.1.38


fa0/21


CRC error


-

fa0/22


CRC error

10.16.1.39


fa0/2


CRC error & interface flapping up/down


-

fa0/3


CRC error


-

fa0/19


CRC error

10.16.1.40


fa0/2


CRC error


-

fa0/2


CRC error & interface flapping up/down


-

fa0/9


CRC error & interface flapping up/down


-

fa0/19


CRC error


-

fa0/23


CRC error

10.16.1.41


fa0/10


CRC error & interface flapping up/down


-

fa0/23


CRC error

10.16.1.42


fa0/5


CRC error & interface flapping up/down


-

fa0/13


CRC error & interface flapping up/down


-

fa0/22


CRC error

10.16.1.43


fa0/12


CRC err
or & interface flapping up/down


-

fa0/22


interface flapping up/down


-

fa0/24


CRC error

10.16.1.46


fa0/22


collisions


duplex mismatch ( Access Point )


Le
land Elementary


10.20.0.0

10.0.0.20


gi0/1


high number of ignored packets (might not be

a problem)

10.20.1.3


fa0/8


CRC errors & collisions


duplex mismatch

10.20.1.6


fa0/18


collisions


duplex mismatch

10.20.1.10


switch not accessible

10.20.1.13


fa0/13


CRC error

10.20.1.14


fa0/1


CRC error


-

fa0/3


CRC error


NBHS


10.26
.0.0

10.0.0.26
-

3/1


xmit errors

100/full


-

3/22


xmit errors


100/full


-

3/46


xmit errors


100/full


-

2/2, 2/3, & 2/4


Native VLAN mismatch

10.26.1.20


fa0/28


duplex mismatch 100
-
half

10.26.1.50


fa0/7


duplex mismatch


collisions

10.2
6.1.247


Gi0/3


duplex mismatch


collisions 100
-
half


-

Gi0/4


CRC errors


check cable / environment


-

Gi0/7


CRC errors


check cable / environment


-

Gi0/8


CRC errors


check cable / environment

10.26.1.253


IOS 11.2 needs to be upgraded


-

Fa0/12


CRC errors


check cable / environment

10.26.1.254


IOS 11.2 needs to be upgraded



Shallotte Middle


10.32.0.0

10.0.0.32


fa0/1


CRC error

10.32.1.10


ALL

ports except (fa0/1, fa0/16, fa0/21, and fa0/24) CRC errors


-

fa0/12 has the most CRC

errors

10.32.1.11


fa0/3


CRC error


-

fa0/4


CRC error


-

fa0/6


CRC error


-

fa0/7


CRC error


-

fa0/8


CRC error


-

fa0/10


CRC error


-

fa0/12


CRC error

10.32.1.13


fa0/4


CRC error


-

fa0/5


CRC error


-

fa0/15


CRC error

10.32.1.17


fa
0/13


CRC error


-

fa0/18


CRC error

10.32.1.21


fa0/7


CRC error


-

fa0/10


CRC error


SBHS


10.34.0.0

10.0.0.34


gi3/2


CRC error


-

gi3/4


CRC error


SB
M
S


10.3
5
.0.0

10.35.1.24


fa0/3


CRC error

10.25.1.26


fa0/9


CRC error



Configuration

analysis:


1.

Switch configuration analysis:


Switch configurations at the school system follow
s
tandard best practices. In a few rare
occasions, VLAN uniformity is not followed. These are listed under the Port Error
section of the
a
ssessment

as Native VLA
N Mismatches.


BCS should c
onsider hard
-
coding the speed and duplex on all ports that connect
a
switch
to another switch. After power outages there is a possibility that one of the ports will not
auto
-
negotiate correctly and cause latency and packet l
oss.



N
etwork Performance Analysis


1.

Network Utilization Analysis

(Appendix C
)


Bandwidth usage was very low during our test period as school was out of session. All
traffic from the schools flows through the ATMC Bolivia
-
CO. Traffic at the time of the
assessment, was well within limits. There were some large spikes of traffic duri
ng the
morning hours destined to NBHS. These spikes were to the local servers at the Central
Office.

ATMC Bolivia
-
CO






NBHS




2.

Network Latency Analysis

(Appendix D
)


Network latency is very low across the entire wide area network. Average ping
times

range between 1ms and 3ms. As the “switch hops” between the school and the district
increase so does the average ping to the school.



There are a few schools that have packet loss issues. These schools include Jess
i
e Mae

Monroe
, SBMS and SBHS.

SBHS gets its connectivity from SBMS, so we have to
assume that if the packet loss to SBMS was fixed that SBHS would be as well.
The rest
of the schools show low loss, and low jitter.


Jess
i
e
Mae

Monroe

Packet Loss Average


0.12%

Packet Loss Max


35.
00%




Test points for Jess
i
e Mae
Monroe
are showing packet loss

pikes

in the 35% to 50%
range.





South Brunswick
Middle

Packet Loss Average


2.75%

Packet Loss Max


100.00%





There is a
steady 2% to 3% packet loss between the core 6509 and SBMS.


Sbms_3508#sh interfaces status


Port Name Status Vlan Duplex Speed Type

-------

------------------

------------

--------

------

-------

----

Gi0/1

notconnect 35 Auto 1000 Missing

Gi0/2 connected 35 A
-
Full 1000 1000BaseSX

Gi0/3 connected 35 A
-
Full 1000 1000BaseSX

Gi0/4 connected 35 A
-
Full

1000 1000BaseSX

Gi0/5 connected 35 A
-
Full 1000 1000BaseSX

Gi0/6 connected 35 A
-
Full 1000 1000BaseSX

Gi0/7 Connection LH to S connected trunk A
-
Full 1000 1000BaseLX

Gi0/8 Conne
ction ZX to B connected trunk A
-
Full 1000 1000BaseLX


If the distance is greater than 7 km between SBMS and ATMC’s Bolivia
-
CO, then there
is a very high possibility that the packet loss is due to wrong module installed at SBMS.

Gigabit 0/8 in the

SBMS_3508 switch should be a 1000BaseZX module.

A review of the
OTDR traces taken during the fiber

optic cable installation should be done

to determine if
the
installed fiber optic module
s are

sufficient to support the distance between schools.



3.

Throughput Analysis



The data collected using the Network Diagnostic Tool (NDT) was incomplete and
consequently inconclusive. We are working to modify the tool and associated test
procedures to improve future data collection.





R
ecommendations


High P
riority
Recommendations:




SBMS


the fiber module installed to deliver the signal to ATMC Bolivia
-
CO is
of the LX variety and is only rated for 7 km of distance. Suggest replacing with a
ZX module.



Jess
i
e Mae
Monroe


The

fiber module installed at JMM

is s
howing as unknown
to the switch. Suggest replacing this module or the switch.



Replace all 1000Base
-
CX (Cisco Gigastack) modules at Jess
i
e Mae. Switch to
switch connections inside the school are using Cisco 1000_CX_Gigastack
modules which run at half du
plex
. The industry has
abandoned these modules
due to the issues inherent with half
-
duplex connections.


Other Recommendations:




Acquire a network cabling certifier. Due to the high number of
field
-
terminated
cables, a network cable
certifier

will ensure that each cable meets specifications,
and is not introducing instability into the network. The tester should be able to
certify copper and fiber based cables.



Test and replace
if necessary
cables (closet
-
to
-
computer) that are plugged into the

ports listed with CRC errors.



Hard code speed and duplex settings for all servers and switches. Any device that
is a part of the network infrastructure should not be using auto
-
negotiate.



Investigate and check port speed and duplex settings for all por
ts listed as having
collisions.



Review OTDR traces taken during the installation of the fiber optic cable plant.
If they are not available, h
ire a contractor to test all long
-
haul fiber connections
with an OTDR to verify that the connections between the
schools meet standards
and loss thresholds.



Move from a p
roxy
-
ba
sed filter system to a
pass
-
by or in
-
line appliance. Proxy
-

based filters add latency to network requests. Proxy based systems do not scale
well when speeds of 100Mbit are being utilized.



Review
the
switch configurations for ATMC
-
owned 4000 series switches.



Develop and implement a plan for
improving the level of craftsmanship at schools
where appropriate. The

plan
should include

network rack cable management,
installing patch panels where

appropriate,

protecting and securing fiber optic
cables,

cleaning and covering
unused fiber

optic cables, and re
-
routing the
network patch cables.



Develop and implement a multi
-
year plan/schedule to replace end
-
of
-
life Cisco
switches. The plan should
ensure product standardization across the district.

(Minimize/eliminate the use of non
-
standard products, e.g. Dell switches.)

Start
with a school at a time, replace every switch in the school, and use the ones taken
out as spares for the other schools.


Appendix E includes the Cisco schedule for
switches designated EoL.




Upgrade the IOS on switches
where appropriate
to a newer version with bug
fixes.



Remove IPX routing on the WAN. Move to an IP only environment.



Consider redesigning the WAN infrastru
cture to have less switch hops between
the Central Office and the school edge.