Kaleb Wallace
2011-12-21 17:04:25 UTC
Kaleb Wallace [http://community.zenoss.org/people/Fortecor] created the discussion
"Bottleneck discovery"
To view the discussion, visit: http://community.zenoss.org/message/63377#63377
--------------------------------------------------------------
Hello Everyone,
 I am a new zenoss user and am loving it so far. I have been in IT for the past 12 years. I have been a Net Admin for the past 7. Never once have I started venturing into a world of software troubleshooting. I have never needed to, all of my networks I managed were small. My customer base has been mom and pop shops with a max of 30 pc's. Easy to troubleshoot. In november I landed this job with a company that has over 500 nodes on their network, with a fiber backbone. A Cisco Catalyst 6509 is the main network switch with various Cisco switches, all of the switches connected to each other with fiber. This is a distribution center and you can't walk around 1 corner without seeing another piece of the network. We have 20 different vlans and roughly 30 servers. Some virtualized with VMware and others physical. Sorry about the lengthy explaination, just trying to give you the underlying network, which should help eleviate questions later.
    So here starts my issue, I was wondering if I can use Zenoss to pinpoint my trouble. Or if I've just gotta go old fashioned and go to one node at a time and try to figure this out. But, I have really slow network speeds between a single fiber connection. It was brought to our attention with users complaining about an excel spreadsheet being really slow that is shared and used by about 20 different people. Which our original thought, was that 20 people should not be accessing a single excel spreadsheet simultaneously. However, this is not the problem. I did copy tests, just copying 1 single 17MB file from one PC/Server to another. Some over this single fiber connections and others not. I recieved these results...
*Test 1:*
    Over a 10gb fiber connection, I was averaging a transfer of this 17MB file of 2 minutes. These results came originating from the same computer we will call PC1. I tried copying to 3 different servers; 1 virtual windows 2k8 (server1), 1 physical 2k8 (server2) and my linux server (linny) sitting on my desk running my personal samba server and wiki. Originating from PC1 to all three servers the transfer was averaging 2 minutes. Each server is across our warehouse (1/2 mile), running through a single fiber connection (1 fiber for transmit and 1 for receive) that has roughly 3 junctions in it. Anyway, I did 4 different copies of this same file and each had similar results.
    This test showed me the tranfer of a single file from location 2 (East warehouse 1/2 mile from loaction 1) is extremely slow going to location 1 (West side of warehouse 1/2 mile from location 2)
*Test 2:*
    Now I then ran this same "copy test" from the servers (all 3) to PC1. This file transfer went from the average 2 minutes to an average of 2 seconds. Again each test beign performed 4 times.
    This test showed me that the same fiber connection is slow from location 2 to location 1. But is fast and normal from location 1 to location 2.
*Test 3:*
Now, let me throw this in the mix. When I initiated from PC1 a file transfer from server1 to server2 the transfer avg. 2 minutes. Same times average when file transfering between all 3 servers when it is initiated from PC1.
*Test 4:*
    Next I used another PC (PC2) utilizing the same network segment as PC1 (cisco switch not taking the path through the fiber), thus taking the fiber out of the equation, and transfered the same file from PC1 to PC2 then, BAM this 17MB file took avg. 2 seconds, same times happened with transfering back from PC2 to PC1.Â
    So this test rules out the PC's and this particular network segment as being the problem with slowness.
*Test 5:*
    So I then went across the warehouse to location 1 and tested from my workstation (PC3), thus taking the fiber out of the equation again, but utilizing the same network segment that the servers are, however, not being the segment that the PC's are on. The same file transfered from PC3 to all 3 servers avg. 2 seconds. Same from all 3 servers to PC3, taking avg. 2 seconds. I also received the same results transfering the file from server to server, using all 3 servers, initialzing the transfer this time from PC3, all this took avg 2 seeconds.
    This tests rules out the servers as the problem, this network segment as the problem and pinpoints the problem to being the fiber.
*Test 6:*
    This test just confirms the other tests. I setup a linux box over at location 2 and started ping floods over the fiber which of course gave me exactly what I suspected, dots all over the screen. Then I ping flooded a PC in the same segment and 1 dot. I then went to my linux server at my desk (linny) at location 1 and did the same ping flood through the fiber and i was surprised to find that dots all over the screen. So I have a problem somewhere in that fiber, whether it is the fiber cable itself or the switches.
*What I need:*
    Now, the fiber as I have said goes a distance of 1/2 a mile and spans the warehouse with junctions in multiple spots. Now our warehouse is 100ft up and the fiber is snaked through all the beams and twists and turns of the warehouse, passing through aisles and pathways of the forklifts and motorized jacks. So this is a difficult spot to just go troubleshoot the cable so i need to rule out if it is a switch or a configuration on a switch and which switch it is. i also can't just replace equipment because our our network is so critical in getting uor product out. This is why I am hoping Zenoss can help me pinpoint the problem, however I have no idea where to start when using the software. And am hoping it can help me. I have been researching this and wireshark and all sorts of other programs and it seems that zenoss could be the closest to what i need. Does anyone have any suggestions as to how I need to start monitoring for this?
Sorry for the long spiel, but I had to lay the ground work so this thread wouldn't get really long with questions. Thank you all for your time in reading this and your assitance.
Kaleb Wallace
--------------------------------------------------------------
Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/63377#63377]
Start a new discussion in zenoss-users by email
[discussions-community-forums-zenoss--***@community.zenoss.org] -or- at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]
"Bottleneck discovery"
To view the discussion, visit: http://community.zenoss.org/message/63377#63377
--------------------------------------------------------------
Hello Everyone,
 I am a new zenoss user and am loving it so far. I have been in IT for the past 12 years. I have been a Net Admin for the past 7. Never once have I started venturing into a world of software troubleshooting. I have never needed to, all of my networks I managed were small. My customer base has been mom and pop shops with a max of 30 pc's. Easy to troubleshoot. In november I landed this job with a company that has over 500 nodes on their network, with a fiber backbone. A Cisco Catalyst 6509 is the main network switch with various Cisco switches, all of the switches connected to each other with fiber. This is a distribution center and you can't walk around 1 corner without seeing another piece of the network. We have 20 different vlans and roughly 30 servers. Some virtualized with VMware and others physical. Sorry about the lengthy explaination, just trying to give you the underlying network, which should help eleviate questions later.
    So here starts my issue, I was wondering if I can use Zenoss to pinpoint my trouble. Or if I've just gotta go old fashioned and go to one node at a time and try to figure this out. But, I have really slow network speeds between a single fiber connection. It was brought to our attention with users complaining about an excel spreadsheet being really slow that is shared and used by about 20 different people. Which our original thought, was that 20 people should not be accessing a single excel spreadsheet simultaneously. However, this is not the problem. I did copy tests, just copying 1 single 17MB file from one PC/Server to another. Some over this single fiber connections and others not. I recieved these results...
*Test 1:*
    Over a 10gb fiber connection, I was averaging a transfer of this 17MB file of 2 minutes. These results came originating from the same computer we will call PC1. I tried copying to 3 different servers; 1 virtual windows 2k8 (server1), 1 physical 2k8 (server2) and my linux server (linny) sitting on my desk running my personal samba server and wiki. Originating from PC1 to all three servers the transfer was averaging 2 minutes. Each server is across our warehouse (1/2 mile), running through a single fiber connection (1 fiber for transmit and 1 for receive) that has roughly 3 junctions in it. Anyway, I did 4 different copies of this same file and each had similar results.
    This test showed me the tranfer of a single file from location 2 (East warehouse 1/2 mile from loaction 1) is extremely slow going to location 1 (West side of warehouse 1/2 mile from location 2)
*Test 2:*
    Now I then ran this same "copy test" from the servers (all 3) to PC1. This file transfer went from the average 2 minutes to an average of 2 seconds. Again each test beign performed 4 times.
    This test showed me that the same fiber connection is slow from location 2 to location 1. But is fast and normal from location 1 to location 2.
*Test 3:*
Now, let me throw this in the mix. When I initiated from PC1 a file transfer from server1 to server2 the transfer avg. 2 minutes. Same times average when file transfering between all 3 servers when it is initiated from PC1.
*Test 4:*
    Next I used another PC (PC2) utilizing the same network segment as PC1 (cisco switch not taking the path through the fiber), thus taking the fiber out of the equation, and transfered the same file from PC1 to PC2 then, BAM this 17MB file took avg. 2 seconds, same times happened with transfering back from PC2 to PC1.Â
    So this test rules out the PC's and this particular network segment as being the problem with slowness.
*Test 5:*
    So I then went across the warehouse to location 1 and tested from my workstation (PC3), thus taking the fiber out of the equation again, but utilizing the same network segment that the servers are, however, not being the segment that the PC's are on. The same file transfered from PC3 to all 3 servers avg. 2 seconds. Same from all 3 servers to PC3, taking avg. 2 seconds. I also received the same results transfering the file from server to server, using all 3 servers, initialzing the transfer this time from PC3, all this took avg 2 seeconds.
    This tests rules out the servers as the problem, this network segment as the problem and pinpoints the problem to being the fiber.
*Test 6:*
    This test just confirms the other tests. I setup a linux box over at location 2 and started ping floods over the fiber which of course gave me exactly what I suspected, dots all over the screen. Then I ping flooded a PC in the same segment and 1 dot. I then went to my linux server at my desk (linny) at location 1 and did the same ping flood through the fiber and i was surprised to find that dots all over the screen. So I have a problem somewhere in that fiber, whether it is the fiber cable itself or the switches.
*What I need:*
    Now, the fiber as I have said goes a distance of 1/2 a mile and spans the warehouse with junctions in multiple spots. Now our warehouse is 100ft up and the fiber is snaked through all the beams and twists and turns of the warehouse, passing through aisles and pathways of the forklifts and motorized jacks. So this is a difficult spot to just go troubleshoot the cable so i need to rule out if it is a switch or a configuration on a switch and which switch it is. i also can't just replace equipment because our our network is so critical in getting uor product out. This is why I am hoping Zenoss can help me pinpoint the problem, however I have no idea where to start when using the software. And am hoping it can help me. I have been researching this and wireshark and all sorts of other programs and it seems that zenoss could be the closest to what i need. Does anyone have any suggestions as to how I need to start monitoring for this?
Sorry for the long spiel, but I had to lay the ground work so this thread wouldn't get really long with questions. Thank you all for your time in reading this and your assitance.
Kaleb Wallace
--------------------------------------------------------------
Reply to this message by replying to this email -or- go to the discussion on Zenoss Community
[http://community.zenoss.org/message/63377#63377]
Start a new discussion in zenoss-users by email
[discussions-community-forums-zenoss--***@community.zenoss.org] -or- at Zenoss Community
[http://community.zenoss.org/choose-container!input.jspa?contentType=1&containerType=14&container=2003]