Hi experts,
Please see screenshot here(http://www.4shared.com/file/jTRUk_Gf/All_SAP_connections_are_droppe.html).
Application Reset:
This situation also generates a lot of calls and, unfortunately, is
determined typically by process of elimination. In other words, there is
no other reason for the reset so it must have come from the application.
I hate saying that, but that really is the answer.
http://blogs.technet.com/b/networking/archive/2009/08/12/where-do-resets-
come-from-no-the-stork-does-not-bring-them.aspx
All SAP connections are dropped from 11/23 00:01:09am to 00:01:26am
Interesting, perfmon data is missing(No connection reset, no bytes transferred)
No suspicious system and application logs
Cluster is fine
Many connections reset at 00:01:26am
Huge connections reset at 00:01:16am
After double checking the SAP runtime errors, there is no active jobs/connections running at 00:01:09~00:01:26 on tccap30. So there is no connections reset could be found during this time. I should install wireshark on each application server.
Let's check the niping in dl980-2 to dl980-1. SAP niping connection is dropped at 00:01:06 on dl980-2
Let's see the packets in received side(nipping server): dl980-1
You can see before 00:00:47, the transmission is normal. 192.168.28.12 send several packets to 192.168.28.11 and the received server dl980-1 replied with 2 packets.
However, after 00:00:49am, 192.168.28.11 requests TCP retransmission a lot of times which means something wrong in dl980-1 and it can't receive any packets
Let's see if dl980-2 did send the packets
Hmmm, I don't know how to interpret the sequence number in TCP retransmission packet. However, it seems that the transmission from dl980-1 to dl980-2 is ok which means the network is ok. So, something wrong in NC550SFP/NCU/dl980-1?