|Ticket Number:||20171115_1||Ticket State:||CLOSED|
|Ticket Opened:||2017-11-15 16:19||Ticket Closed:||2017-11-23 10:08|
|Ticket Description:||P2P/OPN (Layer-2) problem|
Problem Description:We got several complaints (from several customers), that their P2P/OPN links within SWITCHlan are broken. The problem could not be located yet. We suspect an issue with MPLS on one of our main routers.
Affected:From 2017-11-15 15:45 until 2017-11-15 16:20
Impact: Partial loss of connectivity
Sites: CERN-CNAF-LHCOPN-002, EHB-LG, EHB-LS, EHB-ZO, EmpaEawag-KAS, EmpaEawag-SGE, EmpaEawag-THU, ENSI-NAZ, EPFL-CBT-el2-ge2, EPFL-CBT-ls2-ce2, EPFL-CSCS-ce2-lug1, EPFL-IMT-el2-ne2, EPFL-IMT-ls2-ne1, EPFL-LABA-el2-ba1, EPFL-Lausanne, EPFL-Sion, ETHZ-BSSE-ez3-ba2, ETHZ-BSSE-zh1-ba1, ETHZ-CSCS-ez3-lug1, ETHZ-CSCS-zh1-lug2, ETHZ-ETHBOARD-ez2-be1, FFHSRD-FFHSBR, HEARCNE-HEARCDEL-ne1-del2, HESSO-AVP, HESSO-DEL, HESSO-ECAL, HESSO-EESP, HESSO-EIC, HESSO-HEFR, HESSO-HEIGVD, HESSO-HESAV, METEOCH, METEOCH-CSCS, NTB-HTWCHUR-bu2-cr2, NTB-Waldau-bu2-fhsg2, RERO-UNIL-my2-ls2, Sommet-BLU, Sommet-BUL, Sommet-CL, Sommet-GL, UNIBE-HFSJGJFJ-be2-jfj2, UNIBE-UNIFR-be2-fr1, UNIL-IUKB-ls2-si2, VSnet-vi2-my2, WSL-CAD-wsl2-caw2, WSL-SLF-wsl1-slf2, WSL-SLF-wsl2-slf1
The root cause of this outage turned out to be a memory leak on our backbone router swiCE1 at CERN. During announced maintenance work at EPFL, the physical link between swiCE1 at CERN and swiEL1 at
EPFL had to be disrupted for a short moment. For as long as the physical link was disrupted, all traffic was routed via another path. Once the link came up again, the router swiCE1 did start the OSPF process.
However, the LDP process was not started due to an out-of-memory condition. The effect was that OSPF signalled the link between swiCE1 at CERN and swiEL1 at EPFL as active while the forwarding of MPLS traffic didn't take place anymore due to the missing label distribution via LDP. This led to a blackholing situation for all OPN traffic passing this link.
The firmware of swiCE1 has been upgraded. All services are restored. Closing ticket.
It seems like the problem has been solved since approx. 16:25.
All links are back to normal operation. Please contact us (firstname.lastname@example.org) if you observe any anomalies.
For all questions about this ticket, please send mail to email@example.com
or call +41 44 268 15 30.