PlasmaNodes Hosting - Main Server Failure – Incident details

Main Server Failure

Resolved
Major outage
Started 8 days agoLasted 1 day

Affected

Core

Major outage from 4:34 PM to 3:10 PM, Partial outage from 3:10 PM to 6:58 PM, Degraded performance from 6:58 PM to 7:39 PM, Operational from 7:39 PM to 4:25 AM

Main Website

Major outage from 4:34 PM to 3:10 PM, Partial outage from 3:10 PM to 6:58 PM, Degraded performance from 6:58 PM to 7:39 PM, Operational from 7:39 PM to 4:25 AM

Client Area

Major outage from 4:34 PM to 3:10 PM, Partial outage from 3:10 PM to 6:58 PM, Degraded performance from 6:58 PM to 7:39 PM, Operational from 7:39 PM to 4:25 AM

VPS Control Panel

Major outage from 4:34 PM to 3:10 PM, Partial outage from 3:10 PM to 6:58 PM, Degraded performance from 6:58 PM to 7:39 PM, Operational from 7:39 PM to 4:25 AM

Game Server Control Panel

Major outage from 4:34 PM to 3:10 PM, Partial outage from 3:10 PM to 6:58 PM, Degraded performance from 6:58 PM to 7:39 PM, Operational from 7:39 PM to 4:25 AM

VPS Nodes

Major outage from 4:34 PM to 3:10 PM, Partial outage from 3:10 PM to 6:58 PM, Degraded performance from 6:58 PM to 4:25 AM

Updates
  • Resolved
    Resolved
    This incident has been resolved.
  • Monitoring
    Monitoring

    Services are now 100% online. We upgraded our network to 40Gb/s to improve network performance with a Mellanox NIC, we will be monitoring this new change for the next hours. Network will be marked as "Degraded Performance" during our monitoring process. Apologies for the inconvenience.

  • Identified
    Identified

    The incident was caused by PXE booting loop, the server was unable to recover itself. We've made task physicly with our on-site technician to replug our server power supply as IPMI didn't work properly. We've been able to replicate and mitigate the issue, also IPMI was fixed to prevent futher incidents.

  • Update
    Update

    Our tech completed the task, our main server is reachable again. We will proceed to do our investigation of what caused the incident. Services may experiencies multiples outages during this process.

  • Update
    Update

    Technician is already on-site, we will post more updates of this incident as we got information about what caused it.

  • Investigating
    Investigating

    Our tech will be on-site in approximately 40 minutes.

  • Identified
    Identified

    Datacenter had scheduled their tech on-site by tomorrow at 09:00 EST. This on-site window it's not what we expected but we're trying to get a window for today if possible.

  • Investigating
    Investigating

    We've detected that after last maintenance our main server had a failure. We've sent our tech to the Datacenter to fix the issue as soon as possible. We apologize for the delay of this status announcement and any inconvenience caused.