Matrix Server Instability

@amandahla @gschiano Heya! I’m currently unable to send or receive messages in any room on the Ubuntu homeserver. Element seems to be able to connect to the server and view previously sent messages though. Could you take a look when you get a chance?

Hi,

We have been facing some infrastructure instability on our Matrix deployment, where when a node goes down, all users on the shard deployed on the node face intermittent issues. Our infrastructure teams are looking into it, and we should have a fix soon. Meanwhile, we are also exploring other options for redeployment/migration to another Kubernetes cluster.

Thanks for your patience!

2 Likes

(can this be pinned? because this is the only place now to get assistance with issues here at the moment)

Our homeserver faced again a unit down (since yesterday 5:23 UTC) that can’t mount its storage, the issue has been raised to IS and our Managed Solution. The issue is still not fixed, I’ve put in place temporary measures that should restore normal behavior.

4 Likes

Homeserver performance seems very degraded at the moment, it’s taking multiple minutes to do things like open Element Desktop or load room and space icons.

1 Like

Using https://app.element.io/ and since yesterday the server does not accredit my credentials.
Unable to log into Matrix :frowning:

1 Like

Matrix back up and functional.

Logged in :smiley:

1 Like

Hello! Matrix is down due to underlying K8s infra issues at the moment. We will recover promptly as soon as the underlying issue is fixed.
I apologize for the inconvenience.

2 Likes

Matrix should be back up.

3 Likes

@amandahla @gschiano The Matrix server is virtually unusable, I’m unable to load messages and am stuck on a “Connectivity to the server has been lost” screen. This is Questing Beta release day, this is one of the worst possible times for this to occur since now flavor leads and the Ubuntu Release Team cannot talk to each other normally. This needs to be resolved as soon as possible. Please help.

1 Like

The Kubernetes cluster and Juju controller on which Synapse is running has be very instable since beginning of the week.
We’re restoring the service as soon as we can, but usually it only stay stable for few hours.

I’ll check with IS to see what’s the status of the underlying cloud services

1 Like

Is there any chance we can get an update? It’s a beta release day and we really cannot be having outages at all on a day like today.

Unfortunately, the underlying instability Greg mentioned earlier is continuing to affect availability. We’re closely monitoring the deployment of Matrix and doing our best to ensure it stays up during this period and that it comes back quickly as things subside. Our team is keeping a close watch so we can react as fast as possible throughout the day.

1 Like

Matrix is currently available again while the investigation continues. We’ll share updates if anything changes.

1 Like

Hello,

We are currently facing some instability. Your experience with the Matrix server may be impacted. We’re closely monitoring things and doing our best to ensure it stays up during this period and that it comes back quickly as the underlying instability subsides.

2 Likes