Matrix delegation and how it may bite you
540 words, estimated reading time: 3
minute(s)
Originally published on January 14, 2024
Last modified on August 21, 2024
For those who don’t know, delegation in Matrix is used in server-to-server
communication to figure out which server serves a given domain. As an example,
if my own Matrix homeserver was running on matrix.dujemihanovic.xyz
instead of
dujemihanovic.xyz
, I could delegate the latter to the former to save anyone
wanting to contact me from having to type out the matrix.
.
Besides domain name, delegation can also be used to specify which port to use
for server-to-server communication. The default is 8448
, and if it’s blocked
you can use delegation to use 443
for server-to-server as client-to-server
does by default. However, if you can, I’d strongly suggest using 8448
! I had
been delegating S2S to 443
almost the whole time I have had this server for no
reason and it seems that this caused an extremely weird issue with a certain
room:
What happened?
Message fetching kept breaking constantly. What I mean by that is that when I joined the room everything would work fine the first few messages, but at some point I would start getting notifications without any new message being present in that room. I have noticed that logging out and back in would get the missing messages in my client, but then the forementioned cycle would repeat again no matter how many times I logged out and back in (this also happened on other clients besides Element desktop). To confirm my homeserver was the issue, I joined the room with my old matrix.org account and sure enough that worked just fine.
I tried the usual things such as restarting Dendrite and the whole VPS, but to no avail. I was pretty insistent that the issue was not with my homeserver but the main server hosting the room (which, unsurprisingly, turned out to be false) and so I gave up on that. The eyeopening moment was me reading the conduit documentation (I had considered migrating to it), specifically this:
If Conduit runs behind Cloudflare reverse proxy, which doesn’t support port 8448 on free plans,
This implies that routing server-to-server traffic to 443
should only be done
if it’s absolutely impossible to use 8448
for this, and the Synapse
documentation
said something similar:
However, if your homeserver’s APIs aren’t accessible on port 8448 and on the domain server_name points to, you will need to let other servers know how to find it using delegation.
Fixing the issue
Encouraged by this, I fixed up my server:
- allow port
8448
inufw
- add something like this to
Caddyfile
:
dujemihanovic.xyz:8448 {
reverse_proxy /_matrix/* localhost:8008
}
- change
/.well-known/matrix/server
to point todujemihanovic.xyz:8448
(in theory, I could have gotten rid of thatreturn
directive altogether as8448
is default anyway, but I still chose to specify it just to be safe) - reload
caddy
and restartdendrite
(the latter is, again, just to be safe)
Once all this was done, the room finally started acting normally.
Small sidenote
I must note that delegating federation to 443
should not cause breakage like
this. Despite this, it still did so in my case and for that reason I wrote
about it anyway. It’s very unlikely that you will be affected by this issue, but
I still believe it should be pointed out in the event that it does.