Back off signing key requests to dead servers
computer.surgery has a ton of logs that look like this:
2024-06-26T07:35:39.095620Z WARN grapevine::api::server_server: Could not send request, error: error sending request for url (https://mannuk.rocks:8448/_matrix/key/v2/server)
at src/api/server_server.rs:263
in grapevine::api::server_server::send_request with destination: "mannuk.rocks", url: https://mannuk.rocks:8448/_matrix/key/v2/server
in grapevine::service::sending::send_federation_request with destination: "mannuk.rocks"
in grapevine::service::rooms::event_handler::fetch_signing_keys with origin: "mannuk.rocks", query_via_trusted_servers: false, signature_ids: ["ed25519:cZ3Y1Rqh"]
in grapevine::api::ruma_wrapper::axum::ar_from_request
in grapevine::http_request with otel.name: "PUT /_matrix/federation/v1/send/:transaction_id", method: PUT, endpoint: /_matrix/federation/v1/send/:transaction_id
2024-06-26T07:35:39.095735Z INFO grapevine::service::rooms::event_handler: Returning stale keys, origin: mannuk.rocks
at src/service/rooms/event_handler.rs:2169
in grapevine::service::rooms::event_handler::fetch_signing_keys with origin: "mannuk.rocks", query_via_trusted_servers: false, signature_ids: ["ed25519:cZ3Y1Rqh"]
in grapevine::api::ruma_wrapper::axum::ar_from_request
in grapevine::http_request with otel.name: "PUT /_matrix/federation/v1/send/:transaction_id", method: PUT, endpoint: /_matrix/federation/v1/send/:transaction_id
Where we try to do something that needs signing keys from a dead server, see that our cached keys are expired, try to fetch the keys from the origin, and then fail. This happens every time we try to validate a signature from a dead server.
We should probably be backing off these requests on error, and probably should have a global backoff mechanism for federation requests in general.