Email or username:

Password:

Forgot your password?
Top-level
spla
@jeff seems related to what's happening to mine:
my Mastodon instance was running great on Centos 8 but I migrated it to a new server running Ubuntu 20.04 LTS.
Don't know why after several hours running fine in this new server, Sidekiq/Redis start throwing lines like this:
'WARN: Your Redis network connection is performing extremely poorly. Last RTT readings were. [100632, 99845, 100015, 99969, 100037], ideally these should be < 1000.'
And the federated timeline get freezed.
I must restart Sidekiq & and Redis to back to normal.
Happened three times in two days. For some reason Redis is performing really bad for unknown reason.

@Gargron
15 comments
Eugen Rochko

@spla @jeff It looks like Redis stops responding. Does the Redis process log anything about the matter? Any clues in RAM or CPU usage? I'm assuming it's on the same machine so no actual network involved

jeff

@Gargron @spla I don't recall any warnings like that - but how do I check?

I'm running ubuntu18 with 6vcpu, 6gb ram, and 160 gb storage -- for a single user instance.

jeff

@Gargron @spla happened again, still can't find any errors. But apparently this Eugen guy is where it says Retry? And all 25 workers are busy? 🤔

Eugen Rochko

@jeff @spla Watch the dashboard and see if there's more failures than processed jobs, or if no jobs are being processed (e.g. all stuck)

jeff

@Gargron @spla failures graph/line is just flatlining at the bottom. But I can't click on the word to see failure specifics if they exist.

jeff

@Gargron @spla not when everything stops its just all at the bottom.

Eugen Rochko

@jeff @spla Go on the Busy tab and look at the Started column, is it basically 25 jobs that have been going for a long time? What are they?

jeff

@Gargron @spla this is what's started. It's showing only 5 this time after a sidekiq restart 🤔

Eugen Rochko

@jeff @spla What do you mean after restart? Did you restart while investigating. This will make finding the cause more difficult. That being said none of those jobs are supposed to take an hour, or more than like a minute, realistically.

You will want to send the Sidekiq process a TTIN signal to make it print backtraces of what it's currently executing

github.com/mperham/sidekiq/wik

jeff replied to Eugen

@Gargron @spla had to restart the service in order see the responses here 😬 sorry I'm so awful at this. Do I use the kill command for backtrace? How do I retrieve the pid? I only see jit & tid 😬😬

Eugen Rochko replied to jeff

@jeff @spla See the pid through `systemctl status mastodon-sidekiq`, execute `kill -TTIN pid`, view the backtrace dump with `journalctl -u mastodon-sidekiq --since="1 minute ago"`

jeff replied to Eugen

@Gargron @spla ok this is what it says. but tbh I don't really understand anything. are you able to decipher any issues?

pastebin.com/jbQNFbt0

Eugen Rochko replied to jeff

@jeff @spla What are the TIDs of the jobs that are stuck for an hour? Search for those in the log output.

From a cursory glance I see some jobs are in the middle of executing ffmpeg conversions. But I don't know if those are the ones that are stuck since you restarted Sidekiq.

jeff replied to Eugen

@Gargron @spla the first row didn't even return any results. The 2nd row down tid search just came up with 1 line:

Apr 28 21:09:08 ubuntu bundle[13509]: 2021-04-29T01:09:08.212Z pid=13509 tid=58x WARN: Thread TID-2hth processor

Just doesn't make sense why something can be stuck without much of a description (now showing 6 stuck)

Go Up