I’ve noticed for a week or so now that Sopuli has somewhat frequent (almost daily) issues with gateway timeouts. I guess that “web facing” process crashes for some reason, but this got me wondering if the issue is known problem with lemmy or something else and more importantly is there something we could help with?
Actually I have noticed during those timeouts that some database process is using 100% of one CPU core. A database deadlock of some kind? RAM usage is also then nearly at max (quite a feat at 16 GB), perhaps I should enable swapping. I have to dig into this issue later this week. Luckily I installed an app that sends me a notification every time Sopuli is down.
While browsing All/New I stumbled upon some threads about instances receiving DDOS and that Lemmy has some really database intensive operations which can cause the whole instance to run out of memory with a relatively small amount of traffic if they all hit those expensive queries. I hope these will get fixed in the future releases, but they might be at least partial cause for the problem.
Some database tuning might help, even considerably if current settings are somehow incompatible with the load, but there’s still a limit on what that can achieve. I’m not an expert, but Lemmys own documentation refers to this configuration tuner and there’s plenty of others around, so that might be worth looking for. Also enabling slow query log might give some info on what’s the root cause for these.
If the issue is with the database I don’t think that swapping will help since if the memory is full “expanding” it with a slow storage only makes the issue worse. In general at least some swap space is recommended tho, but for this kind of load I doubt it’ll help.
Now I remember! I used that tuner several months ago. Now I edited the DB config file a bit to better accommodate the better VPS specs. Let’s see how things will be going…