[rt-users] possibly OT: RT's FCGI server randomly fails, no log

Kenneth Marshall ktm at rice.edu
Tue Oct 11 13:59:19 EDT 2016


On Tue, Oct 11, 2016 at 12:55:13PM -0400, Alex Hall wrote:
> Hello list,
> This may be off-topic, but I'm serving RT with Nginx and FCGI. Randomly, it
> seems, the FCGI server is failing. Nginx works, but users see "error 502:
> bad gateway". I see the same in the logs, with connect() failing. All I
> have to do is run the spawn-fcgi command to get things back.
> 
> Why this is happening, with some frequency, is the question. My Nginx, RT,
> and system logs all show nothing, and to my knowledge, there are no FCGI
> logs at all. The first error for today in Nginx is when a client failed to
> connect after the server went down; there's nothing that says what the
> actual problem was. This happened Saturday, then again today.
> 
> The server has the latest updates for Debian 8.6, and has 4GB of ram. It's
> serving a few dozen users at most, so the load can't be the problem. I'm
> using Nginx 1.6.2 with four workers and 768 threads per worker. Users see
> nothing unusual before this happens, just a 502 instead of the page they
> expected.
> 
> If anyone else is using Nginx and has ever seen this, I'd love some input.
> As this could be considered off topic, feel free to respond directly to
> ahall at autodist.com. If I need to provide more details, please let me know.
> Thank you.
> 
> -- 
> Alex Hall
> Automatic Distributors, IT department
> ahall at autodist.com


Hi Alex,

You will get the 502 error when there are no more RT backends running. I
tracked down verious errors in the RT logs that resulted in a backend
exits. Most were of the 'cannot believe I did that type' by people
setting up the system, i.e. not really fixable with a distributed
management environment. We ended up using 'multiwatch' in RHEL6
and systemd in RHEL7 to keep an appropriate number of backends always
available.

Regards,
Ken



More information about the rt-users mailing list