[rt-users] Tables, database size, backups

Alex Howells alex at bytemark.co.uk
Wed Jan 30 07:04:31 EST 2008


Further to my previous email...

> Several issues for us here:
> 
>      *  noticing the problem has happened. Usually this is when a
>         customer complains, or we happen to be *looking* at whether
>         or not RT is sending mail for some other reason.

Actually seeing a lot of error messages on user-crit via syslog,
generated by RT in response to not being able to send mail:

RT: <rt-3.6.1-5511-1201691639-297.49825-5-0 at bytemark.co.uk>Could not
send mail: Couldn\'t run /usr/lib/sendmail: Cannot allocate memory at
/home/rt/rt-3.6.1/lib/RT/Action/SendEmail.pm line 274. Stack:
[/home/rt/rt-3.6.1/lib/RT/Action/SendEmail.pm:274]
[/home/rt/rt-3.6.1/lib/RT/Action/SendEmail.pm:103]
[/home/rt/rt-3.6.1/lib/RT/ScripAction_Overlay.pm:240]
[/home/rt/rt-3.6.1/lib/RT/Scrip_Overlay.pm:506]
[/home/rt/rt-3.6.1/lib/RT/Scrips_Overlay.pm:193]
[/home/rt/rt-3.6.1/lib/RT/Transaction_Overlay.pm:179]
[/home/rt/rt-3.6.1/lib/RT/Record.pm:1446]
[/home/rt/rt-3.6.1/lib/RT/Ticket_Overlay.pm:2442]
[/home/rt/rt-3.6.1/lib/RT/Ticket_Overlay.pm:2356]
[/home/rt/rt-3.6.1/lib/RT/Interface/Web.pm:570]
[/home/rt/rt-3.6.1/share/html/Ticket/Display.html:140]
[/home/rt/rt-3.6.1/share/html/Ticket/Update.html:216]
[/home/rt/rt-3.6.1/share/html/autohandler:279]
(/home/rt/rt-3.6.1/lib/RT/Action/SendEmail.pm:289)

Now I can accept that's a valid response for OOM situations; the
"problem" is that we were out of memory at ~2am, and it takes a full
restart of Apache + FastCGI processes for mail to start going again.
>From 01:52 through 11:47 no mail got sent by our RT :(

>      *  once we have a vague timeframe when RT didn't send replies
>         it's not trivial to take 'existing' responses to tickets within
>         that window and dump them back into the "Send mail" queue.

According to our loghost, that stack trace was thrown ~177 times.
Granted, a lot of those will be comments to CC's and AdminCC's on a
queue but it still means finding potentially 30-40 tickets which we've
responded to since 9am, may have resolved, and re-sending the message!

We'll try to solve this with more regular restarts of RT, and throwing
more RAM into that server, but do you have any ideas?

Alex



More information about the rt-users mailing list