[rt-users] ExtractCustomFields template and dropping errant HTML tags

Lundberg, Emory emory-lundberg at uiowa.edu
Mon May 20 17:53:36 EDT 2013


Some additional information as I'm getting big blobs of text with markup in several fields on emails and forms that are submitted to my RT-4.0.10: 

> 2013-05-20 21:37:32 The RT System itself - FQDN you've completed this form!</h3></td></tr></tbody></table></div><div align="center"> </div></body></html> added 


I'm stuck on not being sure about the most appropriate way to proceed. I don't know if I should be trying to adjust my CustomField template to ignore HTML tags designated with '<', or if I should be stripping HTML at the MTA and forcing plain text on everybody.

I'm not using `Set($PreferRichText, 1);` because I don't know if that will even help with body parsing at all.

Anyone have an experience like this?

My Template for that field is written as:

> 	FQDN|Body|FQDN:*([^<].*+)||


//emory

Original message below:

On May 15, 2013, at 10:39 AM, "Lundberg, Emory" <emory-lundberg at uiowa.edu> wrote:

> I have a scrip to assign CustomFields based on a template and it often ends up collecting junk like HTML tags trailing after the data I want to match.
> 
> I think I have made my regex as specific as I can, but now I'm concerned that I went about this the wrong way.  I would love an opinion.
> 
> 
> 
> Emails that aren't human-generated typically have a block of data in them that includes data like:
> 
> Room:Y10A
> Building:ddd
> IP:172.16.2.2,fe80::250:43ff:fe00:ed31
> MAC:DE:CA:FB:AD:11:97
> Port:ddd-1 at 4/40
> 
> And sometimes they're handled by applications that generate them with HTML formatting, or are copy/pasted with HTML formatting, etc.
> 
> I have a CustomField called 'Building' and in my Template I have:
> 
> Building|Body|Building:*([^<]*+)\n||  
> 
> a) Is this ([^<]) necessary – or is there a way to merely ignore all HTML on incoming mail before it gets handed off to rt-mailtool that is preferred/better?
> b) Is there something about my Template use that is obviously wrong?



More information about the rt-users mailing list