NYCPHP Meetup

NYPHP.org

[nycphp-talk] Filtering input to be appended inside email

Mikko Rantalainen mikko.rantalainen at peda.net
Thu Sep 15 10:10:22 EDT 2005


Daniel Convissor wrote:
> On Thu, Sep 15, 2005 at 12:04:16PM +0300, Mikko Rantalainen wrote:
> 
>>Daniel Convissor wrote:
>>
>>>    $value = preg_replace("/[\r\n]+/", "\r\n ", trim($value));
>>
>>Yeah, that can be done in one call, but let's include the 'g' so 
>>that we are safe even if the input includes multiple lines of text. 
> 
> "g" isn't an official pattern modifier (aka "Internal option letter") 
> (http://www.php.net/manual/en/reference.pcre.pattern.syntax.php).  
> Perhaps you mean for it to be greedy, but PHP's preg is greedy by default.  
> The "U" modifier makes things un-greedy.

Yes, you're right, of course. I hate when they make it look like 
Perl but don't actually copy the behavior. In Perl, the 'g' option 
makes replacement pattern to replace *all* matches - by default Perl 
regexes only replace the first occurrence.

> The pattern I presented replaces any \r, \n or combination thereof in any 
> order and of any length.  So, since those ARE the characters that define 
> line breaks, there's no need for the multi-line modifier, "m".

As I wrote:
>> if input has "\r\n\r\n\r\n" then output should have "\r\n \r\n \r\n"

Your pattern replaces the above input with exactly one CRLF pair. If 
you just want to discard all line feeds, then it's fine to use that 
pattern and use space " " as a replacement. However, as I wanted to 
keep as much information as possible, I'm trying to keep all the 
three line feeds in the output so I cannot just match a sequence of
"\r" and "\n" characters.

A correct implementation following the RFC would also make sure that 
no line exceeds 1000 characters and it *should* wrap lines at 
maximum of 78 characters.

-- 
Mikko



More information about the talk mailing list