[nycphp-talk] A good PCRE expression for matching URLs
Michael B Allen
ioplex at gmail.com
Fri Jul 25 02:27:20 EDT 2008
On Thu, Jul 24, 2008 at 7:50 PM, Michael B Allen <ioplex at gmail.com> wrote:
> But it would be nice to exclude those end-of-sentence punctuation from
> the capture output. I tried the following minimalistic expression just
> to try and get the trailing condition right I'm not able to
> distinguish between a dot that is part of the URL and a period at the
> end.
Got it. I just needed to negate the end-of-sentence punctuation character class.
This seems to be handling all cases properly:
$expr = '([a-zA-Z0-9]{1,10}://[a-zA-Z0-9.-]+[\p{L}0-9"!#$%&\\()+,\\./:;=?\\@\\\\^_{}~-]*)[^,\\.?!:;"\'\\s]';
Thanks,
Mike
--
Michael B Allen
PHP Active Directory SPNEGO SSO
http://www.ioplex.com/
More information about the talk
mailing list