REGEX in PHP

PHP uses PCRE (Perl Compatible Regular Expressions). Perhaps the best method for learning about these is to actually experience them with trial code. For this class exercise, the following web site will be used. The site provides tutorials but unfortunately they are in French. Luckily, however, the test has an English version.

REGEX Video Series

http://code.tutsplus.com/tutorials/regular-expressions-for-dummies-screencast-series--net-7887

RegExr

Additional Resources

Try It Out


How to find "the" and "these" followed by a word (to capture) in the following text:

The syntax for patterns used in these functions closely resembles Perl. The expression should be enclosed in the delimiters, a forward slash (/), for example. Any character can be used for delimiter as long as it's not alphanumeric or backslash (\). If the delimiter character has to be used in the expression itself, it needs to be escaped by backslash. Since PHP 4.0.4, you can also use Perl-style (), {}, [], and <> matching delimiters. See Pattern Syntax for detailed explanation.

http://www.occc.edu
anita.m.philipp@occc.edu


Matching Characters

/capture/ capture
/./ single character
/./ single character
/..../ four characters

Matching at the Beginning or End of a String

/^How/ How at the beginning of a string
/^The/i The at the beginning of a string, case Insensitive
/edu/ edu
/edu$/ edu at the end of a string

Matching Special Characters

/\./ Period (.) not a single character
/\{\}/ {}

Specifying Quantity

/any?/i an or any - y is optional, case insensitive
/l+/i l (letter L) one or more times, case insensitive
/c{3}/ c repeated exactly 3 times
/s{2,3}/ s at least 2 but no more than 3 times

Specifying Subexpressions

/the(se)*/ the followed by se 0+ times
/@(.+)(.+)*(.edu)/ @, any characters 1+ times, any character 1+ times but this subexpression may be 0+ times, .edu

Regular Expression Patterns

/[0-9]/ any number
/P[a-z]/ P followed by any lowercase letter
/use[^d]/ use but not used
/[L-Pa-b0-1]/ L, P, a, b, 0, or 1
/\s/ any white space character
/\d/ any digit
/\w/ any letter, number or underscore
/[A-Z][A-Za-z0-9].*\./ capital letter followed by any letters or digits ending with a .

Multiple Pattern Choices

/www|edu/ www or edu
/(w{3})|(c{3})/ www or ccc
/(\{\})|(\[\])/ {} or []

Pattern Modifiers

/the/i the, case insensitive
/^h/im h, case insensitive, new line begins new string

Now, its your turn to write a few regular expressions: Answers at bottom of the page.

  1. String that begins with The followed by any number of any characters(Consider the newline character)
  2. Any instance of double vowels upper or lowercase
  3. Any case of double consonants upper or lowercase
  4. Secure URL beginning with www (http://www or https://www)
  5. Any white space character followed by an uppercase letter
  6. Expression to match the following: Perl-style (), {}, [], and <>
  7. Email address that starts with any number of any characters but at least one, followed by @, followed by any number but at least one, followed by a period, followed by either 3 or 4 letters
  8. Five or nine digit zip code in the format ##### or #####-####
  9. Course numbers such as ENGL2113, MATH1513, CS2623, CAT1513

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Answers (There is probably more than one correct answer, but this should help.):

  1. /The.*/m
  2. /[aeiou]{2}/i
  3. /[b-df-hj-mp-tx-z]{2}/i
  4. /https?:\/\/www/
  5. /\s[A-Z]/
  6. /Perl\-style \(\)\, \{\}\, \[\]\, and \<\>/
  7. /.+\@.+\.[A-Za-z]{3,4}/
  8. /\d{5}(\-\d{4})?/
  9. A-z]{2,4}[0-9]{4} -- be sure you know how to do variations of this.