Famous Perl One-Liners Explained, Part VII: Handy Regular Expressions - good coders code, great reuse |
| Famous Perl One-Liners Explained, Part VII: Handy Regular Expressions Posted: 10 Nov 2011 03:43 AM PST
Famous Perl one-liners is my attempt to create "perl1line.txt" that is similar to "awk1line.txt" and "sed1line.txt" that have been so popular among Awk and Sed programmers, and Unix sysadmins. I will release the perl1line.txt in the next part of the series. The article on famous Perl one-liners consists of nine parts:
After I am done with the next part of the article, I will release the whole article series as a pdf e-book! Please subscribe to my blog to be the first to get it! And here are today's one-liners: 109. Check if the string looks like an email. /.+@.+\..+/ This regex makes sure that the string looks like email. Notice that I say "looks like". It doesn't guarantee it is an email address. Here is how it works - first it matches something up to the For example, 110. Check if the string is a number. /^\d+$/ This regex matches one or more digits For example, How about hexadecimal numbers? Here is how: /^0x[0-9a-f]$/i This matches the hex prefix Now how about octal? Here is how: /^0[0-7]+$/ Octal numbers are prefixed by Finally binary: /^[01]+$/ Binary base consists of just 111. Check if a word appears twice in the string. /(word).*\1/ This regex matches For example, 112. Increase all numbers by one in the string. $str =~ s/(\d+)/$1+1/ge Here we use the substitution operator For example, 113. Export HTTP User-Agent string from the HTTP headers. /^User-Agent: (.+)$/ HTTP headers are formatted as For example, if the HTTP headers contain, Host: localhost:8000 Connection: keep-alive User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_0_0; en-US) Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Encoding: gzip,deflate,sdch Accept-Language: en-US,en;q=0.8 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3 Then the regular expression will extract the 114. Match something that looks like an IP address. /^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$/ This regex doesn't guarantee that the thing that got matched is in fact a valid IP. All it does is match something that looks like an IP. It matches a number followed by a dot four times. For example, it matches a valid IP Here is how it works. The This regex can be simplified by grouping the first three repeated /^(\d{1,3}\.){3}\d{1,3}$/ 115. Match printable ASCII characters. /[ -~]/ This is really tricky and smart. To understand it, take a look at 116. Match unprintable ASCII characters. /[^ -~]/ Here we invert the previous regular expression. Placing 117. Match text between two HTML tags. m|<strong>([^<]*)</strong>| This regex matches everything between Alternatively you can write: m|<strong>(.*?)</strong>| But this is a little different. For example, if the HTML is However don't use regular expressions for matching and parsing HTML. Use modules like HTML::TreeBuilder to accomplish the task cleaner. 118. Extract all matches from a regular expression. my @matches = $text =~ /regex/g; Here the regular expression gets evaluated in the list context that makes it return all the matches. The matches get put in the For example, the following regex extracts all numbers from a string: my $t = "10 hello 25 moo 31 foo"; my @nums = $text =~ /\d+/g;
119. Test if a number is in range 0-255. /^(([0-9])|([0-9][0-9])|([12][0-5][0-5]))$/ Here is how it works. A number can either be one digit, two digit or three digit. If it's a one digit number then we allow it to be anything 120. Replace all <b> tags with <strong> $html =~ s|<(/)?b>|<$1strong>|g Here I assume that the HTML is in variable Have Fun!Thanks for reading the article! In the next part I am releasing the perl1line.txt that will contain all the one-liners in a single file. |
| You are subscribed to email updates from good coders code, great reuse To stop receiving these emails, you may unsubscribe now. | Email delivery powered by Google |
| Google Inc., 20 West Kinzie, Chicago IL USA 60610 | |
No comments:
Post a Comment
Keep a civil tongue.