Jump to content

How to Match IPv4 Addresses with Regular Expressions

VOTE
+ 1
  • -
  • +
  Jan Goyvaerts's Photo
Posted Sep 30 2009 01:11 PM

If you want to check whether a certain string represents a valid IPv4 address in 255.255.255.255 notation, try one of these examples from Regular Expressions Cookbook:

Simple regex to check for an IP address:

^(?:[0-9]{1,3}\.){3}[0-9]{1,3}$
Regex options: None
Regex flavors: .NET, Java, Javascript, PCRE, Perl, Python, Ruby

Accurate regex to check for an IP address:

^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}↵

(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
Regex options: None
Regex flavors: .NET, Java, Javascript, PCRE, Perl, Python, Ruby

Simple regex to extract IP addresses from longer text:

\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b
Regex options: None
Regex flavors: .NET, Java, Javascript, PCRE, Perl, Python, Ruby

Accurate regex to extract IP addresses from longer text:

\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}↵

(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
Regex options: None
Regex flavors: .NET, Java, Javascript, PCRE, Perl, Python, Ruby

Simple regex that captures the four parts of the IP address:

^([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})$
Regex options: None
Regex flavors: .NET, Java, Javascript, PCRE, Perl, Python, Ruby

Accurate regex that captures the four parts of the IP address:

^(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.↵

(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.↵

(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.↵

(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
Regex options: None
Regex flavors: .NET, Java, Javascript, PCRE, Perl, Python, Ruby

Perl

if ($subject =~ m/^([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})/)

{

    $ip = $1 << 24 | $2 << 16 | $3 << 8 | $4;

}

A version 4 IP address is usually written in the form 255.255.255.255, where each of the four numbers must be between 0 and 255. Matching such IP addresses with a regular expression is very straightforward.

In the solution we present four regular expressions. Two of them are billed as “simple,” while the other two are marked “accurate.”

The simple regexes use [0-9]{1,3} to match each of the four blocks of digits in the IP address. These actually allow numbers from 0 to 999 rather than 0 to 255. The simple regexes are more efficient when you already know your input will contain only valid IP addresses, and you only need to separate the IP addresses from the other stuff.

The accurate regexes use 25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]? to match each of the four numbers in the IP address. This regex accurately matches a number in the range 0 to 255, with one optional leading zero for numbers between 10 and 99, and two optional leading zeros for numbers between 0 and 9. 25[0-5] matches 250 through 255, 2[0-4][0-9] matches 200 to 249, and [01]?[0-9][0-9]? takes care of 0 to 199, including the optional leading zeros.

If you want to check whether a string is a valid IP address in its entirety, use one of the regexes that begin with a caret and end with a dollar. These are the start-of-string and end-of-string anchors. If you want to find IP addresses within longer text, use one of the regexes that begin and end with the word boundaries \b.

The first four regular expressions use the form (?:number\.){3}number. The first three numbers in the IP address are matched by a noncapturing group that is repeated three times. The group matches a number and a literal dot, of which there are three in an IP address. The last part of the regex matches the final number in the IP address. Using the noncapturing group and repeating it three times makes our regular expression shorter and more efficient.

To convert the textual representation of the IP address into an integer, we need to capture the four numbers separately. The last two regexes in the solution do this. Instead of using the trick of repeating a group three times, they have four capturing groups, one for each number. Spelling things out this way is the only way we can separately capture all four numbers in the IP address.

Once we’ve captured the number, combining them into a 32-bit number is easy. In Perl, the special variables $1, $2, $3, and $4 hold the text matched by the four capturing groups in the regular expression. In Perl, the string variables for the capturing groups are automatically coerced into numbers when we apply the bitwise left shift operator (<<) to them. In other languages, you may have to call String.toInteger() or something similar before you can shift the numbers and combine them with a bitwise or.

Cover of Regular Exp<b></b>ressions Cookbook
Learn more about this topic from Regular Expressions Cookbook.  This cookbook provides more than 100 recipes to help you crunch data and manipulate text with regular expressions. With recipes for popular programming languages such as C#, Java, Javascript, Perl, PHP, Python, Ruby, and VB.NET, Regular Expressions Cookbook will help you learn powerful new tricks, avoid language-specific gotchas, and save valuable time with this library of proven solutions to difficult, real-world problems.
Learn More Read Now on Safari







0 Alternative Solutions | 0 Comments

filter by: