preg_match_all php function & how to filter out div, a links, img or other elements

This article was written by in August 16, 2014, & may not be posted on other sites!
Original source url for this article: preg_match_all php function & how to filter out div, a links, img or other elements

preg_match_all php function + RegEx & how to filter out div, a link, img or any other elements.
Php function preg_match_all( ) + Regex is very usefull when you need to filter out any element from html content, for example filter out all links <a>, all images <img />, all divs <div> or any other element, or even filter out elements by class or ID.

 

preg_match_all php function + RegExp & how to filter out div, a links, img or other elements

How does it work?

By using RegEx (Regular expressions) and php function preg_match_all( ) we can filter out what ever we like. RegEx is the pattern we want to filter out and looks something like this:
/<h1(.*)<\/h1>/iU
and preg_match_all( ) is the function filtering out all the matches! Let’s go!

 

 

Filter out specific elements using preg_match_all php function & RegEx

To filter out specific elements from content, such as all links, images or divs, see code examples below. The outcome will be gathered in the $result variable as an array. You need to put the html code to be filtered in the variable $incoming_data before starting.

Filter out all headings
$headings = '/<h1(.*)<\/h1>/iU';
preg_match_all ($headings, $incoming_data, $result);

Filter out all a links
$links = '/<a(.*)<\/a>/iU';
preg_match_all ($links, $incoming_data, $result);

Filter out all images
$images  = '/<img[^>]+>/i';
preg_match_all ($images, $incoming_data, $result);

Filter out divs
Now, this will only work if you don’t have nested divs inside eachother.
$divs = '/<div(.*)<\/div>/iU';
preg_match_all ($divs, $incoming_data, $result);

Filter out divs by class, for example “myclass”
Just change the word “myclass” to what ever class you are filtering out. This will not work on nested divs, meaning if div with class “myclass” has more divs nested inside it, those will not be included. Code is filtering only to first </div> closing div it can find.
$divs = '/<div class=\"myclass">(.*?)<\/div>/s';
preg_match_all($divs, $incoming_data, $result);

Filter out divs by class using – character, for example “my-class
Just change the word “my” and “class” to what ever class you are filtering out. This will not work properly on nested divs, meaning if div with class “my-class” has more divs nested inside it, those will not be included. Code is filtering only to first closing </div> found.
$divs = '/<div class=\"my\-class\">(.*?)<\/div>/s';
preg_match_all($divs, $incoming_data, $result);

Filter out divs by class when div having multiple classes
Just change the word “myclass” to the divs first class. This will not work properly on nested divs, meaning if div with class “myclass” has more divs nested inside it, those will not be included. Code is filtering only to first closing </div> found. With this code we can filter out all divs starting with <div class=”myclass no matter how many classes that follows after that!
$divs = '/<div class=\"myclass(.*?)<\/div>/s';
preg_match_all($divs, $incoming_data, $result);

Filter out nested divs with multiple divs inside it
To filer out a div by class which holds multiple divs inside it, nested divs, and still get all content can be tricky. Try this code but change the “my-class” to whatever class your div has:
$divs = '{<div\s+class="my-class"\s*>((?:(?:(?!<div[^>]*>|</div>).)++|<div[^>]*>(?1)</div>)*)</div>}si';
preg_match_all($divs, $incoming_data, $result);

 

 

How to echo out / print your preg_match_all() content

To print out, or echo out your filtered content. Follow this complete code example. Change the regEx code to match what ever pattern you want to filter out from content. If you are having problems getting the content to print properly, wrap everyting inside a <pre> </pre> tag.
$incoming_data = put all your html code to be filtered into the $incoming_data variable;
$h1 = '/<h1(.*)<\/h1>/iU';
$count = preg_match_all($h1, $incoming_data, $result);
if ($count > 0) {
for($i = 0; $i < $count; $i++) {
echo($result[1][$i]);
}
} else {
echo('There was a problem loading this information');
}

 

 

Other usefull php functions

preg_match()
The preg_match() function searches string for pattern, returning true if pattern exists, and false otherwise.

preg_match_all()
The preg_match_all() function matches all occurrences of pattern in string.

preg_replace()
The preg_replace() function operates just like ereg_replace(), except that regular expressions can be used in the pattern and replacement input parameters.

preg_split()
The preg_split() function operates exactly like split(), except that regular expressions are accepted as input parameters for pattern.

preg_grep()
The preg_grep() function searches all elements of input_array, returning all elements matching the regexp pattern.

preg_ quote()   
Quote regular expression characters

 

 

Spread the word

Facebooktwittergoogle_pluslinkedinmail

Leave a Reply

Your email address will not be published. Required fields are marked *