Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-content/themes/atahualpa353/functions/bfa_theme_options.php on line 166

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-content/themes/atahualpa353/functions/bfa_theme_options.php on line 166

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-content/themes/atahualpa353/functions/bfa_theme_options.php on line 166
HTML Parser « g.cliquet[@]lecolededesign.com
Warning: strtotime(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/link-template.php on line 113

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/link-template.php on line 138

Warning: strtotime(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 35

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 107

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 109

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 111

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 112

Warning: strtotime(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/link-template.php on line 113

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/link-template.php on line 138

Warning: strtotime(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 35

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 107

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 109

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 111

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 112

Warning: strtotime(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/link-template.php on line 113

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/link-template.php on line 138

Warning: strtotime(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 35

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 107

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 109

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 111

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 112

Warning: strtotime(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/link-template.php on line 113

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/link-template.php on line 138
Warning: strtotime(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 35

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 43

Warning: strtotime(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 35

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 43
class="single postid-115">

Warning: strtotime(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 35

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 43

Warning: strtotime(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 35

Warning: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone. in /Volumes/data/web-etudiants/gcliquet/wp-includes/functions.php on line 43

HTML Parser

La récente mise à jour de del.icio.us (1 août 2008) m’a contraint à reprendre mon code visant à parser différentes pages sources, notamment la “popular” qui me permet de visualiser chaque jour les URLs plébicités par la communauté del.icio.us.

C’est toujours très douloureux de se replonger dans du code… même soigneusement commenté !! surtout quand celui-ci est truffé d’expressions régulières et autres merveilles récursives, mais je dois dire que la visite quotidienne de ce treemap m’est devenue indispensable, dans ma pratique d’observation de l’évolution du Web. Quel plaisir d’ailleurs de constater à quel point une “petite” application peut procurer autant de satisfaction, même si je n’ai toujours pas réussi à corriger le bug lié à l’affichage des caractères asiatiques…

Motivé donc à l’idée de remettre le système en fonction, un peu moins à l’idée de replonger dans l’édition du code source ;(… J’en profite pour refaire quelque recherches sur le Web me disant que de nombreuses solutions devaient exister. Effectivement, sur Source Forge j’obtiens un nombre important de résultats et télécharge différentes solutions qui semblent adaptées à mes besoins. Après de nombreux essais infructueux (les applis les plus populaires ne sont pas forcément les plus efficaces…) je teste PHP Simple HTML DOM Parser développé par S.C. Chen d’après un Parser pour PHP4 conçu par J. Solorzano’s. Le package se compose d’un fichier simple_html_dom.php et (chose relativement rare…) d’une documentation interactive comportant de nombreux exemples très pertinents. La solution correspond parfaitement à mes attentes dans le sens où l’extraction des contenus, la récupération des balises ou de leurs paramètres est assistée par des fonctions permettant de définir simplement des délimiteurs aux regard de différentes bribes de code récurentes présentent dans le code source.

Pour tous vos besoins en parsing, je vous recommande donc cette solution, qui requière certes quelques compétences techniques, mais qui est extrêmement simple, pour peu que l’on soit très vigilant et que l’on garde un oeil sur la structure du code source à parser. De plus, la documentation fournies, très complète, vous permettra des mises à jour beaucoup moins douloureuses !

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>