simplehtmldom Doc api帮助文档


Posted in PHP onMarch 26, 2012

API Reference

Helper functions
object str_get_html ( string $content ) Creates a DOM object from a string.
object file_get_html ( string $filename ) Creates a DOM object from a file or a URL.

DOM methods & properties

stringplaintext Returns the contents extracted from HTML.
voidclear () Clean up memory.
voidload ( string $content ) Load contents from a string.
stringsave ( [string $filename] ) Dumps the internal DOM tree back into a string. If the $filename is set, result string will save to file.
voidload_file ( string $filename ) Load contents from a from a file or a URL.
voidset_callback ( string $function_name ) Set a callback function.
mixedfind ( string $selector [, int $index] ) Find elements by the CSS selector. Returns the Nth element object if index is set, otherwise return an array of object.

Element methods & properties

string[attribute] Read or write element's attribure value.
stringtag Read or write the tag name of element.
stringoutertext Read or write the outer HTML text of element.
stringinnertext Read or write the inner HTML text of element.
stringplaintext Read or write the plain text of element.
mixedfind ( string $selector [, int $index] ) Find children by the CSS selector. Returns the Nth element object if index is set, otherwise, return an array of object.

DOM traversing

mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
element$e->parent () Returns the parent of element.
element$e->first_child () Returns the first child of element, or null if not found.
element$e->last_child () Returns the last child of element, or null if not found.
element$e->next_sibling () Returns the next sibling of element, or null if not found.
element$e->prev_sibling () Returns the previous sibling of element, or null if not found.
Camel naming convertions You can also call methods with W3C STANDARD camel naming convertions.

string$e->getAttribute ( $name ) string$e->attribute
void$e->setAttribute ( $name, $value ) void$value = $e->attribute
bool$e->hasAttribute ( $name ) boolisset($e->attribute)
void$e->removeAttribute ( $name ) void$e->attribute = null
element$e->getElementById ( $id ) mixed$e->find ( "#$id", 0 )
mixed$e->getElementsById ( $id [,$index] ) mixed$e->find ( "#$id" [, int $index] )
element$e->getElementByTagName ($name ) mixed$e->find ( $name, 0 )
mixed$e->getElementsByTagName ( $name [, $index] ) mixed$e->find ( $name [, int $index] )
element$e->parentNode () element$e->parent ()
mixed$e->childNodes ( [$index] ) mixed$e->children ( [int $index] )
element$e->firstChild () element$e->first_child ()
element$e->lastChild () element$e->last_child ()
element$e->nextSibling () element$e->next_sibling ()
element$e->previousSibling () element$e->prev_sibling ()

// Create a DOM object from a string
$html = str_get_html('<html><body>Hello!</body></html>');

// Create a DOM object from a URL
$html = file_get_html('http://www.google.com/');

// Create a DOM object from a HTML file
$html = file_get_html('test.htm');

// Create a DOM object
$html = new simple_html_dom();

// Load HTML from a string
$html->load('<html><body>Hello!</body></html>');

// Load HTML from a URL
$html->load_file('http://www.google.com/');

// Load HTML from a HTML file
$html->load_file('test.htm');

// Find all anchors, returns a array of element objects
$ret = $html->find('a');

// Find (N)thanchor, returns element object or null if not found(zero based)
$ret = $html->find('a', 0);

// Find all <div> which attribute id=foo
$ret = $html->find('div[id=foo]');

// Find all <div> with the id attribute
$ret = $html->find('div[id]');

// Find all element has attribute id
$ret = $html->find('[id]');

// Find all element which id=foo
$ret = $html->find('#foo');

// Find all element which class=foo
$ret = $html->find('.foo');

// Find all anchors and images
$ret = $html->find('a, img');

// Find all anchors and images with the "title" attribute
$ret = $html->find('a[title], img[title]');

// Find all <li> in <ul>
$es = $html->find('ul li');

// Find Nested <div> tags
$es = $html->find('div div div');

// Find all <td> in <table> which class=hello
$es = $html->find('table.hello td');

// Find all td tags with attribite align=center in table tags
$es = $html->find(''table td[align=center]');

// Find all <li> in <ul>
foreach($html->find('ul') as $ul)
{
foreach($ul->find('li') as $li)
{
// do something...
}
}

// Find first <li> in first <ul>
$e = $html->find('ul', 0)->find('li', 0);

Supports these operators in attribute selectors:

[attribute] Matches elements that have the specified attribute.
[attribute=value] Matches elements that have the specified attribute with a certain value.
[attribute!=value] Matches elements that don't have the specified attribute with a certain value.
[attribute^=value] Matches elements that have the specified attribute and it starts with a certain value.
[attribute$=value] Matches elements that have the specified attribute and it ends with a certain value.
[attribute*=value] Matches elements that have the specified attribute and it contains a certain value.

// Find all text blocks
$es = $html->find('text');

// Find all comment (<!--...-->) blocks
$es = $html->find('comment');

// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected...), it will returns true or false)
$value = $e->href;

// Set a attribute(If the attribute is non-value attribute (eg. checked, selected...), set it's value as true or false)
$e->href = 'my link';

// Remove a attribute, set it's value as null!
$e->href = null;

// Determine whether a attribute exist?
if(isset($e->href))
echo 'href exist!';

// Example
$html = str_get_html("<div>foo <b>bar</b></div>");
$e = $html->find("div", 0);

echo $e->tag; // Returns: " div"
echo $e->outertext; // Returns: " <div>foo <b>bar</b></div>"
echo $e->innertext; // Returns: " foo <b>bar</b>"
echo $e->plaintext; // Returns: " foo bar"

$e->tag Read or write the tag name of element.
$e->outertext Read or write the outer HTML text of element.
$e->innertext Read or write the inner HTML text of element.
$e->plaintext Read or write the plain text of element.

// Extract contents from HTML
echo $html->plaintext;

// Wrap a element
$e->outertext = '<div class="wrap">' . $e->outertext . '<div>';

// Remove a element, set it's outertext as an empty string
$e->outertext = '';

// Append a element
$e->outertext = $e->outertext . '<div>foo<div>';

// Insert a element
$e->outertext = '<div>foo<div>' . $e->outertext;

// If you are not so familiar with HTML DOM, check this link to learn more...

// Example
echo $html->find("#div1", 0)->children(1)->children(1)->children(2)->id;
// or
echo $html->getElementById("div1")->childNodes(1)->childNodes(1)->childNodes(2)->getAttribute('id');
You can also call methods with Camel naming convertions.

mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
element$e->parent () Returns the parent of element.
element$e->first_child () Returns the first child of element, or null if not found.
element$e->last_child () Returns the last child of element, or null if not found.
element$e->next_sibling () Returns the next sibling of element, or null if not found.
element$e->prev_sibling () Returns the previous sibling of element, or null if not found.

// Dumps the internal DOM tree back into string
$str = $html;

// Print it!
echo $html;

// Dumps the internal DOM tree back into string
$str = $html->save();

// Dumps the internal DOM tree back into a file
$html->save('result.htm');

// Write a function with parameter "$element"
function my_callback($element) {
// Hide all <b> tags
if ($element->tag=='b')
$element->outertext = '';
}

// Register the callback function with it's function name
$html->set_callback('my_callback');

// Callback function will be invoked while dumping
echo $html;

PHP 相关文章推荐
PHP 和 MySQL 开发的 8 个技巧
Oct 09 PHP
PHP一些有意思的小区别
Dec 06 PHP
ob_start(),ob_start('ob_gzhandler')使用
Dec 25 PHP
php小偷相关截取函数备忘
Nov 28 PHP
php之CodeIgniter学习笔记
Jun 17 PHP
解析PHP实现下载文件的两种方法
Jul 05 PHP
显示程序执行时间php函数代码
Aug 29 PHP
php验证是否是md5编码的简单代码
Apr 01 PHP
PHP实现利用MySQL保存session的方法
Aug 23 PHP
PHP7.1新功能之Nullable Type用法分析
Sep 26 PHP
php封装db类连接sqlite3数据库的方法实例
Dec 19 PHP
PHP设计模式(三)建造者模式Builder实例详解【创建型】
May 02 PHP
php中一个有意思的日期逻辑处理
Mar 25 #PHP
php中http_build_query 的一个问题
Mar 25 #PHP
php正则表达匹配中文问题分析小结
Mar 25 #PHP
二招解决php乱码问题
Mar 25 #PHP
php引用地址改变变量值的问题
Mar 23 #PHP
奇怪的PHP引用效率问题分析
Mar 23 #PHP
php地址引用(php地址引用的效率问题)
Mar 23 #PHP
You might like
php 静态化实现代码
2009/03/20 PHP
PHP随机生成信用卡卡号的方法
2015/03/23 PHP
php数组函数array_key_exists()小结
2015/12/10 PHP
ThinkPHP项目分组配置方法分析
2016/03/23 PHP
PHP数据的提交与过滤基本操作实例详解
2016/11/11 PHP
浅谈laravel 5.6 安装 windows上使用composer的安装过程
2019/10/18 PHP
js 利用className得到对象的实现代码
2011/11/15 Javascript
关于jQuery $.isNumeric vs. $.isNaN vs. isNaN
2013/04/15 Javascript
得到form下的所有的input的js代码
2013/11/07 Javascript
返回上一页并自动刷新的JavaScript代码
2014/02/19 Javascript
jquery插件推荐 jquery.cookie
2014/11/09 Javascript
mpvue小程序仿qq左滑置顶删除组件
2018/08/03 Javascript
jQuery实现网页拼图游戏
2020/04/22 jQuery
postman自定义函数实现 时间函数的思路详解
2019/04/17 Javascript
vue-router结合vuex实现用户权限控制功能
2019/11/14 Javascript
javascript数组元素删除方法delete和splice解析
2019/12/09 Javascript
[01:15:56]2018DOTA2亚洲邀请赛3月30日 小组赛A组 TNC VS Newbee
2018/03/31 DOTA
使用python提取html文件中的特定数据的实现代码
2013/03/24 Python
编写Python脚本抓取网络小说来制作自己的阅读器
2015/08/20 Python
Python 递归函数详解及实例
2016/12/27 Python
python+matplotlib实现动态绘制图片实例代码(交互式绘图)
2018/01/20 Python
Python在groupby分组后提取指定位置记录方法
2018/04/20 Python
使用Python向C语言的链接库传递数组、结构体、指针类型的数据
2019/01/29 Python
基于Python的图像数据增强Data Augmentation解析
2019/08/13 Python
利用Python复制文件的9种方法总结
2019/09/02 Python
安装python及pycharm的教程图解
2019/10/10 Python
python如何导入依赖包
2020/07/13 Python
英国知名的护肤彩妆与时尚配饰大型综合零售电商:Unineed
2016/11/21 全球购物
个人实习生的自我评价
2014/02/16 职场文书
大课间活动实施方案
2014/03/06 职场文书
2014年乡镇个人工作总结
2014/12/03 职场文书
师德先进个人材料
2014/12/20 职场文书
2015年小学一年级班主任工作总结
2015/05/21 职场文书
初婚未育证明样本
2015/06/18 职场文书
PHP新手指南
2021/04/01 PHP
Mysql 如何实现多张无关联表查询数据并分页
2021/06/05 MySQL