simplehtmldom Doc api帮助文档


Posted in PHP onMarch 26, 2012

API Reference

Helper functions
object str_get_html ( string $content ) Creates a DOM object from a string.
object file_get_html ( string $filename ) Creates a DOM object from a file or a URL.

DOM methods & properties

stringplaintext Returns the contents extracted from HTML.
voidclear () Clean up memory.
voidload ( string $content ) Load contents from a string.
stringsave ( [string $filename] ) Dumps the internal DOM tree back into a string. If the $filename is set, result string will save to file.
voidload_file ( string $filename ) Load contents from a from a file or a URL.
voidset_callback ( string $function_name ) Set a callback function.
mixedfind ( string $selector [, int $index] ) Find elements by the CSS selector. Returns the Nth element object if index is set, otherwise return an array of object.

Element methods & properties

string[attribute] Read or write element's attribure value.
stringtag Read or write the tag name of element.
stringoutertext Read or write the outer HTML text of element.
stringinnertext Read or write the inner HTML text of element.
stringplaintext Read or write the plain text of element.
mixedfind ( string $selector [, int $index] ) Find children by the CSS selector. Returns the Nth element object if index is set, otherwise, return an array of object.

DOM traversing

mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
element$e->parent () Returns the parent of element.
element$e->first_child () Returns the first child of element, or null if not found.
element$e->last_child () Returns the last child of element, or null if not found.
element$e->next_sibling () Returns the next sibling of element, or null if not found.
element$e->prev_sibling () Returns the previous sibling of element, or null if not found.
Camel naming convertions You can also call methods with W3C STANDARD camel naming convertions.

string$e->getAttribute ( $name ) string$e->attribute
void$e->setAttribute ( $name, $value ) void$value = $e->attribute
bool$e->hasAttribute ( $name ) boolisset($e->attribute)
void$e->removeAttribute ( $name ) void$e->attribute = null
element$e->getElementById ( $id ) mixed$e->find ( "#$id", 0 )
mixed$e->getElementsById ( $id [,$index] ) mixed$e->find ( "#$id" [, int $index] )
element$e->getElementByTagName ($name ) mixed$e->find ( $name, 0 )
mixed$e->getElementsByTagName ( $name [, $index] ) mixed$e->find ( $name [, int $index] )
element$e->parentNode () element$e->parent ()
mixed$e->childNodes ( [$index] ) mixed$e->children ( [int $index] )
element$e->firstChild () element$e->first_child ()
element$e->lastChild () element$e->last_child ()
element$e->nextSibling () element$e->next_sibling ()
element$e->previousSibling () element$e->prev_sibling ()

// Create a DOM object from a string
$html = str_get_html('<html><body>Hello!</body></html>');

// Create a DOM object from a URL
$html = file_get_html('http://www.google.com/');

// Create a DOM object from a HTML file
$html = file_get_html('test.htm');

// Create a DOM object
$html = new simple_html_dom();

// Load HTML from a string
$html->load('<html><body>Hello!</body></html>');

// Load HTML from a URL
$html->load_file('http://www.google.com/');

// Load HTML from a HTML file
$html->load_file('test.htm');

// Find all anchors, returns a array of element objects
$ret = $html->find('a');

// Find (N)thanchor, returns element object or null if not found(zero based)
$ret = $html->find('a', 0);

// Find all <div> which attribute id=foo
$ret = $html->find('div[id=foo]');

// Find all <div> with the id attribute
$ret = $html->find('div[id]');

// Find all element has attribute id
$ret = $html->find('[id]');

// Find all element which id=foo
$ret = $html->find('#foo');

// Find all element which class=foo
$ret = $html->find('.foo');

// Find all anchors and images
$ret = $html->find('a, img');

// Find all anchors and images with the "title" attribute
$ret = $html->find('a[title], img[title]');

// Find all <li> in <ul>
$es = $html->find('ul li');

// Find Nested <div> tags
$es = $html->find('div div div');

// Find all <td> in <table> which class=hello
$es = $html->find('table.hello td');

// Find all td tags with attribite align=center in table tags
$es = $html->find(''table td[align=center]');

// Find all <li> in <ul>
foreach($html->find('ul') as $ul)
{
foreach($ul->find('li') as $li)
{
// do something...
}
}

// Find first <li> in first <ul>
$e = $html->find('ul', 0)->find('li', 0);

Supports these operators in attribute selectors:

[attribute] Matches elements that have the specified attribute.
[attribute=value] Matches elements that have the specified attribute with a certain value.
[attribute!=value] Matches elements that don't have the specified attribute with a certain value.
[attribute^=value] Matches elements that have the specified attribute and it starts with a certain value.
[attribute$=value] Matches elements that have the specified attribute and it ends with a certain value.
[attribute*=value] Matches elements that have the specified attribute and it contains a certain value.

// Find all text blocks
$es = $html->find('text');

// Find all comment (<!--...-->) blocks
$es = $html->find('comment');

// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected...), it will returns true or false)
$value = $e->href;

// Set a attribute(If the attribute is non-value attribute (eg. checked, selected...), set it's value as true or false)
$e->href = 'my link';

// Remove a attribute, set it's value as null!
$e->href = null;

// Determine whether a attribute exist?
if(isset($e->href))
echo 'href exist!';

// Example
$html = str_get_html("<div>foo <b>bar</b></div>");
$e = $html->find("div", 0);

echo $e->tag; // Returns: " div"
echo $e->outertext; // Returns: " <div>foo <b>bar</b></div>"
echo $e->innertext; // Returns: " foo <b>bar</b>"
echo $e->plaintext; // Returns: " foo bar"

$e->tag Read or write the tag name of element.
$e->outertext Read or write the outer HTML text of element.
$e->innertext Read or write the inner HTML text of element.
$e->plaintext Read or write the plain text of element.

// Extract contents from HTML
echo $html->plaintext;

// Wrap a element
$e->outertext = '<div class="wrap">' . $e->outertext . '<div>';

// Remove a element, set it's outertext as an empty string
$e->outertext = '';

// Append a element
$e->outertext = $e->outertext . '<div>foo<div>';

// Insert a element
$e->outertext = '<div>foo<div>' . $e->outertext;

// If you are not so familiar with HTML DOM, check this link to learn more...

// Example
echo $html->find("#div1", 0)->children(1)->children(1)->children(2)->id;
// or
echo $html->getElementById("div1")->childNodes(1)->childNodes(1)->childNodes(2)->getAttribute('id');
You can also call methods with Camel naming convertions.

mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
element$e->parent () Returns the parent of element.
element$e->first_child () Returns the first child of element, or null if not found.
element$e->last_child () Returns the last child of element, or null if not found.
element$e->next_sibling () Returns the next sibling of element, or null if not found.
element$e->prev_sibling () Returns the previous sibling of element, or null if not found.

// Dumps the internal DOM tree back into string
$str = $html;

// Print it!
echo $html;

// Dumps the internal DOM tree back into string
$str = $html->save();

// Dumps the internal DOM tree back into a file
$html->save('result.htm');

// Write a function with parameter "$element"
function my_callback($element) {
// Hide all <b> tags
if ($element->tag=='b')
$element->outertext = '';
}

// Register the callback function with it's function name
$html->set_callback('my_callback');

// Callback function will be invoked while dumping
echo $html;

PHP 相关文章推荐
PHP 高手之路(一)
Oct 09 PHP
PHP 函数学习简单小结
Jul 08 PHP
让PHP COOKIE立即生效,不用刷新就可以使用
Mar 09 PHP
深入file_get_contents函数抓取内容失败的原因分析
Jun 25 PHP
php curl模拟post请求小实例
Nov 13 PHP
php读取3389的脚本
May 06 PHP
php上传图片之时间戳命名(保存路径)
Aug 15 PHP
php5.3提示Function ereg() is deprecated Error问题解决方法
Nov 12 PHP
php通过curl模拟登陆DZ论坛
May 11 PHP
php无限级分类实现方法分析
Oct 19 PHP
Laravel框架实现利用监听器进行sql语句记录功能
Jun 06 PHP
一次项目中Thinkphp绕过禁用函数的实战记录
Nov 17 PHP
php中一个有意思的日期逻辑处理
Mar 25 #PHP
php中http_build_query 的一个问题
Mar 25 #PHP
php正则表达匹配中文问题分析小结
Mar 25 #PHP
二招解决php乱码问题
Mar 25 #PHP
php引用地址改变变量值的问题
Mar 23 #PHP
奇怪的PHP引用效率问题分析
Mar 23 #PHP
php地址引用(php地址引用的效率问题)
Mar 23 #PHP
You might like
PHP-redis中文文档介绍
2013/02/07 PHP
再Docker中架设完整的WordPress站点全攻略
2015/07/29 PHP
PHP mysql事务问题实例分析
2016/01/18 PHP
PHP上传Excel文件导入数据到MySQL数据库示例
2016/10/25 PHP
Zend Framework入门应用实例详解
2016/12/11 PHP
PHP实现 APP端微信支付功能
2018/06/22 PHP
关于javascript 回调函数中变量作用域的讨论
2009/09/11 Javascript
javascript+iframe 实现无刷新载入整页的代码
2010/03/17 Javascript
Extjs中常用表单介绍与应用
2010/06/07 Javascript
jquery获取iframe中的dom对象(两种方法)
2013/07/02 Javascript
javascript ready和load事件的区别示例介绍
2013/08/30 Javascript
Array 重排序方法和操作方法的简单实例
2014/01/24 Javascript
jQuery.position()方法获取不到值的安全替换方法
2015/03/13 Javascript
javascript实现3D变换的立体圆圈实例
2015/08/06 Javascript
轻松实现js弹框显示选项
2016/09/13 Javascript
Bootstrap CSS组件之输入框组
2016/12/17 Javascript
jQuery实现CheckBox全选、全不选功能
2017/01/11 Javascript
详解如何提高 webpack 构建 Vue 项目的速度
2017/07/03 Javascript
浅谈webpack编译vue项目生成的代码探索
2017/12/11 Javascript
详解NODEJS的http实现
2018/01/04 NodeJs
vue-video-player 解决微信自动全屏播放问题(横竖屏导致样式错乱问题)
2020/02/25 Javascript
[09:13]2014DOTA2国际邀请赛 中国区预选赛coser表演
2014/05/23 DOTA
Django 如何获取前端发送的头文件详解(推荐)
2017/08/15 Python
Python简单读取json文件功能示例
2017/11/30 Python
python实现简易内存监控
2018/06/21 Python
python实现机器人卡牌
2019/10/06 Python
python字典排序的方法
2019/10/12 Python
python实现连续变量最优分箱详解--CART算法
2019/11/22 Python
CSS3 文字动画效果
2020/11/12 HTML / CSS
Brasty罗马尼亚:购买手表、香水、化妆品、珠宝
2020/04/21 全球购物
华为python面试题
2016/05/03 面试题
文秘档案管理岗位职责
2014/03/06 职场文书
社区学习党的群众路线教育实践活动心得体会
2014/11/03 职场文书
2014城乡环境综合治理工作总结
2014/12/19 职场文书
2015毕业生自我评价范文
2015/03/02 职场文书
2015年生产部工作总结范文
2015/05/25 职场文书