simplehtmldom Doc api帮助文档


Posted in PHP onMarch 26, 2012

API Reference

Helper functions
object str_get_html ( string $content ) Creates a DOM object from a string.
object file_get_html ( string $filename ) Creates a DOM object from a file or a URL.

DOM methods & properties

stringplaintext Returns the contents extracted from HTML.
voidclear () Clean up memory.
voidload ( string $content ) Load contents from a string.
stringsave ( [string $filename] ) Dumps the internal DOM tree back into a string. If the $filename is set, result string will save to file.
voidload_file ( string $filename ) Load contents from a from a file or a URL.
voidset_callback ( string $function_name ) Set a callback function.
mixedfind ( string $selector [, int $index] ) Find elements by the CSS selector. Returns the Nth element object if index is set, otherwise return an array of object.

Element methods & properties

string[attribute] Read or write element's attribure value.
stringtag Read or write the tag name of element.
stringoutertext Read or write the outer HTML text of element.
stringinnertext Read or write the inner HTML text of element.
stringplaintext Read or write the plain text of element.
mixedfind ( string $selector [, int $index] ) Find children by the CSS selector. Returns the Nth element object if index is set, otherwise, return an array of object.

DOM traversing

mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
element$e->parent () Returns the parent of element.
element$e->first_child () Returns the first child of element, or null if not found.
element$e->last_child () Returns the last child of element, or null if not found.
element$e->next_sibling () Returns the next sibling of element, or null if not found.
element$e->prev_sibling () Returns the previous sibling of element, or null if not found.
Camel naming convertions You can also call methods with W3C STANDARD camel naming convertions.

string$e->getAttribute ( $name ) string$e->attribute
void$e->setAttribute ( $name, $value ) void$value = $e->attribute
bool$e->hasAttribute ( $name ) boolisset($e->attribute)
void$e->removeAttribute ( $name ) void$e->attribute = null
element$e->getElementById ( $id ) mixed$e->find ( "#$id", 0 )
mixed$e->getElementsById ( $id [,$index] ) mixed$e->find ( "#$id" [, int $index] )
element$e->getElementByTagName ($name ) mixed$e->find ( $name, 0 )
mixed$e->getElementsByTagName ( $name [, $index] ) mixed$e->find ( $name [, int $index] )
element$e->parentNode () element$e->parent ()
mixed$e->childNodes ( [$index] ) mixed$e->children ( [int $index] )
element$e->firstChild () element$e->first_child ()
element$e->lastChild () element$e->last_child ()
element$e->nextSibling () element$e->next_sibling ()
element$e->previousSibling () element$e->prev_sibling ()

// Create a DOM object from a string
$html = str_get_html('<html><body>Hello!</body></html>');

// Create a DOM object from a URL
$html = file_get_html('http://www.google.com/');

// Create a DOM object from a HTML file
$html = file_get_html('test.htm');

// Create a DOM object
$html = new simple_html_dom();

// Load HTML from a string
$html->load('<html><body>Hello!</body></html>');

// Load HTML from a URL
$html->load_file('http://www.google.com/');

// Load HTML from a HTML file
$html->load_file('test.htm');

// Find all anchors, returns a array of element objects
$ret = $html->find('a');

// Find (N)thanchor, returns element object or null if not found(zero based)
$ret = $html->find('a', 0);

// Find all <div> which attribute id=foo
$ret = $html->find('div[id=foo]');

// Find all <div> with the id attribute
$ret = $html->find('div[id]');

// Find all element has attribute id
$ret = $html->find('[id]');

// Find all element which id=foo
$ret = $html->find('#foo');

// Find all element which class=foo
$ret = $html->find('.foo');

// Find all anchors and images
$ret = $html->find('a, img');

// Find all anchors and images with the "title" attribute
$ret = $html->find('a[title], img[title]');

// Find all <li> in <ul>
$es = $html->find('ul li');

// Find Nested <div> tags
$es = $html->find('div div div');

// Find all <td> in <table> which class=hello
$es = $html->find('table.hello td');

// Find all td tags with attribite align=center in table tags
$es = $html->find(''table td[align=center]');

// Find all <li> in <ul>
foreach($html->find('ul') as $ul)
{
foreach($ul->find('li') as $li)
{
// do something...
}
}

// Find first <li> in first <ul>
$e = $html->find('ul', 0)->find('li', 0);

Supports these operators in attribute selectors:

[attribute] Matches elements that have the specified attribute.
[attribute=value] Matches elements that have the specified attribute with a certain value.
[attribute!=value] Matches elements that don't have the specified attribute with a certain value.
[attribute^=value] Matches elements that have the specified attribute and it starts with a certain value.
[attribute$=value] Matches elements that have the specified attribute and it ends with a certain value.
[attribute*=value] Matches elements that have the specified attribute and it contains a certain value.

// Find all text blocks
$es = $html->find('text');

// Find all comment (<!--...-->) blocks
$es = $html->find('comment');

// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected...), it will returns true or false)
$value = $e->href;

// Set a attribute(If the attribute is non-value attribute (eg. checked, selected...), set it's value as true or false)
$e->href = 'my link';

// Remove a attribute, set it's value as null!
$e->href = null;

// Determine whether a attribute exist?
if(isset($e->href))
echo 'href exist!';

// Example
$html = str_get_html("<div>foo <b>bar</b></div>");
$e = $html->find("div", 0);

echo $e->tag; // Returns: " div"
echo $e->outertext; // Returns: " <div>foo <b>bar</b></div>"
echo $e->innertext; // Returns: " foo <b>bar</b>"
echo $e->plaintext; // Returns: " foo bar"

$e->tag Read or write the tag name of element.
$e->outertext Read or write the outer HTML text of element.
$e->innertext Read or write the inner HTML text of element.
$e->plaintext Read or write the plain text of element.

// Extract contents from HTML
echo $html->plaintext;

// Wrap a element
$e->outertext = '<div class="wrap">' . $e->outertext . '<div>';

// Remove a element, set it's outertext as an empty string
$e->outertext = '';

// Append a element
$e->outertext = $e->outertext . '<div>foo<div>';

// Insert a element
$e->outertext = '<div>foo<div>' . $e->outertext;

// If you are not so familiar with HTML DOM, check this link to learn more...

// Example
echo $html->find("#div1", 0)->children(1)->children(1)->children(2)->id;
// or
echo $html->getElementById("div1")->childNodes(1)->childNodes(1)->childNodes(2)->getAttribute('id');
You can also call methods with Camel naming convertions.

mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
element$e->parent () Returns the parent of element.
element$e->first_child () Returns the first child of element, or null if not found.
element$e->last_child () Returns the last child of element, or null if not found.
element$e->next_sibling () Returns the next sibling of element, or null if not found.
element$e->prev_sibling () Returns the previous sibling of element, or null if not found.

// Dumps the internal DOM tree back into string
$str = $html;

// Print it!
echo $html;

// Dumps the internal DOM tree back into string
$str = $html->save();

// Dumps the internal DOM tree back into a file
$html->save('result.htm');

// Write a function with parameter "$element"
function my_callback($element) {
// Hide all <b> tags
if ($element->tag=='b')
$element->outertext = '';
}

// Register the callback function with it's function name
$html->set_callback('my_callback');

// Callback function will be invoked while dumping
echo $html;

PHP 相关文章推荐
新手学PHP之数据库操作详解及乱码解决!
Jan 02 PHP
删除数组元素实用的PHP数组函数
Aug 18 PHP
PHP去除数组中重复的元素并按键名排序函数
Aug 18 PHP
PHP 长文章分页函数 带使用方法,不会分割段落,翻页在底部
Oct 22 PHP
PHP substr 截取字符串出现乱码问题解决方法[utf8与gb2312]
Dec 16 PHP
PHP json_decode函数详细解析
Feb 17 PHP
thinkphp文件处理类Dir.class.php的用法分析
Dec 08 PHP
PHP实现采集抓取淘宝网单个商品信息
Jan 08 PHP
学习php设计模式 php实现装饰器模式(decorator)
Dec 07 PHP
在WordPress中使用wp_count_posts函数来统计文章数量
Jan 05 PHP
laravel学习笔记之模型事件的几种用法示例
Aug 15 PHP
Laravel中日期时间处理包Carbon的简单使用
Sep 21 PHP
php中一个有意思的日期逻辑处理
Mar 25 #PHP
php中http_build_query 的一个问题
Mar 25 #PHP
php正则表达匹配中文问题分析小结
Mar 25 #PHP
二招解决php乱码问题
Mar 25 #PHP
php引用地址改变变量值的问题
Mar 23 #PHP
奇怪的PHP引用效率问题分析
Mar 23 #PHP
php地址引用(php地址引用的效率问题)
Mar 23 #PHP
You might like
PHP类的使用 实例代码讲解
2009/12/28 PHP
PHP5常用函数列表(分享)
2013/06/07 PHP
深入解析PHP中逗号与点号的区别
2013/08/05 PHP
PHP 实现类似js中alert() 提示框
2015/03/18 PHP
PHP curl批处理及多请求并发实现方法分析
2018/08/15 PHP
JavaScript入门教程(12) js对象化编程
2009/01/31 Javascript
将函数的实际参数转换成数组的方法
2010/01/25 Javascript
按给定几率进行随机抽取的js代码
2010/12/28 Javascript
js三种排序算法分享
2012/08/16 Javascript
不用锚点也可以平滑滚动到页面的指定位置实现代码
2013/05/08 Javascript
Javascript设置对象的ReadOnly属性(示例代码)
2013/12/25 Javascript
快速掌握Node.js事件驱动模型
2016/03/21 Javascript
jquery checkbox无法用attr()二次勾选问题的解决方法
2016/07/22 Javascript
将json转换成struts参数的方法
2016/11/08 Javascript
AngularJS实现给动态生成的元素绑定事件的方法
2016/12/14 Javascript
js读取json文件片段中的数据实例
2017/03/09 Javascript
详解webpack-dev-server 设置反向代理解决跨域问题
2018/04/18 Javascript
Vue常见面试题整理【值得收藏】
2018/09/20 Javascript
简述pm2常用命令集合及配置文件说明
2019/05/30 Javascript
JavaScript中的连续赋值问题实例分析
2019/07/12 Javascript
微信小程序 select 下拉框组件功能
2019/09/09 Javascript
vue 设置 input 为不可以编辑的实现方法
2019/09/19 Javascript
Vue项目中使用mock.js的完整步骤
2021/01/12 Vue.js
[01:57]2018年度DOTA2最具潜力解说-完美盛典
2018/12/16 DOTA
python实现的一个火车票转让信息采集器
2014/07/09 Python
Python3标准库总结
2019/02/19 Python
python对指定字符串逆序的6种方法(小结)
2020/04/02 Python
Python基于yaml文件配置logging日志过程解析
2020/06/23 Python
药学专业大学生自荐信
2013/09/28 职场文书
五年后的职业生涯规划
2014/03/04 职场文书
作文批改评语
2014/12/25 职场文书
《金色的草地》教学反思
2016/02/17 职场文书
世界文化遗产导游词
2019/08/07 职场文书
详解如何在Canvas中添加事件的方法
2021/04/17 Javascript
python游戏开发Pygame框架
2022/04/22 Python
Vscode中SSH插件如何远程连接Linux
2022/05/02 Servers