simplehtmldom Doc api帮助文档


Posted in PHP onMarch 26, 2012

API Reference

Helper functions
object str_get_html ( string $content ) Creates a DOM object from a string.
object file_get_html ( string $filename ) Creates a DOM object from a file or a URL.

DOM methods & properties

stringplaintext Returns the contents extracted from HTML.
voidclear () Clean up memory.
voidload ( string $content ) Load contents from a string.
stringsave ( [string $filename] ) Dumps the internal DOM tree back into a string. If the $filename is set, result string will save to file.
voidload_file ( string $filename ) Load contents from a from a file or a URL.
voidset_callback ( string $function_name ) Set a callback function.
mixedfind ( string $selector [, int $index] ) Find elements by the CSS selector. Returns the Nth element object if index is set, otherwise return an array of object.

Element methods & properties

string[attribute] Read or write element's attribure value.
stringtag Read or write the tag name of element.
stringoutertext Read or write the outer HTML text of element.
stringinnertext Read or write the inner HTML text of element.
stringplaintext Read or write the plain text of element.
mixedfind ( string $selector [, int $index] ) Find children by the CSS selector. Returns the Nth element object if index is set, otherwise, return an array of object.

DOM traversing

mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
element$e->parent () Returns the parent of element.
element$e->first_child () Returns the first child of element, or null if not found.
element$e->last_child () Returns the last child of element, or null if not found.
element$e->next_sibling () Returns the next sibling of element, or null if not found.
element$e->prev_sibling () Returns the previous sibling of element, or null if not found.
Camel naming convertions You can also call methods with W3C STANDARD camel naming convertions.

string$e->getAttribute ( $name ) string$e->attribute
void$e->setAttribute ( $name, $value ) void$value = $e->attribute
bool$e->hasAttribute ( $name ) boolisset($e->attribute)
void$e->removeAttribute ( $name ) void$e->attribute = null
element$e->getElementById ( $id ) mixed$e->find ( "#$id", 0 )
mixed$e->getElementsById ( $id [,$index] ) mixed$e->find ( "#$id" [, int $index] )
element$e->getElementByTagName ($name ) mixed$e->find ( $name, 0 )
mixed$e->getElementsByTagName ( $name [, $index] ) mixed$e->find ( $name [, int $index] )
element$e->parentNode () element$e->parent ()
mixed$e->childNodes ( [$index] ) mixed$e->children ( [int $index] )
element$e->firstChild () element$e->first_child ()
element$e->lastChild () element$e->last_child ()
element$e->nextSibling () element$e->next_sibling ()
element$e->previousSibling () element$e->prev_sibling ()

// Create a DOM object from a string
$html = str_get_html('<html><body>Hello!</body></html>');

// Create a DOM object from a URL
$html = file_get_html('http://www.google.com/');

// Create a DOM object from a HTML file
$html = file_get_html('test.htm');

// Create a DOM object
$html = new simple_html_dom();

// Load HTML from a string
$html->load('<html><body>Hello!</body></html>');

// Load HTML from a URL
$html->load_file('http://www.google.com/');

// Load HTML from a HTML file
$html->load_file('test.htm');

// Find all anchors, returns a array of element objects
$ret = $html->find('a');

// Find (N)thanchor, returns element object or null if not found(zero based)
$ret = $html->find('a', 0);

// Find all <div> which attribute id=foo
$ret = $html->find('div[id=foo]');

// Find all <div> with the id attribute
$ret = $html->find('div[id]');

// Find all element has attribute id
$ret = $html->find('[id]');

// Find all element which id=foo
$ret = $html->find('#foo');

// Find all element which class=foo
$ret = $html->find('.foo');

// Find all anchors and images
$ret = $html->find('a, img');

// Find all anchors and images with the "title" attribute
$ret = $html->find('a[title], img[title]');

// Find all <li> in <ul>
$es = $html->find('ul li');

// Find Nested <div> tags
$es = $html->find('div div div');

// Find all <td> in <table> which class=hello
$es = $html->find('table.hello td');

// Find all td tags with attribite align=center in table tags
$es = $html->find(''table td[align=center]');

// Find all <li> in <ul>
foreach($html->find('ul') as $ul)
{
foreach($ul->find('li') as $li)
{
// do something...
}
}

// Find first <li> in first <ul>
$e = $html->find('ul', 0)->find('li', 0);

Supports these operators in attribute selectors:

[attribute] Matches elements that have the specified attribute.
[attribute=value] Matches elements that have the specified attribute with a certain value.
[attribute!=value] Matches elements that don't have the specified attribute with a certain value.
[attribute^=value] Matches elements that have the specified attribute and it starts with a certain value.
[attribute$=value] Matches elements that have the specified attribute and it ends with a certain value.
[attribute*=value] Matches elements that have the specified attribute and it contains a certain value.

// Find all text blocks
$es = $html->find('text');

// Find all comment (<!--...-->) blocks
$es = $html->find('comment');

// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected...), it will returns true or false)
$value = $e->href;

// Set a attribute(If the attribute is non-value attribute (eg. checked, selected...), set it's value as true or false)
$e->href = 'my link';

// Remove a attribute, set it's value as null!
$e->href = null;

// Determine whether a attribute exist?
if(isset($e->href))
echo 'href exist!';

// Example
$html = str_get_html("<div>foo <b>bar</b></div>");
$e = $html->find("div", 0);

echo $e->tag; // Returns: " div"
echo $e->outertext; // Returns: " <div>foo <b>bar</b></div>"
echo $e->innertext; // Returns: " foo <b>bar</b>"
echo $e->plaintext; // Returns: " foo bar"

$e->tag Read or write the tag name of element.
$e->outertext Read or write the outer HTML text of element.
$e->innertext Read or write the inner HTML text of element.
$e->plaintext Read or write the plain text of element.

// Extract contents from HTML
echo $html->plaintext;

// Wrap a element
$e->outertext = '<div class="wrap">' . $e->outertext . '<div>';

// Remove a element, set it's outertext as an empty string
$e->outertext = '';

// Append a element
$e->outertext = $e->outertext . '<div>foo<div>';

// Insert a element
$e->outertext = '<div>foo<div>' . $e->outertext;

// If you are not so familiar with HTML DOM, check this link to learn more...

// Example
echo $html->find("#div1", 0)->children(1)->children(1)->children(2)->id;
// or
echo $html->getElementById("div1")->childNodes(1)->childNodes(1)->childNodes(2)->getAttribute('id');
You can also call methods with Camel naming convertions.

mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
element$e->parent () Returns the parent of element.
element$e->first_child () Returns the first child of element, or null if not found.
element$e->last_child () Returns the last child of element, or null if not found.
element$e->next_sibling () Returns the next sibling of element, or null if not found.
element$e->prev_sibling () Returns the previous sibling of element, or null if not found.

// Dumps the internal DOM tree back into string
$str = $html;

// Print it!
echo $html;

// Dumps the internal DOM tree back into string
$str = $html->save();

// Dumps the internal DOM tree back into a file
$html->save('result.htm');

// Write a function with parameter "$element"
function my_callback($element) {
// Hide all <b> tags
if ($element->tag=='b')
$element->outertext = '';
}

// Register the callback function with it's function name
$html->set_callback('my_callback');

// Callback function will be invoked while dumping
echo $html;

PHP 相关文章推荐
Views rows style模板重写代码
May 16 PHP
php实现信用卡校验位算法THE LUHN MOD-10示例
May 07 PHP
Ubuntu中启用php的mail()函数并解决发送邮件速度慢问题
Mar 27 PHP
深入理解PHP原理之执行周期分析
Jun 01 PHP
PHP实现文件上传功能实例代码
May 18 PHP
PHP高效获取远程图片尺寸和大小的实现方法
Oct 20 PHP
PHP两个n位的二进制整数相加问题的解决
Aug 26 PHP
PHP的curl函数的用法总结
Feb 14 PHP
php使用filter_var函数判断邮箱,url,ip格式示例
Jul 06 PHP
Laravel 5.4前后台分离,通过不同的二级域名访问方法
Oct 13 PHP
Laravel 微信小程序后端实现用户登录的示例代码
Nov 26 PHP
PHP中的异常处理机制深入讲解
Nov 10 PHP
php中一个有意思的日期逻辑处理
Mar 25 #PHP
php中http_build_query 的一个问题
Mar 25 #PHP
php正则表达匹配中文问题分析小结
Mar 25 #PHP
二招解决php乱码问题
Mar 25 #PHP
php引用地址改变变量值的问题
Mar 23 #PHP
奇怪的PHP引用效率问题分析
Mar 23 #PHP
php地址引用(php地址引用的效率问题)
Mar 23 #PHP
You might like
dedecms 批量提取第一张图片最为缩略图的代码(文章+软件)
2009/10/29 PHP
基于php iconv函数的使用详解
2013/06/09 PHP
关于PHP语言构造器介绍
2013/07/08 PHP
PHP将两个关联数组合并函数提高函数效率
2014/03/18 PHP
CodeIgniter框架提示Disallowed Key Characters的解决办法
2014/04/21 PHP
php和nginx交互实例讲解
2019/09/24 PHP
laravel 框架结合关联查询 when()用法分析
2019/11/22 PHP
Javascript入门学习资料收集整理篇
2008/07/06 Javascript
将CKfinder整合进CKEditor3.0的新方法
2010/01/10 Javascript
从零开始学习jQuery (六) jquery中的AJAX使用
2011/02/23 Javascript
js文件Cookie存取值示例代码
2014/02/20 Javascript
简介JavaScript中的sub()方法的使用
2015/06/08 Javascript
JS实现设置ff与ie元素绝对位置的方法
2016/03/08 Javascript
AngularJS过滤器filter用法总结
2016/12/13 Javascript
jQuery Validate验证表单时多个name相同的元素只验证第一个的解决方法
2016/12/24 Javascript
js禁止浏览器的回退事件
2017/04/20 Javascript
详解基于vue的移动web app页面缓存解决方案
2017/08/03 Javascript
vue-router 路由基础的详解
2017/10/17 Javascript
可能被忽略的一些JavaScript数组方法细节
2019/02/28 Javascript
[05:08]顺网杯ISS-DOTA2赛歌 少女偶像Lunar青春演绎
2013/12/05 DOTA
[44:40]KG vs LGD 2019国际邀请赛小组赛 BO2 第一场 8.15
2019/08/16 DOTA
Tensorflow卷积神经网络实例
2018/05/24 Python
Python 给某个文件名添加时间戳的方法
2018/10/16 Python
pygame编写音乐播放器的实现代码示例
2019/11/19 Python
CSS3使用border-radius属性制作圆角
2014/12/22 HTML / CSS
美国浴缸、水槽和水龙头购物网站:Vintage Tub & Bath
2019/11/05 全球购物
文明学生标兵事迹
2014/01/21 职场文书
《要下雨了》教学反思
2014/02/17 职场文书
质量管理标语
2014/06/12 职场文书
社区综治宣传月活动总结
2014/07/02 职场文书
三月雷锋月活动总结
2014/07/03 职场文书
护理目标管理责任书
2014/07/25 职场文书
营销与策划实训报告
2014/11/05 职场文书
英文慰问信
2015/02/14 职场文书
团支部组织委员竞选稿
2015/11/21 职场文书
2016先进集体事迹材料范文
2016/02/25 职场文书