simplehtmldom Doc api帮助文档


Posted in PHP onMarch 26, 2012

API Reference

Helper functions
object str_get_html ( string $content ) Creates a DOM object from a string.
object file_get_html ( string $filename ) Creates a DOM object from a file or a URL.

DOM methods & properties

stringplaintext Returns the contents extracted from HTML.
voidclear () Clean up memory.
voidload ( string $content ) Load contents from a string.
stringsave ( [string $filename] ) Dumps the internal DOM tree back into a string. If the $filename is set, result string will save to file.
voidload_file ( string $filename ) Load contents from a from a file or a URL.
voidset_callback ( string $function_name ) Set a callback function.
mixedfind ( string $selector [, int $index] ) Find elements by the CSS selector. Returns the Nth element object if index is set, otherwise return an array of object.

Element methods & properties

string[attribute] Read or write element's attribure value.
stringtag Read or write the tag name of element.
stringoutertext Read or write the outer HTML text of element.
stringinnertext Read or write the inner HTML text of element.
stringplaintext Read or write the plain text of element.
mixedfind ( string $selector [, int $index] ) Find children by the CSS selector. Returns the Nth element object if index is set, otherwise, return an array of object.

DOM traversing

mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
element$e->parent () Returns the parent of element.
element$e->first_child () Returns the first child of element, or null if not found.
element$e->last_child () Returns the last child of element, or null if not found.
element$e->next_sibling () Returns the next sibling of element, or null if not found.
element$e->prev_sibling () Returns the previous sibling of element, or null if not found.
Camel naming convertions You can also call methods with W3C STANDARD camel naming convertions.

string$e->getAttribute ( $name ) string$e->attribute
void$e->setAttribute ( $name, $value ) void$value = $e->attribute
bool$e->hasAttribute ( $name ) boolisset($e->attribute)
void$e->removeAttribute ( $name ) void$e->attribute = null
element$e->getElementById ( $id ) mixed$e->find ( "#$id", 0 )
mixed$e->getElementsById ( $id [,$index] ) mixed$e->find ( "#$id" [, int $index] )
element$e->getElementByTagName ($name ) mixed$e->find ( $name, 0 )
mixed$e->getElementsByTagName ( $name [, $index] ) mixed$e->find ( $name [, int $index] )
element$e->parentNode () element$e->parent ()
mixed$e->childNodes ( [$index] ) mixed$e->children ( [int $index] )
element$e->firstChild () element$e->first_child ()
element$e->lastChild () element$e->last_child ()
element$e->nextSibling () element$e->next_sibling ()
element$e->previousSibling () element$e->prev_sibling ()

// Create a DOM object from a string
$html = str_get_html('<html><body>Hello!</body></html>');

// Create a DOM object from a URL
$html = file_get_html('http://www.google.com/');

// Create a DOM object from a HTML file
$html = file_get_html('test.htm');

// Create a DOM object
$html = new simple_html_dom();

// Load HTML from a string
$html->load('<html><body>Hello!</body></html>');

// Load HTML from a URL
$html->load_file('http://www.google.com/');

// Load HTML from a HTML file
$html->load_file('test.htm');

// Find all anchors, returns a array of element objects
$ret = $html->find('a');

// Find (N)thanchor, returns element object or null if not found(zero based)
$ret = $html->find('a', 0);

// Find all <div> which attribute id=foo
$ret = $html->find('div[id=foo]');

// Find all <div> with the id attribute
$ret = $html->find('div[id]');

// Find all element has attribute id
$ret = $html->find('[id]');

// Find all element which id=foo
$ret = $html->find('#foo');

// Find all element which class=foo
$ret = $html->find('.foo');

// Find all anchors and images
$ret = $html->find('a, img');

// Find all anchors and images with the "title" attribute
$ret = $html->find('a[title], img[title]');

// Find all <li> in <ul>
$es = $html->find('ul li');

// Find Nested <div> tags
$es = $html->find('div div div');

// Find all <td> in <table> which class=hello
$es = $html->find('table.hello td');

// Find all td tags with attribite align=center in table tags
$es = $html->find(''table td[align=center]');

// Find all <li> in <ul>
foreach($html->find('ul') as $ul)
{
foreach($ul->find('li') as $li)
{
// do something...
}
}

// Find first <li> in first <ul>
$e = $html->find('ul', 0)->find('li', 0);

Supports these operators in attribute selectors:

[attribute] Matches elements that have the specified attribute.
[attribute=value] Matches elements that have the specified attribute with a certain value.
[attribute!=value] Matches elements that don't have the specified attribute with a certain value.
[attribute^=value] Matches elements that have the specified attribute and it starts with a certain value.
[attribute$=value] Matches elements that have the specified attribute and it ends with a certain value.
[attribute*=value] Matches elements that have the specified attribute and it contains a certain value.

// Find all text blocks
$es = $html->find('text');

// Find all comment (<!--...-->) blocks
$es = $html->find('comment');

// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected...), it will returns true or false)
$value = $e->href;

// Set a attribute(If the attribute is non-value attribute (eg. checked, selected...), set it's value as true or false)
$e->href = 'my link';

// Remove a attribute, set it's value as null!
$e->href = null;

// Determine whether a attribute exist?
if(isset($e->href))
echo 'href exist!';

// Example
$html = str_get_html("<div>foo <b>bar</b></div>");
$e = $html->find("div", 0);

echo $e->tag; // Returns: " div"
echo $e->outertext; // Returns: " <div>foo <b>bar</b></div>"
echo $e->innertext; // Returns: " foo <b>bar</b>"
echo $e->plaintext; // Returns: " foo bar"

$e->tag Read or write the tag name of element.
$e->outertext Read or write the outer HTML text of element.
$e->innertext Read or write the inner HTML text of element.
$e->plaintext Read or write the plain text of element.

// Extract contents from HTML
echo $html->plaintext;

// Wrap a element
$e->outertext = '<div class="wrap">' . $e->outertext . '<div>';

// Remove a element, set it's outertext as an empty string
$e->outertext = '';

// Append a element
$e->outertext = $e->outertext . '<div>foo<div>';

// Insert a element
$e->outertext = '<div>foo<div>' . $e->outertext;

// If you are not so familiar with HTML DOM, check this link to learn more...

// Example
echo $html->find("#div1", 0)->children(1)->children(1)->children(2)->id;
// or
echo $html->getElementById("div1")->childNodes(1)->childNodes(1)->childNodes(2)->getAttribute('id');
You can also call methods with Camel naming convertions.

mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
element$e->parent () Returns the parent of element.
element$e->first_child () Returns the first child of element, or null if not found.
element$e->last_child () Returns the last child of element, or null if not found.
element$e->next_sibling () Returns the next sibling of element, or null if not found.
element$e->prev_sibling () Returns the previous sibling of element, or null if not found.

// Dumps the internal DOM tree back into string
$str = $html;

// Print it!
echo $html;

// Dumps the internal DOM tree back into string
$str = $html->save();

// Dumps the internal DOM tree back into a file
$html->save('result.htm');

// Write a function with parameter "$element"
function my_callback($element) {
// Hide all <b> tags
if ($element->tag=='b')
$element->outertext = '';
}

// Register the callback function with it's function name
$html->set_callback('my_callback');

// Callback function will be invoked while dumping
echo $html;

PHP 相关文章推荐
关于在php.ini中添加extension=php_mysqli.dll指令的说明
Jun 14 PHP
How do I change MySQL timezone?
Mar 26 PHP
PHP base64+gzinflate压缩编码和解码代码
Oct 03 PHP
discuz程序的PHP加密函数原理分析
Aug 05 PHP
php模板原理讲解
Nov 13 PHP
php实现的替换敏感字符串类实例
Sep 22 PHP
php实现的支持断点续传的文件下载类
Sep 23 PHP
PHP实现动态创建XML文档的方法
Mar 30 PHP
php微信开发之谷歌测距
Jun 14 PHP
thinkphp5实现无限级分类
Feb 18 PHP
基于Laravel 多个中间件的执行顺序详解
Oct 21 PHP
PHP中迭代器的简单实现及Yii框架中的迭代器实现方法示例
Apr 26 PHP
php中一个有意思的日期逻辑处理
Mar 25 #PHP
php中http_build_query 的一个问题
Mar 25 #PHP
php正则表达匹配中文问题分析小结
Mar 25 #PHP
二招解决php乱码问题
Mar 25 #PHP
php引用地址改变变量值的问题
Mar 23 #PHP
奇怪的PHP引用效率问题分析
Mar 23 #PHP
php地址引用(php地址引用的效率问题)
Mar 23 #PHP
You might like
php木马攻击防御之道
2008/03/24 PHP
用PHP的ob_start() 控制您的浏览器cache
2009/08/03 PHP
php使用memcoder将视频转成mp4格式的方法
2015/03/12 PHP
PHP判断JSON对象是否存在的方法(推荐)
2016/07/06 PHP
使用PHP下载CSS文件中的所有图片【几行代码即可实现】
2016/12/14 PHP
PHP中的访问修饰符简单比较
2019/02/02 PHP
解决PHP curl或file_get_contents下载图片损坏或无法打开的问题
2019/10/11 PHP
ThinkPHP6.0如何利用自定义验证规则规范的实现登陆
2020/12/16 PHP
jQuery 通过事件委派一次绑定多种事件,以减少事件冗余
2010/06/30 Javascript
jQuery 数据缓存模块进化史详细介绍
2012/11/19 Javascript
js移除事件 js绑定事件实例应用
2012/11/28 Javascript
Javascript设置对象的ReadOnly属性(示例代码)
2013/12/25 Javascript
jquery如何根据值设置默认的选中项
2014/03/17 Javascript
Javascript中的apply()方法浅析
2015/03/15 Javascript
jQuery实现的导航下拉菜单效果示例
2016/09/05 Javascript
Angular.js中$resource高大上的数据交互详解
2017/07/30 Javascript
JavaScript中错误正确处理方式小结你用对了吗
2017/10/10 Javascript
JavaScript模块详解
2017/12/18 Javascript
vuejs实现递归树型菜单组件
2018/01/13 Javascript
微信小程序授权登录及解密unionId出错的方法
2018/09/26 Javascript
详细讲解如何创建, 发布自己的 Vue UI 组件库
2019/05/29 Javascript
vue中typescript装饰器的使用方法超实用教程
2019/06/17 Javascript
[01:33]真香警告!DOTA2勇士令状不朽珍藏Ⅱ饰品欣赏
2018/06/26 DOTA
python中的__init__ 、__new__、__call__小结
2014/04/25 Python
Python中列表(list)操作方法汇总
2014/08/18 Python
python实现向ppt文件里插入新幻灯片页面的方法
2015/04/28 Python
浅析Python的web.py框架中url的设定方法
2016/07/11 Python
python得到qq句柄,并显示在前台的方法
2018/10/14 Python
Python使用字典的嵌套功能详解
2019/02/27 Python
Python爬虫之urllib基础用法教程
2019/10/12 Python
Python实现元素等待代码实例
2019/11/11 Python
浅析canvas元素的html尺寸和css尺寸对元素视觉的影响
2019/07/22 HTML / CSS
课前三分钟演讲稿
2014/04/24 职场文书
祝福语集锦:给百岁老人祝寿贺词
2019/11/19 职场文书
Python中相见恨晚的技巧
2021/04/13 Python
Windows server 2012 R2 安装IIS服务器
2022/04/29 Servers