simplehtmldom Doc api帮助文档


Posted in PHP onMarch 26, 2012

API Reference

Helper functions
object str_get_html ( string $content ) Creates a DOM object from a string.
object file_get_html ( string $filename ) Creates a DOM object from a file or a URL.

DOM methods & properties

stringplaintext Returns the contents extracted from HTML.
voidclear () Clean up memory.
voidload ( string $content ) Load contents from a string.
stringsave ( [string $filename] ) Dumps the internal DOM tree back into a string. If the $filename is set, result string will save to file.
voidload_file ( string $filename ) Load contents from a from a file or a URL.
voidset_callback ( string $function_name ) Set a callback function.
mixedfind ( string $selector [, int $index] ) Find elements by the CSS selector. Returns the Nth element object if index is set, otherwise return an array of object.

Element methods & properties

string[attribute] Read or write element's attribure value.
stringtag Read or write the tag name of element.
stringoutertext Read or write the outer HTML text of element.
stringinnertext Read or write the inner HTML text of element.
stringplaintext Read or write the plain text of element.
mixedfind ( string $selector [, int $index] ) Find children by the CSS selector. Returns the Nth element object if index is set, otherwise, return an array of object.

DOM traversing

mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
element$e->parent () Returns the parent of element.
element$e->first_child () Returns the first child of element, or null if not found.
element$e->last_child () Returns the last child of element, or null if not found.
element$e->next_sibling () Returns the next sibling of element, or null if not found.
element$e->prev_sibling () Returns the previous sibling of element, or null if not found.
Camel naming convertions You can also call methods with W3C STANDARD camel naming convertions.

string$e->getAttribute ( $name ) string$e->attribute
void$e->setAttribute ( $name, $value ) void$value = $e->attribute
bool$e->hasAttribute ( $name ) boolisset($e->attribute)
void$e->removeAttribute ( $name ) void$e->attribute = null
element$e->getElementById ( $id ) mixed$e->find ( "#$id", 0 )
mixed$e->getElementsById ( $id [,$index] ) mixed$e->find ( "#$id" [, int $index] )
element$e->getElementByTagName ($name ) mixed$e->find ( $name, 0 )
mixed$e->getElementsByTagName ( $name [, $index] ) mixed$e->find ( $name [, int $index] )
element$e->parentNode () element$e->parent ()
mixed$e->childNodes ( [$index] ) mixed$e->children ( [int $index] )
element$e->firstChild () element$e->first_child ()
element$e->lastChild () element$e->last_child ()
element$e->nextSibling () element$e->next_sibling ()
element$e->previousSibling () element$e->prev_sibling ()

// Create a DOM object from a string
$html = str_get_html('<html><body>Hello!</body></html>');

// Create a DOM object from a URL
$html = file_get_html('http://www.google.com/');

// Create a DOM object from a HTML file
$html = file_get_html('test.htm');

// Create a DOM object
$html = new simple_html_dom();

// Load HTML from a string
$html->load('<html><body>Hello!</body></html>');

// Load HTML from a URL
$html->load_file('http://www.google.com/');

// Load HTML from a HTML file
$html->load_file('test.htm');

// Find all anchors, returns a array of element objects
$ret = $html->find('a');

// Find (N)thanchor, returns element object or null if not found(zero based)
$ret = $html->find('a', 0);

// Find all <div> which attribute id=foo
$ret = $html->find('div[id=foo]');

// Find all <div> with the id attribute
$ret = $html->find('div[id]');

// Find all element has attribute id
$ret = $html->find('[id]');

// Find all element which id=foo
$ret = $html->find('#foo');

// Find all element which class=foo
$ret = $html->find('.foo');

// Find all anchors and images
$ret = $html->find('a, img');

// Find all anchors and images with the "title" attribute
$ret = $html->find('a[title], img[title]');

// Find all <li> in <ul>
$es = $html->find('ul li');

// Find Nested <div> tags
$es = $html->find('div div div');

// Find all <td> in <table> which class=hello
$es = $html->find('table.hello td');

// Find all td tags with attribite align=center in table tags
$es = $html->find(''table td[align=center]');

// Find all <li> in <ul>
foreach($html->find('ul') as $ul)
{
foreach($ul->find('li') as $li)
{
// do something...
}
}

// Find first <li> in first <ul>
$e = $html->find('ul', 0)->find('li', 0);

Supports these operators in attribute selectors:

[attribute] Matches elements that have the specified attribute.
[attribute=value] Matches elements that have the specified attribute with a certain value.
[attribute!=value] Matches elements that don't have the specified attribute with a certain value.
[attribute^=value] Matches elements that have the specified attribute and it starts with a certain value.
[attribute$=value] Matches elements that have the specified attribute and it ends with a certain value.
[attribute*=value] Matches elements that have the specified attribute and it contains a certain value.

// Find all text blocks
$es = $html->find('text');

// Find all comment (<!--...-->) blocks
$es = $html->find('comment');

// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected...), it will returns true or false)
$value = $e->href;

// Set a attribute(If the attribute is non-value attribute (eg. checked, selected...), set it's value as true or false)
$e->href = 'my link';

// Remove a attribute, set it's value as null!
$e->href = null;

// Determine whether a attribute exist?
if(isset($e->href))
echo 'href exist!';

// Example
$html = str_get_html("<div>foo <b>bar</b></div>");
$e = $html->find("div", 0);

echo $e->tag; // Returns: " div"
echo $e->outertext; // Returns: " <div>foo <b>bar</b></div>"
echo $e->innertext; // Returns: " foo <b>bar</b>"
echo $e->plaintext; // Returns: " foo bar"

$e->tag Read or write the tag name of element.
$e->outertext Read or write the outer HTML text of element.
$e->innertext Read or write the inner HTML text of element.
$e->plaintext Read or write the plain text of element.

// Extract contents from HTML
echo $html->plaintext;

// Wrap a element
$e->outertext = '<div class="wrap">' . $e->outertext . '<div>';

// Remove a element, set it's outertext as an empty string
$e->outertext = '';

// Append a element
$e->outertext = $e->outertext . '<div>foo<div>';

// Insert a element
$e->outertext = '<div>foo<div>' . $e->outertext;

// If you are not so familiar with HTML DOM, check this link to learn more...

// Example
echo $html->find("#div1", 0)->children(1)->children(1)->children(2)->id;
// or
echo $html->getElementById("div1")->childNodes(1)->childNodes(1)->childNodes(2)->getAttribute('id');
You can also call methods with Camel naming convertions.

mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
element$e->parent () Returns the parent of element.
element$e->first_child () Returns the first child of element, or null if not found.
element$e->last_child () Returns the last child of element, or null if not found.
element$e->next_sibling () Returns the next sibling of element, or null if not found.
element$e->prev_sibling () Returns the previous sibling of element, or null if not found.

// Dumps the internal DOM tree back into string
$str = $html;

// Print it!
echo $html;

// Dumps the internal DOM tree back into string
$str = $html->save();

// Dumps the internal DOM tree back into a file
$html->save('result.htm');

// Write a function with parameter "$element"
function my_callback($element) {
// Hide all <b> tags
if ($element->tag=='b')
$element->outertext = '';
}

// Register the callback function with it's function name
$html->set_callback('my_callback');

// Callback function will be invoked while dumping
echo $html;

PHP 相关文章推荐
杏林同学录(四)
Oct 09 PHP
兼容PHP5的PHP目录管理函数库
Jul 10 PHP
PHP开发的一些注意点总结
Oct 12 PHP
php之CodeIgniter学习笔记
Jun 17 PHP
常见的四种POST 提交数据方式(小总结)
Oct 08 PHP
YII CLinkPager分页类扩展增加显示共多少页
Jan 29 PHP
PHP函数checkdnsrr用法详解(Windows平台用法)
Mar 21 PHP
PHP生成可点击刷新的验证码简单示例
May 13 PHP
PHP设计模式之迭代器模式
Jun 17 PHP
thinkphp5.0自定义验证规则使用方法
Nov 16 PHP
PHP使用PhpSpreadsheet操作Excel实例详解
Mar 26 PHP
php引用传递
Apr 01 PHP
php中一个有意思的日期逻辑处理
Mar 25 #PHP
php中http_build_query 的一个问题
Mar 25 #PHP
php正则表达匹配中文问题分析小结
Mar 25 #PHP
二招解决php乱码问题
Mar 25 #PHP
php引用地址改变变量值的问题
Mar 23 #PHP
奇怪的PHP引用效率问题分析
Mar 23 #PHP
php地址引用(php地址引用的效率问题)
Mar 23 #PHP
You might like
php中设置index.php文件为只读的方法
2013/02/06 PHP
在WordPress中使用PHP脚本来判断访客来自什么国家
2015/12/10 PHP
thinkphp5 migrate数据库迁移工具
2018/02/20 PHP
基于jQuery实现下拉收缩(展开与折叠)特效
2012/12/25 Javascript
js实现的点击div区域外隐藏div区域
2014/06/30 Javascript
ie8模式下click无反应点击option无反应的解决方法
2014/10/11 Javascript
JS实现的网页背景闪电闪烁效果代码
2015/10/17 Javascript
js实现获取div坐标的方法
2015/11/16 Javascript
Javascript中常用类型的格式化方法小结
2016/12/26 Javascript
原生js实现瀑布流布局
2017/03/08 Javascript
AngularJS 购物车全选/取消全选功能的实现方法
2017/08/14 Javascript
原生JS实现循环Nodelist Dom列表的4种方式示例
2018/02/11 Javascript
详解Vue基于 Nuxt.js 实现服务端渲染(SSR)
2018/04/05 Javascript
React styled-components设置组件属性的方法
2018/08/07 Javascript
vue动态绑定class的几种常用方式小结
2019/05/21 Javascript
Vue中函数防抖节流的理解及应用实现
2020/04/24 Javascript
微信小程序自定义底部弹出框功能
2020/11/18 Javascript
Python实现处理管道的方法
2015/06/04 Python
Djang中静态文件配置方法
2015/07/30 Python
使用Python进行AES加密和解密的示例代码
2018/02/02 Python
使用Python的OpenCV模块识别滑动验证码的缺口(推荐)
2019/05/10 Python
python的pytest框架之命令行参数详解(上)
2019/06/27 Python
Python 给定的经纬度标注在地图上的实现方法
2019/07/05 Python
Django 模型类(models.py)的定义详解
2019/07/19 Python
使用Python的datetime库处理时间(RPA流程)
2019/11/24 Python
用Python实现童年贪吃蛇小游戏功能的实例代码
2020/12/07 Python
应聘收银员个人的求职信
2013/11/30 职场文书
上班离岗检讨书
2014/01/27 职场文书
自强之星事迹材料
2014/05/12 职场文书
2014年保安个人工作总结
2014/11/13 职场文书
2014年度考核工作总结
2014/12/24 职场文书
2016年秋季新学期致辞
2015/07/30 职场文书
大学宣传委员竞选稿
2015/11/19 职场文书
大学组织委员竞选稿
2015/11/21 职场文书
又涨知识了,自律到底多重要?
2019/06/27 职场文书
MySQL数据库查询进阶之多表查询详解
2022/04/08 MySQL