Traversing the DOM
Traversing a document with Cheerio allows you to select and manipulate specific elements within the document. Whether you want to move up and down the DOM tree, move sideways within the tree, or filter elements based on certain criteria, Cheerio provides a range of methods to help you do so. 使用 Cheerio 遍历文档可以 让你选择并操作文档中的特定元素。无论您是想在 DOM 树中上下移动、在树中横向移动, 还是根据特定条件过滤元素,Cheerio 都能提供一系列方 法来帮助您实现。
In this guide, we will go through the various methods available in Cheerio for traversing and filtering elements. We will cover methods for moving down the DOM tree, moving up the DOM tree, moving sideways within the tree, and filtering elements. By the end of this guide, you will have a good understanding of how to use these methods to select and manipulate elements within a document using Cheerio. 在本指南中,我们将介绍 Cheerio 中用于遍历和过滤元素的各种方法。我们将介 绍向下移动 DOM 树、向上移动 DOM 树、在 DOM 树中横向移动以及过滤元素的方法。本指 南结束时,您将对如何使用这些方法在文档中使用 Cheerio 选择和操作元素有一个很好的 了解。
本指南旨在向您概述 Cheerio 中用于遍历和过滤元素的各种方法。有关这些方法的更详细 参考资料,请参阅 API 文档。
下钻 DOM 树
Cheerio 提供了几种下钻 DOM 树和选择当前所选元素的子元素或后代元素的方法 。
find
find
方法 用于在已选范围内找到特定元素。该方
法接受一个 CSS 选择器作为参数,并返回一个新的选择范围,其包含的是当前选择范围中
与该选择器相匹配的所有元素。
以下是一个使用 find
方法选取 <ul>
元素中所有 <li>
元素的示例:
const $ = cheerio.load( `<ul> <li>Item 1</li> <li>Item 2</li> </ul>`, ); const listItems = $('ul').find('li'); render(<>List item count: {listItems.length}</>);
children
children
方法 用于选择某个元素的所有直接
子元素。该方法返回一个新的选择范围,其包含当前选择范围内的所有直接子元素。
以下是一个使用 children
方法选取 <ul>
元素中所有 <li>
元素的示例:
const $ = cheerio.load( `<ul> <li>Item 1</li> <li>Item 2</li> </ul>`, ); const listItems = $('ul').children('li'); render(<>List item count: {listItems.length}</>);
contents
contents
方法 用于选择某个元素的所有子元
素,包括文本和注释节点。该方法返回一个新的选择范围,其包含当前选选择范围内容的所
有子元素。
以下是一个使用 contents
方法选取 <div>
元素的所有子元素的示例:
const $ = cheerio.load( `<div> Text <p>Paragraph</p> </div>`, ); const contents = $('div').contents(); render(<>Contents count: {contents.length}</>);
上钻 DOM 树
Cheerio 提供了几种上钻 DOM 树以及选取当前所选元素的祖先元素的方法。
parent
parent
方法 用于选取某个元素的父元素。该方
法返回一个新的元素,该新元素是当前选择范围内的每个元素的父元素。
以下是一个使用 parent
方法选取<li>
元素的父元素 <ul>
的示例:
const $ = cheerio.load( `<ul> <li>Item 1</li> </ul>`, ); const list = $('li').parent(); render(<>{list.prop('tagName')}</>);
parents
和 parentsUntil
The parents
method allows you to select
all ancestor elements of a selection, up to the root element. It returns a new
selection containing all ancestor elements of the current selection.
parents
方法 用于选择选区的所有祖先元素,
直至根元素。它会返回一个新的选区,其中包含当前选区的所有祖先元素。
The parentsUntil
method is similar
to parents
, but allows you to specify an ancestor element as a stop point. It
returns a new selection containing all ancestor elements of the current
selection up to (but not including) the specified ancestor. parentsUntil “方
法](/docs/api/classes/Cheerio#parentsuntil)与 ”parents "相似,但允许指定一个祖先
元素作为停止点。它会返回一个新的选区,其中包含当前选区的所有祖先元素,直到(但不
包括)指定的祖先元素。
以下是一个使用 parents
方法和 parentsUntil
方法来选取 <li>
元素的祖先元素
的示例:
const $ = cheerio.load( `<div> <ul> <li>Item 1</li> </ul> </div>`, ); const ancestors = $('li').parents(); const ancestorsUntil = $('li').parentsUntil('div'); render( <> <p> Ancestor count (also includes "body" and "html" tags): {ancestors.length} </p> <p>Ancestor count (until "div"): {ancestorsUntil.length}</p> </>, );
closest
The closest
method allows you to select
the closest ancestor matching a given selector. It returns a new selection
containing the closest ancestor element that matches the selector. If no
matching ancestor is found, the method returns an empty selection.
Here's an example of using closest
to select the closest ancestor <ul>
element of a <li>
element:
const $ = cheerio.load( `<div> <ul> <li>Item 1</li> </ul> </div>`, ); const list = $('li').closest('ul'); render(<>{list.prop('tagName')}</>);
Moving Sideways Within the DOM Tree
Cheerio provides several methods for moving sideways within the DOM tree and selecting elements that are siblings of the current selection.
next
and prev
The next
method allows you to select the
next sibling element of a selection. It returns a new selection containing the
next sibling element (if there is one). If the given selection contains multiple
elements, next
includes the next sibling for each one.
The prev
method is similar to next
, but
allows you to select the previous sibling element. It returns a new selection
containing the previous sibling element for each element in the given selection.
Here's an example of using next
and prev
to select sibling elements of a
<li>
element:
const $ = cheerio.load( `<ul> <li>Item 1</li> <li>Item 2</li> </ul>`, ); const nextItem = $('li:first').next(); const prevItem = $('li:eq(1)').prev(); render( <> <p>Next: {nextItem.text()}</p> <p>Prev: {prevItem.text()}</p> </>, );
nextAll
, prevAll
, and siblings
The nextAll
method allows you to select
all siblings after the current element. It returns a new selection containing
all sibling elements after each element in the current selection.
The prevAll
method is similar to nextAll,
but allows you to select all siblings before the current element. It returns a
new selection containing all sibling elements before each element in the current
selection.
The siblings
method allows you to select
all siblings of a selection. It returns a new selection containing all sibling
elements of each element in the current selection.
Here's an example of using nextAll
, prevAll
, and siblings
to select
sibling elements of a <li>
element:
const $ = cheerio.load( `<ul> <li>[1]</li> <li>[2]</li> <li>[3]</li> </ul>`, ); const nextAll = $('li:first').nextAll(); const prevAll = $('li:last').prevAll(); const siblings = $('li:eq(1)').siblings(); render( <> <p>Next All: {nextAll.text()}</p> <p>Prev All: {prevAll.text()}</p> <p>Siblings: {siblings.text()}</p> </>, );
nextUntil
and prevUntil
The nextUntil
method allows you to
select all siblings after the current element up to a specified sibling. It
takes a selector or a sibling element as an argument and returns a new selection
containing all sibling elements after the current element up to (but not
including) the specified element.
The prevUntil
method is similar to
nextUntil
, but allows you to select all siblings before the current element up
to a specified sibling. It takes a selector or a sibling element as an argument
and returns a new selection containing all sibling elements before the current
element up to (but not including) the specified element.
Here's an example of using nextUntil
and prevUntil
to select sibling
elements of a <li>
element:
const $ = cheerio.load( `<ul> <li>Item 1</li> <li>Item 2</li> <li>Item 3</li> </ul>`, ); const nextUntil = $('li:first').nextUntil('li:last-child'); const prevUntil = $('li:last').prevUntil('li:first-child'); render( <> <p>Next: {nextUntil.text()}</p> <p>Prev: {prevUntil.text()}</p> </>, );
Filtering elements
Cheerio provides several methods for filtering elements within a selection.
Most of these filters also exist as selectors. For example, the first
method
is available as the :first
selector. Users are encouraged to use the selector
syntax when possible, as it is more performant.
eq
The eq
method allows you to select an element
at a specified index within a selection. It takes an index as an argument and
returns a new selection containing the element at the specified index.
Here's an example of using eq
to select the second <li>
element within a
<ul>
element:
const $ = cheerio.load( `<ul> <li>Item 1</li> <li>Item 2</li> </ul>`, ); const secondItem = $('li').eq(1); render(<>{secondItem.text()}</>);
filter
and not
The filter
method allows you to select
elements that match a given selector. It takes a selector as an argument and
returns a new selection containing only those elements that match the selector.
The not
method is similar to filter
, but
allows you to select elements that do not match a given selector. It takes a
selector as an argument and returns a new selection containing only those
elements that do not match the selector.
Here's an example of using filter
and not
to select <li>
elements within a
<ul>
element:
const $ = cheerio.load( `<ul> <li class="item">Item 1</li> <li>Item 2</li> </ul>`, ); const matchingItems = $('li').filter('.item'); const nonMatchingItems = $('li').not('.item'); render( <> <p>Matching: {matchingItems.text()}</p> <p>Non-matching: {nonMatchingItems.text()}</p> </>, );
has
has
方法 allows you to select elements that
contain an element matching a given selector. It takes a selector as an argument
and returns a new selection containing only those elements that contain an
element matching the selector.
Here's an example of using has
to select <li>
elements within a <ul>
element that contain a <strong>
element:
const $ = cheerio.load( `<ul> <li>Item 1</li> <li> <strong>Item 2</strong> </li> </ul>`, ); const matchingItems = $('li').has('strong'); render(<>{matchingItems.length}</>);
first
和 last
first
方法 用于选取某个选择范围内的第一个元
素。其返回值为第一个元素。
last
方法 与 first
方法类似,但其用途是从
选择范围中选取最后一个元素。其返回值为最后一个元素。
以下是一个使用 first
方法和 last
方法在 <ul>
元素中选取 <li>
元素的示
例:
const $ = cheerio.load( `<ul> <li>Item 1</li> <li>Item 2</li> </ul>`, ); const firstItem = $('li').first(); const lastItem = $('li').last(); render( <> <p>First: {firstItem.text()}</p> <p>Last: {lastItem.text()}</p> </>, );
总结
Cheerio provides a range of methods for traversing and filtering elements within a document. These methods allow you to move up and down the DOM tree, move sideways within the tree, and filter elements based on various criteria. By using these methods, you can easily select and manipulate elements within a document using Cheerio. Cheerio 提供了一系列用于遍历和过滤文档中元素的方法。通过 这些方法,您可以在 DOM 树中上钻下钻、move sideways within the tree,并根据各种条 件过滤元素。通过使用这些方法,您可以使用 Cheerio 轻松选取和操作文档中的元素。