跳到主要内容

Traversing the DOM

Traversing a document with Cheerio allows you to select and manipulate specific elements within the document. Whether you want to move up and down the DOM tree, move sideways within the tree, or filter elements based on certain criteria, Cheerio provides a range of methods to help you do so. 使用 Cheerio 遍历文档可以 让你选择并操作文档中的特定元素。无论您是想在 DOM 树中上下移动、在树中横向移动, 还是根据特定条件过滤元素,Cheerio 都能提供一系列方法来帮助您实现。

In this guide, we will go through the various methods available in Cheerio for traversing and filtering elements. We will cover methods for moving down the DOM tree, moving up the DOM tree, moving sideways within the tree, and filtering elements. By the end of this guide, you will have a good understanding of how to use these methods to select and manipulate elements within a document using Cheerio. 在本指南中,我们将介绍 Cheerio 中用于遍历和过滤元素的各种方法。我们将介 绍向下移动 DOM 树、向上移动 DOM 树、在 DOM 树中横向移动以及过滤元素的方法。本指 南结束时,您将对如何使用这些方法在文档中使用 Cheerio 选择和操作元素有一个很好的 了解。

提示

本指南旨在向您概述 Cheerio 中用于遍历和过滤元素的各种方法。有关这些方法的更详细 参考资料,请参阅 API 文档

下钻 DOM 树

Cheerio 提供了几种下钻 DOM 树和选择当前所选元素的子元素或后代元素的方法。

find

find 方法 用于在已选范围内找到特定元素。该方 法接受一个 CSS 选择器作为参数,并返回一个新的选择范围,其包含的是当前选择范围中 与该选择器相匹配的所有元素。

以下是一个使用 find 方法选取 <ul> 元素中所有 <li> 元素的示例:

实时编辑器
const $ = cheerio.load(
  `<ul>
    <li>Item 1</li>
    <li>Item 2</li>
  </ul>`,
);

const listItems = $('ul').find('li');
render(<>List item count: {listItems.length}</>);
结果
Loading...

children

children 方法 用于选择某个元素的所有直接 子元素。该方法返回一个新的选择范围,其包含当前选择范围内的所有直接子元素。

以下是一个使用 children 方法选取 <ul> 元素中所有 <li> 元素的示例:

实时编辑器
const $ = cheerio.load(
  `<ul>
    <li>Item 1</li>
    <li>Item 2</li>
  </ul>`,
);

const listItems = $('ul').children('li');
render(<>List item count: {listItems.length}</>);
结果
Loading...

contents

contents 方法 用于选择某个元素的所有子元 素,包括文本和注释节点。该方法返回一个新的选择范围,其包含当前选选择范围内容的所 有子元素。

以下是一个使用 contents 方法选取 <div> 元素的所有子元素的示例:

实时编辑器
const $ = cheerio.load(
  `<div>
    Text <p>Paragraph</p>
  </div>`,
);

const contents = $('div').contents();
render(<>Contents count: {contents.length}</>);
结果
Loading...

上钻 DOM 树

Cheerio 提供了几种上钻 DOM 树以及选取当前所选元素的祖先元素的方法。

parent

parent 方法 用于选取某个元素的父元素。该方 法返回一个新的元素,该新元素是当前选择范围内的每个元素的父元素。

以下是一个使用 parent 方法选取<li> 元素的父元素 <ul> 的示例:

实时编辑器
const $ = cheerio.load(
  `<ul>
    <li>Item 1</li>
  </ul>`,
);

const list = $('li').parent();
render(<>{list.prop('tagName')}</>);
结果
Loading...

parentsparentsUntil

The parents method allows you to select all ancestor elements of a selection, up to the root element. It returns a new selection containing all ancestor elements of the current selection. parents方法 用于选择选区的所有祖先元素, 直至根元素。它会返回一个新的选区,其中包含当前选区的所有祖先元素。

The parentsUntil method is similar to parents, but allows you to specify an ancestor element as a stop point. It returns a new selection containing all ancestor elements of the current selection up to (but not including) the specified ancestor. parentsUntil “方 法](/docs/api/classes/Cheerio#parentsuntil)与 ”parents "相似,但允许指定一个祖先 元素作为停止点。它会返回一个新的选区,其中包含当前选区的所有祖先元素,直到(但不 包括)指定的祖先元素。

以下是一个使用 parents 方法和 parentsUntil 方法来选取 <li> 元素的祖先元素 的示例:

实时编辑器
const $ = cheerio.load(
  `<div>
    <ul>
      <li>Item 1</li>
    </ul>
  </div>`,
);

const ancestors = $('li').parents();
const ancestorsUntil = $('li').parentsUntil('div');

render(
  <>
    <p>
      Ancestor count (also includes "body" and "html" tags): {ancestors.length}
    </p>
    <p>Ancestor count (until "div"): {ancestorsUntil.length}</p>
  </>,
);
结果
Loading...

closest

The closest method allows you to select the closest ancestor matching a given selector. It returns a new selection containing the closest ancestor element that matches the selector. If no matching ancestor is found, the method returns an empty selection.

Here's an example of using closest to select the closest ancestor <ul> element of a <li> element:

实时编辑器
const $ = cheerio.load(
  `<div>
    <ul>
      <li>Item 1</li>
    </ul>
  </div>`,
);

const list = $('li').closest('ul');
render(<>{list.prop('tagName')}</>);
结果
Loading...

Moving Sideways Within the DOM Tree

Cheerio provides several methods for moving sideways within the DOM tree and selecting elements that are siblings of the current selection.

next and prev

The next method allows you to select the next sibling element of a selection. It returns a new selection containing the next sibling element (if there is one). If the given selection contains multiple elements, next includes the next sibling for each one.

The prev method is similar to next, but allows you to select the previous sibling element. It returns a new selection containing the previous sibling element for each element in the given selection.

Here's an example of using next and prev to select sibling elements of a <li> element:

实时编辑器
const $ = cheerio.load(
  `<ul>
    <li>Item 1</li>
    <li>Item 2</li>
  </ul>`,
);

const nextItem = $('li:first').next();
const prevItem = $('li:eq(1)').prev();

render(
  <>
    <p>Next: {nextItem.text()}</p>
    <p>Prev: {prevItem.text()}</p>
  </>,
);
结果
Loading...

nextAll, prevAll, and siblings

The nextAll method allows you to select all siblings after the current element. It returns a new selection containing all sibling elements after each element in the current selection.

The prevAll method is similar to nextAll, but allows you to select all siblings before the current element. It returns a new selection containing all sibling elements before each element in the current selection.

The siblings method allows you to select all siblings of a selection. It returns a new selection containing all sibling elements of each element in the current selection.

Here's an example of using nextAll, prevAll, and siblings to select sibling elements of a <li> element:

实时编辑器
const $ = cheerio.load(
  `<ul>
    <li>[1]</li>
    <li>[2]</li>
    <li>[3]</li>
  </ul>`,
);

const nextAll = $('li:first').nextAll();
const prevAll = $('li:last').prevAll();
const siblings = $('li:eq(1)').siblings();

render(
  <>
    <p>Next All: {nextAll.text()}</p>
    <p>Prev All: {prevAll.text()}</p>
    <p>Siblings: {siblings.text()}</p>
  </>,
);
结果
Loading...

nextUntil and prevUntil

The nextUntil method allows you to select all siblings after the current element up to a specified sibling. It takes a selector or a sibling element as an argument and returns a new selection containing all sibling elements after the current element up to (but not including) the specified element.

The prevUntil method is similar to nextUntil, but allows you to select all siblings before the current element up to a specified sibling. It takes a selector or a sibling element as an argument and returns a new selection containing all sibling elements before the current element up to (but not including) the specified element.

Here's an example of using nextUntil and prevUntil to select sibling elements of a <li> element:

实时编辑器
const $ = cheerio.load(
  `<ul>
    <li>Item 1</li>
    <li>Item 2</li>
    <li>Item 3</li>
  </ul>`,
);

const nextUntil = $('li:first').nextUntil('li:last-child');
const prevUntil = $('li:last').prevUntil('li:first-child');

render(
  <>
    <p>Next: {nextUntil.text()}</p>
    <p>Prev: {prevUntil.text()}</p>
  </>,
);
结果
Loading...

Filtering elements

Cheerio provides several methods for filtering elements within a selection.

提示

Most of these filters also exist as selectors. For example, the first method is available as the :first selector. Users are encouraged to use the selector syntax when possible, as it is more performant.

eq

The eq method allows you to select an element at a specified index within a selection. It takes an index as an argument and returns a new selection containing the element at the specified index.

Here's an example of using eq to select the second <li> element within a <ul> element:

实时编辑器
const $ = cheerio.load(
  `<ul>
    <li>Item 1</li>
    <li>Item 2</li>
  </ul>`,
);

const secondItem = $('li').eq(1);
render(<>{secondItem.text()}</>);
结果
Loading...

filter and not

The filter method allows you to select elements that match a given selector. It takes a selector as an argument and returns a new selection containing only those elements that match the selector.

The not method is similar to filter, but allows you to select elements that do not match a given selector. It takes a selector as an argument and returns a new selection containing only those elements that do not match the selector.

Here's an example of using filter and not to select <li> elements within a <ul> element:

实时编辑器
const $ = cheerio.load(
  `<ul>
    <li class="item">Item 1</li>
    <li>Item 2</li>
  </ul>`,
);

const matchingItems = $('li').filter('.item');
const nonMatchingItems = $('li').not('.item');

render(
  <>
    <p>Matching: {matchingItems.text()}</p>
    <p>Non-matching: {nonMatchingItems.text()}</p>
  </>,
);
结果
Loading...

has

has 方法 allows you to select elements that contain an element matching a given selector. It takes a selector as an argument and returns a new selection containing only those elements that contain an element matching the selector.

Here's an example of using has to select <li> elements within a <ul> element that contain a <strong> element:

实时编辑器
const $ = cheerio.load(
  `<ul>
    <li>Item 1</li>
    <li>
      <strong>Item 2</strong>
    </li>
  </ul>`,
);

const matchingItems = $('li').has('strong');
render(<>{matchingItems.length}</>);
结果
Loading...

firstlast

first 方法 用于选取某个选择范围内的第一个元 素。其返回值为第一个元素。

last 方法first 方法类似,但其用途是从 选择范围中选取最后一个元素。其返回值为最后一个元素。

以下是一个使用 first 方法和 last 方法在 <ul> 元素中选取 <li> 元素的示 例:

实时编辑器
const $ = cheerio.load(
  `<ul>
    <li>Item 1</li>
    <li>Item 2</li>
  </ul>`,
);

const firstItem = $('li').first();
const lastItem = $('li').last();

render(
  <>
    <p>First: {firstItem.text()}</p>
    <p>Last: {lastItem.text()}</p>
  </>,
);
结果
Loading...

总结

Cheerio provides a range of methods for traversing and filtering elements within a document. These methods allow you to move up and down the DOM tree, move sideways within the tree, and filter elements based on various criteria. By using these methods, you can easily select and manipulate elements within a document using Cheerio. Cheerio 提供了一系列用于遍历和过滤文档中元素的方法。通过 这些方法,您可以在 DOM 树中上钻下钻、move sideways within the tree,并根据各种条 件过滤元素。通过使用这些方法,您可以使用 Cheerio 轻松选取和操作文档中的元素。