了解使用 NewsDataHub API 进行分页

本指南介绍使用 NewsDataHub API 时如何对结果进行分页。

NewsDataHub API 是一种通过 RESTful API 接口提供新闻数据的服务。它实现了基于游标的分页,以高效处理大型数据集⁠,允许开发人员以可管理的批次检索新闻文章。每个响应都包含一组文章,其中每个文章对象包含标题、描述、发布日期、来源、内容、关键字、主题和情感分析⁠等详细信息。该 API 使用“游标”参数无缝浏览结果⁠,并为搜索参数和过滤选项⁠等高级功能提供全面的文档。

有关文档,请访问:https://newsdatahub.com/docs

API 通常会在响应中返回有限量的数据,因为在单个请求中返回所有结果通常是不切实际的。相反,它们使用分页——一种将数据分成单独页面或批次的技术。这允许客户端一次检索一页,访问可管理的结果子集。

当你向 `/news` 端点发出初始请求并收到第一批结果时,响应的形状如下所示:

{
    "next_cursor": "VW93MzoqpzM0MzgzMQpqwDAwMDQ5LjA6MzA0NTM0Mjk1T0xHag==",
        "total_results": 910310,
        "per_page": 10,
        "data": [
            {
                "id": "4927167e-93f3-45d2-9c53-f1b8cdf2888f",
                "title": "Jail time for wage theft: New laws start January",
                "source_title": "Dynamic Business",
                "source_link": "https://dynamicbusiness.com",
                "article_link": "https://dynamicbusiness.com/topics/news/jail-time-for-wage-theft-new-laws-start-january.html",
                "keywords": [
                    "wage theft",
                    "criminalisation of wage theft",
                    "Australian businesses",
                    "payroll errors",
                    "underpayment laws"
                ],
                "topics": [
                    "law",
                    "employment",
                    "economy"
                ],
                "description": "Starting January 2025, deliberate wage theft will come with serious consequences for employers in Australia.",
                "pub_date": "2024-12-17T07:15:00",
                "creator": null,
                "content": "The criminalisation of wage theft from January 2025 will be a wake-up call for all Australian businesses. While deliberate underpayment has rightly drawn scrutiny, our research reveals that accidental payroll errors are alarmingly common, affecting nearly 60% of companies in the past two years. Matt Loop, VP and Head of Asia at Rippling Starting January 1, 2025, Australias workplace compliance landscape will change dramatically. Employers who deliberately underpay employees could face fines as high as AU$8. 25 million or up to 10 years in prison under new amendments to the Fair Work Act 2009 likely. Employers must act decisively to ensure compliance, as ignorance or unintentional errors wont shield them from civil or criminal consequences. Matt Loop, VP and Head of Asia at Rippling, says: The criminalisation of wage theft from January 2025 will be a wake-up call for all Australian businesses. While deliberate underpayment has rightly drawn scrutiny, our research reveals that accidental payroll errors are alarmingly common, affecting nearly 60% of companies in the past two years. Adding to the challenge, many SMEs still rely on fragmented, siloed systems to manage payroll. This not only complicates operations but significantly increases the risk of errors heightening the potential for non-compliance under the new laws. The urgency for businesses to modernise their approach cannot be overstated. Technology offers a practical solution, helping to streamline and automate processes, reduce human error, and ensure compliance. But this is about more than just avoiding penalties. Accurate and timely pay builds trust with employees, strengthens workplace morale, and fosters accountability. The message is clear: wage theft isnt just a financial risk anymoreits a criminal offense. Now is the time to ensure your business complies with Australias new workplace laws. Keep up to date with our stories on LinkedIn, Twitter, Facebook and Instagram.",
                "media_url": "https://backend.dynamicbusiness.com/wp-content/uploads/2024/12/db-3-4.jpg",
                "media_type": "image/jpeg",
                "media_description": null,
                "media_credit": null,
                "media_thumbnail": null,
                "language": "en",
                "sentiment": {
                    "pos": 0.083,
                    "neg": 0.12,
                    "neu": 0.796
                }
            },
        // more article objects
      ]
  }

注意 JSON 响应中的第一个属性 - `next_cursor`。`next_cursor` 中的值指向下一页结果的开始。发出下一个请求时,您可以像这样指定 `cursor` 查询参数:

`https://api.newsdatahub.com/v1/news?cursor=VW93MzoqpzM0MzgzMQpqwDAwMDQ5LjA6MzA0NTM0Mjk1T0xHag==`

尝试分页结果的最简单方法是通过 Postman 或类似工具。这里有一个简短的视频,演示了如何使用光标值在 Postman 中分页结果。

https://youtu.be/G7kkTwCPtCE

当“next_cursor”值为“null”时,表示您已到达所选条件的可用结果的末尾。

使用 Python 对结果进行分页

以下是如何使用 Python 通过 NewsDataHub API 结果设置基本分页。

import requests

# Make sure to keep your API keys secure
# Use environment variables instead of hardcoding
API_KEY = 'your_api_key'
BASE_URL = 'https://api.newsdatahub.com/v1/news'

headers = {
    'X-Api-Key': API_KEY,
    'Accept': 'application/json',
    'User-Agent': 'Mozilla/5.0 Chrome/83.0.4103.97 Safari/537.36'
}

params = {}
cursor = None

# Limit to 5 pages to avoid rate limiting while demonstrating pagination

for _ in range(5):
    params['cursor'] = cursor

    try:
        response = requests.get(BASE_URL, headers=headers, params=params)
        response.raise_for_status()
        data = response.json()
    except (requests.HTTPError, ValueError) as e:
        print(f"There was an error when making the request: {e}")
        continue

    cursor = data.get('next_cursor')

    for article in data.get('data', []):
        print(article['title'])

    if cursor is None:
        print("No more results")
        break

基于索引的分页

一些 API 使用基于索引的分页将结果拆分为离散块。通过这种方法,API 会返回特定页面的数据 - 类似于书中的目录,其中每个页码指向特定部分。

虽然基于索引的分页更易于实现,但它也有几个缺点。它难以实现实时更新,可能产生不一致的结果,并且会给数据库带来更多压力,因为检索每个新页面需要依次扫描以前的记录。

我们已经介绍了 NewsDataHub API 中基于游标的分页的基础知识。有关搜索参数和过滤选项等高级功能,请参阅 https://newsdatahub.com/docs 上的完整 API 文档。