LogoLogo
SupportDashboard
  • Community
  • Welcome to Hyperbrowser
  • Get Started
    • Quickstart
      • AI Agents
        • Browser Use
        • Claude Computer Use
        • OpenAI CUA
      • Web Scraping
        • Scrape
        • Crawl
        • Extract
      • Browser Automation
        • Puppeteer
        • Playwright
        • Selenium
  • Agents
    • Browser Use
    • Claude Computer Use
    • OpenAI CUA
  • HyperAgent
    • About HyperAgent
      • HyperAgent SDK
      • HyperAgent Types
  • Quickstart
  • Multi-Page actions
  • Custom Actions
  • MCP Support
    • Tutorial
  • Examples
    • Custom Actions
    • LLM support
    • Cloud Support
      • Setting Up
      • Proxies
      • Profiles
    • MCP Examples
      • Google Sheets
      • Weather
        • Weather Server
    • Output to Schema
  • Web Scraping
    • Scrape
    • Crawl
    • Extract
  • Sessions
    • Overview
      • Session Parameters
    • Advanced Privacy & Anti-Detection
      • Stealth Mode
      • Proxies
      • Static IPs
      • CAPTCHA Solving
      • Ad Blocking
    • Profiles
    • Recordings
    • Live View
    • Extensions
    • Downloads
  • Guides
    • Model Context Protocol
    • Scraping
    • AI Function Calling
    • Extract Information with an LLM
    • Using Hyperbrowser Session
    • CAPTCHA Solving
  • Integrations
    • ⛓️LangChain
    • 🦙LlamaIndex
  • reference
    • Pricing
    • SDKs
      • Node
        • Sessions
        • Profiles
        • Scrape
        • Crawl
        • Extensions
      • Python
        • Sessions
        • Profiles
        • Scrape
        • Crawl
        • Extensions
    • API Reference
      • Sessions
      • Scrape
      • Crawl
      • Extract
      • Agents
        • Browser Use
        • Claude Computer Use
        • OpenAI CUA
      • Profiles
      • Extensions
Powered by GitBook
On this page
  • Setup
  • Installation
  • Setup your Environment
  • Code
  • Run the Scraper
  • How it Works
Export as PDF
  1. Guides

CAPTCHA Solving

Using Hyperbrowser's CAPTCHA Solving

PreviousUsing Hyperbrowser SessionNextLangChain

Last updated 1 month ago

Hyperbrowser's CAPTCHA solving feature requires being on a PAID plan.

In this guide, we will see how to use Hyperbrowser and its integrated CAPTCHA solver to scrape Today's Top Deals from Amazon without being blocked.

Setup

First, lets create a new Node.js project.

mkdir amazon-deals-scraper && cd amazon-deals-scraper
npm init -y

Installation

Next, let's install the necessary dependencies to run our script.

npm install @hyperbrowser/sdk puppeteer-core dotenv

Setup your Environment

To use Hyperbrowser with your code, you will need an API Key. You can get one easily from the . Once you have your API Key, add it to your .env file as HYPERBROWSER_API_KEY.

Code

Next, create a new file index.js and add the following code:

import { Hyperbrowser } from "@hyperbrowser/sdk";
import { config } from "dotenv";
import { connect } from "puppeteer-core";

config();

const client = new Hyperbrowser({
  apiKey: process.env.HYPERBROWSER_API_KEY,
});

const sleep = (ms) => new Promise((resolve) => setTimeout(resolve, ms));

const main = async () => {
  console.log("Starting session");
  const session = await client.sessions.create({
    solveCaptchas: true,
    adblock: true,
    annoyances: true,
    trackers: true,
  });
  console.log("Session created:", session.id);

  try {
    const browser = await connect({
      browserWSEndpoint: session.wsEndpoint,
      defaultViewport: null,
    });

    const [page] = await browser.pages();

    await page.goto("https://amazon.com/deals", {
      waitUntil: "load",
      timeout: 20_000,
    });

    const pageTitle = await page.title();
    console.log("Navigated to Page:", pageTitle);

    await sleep(10_000);

    const products = await page.evaluate(() => {
      const items = document.querySelectorAll(".dcl-carousel-element");
      return Array.from(items)
        .map((item) => {
          const nameElement = item.querySelector(".dcl-product-label");
          const dealPriceElement = item.querySelector(
            ".dcl-product-price-new .a-offscreen"
          );
          const originalPriceElement = item.querySelector(
            ".dcl-product-price-old .a-offscreen"
          );
          const percentOffElement = item.querySelector(
            ".dcl-badge .a-size-mini"
          );

          return {
            name: nameElement ? nameElement.textContent?.trim() : null,
            dealPrice: dealPriceElement
              ? dealPriceElement.textContent?.trim()
              : null,
            originalPrice: originalPriceElement
              ? originalPriceElement.textContent?.trim()
              : null,
            percentOff: percentOffElement
              ? percentOffElement.textContent?.trim()
              : null,
          };
        })
        .filter((product) => product.name && product.dealPrice);
    });

    console.log("Found products:", JSON.stringify(products, null, 2));
  } catch (error) {
    console.error(`Encountered an error: ${error}`);
  } finally {
    await client.sessions.stop(session.id);
    console.log("Session stopped:", session.id);
  }
};

main().catch((error) => {
  console.error(`Encountered an error: ${error}`);
});

Run the Scraper

To run the Amazon deals scraper:

  1. In your terminal, navigate to the project directory

  2. Run the script with Node.js:

node index.js

The script will:

  1. Create a new Hyperbrowser session with captcha solving, ad blocking, and anti-tracking enabled

  2. Launch a Puppeteer browser and connect it to the session

  3. Navigate to the Amazon deals page, solving any CAPTCHAs that are encountered

  4. Wait 10 seconds for the page to load its content

  5. Scrape the deal data using Puppeteer's page.evaluate method

  6. Print the scraped products to the console

  7. Close the browser and stop the Hyperbrowser session

You should see the scraped products printed in the console, like:

[
  {
    "name": "Apple AirPods Pro",
    "dealPrice": "$197.00",
    "originalPrice": "$249.99", 
    "percentOff": "21% off"
  },
  {
    "name": "Echo Dot (4th Gen)", 
    "dealPrice": "$27.99",
    "originalPrice": "$49.99",
    "percentOff": "44% off"  
  }
]

How it Works

Let's break down the key parts:

  1. We create a new Hyperbrowser session with solveCaptchas, adblock, annoyances, and trackers set to true. This enables the captcha solver and other anti-bot evasion features.

  2. We launch a Puppeteer browser and connect it to the Hyperbrowser session.

  3. We navigate to the Amazon deals page and wait for any CAPTCHAs to be solved automatically by Hyperbrowser.

  4. We pause execution for 10 seconds with sleep to allow all content to be loaded.

  5. We use page.evaluate to run JavaScript on the page to scrape the deal data.

  6. In the evaluator function, we select the deal elements, extract the relevant data, and return an array of product objects.

  7. We print the scraped data and stop the Hyperbrowser session.

Without the solveCaptchas enabled, we could encounter a screen like this when trying to navigate to the deals page:

The captcha solver runs automatically in the background, so we don't need to handle captchas explicitly in our script. If a captcha appears, Hyperbrowser will solve it and continue loading the page. In this case, it would solve this CAPTCHA and continue on to the deals page.

If you are trying to solve simple image based captchas (the kind which get input into a text box for verification), you also have to add the imageCaptchaParamsfield. It takes an array of objects. Each object has a parameter for image selector and input selector. Together, these are used to specify where the source of a captcha will come from, and the input box into which the solution will have to be filled in. The selectors follow the standard html query-selector format .

dashboard
as specified on mdn