Main Website
Scraping
E2E Testing
Updated on
October 29, 2024

Proxy in Playwright: 3 Effective Setup Methods Explained

Playwright is a powerful automation tool designed for web scraping, testing, and interaction with web applications across multiple browsers. Its flexibility and cross-browser support make it a favorite for developers seeking reliable, scalable solutions. However, like any web interaction tool, Playwright can face issues with access restrictions, rate limits, and region-based blocks when working with certain websites. This is where proxies come in.

In this article, we’ll dive into three effective methods to set up proxies in Playwright to help you bypass such restrictions: using a static IP, a proxy list, and rotating IPs (both datacenter and residential). Each method offers different benefits depending on your use case, whether it's for authentication, handling multiple requests, or avoiding bans. We’ll also address common proxy-related issues and how to resolve them, ensuring your setup runs smoothly without interruptions.

3 Methods of setting up proxy in Playwright

Whether you're aiming to maintain consistent sessions or handle complex tasks like user authentication, choosing the right proxy method is the key. Here are three effective methods for setting up proxies in Playwright:

1) Static IP Proxy: This method involves using a single, consistent IP address for all your Playwright sessions that can be useful for maintaining session persistence and authentication.

2) Proxy List: With this approach, you can rotate between multiple proxies from a predefined list, which helps distribute requests and avoid hitting rate limits, while maintaining authentication across different IPs.

3) Rotating IP (Datacenter and Residential Proxy): This method automatically switches between IPs, offering a dynamic solution for large-scale tasks, especially those that require bypassing region-specific restrictions and handling authentications seamlessly across multiple IPs.

Let’s go through each of these methods with code examples.

Method 1: Static IP proxy setup

Using a Static IP proxy ensures session persistence, making it ideal for tasks that require authentication and continuous access from the same IP address. Here's how you can set it up in Playwright:

Import Playwright: First, you need to import Playwright. This module gives you access to the browsers and functionalities required to run automated browser tasks.

const { chromium } = require('playwright');

Launch the Browser with a Static IP Proxy: When launching the browser, configure it to use a static IP by providing the proxy details. The proxy object includes the proxy server's IP, port, and optional authentication credentials.

(async () => {
    const browser = await chromium.launch({
        proxy: {
            server: 'http://your-static-ip:port',
            username: 'proxy-username',
            password: 'proxy-password'
        }
    });
  • server: The static IP and port of your proxy.
  • username/password: If the proxy requires authentication, you can pass the credentials here. This is crucial for proxies that require login before allowing access to resources. Webshare users can find the details in their account.

Create a New Browser Context: The browser context is where Playwright operates. You create a new context to open multiple pages or reuse session data while interacting with different parts of the website.

const context = await browser.newContext();

Open a New Page: With the browser context set up, you can open a new page within this context. This page will be used for navigating and interacting with websites.

const page = await context.newPage();

Navigate to the Target URL: Use page.goto() to navigate to the website where you’ll perform tasks like scraping or interacting with authenticated resources. Since the proxy is already configured, all requests will be routed through the static IP.

Once your tasks are complete, close the browser to free up resources.

await page.goto('https://example.com'); // the URL you're targeting
await browser.close();

})();

Method 2: Using a proxy list

Using a proxy list in Playwright allows you to assign different proxies for various browser sessions. This is useful for handling multiple sessions from different IP addresses or bypassing access restrictions. You can easily configure Playwright to select and apply proxies from a predefined list. Here are the steps you need to follow:

Import Playwright: As always, start by importing the Playwright module.

const { chromium } = require('playwright');

Prepare Your Proxy List: Create an array that holds your list of proxies, with each proxy containing the IP, port, and, if required, authentication credentials.

const proxyList = [
    { server: 'http://proxy1-ip:port', username: 'user1', password: 'pass1' },
    { server: 'http://proxy2-ip:port', username: 'user2', password: 'pass2' },
    { server: 'http://proxy3-ip:port', username: 'user3', password: 'pass3' },
];

In this list, replace the placeholder values with your actual proxy IPs, ports, usernames, and passwords.

Select a Random Proxy: To ensure that a different proxy is selected each time, you can randomly choose one from the proxyList array. The following Math expression generates a random index based on the number of proxies in your list.

function getRandomProxy() {
    const randomIndex = Math.floor(Math.random() * proxyList.length);
    return proxyList[randomIndex];
}

Launch the Browser with a Random Proxy: For each Playwright session, launch the browser with a randomly selected proxy from your list. This ensures each session uses a different proxy for its requests.

1(async () => {
2    const proxy = getRandomProxy(); // Get a random proxy from the list
3
4    const browser = await chromium.launch({
5        proxy: {
6            server: proxy.server,
7            username: proxy.username,
8            password: proxy.password
9        }
10    });
11
12    const context = await browser.newContext();
13    const page = await context.newPage();
14
15    await page.goto('https://example.com'); // Replace with your target URL
16
17    // Perform your actions...
18
19    await browser.close();
20})();

If you're running multiple requests or sessions, you can invoke the getRandomProxy() function each time a new browser session starts. This ensures that each session will use a new proxy, helping you avoid triggering blocks or rate limits.

Method 3: Rotating proxies

Rotating proxies are ideal when you need to distribute requests across multiple IP addresses to avoid rate limits or blocks on websites. This method is especially useful for tasks like web scraping, testing, or automating tasks that involve multiple sessions. There are two types of rotating proxies: Datacenter and Residential.

  • Datacenter proxies are hosted in data centers and tend to be faster but more detectable by websites.
  • Residential proxies come from real devices connected to residential ISPs, making them harder to detect but usually slower.

For Playwright, setting up rotating proxies can help you manage authentication challenges, distribute traffic, and gain access to geo-restricted content without risking IP bans. Below are the steps to set up rotating proxies:

Subscribe to a Webshare Plan: To use rotating proxies, you’ll typically need to subscribe to a rotating proxy service. One such service is Webshare that offers rotating proxy endpoints. If you have an account, you can access the rotating proxy endpoint in your account. This endpoint is provided for free for all users.

Import Playwright: As usual, start by importing Playwright in your script.

const { chromium } = require('playwright');

Define the Rotating Proxy Server: Configure the rotating proxy by specifying the server details provided by your proxy provider. Below is the code setup for Webshare’s rotating proxy endpoint:

const browser = await chromium.launch({
    proxy: {
        server: 'http://username:password@p.webshare.io:port',
    }
});

Make sure to do the following:

  • Replace username with your Proxy Username.
  • Replace password with your Proxy Password.
  • Replace p.webshare.io with the hostname of your proxy server (Domain Name field for Webshare users).
  • Replace port with the port number provided by your proxy provider (Proxy Port field for Webshare users).

Create a New Browser Context: A browser context is essential for each session, allowing you to manage cookies, local storage, and session states. Playwright ensures that each context can run independently, even if different proxies are used.

const context = await browser.newContext();

Open a New Page: Once the browser context is ready, create a new page where you can interact with your target website.

const page = await context.newPage();

Navigate to Your Target URL: Using the rotating IP proxy, navigate to your desired website. The proxy will automatically change the IP after a specified period or with each new session, depending on your provider's settings.

Afterwards, you can execute any actions on the webpage, such as logging in, scraping data, or interacting with elements. Since the proxy is rotating, you’ll benefit from using a fresh IP each time, avoiding IP blocks and managing multiple authenticated sessions more efficiently.

await page.goto('https://example.com');

// Perform scraping, login, or other tasks here...
await browser.close();

Fixing common issues

While setting up proxies in Playwright, it’s common to encounter issues that could hinder seamless operation. Below are some typical problems and their solutions:

Authentication errors

When using authenticated proxies, authentication errors are a frequent issue. This occurs when the credentials are not correctly passed to the proxy server. Make sure:

  • Correct Credentials: Ensure you have the correct username and password set in the proxy configuration.
  • Proper Format: Use the correct proxy server format:

http://username:password@proxyserver:port.

If you still face authentication issues, verify that your proxy provider supports HTTP Basic Authentication and that your subscription includes authenticated proxies.

Connection timeouts

Connection timeouts can occur if the proxy server is unresponsive or overloaded. To address this:

  • Check Proxy Status: Confirm with your proxy provider that the proxy server is operational and not undergoing maintenance.
  • Increase Timeout: You can increase the default timeout in Playwright using the timeout option:
const browser = await chromium.launch({
    proxy: {
        server: 'http://username:password@proxyserver:port'
    },
    timeout: 60000 // Timeout in milliseconds
});
  • Switch Proxies: If timeouts persist, try switching to a different proxy server or rotating proxies more frequently.

Blocked IPs or CAPTCHA challengesWebsites may block your proxy IP or present CAPTCHA challenges when they detect suspicious activity. To resolve this:

  • Use Residential Proxies: Residential proxies are less likely to be flagged or blocked compared to datacenter proxies, as they appear to originate from real users.
  • Reduce Request Volume: Slow down your requests to avoid being detected as a bot. Use delays between actions to mimic real user behavior.
await page.waitForTimeout(2000);
  • Rotate IPs: If using rotating proxies, ensure that each request is routed through a different IP to avoid rate limiting or blocking.

SSL handshake errorsIf you encounter SSL errors when using proxies, it may be due to invalid SSL certificates or proxy misconfiguration. To fix this:

  • Ignore SSL Errors: In testing environments, you can bypass SSL certificate errors by disabling strict certificate checks.
const context = await browser.newContext({
    ignoreHTTPSErrors: true
});
  • Check Proxy Configuration: Ensure that your proxy settings support SSL traffic and that the proxy server itself is not blocking HTTPS connections.

Proxy not working in headless modeSometimes proxies work in a non-headless mode but fail when the browser is launched in headless mode. To address this:

  • Disable Headless Mode: If your automation tasks don't strictly require headless mode, you can disable it to avoid this issue.
const browser = await chromium.launch({
    headless: false, // Disable headless mode
    proxy: {
        server: 'http://username:password@proxyserver:port'
    }
});
  • Check for Compatibility: Ensure that your proxy provider supports headless browsing. Some proxies may behave differently in headless mode due to restrictions or additional security checks.

Wrapping up: proxy setup in Playwright

Setting up proxies in Playwright enhances your ability to manage IPs, bypass geo-restrictions and maintain anonymity. Whether using static IPs, proxy lists, or rotating proxies, ensuring proper authentication is key to a seamless connection. By troubleshooting common issues, you can leverage proxies to optimize your Playwright sessions without disruptions.

Puppeteer vs. Playwright

Proxy in Puppeteer: 3 Effective Setup Methods Explained

Proxy in Selenium: 3 Setup Methods Explained