Puppeteer is a Node library that provides a high-level API to control headless Chrome over the DevTools Protocol. Puppeteer is compatible with the native DOM element and has many advantages. This tutorial will cover how to fill out and submit Puppeteer submit form, covering automated batch submissions, and other fundamental scenarios.
Let's get started:
Prerequisites
Let's see what prerequisites are required for this tutorial, before we move on to coding. To get started, install Node.js and npm from the official Nodejs website. Next, use npm to install Puppeteer by executing the following command in your terminal.
Filling and submitting a basic form
After installing the necessary files on our PCs, let's move on to the key portion of the manual. Now, let us briefly go over how to write the puppeteer code for creating a contact form, a login, and a search action.
Search action example
Let's consider a basic search form on a website. Given below is the code for creating a search form by using HTML.
<form id="search-form" action="#">
<input type="text" name="query" />
<button type="submit">Search</button></form>
Lets fill and submit this form with Puppeteer by using form inputs.
const puppeteer = require('puppeteer');
(async () => { const browser = await puppeteer.launch({ headless: false }); const page = await browser.newPage();
await page.goto('http://example.com');
await page.waitForSelector('#search-form');
await page.type('input[name=query]', 'Puppeteer'); await page.evaluate(() => { document.querySelector('button[type=submit]').click();
});
await page.waitForNavigation();
await browser.close();
})();
Contact form example
Consider a contact form with fields for name, email, and message. The Puppeteer code to fill and submit the form using form inputs looks like this.
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto('https://example.com/contact');
await page.waitForSelector('#contact-form');
await page.type('input[name=name]', 'John Doe');
await page.type('input[name=email]', 'john@example.com');
await page.type('textarea[name=message]', 'Hello, Puppeteer!');
await page.evaluate(() => {
document.querySelector('button[type=submit]').click();
});
await page.waitForNavigation();
await browser.close();
})();
The code asks to automatically click the submit button after the necessary details are filled.
Login example
For a login form, where you have username and password fields, the Puppeteer script would resemble the following.
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto('http://example.com/login');
await page.waitForSelector('#login-form');
await page.type('input[name=username]', 'user123');
await page.type('input[name=password]', 'securepassword');
await page.evaluate(() => {
document.querySelector('button[type=submit]').click();
});
await page.waitForNavigation();
await browser.close();
})();
Here also we are automating form submission.
Automated batch form submission
In some scenarios, you might need to automate the submission of multiple forms. This could be useful for testing or data collection purposes. Let's extend our previous examples to automate form submission.
Search action example (batch submission)
Suppose you have a list of search queries to perform. You can automate the process as follows.
const puppeteer = require('puppeteer');
const searchQueries = ['Puppeteer', 'Web Scraping', 'Automation'];
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
for (const query of searchQueries) {
await page.goto('https://example.com');
await page.waitForSelector('#search-form');
await page.type('input[name=query]', query);
await page.evaluate(() => {
document.querySelector('button[type=submit]').click();
});
await page.waitForNavigation();
}
await browser.close();
})();
Contact form example (batch submission)
Extend the contact form example to submit multiple entries. This is an example of an automated form submission as well.
const puppeteer = require('puppeteer');
const contacts = [
{ name: 'John Doe', email: 'john@example.com', message: 'Hello, Puppeteer!' },
// Add more entries as needed
];
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
for (const contact of contacts) {
await page.goto('https://example.com/contact');
await page.waitForSelector('#contact-form');
await page.type('input[name=name]', contact.name);
await page.type('input[name=email]', contact.email);
await page.type('textarea[name=message]', contact.message);
await page.evaluate(() => {
document.querySelector('button[type=submit]').click();
});
await page.waitForNavigation();
}
await browser.close();
})();
Login example (batch submission)
Extend the login example to handle multiple user credentials.
const puppeteer = require('puppeteer');
const credentials = [
{ username: 'user123', password: 'securepassword' },
// Add more credentials as needed
];
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
for (const credential of credentials) {
await page.goto('https://example.com/login');
await page.waitForSelector('#login-form');
await page.type('input[name=username]', credential.username);
await page.type('input[name=password]', credential.password);
await page.evaluate(() => {
document.querySelector('button[type=submit]').click();
});
await page.waitForNavigation();
}
await browser.close();
})();
When we automate the submission of multiple forms we are able to receive some positive outcomes. Some examples include the following.
- Time saving: Automation saves time and improves efficiency by smoothening processes. It eliminates the need for manual intervention. This is especially advantageous when it comes to tasks like collecting data, performing tests, or carrying out repetitive actions.
- Uniformity: Automation guarantees consistent form submissions, reducing the chance of mistakes that can arise from manual interactions.
- Expandability: The option to automate multiple submissions allows for scalability, making it possible to efficiently process a significant number of forms.
- Data Gathering and Testing: Automated form submissions are valuable for gathering extensive data or for testing automation that require repetitive form interactions.
Handling advanced cases
While basic form filling and submission can be achieved with the examples above, you might encounter more complex scenarios on certain websites. Let's explore how to handle advanced cases using Puppeteer.
Form with CAPTCHA confirmation
The purpose of CAPTCHA challenges is to stop automated bots from submitting forms. Use human interaction or third-party services to manage CAPTCHA in Puppeteer. For example, you could utilize an API to solve CAPTCHAs, and a service like 2Captcha.
Handling anti-scrape measures
Websites may implement anti-scraping measures to deter automated bots. To overcome this, you can use various strategies. Some of them are given below.
Cookie management
Use Puppeteer's cookie management capabilities to simulate realistic user behavior. Consider the example given below.
await page.setCookie({ name: 'session', value: 'your_session_value', domain: 'example.com' });
User agent modification
By changing the user agent string, you can mimic different browsers, operating systems, or devices, providing a way to simulate varied client environments during web scraping or automation. An example is given below.
await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36');
Using Puppeteer Extra
Puppeteer Extra is a modular plugin system that improves the functionality of Puppeteer. Puppeteer Extra's ability to incorporate plugins that permit anonymous operations during site scraping or automated chores is an intriguing feature. We have used Puppeteer Extra in the demonstration that follows.
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
// Perform actions with the page here
await browser.close();
})();
Conclusion
Puppeteer is a reliable tool for web form automation. It simplifies basic tasks like search actions, contact forms, and logins while offering scalability for batch submissions. With prerequisites covered, Puppeteer's code simplicity and adaptability shine. Puppeteer’s advanced features are able to face challenges, from CAPTCHA handling to stealth strategies. This makes it versatile for diverse web automation scenarios. Puppeteer's ability to handle dynamic content, delays, and customization ensures effective form interactions. Puppeteer, being a flexible and reliable tool, is nonetheless necessary to precisely automate the complexities of online forms.