data:image/s3,"s3://crabby-images/33085/330859869e541b0640e9c99e089caf15f8639fe0" alt="Puppeteer examples download"
The function will be called with an object having the following fields: If a function is provided, this function will be called (only for this job) instead of the function provided to Cluster.task.
taskFunction Function like the one given to Cluster.task. See for a more complex usage of this argument. The data given will be provided to your task function(s). This might be your URL (a string) or a more complex object containing data. Alternatively you can directly queue the function that you want to be executed. A task is called for each job you queue via Cluster.queue. worker An object containing information about the worker executing the current job. data The data of the job you provided to Cluster.queue. page The page given by puppeteer, which provides methods to interact with a single tab in Chromium. taskFunction Sets the function, which will be called for each job. When using puppeteer-core, make sure to also provide puppeteerOptions.executablePath. If not set, will default to using puppeteer. puppeteer In case you want to use a different puppeteer library (like puppeteer-core or puppeteer-extra), pass the object here. You can use this to prevent a network peak right at the start. Set this to a value like 100 (0.1 seconds) in case you want some time to pass before another worker is created. workerCreationDelay Time between creation of two workers. monitor If set to true, will provide a small command line output to provide information about the crawling process. timeout Specify a timeout for all tasks. If you use this field, the queued data must be your URL or data must be an object containing a field called url. skipDuplicateUrls If set to true, will skip URLs which were already crawled by the cluster. sameDomainDelay How much time should pass at minimum between two requests to the same domain.
Ignored by tasks queued via Cluster.execute.
retryDelay How much time should pass at minimum between the job execution and its retry.
retryLimit How often do you want to retry a job before marking it as failed. Defaults to undefined (meaning that puppeteerOptions will be used). If set, puppeteerOptions will be ignored.
perBrowserOptions > Object passed to puppeteer.launch for each individual browser.