Download PDF file from URL?

At a glance

The community members are looking for a simple npm library to download files from a URL in JavaScript/TypeScript. They have tried using Axios, HTTP.get, and clicking on the download button, but none of these methods have worked. One community member suggests using the Crawlee library, while another provides a code snippet using the Fetch API and the Node.js file system. However, the latter solution does not compile. Finally, another community member provides a working solution using the Node.js HTTPS module to download a file from a URL and save it to disk.

Useful resources

CCasper

Does someone know of a simple npm library to download files from a URL in Javascript/TypeScript?

8 comments

ggrimrippa

I actually want to know this as well, seems like it's giving me an error on how other formats are only supported application/pdf is not supported

CCasper

I tried using axiom, http.get, clicking on the download button. Nothing works

CCasper

Maybe this will work https://crawlee.dev/docs/examples/basic-crawler

ggrimrippa

Plain Text

router.addDefaultHandler(async ({ request }) => {
    const file = fs.createWriteStream('filename.pdf');
    const response = await fetch(request.url);
    response.body.pipe(file);
});

CCasper

Thanks I tried your suggestion but unfortunately it wont compile

CCasper

I fixed it with code

Plain Text

async function downloadFile(url: string, targetFile: string) {
  return await new Promise((resolve, reject) => {
    Https.get(url, (response: any) => {
      const code = response.statusCode ?? 0;

      if (code >= 400) {
        return reject(new Error(response.statusMessage));
      }

      // handle redirects
      if (code > 300 && code < 400 && !!response.headers.location) {
        return downloadFile(response.headers.location, targetFile);
      }

      // save the file to disk
      const fileWriter = Fs.createWriteStream(targetFile).on("finish", () => {
        resolve({});
      });

      response.pipe(fileWriter);
    }).on("error", (error: string) => {
      reject(error);
    });
  });
}

await downloadFile(link, "file.pdf");

AAlexey Udovydchenko

do not forget to add additionalMimeTypes in crawler options then you can handle files with cheerio crawler

CCasper

Thanks

Add a reply

Apify Discord Mirror

Download PDF file from URL?