conscious-sapphire
conscious-sapphire•3y ago

request method delete

i am trying to make a delete request to an api but it doesnt work. ( script hangs at the request then request times out)
router.addDefaultHandler(async ({ request, json, enqueueLinks, log }) => {
const requests = [
{
url:'http://httpbin.org/delete',
method:'DELETE',
label:'delete_check',
useExtendedUniqueKey:true,
headers: {'accept':'application/json'},
},
];
await crawler.addRequests(requests)
});
router.addDefaultHandler(async ({ request, json, enqueueLinks, log }) => {
const requests = [
{
url:'http://httpbin.org/delete',
method:'DELETE',
label:'delete_check',
useExtendedUniqueKey:true,
headers: {'accept':'application/json'},
},
];
await crawler.addRequests(requests)
});
curl equivalent
curl -X DELETE "http://httpbin.org/delete" -H "accept: application/json"
curl -X DELETE "http://httpbin.org/delete" -H "accept: application/json"
8 Replies
Alexey Udovydchenko
Alexey Udovydchenko•3y ago
try to remove label it should be in userData not in request root, if issue not resolved by doing so add github issue
conscious-sapphire
conscious-sapphireOP•3y ago
hi, i removed label but it doesnt work. i dont have a github account i tried preNavigationHooks with gotOptions too. only sendRequest works with DELETE requesy method
preNavigationHooks: [
async ({ request }, gotOptions) => {
gotOptions.method='DELETE'
},
]
preNavigationHooks: [
async ({ request }, gotOptions) => {
gotOptions.method='DELETE'
},
]
adverse-sapphire
adverse-sapphire•3y ago
Got no error when do it with BasicCrawler
import { BasicCrawler } from 'crawlee';

const basic_crawler = new BasicCrawler({
async requestHandler(ctx) {
const res = await ctx.sendRequest();
ctx.log.info(`Response body: ${res.body}`);
ctx.log.info(`Response headers: ${JSON.stringify(res.headers)}`);
}
});

const delete_request = [
{
url:'http://httpbin.org/delete',
method: 'DELETE',
headers: {'accept':'application/json'},
},
];
await basic_crawler.run(delete_request);
import { BasicCrawler } from 'crawlee';

const basic_crawler = new BasicCrawler({
async requestHandler(ctx) {
const res = await ctx.sendRequest();
ctx.log.info(`Response body: ${res.body}`);
ctx.log.info(`Response headers: ${JSON.stringify(res.headers)}`);
}
});

const delete_request = [
{
url:'http://httpbin.org/delete',
method: 'DELETE',
headers: {'accept':'application/json'},
},
];
await basic_crawler.run(delete_request);
In HttpCrawler documentation (https://crawlee.dev/api/http-crawler/class/HttpCrawler):
By default, this crawler only processes web pages with the text/html and application/xhtml+xml MIME content types (as reported by the Content-Type HTTP header), and skips pages with other content types. If you want the crawler to process other content types, use the HttpCrawlerOptions.additionalMimeTypes constructor option.
MEE6
MEE6•3y ago
@LeMoussel just advanced to level 3! Thanks for your contributions! 🎉
conscious-sapphire
conscious-sapphireOP•3y ago
thank you. i am checking it this works great. i tried this with httpcrawler and cheeriocrawler. it didnt work
adverse-sapphire
adverse-sapphire•3y ago
@Alexey Udovydchenko: I confirm that with this script, there is a time out error.
import { HttpCrawler } from 'crawlee';

const http_crawler = new HttpCrawler({
maxRequestRetries: 1,
navigationTimeoutSecs: 5,
additionalMimeTypes: ['application/json'],

requestHandler: async (ctx) => {
ctx.log.info(`Processing ${ctx.request.url}...`);
ctx.log.info(`Response body: ${ctx.body}`);
//console.dir(ctx.json);
},
failedRequestHandler: async (ctx) => {
ctx.log.error(`Request ${ctx.request.url} failed too many times. JSON Response body not scraped!`);
}
});

const delete_request = [
{
url:'http://httpbin.org/delete',
method: 'DELETE',
headers: {'accept':'application/json'},
},
];
await http_crawler.run(delete_request);
import { HttpCrawler } from 'crawlee';

const http_crawler = new HttpCrawler({
maxRequestRetries: 1,
navigationTimeoutSecs: 5,
additionalMimeTypes: ['application/json'],

requestHandler: async (ctx) => {
ctx.log.info(`Processing ${ctx.request.url}...`);
ctx.log.info(`Response body: ${ctx.body}`);
//console.dir(ctx.json);
},
failedRequestHandler: async (ctx) => {
ctx.log.error(`Request ${ctx.request.url} failed too many times. JSON Response body not scraped!`);
}
});

const delete_request = [
{
url:'http://httpbin.org/delete',
method: 'DELETE',
headers: {'accept':'application/json'},
},
];
await http_crawler.run(delete_request);
I made this github issue: https://github.com/apify/crawlee/issues/1658#issue-1437885334
GitHub
Getting time out error with 'delete' request method · Issue #1658 ...
Which package is this bug report for? If unsure which one to select, leave blank @crawlee/http (HttpCrawler) Issue description Setting delete method for request doesn't work. Request fail w...
conscious-sapphire
conscious-sapphireOP•3y ago
thank you
Alexey Udovydchenko
Alexey Udovydchenko•3y ago
@LeMoussel great, thanks!

Did you find this page helpful?