raw-harlequin
raw-harlequin14mo ago

Issue with Extracting <table> Data Using Apify

Hi, I'm using Apify to create a chatbot with the following code:
apify = ApifyWrapper()

loader = apify.call_actor(
actor_id="apify/website-content-crawler",
run_input={"startUrls": [ {"url": "https://docs.apify.com/"} ] },
dataset_mapping_function=lambda item: Document(
page_content=item["text"] or "", metadata={"source": item["url"]}
),
)
apify = ApifyWrapper()

loader = apify.call_actor(
actor_id="apify/website-content-crawler",
run_input={"startUrls": [ {"url": "https://docs.apify.com/"} ] },
dataset_mapping_function=lambda item: Document(
page_content=item["text"] or "", metadata={"source": item["url"]}
),
)
However, the data from <table> tags is not being extracted. The tables seem to be ignored. I've tested this on multiple pages with the same result. Can you help me understand why this is happening and how to fix it? Thanks!
2 Replies
Mantisus
Mantisus14mo ago
Hi @Mykola you should ask this question to the Actor developers in issues, it doesn't look like a problem on the side of your Python code
raw-harlequin
raw-harlequinOP14mo ago
Hi @Mantisus Got it, thank you for the advice!

Did you find this page helpful?