deep-jade
deep-jade•2y ago

Web Site Content Monitor Tool to scrape new pages

Hello everyone, I mostly use website content crawler on apify store for adding the content into pinecone which is my vector database for passing these vector into langchain LLM to create chatbot using Python. I wonder is there any tool to monitor or check website with time based and scrape new content into my database then I will convert to embeddings and add into my vector database?
3 Replies
Lukas Krivka
Lukas Krivka•2y ago
Hello, so yuo are basically looking for website content crawler but with added feature to scrape only new pages you didn't scrape yet?
deep-jade
deep-jadeOP•2y ago
yes correct
Lukas Krivka
Lukas Krivka•2y ago
Currently, the actor doesn't support that but it is technically possible, feel free to create an Issue there. Other than that, you would need to ask someone to create that version for you from scratch, see #💻hire-freelancers

Did you find this page helpful?