Question 1

How is the crawler different from the Web Scraper?

Accepted Answer

Web Scraper handles a single URL — fast, focused, one page at a time. Web Crawler walks an entire site by following internal links from a starting page.

Question 2

What's the maximum crawl size?

Accepted Answer

Free tier crawls up to a few dozen pages per run. Pro extends to hundreds or thousands depending on tier. Set the page limit in the form.

Question 3

Does the crawler stay on one domain?

Accepted Answer

Yes. The crawler only follows internal links — same domain as the starting URL. External links are not followed.

Question 4

Will this crawl JavaScript-heavy sites?

Accepted Answer

Yes. Each page is rendered before extraction, so dynamic content shows up in the Markdown output.

Question 5

How does it handle paginated content?

Accepted Answer

If pagination uses real links (?page=2 or /page/2), the crawler will follow them. Infinite-scroll sites are partially supported — the visible content is extracted.

Question 6

Does this respect robots.txt and rate limits?

Accepted Answer

We respect publicly accessible content and follow polite request rates. If a site explicitly disallows crawling in robots.txt, we honor it.

Question 7

What's the output format?

Accepted Answer

One bundled file with all pages as Markdown, separated by URL headers. Easy to split into individual files or feed to a vector store.

Website Crawler

Who crawls websites to text

How to crawl a website to clean Markdown

Frequently asked questions

Related tools & guides