.github docs deployment examples guides code http_clients_curl_impersonate.py http_clients_httpx.py proxy_management_inspecting_bs.py proxy_management_inspecting_pw.py proxy_management_integration_bs.py proxy_management_integration_pw.py proxy_management_quick_start.py proxy_management_session_bs.py ...
.github docs deployment examples guides code http_clients.mdx proxy_management.mdx request_storage.mdx scaling_crawlers.mdx introduction quick-start upgrading src templates tests website .editorconfig .gitignore .markdownlint.yaml .pre-commit-config.yaml CHANGELOG.md CONTRIBUTING.md LICENSE Makefile ...
Addalways_enqueueoption to bypass URL deduplication (#621) (4e59fa4) by@Rutam21 Split and add extra configuration to export_data method (#580) (6751635) by@deshansh 🐛 Bug Fixes Use strip in headers normalization (#614) (a15b21e) by@vdusek Merge payload and data fields of Request (#...
.github docs scripts src/crawlee _utils autoscaling base_storage_client basic_crawler beautifulsoup_crawler browsers events http_clients http_crawler memory_storage_client playwright_crawler sessions statistics storages __init__.py cli.py configuration.py consts.py enqueue_strategy.py log_config.py mo...
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both h
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both h
node_modules/crawlee/node_modules/@crawlee/playwright npm ERR! @crawlee/playwright@"^3.1.4" from crawlee@3.1.4 npm ERR! node_modules/crawlee npm ERR! crawlee@"3.1.4" from the root project npm ERR! npm ERR! Could not resolve dependency: npm ERR! peerOptional playwright@">= 1.21.x ...
2 changes: 1 addition & 1 deletion 2 crawlee/README.md Original file line numberDiff line numberDiff line change @@ -5,7 +5,7 @@ crawlee scraping and browser automation library. ```bash $ docker run --rm -it -v $PWD:/tmp apify/actor-node:16 sh $ docker run --rm -it -v...
Name Last commit message Last commit date Latest commit renovate[bot] chore(deps): lock file maintenance Mar 23, 2025 6b30072·Mar 23, 2025 History 4,857 Commits .github feat: remove old docker CI (#2831) Feb 6, 2025 .husky chore: run biome as a pre-commit hook (#2493) ...
master .github docs deployment examples guides code http_clients.mdx proxy_management.mdx request_storage.mdx result_storage.mdx scaling_crawlers.mdx introduction quick-start upgrading src templates tests website .editorconfig .gitignore .markdownlint.yaml ...