Added note about CloudFront protections - thank you Andy J!
This commit is contained in:
@@ -264,6 +264,22 @@
|
|||||||
"display_summary(\"https://edwarddonner.com\")"
|
"display_summary(\"https://edwarddonner.com\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "b3bcf6f4-adce-45e9-97ad-d9a5d7a3a624",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Let's try more websites\n",
|
||||||
|
"\n",
|
||||||
|
"Note that this will only work on websites that can be scraped using this simplistic approach.\n",
|
||||||
|
"\n",
|
||||||
|
"Websites that are rendered with Javascript, like React apps, won't show up. See the community-contributions folder for a Selenium implementation that gets around this.\n",
|
||||||
|
"\n",
|
||||||
|
"Also Websites protected with CloudFront (and similar) may give 403 errors - many thanks Andy J for pointing this out.\n",
|
||||||
|
"\n",
|
||||||
|
"But many websites will work just fine!"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": null,
|
||||||
|
|||||||
Reference in New Issue
Block a user