-
Hi, I'm trying to fetch this specific webpage programmatically: https://health.usnews.com/best-nursing-homes/search?sort=name-asc&page=1 The webserver is forcing an SSL/ TLS connection, so its response never finishes. Here's my code: import aiohttp
import asyncio
MAX_PAGE = 1541
US_HEALTH_URL = 'https://health.usnews.com/best-nursing-homes/search?sort=name-asc&page='
HEADERS = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36',
'cookie': '',
'connection': 'close'
}
async def handleUSHealthPage(session: aiohttp.ClientSession, url: str):
print(url)
async with session.get(url, headers = HEADERS) as response:
print(await response.text())
async def main():
async with aiohttp.ClientSession() as session:
await handleUSHealthPage(session, US_HEALTH_URL + '1')
asyncio.get_event_loop().run_until_complete(main()) The issue is that the program is halted indefinitely at this line: |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
I'm not sure why you think this has something to do TLS, but I get the same result with both http and https. The server appears to not be sending anything or closing the connection. If I use curl, I get a 403 forbidden page, whereas it works fine in my browser. So, this server is obviously doing something dodgy. I don't know what the server is doing, but maybe you need to change a bunch of different headers in order to match what the browser (or requests) is sending, before the server will send a response correctly? |
Beta Was this translation helpful? Give feedback.
-
It turns out the SSL/TLS was just the protocol for HTTPS and {
'Accept-Encoding': 'gzip, deflate',
'Accept': '*/*',
'Connection': 'keep-alive'
} I can fetch data from the website now- sorry for wasting your time. |
Beta Was this translation helpful? Give feedback.
It turns out the SSL/TLS was just the protocol for HTTPS and
requests.get
was sending some other headers by default:I can fetch data from the website now- sorry for wasting your time.