목록 중간 특정 웹사이트에서 에러가 발생하여 exception catch 처리한 내용. 기록/메모를 위해 남겨둔다.
[error msg]
requests.exceptions.ConnectionError: HTTPConnectionPool(host='hostname', port=80): Max retries exceeded with url: hostname ...
response == none 인 경우를 잡아내 주면 보통 문제없이 돌았었는데, 이번처럼 request pkg의 ConnectionError를 handling 안해주면 죽는 경우가 생긴다.
[Code]
import requests as rq import bs4 header = {'User-Agent':'Mozilla/5.0'} def get_html_title(url): request = rq.Request('Get', url, headers = header) r = request.prepare() s = rq.Session() try: response = s.send(r) except rq.exceptions.ConnectionError: return '' if response == None: return '' html_content = response.text navigator = bs4.BeautifulSoup(html_content, "html.parser") title = navigator.find('title') if title == None: return '' return title.get_text()
[Request PKG의 Exception Case]
http://docs.python-requests.org/en/latest/user/quickstart/#errors-and-exceptions
Errors and Exceptions
In the event of a network problem (e.g. DNS failure, refused connection, etc), Requests will raise a
ConnectionError
exception.
In the rare event of an invalid HTTP response, Requests will raise an
HTTPError
exception.
If a request times out, a
Timeout
exception is raised.
If a request exceeds the configured number of maximum redirections, a
TooManyRedirects
exception is raised.
All exceptions that Requests explicitly raises inherit from
requests.exceptions.RequestException
.
댓글 없음:
댓글 쓰기