I alraedy checked existing questions. None of them works for me.
I wrote some code to scrape information from multiple pages in a website.
When I run the code, it returns this error:
'ascii' codec can't encode character '\xfc' in position 18: ordinal not in range(128)
When I test the code on a limited number of links it works.
The problem is probably this link:
Because there is the ü
In this specific case, I can drop that link and it is ok. However I would like to know how to handle this problem in general.
Here there is the code
from bs4 import BeautifulSoup
from time import sleep
html = urllib.request.urlopen(url)
return BeautifulSoup(html, "lxml")
get_link = make_soup(section_url)
links_page = [a.attrs.get('href') for a in get_link.select('a[href]')]
links_page = list(set(links_page))
links = [l for l in links_page if 'https://www.crowdcube.com/investment/' in l]
title = tree.find_all('h2').get_text()
description=re.sub(r'[^\w.]', ' ', description)
while l < len(loc):
while r < len(rais):