getting error while scrapping Amazon using Selenium and bs4

Question

getting error while scrapping Amazon using Selenium and bs4

Learning

2020年2月23日 20:45

I'm working on a class project using BeautifulSoup and webdriver to scrap Disposable Diapers on amazon for the name of the item, price, reviews, rating.

My goal is to have something like this where I will split this info in different column:

 Diapers Size 4, 150 Count - Pampers Swaddlers Disposable Baby Diapers, One 
 Month Supply
   4.0 out of 5 stars
   1,982
   $43.98
  ($0.29/Count)

Unfortunately, I get this message after the 50 data appears: message: no such element: unable to locate element: {"method":"css selector","selector":".a-last"}

Here is my code:

URL = "https://www.amazon.com/s? 
k=baby+disposablerh=n%3A166772011ref=nb_sb_noss" 
driver = ('C:/Users/Desktop/chromedriver_win32/chromedriver.exe') 
driver.get(URL) html = driver.page_source soup = BeautifulSoup(html, "html.parser") 
df = pd.DataFrame(columns = ["Product Name","Rating","Number of 
Reviews","Price","Price Count"])

while True:
for i in soup.find_all(class_= "sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 
s-result-item sg-col-
4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"):
ProductName = i.find(class_= "a-size-base-plus a-color-base a-text- normal").text#.span.get_text
print(ProductName)
 try:
Rating = i.find(class_= "a-icon-alt").text#.span.get_text()
  except:
 Rating = "Null"
print(Rating)
try:
NumberOfReviews = i.find(class_= "a-size-base").text#.span.get_text()
 except:
 NumberOfReviews = "Null"
 print(NumberOfReviews)
 try:
Price = i.find(class_= "a-offscreen").text#.span.get_text()
except:
Price = "Null"
print(Price)
try:
PriceCount = i.find(class_= "a-size-base a-color-secondary").text#.span.get_text()
except:
PriceCount = "Null"
print(PriceCount)
df = df.append({"Product Name":ProductName, "Rating":Rating, "Number of 
Reviews":NumberOfReviews, 
"Price":Price, "Price Count":PriceCount}, ignore_index = True)
nextlink = soup.find(class_= "a-disabled a-last")
if nextlink:
print ("This is the last page. ")
break
else:
progress = driver.find_element_by_class_name('a-last').click()
subhtml = driver.page_source
soup = BeautifulSoup(subhtml, "html.parser")

Unfortunately, I hit a block road trying to figure out why it is not taking a_last.

Topic web-scraping scraping python

Category Data Science

keiv.fly · Accepted Answer · 2020年2月23日 20:45

Most likely the element you are trying to find (I suppose it is a link or a button with class a-last) is not on the page. When the error appears, you should look at your chrome window and see what it is showing you. Check if the element is on the page.

It could be that amazon.com just showed a captcha and therefore all the usual elements disappeared from the screen. For example, if you try to do while True without waiting time on Google it will show you a captcha after several requests.

getting error while scrapping Amazon using Selenium and bs4

About