Scrolling a Webpage of Infinite Length
Since I started automating browsers for web scraping, scrolling the page was a challenge for me. I searched over the internet and found many solutions. One such solution was
browser.execute_script("window.scrollTo(x, y)")
where ‘x’ is the position over the x-axis and ‘y’ is the position over the y-axis.
But, how would I know these coordinates?
1st Approach
To solve the problem I found another answer
browser.execute_script("window.scrollTo(0, document.body.scrollHeight)")
where ‘x’ = 0, and ‘y’= full height of the document.
2nd Approach
I decided to put some range for page size and run the scrolling code inside a loop i.e.
for i in range(0, 7000, 200):
browser.execute_script(f"window.scrollTo(x, {i})")
Here, I had defined a range for the page size i.e. 0 to 7000 px. When I ran the code, it iterated by a length of 200 to reach 7000–1px length.
Problem: I manually defined the page size. How would I know the exact page size?
3rd Approach
browser.execute_script(f"window.scrollTo(x, document.body.offsetHeight)")
4th Approach
while True:
browser.execute_script("window.scrollTo(0, document.body.scrollHeight)")
Final Approach (The Success)
The solution I found was:
time.sleep(5)
while True:
height1 = driver.execute_script('return window.pageYOffset')
time.sleep(2)
scroll = driver.execute_script('window.scrollTo(0, document.body.scrollHeight)')
height2 = driver.execute_script('return window.pageYOffset')
if height1 == height2:
break
else:
continue
Let me explain each line.
1- rest for a while
2- start an infinite while loop
3- find the current height of the scrollbar (not scroll to bottom)
4- rest for 2 seconds
5/6- scroll to the bottom
7- find this new height of the scrollbar
8- check if the new height of the scrollbar is equal to the old height.
(it would be different if the page loads again)
9- if heights are equal, break the loop.
10- else
11- continue the loop
Happy Automation