Utils¶
- ScraperFC.utils.botasaurus_getters.botasaurus_browser_get_json(url: str, headless: bool = True, block_images_and_css: bool = True, wait_for_complete_page_load: bool = True, delay: int = 0) dict¶
Use Botasaurus BROWSER module to get JSON from page
- Parameters:
url (str) – The URL to scrape
headless (bool) – Whether to run the browser in headless mode
block_images_and_css (bool) – Whether to block images and CSS
wait_for_complete_page_load (bool) – Whether to wait for the page to load completely
delay (int) – Seconds to wait after the request (default: 0)
- Raises:
TypeError – If any of the parameters are the wrong type
ValueError – If
delayis negative
- Returns:
JSON data
- Return type:
dict
- ScraperFC.utils.botasaurus_getters.botasaurus_browser_get_soup(url: str, headless: bool = False, block_images_and_css: bool = False, wait_for_complete_page_load: bool = True, delay: int = 0) BeautifulSoup¶
Use Botasaurus BROWSER module to get Soup from page.
- Parameters:
url (str) – The URL to scrape
headless (bool) – Whether to run the browser in headless mode
block_images_and_css (bool) – Whether to block images and CSS
wait_for_complete_page_load (bool) – Whether to wait for the page to load completely
delay (int) – Seconds to wait after the request (default: 0)
- Raises:
TypeError – If any of the parameters are the wrong type
ValueError – If
delayis negative
- Returns:
BeautifulSoup object
- Return type:
BeautifulSoup
- ScraperFC.utils.botasaurus_getters.botasaurus_request_get_json(url: str, delay: int = 0) dict¶
Use Botasaurus REQUESTS module to get JSON from page.
- Parameters:
url (str) – The URL to request
delay (int) – Seconds to wait after the request (default: 0)
- Raises:
TypeError – If any of the parameters are the wrong type
ValueError – If
delayis negative
- Returns:
JSON data
- Return type:
dict
- ScraperFC.utils.botasaurus_getters.botasaurus_request_get_soup(url: str, delay: int = 0) BeautifulSoup¶
Use Botasaurus REQUESTS module to get Soup from page.
- Parameters:
url (str) – The URL to request
delay (int) – Seconds to wait after the request (default: 0)
- Raises:
TypeError – If any of the parameters are the wrong type
ValueError – If
delayis negative
- Returns:
BeautifulSoup object
- Return type:
BeautifulSoup