You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're using Protobuf strings to generate the `tfs` query parameter, which stores all the information for a lookup request. We then parse the HTML content and extract the info we need.
20
-
21
-
Generally speaking, using the `requests` module with naively-inserted `User-Agent` headers to scrape Google websites is a horrible idea since it's too easy to detect on the server-side. I've been blocked once, and it lasted for almost 3 months. If you're looking to be more stable, I recommend using proxies or replace the `requests` module in the source code to `primp`, which is a scraper yet highly optimized for browser impersonation. Since `primp` doesn't come with type annotations, you may create a file named `primp.py` importing the necessary items (`Client`) and constructing a blank class for `Response`, which is not directly importable from `primp`. Type definitions (`.pyi`) for `primp`:
22
-
23
-
<details>
24
-
<summary>Expand <code>primp.pyi</code></summary>
25
-
26
-
```python
27
-
from typing import Dict, Optional, Tuple
28
-
29
-
classClient:
30
-
"""Initializes an HTTP client that can impersonate web browsers.
31
-
32
-
Args:
33
-
auth (tuple, optional): A tuple containing the username and password for basic authentication. Default is None.
34
-
auth_bearer (str, optional): Bearer token for authentication. Default is None.
35
-
params (dict, optional): Default query parameters to include in all requests. Default is None.
36
-
headers (dict, optional): Default headers to send with requests. If `impersonate` is set, this will be ignored.
37
-
cookies (dict, optional): - An optional map of cookies to send with requests as the `Cookie` header.
38
-
timeout (float, optional): HTTP request timeout in seconds. Default is 30.
39
-
cookie_store (bool, optional): Enable a persistent cookie store. Received cookies will be preserved and included
40
-
in additional requests. Default is True.
41
-
referer (bool, optional): Enable or disable automatic setting of the `Referer` header. Default is True.
42
-
proxy (str, optional): Proxy URL for HTTP requests. Example: "socks5://127.0.0.1:9150". Default is None.
43
-
impersonate (str, optional): Entity to impersonate. Example: "chrome_124". Default is None.
**Preflights**: We may request to the server twice as sometimes the initial request would not return any results. When this happens, it counts as a preflight agent and we'll send another request to the server as they build data. You can think of this as a "cold start."
186
-
187
71
## Cookies & consent
188
-
The EU region is a bit tricky to solve for now, I'll find workarounds soon.
72
+
The EU region is a bit tricky to solve for now, but the fallback support should be able to handle it.
73
+
74
+
## What's new
75
+
-`v2.0` – New (much more succinct) API, fallback support for Playwright serverless functions, and [documentation](https://aweirddev.github.io/flights)!
0 commit comments