How to Use Proxies with Python requests
Python's requests library is the industry standard for synchronous HTTP communication, built on the battle-tested urllib3 engine. It manages proxy routing through a local proxies dictionary or system-level environment variables, using the HTTP CONNECT method to establish secure tunnels for HTTPS traffic. This architecture ensures that your data remains encrypted end-to-end between your script and the target server, as the proxy gateway only sees the destination hostname and not the actual request payload. While requests is exceptionally user-friendly, it has a distinct TLS fingerprint that differs from modern browsers, which can be a detection signal for advanced anti-bot systems. For most scraping tasks, however, its thread-safe session objects and robust connection pooling make it the most reliable choice for managing high-volume residential proxy traffic with minimal overhead.
Focus: working config first, then the mistakes that usually cause traffic to bypass the proxy or break under concurrency.
Using Proxies with Python requests: What to Know
Python's requests library routes all proxied traffic through the underlying urllib3 library, which handles the low-level socket management and connection pooling. When you pass a proxies dictionary to a request, the library inspects the target URL's scheme and selects the corresponding proxy URL. For HTTPS targets, it initiates an HTTP CONNECT request to the proxy gateway. This establishes a transparent tunnel where the gateway simply passes encrypted bytes between your script and the target, meaning the proxy never sees your sensitive data or headers.
A frequent pitfall is the assumption that setting a proxy in a Session object will force a new IP for every request. In reality, the Session's connection pool reuses the same TCP connection to the gateway to improve performance. Since ProxyLabs identifies sessions based on the connection's lifecycle or the session ID in the username, reusing a connection might lead to IP persistence. If your workflow requires a guaranteed fresh IP for every single call, you should use the top-level requests.get() method or manually clear the session's connection pool after each request.
HTTPS tunneling in requests is strictly end-to-end. Unlike some corporate proxies that perform SSL stripping or man-in-the-middle inspection, the ProxyLabs gateway only manages the transport layer. This means you don't need to install custom CA certificates or worry about certificate validation errors caused by the proxy itself. If you do encounter SSL errors, they are almost certainly originating from the target server's certificate or a local mismatch in your Python environment's certifi package.
Connection pooling is one of the most powerful features of requests, but it requires careful management. By default, a Session will maintain a pool of 10 connections. If you're running multi-threaded scraping with a higher worker count, you must increase the pool size via an HTTPAdapter. Without this adjustment, threads will block while waiting for an available connection from the pool, creating a bottleneck that looks like proxy latency but is actually a local configuration issue.
For high-concurrency environments, memory management becomes critical. When requests through residential proxies take several seconds to complete, a large number of concurrent threads can quickly consume available RAM. Using the 'stream=True' parameter allows you to process the response body iteratively, which is essential when downloading large datasets or scraping thousands of pages simultaneously. This prevents the entire response from being loaded into memory at once, keeping your script stable even under heavy load.
Verification is an essential step in any production scraping pipeline. Before starting your main task, you should always route a test request through the proxy to a service like httpbin.org or our own IP lookup tool. This allows you to programmatically verify that the exit IP is residential, matches the intended geo-location, and that the proxy authentication is correctly configured. Handling these checks early prevents wasting bandwidth and processing time on requests that would eventually fail or leak your real identity.
When comparing requests to alternatives like aiohttp or Playwright, the primary trade-off is between simplicity and performance. Requests is synchronous and blocking, which makes the code easier to write and debug but limits throughput on a single thread. For most scraping tasks, however, its reliability and the maturity of its ecosystem outweigh the raw performance gains of async libraries, especially when paired with a robust threading model or distributed task queues like Celery.
Installation
pip install requestscopy to clipboardWorking Examples
import requests
proxies = {
"http": "http://your-username:[email protected]:8080",
"https": "http://your-username:[email protected]:8080",
}
try:
response = requests.get(
"https://httpbin.org/ip",
proxies=proxies,
timeout=30,
)
response.raise_for_status()
print(response.json())
except requests.exceptions.RequestException as e:
print(f"Request failed: {e}")import requests
# Session ID keeps the same IP for up to 30 minutes
proxies = {
"http": "http://your-username-session-abc123:[email protected]:8080",
"https": "http://your-username-session-abc123:[email protected]:8080",
}
session = requests.Session()
session.proxies.update(proxies)
# All requests through this session use the same exit IP
for page in range(1, 6):
resp = session.get(f"https://example.com/page/{page}", timeout=30)
print(f"Page {page}: {resp.status_code}")import requests
# Country-level targeting
proxies_country = {
"http": "http://your-username-country-US:[email protected]:8080",
"https": "http://your-username-country-US:[email protected]:8080",
}
# City-level targeting
proxies_city = {
"http": "http://your-username-country-US-city-NewYork:[email protected]:8080",
"https": "http://your-username-country-US-city-NewYork:[email protected]:8080",
}
resp = requests.get("https://httpbin.org/ip", proxies=proxies_city, timeout=30)
print(resp.json())What matters in practice
- Thread-safe session objects with internal connection pooling for efficient resource reuse across multiple requests.
- Automatic handling of Proxy-Authorization headers when credentials are embedded in the proxy URL or provided via HTTPProxyAuth.
- Support for streaming large responses to disk or processing them in chunks to minimize memory footprint in high-concurrency environments.
- Extensive configuration of retry logic and backoff strategies through custom urllib3 HTTP adapters mounted to specific URL prefixes.
- Native support for picking up proxy settings from .netrc files and standard environment variables like HTTP_PROXY and HTTPS_PROXY.
- Transparent handling of keep-alive connections through the proxy gateway to reduce handshake latency on repeated requests.
Operational Notes
Always define both 'http' and 'https' keys in your proxies dictionary. If the 'https' key is missing, requests will silently bypass the proxy for all secure URLs, potentially leaking your real IP address to the target server.
Use requests.Session() for multiple requests to the same target host. This allows requests to reuse the same underlying TCP connection, which significantly reduces the overhead of the residential gateway's initial handshake.
Mount an HTTPAdapter with a custom Retry strategy to handle intermittent 429 and 503 errors. Each retry attempt through the ProxyLabs gateway automatically triggers an internal rotation to a fresh residential IP address.
Set a connect timeout explicitly in addition to a read timeout. Residential proxy gateways can occasionally hang during the dial phase, and requests can block your execution indefinitely unless a strict timeout is enforced.
When using sticky sessions, verify your IP with a call to httpbin.org/ip at the start of the session. This ensures the residential peer is online and active before you begin a multi-step scraping flow that requires IP persistence.
If your proxy password contains special characters like '@' or ':', use the requests.auth.HTTPProxyAuth class instead of URL embedding to avoid malformed URL errors that are difficult to debug.
Frequently Asked Questions
Why is my IP not changing between requests when using a Session?
When you use a requests.Session() object, the underlying urllib3 connection pool maintains open TCP sockets to the proxy gateway. Because the gateway uses these open connections to route traffic, you might occasionally exit from the same residential IP even with rotating credentials. To force a rotation on every single request, you should either avoid using a Session object entirely or call session.close() between requests, though this will increase your overall latency due to repeated handshakes.
How do I handle proxy authentication errors in Python?
An HTTP 407 Proxy Authentication Required error indicates that your credentials weren't accepted by the gateway. This is most often caused by special characters in your password that break the URL parsing logic in requests. The most reliable fix is to use HTTPProxyAuth for credential passing, or to ensure your username and password are correctly URL-encoded. You should also check the ProxyLabs dashboard to verify that your plan is active and that you haven't exceeded your bandwidth limit.
Does the requests library support SOCKS5 proxies with ProxyLabs?
Requests can support SOCKS5 proxies if you install the 'requests[socks]' extra, which brings in the PySocks library. However, ProxyLabs residential gateways are heavily optimized for HTTP and HTTPS tunneling via the CONNECT method. Using SOCKS5 often results in higher latency and lower stability compared to the standard HTTP gateway. We strongly recommend sticking to the standard http://gate.proxylabs.app:8080 configuration for the most reliable performance during large-scale scraping operations.
Nearby Guides
Need residential IPs for Python requests?
Get access to 30M+ residential IPs in 195+ countries. Pay-as-you-go from £2.50/GB. No subscriptions, no commitments.
GET STARTEDRelated guides