proxy.py
Lightweight, Programmable, TLS interceptor Proxy for HTTP(S), HTTP2, WebSockets protocols in a single Python file
Contribute
Become a financial contributor.
Financial Contributions
Top financial contributors
$10 USD since Jan 2022
proxy.py is all of us
Our contributors 4
Thank you for supporting proxy.py.
Budget
Transparent and open finances.
$8.41 USD
$8.41 USD
--.-- USD
--.-- USD
About
Table of Contents
- Features
- Install
- Start proxy.py
- Plugin Examples
- End-to-End Encryption
- TLS Interception
- Plugin Developer and Contributor Guide
- Flags
Features
- Lightweight
- Distributed as a single file module
~50KB
- Uses only
~5-20MB
RAM - No external dependency other than standard Python library
- Distributed as a single file module
- Programmable
- Optionally enable builtin Web Server
- Customize proxy and http routing via plugins
- Enable plugin using command line option e.g.
--plugins plugin_examples.CacheResponsesPlugin
- Plugin API is currently in development state, expect breaking changes.
- Secure
- Enable end-to-end encryption between clients and
proxy.py
using TLS - See End-to-End Encryption
- Enable end-to-end encryption between clients and
- Man-In-The-Middle
- Can decrypt TLS traffic between clients and upstream servers
- See TLS Encryption
- Supported proxy protocols
http
https
http2
websockets
- Optimized for large file uploads and downloads
- IPv4 and IPv6 support
- Basic authentication support
- Can serve a PAC (Proxy Auto-configuration) file
- See
--pac-file
and--pac-file-url-path
flags
- See
Install
Stable version
$ pip install --upgrade proxy.py
Development version
$ pip install git+https://github.com/abhinavsingh/proxy.py.git@develop
For Docker
usage see Docker Image.
Start proxy.py
Command line
Simply type proxy.py
on command line to start it with default configuration.
$ proxy.py
...[redacted]... - Loaded plugin
...[redacted]... - Starting 8 workers
...[redacted]... - Started server on ::1:8899
Things to notice from above logs:
Loaded plugin
-proxy.py
will loadHttpProxyPlugin
by default. It addshttp(s)
proxy server capabilities toproxy.py
Started N workers
- Use--num-workers
flag to customize number ofWorker
processes. By default,proxy.py
will start as many workers as there are CPU cores on the machine.Started server on ::1:8899
- By default,proxy.py
listens on IPv6::1
, which is equivalent of IPv4127.0.0.1
. If you want to accessproxy.py
externally, use--hostname ::
or--hostname 0.0.0.0
or bind to any other interface available on your machine.Port 8899
- Use--port
flag to customize default TCP port.
All the logs above are INFO
level logs, default --log-level
for proxy.py
.
Lets start proxy.py
with DEBUG
level logging:
$ proxy.py --log-level d
...[redacted]... - Open file descriptor soft limit set to 1024
...[redacted]... - Loaded plugin
...[redacted]... - Started 8 workers
...[redacted]... - Started server on ::1:8899
As we can see, before starting up:
proxy.py
also tried to set open file limitulimit
on the system.- Default value for
--open-file-limit
used is1024
. --open-file-limit
flag is a no-op onWindows
operating systems.
See flags for full list of available configuration options.
Docker image
$ docker run -it -p 8899:8899 --rm abhinavsingh/proxy.py:v1.0.0
By default docker
binary is started with IPv4 networking flags:
--hostname 0.0.0.0 --port 8899
To override input flags, start docker image as follows.
For example, to check proxy.py --version
:
$ docker run -it \
-p 8899:8899 \
--rm abhinavsingh/proxy.py:v1.0.0 \
--version
docker
image is currently broken on macOS
due to incompatibility with vpnkit.
Plugin Examples
See plugin_examples.py for full code.
All the examples below also works with https
traffic but require additional flags and certificate generation.
See TLS Interception.
RedirectToCustomServerPlugin
Redirects all incoming http
requests to custom web server.
By default, it redirects client requests to inbuilt web server,
also running on 8899
port.
Start proxy.py
and enable inbuilt web server:
$ proxy.py \
--enable-web-server \
--plugins plugin_examples.RedirectToCustomServerPlugin
Verify using curl -v -x localhost:8899 http://google.com
... [redacted] ...
< HTTP/1.1 404 NOT FOUND
< Server: proxy.py v1.0.0
< Connection: Close
<
* Closing connection 0
Above 404
response was returned from proxy.py
web server.
Verify the same by inspecting the logs for proxy.py
.
Along with the proxy request log, you must also see a http web server request log.
2019-09-24 19:09:33,602 - INFO - pid:49996 - access_log:1241 - ::1:49525 - GET /
2019-09-24 19:09:33,603 - INFO - pid:49995 - access_log:1157 - ::1:49524 - GET localhost:8899/ - 404 NOT FOUND - 70 bytes
FilterByUpstreamHostPlugin
Drops traffic by inspecting upstream host.
By default, plugin drops traffic for google.com
and www.google.com
.
Start proxy.py
as:
$ proxy.py \
--plugins plugin_examples.FilterByUpstreamHostPlugin
Verify using curl -v -x localhost:8899 http://google.com
:
... [redacted] ...
< HTTP/1.1 418 I'm a tea pot
< Proxy-agent: proxy.py v1.0.0
* no chunk, no close, no size. Assume close to signal end
<
* Closing connection 0
Above 418 I'm a tea pot
is sent by our plugin.
Verify the same by inspecting logs for proxy.py
:
2019-09-24 19:21:37,893 - ERROR - pid:50074 - handle_readables:1347 - ProtocolException type raised
Traceback (most recent call last):
... [redacted] ...
2019-09-24 19:21:37,897 - INFO - pid:50074 - access_log:1157 - ::1:49911 - GET None:None/ - None None - 0 bytes
CacheResponsesPlugin
Caches Upstream Server Responses.
Start proxy.py
as:
$ proxy.py \
--plugins plugin_examples.CacheResponsesPlugin
Verify using curl -v -x localhost:8899 http://httpbin.org/get
:
... [redacted] ...
< HTTP/1.1 200 OK
< Access-Control-Allow-Credentials: true
< Access-Control-Allow-Origin: *
< Content-Type: application/json
< Date: Wed, 25 Sep 2019 02:24:25 GMT
< Referrer-Policy: no-referrer-when-downgrade
< Server: nginx
< X-Content-Type-Options: nosniff
< X-Frame-Options: DENY
< X-XSS-Protection: 1; mode=block
< Content-Length: 202
< Connection: keep-alive
<
{
"args": {},
"headers": {
"Accept": "*/*",
"Host": "httpbin.org",
"User-Agent": "curl/7.54.0"
},
"origin": "1.2.3.4, 5.6.7.8",
"url": "https://httpbin.org/get"
}
* Connection #0 to host localhost left intact
Get path to the cache file from proxy.py
logs:
... [redacted] ... - GET httpbin.org:80/get - 200 OK - 556 bytes
... [redacted] ... - Cached response at /var/folders/k9/x93q0_xn1ls9zy76m2mf2k_00000gn/T/httpbin.org-1569378301.407512.txt
Verify contents of the cache file cat /path/to/your/cache/httpbin.org.txt
HTTP/1.1 200 OK
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Content-Type: application/json
Date: Wed, 25 Sep 2019 02:24:25 GMT
Referrer-Policy: no-referrer-when-downgrade
Server: nginx
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
Content-Length: 202
Connection: keep-alive
{
"args": {},
"headers": {
"Accept": "*/*",
"Host": "httpbin.org",
"User-Agent": "curl/7.54.0"
},
"origin": "1.2.3.4, 5.6.7.8",
"url": "https://httpbin.org/get"
}
ManInTheMiddlePlugin
Modifies upstream server responses.
Start proxy.py
as:
$ proxy.py \
--plugins plugin_examples.ManInTheMiddlePlugin
Verify using curl -v -x localhost:8899 http://google.com
:
... [redacted] ...
< HTTP/1.1 200 OK
< Content-Length: 28
<
* Connection #0 to host localhost left intact
Hello from man in the middle
Response body Hello from man in the middle
is sent by our plugin.
Plugin Ordering
When using multiple plugins, depending upon plugin functionality, it might be worth considering the order in which plugins are passed on the command line.
Plugins are called in the same order as they are passed. Example,
say we are using both FilterByUpstreamHostPlugin
and
RedirectToCustomServerPlugin
. Idea is to drop all incoming http
requests for google.com
and www.google.com
and redirect other
http
requests to our inbuilt web server.
Hence, in this scenario it is important to use
FilterByUpstreamHostPlugin
before RedirectToCustomServerPlugin
.
If we enable RedirectToCustomServerPlugin
before FilterByUpstreamHostPlugin
,
google
requests will also get redirected to inbuilt web server,
instead of being dropped.
End-to-End Encryption
By default, proxy.py
uses http
protocol for communication with clients e.g. curl
, browser
.
For enabling end-to-end encrypting using tls
/ https
first generate certificates:
make https-certificates
Start proxy.py
as:
$ proxy.py \
--cert-file https-cert.pem \
--key-file https-key.pem
Verify using curl -x https://localhost:8899 --proxy-cacert https-cert.pem https://httpbin.org/get
:
{
"args": {},
"headers": {
"Accept": "*/*",
"Host": "httpbin.org",
"User-Agent": "curl/7.54.0"
},
"origin": "1.2.3.4, 5.6.7.8",
"url": "https://httpbin.org/get"
}
TLS Interception
By default, proxy.py
doesn't decrypt https
traffic between client and server.
To enable TLS interception first generate CA certificates:
make ca-certificates
Lets also enable CacheResponsePlugin
so that we can verify decrypted
response from the server. Start proxy.py
as:
$ proxy.py \
--plugins plugin_examples.CacheResponsesPlugin \
--ca-key-file ca-key.pem \
--ca-cert-file ca-cert.pem \
--ca-signing-key-file ca-signing-key.pem
Verify using curl -v -x localhost:8899 --cacert ca-cert.pem https://httpbin.org/get
* issuer: C=US; ST=CA; L=SanFrancisco; O=proxy.py; OU=CA; CN=Proxy PY CA; [email protected]
* SSL certificate verify ok.
> GET /get HTTP/1.1
... [redacted] ...
< Connection: keep-alive
<
{
"args": {},
"headers": {
"Accept": "*/*",
"Host": "httpbin.org",
"User-Agent": "curl/7.54.0"
},
"origin": "1.2.3.4, 5.6.7.8",
"url": "https://httpbin.org/get"
}
The issuer
line confirms that response was intercepted.
Also verify the contents of cached response file. Get path to the cache
file from proxy.py
logs.
$ cat /path/to/your/tmp/directory/httpbin.org-1569452863.924174.txt
HTTP/1.1 200 OK
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Content-Type: application/json
Date: Wed, 25 Sep 2019 23:07:05 GMT
Referrer-Policy: no-referrer-when-downgrade
Server: nginx
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
Content-Length: 202
Connection: keep-alive
{
"args": {},
"headers": {
"Accept": "*/*",
"Host": "httpbin.org",
"User-Agent": "curl/7.54.0"
},
"origin": "1.2.3.4, 5.6.7.8",
"url": "https://httpbin.org/get"
}
Viola!!! If you remove CA flags, encrypted data will be found in the cached file instead of plain text.
Now use CA flags other
plugin examples to make them work for https
traffic.
Plugin Developer and Contributor Guide
Everything is a plugin
As you might have guessed by now, in proxy.py
everything is a plugin.
We enabled proxy server plugins using
--plugins
flag. All the plugin examples were implementingHttpProxyBasePlugin
. See documentation of HttpProxyBasePlugin for available lifecycle hooks. UseHttpProxyBasePlugin
to modify behavior of http(s) proxy protocol between client and upstream server. Example, FilterByUpstreamHostPlugin.We also enabled inbuilt web server using
--enable-web-server
. Inbuilt web server implementsProtocolHandlerPlugin
plugin. See documentation of ProtocolHandlerPlugin for available lifecycle hooks. UseProtocolHandlerPlugin
to add new features for http(s) clients. Example, HttpWebServerPlugin.There also is a
--disable-http-proxy
flag. It disables inbuilt proxy server. Use this flag with--enable-web-server
flag to runproxy.py
as a programmable http(s) server. HttpProxyPlugin also implementsProtocolHandlerPlugin
.
proxy.py Internals
ProtocolHandler thread is started with the accepted TcpClientConnection.
ProtocolHandler
is responsible for parsing incoming client request and invokingProtocolHandlerPlugin
lifecycle hooks.HttpProxyPlugin
which implementsProtocolHandlerPlugin
also has its own plugin mechanism. Its responsibility is to establish connection between client and upstream TcpServerConnection and invokeHttpProxyBasePlugin
lifecycle hooks.ProtocolHandler
threads are started by Worker processes.--num-workers
Worker
processes are started by WorkerPool on start-up.Worker
processes receivesTcpClientConnection
over a pipe fromWorkerPool
.WorkerPool
implements TcpServer abstract class.TcpServer
acceptsTcpClientConnection
.WorkerPool
ensures full utilization of available CPU cores, for which it dispatches acceptedTcpClientConnection
toWorker
processes in a round-robin fashion.
Pull Request
Every pull request goes through set of tests which must pass:
mypy
: Runmake lint
locally for compliance check. Fix all warnings and errors before sending out a PR.coverage
: Runmake coverage
for coverage report. Its ideal to add tests for any critical change. Depending upon the change, it's ok if test coverage falls by `
Our team
Abhinav Singh