Understand the weird Socks 5 protocol and HTTP proxy

Preface

Let's understand the Socks 5 proxy protocol together. For the convenience of understanding, we will explain the HTTP proxy protocol at the same time for comparison. First explain the protocol, and explain some common problems, and then talk about the previous application scenarios of these two protocols, as well as the current application scenarios.

The content to be proxied is generally divided into TCP-based content and UDP-based content.

HTTP proxy

Let's talk about HTTP proxies first. There are actually two types of HTTP proxies, one is a reverse proxy, such as nico and nginx, which are not shown here. The other is tunnel proxy. We mainly expand this kind of proxy. It can proxy any content based on TCP. Note that there is a misunderstanding here. Many people think that HTTP proxy can only proxy http:// content, but it can also proxy https: // and any TCP-based content, but can't proxy UDP content, and we'll see why it can't proxy UDP content later.

Suppose there is an HTTP Proxy Server

1.2.3.4:8010

The address that a client wants to proxy

google.com:443

Let's assign roles, here a client may be a Chrome browser or a curl command. The HTTP Proxy Server is the server.

The interaction between client and server is as follows

Step 1: client -> server, send the address to be proxy

CONNECT google.com:443 HTTP/1.1

Step 2: Server -> Client, response result

HTTP/1.1 200 Connection established

Then the tunnel is established, and the client and server can transmit content to each other through this tunnel.

Question 1: Where is the domain name resolved?

The proxy address is passed from the client to the server. If the client passes a domain name address, the domain name will be resolved on the server. Of course, it is also possible that the client pre-resolves the IP of the domain name locally, and then passes it to the server IP address. So where the domain name is resolved depends entirely on the role of the client that initiated the request in the first place.

Question 2: Cannot proxy UDP content

As seen in the interaction steps, the entire interaction does not distinguish between TCP or UDP fields, right? So it can only proxy TCP content. Of course, some students will say that to force the server to treat it as UDP content, then it will actually become another proxy protocol.

Question 3: The transmission content is not encrypted

This is actually relative to the HTTP Proxy Server, which receives the content without secondary processing. Of course, if the content of the proxy requested by the client is HTTPS content, the content has been encrypted by TLS on the client side, so once again, this is actually relative to HTTP Proxy Server, that is, HTTP proxy is a non-encrypted proxy protocol.

Socks 5 proxy

Let's talk about the Socks 5 proxy again. It can proxy any TCP-based content, as well as UDP content. But in fact, many Socks 5 proxy servers do not implement UDP support. You can use the brook testsocks5 command to test whether a Socks 5 Server supports UDP.

Suppose there is a Socks 5 Server

1.2.3.4:1080

The TCP address that a client wants to proxy

google.com:443

Let's assign roles. Here, a client may be a Chrome browser, a curl command, or a game client. The Socks 5 Server is the server.

The interaction between the client and the server is as follows. For the convenience of understanding, the original protocol is abstracted into human-readable content.

Step 1: Client -> Server (TCP), ask the protocol version and authentication method

Ask for Version and Auth method

Step 2: Server -> Client (TCP), response protocol version and authentication method

Response Version and Auth method

Step 3: Client -> Server (TCP), if the server needs authentication, send the authentication information

Auth Info

Step 4: Server -> Client (TCP), if the server needs authentication, respond to the authentication result

Auth Success

Step 5: Client -> Server (TCP), send the address to be proxied

TCP google.com:443

Step 6: Server -> Client (TCP), response result

Response OK

Then the tunnel is established, and the client and server can transmit content to each other through this tunnel. We noticed that the whole process between the client and the server is the TCP protocol.

The UDP address a client wants to proxy

google.com:443

The interaction between the client and the server is as follows, and for the convenience of understanding, the original protocol is abstracted into human-readable content.

Step 1: Client -> Server (TCP), ask the protocol version and authentication method

Ask for Version and Auth method

Step 2: Server -> Client (TCP), response protocol version and authentication method

Response Version and Auth method

Step 3: Client -> Server (TCP), if the server needs authentication, send the authentication information

Auth Info

Step 4: Server -> Client (TCP), if the server needs authentication, respond to the authentication result

Auth Success

Step 5: The client, prepare a src address, which will be used to interact with the server through UDP

Prepare a src UDP address

Step 6: Client -> Server (TCP), tell the server that you want to use src for UDP communication

I want send UDP data from src

Step 7: The server, prepare a dst address, which will be used to receive the UDP data sent by the client

Prepare a UDP dst server address

Note: Our Socks 5 Server address is 1.2.3.4:1080, but the UDP Server prepared by the server may not be 1.2.3.4:1080, but any address, such as 1.2.3.4:1081, or even something else A Server on other machine, such as 5.6.7.8:6789.

Step 8: Server -> Client (TCP), tell the client that it can send UDP data to the dst address

Please send UDP data to dst

Step 9: Client src -> server dst (UDP), start proxying UDP data

Please send this UDP data to google.com:443

Then the UDP tunnel is established, and the client and server can transmit UDP content to each other through this tunnel.

Question 1: Where is the domain name resolved?

The proxy address is passed from the client to the server. If the client passes a domain name address, the domain name will be resolved on the server. Of course, it is also possible that the client pre-resolves the IP of the domain name locally, and then passes it to the server IP. address. So where the domain name is resolved depends entirely on the role of the client that initiated the request in the first place.

Question 2: UDP can be proxied, but TCP is still used in the previous communication

That is to say, Socks 5 can proxy UDP, but before starting to proxy UDP, it still needs to conduct a series of communication through TCP.

Question 3: When proxying UDP, it may be necessary to keep the TCP connection uninterrupted

We know earlier that in order to proxy UDP, a TCP connection must be established for a series of communications. According to the standard protocol description, during proxy UDP, this pre-communicated TCP connection can be divided into two situations. One is to disconnect after the communication. One is to stay connected after the communication. The server can decide whether to allow or deny proxy UDP based on whether the TCP connection is disconnected or not. So if the client and server do not communicate in advance, the client should maintain this TCP connection without disconnecting.

Question 4: Can I use brook relay to relay Socks 5 Server

We know that brook relay can relay any service based on TCP and UDP, just need to specify a from address and a to address. But according to the seventh step of the previous Socks 5 proxy UDP protocol, it is only necessary for the Socks 5 Server to specify the UDP Server that informs the client as the address of the relay server. If you are using brook socks5, you can use the --socks5ServerIP parameter to specify the IP of the UDP Server.

Question 5: The transmission content is not encrypted

This is actually relative to Socks 5 Server, which does not have secondary processing after receiving the content. Of course, if the content of the proxy requested by the client is HTTPS content, the content has been encrypted by TLS on the client side, so once again, this is actually relative to the Socks 5 Server, that is, the Socks 5 proxy is a non-encrypted proxy protocol.

Question 6: How should we view the standards of the protocol

It can take a long time for an protocol to go from draft to standard, and adherence to standards can also be beneficial in reducing discrepancies and fragmentation.

Question 7: NAT type

When it comes to UDP proxy, the NAT type is generally considered. We know that there are four NAT types, Full cone NAT, Address-Restricted cone NAT, Port-Restricted cone NAT, and Symmetric NAT. Symmetric is the most secure type, so brook currently uses this NAT type. For more knowledge about NAT, you can search by yourself.

HTTP proxy and Socks 5 proxy usage scenarios

A long time ago, the Internet did not pay much attention to encryption. At that time, everyone built an HTTP proxy or Socks 5 proxy at the remote end, and then configured the remote proxy address on the local application, and it was used happily.

However, one day, people woke up, and the data I sent to the remote end through these two proxy protocols was not encrypted, so that people on the road could see the data I transmitted. It's a little unhappy. So for example, the brook protocol with strong encryption and undetectable protocol was born.

Because of the development in ancient times, many applications such as Chrome browser support the configuration of Socks 5 Server, so the Proxy mode like Brook client will create a local Socks 5 Server, which also means that the usage scenarios of HTTP proxy and Socks 5 proxy basically transition from the previous remote mode to local mode.

Scope of HTTP proxy and Socks 5 proxy

Earlier we mentioned that these two protocols have transitioned from remote mode to local mode. No matter where they are running, their roles are servers, and whether they use this proxy as a client role depends entirely on the client role. For example, if the system proxy is configured, Chrome will go through the proxy, but the terminal will not go through the proxy.

Similarly, the mobile phone is the same. You have configured the system proxy on the mobile phone, but the app, as the client role, can choose whether to use the proxy. Many packet capture software adopts the configuration system proxy mode, and the app can easily bypass it.

If you need to capture packets, you can try my mitmproxy helper and Wireshark Helper packet capture tools, all assemble protocols layer by layer from IP packets from the virtual network card, rather than the system proxy mode. To support my work.

Discuss