Defeating TLS Fingerprinting: Bypassing Firewall Protection for HTTPS Requests
2024-6-11 01:4:2 Author: hackernoon.com(查看原文) 阅读量:7 收藏

I was utterly frustrated with scraping HTTP data from a firewall-protected website. Despite using residential proxies from multiple providers, my requests kept getting blocked without any clear reason. Sometimes, the script worked on my local machine, but it would fail when running on a cloud server.

After extensive research, I stumbled upon the concept of TLS fingerprinting. Let me break it down for you:

Understanding TLS Fingerprinting

When we send an HTTPS request to a server, the process begins with a “Client Hello” TLS request. This request shares supported TLS versions, various cipher suites (encryption algorithms), the user agent, and several other parameters. These details and additional parameters create a unique TLS fingerprint for each request, making it easy to distinguish bots and scripts from legitimate browser clients.

Check your TLS Fingerprint: https://tls.peet.ws/

Sources: https://www.zenrows.com/blog/what-is-tls-fingerprint

The Search for a Solution

Armed with this knowledge, I dug deeper into bypassing TLS fingerprinting. I couldn’t find a one-stop solution, so I’m sharing my findings.

We’ll be using JavaScript to code the solution, but the actual request will be made using the curl command. This approach can be easily implemented in various programming languages.

Initial Curl Command

Here is the sample curl command I initially executed:

curl --location 'url' \
--header 'accept: application/json, text/javascript, */*; q=0.01' \
--header 'accept-language: en-GB,en-US;q=0.9,en;q=0.8' \
--header 'cookie: authCookie' \
--header 'sec-ch-ua: "Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"' \
--header 'sec-ch-ua-bitness: "64"' \
--header 'sec-ch-ua-full-version: "120.0.6099.234"' \
--header 'sec-ch-ua-full-version-list: "Not_A Brand";v="8.0.0.0", "Chromium";v="120.0.6099.234", "Google Chrome";v="120.0.6099.234"' \
--header 'sec-fetch-dest: empty' \
--header 'sec-fetch-mode: cors' \
--header 'sec-fetch-site: same-origin' \
--header 'user-agent: userAgent' \
--header 'x-requested-with: XMLHttpRequest'

You might notice that I’m already using a proxy in the request. However, it wasn’t good enough, so I added a few things based on the User-Agent my script was creating:

  • TLS Version
  • Cipher Suites

Implementing the Solution

Here is the JavaScript code snippet that provides the TLS version string for the request and, based on the version, selects and randomizes the Cipher Suites to prevent TLS fingerprinting:javascriptCopy code

const tls12CipherSuites = [
  'ECDHE-ECDSA-AES256-GCM-SHA384',
  'ECDHE-RSA-AES256-GCM-SHA384',
  'ECDHE-ECDSA-AES128-GCM-SHA256',
  'ECDHE-RSA-AES128-GCM-SHA256',
  'ECDHE-ECDSA-AES256-SHA384',
  'ECDHE-RSA-AES256-SHA384',
  'ECDHE-ECDSA-AES128-SHA256',
  'ECDHE-RSA-AES128-SHA256',
  'DHE-RSA-AES256-GCM-SHA384',
  'DHE-RSA-AES128-GCM-SHA256',
  'DHE-RSA-AES256-SHA256',
  'DHE-RSA-AES128-SHA256',
];

const tls11CipherSuites = [
  'RSA-AES128-SHA',
  'RSA-AES256-SHA',
  'RSA-3DES-EDE-CBC-SHA',
  'ECDHE-RSA-AES128-SHA',
  'ECDHE-RSA-AES256-SHA',
  'ECDHE-ECDSA-AES128-SHA',
  'ECDHE-ECDSA-AES256-SHA',
  'DHE-RSA-AES128-SHA',
  'DHE-RSA-AES256-SHA',
  'DHE-RSA-3DES-EDE-CBC-SHA',
];

type SuiteMap = {
  [key: string]: string[];
};

const versionCipherMap: SuiteMap = {
  'v1.1': tls11CipherSuites,
  'v1.2': tls12CipherSuites,
};

function shuffledSuites(inputArray: string[]) {
  const firstThree = inputArray.slice(0, 3);
  const rest = inputArray.slice(3);
  for (let i = rest.length - 1; i > 0; i--) {
    const j = Math.floor(Math.random() * (i + 1));
    [rest[i], rest[j]] = [rest[j], rest[i]];
  }
  return firstThree.concat(rest).join(':');
}

function provideTLSAndSuites() {
  const tlsVersions = Object.keys(versionCipherMap);
  const selectedVersion = tlsVersions[Math.floor(Math.random() * tlsVersions.length)];
  const tlsVersionString = `--tls${selectedVersion}`;
  const shuffledSuitesString = shuffledSuites(versionCipherMap[selectedVersion]);
  return [tlsVersionString, shuffledSuitesString];
}

This code provides the TLS version string for the request and selects and randomizes the cipher suites to prevent TLS fingerprinting.

Final Curl Command

We’ll fetch the TLS version and cipher suites for the request and embed them in the curl command:

const [tlsVersionString, cipherSuitesString] = provideTLSAndSuites();

Here is the resulting curl command:

curl --connect-timeout 30 --max-time 50 'url' \
 --proxy 'proxy_url' tlsVersionString --cipher cipherSuitesString \
 -H 'accept: application/json, text/javascript, */*; q=0.01' \
 -H 'accept-language: en-GB,en-US;q=0.9,en;q=0.8' \
 -H 'cookie: authCookie' \
 -H 'sec-ch-ua: "Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"' \
 -H 'sec-ch-ua-bitness: "64"' \
 -H 'sec-ch-ua-full-version: "120.0.6099.234"' \
 -H 'sec-ch-ua-full-version-list: "Not_A Brand";v="8.0.0.0", "Chromium";v="120.0.6099.234", "Google Chrome";v="120.0.6099.234"' \
 -H 'sec-fetch-dest: empty' \
 -H 'sec-fetch-mode: cors' \
 -H 'sec-fetch-site: same-origin' \
 -H 'user-agent: userAgent' \
 -H 'x-requested-with: XMLHttpRequest'

Conclusion

Hopefully, this will allow your request to make it through the firewall.

If this helped you or you enjoyed the content, don’t forget to clap and follow for more such content. Happy scraping!


文章来源: https://hackernoon.com/defeating-tls-fingerprinting-bypassing-firewall-protection-for-https-requests?source=rss
如有侵权请联系:admin#unsafe.sh