Home » Articles » Clickstream tracking of users of the Tor browser – A research paper
Click Here To Hide Tor

Clickstream tracking of users of the Tor browser – A research paper

The growing significance of web analytics, we have been witnessing throughout the past few years, has been also accompanied by an enormous growth in the number of web users concerned about preserving their online anonymity. The Tor browser has been often considered as the best online browsing tool available, as evidenced by more than 2.5 million people using it daily. For the vast majority of Tor users, even though most of Tor’s terms and options are rather difficult to understand, they inarguably believe that the Tor browser offers them more anonymity protection than what it is actually capable of providing.

A recently published paper proved that the Tor browser can provide very little privacy protection if used via its default settings. As such, to achieve near total anonymity, extra care must be exercised by users of the Tor browser. Let’s take a look at some of the ways that can be used to track the clickstream of Tor users that were presented in this paper.

tor_network-clickstream.jpg

Clickstream tracking via timing and traffic correlation:

Tor users can be vulnerable to deanonymization using end-to-end timing attackers. An adversary monitoring network traffic sent to the initial relay node, as well as traffic sent to the final relay node, can make use of statistical analysis to identify the circuit they belong to. Consequently, Tor technically does not provide total anonymity for its users. The user’s IP address as well as the destination IP of the observed traffic can be sniffed by the adversary, who can easily track the clickstream of a user via correlation attacks. Interestingly, the adversary needn’t control the entry and exit nodes within a Tor circuit to be able to correlate network traffic streams observed travelling across these relay nodes. The adversary only needs to be capable of observing the traffic.

Sometimes, tracking the clickstream of a user does not require any complex forms of statistical analysis. For example, a student in Harvard University was caught sending fake bomb threats to ditch an exam! The student sent the emails using a Guerilla email, an email address provider, via the Tor browser. The Guerilla email service adds the IP address of the email sender to all sent messages, which helped in identification of the user’s Tor exit node.

Clickstream tracking via traffic correlation attacks is, more or less, easy to conduct, especially when the anonymity set (number of users using the Tor client) is somehow small. In other words, whenever a small number of clients are using the Tor client, within a given local network, then deanonymizing them is relatively a simple task to accomplish. More sophisticated attack forms require more complex techniques of statistical analysis of traffic, as well as timing. Recent experimental studies have revealed that such techniques can help track the clickstream of a large percentage of users of the Tor browser and visitors of Tor hidden services.

Deanonymization and tracking clickstream via practical side channel attacks (Torben):

This is a unique form of deanonymization attack, named Torben. The technique utilizes an approach that is more reliable than timing and traffic correlation attacks, as it is much less intrusive. The attack relies on interaction of multiple technologies – firstly, web pages loaded via the Tor browser can be easily manipulated to load scripts from untrusted origins; secondly, even though Tor encrypts loaded content, using a low latency anonymization circuit is ineffective at hiding the magnitude of request-response pairs. The attack was first described by a group of researchers from the University of Gottingen, Germany, who exploited this interplay to create a side channel in the Tor communication circuit, which enables the transmission of short markers of web pages in order to expose the web pages a client visited using the Tor browser. In an experimental evaluation that involved 60,000 web pages, the attack enabled tracking the clickstream of Tor users via detecting web page markers with a 91% accuracy.

Failure of security of operations:

It is easy to track users by monitoring the pattern of their behavior. This is relatively simple to accomplish for users who neglect using a bridge to connect to the Tor network. This method involves following up the pattern of browsing behavior of users linked to the same aliases on multiple forums, social networks, etc. This approach was how the identity of the mastermind behind Silk Road, Ross Ulbricht, revealed. Ulbricht made a big mistake using the same aliases on multiple forums and on the Silk Road marketplace itself such as “Dread Pirate Roberts” (DPR) and “frosty”.

Recent experiments have shown that 10 web addresses are all that might be needed to identify who the Tor clickstream belongs to. The clickstream is identified by matching account aliases and other online data belonging to the clickstream to publicly available data. The stream can be accurate to the point that it reflects everything a user has been doing, minute by minute.

Clickstream tracking via modified exit/DoS node:

This form of deanonymization attack utilizes five components – a modified exit node, a modified DoS node, a lightweight DoS web server, a client side JS for measurement of latency, and an instrumentation client to receive data. Implementing this attack is conducted as follows:

– The JS ping code is injected by the exit node into the HTML response.

– As the user browses as per usual, the JS will continue to “phone home.”

– As the attacker continues measuring, DoS attack will strain possible initial hop(s).

– If no significant level of variance is detected, another node is selected from candidate nodes and the attack sequence will restart again.

– Once sufficient change is detected within the measurements, the entry node will be detected, which will denaonymize the user and aid in tracking their clickstream.

This attack method helps identify the whole patch of connection through the Tor network. The attack utilizes bandwidth multiplication which makes it possible for low bandwidth connections to DoS connections with high bandwidths.

Clickstream tracking via BGP:

Experimental studies have shown that Tor is vulnerable to Autonomous Systems (Ases) that can relay Tor traffic, thanks to their effective eavesdropping capabilities. When a malicious AS, or a group of colluding ASes, intervening between a Tor user and the entry relay node, and between the exit relay node and the destination, can conduct timing analysis to deanonymize Tor users. AS level adversaries are very powerful for many reasons. Firstly, routine BGP routing can alter the number of ASes that can effectively track the clickstream of Tor users. Secondly, ASes can effectively manipulate BGP announcements to place themselves on Tor circuits along the paths entering and exiting relay nodes. Thirdly, an AS can undergo timing analysis, even if it can only monitor a single traffic direction between the entry node and the exit node. It was proven that asymmetric routing boosts the efficiency of ASes in tracking the clickstream of a Tor user.

Final thoughts:

The paper presented multiple means for tracking the clickstream of Tor users. It is worth mentioning that the biggest weakness that can boost the success of deanonymization attacks is the user. Users should be aware of techniques that can increase their privacy via Tor such as using a bridge, disabling JS, avoiding using Windows OS, and others. The Tor Project is continuously offering users detailed guidelines and tutorials to help them maximize their privacy and protect their online anonymity.

3 comments

  1. Using a bridge is not anon, the creator of Bridgeobfs4 Yawning Angel thinks it is more a simple joke, only good for simple anon.

    Tor should be Configured as

    #UseEntryGuards 0

    Tor default Gaurd Node can keep history of long period of time for ASes AI.

    The bigger problem is every Academic does research to break Tor instead of developing Tor.

    So no new bridge types ect……

  2. where in the article does it explain why microsoft os is shit? I agree but still

    • security reseacher

      I’ve read the research paper.

      This research paper doesn’t have real content, it looks like a blogpost, not something you would like to read in a research journal. The fonts are large and colorful, with magazine-like typesetting. And the whole paper has 18 pages, yet most pages are just some background introduction to various web tracking techniques, only 5 pages are talking about how the tracking applies to Tor Browser.

      What does it say?

      A. they don’t even have an idea of how frequently the IP address of exit node is changed. Instead of reading the source code properly to find out, or to cite relevant discussion in the Tor mailing list, they wrote a Python script to check, seriously? It makes the paper looks like that was written by someone completely new to Tor without a clue.

      B. It says Tor/Browser will always use the same IP address to connect a website, and it only changes the circuit every ~8 minutes, so the website is able to track the webpages the user has clicked.

      C. Since the Tor Browser accepts first-party cookies, it’s possible to track user’s activities by a website they are visiting. It’s also possible to track Tor user’s activities by using links with tracking identifiers.

      D. It says since Tor Browser caches images, it’s possible to use the E-tags as a tracking identifier.

      So what is the purposed solution?

      First, a Tor user should often click the “new circuit” button (with a big screenshot telling you how to do it). Also, Tor Browser should disable memory caching.

      Then I finished reading the paper.

      All the paper says, was if you use Tor Browser to visit a website, the website is able to identify the webpages you clicked are from the same person. And if there is an ads network, it’s possible to crosslink the sites, too.

      Paper finished. WTF?! Seriously? Groundbreaking research, written by 2 PhDs specialized “in network and application layer security, IoT security, machine learning, and user privacy and anonymity” who are surprised to discover 2+2=4?

      I can write this paper too, and my paper will be better than this, as I will point out that there is also a “new identity” button in Tor Browser…

      TLDR no actual research is done in the paper. They should make their paper as a section of “Online Anonymity 101” instead.

      To clarify, I don’t opposed someone to write a paper on this subject, even a very elementary one, but then you should not use the media to spread all the unwarrented FUDs!

      The DOI of the paper is 10.1080/23742917.2018.1518060, this is why you should start using Sci-Hub, too. Paying something like 40 US dollars to a paper like this is literally burning money, better to buy a lottery ticket with the money,

      or even better, donating the money to Tor Project! The end-of-the-year donation is here! If you’ve never given to the Tor Project, we have some exciting news for Giving Tuesday. A donor has offered to match all first-time gifts, up to $20,000. So if you’ve never given to the Tor Project before, your gift gets matched twice. A $25 donation becomes $75.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Captcha: *