Tor - anonymity tool used by people seeking to privacy and fighting censorship on the Internet. Over time, Tor has become very, very bad cope with his task. Therefore, security, stability, and speed of the network are critical for people counting on it.
But how Tor works "under the hood"? In this article, we'll dive into the structure and protocols used by the network to closely get acquainted with the work of Tor.
A Brief History of Tor
Оnion routing concept (later explain the name) was first proposed in 1995. First, these studies were funded by the Ministry of Naval Research, and then in 1997 joined the project DARPA. Since then, the Tor Project funded by various sponsors, and not so long ago, the project won in the campaign to collect donations on Reddit.
A code-date version of the software Tor was opened in October 2003, and it was already the third generation of software for onion routing. The idea of
it is that we wrap traffic encrypted layers (like an onion) to protect the data and the anonymity of the sender and recipient.
Tor Basics
With a history understood - Getting Started principles. In high-level Tor works, throwing Connect your computer to the target (eg, google.com) through several intermediary computers, or repeater (relay).
Routers are located all over the world and work thanks to the volunteers who agree to give a little bit of traffic for a good cause. It is important that the majority of sites do not have any special or additional software of iron - they all operate using the Tor software is configured to work as a unit.
The speed and anonymity of the Tor network depend on the number of nodes - the more, the better! This is understandable because the traffic is limited to one unit. The more you have the choice of components, the more difficult to track down the user.
Types of nodes
By default, Tor traffic passes through the node 3. They each have a role to play (Let us examine them in detail later).
Input or sentinel node - the entry point to the network. The input nodes are selected from those that work for a long time and have proven to be stable and high.
The intermediate unit - transmits traffic from the security to the output. As a result, the former do not know anything about the past.
Output node - from the network exit point, sends traffic to a destination preferred by the client.
Typically, secure way to run a patrol or a smart host - virtual server (DigitalOcean, EC2) - in this case, the server operators will see only encrypted traffic.
But operators of output nodes have a special responsibility. Because they send traffic to the destination, all the illegal acts committed by the Tor, will bind to the output node. And that could lead to police raids, notifications of illegal activities and other things.
And here the bow?
Having analyzed the routes, going through the nodes, let's ask ourselves - how can we trust them? Can we be sure that they are not hacked compound and does not derive all the data from it? In short - but we do not need to trust them!
The Tor network is designed so that the nodes can be treated with a minimum of trust. This is accomplished through encryption.
So what about onions? Let's look at the encryption work during the installation of the client connection through the Tor network.
The client encrypts the data so that they can be decrypted only output node.
This data is then re-encrypted so that they can be decrypted only the intermediate node.
And then again the data is encrypted so that they can be decrypted only sentinel node
It turns out that we wrapped the original data encryption layers - like an onion. As a result, each node has only the information that he needs - from which came the encrypted data, and where they should be sent. The encoding is useful for all - client traffic is not opened, and the nodes are not responsible for the content of transmitted data.
Nodes and Bridges: The problem with the nodes
After starting the Tor client he needs to get a list of all input, intermediate and output nodes. And this list is not a secret - I'll tell you later how it spreads (you can search "consensus" in the documentation on the way). Publicity of the list is necessary, but it lies a problem.
To understand it, let's count up the attacker and ask ourselves: what would make an authoritarian government (AP)? Thinking in this way, we can understand why Tor is designed that way.
So what would make the AP? Censorship - a serious matter and Tor allows it to bypass, so the AP would want to block user access to Tor. To do this, there are two ways:
block users coming from Tor;
block users belonging to the Tor.
The first - perhaps it is the free choice of the owner of the router or a website. He just needs to download a list of Tor exit nodes and to block all traffic from them. It would be bad, but the Tor anything about it can not do.
The second option is seriously worse. Lock emerging from Tor users can avoid visiting a particular service, and block all incoming does not give them a go to any sites - Tor would be useless for those users who already suffer from censorship, whereby they turned to this service. And if it were only in the Tor nodes, it would be possible, as the AP can download a list of sentinel nodes and block traffic to them.
Well, that Tor developers have thought about it and came up with a cunning solution. Meet the bridges.
Bridges
In fact, the bridges - unpublished in shared components. Members who find themselves behind a wall of censorship can use them to access the Tor network. But if they are not published, as the users know where to find them? Do any special needs list? We'll talk about it later, but in short, yes - there is a list of bridges that are engaged in the project developers.
He's just not public. Instead, users can get a quick list of bridges to connect with the rest of the network. This list, BridgeDB, gives users only a few bridges at a time. This is reasonable since many bridges once they do not need.
Giving out several bridges, you can prevent the blocking of network authoritarian government. Of course, getting information about new sites, you can lock and them, but can someone find all the bridges?
Can someone find bridges?
List of bridges strictly secret. If the AP will receive this list, it can completely block Tor. Therefore, network developers have been researching possibilities of obtaining a list of all the bridges.
I will describe in detail two items on this list, the 2nd, and the 6th because it is these methods managed to get access to the bridges. In the 6th paragraph, researchers in search of Tor bridges scanned the entire space of IPv4 by ZMap port scanner and found a 79% to 86% of all bridges.
2nd item means launch of a smart host Tor, which can track the requests coming to him. For intermediate node address only sentinel nodes and bridges - and if the addressed node is not in the public list of nodes, it is obvious that this node - the bridge. This is a serious challenge to the Tor or any other network. Since users can not be trusted to do a network of anonymous and closed as much as possible, so the network is exactly what is done.
Consensus
Consider how the network operates at a lower level. How it is organized and how to find out which nodes in the network are active. We have already mentioned that in the network there is a list of nodes and a list of bridges. Let's talk about who makes these lists.
Each Tor-client contains information about fixed 10 powerful nodes, supported by trusted volunteers. They have a special task - to monitor the status of the entire network. They are called directory authorities (DA).
They are distributed around the world and are responsible for constantly updated list of all the known Tor node. They choose to work with any nodes, and when.
Why 10? Usually, it is not necessary to do a committee of an even number of members, lest tie in the voting. The bottom line is that the DA 9 are engaged in lists of nodes, and a DA (Tonga) - List of bridges
So how does the DA supports the network is functioning?
The status of all nodes being upgraded is contained in a document called "consensus". DA support him and hourly updated by voting. Here's how it works:
each DA creates a list of known nodes;
then calculates all the data - node flags traffic weight etc .;
It sends data as "a vote for the status of" all the rest;
receives the votes of all the rest;
combines and signs all parameters of all votes;
sends the rest of the signed data;
the majority of DA must agree on the data and confirm the presence of a consensus;
The consensus is published by each DA.
Consensus Publication occurs on HTTP so that everyone can download his last option. You can check for yourself by downloading a consensus through Tor or through a gate tor26.
Very insightful post about the Tor Network. Thanks so much for sharing