Understanding Bittorrent
Understanding Bittorrent
Understanding Bittorrent
Iqbal Mohomed
Enter BitTorrent
Released in the summer of 2001 Uses basic ideas from game theory to largely eliminate the free-rider problem
All previous systems could not deal with this problem well
Makes no strong guarantees unlike DHTs It is working extremely well in practice, unlike DHTs
Basic Idea
Chop file into many pieces Replicate DIFFERENT pieces on different peers as soon as possible As soon as a peer has a complete piece, it can trade it with other peers Hopefully, we will be able to assemble the entire file at the end
Basic Components
Seed
Peer that has the entire file
Leacher
Peer that has an incomplete copy of the file
A Torrent file
Passive component Files are typically fragmented into 256KB pieces The torrent file lists SHA1 hashes of all the pieces to allow peers to verify integrity Typically hosted on a web server
A Tracker
Active component Allows peers to find each other Returns a random list of peers
Operation
Pipelining
When transferring data over TCP, it is critical to always have several requests pending at once, to avoid a delay between pieces being sent BitTorrent breaks pieces into sub-pieces At any point in time, some number, typically 5, are requested simultaneously Every time a sub-piece arrives, a new request is sent This scheme has been found to saturate most connections in practice
Piece Selection
The order in which pieces are selected by different peers is critical for good performance If a bad algorithm is used, we could end up in a situation where every peer has all the pieces that are currently available and none of the missing ones If the original seed is taken down, the file cannot be completely downloaded!
Endgame Mode
Policy: When all the sub-pieces that a peer doesnt have are actively being requested, these are requested from EVERY peer When the sub-piece arrives, the replicated requests are cancelled This ensures that a download doesnt get prevented from completion due to a single peer with a slow transfer rate Some bandwidth is wasted, but in practice, this is not too much
Choking
One of BitTorrents most powerful idea is the choking mechanism It ensures that nodes cooperate and eliminates the free-rider problem Cooperation involves uploaded sub-pieces that you have to your peer Choking is a temporary refusal to upload; downloading occurs as normal Connection is kept open so that setup costs are not borne again and again Based on game-theoretic concepts
Tit-for-tat strategy in Repeated Games
Prisoners Dilemma
Repeated Games
Over time, more complex strategies can evolve For instance, Tit-for-tat
Do onto others as they do onto you If someone cheats, you must retaliate back Have a recovery mechanism to ensure eventual cooperation
Choking Algorithm
Goal is to have several bidirectional connections running continuously Upload to peers who have uploaded to you recently Unutilized connections are uploaded to on a trial basis to see if better transfer rates could be found using them
Choking Specifics
A peer always unchokes a fixed number of its peers (default of 4) Decision to choke/unchoke done based on current download rates, which is evaluated on a rolling 20-second average Evaluation on who to choke/unchoke is performed every 10 seconds
This prevents wastage of resources by rapidly choking/unchoking peers Supposedly enough for TCP to ramp up transfers to their full capacity
Anti-Snubbing
Policy: When over a minute has gone by without receiving a single sub-piece from a particular peer, do not upload to it except as an optimistic unchoke A peer might find itself being simultaneously choked by all its peers that it was just downloading from Download will lag until optimistic unchoke finds better peers Policy: If choked by everyone, increase the number of simultaneous optimistic unchokes to more than one
Upload-Only mode
Once download is complete, a peer has no download rates to use for comparison nor has any need to use them The question is, which nodes to upload to? Policy: Upload to those with the best upload rate. This ensures that pieces get replicated faster Also, peers that have good upload rates are probably not being served by others
References
"BitTorrent Economics Paper" , Bram Cohen
"BitTorrent protocol specification" , Bram Cohen
"BitTorrent Resource Availability Analysis" , Brian Greinke and James Hsia. (Rice)
"Dissecting BitTorrent: Five Months in a Torrent's Lifetime" , M. Izal, G. Urvoy-Keller, E.W. Biersack, P.A. Felber, A. Al Hamra, and L. Garc es-Erice. (Institut Eurecom, France)