Paper Menu >>
Journal Menu >>
![]() Int. J. Communications, Network and System Sciences, 2010, 3, 916-924 doi:10.4236/ijcns.2010.312125 Published Online December 2010 (http://www.SciRP.org/journal/ijcns) Copyright © 2010 SciRes. IJCNS A Measurement Study on BitTorrent System Lin Ye*, Hongli Zhang, Fei Li, Majing Su School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China E-mail: hityelin@gmail.com Received September 27, 2010; revised October 29, 2010; accepted November 20, 2010 Abstract Measuring and characterizing peer-to-peer (P2P) file-sharing systems will benefit the optimization and management of P2P systems. Though there are a lot of measurement studies on BitTorrent almost in every important aspect, few of them focus on the measurement issues and the corresponding solutions, which can strongly influence the accuracy of measurement results. This paper analyzes the key difficulties of measuring BitTorrent and presents a measurement system with combination of active and passive ways, which can han- dle with the problems well and balance the efficiency and integrity. Then compared to other work, a more complete and representative measurement was performed for nearly two months and several characteristics are concerned: 1) there are diverse content sharing in BitTorrent system, but multimedia files that are larger than 100 MB are the most. 2) Distributed Hash Tables has indeed enhanced the ability of peer discovery though there are some pitfalls to be addressed. 3) Pieces are distributed uniformly after the early stage and there are few rare pieces. Furthermore, peer arrival rate shows a periodical pattern, which was not well mod- eled before. Then an improved model is proposed and the experiment results indicate that new model is fitted in with actual measurement results with high accuracy. Keywords: P2P, BitTorrent, Measurement, Peer, Piece 1. Introduction With the enhancement of PC performance and bandwidth, peer-to-peer (P2P) systems have become immensely popular and attracted millions of users in the past few years. Particularly, BitTorrent has become a ruling heavyweight application that contributes about 53% of P2P traffic. Though BitTorrent scales fairly well and is now widely used in many fields, such as data distribution [1] and media streaming [2], the performance still gets much attention in the literature. With the introduction of Distributed Hash Tables (DHT) [3], BitTorrent has gone from single center to hybrid structure, which might bring about the change of performance. Nowadays more than fifty kinds of BitTorrent clients have been in use. Unlike other P2P systems such as eMule [4] and Gnutella [5], BitTorrent is simple in de- sign, which is made up of four parts: torrent, peer index, seeds and leechers. A torrent, which is usually uploaded onto a website, is an encoded file that digests the infor- mation of sharing files and is necessary for peers to boot- strap themselves into a swarm. Peer index is the set of peers owning the same files. Peer index tracks the status of the peers that are currently active, and acts as a ren- dezvous point for all peers. So far BitTorrent system has developed three mechanisms for index storage: tracker, DHT and gossip [6]. They are complementary with dif- ferent working principles. The diversity meets the de- mand for one peer to connect enough peers to achieve better performance. Peers are divided into two classes according to their states: peers that have already downloaded all files and continue to serve others are called seeds; peers that are still downloading are called leechers. In BitTorrent, the files are divided into small pieces, and one piece is further divided into smaller blocks. Therefore, a peer can download multiple blocks of the files in parallel, which capitalizes the resources from peers to distribute large contents efficiently. Furthermore, BitTorrent develops a “tit-for-tat” incentive mechanism, which enables peers with high uploading bandwidth to have priority of being served. In this way, peers will pay the penalty for their selfishness, which effectively pre- vents free-riding behaviors common in P2P systems. The purpose of this paper is to aid in the understand- ing of a real and developing P2P system, to provide ![]() L. YE ET AL. 917 measurement data that may be useful in modeling and improving BitTorrent, and to identify design issues in such systems. Our contribution is to discuss the problems and errors in the measurement in depth firstly and design a complete solution to settle them, which is the essential prerequisite to analysis work. We show for the first time the consistency of peers from trackers and DHT, and explore the inherent factors how the differences bring about. Also, we demonstrate the contrast between dif- ferent torrents in the piece view and propose an im- proved model of peer arrival rate in the peer view. The remainder of the paper is organized as follows. Section 2 presents the related work. And then we intro- duce the measurement system for BitTorrent through analyzing several key difficulties with the corresponding solutions in Section 3. We focus on some new findings in our measurement results in Section 4. Finally an im- proved model of peer arrival rate is proposed for better fitting in with actual measurement results. We conclude the paper with a discussion of the results in Section 6. 2. Related Work The amount of P2P traffic and the population of P2P users on the Internet keep increasing. Previous work on BitTorrent has focused on modeling [7,8], incentive mechanisms and improvements [9-11]. In order to get more understanding of BitTorrent, a lot of measurement studies [12-20] have been performed, which can be summarized in four aspects: 1) Torrents: Study [12] collects the information from two popular trackers that continuously update the statis- tics on their torrents and connected users, revealing that torrents have a wide range in size and the average size exceeds 600MB. Work [13] gathers and parses the web pages of the Supernova, and downloads all torrents. They gives the number of downloads among three types (games, movies and music) and finds that movies are the most popular. Work [14] focuses on the difference be- tween video and non-video swarms. Their results show the torrents shared by video swarms are mostly large while the size of non-video swarms is relatively smaller. But existing measurement on torrents are insufficient due to their limited range with only two trackers or single site, which is sensitive to specific user community. 2) Core components: Work [13] monitors the status of all trackers for a long time and points out the overall per- formance and stability are strongly influenced by the availability of core components. Study [15] analyzes the prevalence and impact of new mechanisms, including multiple trackers and DHT. There are some findings: a) the introduction of multiple tracker and DHT both can improve availability; b) Trackers might not be inde- pendent because many of them are hosted in the same machine and multiple trackers may cause swarm splitting; c) Tracker and DHT show complementary characteristic features that trackers provide more information and faster, but DHT can significantly increase the availability of the whole system. 3) Peers: Work [16] depicts the evolution of peers by analyzing five-month tracker log, such as the characteris- tics of session and geographical analysis. The results demonstrate that BitTorrent is highly effective and can sustain flash-crowd. However, the measurement on sin- gle torrent (Linux Redhat 9 distribution) cannot stand for torrents with different popularity. Study [7] finds that the number of peer arrivals decreases exponentially with time in general after its birth. But the work lacks in fine-grained modeling, especially in the peer arrival rate at the early stage. Study [17] reveals that the session length in BitTorrent follows a Weibull distribution more accurately. Work [13] also points out only 17% of peers have an uptime longer than one hour after downloading. 4) Pieces: Studies [18,19] both prove that rarest-first piece selection strategy is better than random strategy. But the simulation results [20] present that the perform- ance benefits provided by network coding in terms of throughput can be more than 2-3 times better in com- parison with transmitting unencoded blocks. Different from simulation methodology, study [21] explores the efficiency of rarest-first mechanism by means of instru- mented clients that are able to record messages sent or received with the detailed content of the messages, state change in the choke algorithm and other important events. The measurement results show that rarest-first algorithm guarantees close to ideal diversity of the pieces among peers. But limited measurement scope (at most 80 peers) and time (8 hours) cannot give a direct impression on piece distribution in a long term. Work [22] performs a detailed measurement study on the distribution and evolution of piece population. They analyze snapshot data of the near-instantaneous population of pieces, and long-term data of evolution of the piece population over several days. The results validate that the downloading policy of BitTorrent is quite effective in the view of piece distribution and evolution. Compared with previous work, this paper differs from the following aspects: The data are more representative. Our measure- ment collected about 382,624 torrents from 72 hot websites, which are comprehensive as well as par- ticular only serving cartoon, TV, games and so on. Compared to studies [13,14], the data are more convincing. The measurement methods and content are more complete. Several methods including active and Copyright © 2010 SciRes. IJCNS ![]() 918 L. YE ET AL. passive ones are integrated to measure BitTorrent system in multilevel view. The combined design can complement each other and solve the limitation of using any single method. The difference between torrents is concerned. Two torrents with different popularity are measured continuously for a long time. The details of them are shown in Table 1. We can get a deep under- standing of different torrents by comparing them. 3. Measurement Methodology P2P file-sharing systems usually adopt two-level index to maintain the announcement and search of the files, which means descriptive meta data and peer information are separately stored in overlay network. Therefore, a stan- dard process is that users search meta information, choose interested items and then start downloading. If piece information is also viewed as a kind of index, a complete measurement system for P2P should contain three levels: Meta measurement: meta data such as filename, file ID, size and type are collected through search- ing keywords in this level. Peer measurement: peer information (IP/Port pairs) is gathered by file ID. Piece measurement: in order to get piece informa- tion, the system needs to connect as many peers as possible and exchange piece information with each other based on peer measurement. 3.1. Meta Measurement Most of P2P systems provide an interface to search meta information for users, but BitTorrent has only limited support. The reason lies in the fact that torrents exist on the websites. Users have to browse many pages and choose proper torrents. To gather meta data rapidly and automatically, a crawler based on Nutch is developed and focuses on the following problems: 1) Unrelated Pages Filtering. There are many pages unrelated to torrents on the websites, such as descriptions or advertisements, which will waste a lot of bandwidth to fetch. By analyzing the hierarchical structure of most BitTorrent websites, it is found that URLs on the same site are similar and can be represented in one or more regular expressions. So URLs are clustered to extract the regular expressions and the useful pattern of torrents’ links in our work. When crawling the site, we decide whether pages should be fetched according to regular expressions. Furthermore, these filtering rules have the generalization ability and can be adjusted dynamically by the previous result to guide the next fetching. Table 1. Information of two torrents. Torrents Hot Ordinary Type video (movie) video (cartoon) Size 457.06 MB 176.15 MB Website bbs.wofei.net bt.ktxp.com Publishing Time 2009-4-20 16:39 2009-4-20 15:56 Start Time 2009-4-20 16:41 2009-4-20 15:58 End Time 2009-5-19 22:03 2009-5-19 22:05 The Number of Peers 28856 6315 2) Complex Pages Parsing. Complex pages are the ones with a lot of javascripts that involve user interac- tions to trigger some events for real content, which poses an obstacle to traditional crawlers. Many sites have al- ready used complex pages which make the torrents dif- ficult to obtain. To solve the problem, an ajax parsing engine is designed to deal with the javascripts. The whole process includes parsing javascripts, triggering corresponding events and abstracting real content. The architecture is shown as Figure 1. 3.2. Peer Measurement There are three ways to collect peers: log analysis, pas- sive and active. Although tracker log can record the status of peers more accurately, there are a few draw- backs to be addressed. First, tracker log is not available for anyone, which puts a hard restriction on this method. Second, tracker log does not cover the whole information because more than one tracker may be appointed to a torrent, and it is not practical to obtain their logs all. Third, tracker log cannot represent DHT. Passive method usually deploys several controlled clients in target swarm and waits for incoming connections from others. Passive Figure 1. Ajax Parsing Engine. Copyright © 2010 SciRes. IJCNS ![]() L. YE ET AL. 919 method is able to handle with the peers behind NAT or firewall, which cannot be connected by external hosts. But passive method needs a long measurement period with low efficiency, especially for a large swarm. Active method acts as a normal user to make requests for other peers to tracker and DHT. However, some mechanisms reduce the efficiency of active method. For example, peer index randomly returns a few peers and does not guarantee the dissimilarity between successive requests. The churn [17] also changes the index at all times, which makes active method hard to converge. Compared to log analysis and passive method, active method can obtain peers freely and diversely. So this paper mainly implements active measurement on tracker and DHT. To address the limitations, the following set- tles for bounding or estimating the errors and makes a tradeoff between the efficiency and integrity. To bound the errors, we assume peer index contains N peers for one torrent and responses n peers every request. We de- note T(m) as total number of distinct peers that we dis- covered after m requests and P(m) representing the cover- age as the fraction of all peers. Obviously, if n > N, T(1) = N and P(1) = 1 in one request. If n < N, T(1) = n at the first request. Since the index randomly returns peers, we need to avoid the redundancy and T(m) does not increase with linear growth. We suppose T(m−1) is the total number of distinct peers after m−1 requests. When the mth request is sent, the probability that undiscovered peers will be returned is: 1NTm N (1) The recursion formula of T(m) after m requests is: 1 1 NTm Tm Tmnm N 1 (2) The coverage P(m) is: 11 1 TmNTmn Pm m NNN (3) According to (2) and (3), for example, more than 230 requests have to be made for measuring a torrent about 5000 peers with 50 peers returned every time, which can obtain 90% of all peers. The process may last 8 minutes or more, and will introduce additional errors into the measurement results. Because every swarm has its own size, there is different measurement time for all of them in real experiments. It is about 10 minutes on average in our measurements for nearly 1,000 torrents. So we dis- guise ourselves as many legal peers to probe in parallel, which succeeds in speeding up the measurement. To overcome the churn and make the measurement easy to converge, we define F(m) as the number of new discov- ered peers between successive measurements, which is equal with T(m) − T(m−1). We use a threshold of F(m) to finish the measurement. When F(m) is less than the threshold in successive measurement, we will end up the measurement. From (2), we can conclude that as long as the threshold is less than n/10, 90% of peers should be discovered. In order to avoid the influence of randomness, we finish the measurement when the times that the num- ber of new discovered peers is less than the threshold is more than 5, which always empirically discovers 95% of peers. In addition, when active method is running, many dis- guised clients that we control are also operating in pas- sive mode, which are able to accept incoming connec- tions and collect the peers that might be behind NAT or firewall. As long as we can own enough peers, the high probability that peers are discovered is guaranteed. 3.3. Piece Measurement Every peer sharing the same files owns its local piece information, which means a global view of pieces needs the piece information from all peers. In this paper two methods are used as follows: 1) Active: we instrument several clients to connect the peers actively to exchange piece information with each other. 2) Passive: in order to overcome the shortcoming that some peers cannot be connected from the outside. We register many forged users or entities into trackers and DHT, and then wait for incoming connections from oth- ers. As long as there are enough controlled forged enti- ties, others will connect them with high probability so as to collect the piece information in a passive way. Moreover, a traffic monitor system is deployed for passive data collection, which uses deep packet inspec- tion (DPI) to capture the communication between peers and trackers. The parameters of requests from peers to trackers can help us analyze the users and resources, such as what files are sharing and when the peers download and finish. In a word, the proposed system in this paper can carry off all related measurement contain- ing torrents, peers, pieces and user behaviors, whose ar- chitecture is shown in Figure 2. On the basis of it, a measurement had been performed from 4/2/2009 to 5/27/2009 for about two months. 4. Measurement Results 4.1. Torrents Size and type are the main static characteristics of the torrents, which can tell what ind of torrents are the most k Copyright © 2010 SciRes. IJCNS ![]() L. YE ET AL. Copyright © 2010 SciRes. IJCNS 920 Figure 2. The architecture of BitTorrent measurement system. times more than trackers. The diversity of peer index indeed reinforces the availability of BitTorrent. Second, peers from DHT are more than trackers, which implies peers from them may not follow the same pattern. To show the consistency between trackers and DHT, the percentage of the same peers in trackers and DHT is given in Figure 5. The high consistency implies DHT makes a good complement to trackers. popular in BitTorrent. It is helpful to design better mecha- nisms for resources management and replicas control. Figure 3 gives the distribution of torrent size with sev- eral common bins. About 21% of torrents are more than 1024 MB, which are usually games, HD movies and video collection. 34% of torrents that might be video lie in between 250 MB and 1024 MB. Both of them take up 55% of all torrents, which means large torrents occupy a dominant position in BitTorrent system. In order to have a close view on real sharing file type, we abstracted the packed filenames from each torrent and counted the corresponding percentage according to file extension. Text files are omitted because they are small and should not be considered as the main content. RAR, MP3, RMVB, JPG and AVI are the top 5 file types with percentage 31.23%, 20.42%, 10.42%, 9.03% and 6.19% respectively. Furthermore, we classified file ex- tension into six types: video, audio, image, executable, archive and other and the corresponding percentage are shown in Table 2. 4.2. Core Components As a rendezvous point, trackers and DHT both play an important role in BitTorrent, which maintain the peers’ status and provide other peers for new ones. Study [15] has given a detailed description of tracker type and activ- ity, and drawn a comparison between trackers and DHT in the efficiency of finding peers. Figure 3. The distribution of torrent size. Table 2. File type, file extension and corresponding per- centage. In this paper, we focus on the number of peers and their consistency in trackers and DHT, and explore the inherent factors how the differences bring about. The number of peers is a direct metric presenting the per- formance of different index. Figure 4 gives the result of our measurement on ordinary torrent during its first ten days. The total number is the sum of tracker and DHT peers without duplicate ones. Although there are three gaps missing data due to network failure, it has nothing to do with the conclusion we make. First, Figure 4 shows that with the help of DHT, the total number is almost two File type File extension Percentage video rmvb, avi, mkv, mp4, rm, wmv, mpg,… 22.8% audio mp3, flac, cbr, wma, ogg, cue, m4a,… 23.5% image jpg, png, gif, bmp,… 9.5% executableexe,… 0.9% archive rar, zip,… 32.1% other … 11.2 % ![]() L. YE ET AL. 921 Figure 4. The number of peers in total, trackers and DHT. Figure 5. The percentage of the same peers in trackers and DHT. The number of peers has given a preliminary impres- sion on different index. As above mentioned, we seem to draw a specious conclusion that DHT is better than tracker by the number of peers. Actually, the number of available peers is more convincing than the number of peers. We define a peer is available when the peer can be connected by others. The peers behind NAT or firewall are treated as unavailable ones because they cannot ac- cept incoming requests though they can offer pieces to others in reverse way. The ratio of available peers in tracker on average is 25.3% as the same as the statistics work on peers behind NAT [23], and DHT is 19.1% in Figure 6, which shows the ratio of available peers in tracker is higher than DHT. It is interesting that though there are more peers from DHT than trackers, the peers from DHT are less avail- able. It is the architecture that makes the difference. Tracker is a global component in BitTorrent and can be visible by all peers, but DHT is a completely distributed network. As a result, when some events happen, for ex- ample, arrival or departure, tracker can update the peers’ status without delay, which keeps the index fresh and Figure 6. The percentage of available peers in trackers and DHT. correct. On the other hand, DHT has a complex process to route various messages among peers. A successful event notification needs more than 3-5 messages. More- over there is no corresponding message for departing, and stale peers are not removed until the pre-set timer is timeout. Many useless peers are left because of delay or forgetting, which wastes a lot of resources and introduces unnecessary traffic. 4.3. Pieces Pieces are the smallest appreciable unit of data in Bit- Torrent since smaller blocks do not directly affect whether torrent contents finish transferring. The distribu- tion of the pieces across the swarm is important for availability in two aspects. First, if there are not enough replicas for each piece, the whole process will be held due to some missing pieces. Second, if the distribution is not uniform with many rare pieces, the efficiency will be low because the peers owning rare pieces have to afford the pressure from others who want them. Though study [22] was able to obtain 90% of peers by using instru- mented clients, it failed to cope with peers behind NAT or firewall. Our combined method can give a more com- plete view on the distribution. Moreover, we pay close attention to the difference in replicas and rarity of pieces between torrents. We suppose one file F is divided into n pieces P1, P2,, Pn and there are ri replicas of piece Pi at time t in the swarm. Therefore, the rate of replicas R(t) is defined at time t: 12 min,,, n Rtr trtrt (4) The meaning of R(t) is to give the number of equiva- lent replicas for the whole file F at time t and reflect the availability of the file in BitTorrent system. The value of R(t) is no less than the number of seeds in existence at Copyright © 2010 SciRes. IJCNS ![]() 922 L. YE ET AL. time t. To explore whether the distribution of pieces is uni- form, the rate of rarity Cr(t) is defined at time t: 1 niavg r i rt rt Ct n (5) ravg(t) is the average replicas of all pieces as calculated below: 1 ni avg i rt rt n (6) The rates of replicas between hot and ordinary torrents are contrasted in Figure 7. First, it is shown that at the early stage the rates of replicas are low though the files can be downloaded slowly. Once the seeds leave, the downloading cannot be finished at all. And then the rates go into a steady stage when there are enough equivalent replicas with the best system service ability. The offline of any peer (seed or leecher) has no significant effect on the downloading. With the departure of seeds and the decreasing peer arrival rate, the rates reduce dramatically at the last stage. Second, there is a big difference be- tween two torrents mentioned above. The rate of hot tor- rent is higher than ordinary one, especially at the last stage, and the steady stage of hot torrent is also longer than ordinary one, which means BitTorrent is more suit- able for popular content because it will be difficult for ordinary ones to obtain if users do not find them as soon as possible. Figure 8 demonstrates the rates of rarity between hot and ordinary torrents, which are similar as a whole. The fluctuation is very drastical at the early stage because the downloading has just begun and only a few pieces have the opportunity to be propagated, and after that the rate gradually approach at 0, which indicates the distribution of pieces is nearly uniform. 5. Model of Peer Arrival Rate Many studies have already proved the close relationship between the performance and the peer arrival rate in BitTorrent. However, they usually suppose the peer arri- val rate follows Poisson distribution, which does not fit in with the results [13]. In fact, in our measurement some new pattern is found in the peer arrival rate, which is also not well modeled before. Before the measurement begins, there are always some peers existing that cannot be con- sidered as new ones in target network. So we use the last result as a point of reference. If some peers are not found in last result but discovered in this experiment, they will be viewed as new ones. Figure 9 and Figure 10 show the number of new arrival peers from both torrents. From Figure 9 and Figure 10, the number of new arrival Figure 7. The rate of replicas. Figure 8. The rate of rarity. Figure 9. The number of new peers from hot torrent with fitted curve. Copyright © 2010 SciRes. IJCNS ![]() L. YE ET AL. 923 Figure 10. The number of new peers from ordinary torrent with fitted curve. peers shows a typical fluctuation pattern in both torrents, which seems to follow a daily cycle. Therefore, we count the appearance time of the crest and trough with their corresponding value in detail as shown in Tab le 3. First, Table 3 can tell that the interval between two adjacent crests or troughs on average is about 24 hours as long as one nature day. This pattern implies the fluctuation seems to follow the behaviors of normal users in the whole day. Second, the value of crests shows an overall decreasing trend with time, although the 3rd crest of hot torrent is an exception. However, this obvious fluctuation pattern of new arri- val peers is not well modeled in the literature as far as we know, so the goal of this paper here is to improve the model of the peer arrival rate in the thought of periodic- ity. The new model is based on the study [10] as below: 0sin 1 t tAe TtB (7) In the above formula, is the number of new ar- rival peers at time t. A0 is the initial oscillation amplitude when the measurement starts, which is related to the popularity of torrents and the state (transient or steady). To be specific, the more popular torrent owns a bigger A0. t is the attenuation parameter of the rate. The more popular torrent has a bigger , which means slower attenuation. T is the period and B is the phase shift. In order to quantify the parameters in (7), we define objective function as below: 2 1 N k J ModkObs k (8) Mod(k) is the kth computation value of the model while Obs(k) is the kth measurement value. This paper uses the BFGS quasi-Newton method to search the para- Table 3. The Crest and Trough of new arrival peers in both torrents. Hot Ordinary Crest/Trough appearance time (hour) value appearance time (hour) value Crest 1 6.09 85 4.2 35 Trough 1 12.64 0 14.05 0 Crest 2 22.78 58 24.06 26 Trough 2 36.05 0 40.32 0 Crest 3 49.01 110 50.76 10 Trough 3 60.21 0 62.16 0 Crest 4 75.03 57 73.11 4 Trough 4 84.03 0 86.97 0 meters and make objective function minimum. Figure 9 and Figure 10 also give the fitted curves of the periodical model for both torrents. The new proposed model is close to the same with actual measurement re- sults. Furthermore, compared to ordinary one, hot torrent has a higher A0, which implies hot torrent has a larger network. And hot torrent also has a higher , which means the corresponding resource can stay active for a longer time. Considering the value of T in both torrents with separately 0.26 and 0.29, the period is about 2πT24 hour. 6. Discussion and Conclusions The existing measurement works on BitTorrent pay little attention to the measurement issues and the correspond- ing solutions, which may lead to the inaccuracy of the results. In this paper we design a complete solution to settle them and have presented a detailed measurement study and an analysis of BitTorrent. We believe that this study is a contribution to the ongoing effort to gain in- sight into a real and developing P2P system. Though all kinds of torrents are sharing in BitTorrent, the measurement results show that it is more suitable for popular content that are very “heavy”. And the high con- sistency of peers between trackers and DHT implies DHT makes a good complement to trackers. However, DHT still needs considerate improvements for better performance. Also, the fluctuation of peer arrival rate modeled in this paper will cause the difference of system performance, which needs further research, such as in- centive mechanisms beyond “tit-for-tat” mechanism and torrents collaboration. Copyright © 2010 SciRes. IJCNS ![]() L. YE ET AL. Copyright © 2010 SciRes. IJCNS 924 7. Acknowledgements This work is partially supported by National High-Tech Development 863 Program of China (2009AA01Z437); National Grand Fundamental Research 973 Program of China (G2007CB311101); National Natural Science Foundation of China (60703014);Program for New Cen- tury Excellent Talents in University (NCET-07-0245); The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation. 8. References [1] B. H. Wei, F. Gilles and C. Franck, “Collaborative Data Distribution with BitTorrent for Computational Desktop Grids,” Proceedings of the 4th International Symposium on Parallel and Distributed Computing, Lille, 4-6 July 2005, pp. 250-257. [2] N. Parvez, C. Williamson, M. Anirban and N. Carlsson, “Analysis of Bittorrent-Like Protocols for On-Demand Stored Media Streaming,” Proceedings of the ACM SIG- METRICS International Conference on Measurement and Modeling of Computer Systems, Annapolis, 2-6 June 2008, pp. 301-312. [3] F. Jarret, P. Michael, P. J. John, K. Arvind and A. Tho- mas, “Profiling a Million User DHT,” Proceedings of the 7th ACM SIGCOMM Conference on Internet Measure- ment, San Diego, 24-26 October 2007, pp. 129-134. [4] Emule, 2004. http://www.emule-project.net [5] Gnutella, 2003. http://rfc-gnutella.sourceforge.net [6] Azureus, 2010. http://azureus.sourceforge.net [7] L. Guo, S. Chen, X. Zhen, E. Tan, X. Ding and X. Zhang, “Measurements, Analysis and Modeling of BitTorrent-Like Systems,” Proceedings of the 5th ACM SIGCOMM Con- ference on Internet Measurement, Berkeley, 19-21 Octo- ber 2005, pp. 35-48. [8] D. Qiu and R. Srikant, “Modeling and Performance Analy- sis of BitTorrent-Like Peer-to-Peer Networks,” Proceed- ings of the ACM SIGCOMM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, Portland, 30 August-3 September 2004, pp. 367-378. [9] T. Locher, P. Moor, S. Schmid and R. Wattenhofer, “Free Riding in BitTorrent is Cheap,” Proceedings of Hot- Nets-V, Irvine, 29-30 November 2006. [10] M. Piatek, T. Isdal, T. Anderson, A. Krishnamurthy and A. Venkataramari, “Do Incentives Build Robustness in BitTorrent?” Proceedings of 4th USENIX Symposium on Networked Systems Design & Implementation, Cambridge, 11-13 April 2007, pp. 1-14. [11] S. Jun and M. Ahamad, “Incentives in BitTorrent Induce Free Riding,” Proceedings of the ACM SIGCOMM Work- shop on Economics of Peer-to-Peer Systems, Philadelphia, 22-26 August 2005, pp. 116-121. [12] A. Bellissimo, B. N. Levine and P. Shenoy, “Exploring the Use of BitTorrent as the Basis for a Large Trace Re- pository,” Technical Report, University of Massachusetts, Amherst, 2004. [13] J. A. Pouwelse, P. Garbacki, D. H. J Epema and H. J. Sips, “The Bittorrent P2P File-Sharing System: Meas- urements and Analysis,” Proceedings of the 54th Interna- tional Workshop on Peer-to-Peer Systems, Ithaca, Vol. 3640, 24-25 February 2005, pp. 205-216. [14] H. Wang, J. Liu and K. Xu, “On the Locality of BitTor- rent-Based Video File Swarming,” Proceedings of the 8th USENIX International Conference on Peer-to-Peer Sys- tems, Boston, 21 April 2009, pp. 1-6. [15] G. Neglia, G. Reina, H. Zhang, et al., “Availability in BitTorrent Systems,” Proceedings of 26th IEEE Interna- tional Conference on Computer Communications, An- chorage, 6-12 May 2007, pp. 2216-2224. [16] M. Izal, G. Urvoy-Keller, E. W. Biersack, P. A. Felber, A. Al Hamra and L. Garcés-Erice, “Dissecting BitTorrent: Five Months in a Torrent’s Lifetime,” Proceedings of 5th International Passive and Active Measurement Workshop, Antibes Juan-les-Pins, 19-20 April 2004, pp. 1-11. [17] D. Stutzbach and R. Rejaie, “Understanding Churn in Peer-to-Peer Networks,” Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement, Rio de Janeiro, 25-27 October 2006, pp. 1-13. [18] P. A. Felber and E. W. Biersack, “Self-Scaling Networks for Content Distribution,” Proceedings of the Interna- tional Workshop on Self-Star Properties in Complex In- formation Systems, Bertinoro, 31 May-2 June 2004, pp. 1-4. [19] A. R. Bharambe, C. Herley and V. N. Padmanabhan, “Analyzing and Improving a BitTorrent Networks Per- formance Mechanisms,” Proceedings of 25th IEEE In- ternational Conference on Computer Communications, Barcelona, 23-29 April 2006, pp. 1-12. [20] C. Gkantsidis and P. R. Rodriguez, “Network Coding for Large Scale Content Distribution,” Proceedings of 24th Annual Joint Conference of the IEEE Computer and Communications Societies, Miami, Vol. 4, 13-17 March 2005, pp. 2235-2245. [21] A. Legout, G. Urvoy-Keller and P. Michiardi, “Rarest First and Choke Algorithms are Enough,” Proceedings of 6th ACM SIGCOMM Conference on Internet Measure- ment, Rio de Janeiro, 25-27 October 2006, pp. 203-216. [22] C. Dale and J. Liu, “A Measurement Study of Piece Population in BitTorrent,” Proceedings of IEEE Global Telecommunications Conference, Washington DC, 26-30 November 2007, pp. 405-410. [23] Free Peers, Inc., “BearShare Network Statistics,” 2005. http://www. bearshare.com/stats/ |