BitTorrent: An Educational Autopsy of the Hydra

The Disclaimer comes first 😉

The following post is a ton of stuff I have collected over the last year or two on BitTorrent and its implications for educational institutions. This will all be fodder for an article in the EDUCAUSE Quarterly I have been promising for months, and I found the only way to get it out was to blog it ’cause I’m a sick addict. If you are brave enough to read through it all (it’s frighteningly long), feedback and some healthy peer review, additional sources, and perhaps a public flogging would be greatly appreciated. I am not an expert on BitTorrent, nor on much of anything for that matter, but I have been fascinated with the technology for years now, and have always wondered how educational institutions can, for the most part, categorically deny or ignore ways of distributed sharing that is currently revolutionizing the means of delivering media more effectively and efficiently than anything else out there. Our organizational love affair with corporate solutions like iTunesU is in many ways symptomatic of our refusal to truly explore and examine how the means of distribution and production are changing more radically than we are willing (not able) to acknowledge and adapt to.

What is BitTorrent?

BitTorrent is a peer-to-peer file sharing protocol developed by Bram Cohen in 2001 and designed to share large files efficiently by breaking a regular file into several different pieces which are each independently exchanged within a network of peers. The dispersed logic of such a system enables the delivery of large files without any one website or host incurring the entire costs of hardware, hosting, and bandwidth resources. In fact, BitTorrent does not depend upon a sole origin for a file, but rather a loosely joined network of file ‘bits’ that are simultaneously downloaded from and uploaded to various peers’ computers. The more people sharing a particular torrent file (which has a ‘.torrent’ extension), the more distributed it becomes and the faster it can be downloaded. All of which results in a more efficient and effective use of resources because the bandwidth needed to serve the file centrally is now spread amongst those who are downloading it.

How does it work?

Given that BitTorrent is a communications protocol, it has a specific set of standards for tracking, sending and receiving information over the internet. Using BitTorrent can be broken down into two categories: 1) creating and publishing torrent file and 2) downloading torrents and sharing files.

1) Creating and publishing torrent file

To create and publish a torrent file (also know as seeding) you will need a BitTorrent client, keep in mind there are a number to choose from, such as Vuze (formerly Azureus, and now billing itself as an “Open Entertainment Platform”), uTorrent, BitComet, and Transmission to name just a few. Once you download a client, you can create new torrent and associate it with a centralized tracker that will allow others to find it and initiate the process of sharing pieces of the file amongst and between various users. Peers that have a complete copy of the torrent and still offers it for upload are called seeders.

2) Downloading torrents and sharing files

To download torrents and share files you will also need a BitTorrent client, but unlike seeding, you simply have to find a torrent file online and then open it. The client then connects to the tracker specified in the torrent file, from which it receives a list of peers currently sharing pieces of the torrent file. The client connects to those peers and begins downloading various parts of the file. Such a group of peers connected to each other while sharing a torrent is called a swarm. Individuals that download far more than they upload are known as leeches, and they generally have a negative effect on such a network. For more BitTorrent terminology see this article on Wikipedia.

Changing IT Infrastructures and the Lesson of BitTorrent

The summary definition of BitTorret on Wikipedia is rather instructive. Not only does it trace the implications of this technology for sharing hardware, hosting, and bandwidth resources, but it also subtly suggests why it has been outlawed by a majority of the universities and colleges around the US.

BitTorrent is a method of distributing large amounts of data widely without the original distributor incurring the entire costs of hardware, hosting, and bandwidth resources [emphasis mine]. Instead, when data is distributed using the BitTorrent protocol, each recipient supplies pieces of the data to newer recipients, reducing the cost and burden on any given individual source, providing redundancy against system problems, and reducing dependence on the original distributor [emphasis mine].

Of particular interest here is how such a description of BitTorrent intersects on certain points (which I have emphasized above) with larger conversations and concerns campus IT organizations are currently having regarding the increasingly prohibitive costs of owning, maintaining, and monitoring data services locally. In fact, this an issue with much larger scope that is not limited to the education sector by any means. Much of this is a result of our particular moment wherein a plethora of externally hosted options provide college communities the same, if not better, services with infinitely more storage space. And all of this at a fraction of the cost. For some campus IT shops in the business of supporting themselves financially, or even making money, the risks of not going in such a direction are much more dire. The recent news that the University of Washington’s IT department will be laying off 15% of their staff speaks directly to this. In fact, a number of schools have already begun offloading IT staples such as file storage and email to externally hosted solutions. Arizona State University was one of the the first large universities to do this in a deal with Google back in the Fall of 2006, and it is a trend we will continue to see much more of in the coming months and years, particularly as budgets shrink and the economy continues to tank.

So given the changing nature of campus IT strategies, why aren’t more campuses exploring the distributed logic of BitTorrent as a self-sustaining network for sharing large files (particularly media) by using various computers throughout campus (or even amongst campuses) that would save everyone on hardware, hosting, and bandwidth costs? Not to mention how a decentralized network like this safeguards against a single point of failure. It would seem that such possibilities for sharing large media files and applications would be an integral part of any campus’s strategic technology plan? Yet BitTorrent technology is all but outlawed on most college campuses across the country. And while there may be several reasons, the most obvious and seemingly insurmountable is the fact that P2P applications provide, to return to the Wikipedia article on BitTorrent, “freedom from the original distributors.” So rather then exploring BitTorrent as a way to provide possible solutions to clogged bandwidth (an increasingly precious commodity) and the centralized, inefficient serving large files, the question of circumventing traditional means of distribution through a decentralized network has transformed campuses into full blown legal war zones with entertainment industry interests over copyrighted material, the pressure from which has pushed most universities to block or tightly control any and all peer-to-peer file sharing on campus.

Universities, P2P, and the Internecine War over Media Distribution

What could be more uncomfortable for higher education IT departments than serving as the copyright police for entertainment industry interest groups like the Recording Industries Association of America (RIAA) or the Motion Picture Association of America (MPAA)? Yet, if you look at the current legal and political landscape of copyright on campus, increasingly institutions are being asked to police, report, and share personal information related to individuals illegally downloading files on campus. Yet, as Kenneth C. Green’s Web Seminar for EDUCAUSE last December, “Swiftboating Higher Education on P2P: Why Higher Education Is Not the Real Problem, and Technology Is Not the Real Solution,” suggested, rather than categorically blocking or villifying peer-to-peer applications like BitTorrent, we need to do education, outreach, and a frame a broader, informed discussion about the implications of this technology that does not reduce it simply to a means of breaking the law. for, if we think long and hard about BitTorrent, much of the fear and terror aimed at college student by the entertainment industry has everything to so with power behinf this technology as an open, free, and distributed platform for educational resources. The fact that some may use it to break copyright should not be the primary logic steering the conversation (if we can even call it that!).

Moreover, recent studies from the University of Washington have suggested a couple of interesting points about the both figures of illegal downloading as well as the degree of accuracy with which the entertainment industry tracks such activity on college campuses around the country. As to the overstated estimates of college students illegally downloading on campuses around the US, Inside Higher Ed reports:

The [MPAA] often notes that according to a 2005 study it commissioned, 44 percent of the money the industry lost within the United States that year was attributable to peer-to-peer file sharing by college students. It now appears that the figure was closer to 15 percent, or $243 million. Mark Luker, a vice president at Educause, an organization promoting technology use in higher education, said the numbers reflected college students both on and off campus even though college Internet service providers, the target of pressure from both Congress and the MPAA to step up anti-piracy efforts, typically only serve on-campus residents. It would be “reasonable,” Luker said, to divide the MPAA numbers by five, since about a fifth of college students live on campus, leaving the figure somewhere around 3 percent of domestic losses.

Now take these real figures (as opposed to the insanely inflated figures of the MPAA) and run them through a recent study conducted by the University of Washington, “Tracking the Trackers: Investigating P2P Copyright Enforcement” that suggests that entertainment industry’s warning are often targeting innocent users. According to the report which is “the first scientific, experimental study of monitoring and copyright enforcement on P2P networks,” they found the following, disturbing results:

  • Practically any Internet user can be framed for copyright infringement today.
    By profiling copyright enforcement in the popular BitTorrent file sharing system, we were able to generate hundreds of real DMCA take down notices for computers at the University of Washington that never downloaded nor shared any content whatsoever.Further, we were able to remotely generate complaints for nonsense devices including several printers and a (non-NAT) wireless access point. Our results demonstrate several simple techniques that a malicious user could use to frame arbitrary network endpoints.
  • Even without being explicitly framed, innocent users may still receive complaints.
    Because of the inconclusive techniques used to identify infringing BitTorrent users, users may receive DMCA complaints even if they have not been explicitly framed by a malicious user and even if they have never used P2P software!
  • Software packages designed to preserve the privacy of P2P users are not completely effective.
    To avoid DMCA complaints today, many privacy conscious users employ IP blacklisting software designed to avoid communication with monitoring and enforcement agencies. We find that this software often fails to identify many likely monitoring agents, but we also discover that these agents exhibit characteristics that make distinguishing them straightforward.

Given the results (and you can download the pdf version of the entire study here), the level for error and inaccuracy is so high that the adjusted figure that institutions of higher education account for roughly 3% of the domestic losses to the entertainment industry may themselves be greatly inflated. Which makes colleges and universities an even smaller fraction of the overall loss sustained by these companies, yet they continue to mercilessly target educational institutions with take down notices and “pre-litigation letters,” which according to several sources have been coming with greater frequency over the last couple of months. One needs to ask why? And while I have a few theories, none of them really account for the bizarre fact that the greatest distribution crisis they will ever face has provoked them to further alienate, criminalize, and attack the very population that represents their greatest consumer base. It seems counter-intuitive, yet in many ways remains in line with the fear and terror tactics that have characterized the spirit of both domestic and international relations over the past eight years in this country. Moreover, if the recently proposed copyright bill c-61 in Canada is any indicator, then the impact of US copyright interests are beginning to reverberate internationally with real and severe consequences for open education.

What is most frustrating about these draconian measures to control copyright supported and funded by entertainment industry interest groups, and framed as measures of economics, morality, and legality when in fact the bigger question is one changing models of production, distribution, and consumption. Matt Mason’s “The Pirate’s Dilemma” deals with some of the larger questions surrounding the changing business landscape and how current models employed by a number of industries, including the entertainment industry, are outmoded, reactive, and ultimately doomed

It’s hard for large organizations that move at glacial speeds to compete with individuals taking their content and creating new distribution systems, revenue streams and business models, but the fall of the major record labels taught the rest of the corporate world a lesson. In many cases, piracy it is helping people to innovate and create new legitimate market spaces. Link.

And this:

The Internet is in its infancy. Electronic information still travels along copper wires left over from the industrial revolution, but the information age is about to hit puberty. Fiber optic cables are sprouting in unexpected places. The piracy and chaos we are collectively experiencing is growing pains. Link

Pirates as economic pioneers that help frame the future of legitimate market spaces? The entertainment industry should be exploring the implications here and partnering with universities rather than alienating students and commandeering IT departments as local soldiers in their war against the future. And if the don’t, why should higher education institutions around the country abandon using these technologies to develop their own means of media production and distribution?

Cutting-edge BitTorrent Research at the Ivy Leagues

Harvard University did something last year that seems almost insane given the increasingly aggressive legal landscape associated with copyright, the DMCA, and draconian state legislation in regards to file sharing. They have researched, designed, and released their own BitTorrent client called Tribler that they are encouraging their campus community to use! In fact, like Vuze, Tribler is a social-based BitTorrent client that not only makes Internet TV easier and sharing files throughout the campus more effective, but encourages interaction and connection. To quote TorrentFreak on the issue of Harvard’s renewed interest in BitTorrent research:

Yes, Harvard, the richest University in the world recently started a new line of P2P research. They have an army of law professors to protect them, so unlike others, they must feel safe to do this controversial research in the land of the free and the home of the RIAA/MPAA.

The Harvard project is all about a fresh new approach…

The Harvard researchers are currently working on one of the hardest P2P problems, ensuring uploads. P2P dies or thrives depending on how much upload people donate. By introducing electronic “currency” for uploads they think they can make P2P HDTV Video on Demand possible.

In fact, Harvard University is doing what the entertainment industry has failed to do for close to a decade now, namely explore the possibilities of this powerful and effective media delivery platform in order to make downloading and sharing movies, TV shows, documentaries, large images and music faster and easier. A crucial factor being ignored by media corporations that Chris Sogholan, using the example of NBC’s defection from iTunes last year, makes quite clear in his article TV Torrents: When ‘Piracy’ is easier than legal purchase. Yet, beyond ease the questions of legality and compensation still loom large, and Harvard is using the Tribler client as a way to explore bandwidth as currency for P2P file sharing in an attempt to wrestle with the economic issues that have been the root of the legal and political fear and terror campaigns launched by the entertainment industry against technologies like BitTorrent. In fact, Harvard’s is using Tribler as a practical implementation of the mechanism design theory, the Nobel Prize winning theory in Economics last year, to see if it can’t be useful to motivate people to share. Here’s a bit more from this TorrentFreak post on the subject:

A lot of people probably wonder how an economical theory can improve the performance of a BitTorrent client, Pouwelse explains: “A structured scientific advancement of P2P file sharing was really lacking. With Mechanism Design we can go beyond the current trial-and-error methodology. We are working on a mechanism design based solution for all 9 elementary actions in P2P by using a distributed reputation system and mechanism that does not degrade to a single shot prisoners dilemma, such as BitTorrent tit-fot-tat”

What Pouwelse is basically saying is that the mechanism design theory will be used to improve download speed and to make sure that content will be available for the long run, even when it’s not really popular. This is especially useful in BitTorrent streaming solutions where the incentive to keep sharing is relatively low.

The Nobel-powered BitTorrent/P2P client supports both regular .torrent downloads, but can also be used to stream videos from YouTube and Liveleak. As we reported earlier, the client also enhances the standard tit-for-tat BitTorrent algorithms with a so called give-to-get algorithm where bandwidth is used as a currency.

Bandwidth as currency, reputation systems, and a give-to-get algorithm suggest something that all of us already know in one abstract form or another. The future of media is now and it isn’t CDs and DVDs, it is a distributed network of sharing between and amongst large number of people. And while BitTorrent may not necessarily be the long-term answer to such a question, it seems highly likely that some peer-to-peer file sharing will be at the heart of such a future.

In fact, Harvard is not alone with such research. The Computer Science department at Cornell University has been working on a project called Cubit, what they are describing as a peer-to-peer overlay. In other words, while BitTorrent is currently a decentralized means of sharing files, the information about torrent files is still indexed and aggregated centrally by trackers such as Pirate Bay. Cubit promises a “good quality, approximate keyword searching directly through BitTorrent networks—a truly decentralized system that doesn’t rely on aggregators” allowing direct searches for files apart from any centralized tracking service–more in line with the Napster model from the late 90s. Hence the recent suggestions from around the web that sites like the Pirate Bay may become obsolete.

Educational Applications and the Future of Now

I have spent some time thinking about this one, and I have still been hard pressed to articulate specific use cases, because the rel power behind BitTorrent, and peer-to-peer more generally, for educational institutions would be to frame their own distributed media production and distribution spaces. In fact, this could provide a far wider network of resources being shared not only within a particular campuses, as we see with Harvard, but also amongst several campuses. As BitTorrent clients become increasingly “social,” adding the ability to follow and share with friends, the ability for a truly wide net of public domain resources from the Internet Archive, The Library of Congress, and various other archives can be distributed fast and more effectievly, all the while saving institutions the outlandish costs of high end media servers. A kind of cooperative between a wide range of campuses that may afford a way to network and share around the ever important, and increasingly more dominant, role of media in education.

Whn faced with a moment of flux and change we have couple of options, let the corporations and politicians dictate the terms of our future, or get familiar with the technologies, become active in the dicusion, and push back with both the technologies and their possibilities so that we have a clear idea of what we have to lose. The educational applications of BitTorrent are so meager as of now because so few schools allow the technology to operate freely, in order to understand it we must be allowed to use it. And in using it, we may very well discover why billion dollar industries have chosen to push so hard to criminalize this technology until tey can figure out how to monetize it, and my then it may be too late for everyone.

This entry was posted in Uncategorized and tagged , , , , , , , , , , , . Bookmark the permalink.

12 Responses to BitTorrent: An Educational Autopsy of the Hydra

  1. Alan Levine says:

    Love the Hydra, Reverend. I was mulling over this a year or so ago, thinking it might be a means for sharing high quality video files, particularly DV versions of digital stories. As I asked around my edtech network, it pretty much was a resounding “fuggedaboudit” based on the fact that almost no university or college is going to allow network access to the client.

    Fascinating stuff on Harvard’s taking the lead on this; I often wonder if open courseware would ever have gotten off the blocks without MIT’s lead there.

    I do think you need to identify, dream, propose some educational use cases, even hypothetical– where besides video content would it be efficient to share very large files P2P? Large data sets? Complex 3D models?

    And it also bears some reflection IMHO on the need for the content to be “desired” by more than a few people– how many seeds might something need to make the torrent effect beneficial?

  2. Ernesto says:

    Great read, and thanks for the linkage 😉

  3. Reverend says:

    Alan,

    Thanks for the feedback, I think you’re totally right on both points. I need to dream up some test cases, and the idea of popularity is key to the effective use of BitTorrent. Fact is if there is no critical mass, then all my indignation becomes kinda moot.

    Also, I remember you asking around about BitTorrent last year, and I know for a fact that it would be impossible for us here at UMW. I guess I’m interested in this technology as a kinda of HDTV for campus based media, for example U Penn’s mashup contest, or some homegrown media/production studio. I guess it would make sense for a host of reasons, and sharing NMC video files is certainly one. I’ll sleep on these possibilities and more, and hopefully follow-up with some ideas like universities torrenting the Internet Archive’s entire Prelinger collection. 🙂

    @Ernesto
    Well thank you, it is your torrents of great reportage on all things BitTorrent that has kept me abreast of so many of this issues.

  4. Brian says:

    These are a couple monster posts. EQ won’t know what hit it.

  5. Johan says:

    Thnx for the kinds words on our bandwidth-as-a-currency research.

  6. Reverend says:

    @Brian
    You were right, the article was nixed, and for good reason—I just don’t have the objective overview of bitTorrent fleshed out enough. And it would be abusive to expect an editor to go through this frothing madness to make sense of, so I guess the bava is officially my vanity press 🙂

    Johan,

    It is great stuff, look forward to your reports on how it’s working out.

  7. Dave says:

    “why aren’t more campuses exploring the distributed logic of BitTorrent as a self-sustaining network for sharing large files (particularly media) by using various computers throughout campus (or even amongst campuses) that would save everyone on hardware, hosting, and bandwidth costs?”

    Beside the scare-mongering by media companies?

    You never touch on the fact (or else I tl;dr’d) that, in voluntary Bit Torrent networks, the network depends on demand for a file. If I want to download a large file, but I’m only the first person to download it and no one else joins me, then there really aren’t any benefits to using BT versus downloading it straight from the uploader. That problem can really takes 3-4 high-speed nodes to overcome. If the files you want to offer on BT aren’t in reasonable demand, there’s no point. (If you have a mandatory BT, like some online games use to distribute uploads or like what I’m expecting the Harvard team is looking at for HDTV, you’ll have more reliable connections and you can force the nodes to share content well past a 1:1 ratio).

    Also, within a geographically small, closed network, it still seems more practical to upgrade the network rather than spend lots of man hours. (see http://thedailywtf.com/Articles/That-Wouldve-Been-an-Option-Too.aspx )

  8. Pingback: edgeek » Blog Archive » Tribler

  9. Jason Priem says:

    I love the idea of seeing P2P as an asset instead of a liability. I’m reminded of an article I just read about an Australian effort to do the same thing at a much larger scale: a worldwide network of distributed video hosting, supported by millions of “TiVo-sized boxes” in people’s homes. They seen the main benefit of the project as saving millions on data center cooling bills. The great thing about using torrent hosting, though, is that of course you don’t need to distribute any hardware.

    It will be interesting to see how well these anti-leech strategies work…it’s a big difference between encouraging participation in a self-selected community and getting enough _reliable_ participation to get dependable hosting. I wonder if we’ll hear someone suggest the stick rather than the carrot: students using institutional networks are required to download sharing software that automatically connects to the university’s torrent network and hosts a certain number of files.

  10. Geoff Martin says:

    The BBC in the UK has been using P2P in its iPlayer technology very effectively for about a year now. All BBC broadcasts are freely downloadable via the iPlayer P2P network for seven days after airing. I imagine other broadcasters elsewhere are doing the same.

    I guess my point here is that not all big corporations are anti peer-to-peer. Given time this technology will start to be used widely and we’re already seeing the beginnings of this.

  11. Reverend says:

    @Geoff,

    I didn;t kow this, and it’s an excellent example of what @Dave brings up above. Allowing it for seven days kind of maximizes the activity around a file. It seems like a good model, and it is one I will have to investigate further. Appreciate the info.

    @Jason
    I am wil you, I like the idea of re-thinking BitTorrent because thanks to the large media interest groups it has been nothing but villified. As for the anti-leech strategies, I am facinated by the idea of the currency model at Tribler and will be watching that project closely.

    @Dave
    You point is a very good one, and I did kid of gloss over it here. I have to clean this up, it was really like a warehousing of all the information I collected online over the last year or so as a way to get more familiar with the issues surrounding BitTorrent. And links like the one you provided just make me that much more informed, which is definitely my goal with this post. Thank.

  12. Pingback: fak3r » Distributing biodiversity data globally

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.