How to Share Nicely

Journal started Aug 2, 2005


It has come to my attention that most all file sharing applications are the same. Well, the same with respect to file transfer. They divide the file into blocks, and send the blocks in random order to different peers. The only distinction, though important, is how to find those peers with the file you want to transfer. Every file sharing network is effectively a search engine, and nobody's figured out the best search engine. Actually it's usually two search engines: one to find the 'hash' of the file using meaningful data like "salad monster sex type:image", and one to find who has that file by hash, and who's willing to share it with you. ed2k shortcuts this nicely as a matter of fact, by allowing links to be posted anywhere online that already contain the hash. bittorrent's .torrent files are effectively the same thing.

One of the biggest problems I've seen with bittorrent is the problem of bandwidth sharing across files. Well, okay one of the biggest problems I've really seen is convoluted client software, but sharing across files too. The problem I perceived was that you can define the bandwidth to share for 1 file, but for another file you have to allocate a whole new block of bandwidth! If one file is set to upload at 1K/s, and nobody is uploading that file, then you have just effectively lost 1K/s of your bandwidth.

Modern clients, even the "official" python client, are updating that, and allowing all shared files to draw from bandwidth in a pool, so that if you have 20 files you want to share, you can devote all your bandwidth to one of them, if at the time the other 19 are not wanted. But I seriously am beginning to have second thoughts about this strategy. I am starting to sense some extremely wrong tendancies going on in such strategies. Here is what I've been noticing.

Okay so I fired up amule, and pointed all my shared files to it. One of the things I had been sharing is the [url=http://www.ocremix.org]OC Remix[/url] torrents. Unfortunately those torrents are huge monolithic archives, and you can only share 1 at a time pretty much, unless you have huge amounts of bandwidth to devote. So I decided instead of running bittorrent, to point amule to all those lovely little music files. And then I watched my upload queue, and began to worry.

There are some files I would really like to share. I mean, that they're really important data. Who could possibly go without seeing the dangers of John Kerry and rough gay wolf sex? So, one thing in amule that's nice is you can prioritize your files. You can give them a higher or lower priority. Unfortunately I have no clue what this means, but I assume it means that people requesting the file will go on your queue faster than people requesting lower priority files, and also that you'll download higher priority files before lower ones, given the chance (but not the download bandwidth).

But even with setting the priority to high for my most dearest shared files, I notice that 99% of all my uploads are nothing less than those cheesy little OC remixes. And it occurs to me that file sharers would do good to remember the problems that that long haired Jewish punk ran into when he started healing the lepers. The ingrates dive bombed the bastard! Saying, "Heal us all" or some creepy zombie stuff.

How many files can we share? There are some files (like OC Remixes) that millions of people are going to be asking for all the time. There will never be time when these files are not uploading as much as possible. Multiply this across the thousands of popular files you may have copies of and you have a problem. Just by random chance, the numerous low priority files still overwhelm your higher priority ones that you actually want to be known for sharing!

So it worries me. I'm beginning to think that the best strategy is to have only 2 files shared per client. That's right: one file that you value, and one file that is popular. And then set aside your bandwidth blocks for each of these duples, until you have no more bandwidth. And that's all you can do. I'm worried that trying to do more will not scale, that even after we find the people who have the file we want, their queue may be cluttered up with so many other popular files, that we'll be waiting longer to start downloading, than we would wait during the actual download.

And that is why, I fear, 90% of the time files in ed2k networks are not downloading at all.


Comment
Index
Previous (Defeating Advertising)
Next (Tents and "Convenience")

(cc) some rights reserved