Author Topic: Pipelining? (Read 3512 times)

taemun · « **on:** December 12, 2010, 02:39:01 pm »

I'm not sure if Alt.Binz already does this (the internets doesn't seem to know), but if it doesn't....

Pipelining for downloads would be awesome. This means that the next part (section of a file; block) is requested before the current one has finished downloading.

It is allowed by RFC3977: http://tools.ietf.org/html/rfc3977#section-3.5

This has the potential to be very advantageous for people with high-latency, high-bandwidth connections.

Hecks · « **Reply #1 on:** December 12, 2010, 03:45:26 pm »

Latency would have to be pretty high for sending article requests in batches to have any real impact on overall speed with multiple threads, I would imagine. Perhaps it would help to describe a genuine use-case where bandwidth isn't being maxed and latency is the identified cause.

taemun · « **Reply #2 on:** December 12, 2010, 11:34:39 pm »

A thought exercise:

Assuming 100mbit bandwidth, and 400ms latency. The default block size on Usenet is 384,000 bytes, or 375KiB (obviously there are posts that go either side of this, but for the sake of argument...).

The 100mbit line can pull 12,000 KiB/s. Downloading that 375KiB block will take 375/12000 = 29.3ms. Now we wait 400ms for our new request to get sent out, and the server to start responding. This is ignoring server-side response latency (I presume it takes a few ms for the server to find the block we want). So each block takes ~430ms to retrieve. Rate = 375KiB / 430 ms = 870KiB/s. We're getting pretty poor efficiency, per thread.

Now the nay-sayers out there are going to point out that most usenet providers allow for 20 connections, that this should iron out the bumps as above. This is partially true. In the real world, latency is somewhat higher than just the network latency (as I said, the server has to find the block), and also even a decent 100mbit line won't be able to get up to full speed for such a short transmission time.

Put it this way: I've noticed that when downloading files which were split into 750KiB chunks I get significantly higher overall transfer rate, compared with normal 375KiB chunks. Since I can't directly effect what the uploaders are doing, I'm left with asking for a solution here

Hecks · « **Reply #3 on:** December 12, 2010, 11:58:47 pm »

So, just to dig down a bit more, because your suggestion seems to have some merit and because I'm naturally curious although sans 100 Mbps: how significant is "significantly higher", and with how many threads?

To consider the practicalities: assuming that articles are being sent in more than one thread, and assuming that servers are configured to support pipelining in this way (i.e. not to ignore everything that follows an ARTICLE or BODY command) and that this can be detected by Alt.Binz, how many should be requested in a batch per thread? What should be the behaviour of a thread if an article can't be found in batch, and how should this be displayed in the Connections panel?

I'm not sure how practical a new concept of 'batch-per-thread' rather than 'article-per-thread' would be for Alt.Binz, but Rdl will be able to comment on that.

taemun · « **Reply #4 on:** December 13, 2010, 12:26:02 am »

I'd presume that the server-side support could be indicated by an option in the servers configuration page (much like "Select group" and "ARTICLE command" are now). I'd suggest it defaulted to on, and the user could disable it if needed.

I'd tentatively suggest that groups of eight, and continually push the eighth one as fast as you get responses. Of course, the number of "in-flight" blocks depends upon the latency and bandwidth. Taking the example I gave in my previous post, which had 30ms "download latency" and 400ms network latency, this system needs 13.3 blocks in flight in order to keep the line full (ideally, etc). Perhaps this could be a tuneable in Settings.

To be frank, I wasn't aware that a miss (block couldn't be found) *was* shown in the connection panel, only in the Log. A block miss should be dealt with as it is presently (retry later); I'm not sure why that would need to change.

Something else to take into account would be downloading something with say 100 blocks, if I had my articles per block set to 20. If I have 20 connections open, the available blocks should be spread evenly over the connections, rather than just five connections with 20, and fifteen connections idle.

Hecks · « **Reply #5 on:** December 13, 2010, 12:49:57 am »

Quote from: taemun on December 13, 2010, 12:26:02 am

I'd presume that the server-side support could be indicated by an option in the servers configuration page (much like "Select group" and "ARTICLE command" are now). I'd suggest it defaulted to on, and the user could disable it if needed.

I'm not even certain how universal the support for pipelining is. The RFC seems to indicate it's optional, and different commands may be supported by different servers. If Alt.Binz can't autodetect this, it seems safer to default to off.

Quote

To be frank, I wasn't aware that a miss (block couldn't be found) *was* shown in the connection panel, only in the Log. A block miss should be dealt with as it is presently (retry later); I'm not sure why that would need to change.

Well, it's more a question of what the Connections panel should show at all with a new concept of article batches. Currently it shows per-article request & response progress on each thread, with presumably the progress meter tied to the known byte value for the article. This may be a minor issue, really. Retries are made on different threads, if I recall correctly, and are of course timed, but in any case these should presumably be requested in batches too somehow.

taemun · « **Reply #6 on:** December 13, 2010, 01:08:57 am »

To quote from the RFC:

Quote from: RFC3977

Except where stated otherwise, a client MAY use pipelining. That is, it may send a command before receiving the response for the previous command. The server MUST allow pipelining and MUST NOT throw away any text received after a command.

So the RFC says that a client *can* and a server *must* support it. Now whether or not most modern usenet servers are in compliance with RFC3977, I have no idea.

To be honest, (for a high speed connection) the connection panel just has a lot of text that changes constantly, and isn't really useful information.

Something else to consider is that sometimes my provider (Astraweb) has issues sometimes, and the server hard-disconnects requests to some articles. This is a really badly implemented article-not-found case, I think. When this happens, the batched articles would presumably be marked as "not found" as well, when they shouldn't be. It was only the latest block which caused the disconnection.

Hecks · « **Reply #7 on:** December 13, 2010, 02:04:57 am »

I was looking more at this:

Quote

However, certain server implementations throw away all text received from the client following certain commands before sending their response. If this happens, pipelining will be affected [...]

But of course, you're right, that's what the RFC is intended to correct.

We could start testing on different servers to see which actually implements it.

Retries: presumably these would remain by article, but would be queued somehow for retrying as a batch, and the timings would need to be overridden, i.e. 'retry after X seconds' ignored somewhat when a batch-full of articles to retry is 'ready'. Or just ignore pipelining for retries, since they're relativey rare.

taemun · « **Reply #8 on:** December 20, 2010, 07:30:26 am »

Quote from: Hecks on December 13, 2010, 02:04:57 am

We could start testing on different servers to see which actually implements it.

Any ideas how to do this (without, obviously, a custom build of AltBinz)?

Alt.Binz forum

News:

Author Topic: Pipelining? (Read 3512 times)

taemun

Pipelining?

Hecks

Re: Pipelining?

taemun

Re: Pipelining?

Hecks

Re: Pipelining?

taemun

Re: Pipelining?

Hecks

Re: Pipelining?

taemun

Re: Pipelining?

Hecks

Re: Pipelining?

taemun

Re: Pipelining?