Author Topic: prevent incompletes from rss  (Read 3415 times)

Offline mysteryman

  • Contributor
  • ***
  • Posts: 66
prevent incompletes from rss
« on: October 04, 2008, 11:33:58 pm »
Sometimes when using RSS, the feed will be ahead of your server. Even though you may get 70-80% ...many times the missing parts are too few to par recover. Usually the missing parts are spread across all the files equally (1-2 parts per file). To repair this, it requires redownloading the entirety of the files that are incomplete, until you have enough to par recover. This is a painfully boring procedure which wastes tons of bw... and time; especially for x264.

This sometimes can be 'fixed' by ramping up retries really high, but that makes failover VERY slow on old articles that are not on your primary server. I had three ideas that might mitigate that effect:

1. If a part fails to the point of removal... instead put it at the end of the queue (for that collection) and reset the retry counter (remember the reset so it's only done once per article). That way, it is hopefully complete by the time the rest of the files complete. This might be good to only enable this feature under the condition that the articles are less than a hour (or user-configurable time) old.

2. Recognize (either remember, or detect) downloaded ARTICLES/PARTS rather than files. In this case, you could simply download and import the nzb (or reimport from rss) and it would fix all files automatically. It could download only the article parts that are missing; It would NOT need to download the 80% that were completed already.

3. Automatically download and import the nzb file posted inside the collection. Sometimes the nzbs posted with the collection are more accurate than those posted on the web. Also, for things that generally get checked for rss... the nzb file is already indexed with what you download. This is especially useful in combination with #2...

And also, altbinz should recognize that you are importing an article that already exists in queue... even if you have not downloaded it yet. This will prevent altbinz from thinking double the number of pars are available when importing the second time to complete in case of an incomplete nzb (rather than slow server).

Offline cr4zyfr4g

  • Global Moderator
  • *****
  • Posts: 781
  • German n00b
Re: prevent incompletes from rss
« Reply #1 on: October 05, 2008, 12:58:55 am »
that is realy complicated. there are options for some feeds to only get "complete" downloads like for example:

nzbindex.nl (gerneral)
tvbinz(dot)net for tv

Offline mysteryman

  • Contributor
  • ***
  • Posts: 66
Re: prevent incompletes from rss
« Reply #2 on: October 05, 2008, 01:11:21 am »
'only complete' feeds may only solve the problem half way... Though I havn't watched them happen, I am usually able to redownload the same rss entries later and they are complete. That would imply that the nzb was complete ...but the server I downloaded from was incomplete. This only happens to me once or twice every few days... but a royal pain when it does.

Right now, I'm using newzleech, and pretty happy with the update speed. Has anyone else had this problem on there? to know if its my server (not updating fast enough), or the feed (incompletes)?

Offline cr4zyfr4g

  • Global Moderator
  • *****
  • Posts: 781
  • German n00b
Re: prevent incompletes from rss
« Reply #3 on: October 05, 2008, 01:24:36 am »
see pm

Offline Hecks

  • Contributor
  • ***
  • Posts: 2011
  • naughty cop
Re: prevent incompletes from rss
« Reply #4 on: October 05, 2008, 03:08:03 am »
Another possibility of course is to increase your RSS refresh time - say from 10 minutes to 30 minutes or more.  Not perfect, but will decrease likelihood of catching stuff not complete on your server.

Actually, for most purposes, RSS is bettter checked just once or twice a day, IMHO.  Autodownload makes most sense when, for example, grabbing overnight releases, in which case checking the feed once in the morning (depending on timezone) is usually enough.  I just do it once at 9.00am GMT, for example, and that's always worked well for me (needs a scheduled task though). For anything else, the Import as Paused option may be preferrable.  I seem to remember an open request about adding that as a per-filter option.

Newzleech I would say is one of the least useful RSS feeds to use for this reason among others - it updates too quickly, and with too many results.  Really you need to decide what exactly you need autodownloading for, and pick the feed to suit.  There's a nice range at nzbs.org, organised by release category, and there's a relatively long delay on them to ensure completion (especially for x264).

There are some interesting ideas here about queue management, though, even if they are complex.  Personally, and for other reasons, I'd be +1 on the ability to expand the queue to see article parts for files.  I presume it's the way it is now because the yEnc decoder needs them all to be able to function, and this is the easiest way of ensuring that.

Let me add another request on the back of yours: expand the refesh intervals, preferrably to allow setting specific times rather than just intervals.  So, perhaps a bit like setting a cronjob, using minutes, hours & days of week (0-6):

0,20,40 * * = would check every 20 minutes every day.

0 9 * = would check once at 9.00 am every day.

30 8 1,2,3,4,5,6 = would check once at 8.30 am every day but Sunday

Or something like that. Edit: would probably need to add a random number of minutes to prevent server hammering by everyone with the same settings. :)
« Last Edit: October 05, 2008, 03:29:12 am by Hecks »

Offline mysteryman

  • Contributor
  • ***
  • Posts: 66
Re: prevent incompletes from rss
« Reply #5 on: October 05, 2008, 04:56:07 am »
I would rather change my retry count/time to higher numbers... and change them back low when I need something old ... it work, but gets kind of tedious (there should be two options, retries on new/old too).

The things I want to download with rss are time critical. I don't have a whole lot of time in the day. If I wait until morning to download, I wont see it until the following night. I watch shortly after it downloads, after patiently waiting. If I didn't watch it that night, someone the next day will say something like "hey did you see so-n-so do such-n-such on that show?" and I'll spontaneously explode... ok... maybe not, but would really ruin the show, or whatever.

As far as too fast is concerned...Feeds should update as fast as resources allow. I wouldn't knock them for being too fast, as long as it picks up again with a second entry (which is I believe what happens). The client should figure it out and make it work. However, it WOULD be nice if they did not add until complete (with more ai than simply waiting 5 hours or whatever that nzbs.org does). There are part numbers listed in the subjects... they should be able to figure out when something is done.

Offline Hecks

  • Contributor
  • ***
  • Posts: 2011
  • naughty cop
Re: prevent incompletes from rss
« Reply #6 on: October 05, 2008, 06:50:43 am »
We're obviously in different timezones & realityzones.

As for calculating completion, yes NZBindex and tvbinz do that (but it's not as straightforward a matter as you might think when dealing with whole collections).

But obviously you expect 'them' to just do it all for you, rather than applying a little common sense at your end.  It all boils down to the fact that you're trying to download articles that haven't propagated to your server yet - RSS or not is kind of irrelevant to that.  These days, propagation is a matter of minutes for all but the worst providers, so really you must be sitting right on top of the posts if you're missing them on your server.

So you can either change providers - e.g. Newzleech uses Giganews for indexing, so if you use Giga too you'll be in synch - or wait 30 minutes (won't kill you I'm sure), or use a better feed, or use another client that supports header downloading so you can see exactly what's available on your server.  In general terms, there's little chance of major changes being made to Alt.Binz to deal with situations that only affect a tiny minority of users.

But then again, I also think there's little chance that you'll listen to anything anyone may suggest to you here.  So, for me at least: /discussion.

« Last Edit: October 05, 2008, 06:56:35 am by Hecks »

Offline mysteryman

  • Contributor
  • ***
  • Posts: 66
Re: prevent incompletes from rss
« Reply #7 on: October 05, 2008, 07:42:06 am »
You might have a point with timezones... possibly on the reality zones too, cause I'll whole heartedly agree I'm a bit opinionated and picky. I want it to download as early as rediculously possible, because I always watch x264 now adays, and that takes hours longer than everything else, so... 30 minutes might not matter at 8pm, but it a lot more when you add 2-3 hours AND 30 minutes... plus download time on x264.

You're probably right about nothing making me happy... picky, remember. :-P For a long while had set the retries up, it worked pretty much every time (much faster than using 30min timer as well). So, if the requests don't get accepted and added, I'll simply revert back to that. The reason I backed it off (I'm sure someone will ask) is because I was downloading something beyond the reach of my primary server (hitnews... cheap ... until my payment on old price expires) and it was taking ages to download, due to the high retry rate, and hitnews only having 20% of the collection I wanted.

That doesn't change the fact that the ideas are good... IMNSHO. Though, adding them all might be a bit redundant ... a few might mitigate the issue very well, while making things nicer at the same time. I think #1 and #2 would probably do everything very well.

However, if those are too complicated to do... or will take significant time to be implemented into a new version... the other idea I came up with about having two retry limits, with a definable threshold would work VERY well. Meaning that articles less than X hours old (could be assumed by other articles in collection) would have the retry limit Y... and articles older than X hours old would get the retry limit Z. This would solve the problem with the low retry rate failing on new articles, and high retry rate from causing failover to secondary to take ages.

Offline Hecks

  • Contributor
  • ***
  • Posts: 2011
  • naughty cop
Re: prevent incompletes from rss
« Reply #8 on: October 05, 2008, 08:52:53 am »
OK. Moderation in all things. ;)

Offline mysteryman

  • Contributor
  • ***
  • Posts: 66
Re: prevent incompletes from rss
« Reply #9 on: October 06, 2008, 07:59:02 am »
hrmm... I had a bunch happen again today. With these ideas fresh in my mind, I decided to do a little testing. I have had nzb archiving enabled for some time... and I finally tried to re-import a failed collection.

I reimported the nzbs (one par, one sample, one rest, etc) one by one for a incomplete collection. The nzbs contained the same incomplete collection that I had thought was caused by incomplete propogation. Gathering from this... I guess the retries correlation was just a coincidence.

The incompletes are incomplete on newzleecher's end... then updated again later...becoming complete. I may be able to ignore this, upping the refresh time, but instead I think I'm going to look for an alternative, one that DOES have an onlycomplete option.

For now, I will try out tvbinz as suggested by someone here, but stability is an issue in the past... hoping things have steadied since then. However, are there any other suggestions on the FASTEST updating rss feed with either:

1. only when complete
2. with a article view (not collection view)
3. has collections, but won't update a collection once posted (creating a second or third entry each time... preferably without splitting any single file between entries)

PS. I had an after thought, wondering if I got a bit off topic, asking for rss suggestions, let me know if I should split the second half off into the general category.