I spam filter like a madman. Won't this throw things off?
> 3 groups in alt.binaries*
> Expected: 654 605 576 453
> Actual: 568 507 474 385
> Percent: 87% 84% 82% 85%
> Now I spam filter like a madman. If you really want to see some wild and
> crazy stuff, check out the perl filter I have going in
> /usr/news/bin/control.........the ruleset is quite impressive. Also I run
> gfilter, with a killer filter file in /var/news/etc/gfilter.ctl and NoCem
> with all the cool keys :)
The first thing to note: the alt.binaries newsgroups in the sample set were chosen because it didn't look like they got a lot of spam. The "spam magnet" newsgroups are not in the sample set. *warez* is not in the sample set.
The "Expected" values are already spam removed, so if anything, you can expect to see MORE than the reported number of articles. This is one reason that you can see you are at "106%" of expected, etc.
Spam filter configuration means a lot though. You could very well be rejecting articles that get through to "Expected." Hopefully you aren't rejecting anything with "false positives" in your spam detector.
Finally, depending on how you handled the spam, the sample method could have counted it as an "article accepted" even when it was cancelled later.
To track down the differences:
- You can see in the individual newsgroup report how you are doing in each of those
newsgroups. That is the first place to look.
- You can request a "missing content" report which will detail the articles missing on your
server in a particular newsgroup. (Usually what is missing is not spam, they really are
missing!) Every newsgroup gets some spam though, so that does contribute to some
differences. (Less than 10%)
View an example report
Why doesn't spam make meaningful comparisons impossible?
Why is my alt.* or alt.binaries.* performance so much lower?
How do I signup?
Up to The newsrAte RKT
Up to newsrAte home
Up to Mib Software home
Copyright 1998, Forrest J. Cavalier III, Mib Software
INN customization and consulting