I spam filter like a madman. Won't this throw things off?

>   3 groups in alt.binaries*
>       Expected:      654      605      576        453
>         Actual:      568      507      474        385
>        Percent:      87%      84%      82%        85%
> Now I spam filter like a madman.  If you really want to see some wild and
> crazy stuff, check out the perl filter I have going in
> /usr/news/bin/control.........the ruleset is quite impressive. Also I run
> gfilter, with a killer filter file in /var/news/etc/gfilter.ctl and NoCem
> with all the cool keys :)

The first thing to note: the alt.binaries newsgroups in the sample set were chosen because it didn't look like they got a lot of spam. The "spam magnet" newsgroups are not in the sample set. *warez* is not in the sample set.

The "Expected" values are already spam removed, so if anything, you can expect to see MORE than the reported number of articles. This is one reason that you can see you are at "106%" of expected, etc.

Spam filter configuration means a lot though. You could very well be rejecting articles that get through to "Expected." Hopefully you aren't rejecting anything with "false positives" in your spam detector.

Finally, depending on how you handled the spam, the sample method could have counted it as an "article accepted" even when it was cancelled later.

To track down the differences:
