INN FAQ Listing every article

INN FAQ Part 6

NOTE: The maintainers of the INN FAQ stopped publishing in December 1997.
An important update to this topic is provided by Mib Software in the Usenet RKT for Subscribers.

Subject: (6.9) Listing every article
People often ask for a way to list every file in the newsspool. There are a couple ways of doing this. They work well for INN as well as C News: 1. Here's the fastest way. However, it only lists the files that are actually in the history file and if an article is crossposted it only gets listed once: #!/bin/sh . /usr/lib/news/innshellvars cd ${SPOOL} awk '(NF > 2){print $3}' < ${HISTORY} \| tr . / Sorting the output will improve directory cache efficiency. 2. This lists any article file no matter how many links you have, etc. and even if it is not listed in the history file: cd /var/spool/news gfind . -regex './[0-9][0-9]$' -print NOTE: GNU find will execute this much faster than the "find" that comes with most versions of Unix (including SunOS). 3. If you need to do something fancier than what find can do, consider using perl's find2perl program. Given a find command line, find2perl will output the perl code to do the same thing. You can then modify the output to do what you want. For example: find2perl . -mtime +30 -name '[0-9][0-9]$' -exec '/bin/rm {}' outputs a perl script that deletes any article that is over 30 days old (except the regular expression is output as wrong... change it to: /^[0-9]+$/ && and it should work just fine. 4. Another efficient way to scan all articles in the spool, including those that for some reason aren't in the history file, is to read the active file for a list of newsgroup names, and chdir() to each directory to scan for files. Remember not* to do a recursive treewalk for each directory. ------------------------------ [Source: INN FAQ Part 6 Archive-name: usenet/software/inn-faq/part6] [Last Changed: $Date: 1997/07/01 01:25:41 $ $Revision: 2.21 $] [Copyright: 1997 Heiko Rupp, portions by Tom Limoncelli, Rich Salz, et al.]

Subject: (6.9) Listing every article

People often ask for a way to list every file in the newsspool. There
are a couple ways of doing this. They work well for INN as well as C
News:

1. Here's the fastest way. However, it only lists the files that are
actually in the history file and if an article is crossposted it only
gets listed once:

#!/bin/sh
. /usr/lib/news/innshellvars
cd ${SPOOL}
awk '(NF > 2){print $3}' < ${HISTORY} | tr . /

Sorting the output will improve directory cache efficiency.

2. This lists any article file no matter how many links
you have, etc. and even if it is not listed in the history
file:

	cd /var/spool/news
	gfind . -regex '.*/[0-9][0-9]*$' -print

NOTE: GNU find will execute this much faster than the "find" that
comes with most versions of Unix (including SunOS).

3. If you need to do something fancier than what find can do, consider
using perl's find2perl program. Given a find command line, find2perl
will output the perl code to do the same thing. You can then modify
the output to do what you want. For example:

	find2perl . -mtime +30 -name '[0-9][0-9]*$' -exec '/bin/rm {}'

outputs a perl script that deletes any article that is over 30 days old
(except the regular expression is output as wrong... change it to:

	/^[0-9]+$/ &&

and it should work just fine.

4. Another efficient way to scan all articles in the spool, including
those that for some reason aren't in the history file, is to read the
active file for a list of newsgroup names, and chdir() to each
directory to scan for files. Remember *not* to do a recursive treewalk
for each directory.

------------------------------

[Source: INN FAQ Part 6 Archive-name: usenet/software/inn-faq/part6]
[Last Changed: $Date: 1997/07/01 01:25:41 $ $Revision: 2.21 $]
[Copyright: 1997 Heiko Rupp, portions by Tom Limoncelli, Rich Salz, et al.]