Data storage issues in SpamRankings.net

Data storage issues led to loss of some incoming data for the September 2012 SpamRankings.net. Interestingly, the results seem almost normal anyway. Here is a speculation on why that can be.

Look just under any rankings chart for September 2012 and you’ll see this notice:

CBL dropouts 8,11 September 2012 were on our end.
PSBL data is unusable 4-15 Sep 2012 due to problems on our end.
September 2012 World All SpamRankings.net from CBL Volume
1 (2) AS 9829 BSNL-NIB India IN
2 (1) AS 25019 SAUDINETSTC-AS Saudi Arabia SA
3 (5) AS 6147 SAA Peru PE
4 (3) AS 8386 KOCNET Turkey TR
5 (4) AS 7643 VNPT-AS-VN Vietnam VN
6 (-) AS 9050 ROMTELECOM Romania RO

The source of the problem was embarassingly simple and easily fixed: not enough inodes. The CBL and PSBL data were affected differently because they arrive differently. We pick up from CBL daily a text summary table with a line per IP address. We get from PSBL an NNTP feed of spam messages, each in its own file, that we boil down to a summary. So for CBL, we either got the whole file (most days of the month), or we didn’t store it at all (8 and 11 September). For PSBL, for each incoming message, we either stored it or we didn’t. Which is why there are some days with PSBL data between 4 and 15 Sep, but the volume is lower than usual. The notice below the chart is dire because we prefer to be conservative about these things.

Yet the PSBL rankings show AS 9829 BSNL-NIB #1 worldwide just like