Tag Archives: data

Recompute Fall 2013 on SpamRankings.net

Glitches happen, and this one illustrates how rankings with big differences in spam volume are robust anyway.

A format change in an ancillary data source detected through consistency checks caused recomputations in selected rankings for September, October, and November 2013 in Classic.Classic.SpamRankings.net (Cloud.Cloud.SpamRankings.net was unaffected). The old versions are preserved as v1 rankings, and the differences are visible for these overall rankings:

GeographySep 2013Oct 2013Nov 2013
CBLPSBL CBLPSBL CBLPSBL
World World CBL** PSBL PSBL* PSBL**
BE BE CBL PSBL PSBL PSBL
CA CA CBL PSBL PSBL PSBL
TR TR CBL PSBL CBL PSBL PSBL
USUS CBL* PSBL PSBL PSBL
Countries Countries CBL PSBL CBL PSBL CBL PSBL
Medical
World World CBL** PSBL
US US PSBL
Countries Countries CBL* PSBL*
* Completely unchanged in rank order
** Unchanged except for dropout final rank

So the most noticeable rankings, for World, were Continue reading

Data storage issues in SpamRankings.net

Data storage issues led to loss of some incoming data for the September 2012 SpamRankings.net. Interestingly, the results seem almost normal anyway. Here is a speculation on why that can be.

Look just under any rankings chart for September 2012 and you’ll see this notice:

CBL dropouts 8,11 September 2012 were on our end.
PSBL data is unusable 4-15 Sep 2012 due to problems on our end.
September 2012 World All SpamRankings.net from CBL Volume
1 (2) AS 9829 BSNL-NIB India IN
2 (1) AS 25019 SAUDINETSTC-AS Saudi Arabia SA
3 (5) AS 6147 SAA Peru PE
4 (3) AS 8386 KOCNET Turkey TR
5 (4) AS 7643 VNPT-AS-VN Vietnam VN
6 (-) AS 9050 ROMTELECOM Romania RO

The source of the problem was embarassingly simple and easily fixed: not enough inodes. The CBL and PSBL data were affected differently because they arrive differently. We pick up from CBL daily a text summary table with a line per IP address. We get from PSBL an NNTP feed of spam messages, each in its own file, that we boil down to a summary. So for CBL, we either got the whole file (most days of the month), or we didn’t store it at all (8 and 11 September). For PSBL, for each incoming message, we either stored it or we didn’t. Which is why there are some days with PSBL data between 4 and 15 Sep, but the volume is lower than usual. The notice below the chart is dire because we prefer to be conservative about these things.

Yet the PSBL rankings show AS 9829 BSNL-NIB #1 worldwide just like Continue reading

Massive effects of reputational rankings on law schools

Law schools game weak reputation rankings, which could be fixed, if the law schools, the bar association, or the ranking organization wanted to. If anyone doubts that reputational rankings can have massive effects on ranked organizations, read this.

David Segal wrote in the NYTimes 30 April 2011, Law Students Lose the Grant Game as Schools Win:

How hard could a 3.0 be? Really hard, it turned out. That might have been obvious if Golden Gate published a statistic that law schools are loath to share: the number of first-year students who lose their merit scholarships. That figure is not in the literature sent to prospective Golden Gate students or on its Web site.

Why would a school offer more scholarships than it planned to renew?

The short answer is this: to build the best class that money can buy, and with it, prestige. But these grant programs often succeed at the expense of students, who in many cases figure out the perils of the merit scholarship game far too late.

What makes law school rankings so easy to game? Continue reading