Tag Archives: data

Recompute Fall 2013 on SpamRankings.net

Glitches happen, and this one illustrates how rankings with big differences in spam volume are robust anyway.

A format change in an ancillary data source detected through consistency checks caused recomputations in selected rankings for September, October, and November 2013 in Classic. Classic.SpamRankings.net (Cloud. Cloud.SpamRankings.net was unaffected). The old versions are preserved as v1 rankings, and the differences are visible for these overall rankings:

Geography	Sep 2013		Oct 2013		Nov 2013
	CBL	PSBL	CBL	PSBL	CBL	PSBL
World	CBL**	PSBL		PSBL*		PSBL**
BE	CBL	PSBL		PSBL		PSBL
CA	CBL	PSBL		PSBL		PSBL
TR	CBL	PSBL	CBL	PSBL		PSBL
US	CBL*	PSBL		PSBL		PSBL
Countries	CBL	PSBL	CBL	PSBL	CBL	PSBL
Medical
World	CBL**	PSBL
US		PSBL
Countries	CBL*	PSBL*
	* Completely unchanged in rank order
	** Unchanged except for dropout final rank

So the most noticeable rankings, for World, were Continue reading →

Data storage issues in SpamRankings.net

Data storage issues led to loss of some incoming data for the September 2012 SpamRankings.net. Interestingly, the results seem almost normal anyway. Here is a speculation on why that can be.

Look just under any rankings chart for September 2012 and you’ll see this notice:

CBL dropouts 8,11 September 2012 were on our end.
PSBL data is unusable 4-15 Sep 2012 due to problems on our end.

September 2012 World All SpamRankings.net from CBL Volume
1	(2)	AS 9829 BSNL-NIB	IN
2	(1)	AS 25019 SAUDINETSTC-AS	SA
3	(5)	AS 6147 SAA	PE
4	(3)	AS 8386 KOCNET	TR
5	(4)	AS 7643 VNPT-AS-VN	VN
6	(-)	AS 9050 ROMTELECOM	RO

The source of the problem was embarassingly simple and easily fixed: not enough inodes. The CBL and PSBL data were affected differently because they arrive differently. We pick up from CBL daily a text summary table with a line per IP address. We get from PSBL an NNTP feed of spam messages, each in its own file, that we boil down to a summary. So for CBL, we either got the whole file (most days of the month), or we didn’t store it at all (8 and 11 September). For PSBL, for each incoming message, we either stored it or we didn’t. Which is why there are some days with PSBL data between 4 and 15 Sep, but the volume is lower than usual. The notice below the chart is dire because we prefer to be conservative about these things.

Yet the PSBL rankings show AS 9829 BSNL-NIB #1 worldwide just like Continue reading →

Massive effects of reputational rankings on law schools

Law schools game weak reputation rankings, which could be fixed, if the law schools, the bar association, or the ranking organization wanted to. If anyone doubts that reputational rankings can have massive effects on ranked organizations, read this.

David Segal wrote in the NYTimes 30 April 2011, Law Students Lose the Grant Game as Schools Win:

How hard could a 3.0 be? Really hard, it turned out. That might have been obvious if Golden Gate published a statistic that law schools are loath to share: the number of first-year students who lose their merit scholarships. That figure is not in the literature sent to prospective Golden Gate students or on its Web site.
…
Why would a school offer more scholarships than it planned to renew?
The short answer is this: to build the best class that money can buy, and with it, prestige. But these grant programs often succeed at the expense of students, who in many cases figure out the perils of the merit scholarship game far too late.

What makes law school rankings so easy to game? Continue reading →

Perilocity

Beyond Internet security to risk management

Tag Archives: data

Recompute Fall 2013 on SpamRankings.net

Data storage issues in SpamRankings.net

Massive effects of reputational rankings on law schools