DIA Report on Internet Filter Test

The Department of Internal Affairs (DIA) have released to me their report (PDF) on the testing of the Internet Filtering system.

The first half of it is a description of the system and doesn’t really contain much new information (except that we now know it runs on FreeBSD and uses the Quagga BGP daemon).

The second half of it is more interesting as it has some results from the DIA’s testing. This was apparently split into three phases:

  1. Single ISP with 5,000 users ((already had their own filtering system so it was probably Watchdog).
  2. Two ISPs with 25,000 users.
  3. Four ISPs with 600,000 users (at a guess this was when Ihug and TelstraClear joined).

Before we go on, a brief reminder of how it works: The ISP diverts all requests that are on the same Internet address as one of the blocked sites. The filter then checks each diverted request and decides whether to block it or let it through. The filter never sees requests for websites that don’t share an Internet address with a blocked site.

Interceptions

Now, back to the numbers. The phrasing in the whitepaper is a bit hard to interpret, the following is based on my best attempt at understanding it:

In phase 1, the system apparently had 3 million requests diverted to it each month and blocked 10,000 of those requests. This means that only a third of 1% of processed requests ended up being blocked.

In phase 2, there’s 8 million requests per month with 30,000 of them being blocked.

In phase 3, there’s 40 million requests per month with 100,000 of them being blocked.

In other words, there’s a very large number of requests being filtered through the DIA’s server compared to the number that are being blocked.

Effectiveness

There’s no way to measure the effectiveness of the filter at stopping people from finding child pornography – we can’t tell how many people worked around it or downloaded material using peer to peer filesharing or other methods.

One interesting number, however, is the number of blocked requests per user.

In phase 1, there’s 2 blocked requests per user per month (10,000 blocked requests per month/5000 users).

In phase 2, there’s just over 1 blocked request per user per month (average 30,000 blocked requests per month, 25,000 users).

In phase 3, there’s 0.17 (average 100,000 blocked requests per month, 600,000 users).

What’s odd is the way that the number of blocked requests per user go down phase by phase. I have no idea what this indicates.

Robustness

According to the report, the system was operating at 80% capacity in the third phase. Apparently this was a bit much for it as: “the system did experience some stability issues processing this amount of requests and required maintenance on two occasions to replace hardware.”

There is no further detail about whether the “80% capacity” referred to the performance of the filtering system or the Internet connection they were using.

3 Responses to “DIA Report on Internet Filter Test”

  1. 1Matt on Oct 4, 2009 at 2:33 pm:

    So, in summary, it’s already nearing capacity in testing, and it’s filtering 39.9 million requests per month that don’t need blocking. This, on top of the fact that it doesn’t stop any of the real traffic anyway. Smells like success to me :(.

  2. 2Jason on Oct 4, 2009 at 2:37 pm:

    Usually, it indicates that there are individuals testing the service. With smaller numbers of customers, these individuals will have a larger impact.

    Another possible reason:

    Let’s say that there are 10k blocked addresses. Each ISP (or the DIA itself) as part of the trial is likely to test each of the addresses to see if the filter works. Each ISP will perform this test. However, the test only needs to be done once per ISP, meaning as larger ISPs come online, the impact of the test on the statistics will reduce. Even if only to test the latency differences to filtered vs semi-filtered vs non-filtered.

  3. 3thomas on Oct 6, 2009 at 7:03 pm:

    Matt – I kind of assume that they’ll be buying more hardware for going live.

    Jason – I guess you’re right that testing could account for some of it.