The Wayback Machine - https://web.archive.org/web/20081218153438/http://blog.hecker.org:80/

A departure from my usual topics, in remembrance of my college math classes (and with a nod to Mozilla folks working on related areas like automated testing and software verification): Via Eric Drexler via Emergent Chaos comes this interesting review paper on formal proofs in mathematics and software to verify them.

As a dual math/physics major I was well acquainted with jokes about the lack of mathematical rigor on the part of physicists, who often engaged in rather slapdash simplifications in their drive to get formulas they could use to explain experimental data and make further predictions. However physicists who cut corners are ultimately saved by the fact that nature will check their work and let them know if they’ve made bonehead mistakes.

On the other hand mathematicians traditionally have had only other mathematicians to save them from errors, and most mathematicians find it more personally and professionally rewarding to do their own original work instead of verifying that of others. Enter computers, which are happy to do such relatively mindless tasks. Of course the catch is that you have to instruct them in how to do this, and like conventional programming this requires specifying the task at hand in excruciating detail, including all the steps that mathematicians leave out in conventional proofs (marked by phrases like “we can clearly see that…”).

Read the paper for how this is done; it’s pretty comprehensible even if (like me) you’ve forgotten almost all your college math. My favorite bit of the paper (speaking of the HOL Light theorem prover):

All the basic theorems of mathematics up through the Fundamental Theorem of Calculus are proved from scratch on the user’s laptop in about two minutes every time the system loads, so that the casual user does not need to be concerned with the low-level details.

In other words, it’s booting mathematics! And like conventional booting, people still get impatient; the HOL LIght tutorial describes how to cheat and checkpoint the system so you can skip the boring stuff.

I was also enchanted by the fact that multiple theorem provers have been developed, and mathematicians in different countries have their own favorite systems. Thus they replicate in cyberspace the traditional national styles (English, French, German, Hungarian, etc.) of mathematics. However it turns out that this is actually useful and not wasted work, as the different theorem provers can check each others’ efforts and thus increase confidence in their underlying correctness.

Finally, a lot of these systems are released as free and open source software, so if you want to try your hand at formalizing some famous theorems everything you need is just a download away.

(This is part 2 of a two-part post; for background on the Gini coefficient see part 1.)

I previously discussed use of the Gini coefficient as a way to measure income inequality (or equality, as the case may be), and promised to discuss why Howard County is noteworthy in this regard. In brief, Howard County is one of only seven counties in the US (out of 800 counties and other geographic areas) that rank in the top 5% (positions 1-40) for both median household income and income equality (as measured by the Gini coefficient):

Geographic area Income rank Median household income Equality rank Gini coefficient
Howard County, Maryland 3 101,672 29 0.379
Calvert County, Maryland 6 95,134 26 0.376
Douglas County CO 9 92,824 25 0.376
Stafford County, Virginia 12 87,629 12 0.36
Prince William County, Virginia 13 87,243 6 0.351
Charles County, Maryland 20 83,412 9 0.353
Scott County, Minnesota 39 77,678 20 0.369

(By way of comparison, the estimated Gini coefficient for the entire US in 2007 is 0.467, while the estimated US median household income in 2007 is $50,740.)

All of these counties share similar characteristics: They are formerly rural counties, relatively small in population (ranging from roughly 100,000 to 400,000), that are close enough to major cities to benefit from their economic growth but far enough away to exclude urban concentrations of poverty. Except for Douglas County (a suburb of Denver) and Scott County (a suburb of Minneapolis-St Paul), all are located near Washington DC. This points up the role of the Federal government as the economic engine of the region, providing lots of well-paying government and contractor jobs but at the same time not fostering an entrepreneurial culture that might produce more truly wealthy people.1

Although people disagree on the exact causes, there’s general agreement that income inequality has been generally growing over the past few decades, both in the US as a whole and within Maryland specifically. Howard County has been no exception, but even so its current level of inequality, although higher than it was in former years, is apparently no greater than that for the US as a whole in 1967, the year Columbia was founded.2

Howard County’s high median household income and low Gini coefficient could be interpreted as an endorsement of the Columbia vision: Columbia and Howard County have achieved 21st century-leading prosperity accompanied by 1960s-level equality. But does the Columbia vision really have anything to with this?

As noted above, while its situation is special in the US as a whole, Howard County is joined in its relative good fortune by several other Maryland and Virginia counties, all standard garden-variety suburbs with no Jim Rouse-like figures present at the creation (as it were). Rouse was certainly an enlightened developer, but first and foremost he was a canny developer, and the fundamental reason for Columbia’s success was Rouse’s foresight in seeing over forty years ago that Howard County’s location, location, location positioned it for future prosperity.

Despite that, I think the (lingering) vision of what Columbia should be does influence public attitudes toward income inequality in Howard County, and may help account for some of the special characteristics of the debates over Columbia’s future. For example, I’m sure that many opponents of the WCI Plaza Residences were sincerely concerned about the architectural compatibility of a 22-story tower with Columbia Town Center as it is and (in their minds) should be. However I also think some of the opposition was due to unease as to what it meant for the Columbia vision to have rich people in $2M condos looming over the split-levels, townhouses, and apartments of the surrounding villages.

Having the truly wealthy and their luxurious dwellings sprinkled through western Howard is one thing, having them occupy the symbolic heart of Columbia would be quite another, and I can understand why some older Columbians may have been troubled at the thought of it. I think a similar unease may lie behind the concern expressed that future housing in Columbia Town Center would be monopolized by the wealthy few.

In my previous post I mentioned Fairfield County, Connecticut. As Jay Hancock wrote in a blog post a while back, though it has a high median household income Howard County isn’t really rich in the sense that other areas are. On the other hand Fairfield County (or, to be more precise, Greenwich and other towns within Fairfield County) is indeed rich, with a vengeance. (Or at least it was, before the recent financial crisis; I don’t know how it’s doing now.) Home to a number of hedge fund billionaires and other people who made their fortunes in financial services, in 2007 Fairfield County had a mean household income of over $130,000, ranked in the top 5% for median household income ($80,241), and in the bottom 5% for income equality (with a Gini coefficient of 0.534).

Fairfield County is in a sense Howard County as it might have been in an alternative world, if DC were like New York. (In this regard it’s also worth noting that in 2007 New York City surpassed DC in both median household income, $64,217 vs. $54,317, and income inequality, with a Gini coefficient of 0.603 vs. 0.542.) That Howard County isn’t Fairfield County in this world might be seen as an unalloyed blessing: We live in a more fair and equal society, and are more insulated from the vicissitudes of global capitalism.

However it can also be argued that Columbia and Howard County are giving up something in return, and that (within limits) they might benefit from an increased influx of true wealth and the inequality that accompanies it. That’s a subject I hope to address in a future post.


1. It’s worth noting that Frederick County, Maryland almost made the list above as well in 2007; it is ranked #43 for median household income, and #22 for income equality. In fact, as a state Maryland has a Gini coefficient well below the US average, as pointed out by Jay Hancock of the Baltimore Sun last year.

2. The US Gini index in 1967 was 0.397 (US Census Bureau Publication P60-235, Income, Poverty, and Health Insurance Coverage in the United States: 2007, Table A-3, pp. 40-41). Due to a change in methodology in the early 1990s, Gini coefficients published by the Census Bureau for the 1960s cannot be directly compared to current Gini coefficients from the same source. However I think it’s reasonable to conclude that income inequality in Howard County today is at least roughly similar to income inequality in the US as a whole in 1967.

(This is part 1 of a two-part post; for the conclusion see part 2.)

In a previous post I discussed the concept of median income and how it avoids certain distortions inherent in mean (average) income. However median income by itself is not adequate to characterize the economic status of households in Howard County (or anywhere else for that matter). In particular, the median income just provides the midpoint for income, i.e., the income value for which 50% of the households make more and 50% make less; it does not address the question of how income is actually distributed among the various households.

For example, let’s go back to our simple 10-household example from the last post:

Household Household Income Share of Household Income Cumulative Share of Household Income
1 $16,000 1.35% 1.35%
2 $37,000 3.11% 4.46%
3 $56,000 4.71% 9.17%
4 $75,000 6.31% 15.48%
5 $92,000 7.74% 23.21%
6 $111,000 9.34% 32.55%
7 $132,000 11.10% 43.65%
8 $163,000 13.71% 57.36%
9 $190,000 15.98% 73.34%
10 $317,000 26.66% 100.00%

I’ve added two new columns of data, but otherwise the situation is as I described it previously: the ten households have an average income of $118,900 but a median income of $101,500, very similar to the actual numbers for Howard County1. Now let’s look at a second 10-household example:

Household Household Income Share of Household Income Cumulative Share of Household Income
1 $7,000 0.59% 0.59%
2 $9,000 0.76% 1.35%
3 $13,000 1.09% 2.44%
4 $18,000 1.51% 3.95%
5 $43,000 3.62% 7.57%
6 $160,000 13.46% 21.03%
7 $165,000 13.88% 34.90%
8 $174,000 14.63% 49.54%
9 $190,000 15.98% 65.52%
10 $410,000 34.48% 100.00%

As it happens, these ten households have exactly the same average income ($118,900, $1,189,000 divided by 10) and exactly the same median income ($101,500, halfway between $43,000 and $160,000) as in the first example. However the distribution of income looks very different; in its division of households between rich and poor it looks much more like Baltimore city or Washington, DC, than it does Howard County. Clearly this difference in income inequality is not captured by the median or mean income, or even by related measures like the difference between the mean and the median. How can we quantify this difference?

One commonly-used measure of income inequality is the so-called Gini coefficient or Gini index. The computation of the Gini coefficient is more complicated than that for mean or median income, but it’s still relatively straightforward and comprehensible. The key is to look at the numbers in the last two columns of the tables above, and especially the last column, cumulative share of household income.

The third column simply gives the share of household income going to that particular household. For example, in the first table household #1 has income of $16,000 against a total of $1,189,000 for all households, or 1.35% of all income; similarly household #10 has a 26.66% share of all income ($317,000 divided by $1,189,900), and so on for the other households. The fourth column then uses these figures to compute the share of income going to the poorest n% households. For example, household #1 has a 1.35% share of total income and household #2 has a 3.11% share, so the poorest 20% of households (i.e., households #1 and #2 out of 10 total households) have 4.46% of all income (1.35% plus 3.11%). Similarly we can add the income share figures for households #1 through #9 to determine that the poorest 90% of households have 73.34% of all income, with the remaining 10% of households (i.e., household #10) having 26.66% as noted above.

The cumulative share of income can be graphed as shown in the figure below. The red points show the values from the fourth column of the table above, with the red lines then connecting the dots to approximate a curve; if there were more households there would be more points and a correspondingly smoother curve.

Example 1 - Graph of an income distribution similar to that of Howard County, Maryland

Example 1 - Income distribution like Howard County

Now let’s look at the graph for our second example from above:

Example 2 - Graph of a more unequal income distribution

Example 2 - More unequal income distribution

Again the red points represent the values for cumulative share of income from the fourth column of the second table, with the red lines connecting the dots. What about the blue dots in both graphs? Those represent the ideal case where all the household incomes are equal, or nearly so. In that case the poorest 10% of households will have (almost) 10% of total household income, the poorest 20% will have (almost) 20% of income, and so on. The corresponding curve will then be a straight (or nearly straight) line, here shown in blue.

Note that as household income becomes more unequal, the curve of cumulative income share (the red curve) moves further and further away from the blue line representing perfect (or nearly perfect) income equality. This gives us a straightforward way to define the Gini coefficient: It’s the size of the blue-shaded area between the blue line and the red curve, expressed as a fraction (or percentage) of the total area under the blue line. For nearly equal income distributions the red curve will be very close to the blue line, and the Gini coefficient will be close to zero, while for very unequal income distributions the red curve will be far away from the blue line, and the Gini coefficient will approach one (or 100%).

In the first example the Gini coefficient is 0.38, nearly the same as the Gini coefficient of 0.379 for Howard County (see the Census ACS table 19083)2. In the second example the Gini coefficient is 0.53. This is comparable to the Gini coefficient for the District of Columbia, which is 0.542. More interestingly for our purposes, it’s nearly the same as 0.534, the Gini coefficient for Fairfield County, Connecticut, a suburban county in the New York City metropolitan area that’s home to many hedge-fund managers and other wealthy financial services professionals.

Unlike DC, Fairfield County is a pretty affluent area overall; it has a median household income of $80,241 (somewhat lower than Howard County’s) and a mean household income of $130,397 (somewhat higher than Howard County’s).3

The following 10-household example roughly mirrors the Fairfield County household income breakdown:

Household Household Income Share of Household Income Cumulative Share of Household Income
1 $11,000 0.84% 0.84%
2 $23,000 1.76% 2.61%
3 $37,000 2.84% 5.44%
4 $53,000 4.06% 9.51%
5 $70,000 5.37% 14.88%
6 $90,000 6.90% 21.78%
7 $115,000 8.82% 30.60%
8 $145,000 11.12% 41.72%
9 $215,000 16.49% 58.21%
10 $545,000 41.79% 100.00%

The corresponding Gini coefficient diagram is as follows:

Example 1 - Graph of an income distribution similar to that of Fairfield County, Connecticut

Example 3 - Income distribution like Fairfield County

What makes Howard County special with respect to income inequality, and Fairfield County particularly interesting as a comparison? The answers to those questions will be the subject of part 2 of this two-part post.


1. The US Census Bureau’s American Community Survey estimates the median household income in Howard County at $101,672 for 2007 (ACS table B19013). (This figure has a margin of error of +/-$3,594, which we’ll ignore for purposes of this discussion.) The ACS tables apparently don’t directly provide a figure for mean household income, but it can be computed by taking the aggregate household income estimate of $11,734,222,700 (ACS table B19025) and dividing it by the number of households, 98,866 (ACS table 19001); the resulting estimate for mean income is $118,688.

2. For those who’d like to check this result, the computation is relatively straightforward. First, we convert all percentages to fractions, so that the horizontal axis goes from 0 to 1, and the vertical axis likewise; the cumulative shares of income are then 0.0135 (for 0.1 of the population), 0.0446 (for 0.2), 0.0917 (for 0.3), and so on. The easiest way to compute the Gini coefficient is to compute the area under the red curve, and then to subtract it from the area under the blue line; the resulting difference is the size of the blue-shaded area, and we can then divide it by the area under the blue line to obtain the Gini coefficient.

The area under the blue line is simple to compute: It’s a triangle that is half of a 1 by 1 square, so its area is 0.5. The area under the red line is composed of a series of nine trapezoids and one triangle (at the left). The area of the triangle is half the base times the height: 0.5 times 0.1 (base) times 0.0135 (height), or 0.000675. The area of each trapezoid is the base times the average of the two vertical sides; for the first trapezoid (counting from the left) this is 0.1 (the base) times the sum of 0.0135 and 0.0446 divided by 2 or 0.0297 (the average of the two vertical sides), or 0.00297. Continuing with the other areas (left as an exercise for the reader), the sum of all the areas is about 0.31; this is the area under the red curve. We subtract this from 0.5 to get 0.19 as the area of the blue-shaded area, and then divide by 0.5 (the area under the blue line) to get 0.38 as the Gini coefficient.

3. As with Howard County, the mean household income for Fairfield County can be computed by taking the aggregate household income of $42,228,652,700 and dividing it by 323,848, the number of households.

This doesn’t sound good: While researching a Howard County-related blog post today I happened to follow a Google search to www.columbia-md.com (a domain controlled by General Growth Partners), and got the following message: This Account Is No Longer Active.

I guess when your stock’s in the toilet and you’re flirting with bankruptcy you’ve got more pressing things to worry about than keeping your web sites up.

In a previous post I investigated the question of whether those in Howard County with annual incomes of $120,000 or more truly constituted the wealthy few or not. (The answer: No.) Key to that investigation was the idea of median household income, as reported by the US Census Bureau in its annual Amercian Community Survey. It turns out that the ACS data provide some interesting insights into what makes Howard County special, and can help explain the nature of the conflicts that have raged over the future of Howard County in general and Columbia in particular.

In this post I’ll start with a concept that is easy to understand but has interesting implications, namely median income and its relationship to average (or mean) income. Let’s suppose we want to look at household incomes in Howard County or any other jurisdiction. We can compute the mean income (to use the preferred term) by adding all the household incomes and then dividing by the number of households. To simplify things, let’s assume we have only 10 households with incomes as follows:

$16,000
$37,000
$56,000
$75,000
$92,000
$111,000
$132,000
$163,000
$190,000
$317,000

The sum of all incomes is $1,189,000, and then we divide by 10 to get an mean (or average) income of $118,900. Simple enough, right?

Wrong. The problem with mean income is that it doesn’t necessarily represent the reality for the typical household, because it can be skewed by households that have either disproportionately small or (especially) disproportionally large income. For example, let’s suppose that the highest-income household in our example greatly increases its income (perhaps they’ve been the beneficiary of a successful IPO, for example), so that the incomes now look as follows:

$16,000
$37,000
$56,000
$75,000
$92,000
$111,000
$132,000
$163,000
$190,000
$1,558,000

The total of all 10 household incomes is now $2,430,000, which divided by 10 gives a mean household income of $243,000. Thus the mean household income has more than doubled, but the typical household (9 out of 10 of them, in this example) sees no improvement in its own income.

To correct for this distortion the Census Bureau and others use a different measure of household income, namely median income. The median income is the income that falls in the middle: half of all incomes are lower, and half are higher. In the example we have five households with income of $92,000 or less, and five households with income of $111,000 or greater. We then compute the median income as the value halfway between these two incomes, or $101,500. (This happens to be very close to the Howard County median household income of $101,672 in 2007; as will become apparent later, I deliberately chose these example numbers to create a microcosm of Howard County.)

Note that in computing the median income we didn’t specify whether we were using the first set of incomes or the second set of incomes (in which the highest-income household greatly increased its income). That’s because it doesn’t matter: the median income is exactly the same in both cases. Using median income thus avoids the distortion inherent in average income, where (as the economics joke goes), Bill Gates can walk into a bar and cause the average income to skyrocket.

But using median income has its own problems as well. Suppose that instead of walking into a bar, Bill Gates moved to Howard County. Or to achieve the same overall effect, 50 billionaires moved into Howard County, or 500 people worth $100M each. There are almost 100,000 households in Howard County, so adding a few hundred super-rich families isn’t going to affect the county’s median income at all (just as making one family wealthier didn’t affect median income in our toy example). But can anyone doubt that such an influx would change the character of Howard County in major ways?

How could this effect be quantified? That will be the subject of a future post.

Last night a post by local blogger Wordbones caught my eye. Based on a story in the Baltimore Sun, it discussed proposed plans for affordable housing in Columbia Town Center, housing that would be reserved for those with income of less than $80,000 (10% of total units) or those with income between $80,000 and $120,000 (another 10% of total units). Wordbones particularly noted a quote in the article from Alan Klein of the Coalition for Columbia’s Downtown:

Klein noted that 80 percent of the 5,500 residential units planned would be for people making more than $120,000, which he defined as the wealthy few. It seems like a pretty elite group, said Klein, referring to the 20 percent [of units reserved for those making less than $120,000] as a drop in the bucket.

Wordbones questioned this characterization, and I myself was curious as to whether it was really true. When in doubt the best rule is to go to the data, which in this case are from the annual American Community Survey [1] produced by the US Census Bureau. In some comments on Wordbones’s blog I referenced the 2006 ACS data, but as it turns out the 2007 ACS estimates were just released, as highlighted in a recent Columbia Flier story about Howard County being the third-wealthiest county in the country.

So what do the data say? According to the detailed breakdown of household income for Howard County in 2007, about 37% of Howard County households have incomes of $125,000 or more [2]; if we add in the households between $120,000 and $125,000, and also allow for the effects of wage inflation over the past year, almost 40% of Howard County households (or nearly two out of five) are part of Alan Klein’s pretty elite group.

But wait, there’s more! The ACS distinguishes households from families [3]; among other things, the ACS definition of family excludes people living alone. If we look at the detailed data for Howard County family income for 2007 we find that 46% of Howard County families have family incomes of $125,000 or more. Also, the median Howard County family income for 2007 was $115,907, meaning 50% of Howard County families had income higher than that. So if we again add in the effects of wage inflation it’s likely that Alan Klein’s weathy few (for whom 80% of the proposed units are reserved intended) includes half the families in Howard County.

To give Klein his due, $120,000 is an unusually high income for the US as a whole; detailed household income data for the entire US show that only 12% of US households had income of $125,000 or more in 2007, and the corresponding data for family income show that less than 16% of US families had family income above that level. This underscores the unusual position of Howard County as a very wealthy jurisdiction. Wordbones doubted whether any typical family in Columbia with a household income of $120,000 sees themselves as being wealthy, and in the context of Howard County at least he’s absolutely right.

Notes:

  1. The ACS is a survey based on random samples, not a comprehensive survey like the ten-year census. Because of that there is some sampling error in the results, such that the true figures might be a few percentage points higher or lower. I’ve ignored this in my discussion (as do most press stories on the ACS data).
  2. ACS figures are in 2007 inflation-adjusted dollars. I wondered a bit about what this meant until I figured it out: ACS data are taken throughout the year, and wage inflation over the course of the year causes incomes to increase slowly from month to month. The Census Bureau uses monthly inflation figures to adjust the numbers so that they are comparable, i.e., as if they were all taken in a single month in 2007.
  3. According to the official ACS definitions, [a] household includes all the people who occupy a housing unit, while [a] family consists of a householder and one or more other people living in the same household who are related to the householder by birth, marriage, or adoption. In general the number of families will be less than the number of households: Not all households contain families since a household may be comprised of a group of unrelated people or of one person living alone — these are called nonfamily households. Thus, for example, in the 2007 ACS data Howard County is estimated as having 98,866 households but only 73,765 families.

Blogging closer to home

September 9, 2008

For those Mozilla folks and others who’ve been following my full blog feed: I happen to live in Howard County, Maryland, between Washington DC and Baltimore.For a while now I’ve been following Howard County local bloggers but haven’t joined the conversation myself. I’ve now been prompted to write on at least one Howard County topic, and in the event I write more I’ve started a Howard County category for my blog and a corresponding feed. If you’re not interested in this stuff I suggest you consider resubscribing to just my Mozilla feed.

Johnathan Nightingale recently addressed a very common question, namely why Firefox doesn’t automatically accept self-signed SSL certificates as being valid. I don’t have much to add to Johnathan’s discussion of the issues with self-signed certificates, but speaking on behalf on the Mozilla Foundation I do want to address some of the comments that I’ve seen people make with regard to SSL certificates, certification authorities (CAs), and Mozilla.

First, a quick refresher: To support SSL web sites need a combination of a private key kept on the server and a public key embedded with other information (most notably the server’s domain name, and also in some cases the name of the organization operating the server) in a digitally-signed document, the certificate. When a browser connects to an SSL-enabled web server the server sends its certificate to the browser. If the certificate was digitally signed by a third party certification authority known to the browser, the certificate is treated as valid and the browser proceeds to use the information in the certificate to kick off the SSL protocol. (The public key in the certificate is used in setting up SSL encryption, the domain name in the certificate is double-checked against the domain name the browser was supposedly connecting to, and for Extended Validation certificates the organizational name in the certificate is displayed in the Firefox 3 site identification button to the left of the location bar.)

Otherwise the browser displays an error page, with an option for the user to create a security exception to prevent the error from being displayed again. Among other things, this allows users to have an SSL-enabled site work properly if its certificate is signed by a CA unknown to Mozilla or if the certificate is digitally-signed using the server’s own private key (a self-signed certificate).

Now, to correct a few common misconceptions:

  • SSL certificates are not (necessarily) expensive, and can in fact be free. To cite two examples, Go Daddy offers SSL certificates for $29.99 per year or less, while ipsCA offers no-charge SSL certificates for organizations with .edu domain names, and $38/year certificates to others; certificates from these two CAs are recognized in all commonly-used browsers. Also, StartCom offers SSL certificates at no charge whatsoever, though at present these certificates are recognized only in Firefox. (This may change in the future, if StartCom is able to persuade more browser vendors to support its certificates.)
  • Mozilla does not charge CAs to have their root certificates included in Mozilla. Back in the early days of SSL Netscape charged CAs rather large fees to have their root certificates included in Netscape products. The Mozilla Foundation has never done this; any CA is free to apply to have its certificate(s) included in Mozilla-based products, at no charge to itself or others.
  • Mozilla’s goal is to open up the CA market and support both more CAs and a wider variety of CAs. In particular, the Mozilla CA certificate policy was deliberately designed to make it possible for smaller CAs, including volunteer-run nonprofits CAs like CAcert, to have their certificates recognized in Firefox and other Mozilla-based products. (CAcert’s certificates aren’t yet recognized, but only because CAcert has not yet met Mozilla policy requirements.)

There are legitimate questions about what sort of user experience Firefox and other Mozilla-based products should provide in relation to SSL-enabled sites. However we can’t have a fruitful discussion about what to do about this if people start out by repeating tired myths like SSL certs cost hundreds of dollars and browser vendors are just trying to maintain the traditional CAs’ monopoly. If I have time I’ll post again with my thoughts on what today’s CA market is really like, and how it might evolve in the future.

As announced by Mitchell Baker earlier and followed up by Mark Surman, Mark will be coming on board in a month or so as the Executive Director of the Mozilla Foundation. We in the Foundation have all had a chance to speak with Mark in depth both by phone and in person at the summit in Whistler (including during a healthy-snack-fueled Foundation road trip up from Vancouver). I’ll let others add their own take, but I for one am very happy that Mark decided to take this opportunity to join the Mozilla Foundation and the Mozilla project.

Beyond his personal qualities, there are at least three things I think Mark will bring to the Foundation that I think will serve it well:

The first is his background in open education and related initiatives. As I wrote earlier, I think education is an area where Mozilla in general (not just the Foundation) could play at least a supporting role, and perhaps in at least one area (developer education) a significant role. If that’s to happen the more Mozilla people with education experience the better.

Second is his experience with initiatives in the developing world, both in his telecentre work and in his work with the Shuttleworth Foundation. Mozilla has made good inroads into the so-called BRIC countries (Brazil, Russia, India, and China) and other emerging market nations both in terms of Firefox adoption and in terms of formal organizational outreach. However beyond those countries there are a host of less-developed countries that are not online in a major way today (due to overall poverty, internal or regional conflicts, lack of ICT infrastructure, and other reasons) but which may well become much more integrated into the global economy (and thus the Internet and web) over the next 10-20 years. Such frontier market countries are potentially fruitful ground for adoption of and even innovation in open Internet/web technologies, especially those based on mobile devices, and there too may be potential roles to play for Mozilla in general and the Foundation in particular.

Third and finally is Mark’s experience in philanthropy.  Although the Mozilla Foundation is a nonprofit organization and the Mozilla project is operated for the public benefit, it’s fair to say that almost all the people involved in Mozilla are more familiar with the world of software technology and IT than with the world of nonprofits and philanthropic foundations. That’s a good thing in terms of getting new releases of Firefox, Thunderbird, and other products out the door; however the Mozilla Foundation has minimal direct involvement in software development, and much more potential involvement in philanthropic activities. (Though in accordance with the Mozilla DNA I think these activities will be primarily related to open software, open web content, and open technologies in general.) Mark has some interesting ideas on bringing open source ideals and practices into the world of philanthropy, and I think the Mozilla Foundation would make a great testbed for putting those ideas into practice.

So, to conclude, please welcome Mark to the Mozilla project, and introduce yourself to him if you didn’t get the chance to do so at the summit.

[This is part 2 of a two-part post. Part 1 discusses the future of education and the possibility of customized online educational offerings as a disruptive innovation that might eventually grow to rival and even dominate traditional educational systems. It ended with a question: what does this have to do with Mozilla? I now attempt to answer that question.]

Online education evolves to be user-driven, not vendor-driven

By definition disruptive innovations allow users to do things they could previously not do, or could do only at great expense and/or effort. But while disruptive innovations make users’ lives easier, they typically make vendors’ lives harder, at least initially, because creating truly disruptive products can be difficult and expensive. (For example, think of all the industrial design, usability engineering, software development, and other work that Apple put into creating the iPhone and its simplified user experience for running mobile applications and using the web from a handheld device.)

The first products that embody disruptive innovations thus tend to have a high degree of internal integration and a relatively closed architecture (again, consider the iPhone). However over time the state of the art advances to the point where vendors can create comparable products using modular components communicating through standardized interfaces. (Christensen’s favorite example here is Microsoft Windows vs. Linux distributions; in the mobile space would-be contenders include Android and Limo.) This move to modularity also allows disruption in the commercial system, i.e., the context within which a firm establishes its cost structure and operating processes and works with its suppliers and channel partners to respond profitably to customers’ common needs (Disrupting Class, p.124).

In particular, Christensen and his co-authors believe that the first-generation commercial system for online education is too tied to the current commercial system for education in general, and shares its orientation to expensive one size fits all solutions. They predict an eventual move to a new commercial system organized as a facilitated user network, in which users exchange with each other as opposed to being supplied by traditional vendors, with one or more third parties facilitating that exchange (as, for example, YouTube facilitates the exchange of video content):

[In] the first phase of disruption of the instructional system the software will likely be complicated and expensive to build. … Within a few more years, however, two factors that were absent in stage 1 that are critical to the emergence of stage 2 will have fallen into place. The first will be platforms that facilitate the generation of user-created content. The second will be the emergence of a user network …. The tools of the software platform will make it so simple to develop online learning products that students will be able to build products that help them teach other students. Parents will be able to assemble tools to tutor their children. And teachers will be able to create tools to help the different types of learners in their classroom. … User networks … will be the business models of distribution. This will allow parents, teachers, and students to offer these teaching tools to other parents, teachers, and students. (p.134)

So: modular interoperable standards-based products, user-created content, and user networks within which such content gets created and freely distributed. I don’t know about you, but to me that sounds like something Mozilla knows something about.

Tasks for the Mozillas

Let’s assume that education will indeed involve in the direction of user networks producing user-generated and -distributed content for customized online education. Let’s further suppose the continued growth of a movement to ensure that this and other educational content is freely available for others to use and adapt. This certainly sounds like a movement that is in line with the goals of the Mozilla Manifesto (which notes, among other things, that [the] Internet is … a key component in education …), and a trend we might like to encourage. How we might do so in a manner consistent with the Mozilla DNA? I think the answer varies based on the particular Mozilla entity in question (what I call the Mozillas within the overall Mozilla project).

The task of the Mozilla Corporation I think would be mainly to continue on the path it’s currently on. Any modular standards-based personal learning environment or open learning network is likely to be based on web technologies, and the goal is to have Firefox be the very best way there is to bring the web to end users. There are some particular areas that might be relevant to an educational context, though not necessarily limited to that context.

For example, the Mobile Firefox effort will help bring the full power of Firefox to future low-end 4P computing systems that might be deployed for primary and secondary education, and initiatives to support open audio and video formats would assist in efforts to provide rich learning experiences whose delivery doesn’t depend on proprietary technologies. There might also be some supplemental work that might be called for; for example, robust out-of-the-box support for MathML and other specialized markup languages is clearly more important for the educational market than for the general consumer market to which Firefox is pitched.

Mozilla Messaging is a somewhat different case, and perhaps a more interesting one in terms of how a focus on education (which, again, would not be the sole focus) might help shape a future strategy. As I see it, one problem with Thunderbird is that its user base is often conceived of in negative terms: they’re people who don’t like webmail and don’t want to use Outlook. I think Thunderbird and related technologies need a real constituency, a group of people for whom the product is designed to fit their special and distinct needs, and who respond to that focus with enthusiasm. That constituency might be found within the traditional enterprise market, but I confess I’m concerned about Mozilla Messaging trying to re-fight the groupware wars that Netscape lost a decade ago.

Might Mozilla Messaging be able to find its constituency, or at least a significant part of it, within the educational market? Educational institutions are certainly more open to standards-based open source products than your typical enterprise. Also, to the extent that Mozilla Messaging is about not just email but about the broader market for collaboration and communication tools, the education market certainly has a lot of models for collaboration and communication — one to many instruction, one-to-one tutoring, small group collaboration, synchronous vs. asynchronous, text vs. video vs. audio, and so on. So perhaps this might be a fruitful question for Mozilla Messaging to explore: What types of collaboration and communication products would be needed to support advanced online learning environments, and could Mozilla technologies be instrumental in creating such products?

Next comes what might be called the missing Mozilla. As Gerv Markham recently noted, with minor exceptions (e.g., the HTML editor in SeaMonkey) the Mozilla project has for the most part left to others the task of creating tools for web content creation and application development. Is this an area we should look at re-entering over the coming years? In the educational context, consider what sort of rich content might go into a simple Physics 101 online course: mathematical equations, static and dynamic graphs, interactive simulations of experiments, perhaps some archival video, and so on. It would be a shame if people created, distributed, and collaborated on lots of great open education courses like this, but they turned out to be a collection of glorified Flash or Silverlight apps. Should Mozilla do something about this and, if so, how might it best be done? This is a question that extends beyond the context of education, and I think one that needs to be discussed.

Finally we come to the Mozilla Foundation. What role if any might it play in an educational context? The Mozilla Foundation could certainly endorse and perhaps help shape a particular vision for education along the lines discussed above, and could lend moral, financial, and other support to other groups working on the front lines to make it happen. (By coincidence the proposed new executive director for the Mozilla Foundation has relevant experience in this area.) It could also encourage the Mozilla Corporation, Mozilla Messaging, and others within the overall Mozilla project to make Mozilla-based technologies and products the preferred ways by which next-generation customized online education is experienced by end users; where there are gaps in capabilities, the Foundation could provide some funding and other support to help fill those gaps (as we did with accessibility, for example). Finally, the Foundation could go further and pick a particular subproblem within the broad educational space and seek to play a leading role in addressing it.

Most notably, the Mozilla Foundation has a clear interest in (and has already financially supported) the work at Seneca College to bring open source development methodologies into the classroom. The Foundation could continue and expand upon that work, including working with Seneca to promote the adoption of similar Mozilla-related curricula at other like-minded institutions and the creation of Mozilla-related materials suitable for self-education. Beyond focusing just on Mozilla, the Mozilla Foundation could also work with others to change the entire manner in which the next generation of software developers is educated. This could include teaching software development in a more comprehensive and interdisciplinary manner in which topics like QA and release engineering, project organization and governance, user experience, marketing and evangelism, copyright and other legal issues, and others assume equal importance to traditional computer science and programming language instruction. It could also include expanding the range of contexts within which software development is taught — not just in formal academic institutions but also within informal learning collectives associated with open source projects or other groups of people with common interests and objectives.

Ten years until the revolution?

If Christensen and his co-authors are correct, in about ten years time we could very well reach a tipping point in which the educational system in the US and elsewhere will rapidly transition from the traditional instructor-in-the-classroom model to a model based on customized online education provided on standards-based platforms and supported by a network of teachers, students, and others collaboratively creating, distributing, and recombining rich collections of instructional materials. Today we stand at a point in online education comparable to the late 1970s and early 1980s with respect to personal computers or the late 1980s and early 1990s with respect to the Internet and the web: We can envision the promise of what might come, and have early examples of that promise to learn from and build on. But we do not know exactly how the story will play out, who its heroes (and villains) will be, and whether it will have a happy or sad ending for those of us who value openness, freedom, and grassroots participation. We may have an opportunity to help shape how that story unfolds. Should Mozilla grasp that opportunity? That’s the question I’m putting forth for discussion.