“42.7 percent of all statistics are made up on the spot.” – Steven Wright

25 Sep

Finally, stats I can believe. (via tmblg)

Finally, stats I can believe. (via tmblg)

Technorati: State of the Blogosphere 2008 Methodology

It is important for everyone who is throwing around these new Technorati numbers to understand the methodology they employed. I have no issue with the data, nor with the manner it was collected. The only thing to be aware of here is, as usual, how the data is reported

I am seeing twitter posts and blogs all day about “bloggers do this” and “bloggers make that” but remember—this is not a survey of bloggers. This is a survey of Technorati users. My understanding is that there are about 1.2 million bloggers registered with Technorati. To say that this figure under-represents the number of actual bloggers is a massive understatement. The report itself calls this a sample of “the active blogosphere;” however, a more correct representation would be “Technorati-registered bloggers,” which I assume is just too long for folks to use, and certainly too long for 140 characters!

Again, this is not an indictment of the data, which seems to be properly sampled and reported, but everytime I see someone tweet “Median income for Bloggers is $200!” I kinda cringe a little. 

22 Aug

How the undecideds decide - Cosmic Log - msnbc.com

This study purports to have found a method to predict how “undecided” voters will really vote by examining their reactions to images and how quickly the test subjects associated these images with negative or positive words.

There is a kind of implicit cynicism in this sort of thing—as if the assumption by these researchers is that when you say you are undecided, we know you aren’t really undecided. But my biggest problem with the underlying thesis is that this sort of predictive measure might work in a vacuum—i.e., if the “undecided” subject is exposed to no further stimuli or mitigating factors, then these reactions predict “x.” However, my time spent on the day job tells me that no such vacuum exists—and that a vague, subconscious positive association for a candidate is just waiting to be swift-boated in the keyster and turned around 180 degrees. An “October Surprise” can similarly scrub out unspoken reservations about a candidate.

So I guess that while the method is intriguing, I suspect its accuracy improves the closer you get to Election day, and is probably best done the day before the election, when all the undecideds really have left to go on is their gut.

21 Aug

Pollster.com: The "Loopy" Zogby Polls

No Starch Press: The Manga Guide to Statistics
Oh, NO You DIDN’T!
Put me down for three of these!

No Starch Press: The Manga Guide to Statistics

Oh, NO You DIDN’T!

Put me down for three of these!

A New "Hospital Compare" Tool From the Dep. of Health and Human Services

Boy, are tools like this potentially misleading. This one lets you see the mortality rates for a variety of conditions from your local hospitals. Here’s the obvious, ridiculous danger of such stats. I’m sure the tiny little hospital that I used in northern Maine growing up has great rates—because if I were really sick, I’d drive to Bangor, or Portland. Similarly, If you had a serious life-threatening heart condition, or extreme complications due to a skin infection, you wouldn’t go to your regional hospital, you’d go to a specialist facility—where lots of folks with complicated conditions die each year because they are the worst cases.

It’s kinda like looking at the fielding percentage of an essentially immobile shortstop versus that of Ozzie Smith, or Shawon Dunston when he patrolled SS for the Cubs. They had worse fielding percentages because they tried for—and got their gloves on—more balls than most shortstops can even reach. I’m sure the same is true for many of the specialist care facilities in the U.S. that have somewhat higher mortality rates. You can’t get an error on a ball you don’t reach for.

And that’s gotta be the worst analogy in Datasnob history!

20 Aug

TED | TEDBlog: An unlikely master of card magic: Lennart Green on TED.com

A personal hero of mine.

12 Aug

The Oprah/Obama Connection

This study is fascinating: a pair of University of Maryland researchers claim to have quantified the “Oprah effect” on Obama by positing a correlation between sales of her magazine (“O”) and books recommended by her book club with county-level vote data. Their results (which are available in full here) claim that Oprah’s endorsement and subsequent campaigning for Obama was worth approximately one million votes. A quick scan of the regression reveals this to be a pretty solid correlation—so now I have to figure out how to get Oprah to endorse me for something.

8 Aug

tmblg:

Dawkins vs. O’Reilly

GMail, and Receptivity

So, Google does a lot of smart things, and they make a lot of money doing them. I know I have clicked on my share of text ads and sponsored links in my day—all while I was actually searching for a product or service. In other words, when I am receptive to a message, and I am looking for that message, I am likely to respond to that message. And I do.

But I wonder how much GMail is adding to those profits. I am not necessarily skeptical, I’m just admitting I don’t know. As many links as I have clicked on in my life, I can’t say I have ever clicked on an ad from my GMail page—EVER. Sure, they are the same ads, and they are “contextual”, I suppose, but when I am checking my EMail—in other words, when I am getting info pushed to me, and I am not doing the pulling—I am not as receptive to advertising. In fact, sample of one, I am not receptive at all. I wonder if this receptivity phenomenon has been studied by Google and what they make of it.

GMail boosters will note that there is still value for Google in having access to all of that text and being able to continually get smarter in all the Googly things it does, but surely when they have such a vast trove of data there is a point of diminishing returns—and I am curious whether or not that return justifies what has to be an enormous expense maintaining all of those EMail servers.

Anyone shed any light on this? Note to Google—even if my speculation is correct, for God’s sake don’t shut down GMail.