July 2011

Update 2011-07-21:I’m leaving the original post unchanged below, for reference, but I just got an email from Jon Trowbridge saying he gets different search results with “Google” and “Google+”. I do too, but not in the way I would expect, at least for more complex searches. When I first try the search I want, Google suggests just eliminating the term “google+” from the search string entirely:

Google suggests that maybe I don't want to search for 'google+'.

(But that could be because the word “google” is itself treated specially, perhaps because people often needlessly type it in searches? I don’t know; only someone with access to the logs can say for sure.)

Anyway, once I choose the real search…

Tell Google I really mean it.

…this blog post comes up at the top, but no other result on the front page contains the character “+” nor the string “plus”, let alone “google+” or “googleplus”:

These results have a notable paucity of '+' signs.

This screenshot doesn’t show the results below the fold, but I did an in-page search to make sure. I also searched within the second and third hits (i.e., the first two after my own blog post) and they don’t mention “google+” or “googleplus” or anything similar either.

Why is it finding this blog post so accurately and yet nothing else on the same topic? Surely I can’t be the only person mentioning about Debian, Firefox, Iceweasel, and Google+ in the same article. Even if no one else has this browser-compatibility problem (which seems unlikely anyway), I’d expect people writing for other reasons.

But if I do the same thing but just comparing the terms “google” and “google+” as Jon did, without all the other stuff, I get a fair number of results that are clearly about Google+ — well, actually, it’s all about Google+ and the iPhone, but I guess that just shows what’s important on the Internet these days.

That makes me think that it’s not simply the case that when Jon and I add the “+”, we’re getting the old “match this exact string” behavior (which could still lead to different results for the two terms, because Google might treat various misspellings of its name as synonyms). Google really could be indexing “+” signs now, or (more likely?) at least treating “google+” specially when the crawler encounters it.

Color me baffled. Anyone know what’s really going on here?

Original post below.


I’ve got a zillion Google+ invitations. I’m ready to try it out. But I’m getting an error that my browser is “no longer supported” (I’m not sure how a new service in beta testing can say anything is “no longer supported”, but whatever):

My browser coulda been a contender.

(The “Learn More” link above goes to a non-existent page, by the way, so that doesn’t help.)

Note how it lists Firefox as one of the browsers to try instead. The thing is, I’m running Firefox already!

I have a tentative theory. For complicated reasons, the operating system I run, Debian GNU/Linux, repackages Mozilla Firefox under the name “Iceweasel”:

Iceweasel, I mean, Firefox.

When identifying itself to web servers, the browser transmits something like this:

  Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.19) \
  Gecko/20110701 Iceweasel/3.5.19 (like Firefox/3.5.19)

The server on the other end could interpret that in a number of ways. It could decide to treat its interlocutor like Firefox 3.5.19… or it could decide the browser is some mysterious beast called Iceweasel, and refuse to serve to it. No doubt somewhere there’s an RFC that spells out what both sides should do, but I have no idea which one if so.

Anyway, I don’t know for sure if this is why Google+ is rejecting my requests. I may test the theory by having my browser impersonate regular Firefox, after I finish this blog post. At the moment, all I know is “It’s Not Working”.

But the worst part is: I can’t Google up an answer, because “+” doesn’t work in Google searches.

You can’t do a Google search based on the presence or absence of a “+” sign. You might get results that contain “+”, of course, but the matching will have been based on the alphanumeric words around it, not on the “+” sign itself. (Google even turns the fact that they don’t index “+” to advantage, offering it as a metacharacter in search strings for suppression of the automated synonym matching that would otherwise happen automatically).

So Googling for debian firefox iceweasel google+ doesn’t work. As far as I know, all the other major search engines are the same way: “+” isn’t indexed, so you can’t search for it. Fine. The reasons for this are technical, having to do with size-versus-completeness tradeoffs in building search indexes.

But then I wish they hadn’t named an important new service in such a way that it can’t be searched for :-).

Update: using “googleplus” as a synonm gets some useful results, I guess because people are sometimes using that spelling when writing for the web. I wonder if that’s because they’re aware of this problem. Of course, people sometimes spell it with a space and sometimes without, which means if you use an actual “+” with your search for “googleplus”, you’ll miss half the results. It’s the irony that keeps on ironing.

I just got back from a wonderful three-week vacation in China. The trip did, however, further my worry as a patriotic American that the Chinese are rapidly overtaking us — not just in industrial technologies and renewable energy production and the like, but even in the service economy areas where we might hitherto have presumed ourselves to still retain some advantage. For example, read carefully what’s on offer at the Fu Run Hotel in Xi’an:

Guests check in, but they don't all check out.

Like I said, I had a great visit! Some of the other guests in the hotel didn’t have such a great visit, though.

☺ I’ll try to write a real post about my trip later, with some photos. But that was the single best Chinglish I have ever encountered, and I just couldn’t resist posting about it. In fairness, it’s no worse than some of the Chinese phrases we wear on t-shirts in America… well, okay, maybe only a little bit worse.

Why you should be reading more of Ben Collins-Sussman:

I usually find it easier to encode my home videos into the immutable portion of some custom-fabricated DNA, which I then implant into bacteria and launch into the stars. I figure that in billions of years, they may evolve into sentient life forms, examine their DNA, decipher the codec, then watch my children play ball. Gotta make the important memories last.

(Taken from private mailing list correspondence, in a thread entitled “least-unreliable disk storage?” — but I’m pretty sure Ben won’t mind my posting it here.)