Comments on: Google Big Table paper

By: Dan Creswell

Dan Creswell — Wed, 06 Sep 2006 08:08:22 +0000

Re: data volume – I couldn’t tell whether they were talking compressed or uncompressed form.

Re: random read – I seem to recall they allow the programmer to specify a transfer size and I think they also said, typically, it’s 8k. I guess they were after finding worst case performance. Certainly would’ve liked to see the updated figures as well 🙂

I also noticed that they tend to be running several “services” on the same machine so BigTable, GFS etc – I would’ve expected more segmentation but maybe I/O is more of an issue for them than memory or CPU.

By: Ewan

Ewan — Sat, 02 Sep 2006 14:54:52 +0000

I agree – “interesting” choice of name.

A few other points I thought were interesting on an initial skim:

1. the volume of data that is being stored in the system ie the web crawl contains 800TB in 1000 billion cells. 1000 billion is quite a large number. I would have thought the 800TB is a low figure for the size of the web though…

2. the fact that the random read performance suffers quite badly as the number of tablet servers increases. As they point out this is due to the fact that they shift around 64kb blocks even though the required cell data might only contain a kb or so and the network/cpu overhead ends up saturating. Would have been nice to see how the system performs with a reduced block size – 8kb?

3. Fijnally:

“One lesson we learned is that large distributed systems
are vulnerable to many types of failures, not just
the standard network partitions and fail-stop failures assumed
in many distributed protocols.”

Yep – this stuff is hard!

By: Dan Creswell

Dan Creswell — Sat, 02 Sep 2006 09:25:17 +0000

And having read it I can tell you that there’s another paper due for release on Google’s HA distributed lock manager (apparently called Chubby, errr, wouldn’t have been my choice).