Todd Lipcon | 17 Dec 17:43 2010

Re: HBase with trillions rows

Hi Alexey,

A trillion rows at 200 bytes each is 200T, right? So you're talking
about a cluster storing several hundred TB.

To give any kind of decent performance, you'll need a pretty big setup
- probably >100 machines at the least if you want any kind of usable

There aren't any architectural reasons this won't work, but it's at
the larger end of clusters I've seen before, so you might hit
scalability roadblocks, etc. I wouldn't try this as your first cluster
- do something smaller and learn about HBase before tackling the big


On Fri, Dec 17, 2010 at 8:08 AM, alexeyy3 <alexeyy3@...> wrote:
> Hi,
> I am a beginner with HBase (obviously). I wonder, is it feasible to use
> HBase table in "read-mostly" mode with trillions of rows, each contains
> small structured record (~200 bytes, ~15 fields). Does anybody know a
> successful case when tables with such number of rows are used with HBase?
> Thanks
> Alexey
> --
> View this message in context:
> Sent from the HBase User mailing list archive at


Todd Lipcon
Software Engineer, Cloudera