9 Feb 22:52
Emulate left outer join?
David Carmean <dlc <at> halibut.com>
2010-02-09 21:52:07 GMT
2010-02-09 21:52:07 GMT
Hi,
I've been working with numpy for less than a month, having learned about
it after finding matplotlib. My foundation in things like set theory is...
weak to nonexistent, so I need a little help mapping sql-like thoughts into
set-theory thinking :)
Some context to help me explain: I'm trying to store, chart, and analyze
unix system performance data (sar/sadf output). On a typical system I have
about 75 fields/variables, all floats, with identical timestamps... or so
we hope. What I want to do in order to save memory/disk space is to stack
the timeseries data all into three or four different arrays, and use a single
timestamp field for each set.
My problem is: I don't know that I can guarantee that the shape of all the
individual arrays will be identical along the time axis. I may receive
truncated textfiles to parse, or new variables may appear and disappear from
the set being reported/recorded.
If these were in flat files or database tables, I'd do a left outer join between
a master timestamp table and each individual variable's table. But... I don't
know the keywords to search for in the numpy docs/web chatter. A thread from
just about one year ago left the question hanging:
http://article.gmane.org/gmane.comp.python.numeric.general/27942
Examples? Pointers? Shoves toward the correct sections of the docs?
Thanks.
RSS Feed