Monday, March 16, 2009

Sparse vectors - ALWAYS use Column Vectors

I was working on some 'signal' data that I obtained from a ChIP-seq experiment that measures the binding affinity of a transcription factor to every nucleotide in the human genome. I was trying to manipulate this signal data using sparse vectors in MATLAB.

Most of the time I use column vectors by default. For some reason I decided to switch to row vectors. What a difference!

An empty (all-zeros) sparse column vector of length 2 million barely takes a few bytes of memory. However, an empty sparse row vector of the same length gives an 'out of memory' error. While I was aware of the space efficiency of column-based sparse matrices in MATLAB, this was the first time I actually observed such a vast difference.

Moral of the story: If you are manipulating sparse vectors ALWAYS use column vectors!

No comments: