<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-3369578317387300562</id><updated>2011-10-13T03:34:50.153-07:00</updated><category term='mex'/><category term='screen'/><category term='append'/><category term='math'/><category term='plot'/><category term='high-density'/><category term='round'/><category term='hashtable'/><category term='cluster'/><category term='nohup'/><category term='sequence'/><category term='k-mer'/><category term='save'/><category term='memory'/><category term='smoothing'/><category term='mcc'/><category term='surf'/><category term='matlab'/><category term='standalone'/><category term='motifs'/><category term='python'/><category term='roc'/><category term='unix'/><category term='sparse'/><category term='function'/><category term='image'/><category term='code'/><category term='decimal'/><category term='error'/><category term='matlab compiler'/><category term='scatter'/><title type='text'>MATLAB for compbio</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://matlab4compbio.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://matlab4compbio.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Anshul</name><uri>http://www.blogger.com/profile/02178466793315780705</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>13</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-3369578317387300562.post-227589893811401381</id><published>2011-01-13T16:06:00.001-08:00</published><updated>2011-01-13T16:09:02.278-08:00</updated><title type='text'>Compiling matlab to a standalone with no display option</title><content type='html'>You might often want to compile matlab files to a standalone executable using the mcc command. However, by default you will obtain annoying warning messages about no display being available. To avoid these messages you should use the compiler directive -R &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Before Matlab 2010b&lt;/div&gt;&lt;div&gt;mcc -R -nodisplay ...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For Matlab 2010b, although the documentation says it should be the same it isnt. You need to drop the - . i.e.&lt;/div&gt;&lt;div&gt;mcc -R nodisplay ...&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3369578317387300562-227589893811401381?l=matlab4compbio.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://matlab4compbio.blogspot.com/feeds/227589893811401381/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3369578317387300562&amp;postID=227589893811401381' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/227589893811401381'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/227589893811401381'/><link rel='alternate' type='text/html' href='http://matlab4compbio.blogspot.com/2011/01/compiling-matlab-to-standalone-with-no.html' title='Compiling matlab to a standalone with no display option'/><author><name>Anshul</name><uri>http://www.blogger.com/profile/02178466793315780705</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3369578317387300562.post-6958462451737944688</id><published>2010-12-22T00:30:00.000-08:00</published><updated>2010-12-22T00:32:21.739-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='function'/><category scheme='http://www.blogger.com/atom/ns#' term='standalone'/><title type='text'>isdeployed( )</title><content type='html'>isdeployed() is a handy function in matlab to check whether a piece of matlab code is running as a standalone deployed app or whether it is running in native matlab.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3369578317387300562-6958462451737944688?l=matlab4compbio.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://matlab4compbio.blogspot.com/feeds/6958462451737944688/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3369578317387300562&amp;postID=6958462451737944688' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/6958462451737944688'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/6958462451737944688'/><link rel='alternate' type='text/html' href='http://matlab4compbio.blogspot.com/2010/12/isdeployed.html' title='isdeployed( )'/><author><name>Anshul</name><uri>http://www.blogger.com/profile/02178466793315780705</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3369578317387300562.post-6823521317705164717</id><published>2010-06-29T11:23:00.000-07:00</published><updated>2010-06-29T11:30:29.432-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='decimal'/><category scheme='http://www.blogger.com/atom/ns#' term='round'/><category scheme='http://www.blogger.com/atom/ns#' term='math'/><title type='text'>Truncating or rounding off a decimal value/array to user-specified number of decimal places</title><content type='html'>Sometimes, you want to truncate long floating point numbers to keep just the first few digits following the decimal point. The easy way to do this is&lt;br /&gt;&lt;br /&gt;xr = round(x/n) * n&lt;br /&gt;&lt;br /&gt;where&lt;br /&gt;x = original floating point number&lt;br /&gt;n = 10^(-[number of digits after decimal])&lt;number of="" digits="" after="" decimal=""&gt;&lt;br /&gt;&lt;br /&gt;e.g. x=1.5673454, n = 0.01 (2 digits after decimal point)&lt;br /&gt;xr = 1.57&lt;/number&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3369578317387300562-6823521317705164717?l=matlab4compbio.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://matlab4compbio.blogspot.com/feeds/6823521317705164717/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3369578317387300562&amp;postID=6823521317705164717' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/6823521317705164717'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/6823521317705164717'/><link rel='alternate' type='text/html' href='http://matlab4compbio.blogspot.com/2010/06/truncating-or-rounding-off-decimal.html' title='Truncating or rounding off a decimal value/array to user-specified number of decimal places'/><author><name>Anshul</name><uri>http://www.blogger.com/profile/02178466793315780705</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3369578317387300562.post-2766109693293352855</id><published>2010-04-17T01:25:00.000-07:00</published><updated>2010-04-17T01:27:20.126-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mex'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Passing data in and out of MATLAB and Python</title><content type='html'>Came across this great package that allows direct exchange between MATLAB and Python.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://vader.cse.lehigh.edu/~perkins/pymex.html"&gt;http://vader.cse.lehigh.edu/~perkins/pymex.html&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3369578317387300562-2766109693293352855?l=matlab4compbio.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://matlab4compbio.blogspot.com/feeds/2766109693293352855/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3369578317387300562&amp;postID=2766109693293352855' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/2766109693293352855'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/2766109693293352855'/><link rel='alternate' type='text/html' href='http://matlab4compbio.blogspot.com/2010/04/passing-data-in-and-out-of-matlab-and.html' title='Passing data in and out of MATLAB and Python'/><author><name>Anshul</name><uri>http://www.blogger.com/profile/02178466793315780705</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3369578317387300562.post-1560571154540305318</id><published>2010-04-05T11:13:00.000-07:00</published><updated>2010-04-05T11:19:20.669-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cluster'/><category scheme='http://www.blogger.com/atom/ns#' term='standalone'/><category scheme='http://www.blogger.com/atom/ns#' term='matlab compiler'/><category scheme='http://www.blogger.com/atom/ns#' term='error'/><category scheme='http://www.blogger.com/atom/ns#' term='mcc'/><title type='text'>How to solve MCR cache access problems on a cluster</title><content type='html'>Often when I run compiled matlab applications on a cluster, I get the error message &lt;b&gt;&lt;br /&gt;&lt;br /&gt;"Could not access the MCR component cache."&lt;br /&gt;&lt;/b&gt;&lt;p&gt; &lt;/p&gt;&lt;p&gt;This tends to happen because matlab is not able to access the  MCE cache directory. By default this happens to be your home directory. When a large number  of compiled matlab programs are starting off/running simultaneously (e.g. you submit a job array), the load on the file system is too great giving rise to the problem.&lt;/p&gt;&lt;p&gt;The simplest way to solve this problem, if to point the MCR_CACHE_ROOT environment variable to a local temporary directory on each node on the cluster.&lt;br /&gt;&lt;/p&gt; &lt;pre&gt;export MCR_CACHE_ROOT=$TMPDIR&lt;br /&gt;&lt;/pre&gt; &lt;p&gt;This redirects the cache to a temp directory that is able to handle  the traffic.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3369578317387300562-1560571154540305318?l=matlab4compbio.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://matlab4compbio.blogspot.com/feeds/1560571154540305318/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3369578317387300562&amp;postID=1560571154540305318' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/1560571154540305318'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/1560571154540305318'/><link rel='alternate' type='text/html' href='http://matlab4compbio.blogspot.com/2010/04/how-to-solve-mcr-cache-access-problems.html' title='How to solve MCR cache access problems on a cluster'/><author><name>Anshul</name><uri>http://www.blogger.com/profile/02178466793315780705</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3369578317387300562.post-7723958998478449510</id><published>2010-01-09T21:37:00.001-08:00</published><updated>2010-01-09T21:48:52.369-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='high-density'/><category scheme='http://www.blogger.com/atom/ns#' term='smoothing'/><category scheme='http://www.blogger.com/atom/ns#' term='image'/><category scheme='http://www.blogger.com/atom/ns#' term='surf'/><category scheme='http://www.blogger.com/atom/ns#' term='code'/><category scheme='http://www.blogger.com/atom/ns#' term='scatter'/><category scheme='http://www.blogger.com/atom/ns#' term='plot'/><title type='text'>High density scatter plots</title><content type='html'>The scatter(x,y) function in MATLAB is useful to visualize the joint distribution of two variables x and y. But this function breaks down (gets too slow and memory intensive) if the number of data points in x/y is large.&lt;br /&gt;&lt;br /&gt;A nice trick to visualize high density scatter plots is to bin the data and smooth the 2-D histogram. Then one can use the image function or surf function with alpha transparency to view the joint distribution. Darker regions could represent high density of points and light regions could represent low density of points.&lt;br /&gt;&lt;br /&gt;R and several other programming languages have built in functions of this. It is a little surprising that MATLAB doesn't have it built in yet. Anyway, &lt;a href="http://bioinformatics.oxfordjournals.org/cgi/content/abstract/20/5/623"&gt;here&lt;/a&gt; is a paper that gives a very efficient way of creating these smoothed high-density scatter plots and &lt;a href="http://www.mathworks.com/matlabcentral/fileexchange/13352-smoothhist2d"&gt;here&lt;/a&gt; is an implementation.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3369578317387300562-7723958998478449510?l=matlab4compbio.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://matlab4compbio.blogspot.com/feeds/7723958998478449510/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3369578317387300562&amp;postID=7723958998478449510' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/7723958998478449510'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/7723958998478449510'/><link rel='alternate' type='text/html' href='http://matlab4compbio.blogspot.com/2010/01/high-density-scatter-plots.html' title='High density scatter plots'/><author><name>Anshul</name><uri>http://www.blogger.com/profile/02178466793315780705</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3369578317387300562.post-1322807016899019364</id><published>2009-10-29T17:36:00.000-07:00</published><updated>2009-10-29T18:18:05.914-07:00</updated><title type='text'>PSI-BLAST and BLAST background probabilities</title><content type='html'>This post is not directly related to MATLAB but I felt it was important to post this.&lt;br /&gt;&lt;br /&gt;I recently realized that it is not trivial to find the background amino acid probabilities that are used in BLAST and PSI-BLAST. Google didn't find it. None of the papers referenced in the BLAST papers actually have the frequencies in a tabular form. I would have thought this should have been documented by NCBI in BLAST help or something! Anyway after a few hours of searching and reading papers and eventually code, I found the actual values used. They can be found in this file&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c"&gt;http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Below are the tables which contain the frequencies. They need to be normalized (divide by the sum of the frequencies = 1000) to convert the frequencies to probabilities.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://spreadsheets.google.com/ccc?key=0Am6FxqAtrFDwdDh6WWhabTRyaThJNFBDMV9LZmJkVVE&amp;amp;hl=en"&gt;Google doc spreadsheet&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;NOTE: PSI-BLAST uses the Robinson values by default&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;a name="L2345" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2345"&gt;2345&lt;/a&gt; #if &lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/ident?i=STD_AMINO_ACID_FREQS"&gt;STD_AMINO_ACID_FREQS&lt;/a&gt; == &lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/ident?i=Dayhoff_prob"&gt;Dayhoff_prob&lt;/a&gt;&lt;br /&gt;&lt;a name="L2346" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2346"&gt;2346&lt;/a&gt; &lt;b&gt;&lt;i&gt;/*  M. O. Dayhoff amino acid background frequencies   */&lt;/i&gt;&lt;/b&gt;&lt;br /&gt;&lt;a name="L2347" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2347"&gt;2347&lt;/a&gt; static &lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/ident?i=BLAST_LetterProb"&gt;BLAST_LetterProb&lt;/a&gt; &lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/ident?i=Dayhoff_prob"&gt;Dayhoff_prob&lt;/a&gt;[] = {&lt;br /&gt;&lt;a name="L2348" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2348"&gt;2348&lt;/a&gt;                 { &lt;i&gt;'A'&lt;/i&gt;, 87.13 },&lt;br /&gt;&lt;a name="L2349" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2349"&gt;2349&lt;/a&gt;                 { &lt;i&gt;'C'&lt;/i&gt;, 33.47 },&lt;br /&gt;&lt;a name="L2350" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2350"&gt;2350&lt;/a&gt;                 { &lt;i&gt;'D'&lt;/i&gt;, 46.87 },&lt;br /&gt;&lt;a name="L2351" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2351"&gt;2351&lt;/a&gt;                 { &lt;i&gt;'E'&lt;/i&gt;, 49.53 },&lt;br /&gt;&lt;a name="L2352" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2352"&gt;2352&lt;/a&gt;                 { &lt;i&gt;'F'&lt;/i&gt;, 39.77 },&lt;br /&gt;&lt;a name="L2353" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2353"&gt;2353&lt;/a&gt;                 { &lt;i&gt;'G'&lt;/i&gt;, 88.61 },&lt;br /&gt;&lt;a name="L2354" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2354"&gt;2354&lt;/a&gt;                 { &lt;i&gt;'H'&lt;/i&gt;, 33.62 },&lt;br /&gt;&lt;a name="L2355" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2355"&gt;2355&lt;/a&gt;                 { &lt;i&gt;'I'&lt;/i&gt;, 36.89 },&lt;br /&gt;&lt;a name="L2356" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2356"&gt;2356&lt;/a&gt;                 { &lt;i&gt;'K'&lt;/i&gt;, 80.48 },&lt;br /&gt;&lt;a name="L2357" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2357"&gt;2357&lt;/a&gt;                 { &lt;i&gt;'L'&lt;/i&gt;, 85.36 },&lt;br /&gt;&lt;a name="L2358" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2358"&gt;2358&lt;/a&gt;                 { &lt;i&gt;'M'&lt;/i&gt;, 14.75 },&lt;br /&gt;&lt;a name="L2359" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2359"&gt;2359&lt;/a&gt;                 { &lt;i&gt;'N'&lt;/i&gt;, 40.43 },&lt;br /&gt;&lt;a name="L2360" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2360"&gt;2360&lt;/a&gt;                 { &lt;i&gt;'P'&lt;/i&gt;, 50.68 },&lt;br /&gt;&lt;a name="L2361" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2361"&gt;2361&lt;/a&gt;                 { &lt;i&gt;'Q'&lt;/i&gt;, 38.26 },&lt;br /&gt;&lt;a name="L2362" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2362"&gt;2362&lt;/a&gt;                 { &lt;i&gt;'R'&lt;/i&gt;, 40.90 },&lt;br /&gt;&lt;a name="L2363" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2363"&gt;2363&lt;/a&gt;                 { &lt;i&gt;'S'&lt;/i&gt;, 69.58 },&lt;br /&gt;&lt;a name="L2364" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2364"&gt;2364&lt;/a&gt;                 { &lt;i&gt;'T'&lt;/i&gt;, 58.54 },&lt;br /&gt;&lt;a name="L2365" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2365"&gt;2365&lt;/a&gt;                 { &lt;i&gt;'V'&lt;/i&gt;, 64.72 },&lt;br /&gt;&lt;a name="L2366" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2366"&gt;2366&lt;/a&gt;                 { &lt;i&gt;'W'&lt;/i&gt;, 10.49 },&lt;br /&gt;&lt;a name="L2367" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2367"&gt;2367&lt;/a&gt;                 { &lt;i&gt;'Y'&lt;/i&gt;, 29.92 }&lt;br /&gt;&lt;a name="L2368" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2368"&gt;2368&lt;/a&gt;         };&lt;br /&gt;&lt;a name="L2369" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2369"&gt;2369&lt;/a&gt; #endif&lt;br /&gt;&lt;a name="L2370" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2370"&gt;2370&lt;/a&gt;&lt;br /&gt;&lt;a name="L2371" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2371"&gt;2371&lt;/a&gt; #if &lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/ident?i=STD_AMINO_ACID_FREQS"&gt;STD_AMINO_ACID_FREQS&lt;/a&gt; == &lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/ident?i=Altschul_prob"&gt;Altschul_prob&lt;/a&gt;&lt;br /&gt;&lt;a name="L2372" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2372"&gt;2372&lt;/a&gt; &lt;b&gt;&lt;i&gt;/* Stephen Altschul amino acid background frequencies */&lt;/i&gt;&lt;/b&gt;&lt;br /&gt;&lt;a name="L2373" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2373"&gt;2373&lt;/a&gt; static &lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/ident?i=BLAST_LetterProb"&gt;BLAST_LetterProb&lt;/a&gt; &lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/ident?i=Altschul_prob"&gt;Altschul_prob&lt;/a&gt;[] = {&lt;br /&gt;&lt;a name="L2374" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2374"&gt;2374&lt;/a&gt;                 { &lt;i&gt;'A'&lt;/i&gt;, 81.00 },&lt;br /&gt;&lt;a name="L2375" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2375"&gt;2375&lt;/a&gt;                 { &lt;i&gt;'C'&lt;/i&gt;, 15.00 },&lt;br /&gt;&lt;a name="L2376" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2376"&gt;2376&lt;/a&gt;                 { &lt;i&gt;'D'&lt;/i&gt;, 54.00 },&lt;br /&gt;&lt;a name="L2377" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2377"&gt;2377&lt;/a&gt;                 { &lt;i&gt;'E'&lt;/i&gt;, 61.00 },&lt;br /&gt;&lt;a name="L2378" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2378"&gt;2378&lt;/a&gt;                 { &lt;i&gt;'F'&lt;/i&gt;, 40.00 },&lt;br /&gt;&lt;a name="L2379" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2379"&gt;2379&lt;/a&gt;                 { &lt;i&gt;'G'&lt;/i&gt;, 68.00 },&lt;br /&gt;&lt;a name="L2380" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2380"&gt;2380&lt;/a&gt;                 { &lt;i&gt;'H'&lt;/i&gt;, 22.00 },&lt;br /&gt;&lt;a name="L2381" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2381"&gt;2381&lt;/a&gt;                 { &lt;i&gt;'I'&lt;/i&gt;, 57.00 },&lt;br /&gt;&lt;a name="L2382" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2382"&gt;2382&lt;/a&gt;                 { &lt;i&gt;'K'&lt;/i&gt;, 56.00 },&lt;br /&gt;&lt;a name="L2383" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2383"&gt;2383&lt;/a&gt;                 { &lt;i&gt;'L'&lt;/i&gt;, 93.00 },&lt;br /&gt;&lt;a name="L2384" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2384"&gt;2384&lt;/a&gt;                 { &lt;i&gt;'M'&lt;/i&gt;, 25.00 },&lt;br /&gt;&lt;a name="L2385" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2385"&gt;2385&lt;/a&gt;                 { &lt;i&gt;'N'&lt;/i&gt;, 45.00 },&lt;br /&gt;&lt;a name="L2386" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2386"&gt;2386&lt;/a&gt;                 { &lt;i&gt;'P'&lt;/i&gt;, 49.00 },&lt;br /&gt;&lt;a name="L2387" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2387"&gt;2387&lt;/a&gt;                 { &lt;i&gt;'Q'&lt;/i&gt;, 39.00 },&lt;br /&gt;&lt;a name="L2388" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2388"&gt;2388&lt;/a&gt;                 { &lt;i&gt;'R'&lt;/i&gt;, 57.00 },&lt;br /&gt;&lt;a name="L2389" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2389"&gt;2389&lt;/a&gt;                 { &lt;i&gt;'S'&lt;/i&gt;, 68.00 },&lt;br /&gt;&lt;a name="L2390" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2390"&gt;2390&lt;/a&gt;                 { &lt;i&gt;'T'&lt;/i&gt;, 58.00 },&lt;br /&gt;&lt;a name="L2391" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2391"&gt;2391&lt;/a&gt;                 { &lt;i&gt;'V'&lt;/i&gt;, 67.00 },&lt;br /&gt;&lt;a name="L2392" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2392"&gt;2392&lt;/a&gt;                 { &lt;i&gt;'W'&lt;/i&gt;, 13.00 },&lt;br /&gt;&lt;a name="L2393" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2393"&gt;2393&lt;/a&gt;                 { &lt;i&gt;'Y'&lt;/i&gt;, 32.00 }&lt;br /&gt;&lt;a name="L2394" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2394"&gt;2394&lt;/a&gt;         };&lt;br /&gt;&lt;a name="L2395" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2395"&gt;2395&lt;/a&gt; #endif&lt;br /&gt;&lt;a name="L2396" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2396"&gt;2396&lt;/a&gt;&lt;br /&gt;&lt;a name="L2397" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2397"&gt;2397&lt;/a&gt; #if &lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/ident?i=STD_AMINO_ACID_FREQS"&gt;STD_AMINO_ACID_FREQS&lt;/a&gt; == &lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/ident?i=Robinson_prob"&gt;Robinson_prob&lt;/a&gt;&lt;br /&gt;&lt;a name="L2398" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2398"&gt;2398&lt;/a&gt; &lt;b&gt;&lt;i&gt;/* amino acid background frequencies from Robinson and Robinson */&lt;/i&gt;&lt;/b&gt;&lt;br /&gt;&lt;a name="L2399" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2399"&gt;2399&lt;/a&gt; static &lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/ident?i=BLAST_LetterProb"&gt;BLAST_LetterProb&lt;/a&gt; &lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/ident?i=Robinson_prob"&gt;Robinson_prob&lt;/a&gt;[] = {&lt;br /&gt;&lt;a name="L2400" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2400"&gt;2400&lt;/a&gt;                 { &lt;i&gt;'A'&lt;/i&gt;, 78.05 },&lt;br /&gt;&lt;a name="L2401" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2401"&gt;2401&lt;/a&gt;                 { &lt;i&gt;'C'&lt;/i&gt;, 19.25 },&lt;br /&gt;&lt;a name="L2402" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2402"&gt;2402&lt;/a&gt;                 { &lt;i&gt;'D'&lt;/i&gt;, 53.64 },&lt;br /&gt;&lt;a name="L2403" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2403"&gt;2403&lt;/a&gt;                 { &lt;i&gt;'E'&lt;/i&gt;, 62.95 },&lt;br /&gt;&lt;a name="L2404" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2404"&gt;2404&lt;/a&gt;                 { &lt;i&gt;'F'&lt;/i&gt;, 38.56 },&lt;br /&gt;&lt;a name="L2405" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2405"&gt;2405&lt;/a&gt;                 { &lt;i&gt;'G'&lt;/i&gt;, 73.77 },&lt;br /&gt;&lt;a name="L2406" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2406"&gt;2406&lt;/a&gt;                 { &lt;i&gt;'H'&lt;/i&gt;, 21.99 },&lt;br /&gt;&lt;a name="L2407" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2407"&gt;2407&lt;/a&gt;                 { &lt;i&gt;'I'&lt;/i&gt;, 51.42 },&lt;br /&gt;&lt;a name="L2408" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2408"&gt;2408&lt;/a&gt;                 { &lt;i&gt;'K'&lt;/i&gt;, 57.44 },&lt;br /&gt;&lt;a name="L2409" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2409"&gt;2409&lt;/a&gt;                 { &lt;i&gt;'L'&lt;/i&gt;, 90.19 },&lt;br /&gt;&lt;a name="L2410" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2410"&gt;2410&lt;/a&gt;                 { &lt;i&gt;'M'&lt;/i&gt;, 22.43 },&lt;br /&gt;&lt;a name="L2411" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2411"&gt;2411&lt;/a&gt;                 { &lt;i&gt;'N'&lt;/i&gt;, 44.87 },&lt;br /&gt;&lt;a name="L2412" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2412"&gt;2412&lt;/a&gt;                 { &lt;i&gt;'P'&lt;/i&gt;, 52.03 },&lt;br /&gt;&lt;a name="L2413" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2413"&gt;2413&lt;/a&gt;                 { &lt;i&gt;'Q'&lt;/i&gt;, 42.64 },&lt;br /&gt;&lt;a name="L2414" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2414"&gt;2414&lt;/a&gt;                 { &lt;i&gt;'R'&lt;/i&gt;, 51.29 },&lt;br /&gt;&lt;a name="L2415" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2415"&gt;2415&lt;/a&gt;                 { &lt;i&gt;'S'&lt;/i&gt;, 71.20 },&lt;br /&gt;&lt;a name="L2416" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2416"&gt;2416&lt;/a&gt;                 { &lt;i&gt;'T'&lt;/i&gt;, 58.41 },&lt;br /&gt;&lt;a name="L2417" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2417"&gt;2417&lt;/a&gt;                 { &lt;i&gt;'V'&lt;/i&gt;, 64.41 },&lt;br /&gt;&lt;a name="L2418" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2418"&gt;2418&lt;/a&gt;                 { &lt;i&gt;'W'&lt;/i&gt;, 13.30 },&lt;br /&gt;&lt;a name="L2419" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2419"&gt;2419&lt;/a&gt;                 { &lt;i&gt;'Y'&lt;/i&gt;, 32.16 }&lt;br /&gt;&lt;a name="L2420" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2420"&gt;2420&lt;/a&gt;         };&lt;br /&gt;&lt;a name="L2421" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2421"&gt;2421&lt;/a&gt; #endif&lt;br /&gt;&lt;a name="L2422" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2422"&gt;2422&lt;/a&gt;&lt;br /&gt;&lt;a name="L2423" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2423"&gt;2423&lt;/a&gt; static &lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/ident?i=BLAST_LetterProb"&gt;BLAST_LetterProb&lt;/a&gt; &lt;a href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/ident?i=nt_prob"&gt;nt_prob&lt;/a&gt;[] = {&lt;br /&gt;&lt;a name="L2424" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2424"&gt;2424&lt;/a&gt;                 { &lt;i&gt;'A'&lt;/i&gt;, 25.00 },&lt;br /&gt;&lt;a name="L2425" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2425"&gt;2425&lt;/a&gt;                 { &lt;i&gt;'C'&lt;/i&gt;, 25.00 },&lt;br /&gt;&lt;a name="L2426" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2426"&gt;2426&lt;/a&gt;                 { &lt;i&gt;'G'&lt;/i&gt;, 25.00 },&lt;br /&gt;&lt;a name="L2427" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2427"&gt;2427&lt;/a&gt;                 { &lt;i&gt;'T'&lt;/i&gt;, 25.00 }&lt;br /&gt;&lt;a name="L2428" href="http://www.ncbi.nlm.nih.gov/IEB/ToolBox/C_DOC/lxr/source/tools/blastkar.c#L2428"&gt;2428&lt;/a&gt;         };&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3369578317387300562-1322807016899019364?l=matlab4compbio.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://matlab4compbio.blogspot.com/feeds/1322807016899019364/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3369578317387300562&amp;postID=1322807016899019364' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/1322807016899019364'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/1322807016899019364'/><link rel='alternate' type='text/html' href='http://matlab4compbio.blogspot.com/2009/10/psi-blast-and-blast-background.html' title='PSI-BLAST and BLAST background probabilities'/><author><name>Anshul</name><uri>http://www.blogger.com/profile/02178466793315780705</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3369578317387300562.post-4648664077955789435</id><published>2009-03-16T02:50:00.000-07:00</published><updated>2009-03-16T03:00:44.979-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='save'/><category scheme='http://www.blogger.com/atom/ns#' term='append'/><title type='text'>Appending to .MAT files</title><content type='html'>You can append variables to a .mat file using&lt;br /&gt;&lt;br /&gt;&gt;&gt; save(oFname,'var','-append');&lt;br /&gt;&lt;br /&gt;Consider 2 scenarios:&lt;br /&gt;1) The variable 'var' is being added to the .mat file for the first time&lt;br /&gt;2) The variable 'var' already exists in the .mat file and is being overwritten or updated&lt;br /&gt;&lt;br /&gt;If 'var' takes up a lot of memory ie it is large matrix or array, (2) is significantly slower than (1) by orders of magnitude.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Moral of the story:&lt;/span&gt; As far as possible avoid overwriting or updating a variable in a .mat file, especially if the variable takes up a lot of memory.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3369578317387300562-4648664077955789435?l=matlab4compbio.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://matlab4compbio.blogspot.com/feeds/4648664077955789435/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3369578317387300562&amp;postID=4648664077955789435' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/4648664077955789435'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/4648664077955789435'/><link rel='alternate' type='text/html' href='http://matlab4compbio.blogspot.com/2009/03/appending-to-mat-files.html' title='Appending to .MAT files'/><author><name>Anshul</name><uri>http://www.blogger.com/profile/02178466793315780705</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3369578317387300562.post-61159355136896168</id><published>2009-03-16T00:47:00.000-07:00</published><updated>2009-03-16T01:38:55.781-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='sparse'/><title type='text'>Sparse vectors - ALWAYS use Column Vectors</title><content type='html'>I was working on some 'signal' data that I obtained from a ChIP-seq experiment that measures the binding affinity of a transcription factor to every nucleotide in the human genome. I was trying to manipulate this signal data using sparse vectors in MATLAB.&lt;br /&gt;&lt;br /&gt;Most of the time I use column vectors by default. For some reason I decided to switch to row vectors. What a difference!&lt;br /&gt;&lt;br /&gt;An empty (all-zeros) sparse column vector of length 2 million barely takes a few bytes of memory. However, an empty sparse row vector of the same length gives an 'out of memory' error. While I was aware of the space efficiency of column-based sparse matrices in MATLAB, this was the first time I actually observed such a vast difference.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Moral of the story&lt;/span&gt;: If you are manipulating sparse vectors ALWAYS use column vectors!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3369578317387300562-61159355136896168?l=matlab4compbio.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://matlab4compbio.blogspot.com/feeds/61159355136896168/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3369578317387300562&amp;postID=61159355136896168' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/61159355136896168'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/61159355136896168'/><link rel='alternate' type='text/html' href='http://matlab4compbio.blogspot.com/2009/03/sparse-vectors-always-use-column.html' title='Sparse vectors - ALWAYS use Column Vectors'/><author><name>Anshul</name><uri>http://www.blogger.com/profile/02178466793315780705</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3369578317387300562.post-7113455384162657823</id><published>2009-02-28T17:48:00.000-08:00</published><updated>2009-03-16T01:36:23.792-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='memory'/><title type='text'>Dealing with massive files with limited memory</title><content type='html'>&lt;div style="text-align: left;"&gt;When dealing with extremely massive files such as entire genomes, it is pretty much impossible to fit it all in memory. For situations like this MATLAB has an extremely slick function called&lt;span style="font-family:arial;"&gt;&lt;/span&gt; &lt;a href="http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/ref/memmapfile.html"&gt;memmapfile&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The main advantages are&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The file is not loaded in memory&lt;/li&gt;&lt;li&gt;You can access the entire file or a portion of the file as if it were a standard MATLAB array using indexing operations. Let say the file had the sequence for an entire genome. Now if you say a = memmapfile('genome.dat') then doing something like a.Data(1:10) gives you the first 10 nucleotides of the genome.&lt;/li&gt;&lt;li&gt;It can handle single formats or multiple formats&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Much faster than &lt;a href="jar:file:///C:/Program%20Files/MATLAB/R2008b/help/techdoc/help.jar%21/ref/fread.html"&gt;&lt;tt&gt;fread&lt;/tt&gt;&lt;/a&gt; and &lt;a href="jar:file:///C:/Program%20Files/MATLAB/R2008b/help/techdoc/help.jar%21/ref/fwrite.html"&gt;&lt;tt&gt;fwrite&lt;/tt&gt;&lt;/a&gt;.&lt;/li&gt;&lt;/ul&gt;This is extremely useful for handling large binary files.&lt;br /&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;blockquote&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3369578317387300562-7113455384162657823?l=matlab4compbio.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://matlab4compbio.blogspot.com/feeds/7113455384162657823/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3369578317387300562&amp;postID=7113455384162657823' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/7113455384162657823'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/7113455384162657823'/><link rel='alternate' type='text/html' href='http://matlab4compbio.blogspot.com/2009/02/dealing-with-massive-files-with-limited.html' title='Dealing with massive files with limited memory'/><author><name>Anshul</name><uri>http://www.blogger.com/profile/02178466793315780705</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3369578317387300562.post-2979800761603891272</id><published>2009-01-24T10:56:00.000-08:00</published><updated>2009-02-05T11:58:18.655-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='code'/><category scheme='http://www.blogger.com/atom/ns#' term='roc'/><title type='text'>Vectorized ROC curve code + AUC</title><content type='html'>ROC curves are often used to display the predictive performance of binary classifiers. The area under the ROC curve (AUC) is a way to compare various classifiers. A perfect classifier has an AUC of 1 and a completely bogus (random) classifier has an AUC of 0.5. You can read more about ROC curves &lt;a href="http://en.wikipedia.org/wiki/Receiver_operating_characteristic"&gt;here&lt;/a&gt;.There is a ton of code for plotting ROC curves and calculating AUC. But most use 'for' loops. And as we all know, loops slow everything down in MATLAB. You can download my vectorized code for plotting multiple ROC curves from multiple classifiers and calculating AUC curves for each.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://sites.google.com/site/anshulkundaje/icode/calc_roc.m"&gt;Download Link&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3369578317387300562-2979800761603891272?l=matlab4compbio.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://matlab4compbio.blogspot.com/feeds/2979800761603891272/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3369578317387300562&amp;postID=2979800761603891272' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/2979800761603891272'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/2979800761603891272'/><link rel='alternate' type='text/html' href='http://matlab4compbio.blogspot.com/2009/01/vectorized-roc-curve-code-auc.html' title='Vectorized ROC curve code + AUC'/><author><name>Anshul</name><uri>http://www.blogger.com/profile/02178466793315780705</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3369578317387300562.post-3145897786020674763</id><published>2008-10-31T05:49:00.001-07:00</published><updated>2009-02-05T11:37:41.588-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='screen'/><category scheme='http://www.blogger.com/atom/ns#' term='unix'/><category scheme='http://www.blogger.com/atom/ns#' term='nohup'/><category scheme='http://www.blogger.com/atom/ns#' term='matlab'/><title type='text'>Running MATLAB on UNIX</title><content type='html'>nohup matlab -nodisplay -nosplash -nodesktop -nojvm -r &lt;matlab_command mfile=""&gt;"matlab_command;exit;" &gt; logfile&lt;br /&gt;&lt;br /&gt;The nohup command essentially allows you to run MATLAB from a remote terminal without worrying about connection drops or other hang up issues. However, sometimes it doesn't behave  as expected on some UNIX systems. It might be better to use the 'screen' command&lt;br /&gt;&lt;br /&gt;A simple tutorial on how to use the screen command is &lt;a href="http://kb.iu.edu/data/acuy.html" target="blank"&gt;here.&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;All you need to do is from your terminal type&lt;br /&gt;&gt;screen   %This will open up a new screen (Duh!)&lt;br /&gt;&gt;Type your favorite commands&lt;br /&gt;&lt;br /&gt;&lt;/matlab_command&gt;You can now comfortably disconnect your session and reconnect to it any time.&lt;br /&gt;&lt;br /&gt;&lt;matlab_command mfile=""&gt;If you want to get out of this screen back to the original terminal press Cntrl + a + d&lt;br /&gt;&lt;br /&gt;To reconnect to a screen session simply type&lt;br /&gt;&gt;screen -r&lt;br /&gt;&lt;br /&gt;This will either bring up the screen session (if you have just one session going) or give you a list of screen ids.&lt;br /&gt;&lt;br /&gt;To connect to a particular screen session&lt;br /&gt;&gt; screen -r &lt;screen_id&gt;&lt;br /&gt;&lt;/matlab_command&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3369578317387300562-3145897786020674763?l=matlab4compbio.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://matlab4compbio.blogspot.com/feeds/3145897786020674763/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3369578317387300562&amp;postID=3145897786020674763' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/3145897786020674763'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/3145897786020674763'/><link rel='alternate' type='text/html' href='http://matlab4compbio.blogspot.com/2008/10/running-matlab-on-unix.html' title='Running MATLAB on UNIX'/><author><name>Anshul</name><uri>http://www.blogger.com/profile/02178466793315780705</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-3369578317387300562.post-8239837129768692793</id><published>2008-10-29T19:58:00.000-07:00</published><updated>2008-10-29T20:15:33.856-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='motifs'/><category scheme='http://www.blogger.com/atom/ns#' term='k-mer'/><category scheme='http://www.blogger.com/atom/ns#' term='hashtable'/><category scheme='http://www.blogger.com/atom/ns#' term='sequence'/><title type='text'>Hash functions for sequence scanning</title><content type='html'>INPUT: A set of sequences (DNA/Protein etc.)&lt;br /&gt;OUTPUT: A motif matrix of all possible &lt;span style="font-style: italic;"&gt;k&lt;/span&gt;-mers and gapped elements (dimers for example) in the set of sequences&lt;br /&gt;&lt;br /&gt;MATLAB doesn't have any built in hashing functions that run in O(1) time. You would want something that can do a quick array index lookup for each &lt;span style="font-style: italic;"&gt;k&lt;/span&gt;-mer or dimer into the motif matrix. There are several hacks u can pull off.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;You can use a for loop. This simply sucks. Wayyyy to slow.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;If you are scanning DNA sequences then u can encode A = 1, C = 2, G = 3, T = 4 ... In this way every kmer automatically becomes an number which can used as an index into a sparse matrix. U can then prune the sparse matrix to remove indices that donot match any kmer sequence. This is extremely fast. However it doesn't work for dimers or very long kmers or more complex sequence elements such as regular expressions. It also won't work for protein sequence cuz there are 21 amino acids and so you would start generating very large array indices for &lt;span style="font-style: italic;"&gt;k&lt;/span&gt;-mers with &lt;span style="font-style: italic;"&gt;k&lt;/span&gt;&gt;8.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;I feel the best option though is to use the JAVA hash object ht = java.util.Hashtable&lt;/li&gt;&lt;/ol&gt;More on (3) ...&lt;br /&gt;&lt;br /&gt;You create the hash table object as ht = java.util.Hashtable . Check out member functions &lt;a href="http://java.sun.com/j2se/1.4.2/docs/api/java/util/Hashtable.html"&gt;here&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The keys would be the kmers/dimers etc. and the values will be the motif matrix indices. The only problem with this is that u can add only a single (key,value) pair and get the value corresponding to a single key. So it would be better to write JAVA code that would take a set of kmers and add them to the hash table and return indices ... basically a vectorized version of get() and put().&lt;br /&gt;&lt;br /&gt;I need to do this.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3369578317387300562-8239837129768692793?l=matlab4compbio.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://matlab4compbio.blogspot.com/feeds/8239837129768692793/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=3369578317387300562&amp;postID=8239837129768692793' title='9 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/8239837129768692793'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3369578317387300562/posts/default/8239837129768692793'/><link rel='alternate' type='text/html' href='http://matlab4compbio.blogspot.com/2008/10/hash-functions-for-sequence-scanning.html' title='Hash functions for sequence scanning'/><author><name>Anshul</name><uri>http://www.blogger.com/profile/02178466793315780705</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>9</thr:total></entry></feed>
