<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>R on Statistics @ Home</title>
    <link>http://statsathome.com/tags/r/</link>
    <description>Recent content in R on Statistics @ Home</description>
    <generator>Hugo</generator>
    <language>en-EN</language>
    <managingEditor>stats.at.home@gmail.com (Justin and Rachel Silverman)</managingEditor>
    <webMaster>stats.at.home@gmail.com (Justin and Rachel Silverman)</webMaster>
    <copyright>(c) 2017 Justin and Rachel Silverman</copyright>
    <lastBuildDate>Sat, 27 Oct 2018 00:00:00 +0000</lastBuildDate>
    <atom:link href="http://statsathome.com/tags/r/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Sampling from the Singular Normal</title>
      <link>http://statsathome.com/2018/10/27/sampling-from-the-singular-normal/</link>
      <pubDate>Sat, 27 Oct 2018 00:00:00 +0000</pubDate><author>stats.at.home@gmail.com (Justin and Rachel Silverman)</author>
      <guid>http://statsathome.com/2018/10/27/sampling-from-the-singular-normal/</guid>
      <description>&lt;p&gt;Following up the previous post on &lt;a href=&#34;http://www.statsathome.com/2018/10/19/sampling-from-multivariate-normal-precision-and-covariance-parameterizations/&#34;&gt;sampling from the multivariate normal&lt;/a&gt;, I decided to describe in more detail the situation where the covariance matrix or precision matrix is singular (e.g., it is not positive definite). A normal distribution with such a singular covariance/precision matrix is referred to as a singular normal distribution. Here is 100 samples from a two dimensional example:&lt;/p&gt;&#xA;&lt;p&gt;&lt;img src=&#34;http://statsathome.com/post/2018-10-27-sampling-from-the-singular-normal_files/figure-html/unnamed-chunk-2-1.png&#34; width=&#34;384&#34; style=&#34;display: block; margin: auto;&#34; /&gt; Notice that a singular normal essentially has less dimensions (in this case 1 dimension) than the dimension of the random variable (in this case 2 dimensions).&lt;/p&gt;</description>
    </item>
    <item>
      <title>Sampling from Multivariate Normal (precision and covariance parameterizations)</title>
      <link>http://statsathome.com/2018/10/19/sampling-from-multivariate-normal-precision-and-covariance-parameterizations/</link>
      <pubDate>Fri, 19 Oct 2018 00:00:00 +0000</pubDate><author>stats.at.home@gmail.com (Justin and Rachel Silverman)</author>
      <guid>http://statsathome.com/2018/10/19/sampling-from-multivariate-normal-precision-and-covariance-parameterizations/</guid>
      <description>&lt;p&gt;Two things are motivating this quick post. First, I have seen a lot of R code that is slower than it should be due to unoptimized sampling from a multivariate normal. Second, yesterday I spend a frustrating few hours tracking down a bug that ultimately was due to a slight subtlety in sampling from the multivariate normal parameterized by a precision matrix (the inverse of a covariance matrix).&lt;/p&gt;&#xA;&lt;p&gt;&lt;strong&gt;Key Idea:&lt;/strong&gt; It is easy to draw univariate standard (e.g., zero mean and unit variance) normal random variables. In fact most programming languages provide efficient vectorized (e.g., parallelized) algorithms for doing this. In contrast, it is challenging to draw multivariate random variables directly. Motivated by this fact, the approach I discuss below transform samples from standard normal random variables into samples from the desired multivariate normal random variable&lt;a href=&#34;#fn1&#34; class=&#34;footnoteRef&#34; id=&#34;fnref1&#34;&gt;&lt;sup&gt;1&lt;/sup&gt;&lt;/a&gt;.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Bayesian Decision Theory Made Ridiculously Simple</title>
      <link>http://statsathome.com/2017/10/12/bayesian-decision-theory-made-ridiculously-simple/</link>
      <pubDate>Thu, 12 Oct 2017 00:00:00 +0000</pubDate><author>stats.at.home@gmail.com (Justin and Rachel Silverman)</author>
      <guid>http://statsathome.com/2017/10/12/bayesian-decision-theory-made-ridiculously-simple/</guid>
      <description>&lt;div id=&#34;TOC&#34;&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#framing-the-decision-space&#34;&gt;Framing the decision space&lt;/a&gt;&lt;ul&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#examples-part-1&#34;&gt;Examples: Part 1&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#the-other-information-that-helps-us-make-a-decision&#34;&gt;The other information that helps us make a decision&lt;/a&gt;&lt;ul&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#examples-part-2&#34;&gt;Examples: Part 2&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#the-loss-function&#34;&gt;The Loss Function&lt;/a&gt;&lt;ul&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#examples-part-3&#34;&gt;Examples: Part 3&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#uncertainty&#34;&gt;Uncertainty&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#fully-worked-example-what-price-should-i-sell-my-used-phone-for&#34;&gt;Fully Worked Example: What price should I sell my used phone for?&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#next-steps&#34;&gt;Next steps&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;/div&gt;&#xA;&#xA;&lt;p&gt;Bayesian Decision Theory is a wonderfully useful tool that provides a formalism for decision making under uncertainty. It is used in a diverse range of applications including but definitely not limited to finance for guiding investment strategies or in engineering for designing control systems. In what follows I hope to distill a few of the key ideas in Bayesian decision theory. In particular I will give examples that rely on simulation rather than analytical closed form solutions to global optimization problems. My hope is that such a simulation based approach will provide a gentler introduction while allowing readers to solve more difficult problems right from the start.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Plotting a Sequential Binary Partition on a Tree in R</title>
      <link>http://statsathome.com/2017/09/20/plotting-a-sequential-binary-partition-on-a-tree-in-r/</link>
      <pubDate>Wed, 20 Sep 2017 00:00:00 +0000</pubDate><author>stats.at.home@gmail.com (Justin and Rachel Silverman)</author>
      <guid>http://statsathome.com/2017/09/20/plotting-a-sequential-binary-partition-on-a-tree-in-r/</guid>
      <description>&lt;p&gt;For users of PhILR (&lt;a href=&#34;https://elifesciences.org/articles/21887&#34;&gt;Paper&lt;/a&gt;, &lt;a href=&#34;https://bioconductor.org/packages/release/bioc/html/philr.html&#34;&gt;R Package&lt;/a&gt;), and also for users of the ILR transform that wan to make use of the awesome plotting functions in R. I wanted to share a function for plotting a sequential binary partition on a tree using the &lt;a href=&#34;https://bioconductor.org/packages/release/bioc/html/ggtree.html&#34;&gt;ggtree package&lt;/a&gt;. I recently wrote this for a manuscript but figured it might be of more general use to others as well.&lt;/p&gt;&#xA;&lt;p&gt;In its simplest form a sequential binary partition can be represented as a binary tree.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Visualizing the Multinomial in the Simplex</title>
      <link>http://statsathome.com/2017/09/14/visualizing-the-multinomial-in-the-simplex/</link>
      <pubDate>Thu, 14 Sep 2017 00:00:00 +0000</pubDate><author>stats.at.home@gmail.com (Justin and Rachel Silverman)</author>
      <guid>http://statsathome.com/2017/09/14/visualizing-the-multinomial-in-the-simplex/</guid>
      <description>&lt;p&gt;Lately I have been working on figures for a manuscript. In this process I created a few visualizations that I thought might help others visualize the Multinomial distribution. I will focus on describing how counting processes introduce uncertainty into estimates of relative abundances and I will end with a discussion of how understanding the Multinomial has impacted my view of analyses of sequence count data (e.g., data from 16s studies of the microbiome, RNA-seq, and more). Here I have chosen to focus on the Multinomial distribution, however, much of what I discuss also relates to the &lt;a href=&#34;https://en.wikipedia.org/wiki/Hypergeometric_distribution#Multivariate_hypergeometric_distribution&#34;&gt;Multivariate Hypergeometric Distribution&lt;/a&gt; as well.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Does Gauss Love Me More in the Kitchen?</title>
      <link>http://statsathome.com/2017/08/27/does-gauss-love-me-more-in-the-kitchen/</link>
      <pubDate>Sun, 27 Aug 2017 00:00:00 +0000</pubDate><author>stats.at.home@gmail.com (Justin and Rachel Silverman)</author>
      <guid>http://statsathome.com/2017/08/27/does-gauss-love-me-more-in-the-kitchen/</guid>
      <description>&lt;div id=&#34;the-idea&#34; class=&#34;section level2&#34;&gt;&#xA;&lt;h2&gt;The Idea&lt;/h2&gt;&#xA;&lt;p&gt;First things first, Gauss is our dog.&lt;/p&gt;&#xA;Since I am able to work from home, my dog Gauss and I spend a lot of time together. As a result, I like to think I know why he does what he does. But of course I will never really know - though, it’s nice to think that I do. Both of us being a creatures of habit, we have fallen into a nice routine during the day - one where he sleeps the day away and comes to get me around 4pm for some outdoor training/playing. I have noticed that whenever I do anything interesting or out of the norm, he is right there, waiting to see if he can benefit from the activity. Most remarkably, it feels like whenever we are in the kitchen, he sits down right in the middle of everything waiting for scraps and food that drops on the floor.&#xA;&lt;center&gt;&#xA;&lt;img src=&#34;http://statsathome.com/img/2017-08-27-does-gauss-love-me-more-in-the-kitchen/gauss_oven.jpg&#34; alt=&#34;Gauss at Thanksgiving.&#34; /&gt;&#xA;&lt;/center&gt;&#xA;&lt;p&gt;I know Gauss loves me, but I wonder if I am more valuable to him in certain rooms? Does he “love” me more in the kitchen?&lt;/p&gt;</description>
    </item>
    <item>
      <title>A New(?) Regression Clustering Algorithm</title>
      <link>http://statsathome.com/2017/08/13/a-new-functional-clustering-algorithm/</link>
      <pubDate>Sun, 13 Aug 2017 00:00:00 +0000</pubDate><author>stats.at.home@gmail.com (Justin and Rachel Silverman)</author>
      <guid>http://statsathome.com/2017/08/13/a-new-functional-clustering-algorithm/</guid>
      <description>&lt;div id=&#34;TOC&#34;&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#motivation&#34;&gt;Motivation&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#my-solution---hybrid-k-meanslinear-regression-with-transformation&#34;&gt;My Solution - Hybrid K-Means/Linear-Regression with Transformation&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#starting-on-boring-simulated-data&#34;&gt;Starting on Boring Simulated Data&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#now-a-more-interesting-simulated-dataset&#34;&gt;Now a more interesting simulated dataset&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#more-realistic-presense-of-observational-noise&#34;&gt;More realistic, presense of observational noise&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#conclusions-and-future-directions&#34;&gt;Conclusions and future directions&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;/div&gt;&#xA;&#xA;&lt;div id=&#34;motivation&#34; class=&#34;section level1&#34;&gt;&#xA;&lt;h1&gt;Motivation&lt;/h1&gt;&#xA;&lt;p&gt;I am a fan of the &lt;a href=&#34;https://stackexchange.com/&#34;&gt;Stack Exchange forums&lt;/a&gt;. In particular, I like &lt;a href=&#34;https://stats.stackexchange.com/&#34;&gt;Cross Validated&lt;/a&gt; and &lt;a href=&#34;https://stackoverflow.com/&#34;&gt;Stack Overflow&lt;/a&gt;. An &lt;a href=&#34;https://stats.stackexchange.com/questions/297689/method-to-group-linear-features-in-a-graph/297745#297745&#34;&gt;interesting question regarding clustering&lt;/a&gt; was posted recently. Essentially someone had the following dataset.&lt;/p&gt;&#xA;&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;plot(Retirees)&lt;/code&gt;&lt;/pre&gt;&#xA;&lt;p&gt;&lt;img src=&#34;http://statsathome.com/post/2017-08-13-a-new-functional-clustering-algorithm_files/figure-html/unnamed-chunk-2-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;&#xA;&lt;p&gt;Essentially the poster wanted a way of clustering the observations into the “lines” that are fairly easy to observe in the data. I am going to ignore the fact that these lines are actually the result of artifact (e.g., conversion of discrete values to percentages and then plotting the percentages vs. a variable used to calculate the percentages) and just pretend they are real as I think its still an interesting problem. I am actually going to simulate some non-artifactual data and use this as well.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Stochastic Loading of Microfluidic Droplets</title>
      <link>http://statsathome.com/2017/07/08/stochastic-loading-of-microfluidic-droplets/</link>
      <pubDate>Sat, 08 Jul 2017 00:00:00 +0000</pubDate><author>stats.at.home@gmail.com (Justin and Rachel Silverman)</author>
      <guid>http://statsathome.com/2017/07/08/stochastic-loading-of-microfluidic-droplets/</guid>
      <description>&lt;div id=&#34;TOC&#34;&gt;&#xA;&lt;ul&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#the-basic-model&#34;&gt;The Basic Model&lt;/a&gt;&lt;ul&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#the-first-step---multinomial&#34;&gt;The First Step - Multinomial&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#focusing-on-our-question---binomial&#34;&gt;Focusing on our Question - Binomial&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#approximating-the-binomial---poisson&#34;&gt;Approximating the Binomial - Poisson&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#when-is-the-poisson-approximation-to-the-binomial-valid&#34;&gt;When is the Poisson approximation to the Binomial valid?&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#looking-at-real-parameters-values&#34;&gt;Looking at Real Parameters Values&lt;/a&gt;&lt;/li&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#calculating-the-distribution-of-quantities-in-light-of-uncertainty-in-lab-measurements&#34;&gt;Calculating the distribution of quantities in light of uncertainty in lab measurements&lt;/a&gt;&lt;ul&gt;&#xA;&lt;li&gt;&lt;a href=&#34;#bivariate-distributions&#34;&gt;Bivariate Distributions&lt;/a&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&lt;/li&gt;&#xA;&lt;/ul&gt;&#xA;&lt;/div&gt;&#xA;&lt;p&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/Microfluidics#Droplet-based_microfluidics&#34;&gt;Droplet-based microfluidics&lt;/a&gt; are emerging as a useful technology in various fields of biomedicine. Both &lt;a href=&#34;https://en.wikipedia.org/wiki/Digital_polymerase_chain_reaction#Droplet_Digital_PCR&#34;&gt;droplet digital PCR&lt;/a&gt; and droplet based culture methods require that droplets are created with either a single DNA molecule or a single cell per droplet. Obviously it is difficult to individually place DNA molecules or cells into droplets, instead people turn to stochastic models to estimate the distribution of cells per droplet, tuning the experimental parameters to achieve an acceptable distribution.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Fitting Non-Linear Growth Curves in R</title>
      <link>http://statsathome.com/2017/06/07/fitting-non-linear-groth-curves-in-r/</link>
      <pubDate>Wed, 07 Jun 2017 00:00:00 +0000</pubDate><author>stats.at.home@gmail.com (Justin and Rachel Silverman)</author>
      <guid>http://statsathome.com/2017/06/07/fitting-non-linear-groth-curves-in-r/</guid>
      <description>&lt;p&gt;A few months ago I offered to help a friend fit a bunch of microbial growth curves using R. When I was looking over possible solutions I was quite supprised by how little information was available online. Apart from the R package &lt;code&gt;grofit&lt;/code&gt; (which after playing around with I decided seemed a little over-designed for my uses) I found very limited recources or code available. As a result of this I wanted to share a few functions I wrote to quickly fit non-linear growth models. I was specifically asked to help fit growth curves using the gompertz function and this is what I demonstrate below. I hope that this example gives some insight into how to fit non-linear models in R, beyond simply gompertz gorwth curves.&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
