In this post I describe an algorithm for clustering regression data that is based somewhat on K-Means. I cooked it up yesterday when looking over Cross Validated questions. A very smart professor at Duke has recently informed me that this is basically a mixture of regressions model (or a mixture of experts). So, don't I feel silly with the title for this post. Still I left it in to grip the readers attention! (Is it working?)

# Statistics @ Home

last update:Following up on a recent post on limitations of the ALR and Softmax transforms, I wanted to briefly show how we can derive an Isometric Log-Ratio transform from the Additive Log-Ratio (ALR) transform.

Short post describing one of the key limitations with the additive log-ratio (ALR) transform (which is essentially the same as the softmax transform).

I wanted to write a quick post responding to a question that we received about our last post (Error Analysis Made Ridiculously Simple). A reader asked us to give some more detailed examples of how to estimate uncertainty/error in more complicated experimental designs. My response in short - "When in doubt, try to collect replicate samples in an appropriate way and try to think of ways to benchmark your measurements against known standards." Beyond this somewhat cryptic answer I will try to give a few examples that should be a little more clear and I will also at the end try to give a few words on accuracy vs. precision which I have in the past found can inspire some ideas.

All measurements have uncertainty. This is not a subjective opinion but an objective fact that should never be ignored. In light of this, I have always been curious about how infrequently uncertainty is actually taken into account in science. In this post I will advocate the use of simple simulation studies for error/uncertainty propagation.

Droplet-based microfluidics are emerging as a useful technology in various fields of biomedicine. Both droplet digital PCR and droplet based culture methods require that droplets are created with either a single DNA molecule or a single cell per droplet. Obviously it is difficult to individually place DNA molecules or cells into droplets, instead people turn to stochastic models to estimate the distribution of cells per droplet, tuning the experimental parameters to achieve an acceptable distribution. In this post I derive a Poisson approximation to this process and demonstrate how to calculate quantities of interest under uncertainty in lab measurements.

When hosting our annual Matzah Hunt event, we wanted to come up with a cool and unusual way of picking teams. We decided that we wanted each participant to complete each puzzle only once and to complete each puzzle with a different partner.

Measure theory is actually really simple. Here are some core concepts of measure theory, introduced in a ridiculously simple way.

I am sitting on a plane from Rome to Philadelphia, marveling at how quickly we can move around the globe. There is a 6 hour time-zone difference between Rome and Philadelphia. My flight took off at 11am in Rome and was set to arrive 9 hours and 45 minutes later at 3:45pm in Philadelphia. This got me thinking, what would your daily schedule look like if you continuously flew west?

A few notes on non-linear least squares in R with code. Example relates to fitting Gompertz models for microbial growth curves.