Blog Posts - 2025

2025

Anomalies in Physics

19 minute read

Published:

This is a note based on a talk by Dan Freed entitled, “What is an Anomaly?” and the notes he posted for it. There are also slides from the talk. This is not my area of expertise and so any mistakes in here are due to me, not Dan Freed. The main slogan of the talk is the following: quantum theory takes place in projective geometry and an anomaly, mathematically speaking, can be viewed as an expression of this projectivity.

Visualizing Phase Spaces

4 minute read

Published:

Consider a simple swinging pendulum. One way to describe its physical state is to use an angle to describe its position with respect to the vertical (where it is located when at rest) and to also give a number describing its angular velocity. The space of all configurations, also called a phase space, is $S^1 \times \mathbb{R}$ since the circle $S^1$ describes the angles and the real numbers describe the angular velocity. This space is also known as the cotangent bundle $T^* S^1$. Similarly, we can also consider a double pendulum with two rods and angular velocity for each. The pair of angles together live in a torus $T^2 = S^1 \times S^1$ and the pair of angular velocities live in $\mathbb{R}^2$. This phase space is also known as the cotangent bundle $T^* T^2$. The double pendulum is known to be chaotic; here is an animation depicting this. The plot on the right looks to be a square but we identify the opposite edges to form a torus.

High Dimensional Phenomena

8 minute read

Published:

This post is largely based on Richard Hamming’s The Art of Science and Engineering. The book has a large technical focus on error correcting codes, including Hamming codes but this part is just about high dimensional phenomena that may break our intuition.

Shannon Entropy

11 minute read

Published:

Let’s consider a discrete probability distribution. We have events indexed by $i$ with probability $p_i$ of occurring. Then $H(p)=-E[\log(p)]=-\sum_i p_i \log(p_i)$ is the Shannon entropy of the distribution; we can use whatever base for the logarithm. Note that since $p_i \in [0,1]$, if $p_i$ is close to 0, $-\log(p_i)$ is large. If $p_i$ is close to 1, then $\log(p_i)$ is small. If $p_i=0$, we’ll define the term $p_i \log(p_i)=0$ since $\lim_{x \to 0} x\log(x) = 0$. Note that for $x \in [0,1]$, $-x\log(x) \geq 0$. Hence, for discrete probability distributions, $H(p)\geq 0$.

The Entropic Central Limit Theorem and Berry-Esseen Theorem

7 minute read

Published:

In this post, I’ll revisit the Central Limit Theorem again, this time with a focus on entropy. I’ll state a few results at the end without proof including the Berry-Esseen Theorem which tells us about the convergence of the usual Central Limit Theorem. We’ll work with continuous random variables and their density functions. Recall that for a random variable $X:(\Omega,\mu) \to \mathbb{R}$ with density $f$, the entropy is defined as $H[f] = -\int_\Omega f(x) \ln f(x)\,d\mu$. We’ll just work with $\Omega = \mathbb{R}$ and the usual Lebesgue measure.

The Yoneda Lemma

7 minute read

Published:

In math, we often study an object indirectly. For example, if we have a group $G$, one way to study it is to study representations $G \to GL(V)$. Or if we want to study a ring $R$, we can instead study $R$-modules. We can go further and create categories such as the category of $R$-modules.

Efficiency and Sufficiency with an Appendix on Muon Decay, Chirality, and Parity-Violation

16 minute read

Published:

Recall that the mean squared error for an estimator $\hat{\theta}$ for a population parameter $\theta$ is simply $\text{MSE}(\hat{\theta}) = E[(\hat{\theta}-\theta)^2]$ and we can decompose this into $\text{MSE}(\hat{\theta}) = \text{Var}(\hat{\theta})+\text{Bias}(\hat{\theta})^2$ where $\text{Var}(\hat{\theta}) = E[\hat{\theta}^2]-E[\hat{\theta}]^2$ and $\text{Bias}(\hat{\theta}) = E[\hat{\theta}] - \theta$. This works because $E[\theta]$ and $E[\theta^2]$ are just (unknown) numbers. If the estimator is unbiased, then this bias term is 0 and the mean squared error is simply the variance. In a variety of situations, a biased estimator can be made unbiased simply by scalar multiplication by a well-chosen factor.

Time Series Data

19 minute read

Published:

A time series is a sequence of data points $(\vec{x}_1, y_1), (\vec{x}_2, y_2), \dots (\vec{x}_t, y_t) \dots$ where $\vec{x}_t$ represents a collection of $p$ features and $y_t$ represents a numeric variable of interest at time $t$. We often assume that our time steps are evenly spaced. Depending on the model and data set we may make additional assumptions on the $\vec{x}_t$ and $y_t$. It’s possible that we don’t actually have the $\vec{x}_t$ (which I will lazily just write as $x_t$ from now on). For example, we might just have $y_t$ which is a count of the number of sunspots observed on the sun at a given time $t$. We could also have, say, interest rates and inflation and we want to do predictions about inflation. So the interest rates are the features $x_t$ and inflation rates are $y_t$.

Basics of Linear Regression

22 minute read

Published:

These are notes I made while learning about the subject from my former colleague Victor Churchill’s lecture notes. Regression is a type of supervised learning that models the relationship between one or more independent variables (features) and a continuous dependent variable (target). The goal of regression is to predict the value of the dependent variable based on the values of the independent variables. The main example we look at here is linear regression.

Statistics Review

61 minute read

Published:

These are notes that I made when stuyding statistics from various sources, including the lecture notes of my former colleague Victor Churchill.

Algorithmic Complexity of Newton’s Method for Computing $\sqrt{x}$

4 minute read

Published:

Suppose I want to solve the following Leetcode problem: Given a non-negative integer x, return the square root of x rounded down to the nearest integer. The returned integer should be non-negative as well. You must not use any built-in exponent function or operator. For example, do not use x** 0.5 in Python.


← 2024
Most Recent
2026 →