Notation popular in mathematical software

Rob Zinkov

Recently, I saw how regular programmers react to numeric code. Numeric code has a different style than code most people are used to. It tends to be based on a pile of equations and is loaded with single-letter variable names. To many programmers they don’t understand why the code can’t be more descriptive.

What they don’t realize is in field of machine learning x, y, w, z, etc have very well defined conventions associated with them. They are not all that different from i, j, k in loops.

  • x typically refers to your data or features
  • y typically refers to your labels
  • z typically refers to unobserved hidden variables
  • w typically refers to weights

n, m usually refer to sizes of vectors and dimensions

  • vectors are usually lowercase letters
  • matrices are uppercase letters

Greek letters will refer to hyperparameters you tune like learning-rate. The other place you often see greek letters is when referring to parameters of probabilistic distributions. Some popular examples include:

  • mu \(\mu\) refers to the center of a Gaussian distribution
  • sigma (\(\sigma\), \(\Sigma\)) refers to the standard deviation of a Gaussian distribution. You might also see cov which refers to the covariance matrix of a multivariate distribution

This list isn’t exhausted, but hopefully it makes some code more readable.