Statistical Physics and Machine Learning

When I was young, I was attracted to physics for two things (1) elegant math that connects the abstract beauty and the physical world and (2) the possibility to explain complex behaviors in physics worlds starting from simple elements. Eventual, I did my graduate research in the field of statistical physics.

In statistical physics, physicists study phenomena from the interactions out of many parts. For example, how water becomes ice is a collective phenomenon. It would not make sense to talk about whether a single water molecule is a liquid phase or a solid phase. There is no “phase transition” of a single water molecule between the liquid and the solid phase. It would only make sense to talk about such a “phase transition” of a system with many small components. In the example of water becoming ice, we are talking about some phenomena from many many water molecules (~10^23 molecules) and from the collective behavior of all molecules through the interactions between the molecules.

Since statistical physicists always thinks how many things interact with each other, it is not surprised that many of us are also interested in questions outside the “traditional” physics which more focuses on the interactions between fundamental particles.

One research field that got a lot of attention since the 1980s, if not earlier, from statistical physicists was how neurons learn. Many earlier theoretical works on neural networks (as a system of many neurons interacting with each other) showed the close relationship between neural networks and some “toy” models used in statistical physics research. I bought a book published by Santa Fe Institute in 1993 (https://www.amazon.com/Introduction-Theory-Neural-Computation-Institute/dp/0201515601/) . It summarized the state of the art on the interface between theoretical physics and neuronal network pretty well. Here is another example showing how a model called q-state Potts model with some modification can be used to learn different classes in the data (http://u.math.biu.ac.il/~louzouy/courses/seminar/magnet2.pdf). I was quite impressed by the work as a physics graduate student. However, my interests then were most on the theoretical aspects.

After I moved to the field of bioinformatics from theoretical physics, I appreciated more and more looking at “real world” data than just playing with toy models. While I still try to understand some interesting theoretical papers linking statistical mechanics and machine learning/neural networks, I am also interested in applying various techniques on biological datasets. However, most of my research work required more direct approaches using “human learning” system to develop specific algorithms applicable to certain domains. And, using neural networks to solve those problems would be like the cliche of “having a hammer and everything else is a nail.”

The recent buzz about deep learning certainly makes people who like me working in Silicon Valley hard to ignore. It makes me want to explore more about it. It will be useful for me to keep a study note on the recent progress when I get a chance to explore the related subjects as an amateur.