When I was still a physics graduate student, I heard about the coolest thing at the time: we finished the Human Genome Project. It made me think about how awesome we can get all those little letters A/C/G and T in each of our cells. There must be a lot of human health and medicine problem that we can solve with the new information about our genomes.

Nevertheless, I did not pay much attention as I had to finish my thesis. …

In recent years, a group of people in the scientific community starts to look into the application of graph to represent genomes. Graph is an extremely powerful abstraction for processing various kind of data. Meanwhile, mathematically, a graph is simple and complicated at the same time. To define a graph, all we need are the sets of nodes (vertex) and links (edges). In the meantime, some theorists also think we can explain what space-time is, e.g., see what-is-spacetime-really, with graphs.

At this moment, we are not so ambitious to explain the whole universe with graphs. We are interested in applications…

(A slide deck for this work can be found https://speakerdeck.com/jchin/decomposing-dynamics-from-different-time-scale-for-time-lapse-image-sequences-with-a-deep-cnn)

I left my job as a Scientific Fellow in PacBio after 9-year venture helping to make single-molecule sequencing becoming useful for the scientific community (see my story about the first couple year in PacBio there). Most of my technical/scientific work had something to do with DNA sequences. While there are some exciting deep learning approaches for solving a couple of interesting problems, I do like to explore a bit outside just DNA sequencing space.

I joined DNAnexus a while ago. The company has established itself as a leader of cloud…

I left PacBio a bit more than one year ago so I can work on different things. The announcement about Illumina acquiring PacBio last Thursday (Nov. 2, 2018) still created a lot of emotional impact on me. There were so many interesting stories that I had experienced myself in my 9 years tenure with PacBio. I just want to write something down before I forget. I decide to share some of these stories. I hope you might find some interesting to you if you are working in the DNA sequencing and/or genomics field.

**Can we construct an error free consensus sequence from a set of noisy sequences using a neural network?**

**(associated Jupyter notebooks can be found at **https://github.com/cschin/DCNet)

Using a deep convolutional neural network (CNN) for denoising images or constructing super resolution images has generating some quite amazing results. The basic idea is that one can train a CNN to learn to remove the noisy pixels by training the network to construct the original image from those image corrupted by noises. …

No doubt that the fast development of recent progress using deep neural network has change the way that we can solve various problem from image recognition to genomics. Many startups (Viome/CZ Biohub/Deep Genomics) has emphasize using “AI/machine learning” for their research and product. Other relative “big” biotech startups in Silicon Valley, e.g., Verily, Calico and Grail are either funded by Alphabet or have deep connection to Google.

Certainly, these companies will be focusing on applying machining learning, deep neural network and challenging statistical analysis on the large amount of data that they can collect (with the deep pocket of investment…

“In 1904 at a physics conference in St. Louis most physicists seemed to reject atoms and he was not even invited to the physics section.” — Wikipedia page about Ludwig Boltzmann

Lucky for us, we live in an era that we can see “see” atoms with atomic force microscope. And, fortunately for myself, I am personally involved in the development of the technologies to read DNA molecule by molecule to get whole human or gorilla genome. Without the theory of atoms, many modern day technological miracles would be impossible. …

I have downloaded this paper “Deep Unsupervised Learning using Nonequilibrium Thermodynamics” for while but only read it through today. The typical example of direct connection between neural network and statistical physics is some spin-magnetic system like Ising model. This paper is interesting in the sense that it connected deep learning to a different area of statistical physics, e.g. non-equilibrium diffusion process.

“Information Theory and Statistical Mechanics” is the title of a paper that E.T. Jaynes published about 60 years ago. (Official journal link Information Theory and Statistical Mechanics, you can download PDF from https://journals.aps.org/pr/abstract/10.1103/PhysRev.106.620 )

For people who knows both information theory and statistical mechanics, one can recognize the same form “p log(p)” happens in both fields. In statistical mechanics, we learn about “p_i” is some probability of a “system” in state “i”, and the entropy of such system is the sum of p log(p) of all possible state. …

Statistical Physics and Machine Learning

When I was young, I was attracted to physics for two things (1) elegant math that connects the abstract beauty and the physical world and (2) the possibility to explain complex behaviors in physics worlds starting from simple elements. Eventual, I did my graduate research in the field of statistical physics.

In statistical physics, physicists study phenomena from the interactions out of many parts. For example, how water becomes ice is a collective phenomenon. It would not make sense to talk about whether a single water molecule is a liquid phase or a solid phase…