Part 7 of The series where I interview my heroes.
This is another very special version of the series.
Index and about the series“Interviews with ML Heroes”
You can find me on twitter @bhutanisanyam1
Today, I’m honoured to be talking to the GANFather, the inventor of Generative Adversarial Networks, a pioneer of cutting edge Deep Learning research and author of one of the best theoretical books on Deep Learning: Dr. Ian Goodfellow.
About the Series:
I have very recently started making some progress with my Self-Taught Machine Learning Journey. But to be honest, it wouldn’t be possible at all without the amazing community online and the great people that have helped me.
In this Series of Blog Posts, I talk with People that have really inspired me and whom I look up to as my role-models.
The motivation behind doing this is, you might see some patterns and hopefully you’d be able to learn from the amazing people that I have had the chance of learning from.
Sanyam Bhutani: Hello GANFather, Thank you so much for doing this interview.
Dr. Ian Goodfellow: Very welcome! Thank you very much for interviewing me, and for writing a blog to help other students.
Sanyam Bhutani: Today, you’re working as a research scientist at Google. You’re the inventor of the most exciting development in Deep Learning: GAN(s).
Could you tell the readers about how you got started? What got you interested in Deep Learning?
Dr. Ian Goodfellow: I was studying artificial intelligence as an undergrad, back when machine learning was mostly support vector machines, boosted trees, and so on. I was also a hobbyist game programmer, making little hobby projects using OpenGL shader language. My friend Ethan Dreyfuss who works at Zoox now told me about two things: 1) Geoff Hinton’s tech talk at Google on deep belief nets 2) CUDA GPUs, which were new at the time.
It was obvious to me right away that deep learning would fix a lot of my complaints about SVMs. SVMs don’t give you a lot of freedom to design the model. There isn’t an easy way to make the SVM smarter by throwing more resources at it. But deep neural nets tend to get better as they get bigger. At the same time, CUDA GPUs would make it possible to trainer much bigger neural nets, and I knew how to write GPU code already from my game programming hobby.
Over winter break, Ethan and I built the first CUDA machine at Stanford (as far as I know) and I started training Boltzmann machines.
Sanyam Bhutani: You’ve mentioned that you coded the first GAN model just overnight whereas the general belief is that a breakthrough in research might take months if not years.
Could you tell us what allowed you to make the breakthrough just overnight?
Dr. Ian Goodfellow: If you have a good codebase related to a new idea, it’s easy to try out a new idea quickly. My colleagues and I had been working for several years on the software libraries that I used to build the first GAN, Theano, and Pylearn2. The first GAN was mostly a copy-paste of our MNIST classifier from an earlier paper called “Maxout Networks”. Even the hyperparameters from the Maxout paper worked fairly well for GANs, so I didn’t need to do much new. Also, MNIST models train very quickly. I think the first MNIST GAN only took me an hour or so to make.
Sanyam Bhutani: Since their inception, we have seen tremendous growth in GAN(s), which one are you most excited about?
Dr. Ian Goodfellow: It’s hard to choose. Emily Denton and Soumith Chintala’s LAPGAN was the first moment I really knew GANs were going to be big. Of course, LAPGAN was just a small taste of what was to come.
Sanyam Bhutani: Apart from GAN(s), what other domains of Deep Learning research do you find to be really promising?
Dr. Ian Goodfellow: I spend most of my own time working on robustness to adversarial examples. I think this is important for being able to use machine learning in settings where security is a concern. I also hope it will help us understand machine learning better.
Sanyam Bhutani: For the readers and the beginners who are interested in working on Deep Learning with the dreams of working at Google someday. What would be your best advice?
Dr. Ian Goodfellow: Start by learning the basics really well: programming, debugging, linear algebra, probability. Most advanced research projects require you to be excellent at the basics much more than they require you to know something extremely advanced. For example, today I am working on debugging a memory leak that is preventing me from running one of my experiments, and I am working on speeding up the unit tests for a software library so that we can try out more research ideas faster. When I was an undergrad and early PhD student I used to ask Andrew Ng for advice a lot and he always told me to work on thorough mastery of these basics. I thought that was really boring and had been hoping he’d tell me to learn about hyperreal numbers or something like that, but now several years in I think that advice was definitely correct.
Sanyam Bhutani: Could you tell us what a day at Google research is like?
Dr. Ian Goodfellow: It’s very different for different people, or even for the same person at different times in their career. I’ve had times when I mostly just wrote code, ran experiments, and read papers. I’ve had times when I mostly just worked on the deep learning book. I’ve had times when I mostly just went to several different meetings each day checking in on many different projects. Today I try to have about a 60–40 split between supervising others’ project and working firsthand on my own projects.
Sanyam Bhutani: It’s a common belief that you need major resources to produce significant results in Deep Learning.
Do you think a person who does not have the resources that someone at Google might have access to, could produce significant contributions to the field?
Dr. Ian Goodfellow: Yes, definitely, but you need to choose your research project appropriately. For example, proving an interesting theoretical result probably does not require any computational resources. Designing a new algorithm that generalizes very well from an extremely small amount of data will require some resources but not as much as it takes to train on a very large dataset. It is probably not a good idea to try to make the world’s fastest-training ImageNet classifier if you don’t have a lot of hardware to parallelize across though.
Sanyam Bhutani: Given the explosive growth rates in research, How do you stay up to date with the cutting edge?
Dr. Ian Goodfellow: Not very long ago I followed almost everything in deep learning, especially while I was writing the textbook. Today that does not seem feasible, and I really only follow topics that are clearly relevant to my own research. I don’t even know everything that is going on with GANs.
Sanyam Bhutani: Do you feel Machine Learning has been overhyped?
Dr. Ian Goodfellow: In terms of its long-term potential, I actually still think machine learning is still underhyped, in the sense that people outside of the tech industry don’t seem to talk about it as much as I think they should. I do think machine learning is often “incorrectly hyped”: people often exaggerate how much is possible already today, or exaggerate how much of an advance an individual project is, and so on.
Sanyam Bhutani: Do you feel a Ph.D. or Masters level of expertise is necessary or one can contribute to the field of Deep Learning without being an “expert”?
Dr. Ian Goodfellow: I do think that it’s important to develop expertise but I don’t think that a PhD is the only way to get this expertise. The best PhD students are usually very self-directed learners, and it’s possible to do this kind of learning in any job that gives you the time and freedom to learn.
Sanyam Bhutani: Before we conclude, any advice for the beginners who feel overwhelmed to even get started with Deep Learning?
Dr. Ian Goodfellow: Start with an easy project, where you are just re-implementing something that you already know should work, like a CIFAR-10 classifier. A lot of people want to dive straight into doing something new first, and then it’s very hard to tell whether your project doesn’t work because your idea doesn’t work, or whether your project doesn’t work because you have a slight misunderstanding of something that is already known. I do think it’s important to have a project though: deep learning is a bit like flying an airplane. You can read a lot about it but you also need to get hands-on experience to learn the more intuition-based parts of it.
Sanyam Bhutani: Thank you so much for doing this interview.
You can find me on twitter @bhutanisanyam1
Subscribe to my Newsletter for updates on my new posts and interviews with My Machine Learning heroes and Chai Time Data Science