Part 22 of The series where I interview my heroes.
Index and about the series“Interviews with ML Heroes”
Today I’m honored to be interviewing a Kaggle Grandmaster from the ods.ai community.
I’m excited to be talking to Competitions GrandMaster (Ranked #29, kaggle: @drn01z3) and Kernels (Ranked #159), Discussions Expert: (Ranked #58),: Artur Kuzin
Artur has a background in Physics and Applied Math with a Masters Degree. Currently, he is working as the Head of Computer Vision at X5 Retail Group (Largest multi-format retailer in Russia), before X5 Group- he has worked as Lead Data Scientist at Dbrain (Dbrain.io), and as a Data Scientist at Avito (the second largest classifieds site in the world, part OLX group).
About the Series:
I have very recently started making some progress with my Self-Taught Machine Learning Journey. But to be honest, it wouldn’t be possible at all without the amazing community online and the great people that have helped me.
In this Series of Blog Posts, I talk with people that have really inspired me and whom I look up to as my role-models.
The motivation behind doing this is that you might see some patterns and hopefully, you’d be able to learn from the amazing people that I have had the chance to learn from.
Sanyam Bhutani: Hello Grandmaster, Thank you for taking the time to do this.
Artur Kuzin: Hello! Thanks for doing the interview series.
Sanyam Bhutani: Currently, you’re a Comp GrandMaster as well as a Discussions & Kernels Expert.
You have a background in Physics and Applied Math. How did you get interested in Machine Learning and in kaggle at first?
Artur Kuzin: When I was a student, I tried different activities in parallel with working in the lab. A few of my friends were called to work in a strange startup that was engaged in the development of Artificial Intelligence. I still do not understand how I was persuaded, because it was strongly away from my flow of life at that time. The tasks were mostly related to computer vision.
The startup was fun and diverse, but now it seems to me that the speed of my training was rather low. I got a real boost when I started participating in local ML competitions from Avito. I was placed 3rd at my first competition which was about the classification of cars. I was excited and motivated. In the next competition from Avito, I secured the 1st rank. Due to which I got an offer from them.
Sanyam Bhutani: You’re currently working as the Head of Computer Vision at X5 Retail Group and have been working in the Data Science space during the past few years.
Where does kaggle come in the picture? Is it related to your other projects?
Artur Kuzin: After winning the Avito competition, I realized those machine learning competitions are a very cool activity and with a unique atmosphere. Since then, I have always tried to take part in interesting competitions. All this time I had a full-time job and kaggle looked like a second unpaid job (yes, there were prizes, but they hardly claim to be something stable).
Well, I really got into it seriously and for a long time after Kaggle Dstl Satellite Imagery Feature Detection Vladimir Iglovikov (Kaggle: @iglovikov) pushed me and many other participants from top teams to participate, for which I am very grateful to him. It was a very difficult, interesting and emotional competition. After that, I realised that kaggle has become my addiction. So far, competitions itself did not lead me to start my own projects. But often, I have been approached by people with many interesting offers.
Sanyam Bhutani: Could you tell us more about your role at your current job?
What projects are you working on and what is your role in the same?
Artur Kuzin: The X5 Retail Group has recently created the department for video analytics. Tasks are to develop and implement solutions using computer vision. including analysis of the availability of goods on the shelf, queue control, KYC, staff analysis, etc. I am leading a team of 10 engineers and researchers. Tasks that the team cover both, R&D in the ML and CV domains as well as envelop the whole solution from shaping hardware architecture to integration with the data warehouse.
Sanyam Bhutani: You’ve had many amazing finishes in competitions.
Could you tell what was your favorite challenge?
Artur Kuzin: The most important and significant one for me was the 2nd place on the kaggle IEEE camera. In that period of my life, I was considering to take up the role of a team lead, but I was not sure that I would be able to manage. And I decided to try this role in the kaggle team. Typically, everyone in the team develops own solution from the beginning to the end. And then someone blends them up or builds second-level models. We went the other way. Arthur Fattakhov collected data, Ilya Kibardin trained models. And I just gave them hardware, ideas, and advice. The only part that I did with my own hands is data filtering and mixing the final submission 5 minutes before the deadline. As a result, we took off from the 6th place in the public to the 2nd on the private. This gave me a firm assurance that I could handle the leadership.
About the solution of this competition there are two resources that you can checkout:
- A video with English subtitles
- An article
- We also decided to use a high place on the leaderboard. And we cooperated with Vladimir Iglovikov and his graduate student to write an article. As a result, the article was taken to the 2018 IEEE International Conference on Big Data: Link to the abstract.
Sanyam Bhutani: You’ve had great results-both in solo finishes and team finishes.
For a noob kaggler-What tips do you have when forming a team or not?
Artur Kuzin: If you are participating for the first time, then I highly recommend participating in a team. Moreover, it is highly desirable to find an experienced participant at least at the Master level. He will protect from stupid mistakes and save a lot of nerves. It is also desirable for the team to have a person who is familiar with the development practices and will be able to adjust the processes with the split into common folds, git, the place for exchanging data, a place for discussion.
In fact, even experienced participants are better off participating in a team. I see a lot of value in forming a team with different levels of skills. Young and inexperienced people are usually overwhelmed with enthusiasm, so if you motivate them correctly, you can assign them to do boring stuff, such as clearing the data and testing non-trivial hypotheses. Middles / kaggle Masters can write good code under the supervision of senior team members. And Senior / kaggle Grandmasters can be the solution architect and fully delegate the work to others.
Sanyam Bhutani: What kind of challenges do you look for today? How do you decide to enter a new competition?
Artur Kuzin: Now the main challenge for me is to bring a large-scale project in X5 Retail Group to roll out, I’m completely focused on that. However, I still see a lot of value in participating in contests, especially if they are relevant to the work. For example, currently there is a competition with whales and this task is very much similar to defining the goods on a shelf.
Sanyam Bhutani: What opportunities for newbies does kaggle open up if you give him enough time?
Artur Kuzin: Kaggle allows you to very quickly develop a specific skill set. With the right approach, these skills can be converted into the necessary quality for work (https://habr.com/ru/company/ods/blog/439812/ ). Also, competitions allow you to try a lot of different tasks and greatly expand your knowledge. Finally, it is insanely fun if you are able to find such a friendly community like ODS.ai.
If we are talking about more experienced participants of the Master or Grandmaster level, then for them it is an opportunity to try yourself as a team leader. At work, it is very rare when you manage to become a team leader just because you wanted to. But in the case of successful performance in this format, it becomes easier to move towards the leadership.
For myself, I used kaggle as a way to find smart guys. For instance, together with Ilya Kibardin (https://www.kaggle.com/ikibardin ) and Miras Amir (https://www.kaggle.com/amiras ), I participated in several competitions. So they were on top in shortlist of a candidate for my team at X5 Retail Group.
And as the last argument. When Ilya and Miras got a job, one of the arguments in favor of a high salary was their achievements in kaggle, despite the fact that they are still students.
Sanyam Bhutani: What best pieces of advice do you have for beginners who want to score well in the Deep Learning Competitions. However, do not have a beefy GPU box setup with themselves?
How can do well against the others that are doing tremendous stacking at times?
Artur Kuzin: I see the following strategy:
- Find computing resources. These can be Kaggle GPU kernels or credits for Google Cloud or AWS for academics.
- Start participating as early as possible. The lack of computing resources can be compensated by time and the number of attempts.
- As soon as you get a decent result, you propose to team up with more experienced participants who have computing resources.
- Fight together till the end.
This strategy was implemented by Valery Babushkin. He did not have any video cards at all, however, he was able to show a good result on Kaggle Carvana. Therefore, I teamed up with him and as a result, our team received a gold medal.
Sanyam Bhutani: What are your first steps and go to techniques when starting out on a new competition?
Artur Kuzin: Everything is typical here:
- I read the description, the forum. If the leaks have not yet been found, then you can participate
- Split into folds and think about local validation scheme
- Next steps depend on the competition.
Sanyam Bhutani: For the readers and noobs like me who want to become better kagglers, what would be your best advice?
Artur Kuzin: Perhaps I have a somewhat alternative point of view. But strongly believe that the most important thing is the ability to desire and to be obsessed with something. This is the ability to get interested, not to surrender halfway, put all on till the end and fight to the last second. If you have the desire, then you will understand how to become the best.
Sanyam Bhutani: Given the explosive growth rate of ML, How do you stay updated with the recent developments?
Artur Kuzin: Almost all events and results are discussed in ODS.ai. But I’m rather not interested in academic discoveries, but in practical techniques that allow one to train an accurate and lightweight model for production. In this regard, you can follow the overview of solutions after the competition. Some of the participants share their solutions even with source code, which is highly respectable.
Sanyam Bhutani: What developments in the field do you find to be the most exciting?
Artur Kuzin: I am still under the impression of AlphaStar. We live in a very interesting time. I think this is the beginning of very significant discoveries that will change our lifestyle soon.
Sanyam Bhutani: What are your thoughts about Machine Learning as a field, do think its Overhyped?
Artur Kuzin: I think it is definitely overhyped. But it happens not without reason. This area does not look like a fake to me, because now a lot of companies are using machine learning in a large variety of different applications. The labor market also seems to be still unsettled. But over time, the level of competence on the part of ML will grow and everything will even out, as it happens with the usual software development.
Sanyam Bhutani: Before we conclude, any tips for the beginners who aspire to be like you someday but feel completely overwhelmed to even start competing?
Artur Kuzin: Just get started! Just do it! It’s not gods who make pots. I do not consider myself insanely talented, clever or shrewd. I have a bunch of friends who are better than me in every aspect. But I know for sure that I can compensate for all my shortcomings with the number of attempts and the time I devote to achieving the result.
Sanyam Bhutani: Thank you so much for doing this interview.
Subscribe to my Newsletter for updates on my new posts and interviews with My Machine Learning heroes and Chai Time Data Science