An overview of Data Science Network’s largest community event


You can find me on twitter @bhutanisanyam1
Full House for the event

This weekend, the Data Science Network Team hosted one of the largest community events and one of India’s largest Kaggle Days Meetup.

I’d also like to point out that like all of the DSNet and Kaggle related community events, this event was also free of cost-Thanks to our venue sponsor Manipal ProLearn.

What is Kaggle Days Meetup?

You might be familiar with the KaggleDays which is an event hosted by Kaggle and Logic AI in the form of an offline-global scale Kaggle event.

KaggleDays Meetup: is the community-driven, meetup version of KaggleDays.

This means a few things:

  • A meetup with a Kaggle theme.
  • Different Levels: Beginner, Intermediate and Advanced.
  • Free of Cost.

All these three points really stood out to us, so the Data Science Network team decided to host an event Mid-September for the “beginners track”.

Kaggle Days Meetup Beginner Track (14th-15th Sept) Recap

We were lucky to have an amazing set of speakers who kindly agreed to brace the Bangalore traffic on a weekend and share their amazing knowledge with us.

For a detailed set of notes and recap, I’d highly recommend this blogpost by Vinay Kumar who was kind enough to prepare notes for the complete event and create a writeup of these.

Here is a quick recap, along with links to recordings and slide decks for the meetup.

The event was a complete 2 day event including 5-talks, 2 sets of workshops and a Kaggle-Inclass competition.

Day 1

Rohan Rao’s Talk

“On-Kaggle Vs Off-Kaggle”

by Rohan Rao (Kaggle GrandMaster, Data Scientist at H2o.ai)

About: The differences, positives, and negatives of Data Science On and off Kaggle, and why both should be balanced.

Link to Recording:

Rohan Rao was kind enough to share his Data Science Journey both On and Off-Kaggle.

“I have sacrificed a lot of things to spend time on Kaggle; Kaggle is my dream second job but it comes at a sacrfise”

“Is model.fit() enough?

Aakash Nain on “Is model.fit() enough?”

by Aakash Nain: (Kaggle Expert, Research Engineer at Ola)

About: Is building models sufficient skill for deep learning engineers? Are you thinking thoroughly to apply deep learning to your next project? And what are the different real-world scenarios of deep learning that you aren’t probably aware of?

Aakash Nain’s talk picked up right on the theme set by GrandMaster Rohan’s presentation. In the talk, Aakash walked us through the “not-so-sexy” parts of an ML Pipeline. Things to avoid and things that one faces in the real world of Data Science and when facing business clients.

“Neural Networks fail, and they fail silently.”

Day 2

Feature Engineering To Crack Top 1% Private LB on Kaggle

by Mohammad Shahebaz (Kaggle x2 Master, Data Analyst at Societe General):

About: Have you ever wondered why the features you make end up overfitting or not again significant jump on Kaggle’s leaderboard. Is private leaderboard a challenging ladder to climb?

Mohammad Shahebaz In the 1-hour session, Shaz explained his experiences with feature engineering best approaches for your next Kaggle competition.

How to track ML Experiments Effectively”

by Sanyam Bhutani (Kaggle x3 Expert, Data Science Engineer at Swiftace):

The last session was by me, About: “The usual pipeline for working on a machine learning experiment is very different from Software Engineering. This talk will be highlights of Tracking the experiments and the iterative nature of the same effect inside of a Jupyter notebook, how to effectively apply these ideas to Kaggle competitions and make these work with data science teams.”

Video:

Demystifying SVM

by Usha Rengaraju

Usha explained about how SVM can be utilized with linearly inseparable data for classification problem.


That was it about the amazing talks. For the second half during the event days, we conducted 2 workshops. These were aimed at helping beginners get started with hands-on experiments and Kaggle competitions.

Workshops

Analysing WhatsApp Chats

“Understanding your WhatsApp chat data”

Link to resource notebook

This workshop was inspired by Kartik Godawat’s amazing writeup: “Understanding my browsing pattern using Pandas and Seaborn”

In this hands-on session, we looked at how to extract and preprocess WhatsApp chat data, This workshop is aimed at helping you get started with “Exploratory Data Analysis” to answer questions through data using numpy, pandas and some visualization tools. We also answer questions and look at your texting patterns, set a pipeline for processing your chat messages.

Looking at your own texting patterns can reveal a lot of details. These obvious patterns might not be visible by default. Given that this would be the participants’ personal chat data, it allowed the insights to be more relatable.

This also sets the stage for our future workshops where we’ll look at how to make language models work with the data and later-with Transformers.

Getting started with Kaggle Competition, In-Class Hackathon

Making our first Submissions to Kaggle

We are Kaggle addicts and a little biased towards fast.ai. Keeping these points in mind and the will to bringing more people onto these platforms. DSNet decided to host a workshop and an In-Class 1 week-long Competition where we’d look at classifying artworks.

You can find the competition page here. However, this is a limited participation competition so it wouldn’t accept your submissions.

The theme of the workshop included teaching everyone about Kaggle, how to submit to a competition, Kaggle lingo and giving them a taste of a Leaderboard.

Kaggle allows hosting “in-class” competitions where we can leverage the platform to host our dataset and run a competition on it. These made our work really easy and helped us stay on the platform and not rely on other options.

This enabled us to have around 90 submissions from individuals to the leaderboard. To encourage participation, we’ve promised sharing some compute credits to top scorers and people who create a great kernel/discussion thread. You’ll be able to find the winner once the competition ends on DSNet’s Twitter.

Summary

We really enjoyed hosting the event and meeting so many great Kagglers, Data Science Students over the weekend.

A quick shout-out to the amazing folks, also the core-team responsible for making DSNet one of India’s largest active communities:

Aakash N S, Deepak Rawat,Kartik Godawat, Prajwal Prashanth, Siddhant Ujjain.

Even though most of the times, I’m lucky to share the exciting updates about the community, it’s thanks to the amazing core-team which consists of people much smarter and wiser than me, who constantly work together on important things, that we are able to empower the Indian ML Scene and host these meetups.

We’re thankful to our great speakers and everyone who came out on the weekend, we really hope to continue hosting events and improve both the frequency of quality of the same.


You can find me on twitter @bhutanisanyam1
Subscribe to my Newsletter for updates on my new posts and interviews with My Machine Learning heroes and Chai Time Data Science