1 min to read
FAST AI ML Meetup #0 Notes
Intro: FastAI ML Meetup.
Purpose of the Meetup: Accountability, Motivation, Encouragement.
- Train vs test Effective validation set construction
- Trees and ensembles Creating random forests Interpreting random forests
- What is ML? Why do we use it? What makes a good ML project? Structured vs unstructured data Examples of failures/mistakes
- Feature engineering Domain specific - dates, URLs, text Embeddings / latent factors
- Regularized models trained with SGD GLMs, Elasticnet, etc (NB: see what James covered)
- Basic neural nets PyTorch Broadcasting, Matrix Multiplication Training loop, backpropagation
- CV / bootstrap (Diabetes data set?)
- Ethical considerations
Cloud Options: Crestle Paperspace VectorDash
BlueBook For Bulldozers: The goal of the contest is to predict the sale price of a particular piece of heavy equiment at auction based on it’s usage, equipment type, and configuaration. The data is sourced from auction result postings and includes information on usage and equipment configurations.
Structured Data: (Unoffcial Def) Columns of Data having varying types of Data.
FastAI imports includes it.
- Download Data
- Load CSV Using Pandas
- Display data. Write a func to be able to display all.
Evaluation: RMSLE MATH ALERT! Root mean squared log error: between the actual and predicted auction prices.
Random Forests: Great Start. Why? Universal ML Technique for predicting categorical/continuous values. Doesn’t overfit generally Works w/o validation No Stats assumption
Curse Of Dim. No Free Lunch Th.
Feature Engineering: RF Expects Numerical Data.
1. add_datepart() 2. Categorical 3. Fix Missing Values 4. proc_df()
Regressor is called.
You can find me on Twitter @bhutanisanyam1 or Feel free to reach out via the Contact Menu