Statistics for Beginners
I often find people either terrified or bored by statistics. In some friend circles it is even a shorthand for ‘boring’. I attribute this misjudgment to people not being shown the beauty and everyday usefulness of statistics. As a result of which they akin stats to something used by boring professors and crazy scientists. I want to break notion by ‘breaking-down’ statistics into it’s constituent parts. Thereby showing to everyone that it’s really just common sense on steroids....
Dealing with columns in pandas
Why even bother? When working with DataFrames in pandas, I have always missed the ease with which I could select rows and columns in Excel. Simple things like deleting a particular cell or arranging columns were soo incredibly easy and intuitive. Of course Google sheets take this one step further with its fancy editor. I am yet to find such ease of use within pandas. In today’s post I would like to share a couple of handy techniques I use every soo often to deal with the seemingly simple task of selecting and moving around rows and columns in Pandas....
Self notes The analysis itself is secondary. The base data column, the key facts you collect are much more important. EG: While predicting the sales of a list of stores in a city– Count of items, in store, sqft area of store, outlet type, city type in store Preparations Deal with null values (Substitute them with mean) Convert catagorial columns into 1-hot from sklearn.preprocessing import LabelEncoder Definitions: Target variable The varibale you want to successfully predict after your analysis ID columns Remove ‘ID’ type columns from your DF as they do not add any value to the predictive model Process...