Frequently use Pandas methods/functions

I have no counter-argument about why don’t we use something else beside import pandas as pd

Import Data

pd.read_csv(csvfile) and above that I mostly use file=r'c:\datasets\csvfile.csv' because I put ipynb in Git.

pd.read_excel(excelfile) to work with excel file.

pd.read_sql(query, connection_object) not frequently use today since Power BI or Tableau is more faster, but in case it’s repeat process.

pd.read_json(jsonfile) to work with json, this one is fast and very useful when you want to manipulate json files again and again.

Export Data

normally after we import the data, we put them into dataframe called df
like df = pd.read_csv('csv.csv') so, when we export, we begin with df

df.to_csv(csvfile) export to csv to continue working with other program

df.to_excel(excelfile) mostly use when we do a quick ETL

df.to_json(jsonfile) after edited json

Inspect data

df.head() lookup on first 5 rows

df.tail() lookup on last 5 rows

df.shape check number of rows and columns, it will return like (20640, 10) meaning this dataframe has 20640 rows and 10 columns we mostly use .info rather than .shape because it contains important data to continue working on

Clean data

df.dropna() quick and easy way to get rid of row that has null value