Frequently use Pandas methods/functions

Tuan_H · 21 Tháng Sáu 2022 13:45

I have no counter-argument about why don’t we use something else beside import pandas as pd

Import Data

pd.read_csv(csvfile) and above that I mostly use file=r'c:\datasets\csvfile.csv' because I put ipynb in Git.

pd.read_excel(excelfile) to work with excel file.

pd.read_sql(query, connection_object) not frequently use today since Power BI or Tableau is more faster, but in case it’s repeat process.

pd.read_json(jsonfile) to work with json, this one is fast and very useful when you want to manipulate json files again and again.

normally after we import the data, we put them into dataframe called df
like df = pd.read_csv('csv.csv') so, when we export, we begin with df

df.to_csv(csvfile) export to csv to continue working with other program

df.to_excel(excelfile) mostly use when we do a quick ETL

df.to_json(jsonfile) after edited json

df.head() lookup on first 5 rows

df.tail() lookup on last 5 rows

df.shape check number of rows and columns, it will return like (20640, 10) meaning this dataframe has 20640 rows and 10 columns

df.info() we mostly use .info rather than .shape because it contains important data to continue working on

df.dropna() quick and easy way to get rid of row that has null value