Data Science

Data analysis with Python - Importing dataset

Hiru_93 2022. 8. 13. 17:12

Data analysis with Python by IBM 시리즈 - 1강 Importing dataset

 

  • Python packages for Data Science
  1. Scientifics computing: Pandas / Numpy / SciPy
  2. Visualization: Matplotlib / Seaborn
  3. Algorithmic libraries(Linear Regresion등에 쓰임): Scikit-learn(머신러닝 라이브러리) / Statsmodels(Estimate statistical models, perform statistical test)
  • Datatype 비교: Pandas vs Python

 

 

  • dataframe.describe() 숫자가 아닌 columns 생략한다. 때문에 string type column 확인하고 싶다면 
dataframe.describe(include=['object'])

혹은 데이터타입 전부를 확인하고 싶다면

 

df.describe(include = 'all')
  • Top 30 rows and bottom 30 rows of dataframe 확인하는
df.info()
  • Python DB-API 두가지 concept: Connection object, Cursor object
  • Connection object: Database connection, manage transactions 주로 쓰임
  • Cursor objects: Database queries에 주로 쓰임
  • 주로 쓰이는 Connection methods
cursor() # returns a new corsor object using connection

commit() # commit any pending transaction to the database

rollback() # causes the database to roll back to the start of any pending transaction

close() # to close a database connection