python basic : Introduction to Python for Data Science ( Microsoft: DAT208x )

온라인 교육
- Introduction to Python for Data Science
- https://courses.edx.org/courses/course-v1:Microsoft+DAT208x+3T2018/

Further Readings

numpy (numeric python)
- http://www.numpy.org/
on line code testing site
- https://www.onlinegdb.com/online_python_interpreter
matplotlib vs ggplot
- matplotlib : data visualization in Python
  ggplot : a port of a very popular R package to do visualization, ggplot2.
  It's based on the so-called grammar of graphics, a term coined by Leland Wilkinson, who wrote some amazing books on data visualization.
- http://ggplot.yhathq.com/
  http://ggplot2.org/

control flow 추가 설명
- https://docs.python.org/3.5/tutorial/controlflow.html
Built-in Types
- https://docs.python.org/3/library/stdtypes.html#boolean-operations-and-or-not
Pandas 공식사이트
- http://pandas.pydata.org/
indexing and selecting data
- http://pandas.pydata.org/pandas-docs/version/0.8.1/indexing.html
extra material for adv. study
- https://medium.com/dunder-data/selecting-subsets-of-data-in-pandas-6fcd0170be9c
(matplot 예제)
- http://matplotlib.org/1.5.1/gallery.html
(Hans Rosling - data visualization 선구자)
- https://www.youtube.com/watch?v=hVimVzgtD6w
- https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.column_stack.html
python group by function
- https://www.tutorialspoint.com/python_pandas/python_pandas_groupby.htm

예제1

a=8
b=[-1,5]
c=[3,"B",19]
print("[%s, %s, %s]" % (a,b,c))

[8, [-1, 5], [3, 'B', 19]]

예제2

x="education"
print(x.replace("i","_"))

educat_on

예제3

import numpy as np
np_2d = np.array([[1,2,3],[14,15,16]])
print(np_2d[0:, :1])

[[ 1]
 [14]]

예제4 : min/max 예제

x = np.array(5, 4, 3, 2, 1, 0])
       index  0  1  2 3  4  5
np.max(x) // max
x[np.where(x>=3)] // 3이상의 값 index return
np.min(x) // min
np.where(x>=3) // 0,1,2
np.argmax(x) // 0
np.argmin(x) // 5

참조: https://rfriend.tistory.com/356

예제5 : numpy.column_stack 예제

>>> a = np.array((1,2,3))
>>> b = np.array((2,3,4))
>>> np.column_stack((a,b))
array([[1, 2],
       [2, 3],
       [3, 4]])

예제6 : matplotlib 사용예제

dict = {
    'Asia':'red',
    'Europe':'green',
    'Africa':'blue',
    'Americas':'yellow',
    'Oceania':'black'
}
# Specify c and alpha inside plt.scatter()

plt.scatter(x = gdp_cap, y = life_exp, s = np.array(pop) * 2, c = col, alpha=0.8)


# Previous customizations

plt.xscale('log')

plt.xlabel('GDP per Capita [in USD]')

plt.ylabel('Life Expectancy [in years]')

plt.title('World Development in 2007')

plt.xticks([1000,10000,100000], ['1k','10k','100k'])


# Additional customizations
plt.text(1550, 71, 'India')
plt.text(5700, 80, 'China')

# Add grid() call
plt.grid(True)
# Show the plot

plt.show()

출처 : https://campus.datacamp.com/courses/introduction-to-python-for-data-science-microsoft/lab-53-customization?ex=4

예제7 : pandas 예제

# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)


# Print out country column as Pandas Series
print(cars['country'])

# Print out country column as Pandas DataFrame
print(cars[['country']])

# Print out observation for Japan
print(cars.loc['JAP'])

# Print out observations for Australia and Egypt
print(cars.loc[['AUS', 'EG']])

# Print out drives_right value of Morocco
print(cars.loc['MOR', 'drives_right'])

# Print sub-DataFrame
print(cars.loc[['RU', 'MOR'], ['country', 'drives_right']])

'Python 응용 > DataScience교육_MS_DAT208x' 카테고리의 다른 글

[DAT208x] final lab 1-7, 1-8 (0)	2021.10.23
[DAT208x] final lab 1-4, 1-5, 1-6 (0)	2021.10.23
[DAT208x] final lab 1-3 (0)	2021.10.23
[DAT208x] final lab 1-1 : Section 1: Importing and Summarizing Data (0)	2021.10.23
[DAT208x] final lab 1-2 (0)	2021.10.23

Python Study

python basic : Introduction to Python for Data Science ( Microsoft: DAT208x )

'Python 응용 > DataScience교육_MS_DAT208x' 카테고리의 다른 글

티스토리툴바

python basic : Introduction to Python for Data Science ( Microsoft: DAT208x )

'Python 응용 > DataScience교육_MS_DAT208x' 카테고리의 다른 글

'Python 응용/DataScience교육_MS_DAT208x' Related Articles

티스토리툴바