- 온라인 교육
- Introduction to Python for Data Science
- https://courses.edx.org/courses/course-v1:Microsoft+DAT208x+3T2018/
Further Readings
- variable types
- data structures
- functions
- numpy (numeric python)
- on line code testing site
- matplotlib vs ggplot
- matplotlib : data visualization in Python
ggplot : a port of a very popular R package to do visualization, ggplot2.
It's based on the so-called grammar of graphics, a term coined by Leland Wilkinson, who wrote some amazing books on data visualization. - http://ggplot.yhathq.com/
http://ggplot2.org/
- matplotlib : data visualization in Python
- control flow 추가 설명
- Built-in Types
- Pandas 공식사이트
- indexing and selecting data
- extra material for adv. study
- (matplot 예제)
- (Hans Rosling - data visualization 선구자)
- python group by function
예제1
a=8
b=[-1,5]
c=[3,"B",19]
print("[%s, %s, %s]" % (a,b,c))
[8, [-1, 5], [3, 'B', 19]]
예제2
x="education"
print(x.replace("i","_"))
educat_on
예제3
import numpy as np
np_2d = np.array([[1,2,3],[14,15,16]])
print(np_2d[0:, :1])
[[ 1]
[14]]
예제4 : min/max 예제
x = np.array(5, 4, 3, 2, 1, 0])
index 0 1 2 3 4 5
np.max(x) // max
x[np.where(x>=3)] // 3이상의 값 index return
np.min(x) // min
np.where(x>=3) // 0,1,2
np.argmax(x) // 0
np.argmin(x) // 5
참조: https://rfriend.tistory.com/356
예제5 : numpy.column_stack 예제
>>> a = np.array((1,2,3))
>>> b = np.array((2,3,4))
>>> np.column_stack((a,b))
array([[1, 2],
[2, 3],
[3, 4]])
예제6 : matplotlib 사용예제
더보기
dict = {
'Asia':'red',
'Europe':'green',
'Africa':'blue',
'Americas':'yellow',
'Oceania':'black'
}
# Specify c and alpha inside plt.scatter()
plt.scatter(x = gdp_cap, y = life_exp, s = np.array(pop) * 2, c = col, alpha=0.8)
# Previous customizations
plt.xscale('log')
plt.xlabel('GDP per Capita [in USD]')
plt.ylabel('Life Expectancy [in years]')
plt.title('World Development in 2007')
plt.xticks([1000,10000,100000], ['1k','10k','100k'])
# Additional customizations
plt.text(1550, 71, 'India')
plt.text(5700, 80, 'China')
# Add grid() call
plt.grid(True)
# Show the plot
plt.show()
예제7 : pandas 예제
# Import cars data
import pandas as pd
cars = pd.read_csv('cars.csv', index_col = 0)
# Print out country column as Pandas Series
print(cars['country'])
# Print out country column as Pandas DataFrame
print(cars[['country']])
# Print out observation for Japan
print(cars.loc['JAP'])
# Print out observations for Australia and Egypt
print(cars.loc[['AUS', 'EG']])
# Print out drives_right value of Morocco
print(cars.loc['MOR', 'drives_right'])
# Print sub-DataFrame
print(cars.loc[['RU', 'MOR'], ['country', 'drives_right']])
'Python 응용 > DataScience교육_MS_DAT208x' 카테고리의 다른 글
[DAT208x] final lab 1-7, 1-8 (0) | 2021.10.23 |
---|---|
[DAT208x] final lab 1-4, 1-5, 1-6 (0) | 2021.10.23 |
[DAT208x] final lab 1-3 (0) | 2021.10.23 |
[DAT208x] final lab 1-1 : Section 1: Importing and Summarizing Data (0) | 2021.10.23 |
[DAT208x] final lab 1-2 (0) | 2021.10.23 |