1-7 Converting a DataFrame to Numpy Array

Since numpy is such a powerful Python module, this exercise asks you to convert a pandas DataFrame to a numpy array to then utilize a statistics metric available through numpy in the next exercise.

Select the columns unemployed and low_wage_jobs from recent_grads, then convert them to a numpy array. Save this as recent_grads_np.
Print the type of recent_grads_np to see that it is a numpy array.

answer

# Convert to numpy array
recent_grads_np = recent_grads[['unemployed','low_wage_jobs']].as_matrix()

# Print the type of recent_grads_np
print(type(recent_grads_np))

In [3]: # Convert to numpy array
        recent_grads_np = recent_grads[['unemployed','low_wage_jobs']].as_matrix()

        # Print the type of recent_grads_np
        print(type(recent_grads_np))
<class 'numpy.ndarray'>

trial errors

In [1]: # Convert to numpy array
        recent_grads_np = recent_grads[['unemployed','low_wage_jobs']]
# Print the type of recent_grads_np
        print(recent_grads_np)
     unemployed low_wage_jobs
0            37            193
1            85             50
2            16              0
3            40              0
4          1672            972
5           400            244
6           308            259
7            33            220
8          4650           3253
9          3895           3170
10         2275            980
11          794            372
12         1019            789
13           78             81
14           23            263
15          589            524
16          699            640
17         2859           3192
18          170            137
19           11            144
20         6884           5144
21          338            485
22          824            696
23           70             70
24         1015            708
25         3270           2899
26         1042            703
27          504            285
28          597            365
29          670            340
..          ...            ...
143         314           1231
144         266            591
145       28169          48207
146        3918           9286
147        1920           2042
148        1128           3426
149        5486          11880
150        3355           5248
151        3329           4344
152         917           2125
153        1465           2840
154         496            722
155         419           1650
156         326            724
157         372           1141
158        1617           3304
159        1368           3586
160         510           3163
161          82             31
162        3395           6866
163        1487           5125
164        1360           2868
165         846           1115
166        3040          11068
167        1340           3466
168         304            743
169         148             82
170         368            622
171         214            308
172          87            192

[173 rows x 2 columns]

In [2]: # Convert to numpy array
recent_grads_np = recent_grads[['unemployed','low_wage_jobs']]
# Print the type of recent_grads_np
print(type(recent_grads_np))
<class 'pandas.core.frame.DataFrame'>

1-8 Correlation Coefficient

You have some suspicion that there's a relationship between the low_wage_jobs and unemployment_rate columns, so you decide to use numpy to calculate the correlation coefficient.

Calculate the correlation matrix of the numpy array recent_grads_np.

# Calculate correlation matrix
print(np.corrcoef(____))

# Calculate correlation matrix
#print(recent_grads_np[:,0])
print(np.corrcoef(recent_grads_np[:,0],recent_grads_np[:,1]))

trial errors

#print(recent_grads_np)
#print(np.corrcoef(low_wage_jobs,unemployment_rate))
#print(np.corrcoef(recent_grads_np))

answer

[[1. 0.95538815]
[0.95538815 1. ]]

reference values

In [1]: # Calculate correlation matrix
        print(recent_grads_np)
        #print(np.corrcoef(low_wage_jobs,unemployment_rate))
[[   37   193]
[   85    50]
[   16     0]
[   40     0]
[ 1672   972]
[ 400   244]
[ 308   259]
[   33   220]
[ 4650 3253]
[ 3895 3170]
[ 2275   980]
[ 794   372]
[ 1019   789]
[   78    81]
[   23   263]
[ 589   524]
[ 699   640]
[ 2859 3192]
[ 170   137]
[   11   144]
[ 6884 5144]
[ 338   485]
[ 824   696]
[   70    70]
[ 1015   708]
[ 3270 2899]
[ 1042   703]
[ 504   285]
[ 597   365]
[ 670   340]
[ 308   260]
[ 163   142]
[ 286   755]
[   49    49]
[ 8497 6193]
[ 9413 9910]
[11452 10653]
[ 1165 1284]
[ 129   480]
[ 137   124]
[12411 10886]
[ 2884 4569]
[ 2934 1672]
[ 1282 1823]
[ 505 1002]
[ 639   608]
[ 401   343]
[ 385   357]
[ 107    93]
[   99   186]
[   74   245]
[ 407 1270]
[    0    25]
[ 419   263]
[ 223   135]
[   88     0]
[ 2271 2499]
[14946 27320]
[ 4366 4221]
[ 2092 3046]
[ 977 1121]
[ 1067 1168]
[ 1150 1758]
[ 649 1362]
[ 178   839]
[ 416   386]
[ 250   406]
[   87   201]
[ 215   573]
[ 138   302]
[ 286   272]
[ 182    94]
[   42   269]
[    0     0]
[ 2769 4288]
[   64    81]
[21502 32395]
[11663 27968]
[15022 19803]
[ 1799 1905]
[ 693 1246]
[ 721   308]
[ 2249 3012]
[    0    56]
[ 1100   352]
[ 677   959]
[ 1315 1906]
[ 757 1336]
[ 893 1422]
[ 789   496]
[   36   221]
[   33    37]
[ 1779 3175]
[14602 27440]
[11268 18404]
[ 8947 14839]
[ 4535 8512]
[ 2727 5751]
[ 3305 7214]
[ 1668 3677]
[ 1067 1179]
[ 1088 2237]
[ 1743 1895]
[ 975 2449]
[ 1518 1391]
[ 2006 2495]
[ 962   557]
[ 842 1405]
[ 463   902]
[ 749 1061]
[   78   237]
[ 322   327]
[    0     0]
[ 7195 11443]
[11176 16839]
[ 3132 5267]
[ 1718 3168]
[ 1012 1806]
[ 1833 1854]
[ 216   786]
[    0   111]
[ 529 1159]
[ 483   459]
[13874 28339]
[ 8608 13748]
[ 4410 6429]
[ 2409 4468]
[ 2393 9063]
[ 1379 2819]
[ 1302 2085]
[ 547   657]
[ 757 1470]
[ 437   976]
[ 833 1385]
[ 2183 3816]
[ 4267 8051]
[ 1206 2767]
[14345 26503]
[ 7297 11502]
[ 5593 16838]
[ 4657 9030]
[ 3718 5862]
[ 1108 1634]
[ 314 1231]
[ 266   591]
[28169 48207]
[ 3918 9286]
[ 1920 2042]
[ 1128 3426]
[ 5486 11880]
[ 3355 5248]
[ 3329 4344]
[ 917 2125]
[ 1465 2840]
[ 496   722]
[ 419 1650]
[ 326   724]
[ 372 1141]
[ 1617 3304]
[ 1368 3586]
[ 510 3163]
[   82    31]
[ 3395 6866]
[ 1487 5125]
[ 1360 2868]
[ 846 1115]
[ 3040 11068]
[ 1340 3466]
[ 304   743]
[ 148    82]
[ 368   622]
[ 214   308]
[   87   192]]