Python 응용/DataScience교육_MS_DAT208x

[DAT208x] final lab 3-7 : Data Visualazation (Dropping Missing Values)

rararara 2021. 10. 23. 17:09

Dropping Missing Values

Now that you've replaced the UN values with NaN values, you realize it's better that you just delete these rows completely. In this exercise, you'll do just that. To confirm you filtered out the rows, you’ll then check the size of the DataFrame to confirm the size is smaller.

 

  • With a single line of code, drop all the rows that have a missing value.
  • Print the size of the manipulated DataFrame.

 

# Print the size of the DataFrame
print(recent_grads.size)

# Drop all rows with a missing value

recent_grads.dropna(axis=0, inplace=True)

# Print the size of the DataFrame
print(recent_grads.size)

 

<script.py> output:
    3633
    3549

 


recent_grads.dropna(axis=0, inplace=True)

->  여기에서 많이 해맴. axis=0 : , inplace=True)

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html

#  

# axis : {0 or ‘index’, 1 or ‘columns’}, default 0

Determine if rows or columns which contain missing values are removed.

  • 0, or ‘index’ : Drop rows which contain missing values.
  • 1, or ‘columns’ : Drop columns which contain missing value.

# inplace : bool, default False (True 설정시, 결과 저장)

If True, do operation inplace and return None.