본문 바로가기

분류 전체보기

(11)

[DAT208x] final lab 3-7 : Data Visualazation (Dropping Missing Values) Dropping Missing Values Now that you've replaced the UN values with NaN values, you realize it's better that you just delete these rows completely. In this exercise, you'll do just that. To confirm you filtered out the rows, you’ll then check the size of the DataFrame to confirm the size is smaller. With a single line of code, drop all the rows that have a missing value. Print the size of the ma..

[DAT208x] final lab 3-1 : visualization data Plotting Scatterplots Now that you've calculated the correlation coefficient between the low_wage_jobs and unemployment_rate columns, you want to create a visualization to effectively display this relationship. You'll use matplotlib to create a scatterplot of these two columns. The DataFrame dept_stats is available in your workspace again, and the columns low_wage_jobs and unemployment_rate have..

[DAT208x] final lab 2-6, 2-7, 2-8 : Manipulating Data (Grouping) Grouping with Counts There are various department categories but no sense of how many departments there are in each category. You'll use pandasto gain insight into this information. In particular, you'll use the .groupby() method of pandas. This was not introduced to you in the course, but you'll see it very frequently in your data science journey and it's an important method to understand. This..

[DAT208x] final lab 2-1,2-2, 2-3, 2-4, 2-5 : Section 2: Manipulating Data Creating Columns I If you look at the dataset, you'll notice that while there's a column which shows the percentage of women in each department, there is no column which shows the percentage of men. Create a new column named sharemen, that contains the percentage of men for a given department by dividing the number of men by the total number of students for each department. # Add sharemen column..

[DAT208x] final lab 1-7, 1-8 1-7 Converting a DataFrame to Numpy Array Since numpy is such a powerful Python module, this exercise asks you to convert a pandas DataFrame to a numpy array to then utilize a statistics metric available through numpy in the next exercise. Select the columns unemployed and low_wage_jobs from recent_grads, then convert them to a numpy array. Save this as recent_grads_np. Print the type of recent_..

[DAT208x] final lab 1-4, 1-5, 1-6 Select a Column Python's pandas module allows you to select a specific column from a DataFrame, which is especially useful for when you only need to manipulate one piece of data. In this exercise, you'll select the sharewomen column, which shows the percentage of women for a given department. The DataFrame recent_grads is still in your workspace. Instructions Select the sharewomen column from re..

[DAT208x] final lab 1-3 Replacing Missing Values There are some missing values in the dataset that are coded as a string. You'll update these to a value that Python understands as "missing." The list columns contains the names of the columns you'll be working with in this exercise. Instructions Look at the dtypes of the columns in columns to make sure that the data is numeric. It looks like a string is being used to en..

[DAT208x] final lab 1-1 : Section 1: Importing and Summarizing Data Read and explore your data In this lab, you'll explore a dataset containing information on a university's recent graduates for each department. The URL this dataset can be downloaded from is stored in a variable called recent_grads_url. In this exercise, you'll read in this data using Python's pandas module. Instructions100 XP Import pandas as pd. Read in the data from recent_grads_url (which is..

이전 1 2 다음

티스토리툴바