top of page

Data Cleaning

Kaggle and Microsoft SQL Server

Data origin

Data origin

Data origin

Data Origin

The Data for Udemy Project has been extracted from Kaggle Datasets. The data contains information about whole financial courses in Udemy and consists about course reviews, subscribers, price and so on. We are going to make some insights from the data after cleaning it with Microsoft SQL Server. The link to the dataset is here.

Data Cleaning

The Data cleaning is done using Microsoft SQL Server. The code is also posted in GitHub : https://github.com/smzahir/Portfolio-Project

-- Exploratory Data Analysis Of Udemy Financial Courses


SELECT * 
FROM
Udemy_data

​

-- Removing unwanted columns from the table

​

ALTER TABLE Udemy_data
DROP COLUMN  url, avg_rating, avg_rating_recent, is_wishlisted,
            num_published_practice_tests, created, discount_price__currency,
            discount_price__price_string, price_detail__price_string,
            price_detail__currency

​

-- Renaming columns for better understanding

​

sp_rename 'Udemy_data.id','ID'
GO
sp_rename 'Udemy_data.title','Title'
GO
sp_rename 'Udemy_data.is_paid','PaidCourse'
GO
sp_rename 'Udemy_data.num_subscribers','Subscribers'
GO
sp_rename 'Udemy_data.rating','Rating'
GO
sp_rename 'Udemy_data.num_reviews','Reviews'
GO
sp_rename 'Udemy_data.num_published_lectures','LecturesPublished'
GO
sp_rename 'Udemy_data.published_time','DatePublished'
GO
sp_rename 'Udemy_data.discount_price__amount','OfferPrice'
GO
sp_rename 'Udemy_data.price_detail__amount','Price'

​

-- As checked OfferPrice and Price Columns contains null values

​

SELECT * 
FROM
Udemy_data
WHERE Price IS NULL

​

-- Replacing null values with Zeroes

​

UPDATE Udemy_data
SET Price =
   CASE
       WHEN Price IS NULL THEN 0
       ELSE Price
   END

​

UPDATE Udemy_data
SET OfferPrice =
   CASE 
       WHEN OfferPrice IS NULL THEN 0
       ELSE OfferPrice
   END

​

-- Converting DataPublished into Date type only

​

UPDATE Udemy_data
SET DatePublished =
   CAST(DatePublished AS DATE)

​

​

​

​

​

​

​

bottom of page