Member-only story
All about SQL Data Cleaning
Next part of Data Profiling — After defining the potential errors of data that require correction for precise analysis, data cleaning is the next step
Data Cleaning is Indispensable
When you first receive a data set to explore, the first thing that we would always want to check is if this data is ready for analysis. Often, the answer is no. Raw data can be unstructured for a variety of reasons, and it commonly contains mistakes, typos, duplication, missing values, and other matters that could make analysis more challenging.
That’s when data cleaning comes in. The objective is to ensure that the data used to collect the information is consistent, trustworthy, and correct.
In this article, I will walk you through different cases of data cleaning in SQL, along with approaches to dealing with them quickly. Usually, I will present my examples with PostgreSQL.
Now, let’s start to see what we’ve got!
Handling Missing or NULL values
Usually, there are a few ways to deal with missing values, including, Removing the rows with missing values, Filling in some fixed errors, and Inputting a calculated value.