Loading...

Data Cleaning in MySQL

Topan Septiadi Mediana

Data Cleaning in MySQL

As a data analyst, one of the most critical steps before performing any analysis is data cleaning. No matter how sophisticated your dashboard or model is, if the data is messy, the outcome will likely be misleading.

In many of my projects—especially those involving MySQL as the main data source—data cleaning serves as a foundational task. Below, I’ll walk through several real-world data cleaning techniques I’ve applied using SQL queries in MySQL.

Detect and Handle Missing Values

SELECT * FROM sales WHERE customer_name IS NULL;

Once identified, missing values can be:

  • Filled with default values
  • Imputed based on logic or averages
  • Removed completely (when justified)

Remove Duplicate Records

DELETE FROM sales WHERE id NOT IN ( SELECT MIN(id) FROM (SELECT * FROM sales) AS temp GROUP BY order_id );

I often use MIN(id) to retain one original row while eliminating duplicate records based on a unique column like order_id.

Trim Extra Spaces in Text

UPDATE customers SET customer_name = TRIM(customer_name);

TRIM() is a simple but powerful function to ensure clean strings, especially useful for preventing errors during JOIN operations.

Standardize Date Formats

SELECT STR_TO_DATE(order_date, '%d/%m/%Y') AS formatted_date FROM orders;

Using STR_TO_DATE, I make sure all date formats are consistent so functions like YEAR() or MONTH() work flawlessly.

Normalize Categorical Values

UPDATE orders SET ship_mode = 'Standard Class' WHERE ship_mode IN ('standard', 'std', 'STD');

Normalizing category values avoids data being split into several variants that should actually be grouped together.

Real-World Project Insight

In one of my sales analysis projects, I discovered that over 15% of the data contained duplicates or inconsistent labels. After cleaning, the analysis results became much more reliable, helping the business make smarter, data-driven decisions.

Final Thoughts

Data cleaning is not just about technical skills—it's about understanding context. With MySQL, I can clean, normalize, and prepare raw data efficiently to ensure quality insights.

If you're working with data, never skip this step. Because in analytics, quality > quantity.

From Blog

News And Updates

Lorem ipsum dolor, sit amet consectetur adipisicing elit. Tenetur adipisci facilis cupiditate recusandae aperiam temporibus corporis itaque quis facere, numquam, ad culpa deserunt sint dolorem autem obcaecati, ipsam mollitia hic.

MySQL
Topan Septiadi
24 Jul 2025
0 Comments
Data Cleaning in MySQL

In this project, i am cleaning raw data using MySQL and it turn out that this is realy easy to use more often if you are getting to know about the data it self.

Read More
Business
Topan Septiadi
30 Jul 2025
0 Comments
Bike Sales Analysis Using Microsoft Excel

Excel provide me more easy because at the end of the day, excel came up with huge feature and it make it more compatible since we are know the tools.

Read More
Business
Martin.C
30 Dec 2025
6 Comments
Leverage agile frameworks to provide

Lorem, ipsum dolor sit amet consectetur adipisicing elit. Eius libero soluta impedit eligendi? Quibusdam, laudantium.

Read More
View All of My Project