Below are five SQL query examples demonstrating some of the most common data cleansing tasks improving the quality of your dataset.
- Removing duplicate rows:
SELECT *
FROM table1
WHERE rowid
IN (
SELECT
MIN(rowid)
FROM table1
GROUP BY column1, column2, ...
) - Replacing misspelled values in a column:
SELECT
CASE
WHEN column1 LIKE '[Misspelled value]' THEN REPLACE(column1, '[Misspelled value]', '[Correct value]')
ELSE column1
END AS newcolumn1
FROM table1 - Converting data types:
SELECT
CAST(
column1 AS [Data Type of your choice]
) AS newcolumn1
FROM table1 - Replacing NULL values with 0:
SELECT
CASE
WHEN column1 IS NULL THEN 0
ELSE column1
END AS newcolumn1
FROM table1 - Standardizing date formats:
SELECT
DATE_FORMAT(datecolumn, '[Date Format of your choice]')
AS newdatecolumn
FROM table1
These SQL queries can be easily copied and pasted into your SQL environment to clean and prepare your data for analysis or further processing.