Name		Name	Last commit message	Last commit date
parent directory ..
clean.sql		clean.sql
readme.md		readme.md

readme.md

Data Cleaning

This section outlines all the data manipulation and transformation done on the dataset.

Pre-Modification Measures

Before any manipulation and transformation were made, measures were taken to ensure data integrity. These measures included:

Creation of Backup Table
A backup table "session" with the exact columns from the imported dataset was created. This served as a safeguard against unintended changes to the original records.

Cleaning Steps Applied:

Data Inconsistency
The "distinct" function was utilised to inspect categorical variables and update them. E.g. device field had values "m" and "d."

Duplicate Record Removal
192 duplicate records were identified and removed based on multiple criteria, including session_id, created_at, device, and status. Remain 2118 unique records.

Handled Null & Empty Values
Fields with null values were identified and addressed. Appropriate measures were taken, including filling null with average instead of dropping them, considering it is a small dataset.

Outlier Removal
Negative, zero and extreme values in the "load_time" and "view_time" variables were examined and removed to ensure data integrity.

Time Unit Conversion
The "load_time" and "view_time" variables were converted from milliseconds to seconds for consistency and easy interpretation.

Data Type Modication
Data type were inspected for categorical and numerical variables, and appropriate changes made to enhance storage efficiency and query performance.

Please reflect over the data dictionary below:

Fields	Before	After
session_id	INT	INT NL AI PRIMARY KEY
created_at	VARCHAR(255)	DATE
device	VARCHAR(255)	VARCHAR(10)
load_time	VARCHAR(255)	DECIMAL(10,1)
view_time	VARCHAR(255)	DECIMAL(10,1)
status	VARCHAR(255)	TINYINT UNSIGNED

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2.clean

2.clean

readme.md

Data Cleaning

Pre-Modification Measures

Cleaning Steps Applied:

Files

2.clean

Directory actions

More options

Directory actions

More options

Latest commit

History

2.clean

Folders and files

parent directory

readme.md

Data Cleaning

Pre-Modification Measures

Cleaning Steps Applied: