Question
Which data cleaning technique is most appropriate for
handling missing data when missing values are randomly distributed across a dataset?Solution
When missing data points are randomly distributed, imputing values using the mean (for continuous data) or median (for skewed distributions) can be an effective technique. This approach maintains the dataset’s overall structure and helps reduce potential bias introduced by missing values. By substituting missing values with central tendencies, analysts can preserve statistical relationships without significantly distorting the data, ensuring a more accurate analysis. Option A is incorrect as removing rows may lead to a significant data loss, especially if many rows contain missing values. Option C is incorrect because dropping columns with missing values reduces feature dimensions, potentially discarding useful information. Option D is incorrect as placeholder values can introduce bias or mislead analysis, especially if the placeholder value skews the distribution. Option E is incorrect because ignoring missing values leaves gaps, making it difficult to perform accurate analysis.
The 11th edition of Exercise EKUVERIN between India and ________ will be conducted at Kadhdhoo Island.
What is the headquarters of the Asian Infrastructure Investment Bank (AIIB)?
On which date will Karnataka officially hand over kumki elephants to Andhra Pradesh?
Which is not a type of External Commercial Borrowings (ECB)?Â
What is the stake of NIRL in the Joint Venture with MAHAPREIT?
Why did RBI revoke the license of Purvanchal Co-operative Bank?
Which company holds the top position as the most valuable unlisted company in India according to the '2023 Burgundy Private Hurun India 500' list?
The International Hockey Federation (FIH) appointed P.R. Sreejesh and which other individual as co-chairs of the FIH Athletes Committee?
Which state is known for the traditional dish 'Bai'?
When is National Education Day celebrated in India?