Question
Which data cleaning technique is most appropriate for
handling missing data when missing values are randomly distributed across a dataset?Solution
When missing data points are randomly distributed, imputing values using the mean (for continuous data) or median (for skewed distributions) can be an effective technique. This approach maintains the dataset’s overall structure and helps reduce potential bias introduced by missing values. By substituting missing values with central tendencies, analysts can preserve statistical relationships without significantly distorting the data, ensuring a more accurate analysis. Option A is incorrect as removing rows may lead to a significant data loss, especially if many rows contain missing values. Option C is incorrect because dropping columns with missing values reduces feature dimensions, potentially discarding useful information. Option D is incorrect as placeholder values can introduce bias or mislead analysis, especially if the placeholder value skews the distribution. Option E is incorrect because ignoring missing values leaves gaps, making it difficult to perform accurate analysis.
The Buddha got enlightenment under which of the following tree?
The decimal equivalent of (1101)2 is:
Which of the following dance forms traces its lineage to the ancient dance of Sadir Attam?
Which of the following teams won the Indian Super League 2020-21?
Who was appointed as India’s 28th Controller General of Accounts (CGA) in March 2023?
Which of the following countries accorded equivalence to central counterparties authorised by the Reserve Bank of India (RBI), in June 2023?
The maiden international cruise vessel between Chennai and Sri Lanka was flagged off by the Union Minister of Ports, Shipping and Waterways Sarbananda ...
Rivers such as the Ganga and Son flowed through ______, a mahajanapada in ancient India.
Which organisms are classified as Aves?
Which of the following varna was NOT entitled to the ritual of ‘Upanayana Samskara’ in later Vedic period?