Start learning 50% faster. Sign in now
Replacing missing values with the mean or median is one of the most common methods used during data wrangling. This method is preferred when the missing values are not randomly distributed, and there is a need to fill gaps without introducing significant bias. The mean is often used for normally distributed data, while the median is preferred for skewed data, as it is less sensitive to outliers. This technique allows analysts to retain all the available data and proceed with analysis without losing important information, which could otherwise distort statistical analyses or machine learning models. Option A (Remove rows with missing data) is incorrect because it can lead to a significant loss of data, especially if the missing values are scattered across the dataset. Option B (Replace missing values with zeros) is not ideal because replacing with zeros can distort the analysis, especially if zeros don't make sense in the context of the data. Option D (Ignore the missing values) is not recommended as it might lead to biased results or inaccuracies in analysis. Option E (Use machine learning to predict missing values) is correct in advanced scenarios but typically used after more straightforward methods (like mean/median imputation) have been applied.
Which blood type can individuals with the Bombay Blood group (HH) receive transfusions from?
Which organism in the Protista kingdom exhibits characteristics of both plants and animals?
Red rot of Sugarcane is caused by
Which of these helps to diagnose a heart problem?
Paramecium, which is a __________ organism, features a defined shape and ingests food at a specific site.
Identify the smallest bone in the human body.
How many pairs of ribs are typically present in the human body?
How many pair of heart is found in an earthworm?
Which of these is considered a non-living part of the cell?
Which state has launched “Mukhyamantri Udyaman Khiladi Unnayan Yojana” on the occasion of National Sports Day?