Question
Which of the following is the most appropriate method to
handle missing data in a dataset for predictive modeling?Solution
Explanation: Replacing missing data with statistical measures like mean (for continuous data), median (for skewed distributions), or mode (for categorical data) is a robust imputation technique. This approach minimizes the loss of data while maintaining the dataset's integrity. It is particularly effective when missing values are random (MCAR) and do not introduce significant bias. However, this method may not work well for datasets with a high proportion of missing values or when patterns in the missing data need to be preserved. Advanced imputation methods like k-Nearest Neighbors (KNN) or predictive models can be used in such cases. Option A: Deleting rows with missing values can result in significant data loss, reducing the dataset's representativeness. Option C: Ignoring missing data leads to inaccuracies and potential errors in analysis. Option D: Filling with arbitrary constants like zero can distort the dataset, introducing bias. Option E: Duplicating rows compromises the dataset's integrity and can lead to overfitting in predictive models.
- A number is first increased by 18% and then decreased by 18%. The overall change in number is:
Partners A, B, and C invest Rs 10,000, Rs 15,000, and Rs 20,000 in a business. At the end of the year, the business generates a profit of Rs 45,000. A i...
Sara has 150 kg of oranges (rotten and fresh). She sold 50% of the total oranges such that out of the remaining oranges, 30% are rotten. Find the percen...
A quantity is increased by 60% at first, and later decreased by 45%. Calculate the net percentage change.
- In a town, 25% of the residents are underage and hence not allowed to vote. Among the eligible voters, 40% are women. If 1/4 of the eligible women and 2/5 ...
The marks obtained by A is 25% more than B, and the marks obtained by C is 140 more than the difference of the marks obtained by A and B. If C obtained ...
Suyash have total amount of Rs.3000 out of which, he spent 25% on food, 40% of the rest on travelling. Out of remaining amount he spends Rs.620 on misce...
The total number of students in a college is 13,200. If the number of male students increases by 30% and the number of female students increases by 20% ...
Ajay spent 34% of his monthly income on rent and 48% of the remaining on food and the rest amount is saved by him which is Rs. 2574, then find the month...
Rakesh obtained 25% more marks than Ambuj Suresh obtained 40% less marks than Rakesh. Marks obtained by Suresh are what percent less than marks obtaine...