Question
A data analyst is assessing a dataset with inconsistent
categorical entries, such as "USA," "U.S.A," "United States," and "US" for the country field. Which of the following is the best approach for handling this inconsistency?Solution
Standardizing categorical entries to a single representation ensures consistency by consolidating multiple formats of the same entity into one standardized label. For example, consolidating "USA," "U.S.A," "United States," and "US" into one uniform label, like "United States," ensures that all data entries are interpreted consistently. This process is essential in data cleaning, as inconsistencies in categorical data can lead to inaccurate analysis, skewed results, and duplications in reporting. A uniform categorical format enables reliable grouping, sorting, and filtering for analysis. The other options are incorrect because: • Option 1 (Filtering duplicates) removes identical rows but doesn’t address inconsistency in a single field. • Option 2 (Using normalization) only applies to numeric scaling, not categorical consistency. • Option 3 (Applying data transformation) would encode inconsistencies rather than correct them. • Option 5 (Converting to uppercase) helps with case sensitivity but does not fully standardize variations.
Regarding tidal planets, which of the following statements is correct?
According to Budget Estimates, the expenditure on Education as a percentage of GDP rose from _______ in 1951-52 to _______ in 2019-20.
Which of the following districts has benefited from Bilas irrigation project?
Which of the following states has literacy rate less than 80%, as per census 2011?
As per the Census 2011, which of the following states has the lowest literacy rate?
Vide which of the following Acts was the Board of Control formed in the East India Company?
The model based on the Tamil Nadu Human Development Index
Pick out option which consists of BRIC Nations,
What is Laissez-faire?
In 1951-52, the first year of the National Sample Survey (NSS), the head-count ratio of poverty in India was deemed to be close to _________ of the popu...