Question
A company has a large dataset with a mix of numeric and
categorical data. To ensure fair comparisons between variables, which data transformation technique should the analyst apply?Solution
Normalization is a data transformation technique that rescales numeric values to a common scale, often between 0 and 1, while retaining relative differences between them. This method is crucial when dealing with mixed data types, as it allows fair comparisons between numerical variables, especially when they are on different scales. Normalization helps to mitigate the influence of large values dominating smaller ones in the analysis, particularly in machine learning models. When working with mixed data, normalization ensures that each variable contributes equally to the analysis without scale bias. The other options are incorrect because: • Option 1 (Imputation) deals with missing data, not rescaling variables. • Option 2 (Standardization) adjusts for mean and variance but does not rescale to a fixed range, which may not be suitable for all models. • Option 4 (Encoding) converts categorical data to numeric but doesn’t affect numeric variable scales. • Option 5 (Aggregation) combines data points but doesn’t standardize or normalize them.
Which of the following memories has the shortest access times?Â
Which normal form deals with the issue of transitive dependencies?
What is the purpose of the SQL "GROUP BY" clause?
Which of the following is used to speed up data retrieval in a relational database?
Which of the following joins returns all rows from both tables, filling in NULL values for non-matching rows?
What is the full form of DBMS?
Which database model is based on the mathematical set theory and is the foundation of many modern databases?
Which of the following storage devices is considered non-volatile?
How many types of architecture we have in DBMS
Pick the odd one out.