Question
A company has a large dataset with a mix of numeric and
categorical data. To ensure fair comparisons between variables, which data transformation technique should the analyst apply?Solution
Normalization is a data transformation technique that rescales numeric values to a common scale, often between 0 and 1, while retaining relative differences between them. This method is crucial when dealing with mixed data types, as it allows fair comparisons between numerical variables, especially when they are on different scales. Normalization helps to mitigate the influence of large values dominating smaller ones in the analysis, particularly in machine learning models. When working with mixed data, normalization ensures that each variable contributes equally to the analysis without scale bias. The other options are incorrect because: • Option 1 (Imputation) deals with missing data, not rescaling variables. • Option 2 (Standardization) adjusts for mean and variance but does not rescale to a fixed range, which may not be suitable for all models. • Option 4 (Encoding) converts categorical data to numeric but doesn’t affect numeric variable scales. • Option 5 (Aggregation) combines data points but doesn’t standardize or normalize them.
If Aman and Bhanu together earn Rs. ___ per month, and their incomes increase by 25% and 12.5% respectively, Bhanu's new income b...
What will come in place of ?
4, 5, 13, 40, 104, ?.
I. x2 + 31x + 238 = 0
II. 2y2 + 70y + 612 = 0
What does "M" stand for in the term CDMA?
The value of tan 225° is same as the value of:
In a right triangle, the sides are in the ratio 3:4:5. A circle is inscribed in the triangle. What is the area of the triangle covered by the circle?Â
(5.5% of 470) – (6.5% of 510) +2= ?
- If a + b + c = 18, and a ³ + b ³ + c ³ - 3abc = 54, hen find the value of 5(ab + bc + ca).
The value of cos² 10° + cos² 20° + cos² 30° + ……. + cos² 90° is
If x = 3tan t, y = 3sec t, then the value of d2y/dx2 at t = π/4​ is: