Which encoding technique creates binary indicators for each category in nominal data?

Prepare for the GARP Risk and AI (RAI) Exam. Master concepts with flashcards and multiple-choice questions, each with hints and clarifications. Get exam-ready with extensive practice!

Multiple Choice

Which encoding technique creates binary indicators for each category in nominal data?

Explanation:
When dealing with nominal data, you want to convert categories into numeric features without implying any order between them. One-hot encoding achieves this by creating a separate binary feature for each category and placing a 1 in the column that corresponds to the category while placing 0s in the others. For example, a feature like fruit with categories apple, banana, and cherry becomes three columns: fruit_apple, fruit_banana, and fruit_cherry. If the observation is banana, you’d have 0, 1, 0 in those columns, respectively. This representation preserves the distinct categories and keeps no ordinal relationship between them, which is important for models that assume numeric inputs. The other options aren’t encoding techniques. Categorical Data refers to the type of data, not a method to convert it. Data Scaling is about adjusting numeric features to a common scale. Imputation addresses missing values, not how to represent categories numerically.

When dealing with nominal data, you want to convert categories into numeric features without implying any order between them. One-hot encoding achieves this by creating a separate binary feature for each category and placing a 1 in the column that corresponds to the category while placing 0s in the others.

For example, a feature like fruit with categories apple, banana, and cherry becomes three columns: fruit_apple, fruit_banana, and fruit_cherry. If the observation is banana, you’d have 0, 1, 0 in those columns, respectively. This representation preserves the distinct categories and keeps no ordinal relationship between them, which is important for models that assume numeric inputs.

The other options aren’t encoding techniques. Categorical Data refers to the type of data, not a method to convert it. Data Scaling is about adjusting numeric features to a common scale. Imputation addresses missing values, not how to represent categories numerically.

Subscribe

Get the latest from Examzify

You can unsubscribe at any time. Read our privacy policy