Feature Engineering

Explore further

Feature engineering is the process of transforming raw data into inputs that a machine learning model can actually use. Raw data - a timestamp, a street address, a block of text - often isn't directly useful to a model. Feature engineering turns that timestamp into "day of week" and "hours since last purchase," that address into latitude and longitude, and that text into numerical representations. Good feature engineering requires domain knowledge. Someone who understands the problem space can create features that capture meaningful patterns, giving the model a significant head start. A skilled engineer working on fraud detection, for example, might create features like "number of transactions in the last hour" or "distance from previous transaction location" - signals that raw transaction records don't explicitly contain. Historically, feature engineering was where data scientists spent most of their time. Deep learning has reduced this burden for some use cases - neural networks can learn useful representations directly from raw data in domains like images and text. But for tabular data, time series, and many business applications, thoughtful feature engineering remains one of the most effective ways to improve model performance. It's also one of the areas where human expertise is hardest to replace with automation.