A near-infrared spectroscopy system that identifies material composition — Paper, Wood, Plastic, Metal — in real time.
Three students constructed this end-to-end spectroscopy pipeline from scratch — designing the cardboard enclosure, wiring the ESP32 sensor, collecting training data, and writing the ML classifier entirely themselves.
An LED at 25 mA illuminates the target material. The AS7263 sensor captures reflected light across 6 near-infrared channels (610–860 nm), producing a unique spectral fingerprint for each material type.
Raw channel values are expanded into 21 features: normalized ratios, inter-channel gradients (slopes), and statistical descriptors — capturing the shape of each reflection curve, not just its amplitude.
A 150-tree Random Forest classifier, trained on hundreds of labelled samples with oversampling for class balance, identifies the material in real time over a live serial stream from the ESP32.
The AS7263 outputs calibrated intensity values across six NIR bands simultaneously. Channel R at 610 nm dominates for most materials, while the ratio between channels encodes the material identity. Gain: 64×, Integration time: 168 ms.
A region-of-interest (ROI) is extracted from the camera feed and analysed for RGB channel gradients across pixel positions. This visual fingerprint complements the NIR spectral data, providing an additional discriminative signal for material classification.
def engineer_features(X_raw): """ Transforms raw spectroscopy channels into relational features to capture the 'shape' of the material reflection curve. """ X_eng = pd.DataFrame(index=X_raw.index) raw_channels = ['R', 'S', 'T', 'U', 'V', 'W'] # 1. Retain the original raw values for col in raw_channels: X_eng[col] = X_raw[col] # 2. Total Spectral Footprint Intensity X_eng['Total_Intensity'] = X_raw[raw_channels].sum(axis=1).replace(0, 1) # 3. Normalized Curves (Ratios relative to total brightness) for col in raw_channels: X_eng[f'{col}_norm'] = X_raw[col] / X_eng['Total_Intensity'] # 4. Spectral Gradients (Slopes between consecutive channels) X_eng['R_S_diff'] = X_raw['R'] - X_raw['S'] X_eng['S_T_diff'] = X_raw['S'] - X_raw['T'] X_eng['T_U_diff'] = X_raw['T'] - X_raw['U'] X_eng['U_V_diff'] = X_raw['U'] - X_raw['V'] X_eng['V_W_diff'] = X_raw['V'] - X_raw['W'] # 5. Row-wise Statistical Profiles X_eng['Spectral_Mean'] = X_raw[raw_channels].mean(axis=1) X_eng['Spectral_Std'] = X_raw[raw_channels].std(axis=1) X_eng['Spectral_Max'] = X_raw[raw_channels].max(axis=1) return X_eng