Explore how counterfactual token weighting compares to heuristic masking for preventing student models from inheriting teacher-conditioned artifacts.