Explanation methods that help users understand and trust machine-learning models often describe how much certain features used in the model contribute to its prediction. For example, if a model predicts a patient’s risk of developing cardiac disease, a physician might want to know how strongly the patient’s heart rate data influences that prediction. But if those features are so complex or convoluted that the user can’t understand them, does the explanation method do any good? MIT researchers are striving to improve the interpretability of features so decision makers will be…