Transfer Learning Capable Symbolic Regression
Conference Year
January 2020
Abstract
The ever-growing accumulation of data makes automated distillation of understandable models from that data ever-more desirable. Deriving equations directly from data using symbolic regression, as performed by genetic programming, continues its appeal due to its algorithmic simplicity and lack of assumptions about equation form. But, genetic programming has not yet been shown capable of transfer learning: the ability to rapidly distill equations successfully on new data from a previously-unseen domain, due to experience performing this distillation on other domains. Given neural networks' proven ability to transfer learn, here we introduce a neural architecture that, after training, iteratively rewrites an inaccurate equation given its current error, regardless of the domain. We found that trained networks can improve their ability to derive equations from data produced by a test domain, when trained on data from several training domains. Although this phenomenon did not arise in all cases we tested, it does suggest that symbolic regression can more rapidly distill equations from data if exposed to data from a growing set of domains.
Primary Faculty Mentor Name
Jim Bagrow
Secondary Mentor Name
Josh Bongard
Faculty/Staff Collaborators
Jim P. Bagrow (Advisor), Josh Bongard (Advisor)
Status
Graduate
Student College
College of Engineering and Mathematical Sciences
Program/Major
Mathematical Sciences
Primary Research Category
Engineering & Physical Sciences
Transfer Learning Capable Symbolic Regression
The ever-growing accumulation of data makes automated distillation of understandable models from that data ever-more desirable. Deriving equations directly from data using symbolic regression, as performed by genetic programming, continues its appeal due to its algorithmic simplicity and lack of assumptions about equation form. But, genetic programming has not yet been shown capable of transfer learning: the ability to rapidly distill equations successfully on new data from a previously-unseen domain, due to experience performing this distillation on other domains. Given neural networks' proven ability to transfer learn, here we introduce a neural architecture that, after training, iteratively rewrites an inaccurate equation given its current error, regardless of the domain. We found that trained networks can improve their ability to derive equations from data produced by a test domain, when trained on data from several training domains. Although this phenomenon did not arise in all cases we tested, it does suggest that symbolic regression can more rapidly distill equations from data if exposed to data from a growing set of domains.