After reading this from John Langford:
The Deep Learning problem remains interesting. How do you effectively learn complex nonlinearities capable of better performance than a basic linear predictor? An effective solution avoids feature engineering. Right now, this is almost entirely dealt with empirically, but theory could easily have a role to play in phrasing appropriate optimization algorithms, for example.
Does this sound related to modeling the deep interactions you often talk about? (I [Jimmy] never understand the stuff on hunch, but thought that might be so?)
My reply: I don’t understand that stuff on hunch so well either–he uses slightly different jargon than I do! That said, it looks interesting and important so I’m pointing you all to it.