What are the "no free lunch" theorems?

“No free lunch” theorems assert that, on average, every learning algorithm does equally well over all possible learning tasks. An algorithm that does better than chance at predicting some sequences must "pay for lunch" by doing worse at some other sequences.

Some have argued that these theorems imply that fully general intelligence is impossible, and therefore worries about AGI are overblown.

However, "no free lunch" holds only on the entire set of all theoretically possible sequences. The ones our algorithm does worse at may just be fully random, or designed to trick it. But if we start out knowing that the environment that our algorithm operates in has a certain structure, then the “no free lunch” results are not an impediment to designing algorithms with superior predictive or optimizing abilities.

Therefore, for practical AI design purposes, these theorems are often irrelevant. We aren't interested in "predicting" completely random sequences, and we don't mind if another algorithm outperforms us on that "task". No system can be so general as to perform well in every possible universe, but AGI is only required to perform well in one universe - ours. Our universe is far from maximally random, and its laws provide a lot of structure that lets us make good predictions while "paying for lunch" in other possible universes without that structure.

"No free lunch" hasn't prevented humans from exploiting the universe's structure for research and development, and won't prevent artificial systems from doing so much more effectively. The generality needed for AGI to exceed human abilities across the board is not the same kind of generality forbidden by these theorems.