Beyond Program Synthesis

2021.08.12

Generating source code from natural language is not formal artificial intelligence, even though it sounds similar to some of the things I have described about it before.

To help illustrate, let's take the source generation idea all the way to its logical conclusion.

Consider a hypothetical piece of software that listens to you describe your problem and then answers back with a program or circuit design that would solve it. This software would still not be formal artificial intelligence.

In order for a system to be formal AI it would need to be possible to pause it at any time and get a snapshot of its progress, plus a lot of extra conditions. One of the extras is that the snapshot would need to be a partial solution that has all the essential properties we would want of a complete solution.

Formal AI does not operate on vector spaces, neural networks, or other abstractions. Instead, it recursively synthesizes, refines, and elaborates on internal representations of source code, which must be capable, at every iteration, of being transformed into human-readable form. And not just merely legible, but comprehensible; being made from data structures and algorithms that we can follow.

A formal AI system of sufficient complexity could be given a legible computer program made by any means, including a human, and then recursively improve upon it for as much time as one was willing to wait. And I do not just mean improvement by doing what an optimizing compiler would do, but like what the most effective hypothetical team of programmers would do if tasked with the job of making it better.

I am staying within human terms of intelligence so as to not get sidetracked by debates about the effectiveness of artificial intelligence and related measures. We can safely assume that no one knows everything and that a team of the best experts in your field would likely be able to at least improve upon what you have made, and that this is enough to illustrate the point; it only continues to prove the point further the more effective the formal AI becomes.

The process of iteratively improving upon complete computer programs in source code form is one of the defining characteristics of formal artificial intelligence. Merely generating source code is not enough. The entire process of search, refinement, and inference is done entirely over representations of well-formed programs. And the reason for this is to both ensure correctness of each component, but also to guide the evolution of the program in a controlled way.

That means it is possible to control the recursive improvement of formal AI up to arbitrary complexity by constraining it in a type-directed framework, and that this is a candidate solution to the local control problem, which is likely to be as good as it is going to get, as the global version of the control problem has no solution.

So how does this tie in with the original example of the oracle software that produces a program given a natural language description? Because the difference is that formal AI involves programs every step of the way, much like how a person would edit a program, as opposed to just generating programs.

It is valuable to be able to iteratively improve upon programs because that makes it possible for us to work with formal AI at every step. We can use as little or as much automation as we want, and in the specific areas where we need it. The rest of the time we can retain control over the process.

The other aspect is much more economic in nature. By being able to operate on programs, the formal AI process itself becomes amenable to modularization and reuse. We can make progress and share results as programs. The automation brought by formal AI integrates into the entire ecosystem of developer tools, programming languages, and libraries that build the world today. Nothing has to significantly change, as the software continues to play its role in changing the world, but now with formal AI systems as participants.

With all of these results also comes the benefits of knowledge. We may be able to learn something by looking at the programs created by formal AI. That is such a complex idea that it will have to be saved for another time, but the gist of it is that source code is a form of knowledge, and that we regularly advance technology by sharing new ways of building programs that we didn't think of before.

With formal AI as an assistant and guide, we may be able to reach new ways of thinking about our problems and explore solutions to ever more complex domains, limited only by our trust in these systems. It is a matter of trust because we may reach a point where we can no longer follow the programs generated by the automation we have created, not even with teams of people working around the clock. We will then either have the choice of artificially limiting progress or allowing it to blaze a trail for us to follow.