Thoughts on AI

Thoughts on the future of humanity, usually posted while I am drunk.

Monday, December 13, 2010

Dr Zorgs thinking machine.

So I have been checking out Prolog lately. Prolog is a programming language from the early 1970s. It is based on a model fundamentally different from all the popular modern (procedural) programming languages, in that its declarative. The most succint explanation I have heard is: "You tell procedural languages how to solve a problem. You tell declarative languages a set of declarations, and then you query them, and let computer solve the problem". On a pragmatic level, what it looks like is you make declarations, which you then query. For instance:

assert(likes(bob, alice)).

Is a declaration that a relationship "likes" exists between Alice and Bob. You can then query it by asking it
?- likes(alice, bob)
>true
?- likes(alice, X)
>X = bob
Those queries could be read "does alice like bob?" and "who likes alice?"

There are many more features, recursion, lists, etc. Its a Turing complete language, you can do anything in Prolog that you can do in a procedural language, but its model is fundamentally different.

As far as market share and real use go, Prolog resides in the dustbin of CS history. Like any 40 year old language, the debugging is nightmarish. Also, its hard to formulate normal problems in terms of it, and unlike procedural languages, the programmer feels completely divorced from the computational complexity of their own code... Its unclear how to write for performance. The current crowd who keep it alive are the math nerds, the theorists. While this is great, there is this air of academia around it, it has the sense of one of those academic languages that asks what its users can do for it, not what it can do for its users.

But is a failure? Not at all. The thing about Prolog that matters is when it came out: Around the time of Codd's relational model. There was active, pragmatic research at the time about the question of how to store data. The relational model had the advantage of coming out of IBM, and thus being presented with pragmatic business concerns at the forefront. As a result, it is the model for all the relational databases of our time, and will probably be around 100 years from now. Its popular and here to stay because its too useful for too many things to go away, and those who really understand it understand its not about tables, its expressive enough for any object oriented languages and beyond (as argued in Codd's third manifesto) even though SQL is likely to go the way of the Dodo.

So how does this relate to Prolog? Its simple. 90% of current solutions have two parts, a procedural part (manifest in the code, Java C# whatever) and the declarative part (manifest in the database that the app calls on, usually through SQL). This separation isn't by design, its just that nobody ever came up with a language that is both procedural and declarative.

(I have to back up here and note that program language classification are ill defined. Strictly speaking, Java, C# and C++ are Object-oriented not procedural. But all OO talks about is how data is encapsulated. The difference between C++ and C is much smaller than the difference between C++ and Prolog/SQL. So when I say procedural, I am referring to all the languages where the programmer specifies a procedure for producing an output, rather than a set of declarations and queries upon them, like Prolog SQL)

Kudos to Microsoft for trying, but its really not been done yet. The next language to die will be SQL, and its replacement will be a unified Turing complete language with both procedural/OO and declarative (SQL like) behavior. In this language, every library (DLL) will be a database file, every database will be a library. Learning new APIs and new Schemata will be one in the same. The database admin will be the programmer and vice versa. The user who can edit certain database tables through a GUI will be a (highly limited) code contributer. Object access control and database security will be one in the same.

The key is that part about users as code contributors: This is about seeing clearly what's already going on: Application users dictating application behavior. Good web apps do it, popular news stories go strait to the top, etc. But the next step is beyond that, things like CMS systems where the users really are application designers with some restrictions. The clarity and introspection of the relational model offers promise to do this securely, avoiding the insanity of letting users write procedures with all their ambiguity and shadowy behavior.

Next Steps: Obstacles to be Overcome

To create such a language, we must borrow some things Prolog got right. You need to have the idea of a virtual table. This is a database table that doesn't actually exist constructed in memory, but can answer queries about itself like membership and so on. Consider the following Prolog:
assert(cubed(X, Y) :- Y is X*X*X).
This reads "if Y equals X times X times X, then the relationship 'cubed' exists between X and Y. " Prolog then answers the query cubed(3, 27) as 'true', and answers cubed(3, Y) as 'Y=27'. However, it is unable to answer the query cubed(X, 27) because it is not smart enough to invert the definition and compute a cubed root of 27. So the answer is like a table you can run queries on, except the query SELECT from `cubed` WHERE Y = 27; just doesn't work, you can search by X, but not by Y.

This is a fundamental issue with accessing procedural computations as tables: Some can't be inverted, some can. It would be easy to write in cube root functionality, but larger issues exist, such as when a function can be done in P time but its inversion can only be done in NP time.

The key here is to think of it in terms of database access limitations. If a db admin says you can look up a customers name from provided SS number but not SS number by provided name, that's a rule you can do your job around. If God says you can look up an integer product in polynomial time but not its factors, that's another rule you can do your job around.

You do want the language to be able to find inversions, simplifications and so on. How doable that is I do not know. In its simple forms, you want users to be able to provide inversion f'(x) for their own function f(x), and you want it to be caught if the f(f'(x)) != x. Ideally you want to be able to prove that provided f'(x) is indeed the inversion of f(x) at compile time. The difference between proving and producing the inversion of a function seems slim.

The other thing you need is a language really oriented natively toward collections and sets. Set constructors need to be common objects, especially where inverting functions goes, the kind of functions that lose information like 27%5 = X. inverting that, Y%5 = 2? The natural numbers that answer that is given by the set constructor n*5+2 for n in N. Even for infinite N, this function will list all the y that satisfy Y%5=2. It acts like a you queried a table, and the set of rows which satisfy the condition are infinite, but only exists as a tiny little function in memory.

The deep integration with sets, where operations are done not in the order specified by the procedure but simultaneously, is an important part of the new vastly multi-core computing paradigm. The key to such a language is to make the programmer aware of his interaction with the multi-core architecture, so he or she can plan around it consciously, not to make the adaptation happen behind the scenes invisibly, which is a mistake I think current solutions are making.

This post was completed and not published about a week ago.


0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home