Poker and Other Bluffing Games

In July of 2007, a computer program called Polaris played against two of the best human poker players, Phil Laak Four Acesand Ali Aslami, in a four day tournament of limit Texas Hold ‘em. Oddly for poker, the match was carefully designed so that the result could not be attributable to good or bad cards.

To do this, two copies of the program played on separate computers in separate rooms. One computer played against Phil Laak in one room and the other computer played against Ali Aslami in the other room. The decks were arranged so that the cards dealt in one room were identical to the cards dealt in the other room. However, the cards given to Polaris in one room were given to the human in the other, and vice versa. This ensured that if the cards favored Polaris in one room, then they favored the human in the other room by exactly the same amount (since they were the same cards). As a result, whether Polaris won or not, it could not be because it received good or bad cards.

The first day of the match was a draw. What this actually means is that Polaris won, but only by $70, and a win by less than $250 was considered statistically insignificant. The second day Polaris won, and the third and fourth days the humans won. Although Polaris lost the tournament, it put on an unquestionably good showing.

What is particularly interesting about this is that in order to win at poker, one must be able to bluff well, and it seems to run rather counterintuitive to most people’s opinions of machines that they should actually be able to bluff. This was highlighted in the Star Trek Next Generation episode “The Measure of a Man” when Riker bluffed Data1:

DATA: I will fold.
Riker rakes in the chips, then turns over his cards revealing a busted hand. Data turns his cards face up. He held a winning hand.
DATA: You had nothing.
GEORDI: He bluffed you, Data.
DATA: It makes very little sense to bet when you cannot win.
RIKER: But I did win. I was gambling that you wouldn’t call.
DATA: But how can you tell?
O’BRIEN: Instinct, Data, instinct.

I like Star Trek, but they got it wrong here. In fact, there is a mathematically perfect way to bluff, and no instinct is required at all. Unfortunately, perfect bluffing in a complex game like Texas Hold ‘Em requires more computational power than will ever be available. Fortunately, near perfect bluffing can be done on a desktop. It appears that the best humans bluff in a fashion that is still nearer perfect than a computer can yet achieve, but this is almost certain to reverse in the next few years (at least for two player limit Texas Hold ‘Em).

Unlike humans, who often bluff on instinct, a computer bluffs based on random chance. This ensures that its opponent cannot guess when it is bluffing. Of course, the probability of the computer bluffing is dependent on its cards, the opponent’s play, and so forth, but there are optimal values for these probabilities, and the “art” of bluffing really becomes a science.

1. Excerpted from Star Trek Minutiae

Scene Completion

Scene Completion Teaser

Scene completion is the task of filling in holes in a picture, and algorithms for doing it automatically are starting to become impressive. In the series of pictures above, the original picture (left) has an obnoxious roof taking up a lot of the picture. In the next picture, the user erased the roof, leaving a gaping hole in the picture. A current algorithm for filling in holes uses other similar pictures to figure out what should go in the hole, so the next picture shows similar pictures from a picture database. The final picture has the hole filled in with a reasonable guess.

The algorithm that generated the above sequence of pictures is due to James Hays and Alexei Efros and is explained in this paper. It is driven by a database of 2.3 million images of landscape, travel, and city photography. The first stage of the algorithm is to get the “gist” of the scene, which is a descriptor that can distinguish between cities, hillsides, beaches, and so forth. The algorithm uses this to find a set of images from its database that are of the same general type of thing. This prevents the algorithm from filling in hillsides with car roofs.

Once it has a set of images of similar things, it looks at the region around the hole and tries to find similar regions in the list of pictures. Next it chooses a small set of these best fitting regions and blends each of them back into the original hole, making a set of composite pictures without holes. Finally, it presents each of these composite pictures to the user so they can choose which one they like the best.

The algorithm works very well most of the time. However, it still has a few drawbacks. First, it is currently only trained on city, landscape, and travel pictures. As the range of pictures grows, the problem will become more difficult. Second, the program currently runs on a cluster of 15 machines, instead of a single desktop PC. Third, it still makes serious mistakes occasionally (they show a failure case in the article where the bottom half of a lady was filled in with what looks like a trash can). These drawbacks are serious enough to show that scene completion is not yet ready for the mass market, but the overriding message I took from the article was that it will be ready quite a bit sooner than I would have guessed.