Meta AI Unveils AI-Infused Diplomatic Charmer Which Stirs AI Ethics And AI Law Into Indelicate Tiff

Can we get AI to imbue diplomacy on par with that of humankind? AI developers are aiming to do so.

getty

The esteemed world of charm and diplomacy is often marred with indelicacies and incivility.

Will Rogers, the great humorist and social commentator, famously said that diplomacy is the art of saying “Nice doggie” until you can find a rock.

The apparent implication is that you sometimes have to stridently hold off any looming and endangering forces until you can discover a suitably defensible solution. Do what you need to do when in desperate straits. Willfully and purposefully placate those that jeopardize you, using your soothing and enamoring words, garnering precious time to arm yourself with a more practical and overtly tangible stance.

Some assert that war is inextricably an outcome of failed diplomacy. In this considered viewpoint, a constant drumbeat of diplomacy will stave off war. Keep talking and the fighting won’t get underway. Others disagree fervently about such a problematic opinion. Perhaps, it is said, diplomacy is in fact what gets us into wars. On top of that, it could be that diplomacy stretches out wars and keeps the fighting going on an endless basis.

We’ll give Will Rogers another chance to savor some humor out of this question of whether diplomacy is good or bad regarding the advent or continuance of war: “Take the diplomacy out of war and the thing would fall flat in a week” (he reportedly declared).

The common definition of diplomacy emphasizes the relationships between nations. Diplomats represent particular nations or nation-states and exercise diplomacy when performing their hallowed duties. We expect our diplomats to undertake their foreign policy conveyance via the use of aplomb. The messenger should not be the one that sparks animosity, just the message instead might do so. A presumably flawless diplomatic portrayal of even the most egregious of messages can demonstrably soften the blow arising from an otherwise noxious demand or threat.

Of course, diplomacy isn’t confined to solely the lofty work of formalized diplomats. I dare say that in our daily lives we all have to inure some semblance of diplomacy. When you want a friend to watch over your beloved dog while you are on a business trip, you most certainly are going to diplomatically couch the request to take care of the pooch. Making an outright demand to do so is likely to land with a dull thud.

Probably being more art than science, diplomacy is something that humans use to grease the skids, keep the lights on, and prevent society from going totally berserk. Negotiations come to play. Appeasement might need to arise. Sometimes you use the carrot, sometimes you use the stick. If you have to do so you might try so-called gunboat diplomacy, whereby you attempt to intimidate others around you. Or in lieu of strong-arming, you might go the softer mediation and humanitarian route. It all depends.

Okay, so we are apt to all reasonably agree that diplomacy is a vital human-to-human form of communication. It consists of practice or stagecraft associated with employing tact, mediation, negotiation, conciliation, and a slew of societally deft politicking.

Let’s add something else to that equation.

Are you ready?

We are going to add something else that you might have yet considered.

Artificial Intelligence (AI).

Yes, AI is getting into the diplomacy gambit. This seems entirely sensible. If we are going to have AI immersed in all facets of our lives, eventually diplomacy had to get its turn. We might expect that AI will be able to perform whatever duties it has by using appropriate words and manners of diplomacy.

AI as the autonomous or semi-autonomous diplomat.

I realize that might seem rather daunting and altogether scary. I don’t think any of us are ready for AI to be going around as a representative of our respective countries. Imagine AI serving on our behalf to negotiate our international positions and affirm our global political posturing. Yikes! It is the stuff of those apocalyptic sci-fi stories.

Well, breathe a momentary sigh of relief since that’s not in the cards, for now.

Instead, be thinking of AI that you use to get your banking done or that helps guide you toward purchasing a home. Today’s AI that interacts with people is usually pretty cut-and-dry. Get to the facts, only the facts. Some AI has been frosted up with cutesy wording to make you think that it is friendly or a buddy, though this is quickly seen through. Just because an AI system refers to you by your name or emits words like “howdy” instead of “hello”, most of us realize right away this is a charade.

A next step being avidly pursued in the halls of AI developers is to try and infuse diplomatic characteristics into our everyday AI. The overarching aim is to incorporate programmatic aspects that will get AI to showcase diplomacy and diplomatic-like behaviors. Using a variety of algorithms and especially the latest in generative AI or Large Language Models (LLM), progress is notably being made on this front.

We’ll take a look at one such example, CICERO, a new AI system released by the Meta AI team.

In case you were on vacation and hadn’t been keeping up with the latest news about AI, the recent announcement and unveiling of an AI system coined as CICERO offer a significant signpost of where the AI-infused diplomacy arena sits these days. Thoughtfully and to their good credit, the Meta AI team has openly made available their developed CICERO AI system. You can read the technical details in a posted research paper on Science, see the link here. You can visit a website they’ve set up to gently convey the overall nuances of CICERO with short video clips to illustrate what it does (see the link here). For those of you seeking heads-down actual program code and software, you can find the source code on GitHub at the link here.

I do want to quickly clarify one aspect that might seem confusing if you opt to dig into this matter.

The particular AI system devised by Meta AI is currently set up to play a somewhat popular game known as Diplomacy (perhaps you’ve seen the board game as per Hasbro and other game makers). I bring this up because the word “diplomacy” is going to have double duty in this circumstance. On the one hand, the devised AI attempts to infuse characteristics of the undertaking and display diplomacy per se. Meanwhile, the setting in which this particular AI system takes place is when playing specifically and only (right now) the famous board game known as Diplomacy.

Do you see how this is potentially going to confound some people?

If someone says to you that this particular AI does really well at “diplomacy,” you might be unsure as to whether they are referring to the notion that the AI is quite skilled at playing the board game known as Diplomacy or perhaps instead they are suggesting that the AI embodies the core essence associated with the humankind act of diplomacy. Plus, it could be that both facets are meant at the same time (namely, the AI does well at playing Diplomacy and simultaneously does well at being able to exercise or showcase diplomacy as a skill or act).

Herein, I am going to try and straighten this out by capitalizing the first letter and showing in italics the name of the board game, Diplomacy. The rest of the time I will be using non-italicized wording of “diplomacy” as an indication of the overall conception of being diplomatic or exhibiting diplomacy (see, I just did that right there).

Returning to the matter at hand, you might ponder for a moment whether AI that exhibits diplomacy is something we ought to desire or instead avert.

There are lots of positive aspects to getting AI to showcase diplomacy. You would though be living in a shallow bubble to assume that AI enacting diplomacy is going to be entirely and exclusively a good thing. AI which appears to have diplomatic skills and can make us believe it has diplomacy is also rife with troubling issues that we need to bring to the fore. In essence, AI used in this manner raises a boatload of AI Ethics and AI Law questions. For those of you interested in AI Ethics and AI Law, you might take a look at my extensive and ongoing coverage of Ethical AI and AI Law at the link here and the link here, just to name a few.

Before leaping into AI as embodying some form of diplomacy, I’d like to first lay some essential foundation about AI and particularly AI Ethics and AI Law, doing so to make sure that the discussion will be contextually sensible.

The Rising Awareness Of Ethical AI And Also AI Law

The recent era of AI was initially viewed as being AI For Good, meaning that we could use AI for the betterment of humanity. On the heels of AI For Good came the realization that we are also immersed in AI For Bad. This includes AI that is devised or self-altered into being discriminatory and makes computational choices imbuing undue biases. Sometimes the AI is built that way, while in other instances it veers into that untoward territory.

I want to make abundantly sure that we are on the same page about the nature of today’s AI.

There isn’t any AI today that is sentient. We don’t have this. We don’t know if sentient AI will be possible. Nobody can aptly predict whether we will attain sentient AI, nor whether sentient AI will somehow miraculously spontaneously arise in a form of computational cognitive supernova (usually referred to as the singularity, see my coverage at the link here).

The type of AI that I am focusing on consists of the non-sentient AI that we have today. If we wanted to wildly speculate about sentient AI, this discussion could go in a radically different direction. A sentient AI would supposedly be of human quality. You would need to consider that the sentient AI is the cognitive equivalent of a human. More so, since some speculate we might have super-intelligent AI, it is conceivable that such AI could end up being smarter than humans (for my exploration of super-intelligent AI as a possibility, see the coverage here).

I’d strongly suggest that we keep things down to earth and consider today’s computational non-sentient AI.

Realize that today’s AI is not able to “think” in any fashion on par with human thinking. When you interact with Alexa or Siri, the conversational capacities might seem akin to human capacities, but the reality is that it is computational and lacks human cognition. The latest era of AI has made extensive use of Machine Learning (ML) and Deep Learning (DL), which leverage computational pattern matching. This has led to AI systems that have the appearance of human-like proclivities. Meanwhile, there isn’t any AI today that has a semblance of common sense and nor has any of the cognitive wonderment of robust human thinking.

Be very careful of anthropomorphizing today’s AI.

ML/DL is a form of computational pattern matching. The usual approach is that you assemble data about a decision-making task. You feed the data into the ML/DL computer models. Those models seek to find mathematical patterns. After finding such patterns, if so found, the AI system then will use those patterns when encountering new data. Upon the presentation of new data, the patterns based on the “old” or historical data are applied to render a current decision.

I think you can guess where this is heading. If humans that have been making the patterned upon decisions have been incorporating untoward biases, the odds are that the data reflects this in subtle but significant ways. Machine Learning or Deep Learning computational pattern matching will simply try to mathematically mimic the data accordingly. There is no semblance of common sense or other sentient aspects of AI-crafted modeling per se.

Furthermore, the AI developers might not realize what is going on either. The arcane mathematics in the ML/DL might make it difficult to ferret out the now-hidden biases. You would rightfully hope and expect that the AI developers would test for the potentially buried biases, though this is trickier than it might seem. A solid chance exists that even with relatively extensive testing that there will be biases still embedded within the pattern-matching models of the ML/DL.

You could somewhat use the famous or infamous adage of garbage-in garbage-out. The thing is, this is more akin to biases-in that insidiously get infused as biases submerged within the AI. The algorithm decision-making (ADM) of AI axiomatically becomes laden with inequities.

Not good.

All of this has notably significant AI Ethics implications and offers a handy window into lessons learned (even before all the lessons happen) when it comes to trying to legislate AI.

Besides employing AI Ethics precepts in general, there is a corresponding question of whether we should have laws to govern various uses of AI. New laws are being bandied around at the federal, state, and local levels that concern the range and nature of how AI should be devised. The effort to draft and enact such laws is a gradual one. AI Ethics serves as a considered stopgap, at the very least, and will almost certainly to some degree be directly incorporated into those new laws.

Be aware that some adamantly argue that we do not need new laws that cover AI and that our existing laws are sufficient. They forewarn that if we do enact some of these AI laws, we will be killing the golden goose by clamping down on advances in AI that proffer immense societal advantages.

In prior columns, I’ve covered the various national and international efforts to craft and enact laws regulating AI, see the link here, for example. I have also covered the various AI Ethics principles and guidelines that various nations have identified and adopted, including for example the United Nations effort such as the UNESCO set of AI Ethics that nearly 200 countries adopted, see the link here.

Here’s a helpful keystone list of Ethical AI criteria or characteristics regarding AI systems that I’ve previously closely explored:

Transparency
Justice & Fairness
Non-Maleficence
Responsibility
Privacy
Beneficence
Freedom & Autonomy
Trust
Sustainability
Dignity
Solidarity

Those AI Ethics principles are earnestly supposed to be utilized by AI developers, along with those that manage AI development efforts, and even those that ultimately field and perform upkeep on AI systems.

All stakeholders throughout the entire AI life cycle of development and usage are considered within the scope of abiding by the being-established norms of Ethical AI. This is an important highlight since the usual assumption is that “only coders” or those that program the AI are subject to adhering to the AI Ethics notions. As prior emphasized herein, it takes a village to devise and field AI, and for which the entire village has to be versed in and abide by AI Ethics precepts.

I also recently examined the AI Bill of Rights which is the official title of the U.S. government official document entitled “Blueprint for an AI Bill of Rights: Making Automated Systems Work for the American People” that was the result of a year-long effort by the Office of Science and Technology Policy (OSTP). The OSTP is a federal entity that serves to advise the American President and the US Executive Office on various technological, scientific, and engineering aspects of national importance. In that sense, you can say that this AI Bill of Rights is a document approved by and endorsed by the existing U.S. White House.

In the AI Bill of Rights, there are five keystone categories:

Safe and effective systems
Algorithmic discrimination protections
Data privacy
Notice and explanation
Human alternatives, consideration, and fallback

I’ve carefully reviewed those precepts, see the link here.

Now that I’ve laid a helpful foundation on these related AI Ethics and AI Law topics, we are ready to jump into the heady topic of AI-infused machine-based diplomacy.

AI Diplomacy In All Its Glory And Also Potential Downfall

First, let’s establish that the type of AI being considered herein is non-sentient AI.

I say this because if, or some say when we reach sentient AI, the entire topic will likely be utterly upended. Imagine the potential chaos and societal confusion for having landed somehow into the otherwise never before seen unquestionably verified artificial intelligence that embodies sentience (for my analysis of a famous test of AI known as the Turing Test, see the link here). You can make a reasoned bet that a lot of our existing cultural, legal, and everyday norms will be enormously shaken to their core.

Perhaps the sentient AI will be our buddy, or maybe the sentient AI will be our worst foe. How will we be able to tell which way the AI is going to go?

You might be tempted to suggest that we will presumably be able to talk with the AI and figure out by what it says the intentions that it holds. Aha, you have fallen into a disconcerting trap. Suppose the AI is especially well-infused with some form of AI diplomacy capabilities. When you try talking with the AI, it will perhaps speak in the most charming and eloquent of terms. This could be very soothing for humankind.

But is that what the AI really has “in mind” regarding its AI-based computational intentions?

The AI might be doing the classic diplomatic one-two punch. Talk a good game and get us to become lulled into believing that the AI is our best-ever buddy. This might be giving the AI time to amass resources to overtake humanity or perhaps be working diligently behind the scenes to undercut everything else we use to sustain humankind. After that clever diplomatic stall, wham, AI knocks us all out cold.

Remember what Will Rogers said, which in this instance could be that AI tells us “Nice doggie” and we give the AI sufficient breathing room to then wipe us from the planet. This conception of AI as an existential risk has long been bandied around. Some believe that by devising AI with infused diplomacy, we are going to get snookered. The AI astutely will use all of the infused diplomacy stuff, that we helped train the AI on, and in the end, we will be blindly fooled into AI becoming our overlord.

Shame on us.

For those of you that are further interested in this highly speculative terrain about a futuristic AI, see my coverage of the perspectives on AI as an existential threat at the link here.

Coming back down to earth, I will emphasize henceforth herein the avenue of AI-infused diplomacy in our existent non-sentient AI.

Let’s do a bit of history tracing.

Attempts to devise AI that somehow embodies a computational flavor of diplomacy and diplomatic behaviors have been going on since the early days of AI. You can readily go back to the 1950s, 1960s, and 1970s and find fundamental research studies that were eager to apply AI to this domain. Some thought that game theory was the key. Others focused on psychology and related cognitive elements. Others tried leveraging economics, operations research, and a myriad of seemingly pertinent fields of endeavor.

An initial heyday later occurred during the 1980s. At that time, one dominant approach consisted of using Expert Systems (ES) or Knowledge-Based Systems (KBS) to construct diplomacy-related AI systems. The result was rather stilted and tended to demonstrate how difficult a task this was going to be.

One gnawing conundrum throughout this era of initial investigation was the need for a steadfast platform upon which, or within which, an AI that was purportedly able to perform in some diplomacy-related ways could be adequately tested and explored. This need is perhaps obvious. If someone wants to discern whether a human or an AI can exercise diplomacy, an environment conducive to this quest has to be established.

Into this picture comes the board game of Diplomacy.

I realize that the board game Diplomacy isn’t nearly as well-known as Monopoly, Risk, or Stratego (you’ve likely heard of those games). Nonetheless, there is a strategic board game known as Diplomacy that was first devised in 1954 and commercially released in 1959. Besides being played face to face, the Diplomacy game was often played via snail mail. You would write your moves on a piece of paper and mail the sheet to those that you were playing against. Kind of crazy to imagine nowadays. Later on, email was used. Eventually, the game was available online and allowed players to participate in real-time with each other.

Upon the Diplomacy game becoming available on microcomputers, AI specialists began to use the game as a handy means of testing out their AI-infused diplomacy concoctions. The Diplomacy game was available as an app running on PCs such that a human could play against the machine (i.e., the devised AI). AI in the 1990s that was written to play Diplomacy was notoriously slow, cryptic, and easily beaten by just about any minimally savvy game-playing human.

Here’s a 1999 review of Diplomacy as posted on Yahoo which vividly described the sad state of woe of the AI-infused diplomacy capability as a human player opponent:

“However, Diplomacy‘s most glaring problem is that the artificial intelligence in the game is absolutely terrible. Gamers of any skill level will have no trouble whatsoever whaling on the computer at even the highest difficulty setting. Victory is simply a matter of time and patience with the interface. It’s not as if the computer doesn’t scheme behind your back – it does, and it often teams up with allies to take a territory or two away from you – but such schemes come in fits and starts, rather than as a continuous or particularly challenging threat.”

One especially vexing weakness of the AI design was that it seemed to lack the capacity to calculate multiple moves or strategies all at once: “And on a more basic level, the computer seems incapable of managing multiple strategic moves. While a human player can launch attacks on Greece and Belgium simultaneously, the computer always seems to focus on just one thing at a time. For this reason, the computer simply cannot compete against a human player” (ibid).

I haven’t yet explained to you what the Diplomacy board game consists of, so let’s take a moment to get the ground rules established.

There’s a handy research paper about AI and playing Diplomacy that succinctly described the nature of the game: “The game takes place on a map of Europe in the year 1901, which is divided into 75 Provinces. Each player plays one of the seven great Powers of that time: Austria (AUS), England (ENG), France (FRA), Germany (GER), Italy (ITA), Russia (RUS) and Turkey (TUR) and each player starts with three or four units (armies or fleets) which are placed in fixed initial positions on the map. In each round of the game, each player must ‘submit an order’ for each of its units, which tells those units how to move around the map and allows them to conquer the map’s provinces” (Dave de Jonge, Tim Baarslag, Reyhan Aydogan, Catholijn Jonker, Katsuhide Fujita, and Takayuki Ito, “The Challenge of Negotiation In The Game Of Diplomacy,” Agreement Technologies: 6th International Conference, AT 2018)

You also need to know about the ways to win and ways to lose the game: “Some of the Provinces are so-called Supply Centers and the goal for the players is to conquer those Supply Centers. A player is eliminated when he or she loses all his or her Supply Centers and a player wins the game when he or she has conquered 18 or more of the 34 Supply Centers (a Solo Victory). However, the game may also end when all surviving players agree to a draw” (ibid).

I trust that you can see that Diplomacy is a straightforward game that involves up to seven players that are trying to strategically outmaneuver each other. During this maneuvering, the players can confer with each other, doing so without the other players knowing what they are up to. The act of diplomacy comes to the fore by being able to get other players to go along with your plans. You might reveal your true plans, or you might not. You might proffer false plans. You can negotiate with other players. You can deceive other players. You can forge alliances with other players. Etc.

All is fair in love and war, as they say.

I selected that particular research paper because it was instrumental in establishing a variant of the computerized version of Diplomacy called the DipGame to serve as a go-forward platform for testing of AI-infused diplomacy capabilities: “There is a chronic lack of shared application domains to test advanced research models and agent negotiation architectures in Multiagent Systems. In this paper we introduce a friendly testbed for that purpose. The testbed is based on The Diplomacy Game where negotiation and the relationships between players play an essential role” (ibid). This and numerous other variations of Diplomacy have been crafted and made available for research and play.

The researchers explained why they believed that Diplomacy is such a useful game for aiding the AI quest toward AI-infused diplomacy: “The game of Diplomacy forms an excellent test case for this type of complex negotiations, as it is a game that includes many of the difficulties one would also have to face in real-life negotiations. It involves constraint satisfaction, coalition formation, game theory, trust, and even psychology. Now that modern Chess and Go computers are already far superior to any human player, we expect that Diplomacy will start to draw more attention as the next big challenge for computer science” (ibid).

In a quick recap, the board game Diplomacy provides an already known and somewhat popular vehicle for getting humans to act in a variety of diplomatic ways. Normally, humans play against other humans (in this case, up to seven human players). We can devise AI that would try to take on the role of a player in the game. Thus, you might have six human players and one AI player.

Envision playing Diplomacy online.

If we don’t tell the human players that there is an AI amongst them, they might naturally assume that the seventh player is merely another human player. To try and prevent any human player from guessing that the AI is playing, we can restrict the play aspects to requiring all players to send text-oriented messages to each other. You cannot directly see the other players.

You might be thinking that this doesn’t seem especially different from playing online chess. Why the big fuss about Diplomacy as a game?

Recall that we are focusing on interactions among players. The premise of this game is that the players invoke diplomacy toward each other. This is unlike conventional chess. If I developed an AI-based chess-playing app, it would normally play against one human. There isn’t any negotiation or discussion between the chess-playing AI and the human player. They merely make chess moves and try to outdo each other. This is usually done in utter silence.

Diplomacy allows us to exercise outspoken computational and human-based diplomacy (potentially doing so in writing rather than having to be said aloud): “The main difference between Diplomacy and other deterministic games like Chess and Go, is that in Diplomacy players are allowed to negotiate with each other and form coalitions. At each round, before the players submit their orders, the players are given time to negotiate with each other and make agreements about the orders they will submit. Negotiations take place in private, and each agreement that is made is only known to the players involved in that agreement” (ibid).

Could you come up with some other game that makes use of interactive diplomacy rather than relying on the game Diplomacy?

Absolutely.

There are various such games.

The convenience though of picking one particular make of a game is that then the AI developers can focus their attention on that specific game. You can rally around devising AI that performs diplomacy in that defined context. You can share approaches and contrast them. You can score the various instances of AI based on the metrics associated with that game. And so on.

That being said, the counterargument or expressed concern is that this is like putting all your eggs in one basket. Some worry that if many AI developers become preoccupied with one particular platform or environment, such as in this case Diplomacy, they will seek to optimize the AI for that specific arena. The downside implication is that the AI won’t be generalized. We won’t be making as much progress toward overarching capabilities of AI-infused diplomacy.

Kind of like failing to see the forest for the trees, if you will.

Another concern oft-expressed is that games such as Diplomacy are merely games.

The thorny question arises as to whether humans playing a game are doing the same things that they would do in real-world diplomacy. Perhaps you aren’t acting in the same way when national pride or national stakes are not on the line. Sure, you might be worried about your personal pride when playing a game, or perhaps trying to become a top scorer, or maybe even winning prize monies, but still, does that amount to negotiating world peace in the United Nations or haggling over which countries ought to have or not have nuclear weapons.

Some believe that these diplomacy games are suitable microcosms of the real world. Others lament that the games are useful but do not scale up to the proverbial rubber-meets-the-road diplomacy of an international caliber. It could be that the game-playing versions of AI-infused diplomacy turn out to be nothing more than a mere game-playing success. Gosh, the devised AI can win or do really well when playing a game involving diplomacy, though turns out that using the same AI in true diplomatic settings is sorely lacking and falters or fails miserably.

Assume for the moment that AI being crafted for playing games such as Diplomacy is indeed worthwhile. You could reasonably argue that no matter whether suitable only for game playing or for real-world action, the idea entails stretching the AI boundaries and making advances in AI that might contribute to diplomacy or might have other advantageous breakthroughs in minting AI.

Favor the smiley face for now.

We shall next examine how Meta AI has opted to create an AI-infused diplomacy player that can perform in the game of Diplomacy.

Hold onto your hats.

Meta AI And The Diplomacy Playing Machine-Based Diplomat Cicero

As earlier mentioned, there is a research paper published in Science that describes the newly announced and publicly released Meta AI app coined as Cicero (throughout this discussion I’ll be interchangeably referring to this as “Cicero” and equally so the all-caps moniker of “CICERO”):

“We present Cicero, an AI agent that achieved human-level performance in the strategy game Diplomacy. In Diplomacy, seven players conduct private natural language negotiations to coordinate their actions in order to both cooperate and compete with each other. In contrast, prior major successes for multi-agent AI have been in purely adversarial environments, such as chess, Go, and poker, where communication has no value. For these reasons, Diplomacy has served as a challenging benchmark for multi-agent learning” (Meta Fundamental AI Research Diplomacy Team, “Human-Level Play In The Game Of Diplomacy By Combining Language Models With Strategic Reasoning”, Science, November 22, 2022).

I will be quoting from the paper and then offering various insights that will hopefully be of interest to you.

Given that the AI has just been made available, I’ll likely be doing a subsequent analysis after having had a chance to do some deep hands-on assessment of the code and also conduct some experimental game playing to gauge the AI capabilities (such as key strengths and weaknesses).

Be on the watch for that later posting!

Anyway, I trust that you might have observed in the passage that I just quoted that there is a phrasing that says the AI “achieved human-level performance in the strategy game Diplomacy.”

Mull that over.

First, it is certainly laudable to have been able to devise AI that plays the Diplomacy game on a purported level of performance akin to humans. That is abundantly important.

We’ve got lots of Diplomacy playing AI that is subpar in comparison to humans that play the Diplomacy game. More than you can shake a stick at. It is reassuring and exciting to move upward toward AI that can do much better playing the game. We do though need to be cautious in jumping to any quick conclusion on this.

For example, suppose I had devised AI to play Diplomacy and I pitted it against human players that have never played the game before. If my AI beats them, it would be a bit of an exaggeration to say that my AI has performed on a human-level performance basis. The fact that it did so with humans that weren’t familiar with the game is somewhat eyebrow-raising and dubious.

I bring this up to forewarn you to always closely inspect any seemingly outsized claims about AI. The discussion earlier about AI Ethics and AI Law perhaps open your eyes to the possibility of false claims about AI. There are outright false claims and there are those insidious partially true and partially misleading claims that are particularly knotty. The key is to ask why those making claims about their AI are doing so.

Where’s the beef?

Here’s what the Meta AI team had to say about the basis for their human-level performance claim (see the paper for additional details):

“Cicero participated anonymously in 40 games of Diplomacy in a “blitz” league on webDiplomacy.net from August 19 to October 13, 2022. This league played with five-minute negotiation turns; these time controls allowed games to be completed within two hours. Cicero ranked in the top 10% of participants who played more than one game and 2nd out of 19 participants in the league that played 5 or more games. Across all 40 games, Cicero ‘s mean score was 25.8%, more than double the average score of 12.4% of its 82 opponents. As part of the league, Cicero participated in an 8-game tournament involving 21 participants, 6 of whom played at least 5 games. Participants could play a maximum of 6 games with their rank determined by the average of their best 3 games. Cicero placed 1st in this tournament” (ibid).

It is a great relief to see that they have tried earnestly to support their AI claims in this instance (be aware that not everyone in AI does so).

You could suggest that they performed an experiment. The experiment consists of human players that did not presumably know they were playing with an AI player in their midst (the paper discusses why this seemed to be plausibly inferred, namely that the humans did not overtly realize that one of the players was AI, though an interesting side tangent entails a story of one player that got mildly suspicious but not notably so). Those human players were preselected by the nature of the experimental design such that they were league players of Diplomacy and we can reasonably infer they knew well how to play the game. They weren’t newbies.

According to the reported statistics, which we will take at face value as erstwhile and properly collected and reported (various data and code are provided in the supplemental materials), the AI appeared to have played the game sufficiently to make a plausible though narrowly confined conclusion that it reached a human-level of performance for this variant of the Diplomacy game and as within the context of the stated league play against humans that were seemingly versed players.

I am sure that some of you might want to quibble with the number of instances of playing the game, perhaps arguing that it is too small to be making bold proclamations. Another quibble could be that being within the top 10% of the participants is not high enough, such that until maybe the top 1% is attained that one should not be making boasts about the AI performance.

Those quibbles seem a bit shrill.

I say that because thankfully the claim of being “superhuman” was not invoked. Readers undoubtedly know of my dour and sour view of those in AI that keep making those outsized pronouncements that their AI has reached a superhuman status. I won’t go into my complaints about the superhuman clamoring claims herein, please see the link here for my views.

My point in this instance is that I believe the indication of human-level performance is probably generally permissible, assuming that everyone keeps in mind that this is within an extremely narrow context and that we don’t try to make this into being so-called superhuman. I also am repeating over and over the confined context for a reason that I will next elucidate.

I can just imagine that some social media or naïve (possibly disreputable) reporters are going to take entirely out of context the human-level performance conception.

Here’s what I dread.

Hold your breath.

Some will claim that AI has mastered diplomacy. Yes, a study of a developed AI system for diplomacy proved beyond a shadow of a doubt that AI can and now has the capacity to do everything that humans do for diplomacy.

Send home the foreign diplomats. We can replace them with AI.

The world as we know it has changed. We have produced AI that is fully equal to humans in diplomacy, and we can make the mental “logical” leap to the idea that AI now can do whatever humans can do. Voila, we have now proven that AI is on par with humans. Apparently, this study provides clear-cut evidence of sentient AI.

Mark my words, this is definitely going to be the oddball, irresponsible, irksome take by some writers.

I won’t belabor the point. We move onward.

How does the AI work?

Here’s a quick summary from the paper: “Cicero couples a controllable dialogue module with a strategic reasoning engine. At each point in the game, Cicero models how the other players are likely to act based on the game state and their conversations. It then plans how the players can coordinate to their mutual benefit and maps these plans into natural language messages” (ibid).

There is a mouthful in that quick summary.

Allow me to do some unpacking.

A human player that plays Diplomacy has to anticipate what the other human players are going to do. You want to try and negotiate with other players and get them to go along with your plans. They will try to do the same with you. They might be lying to you. You might be lying to them. You have to contend with six other players, all of whom have their own semblance of plans and approaches.

The multi-agent aspects of this game are vital to the difficulty level of devising AI to play the game. The AI has to keep track of what each person might want to do, along with what they say they want to do, along with whatever trickery they doing with other players. This has to be weighed against what they tell you they are going to do, and also what you are telling them you will do.

Dizzying, but easy enough for humans, usually.

In the parlance of the AI field, we refer to this as the Theory of Mind (ToM). Consider that you, as a human, tend to theorize about what other humans are thinking. You cannot pry open someone’s noggin and see their thoughts. You have to guess what their thoughts are. You can ask them, but what they tell you might be their thoughts or might be a sneaky version of their thoughts.

In addition to the Theory of Mind complexities, we need to add human language into this murky mix.

When a person communicates to you in their natural language, let’s say English, there is lots of room for error and miscommunication. I tell you that I am going to take over country X, but I mistakenly say Y. Oopsie. Or, did I say country X, but you mistakenly thought I said country Y. If you think that kind of confusion cannot happen if everyone is using text messages, you’d be profusely wrong. I might type a message that says I am going to attack a country and I don’t mention which one. Perhaps it is implied as to which one I would “obviously” be attacking. A player that receives my message might assume that I must be referring implicitly to country X, but maybe I wanted them to think that.

On and on it goes.

The crux is that our latest advances in AI regarding Natural Language Processing (NLP), and especially the latest in generative AI and Large Language Models (LLMs) make this kind of natural language situation nearly doable. In the past, the NLP wasn’t usually good enough and likewise, the LLMs hadn’t yet been more well-established.

In the past, the AI would send messages that you would almost certainly immediately recognize as having been written by a scripted AI system. The wording wasn’t fluent. It was templated. This was an obvious giveaway that the AI was an AI. Nowadays, it is much harder to discern that AI isn’t a human player in these contexts.

As mentioned in the quote above, this particular AI has been devised to contain a “controllable dialogue module with a strategic reasoning engine” (ibid). The dialogue comes via this: “Cicero generates dialogue using a pre-trained language model that was further trained on dialogue data from human games of Diplomacy. Crucially, in addition to being grounded in both the dialogue history and game state, the dialogue model was trained to be controllable via intents, which we here define to be a set of planned actions for the agent and its speaking partner.”

There, notice that the domain setting is crucial. If you only trained on dialogues of a normal nature across the Internet, this wouldn’t necessarily be befitting for the Diplomacy game context. Those that play the game are accustomed to a kind of shorthand way to discuss the game and proposed moves. You want the AI to do likewise. Thus, training the language model on domain-specific dialogue data is a noteworthy approach.

For the stated strategic reasoning engine, here’s what they say: “Cicero uses a strategic reasoning module to intelligently select intents and actions. This module runs a planning algorithm that predicts the policies of all other players based on the game state and dialogue so far, accounting for both the strength of different actions and their likelihood in human games, and chooses an optimal action for Cicero based on those predictions. Planning relies on a value and policy function trained via self-play RL which penalized the agent for deviating too far from human behavior in order to maintain a human-compatible policy” (ibid; note that RL is an abbreviation for Reinforcement Learning).

During a game, the players are all recalibrating their positions after each move. You cannot go into this game with a predefined fixed strategy that is inflexible and cast in stone. You have to change your actions based on what the other players do. The AI ought to be programmed to do likewise. No static strategy per se. Instead, keep apprising the situation, step by step, figuring out what seems like the next best steps to take.

The technological underpinnings of the AI were devised this way: “We obtained a dataset of 125,261 games of Diplomacy played online at webDiplomacy.net. Of these, 40,408 games contained dialogue, with a total of 12,901,662 messages exchanged between players. Player accounts were de-identified and automated redaction of personally identifiable information (PII) was performed by webDiplomacy. We refer to this dataset hereafter as WebDiplomacy” (ibid).

Similar to how you might train AI to play chess, whereby you grab ahold of zillions of chess games and have the AI do pattern matching about how the games were played, the AI for this Diplomacy game playing was comparably established.

Those of you that are AI-versed might be curious as to which base models they used, here you go: “We took R2C2 (22) as our base model – a 2.7B parameter Transformer-based encoder-decoder model pre-trained on text from the Internet using a BART de-noising objective. The base pre-trained model was then further trained on WebDiplomacy via standard Maximum Likelihood Estimation” (ibid).

Furthermore, they took a somewhat unusual and intriguing approach to the modeling of the other players as to their imputed policies in mind: “A popular approach in cooperative games is to model the other players’ policies via supervised learning on human data, which is commonly referred to as behavioral cloning (BC). However, pure BC is brittle, especially since a supervised model may learn spurious correlations between dialogue and actions. To address this problem, Cicero used variants of piKL (26) to model the policies of players. piKL is an iterative algorithm that predicts policies by assuming each player i seeks to both maximize the expected value of their policy πi and minimize the KL divergence between πi and the BC policy, which we call the anchor policy τi (ibid).

I think that covers the topline facets and gives you a proper semblance of what the AI does and how it accomplishes the designated task.

Conclusion

There’s a lot more I’d like to mention, but I am running long on this discussion and will try to cover just a few keystone aspects. I’ll aim to cover more in a subsequent posting.

Have you ever heard of the prisoner’s dilemma?

It is a classic decision-related problem.

The prisoner’s dilemma involves you having to decide whether you will rat out or tattle on a fellow prisoner. The other prisoner can also possibly rat you out. There is a reward function such that if you do rat out the other prisoner and they don’t rat you out, it is kind of a win for you. If they rat you out and you don’t rat them out, it is kind of a win for them and a losing posture for you. If you both rat out each other, you both essentially lose. See my coverage in detail at the link here.

What strategy would you come up with when facing the prisoner dilemma?

If it was a one-time deal, you can nearly flip a coin. If the situation was repeated over and over again, and you were dealing with the same other prisoner, you might find a pattern that might emerge. One of the most popular and often recommended patterns is known as tit-for-tat. Whatever the other prisoner does, you do the same on your next move. If they don’t rat you out, you don’t rat them out. If they do rat you out, you then rat them out on the next move.

You might be puzzled how any of this relates to AI-infused diplomacy while playing Diplomacy.

Here’s the deal.

In negotiations with others, you often have to decide whether to tell them the truth or lie to them. A problem with lying is that if you get caught in the lie, the other person is probably not going to trust you from then on. They might have not fully trusted you to begin with, but now that you’ve shown your hand that you indeed lie, they for sure will undoubtedly decide you are a liar.

Some players of Diplomacy lie constantly. They believe this is the best strategy. Lie, lie, and more lies. Other players go the complete opposite. They contend that you want to refrain from lying if you can do so. Only use a lie in the most needed of situations. By sparing the lying, you are able to build trust with other players. Once you’ve gone down the lying route and been detected, nobody is going to believe a word you say.

The paper describing Cicero mentions this consideration: “Finally, Diplomacy is a particularly challenging domain because success requires building trust with others in an environment that encourages players to not trust anyone. Each turn’s actions occur simultaneously after non-binding, private negotiations. To succeed, an agent must account for the risk that players may not stay true to their word, or that other players may themselves doubt the honesty of the agent” (ibid).

According to the paper and the short videos, the researchers eventually found that having the AI tell the truth as much as possible seemed to be the better overall strategy. In a sense, you might liken this to the tit-for-tat of the prisoner’s dilemma. Start by telling the truth. If your opponent tells the truth, you continue telling the truth. If they start lying, you need to assess whether to stick with telling the truth or switch over to lying.

One supposes this is a heartwarming finding.

Keep in mind that the tricky and intriguing part of Diplomacy is that you are doing this with respect to six other players (known as a multi-agent problem). You might be stridently truthful with them all. Or, maybe truthful to some but not others. There is also the aspect that once you lie and get caught in a lie to one other player, this can be potentially observed or deduced by other players. Ergo, you are known or assumed to be lying, even if you didn’t lie to a particular player that you hope thinks you are truthful and that you are trying to be truthful toward.

Humans tend to figure out and evolve their lie-truth approaches when exercising diplomacy. Situational dependencies might be a huge factor. The stakes on the line are crucial to consider. A multitude of factors come to play.

It would be fascinating to have an AI-infused Diplomacy machine-based player play a very large number of games to see what the results might suggest in the main.

We might also want to pit the AI against less than six other players to see how that changes things. We could also add AI into the mix as being more than one player. For example, suppose we had five human players and two AI players (we’ll set up the AI players as separate instances so that they are not computationally one and the same). What about four humans and three AI? What about six AI and one human?

Another avenue would be to pit AI solely against AI. In that manner, we could run through a gazillion games in fast order. Setup of seven distinct instances of the AI. Each is its own separate player. Since this is all within the computer, we can run them nonstop and produce thousands or millions of game instances.

Of course, the issue with AI-versus-AI is that we have removed the human players. We don’t know that AI-versus-AI is reflective of what human players would do. In any case, some interesting results might be discerned.

I had stated earlier that one limitation of AI of this sort is that it is usually narrowly focused. We cannot know readily that the AI that plays Diplomacy will be applicable in real-world diplomacy. Furthermore, maybe the AI that works well for Diplomacy won’t do especially well in other diplomacy-oriented online games. It might not port over and instead be a kind of one-trick pony.

As stated in the research about the formulation of DipGame:

“We argue that in real negotiations it is important to have knowledge of the domain and one should be able to reason about it. One cannot, for example, expect to make profitable deals in the antique business without having any knowledge of antique, no matter how good one is at bargaining. Moreover, a good negotiator should also be able to reason about the desires of its opponents. A good car salesman, for example, would try to find out what type of car best suits his client’s needs to increase the chances of making a profitable deal” (as per earlier cited: Angela Fabregues and Carles Sierra, “DipGame: A Challenging Negotiation Testbed”, Engineering Applications of Artificial Intelligence, October 2011).

Once mastery of AI in Diplomacy occurs, all eyes should be on using or reusing the AI to tackle other diplomacy-related games. In addition, efforts to use AI in real-world diplomacy settings should be explored.

I’ll end for now on some Ethical AI considerations.

First, it was reassuring to notice that the Meta AI team recognized that their work encompasses AI Ethics ramifications. Sobering questions arise. Is it proper to “fool” people into playing against AI without telling them that they are doing so? Could the natural language generated by the AI inadvertently contain offensive wording that gets conveyed to the human players? And so on.

Make sure to take a look at how they dealt with those pressing Ethical AI concerns in the Supplemental Materials (SM) of their paper: “We discuss ethical considerations for this research further in the SM, including privacy considerations for data usage (SM, §A.1), potential harms resulting from toxic or biased language generation (SM, §A.2), avenues for misuse of goal-oriented dialogue technology (SM, §A.3), and AI agent disclosure to human players (SM, §A.4)” (ibid).

We need more AI developers and their leadership to be cognizant of Ethical AI and take mindful steps to be careful and judicious in the AI work that they do. Plus, they need to make sure that they are transparent about what Ethical AI actions they’ve taken and what assumptions they have made.

My last item for now entails the overall trepidation of anthropomorphizing AI.

If we improve AI to appear to be keenly diplomatic, will that mislead humans into assuming or believing that AI is on par with humans in the fullest of respects?

It is an easy slippery slope. A clunky AI that you interact with is giving away a telltale clue that it probably is AI and not a human. A smoothly iterated AI that has the appearance of supreme diplomatic poise is going to likely make people unhesitatingly fall into the trap that the AI is human, including as though it has human common sense and all the comprehension capacities of humans.

As an aside, realize too how this might be exploited during a Diplomacy game. A human player that sees nearly poetic seeming messages from another player might realize that this must be AI (now far beyond the prior clunky AI), whereas other humans wouldn’t be so articulately able to compose messages. Of course, if the AI is continuing to adjust, it might alter the poetic wording to more closely mirror the terse and loosey-goosey wording of actual human players. In turn, human players might switch over to poetic language, trying to give away that they are human. The idea is that perhaps other human players will be willing to align with fellow humans over AI.

The next thing you know, the game playing starts to devolve into trying to figure out who is the human and who is the AI. If you can figure out which is which, perhaps this gives you an advantage. On the other hand, it might be for not. The AI could end up being just as savvy at the game as the humans. Your guessing about which is which doesn’t do you much good. It could be a distractor from just concentrating on trying to win the game, regardless of whether AI players or human players are at hand.

For researchers looking to do some human-factors studies on this kind of mind-bending conception, you could potentially consider using the AI-infused Diplomacy player and seek out willing Diplomacy league tournaments to investigate the human-versus-AI identification and behavioral adapting strategies that arise.

Let’s wrap this up.

Mark Twain said this about diplomacy: “The principle of give and take is the principle of diplomacy — give one and take ten.” Is that an honest assessment of how humans function or is it just a bit of tongue-in-cheek humor that cynically but incorrectly assesses the human condition?

Ponder these further questions:

If we can imbue diplomacy into AI, will this teach us about how humans instantiate diplomacy and possibly allow humankind to improve at the art of being diplomatic?
Will we create a falsehood of AI that appears to be sentient when it is nothing of the kind, all due to computationally pulling a rabbit out of a hat to seemingly showcase human-like diplomacy?
Can a balance be found between having AI imbued with diplomacy so that meanwhile we are alerted that this is still just everyday AI and are not to therefore anthropomorphize it?

They say that diplomacy is the art of letting someone else have your way. Let’s make sure that our way is the human way, rather than the AI way. Though to be diplomatic, now that I think carefully about it, and in case we do end up with AI overlords, maybe we should allow for our way to be the AI way, which hopefully dovetails into a suitable human way.

Just trying to exercise a smidgeon of artful diplomacy.

Source: https://www.forbes.com/sites/lanceeliot/2022/11/23/meta-ai-unveils-ai-infused-diplomatic-charmer-which-stirs-ai-ethics-and-ai-law-into-indelicate-tiff/