Friendly Superintelligence

My assumptions

• Need to make friendliness work in general, not just for particular AI designs– we do not know which will succeed

• Hard takeoff unlikely – AI will develop over time in interaction with

society• The context systems are developed in must be taken into

account, we cannot use simple a priori arguments

Friendly AI is in the end a practical problem

• AI will be created for economic reasons, and will be involved in economic transactions with humans from the start.

• Whether AI, IA or something else will be developed will be determined only to a minor extent by deliberate global choices and more by what technologies provide payoffs during their development

Friendliness as a game

• Friendly AI as a game: we want an infinite game for humans

• It is not a game for a single player, but from the start consisting of many different players with slightly different goals.

Do we aim for no risk or acceptable risk?

• As risks become smaller the cost of removing them increases with no limit

• The hard take-off assumption assumes that there is going to be one gamble with a single large risk, while the soft take-off implies many interactions with medium risks.

Suggested approaches to friendly AI

• Internal constraints (Asimov’s laws)

• Built in values or goals (“Love humans”)

• Learned values (Brin, Lungfish)

• External (law, economics)

Problems with the approaches

• Asimov laws allow accidental unfriendly behaviour – the full consequences of a complex formal

system are unknowable, and being in contact with the messy real world makes things worse.

• Internal constraints and values are design solutions, but there are many designers and some might be malevolent, misguided or make mistakes.

• Designs compete with each other - a risky architecture may show greater economic potential

• If values are learned, then they can be mis-learned.

• External approaches can seldom be proven to work due to their complexity.

Law of comparative advantages

• Trade is mutually profitable even when one part is more productive than the other in every commodity that is being exchanged – specialisation enables the more productive agent to

produce more of the commodity most profitable to it.

• AI and humans can profit from specialisation, even when their capabilities are vastly different.

External Approaches

• Seek to reward friendliness and punish unfriendliness

• Relevant for the soft takeoff scenarios– AIs that have “grown up” within a human culture

are more likely to encompass its ethics and values, and have tight economical connections

• Defection is profitable only as long as there are no interactions that can make it unprofitable

A Combination Approach

– Guidelines for AI development

• will be useful for selling AI in any case

– Good rearing?

– Make sure we set up a legal and economical framework where friendly AIs prosper and unfriendly are inhibited

– This will not be a guarantee of friendliness, any more than current systems of upbringing, education and law guarantee it.

Friendly Superintelligence

Documents

Transcript of Friendly Superintelligence