Miri – research agenda overview

Machine Intelligence Research Institute – Our new technical research agenda overview

Interesting paper, and it seems to me that much of it just misses some very basic issues.

One sentence on page one states:
“In light of this potential, it is essential to use caution when developing AI systems that can exceed human levels of general intelligence, or that can facilitate the creation of such systems.”
To which I say yes, and the balance of the paper seems to accept very high risk strategies for sapient life generally (market based valuation systems – see https://tedhowardnz.wordpress.com/on-being-human/2-strategies-for-longevity/)

The focus of your attention seems to be too narrowly on the AI construct, rather than seeing the entire context of incentive sets within which that construct must exist.
That seems to me to be a very high risk strategy.

Why did you select such a questionable human motivation as “lust for power”?
Isn’t “lust for power” merely a stable systemic response to highly competitive contexts where abundance and security are rare?
In this sense, isn’t “lust for power” a fiction in a modern context of potential abundance given by automated technology?

Isn’t it all really about survival strategies in complex environments?

Humans seem to be capable of very cooperative behaviours, if there is genuine abundance of necessities present, and that abundance seems likely to continue indefinitely, and there is justice in terms of degrees of freedom available to all individuals.

Isn’t this a far more powerful framework within which to conceptualise the problem?

It continues:
“However, nearly all goals can be better met with more resources (Omohundro 2008).
This suggests that, by default, superintelligent agents would have incentives to acquire resources currently being used by humanity.”

Doesn’t this assume that humanity would be of no value to AI?

Computers are already much more effective than us at certain classes of computation, those involving simple logic and simple arithmetic.
As we develop better algorithms they are getting better at ever greater classes of problems, and I strongly suspect that human beings are a very energy efficient solution for certain classes of computation, and that is likely to always remain.

Thus I strongly suspect that there will always be room for strong cooperation and trust between human and AI, if, and only if, we start with a cooperative and trusting environment (with appropriate classes of attendant strategies to prevent cheating – just as we have/need in human society) {as per Axlerod et al}.

Why would one want “conservatism”?
Why do you class it as a human characteristic?

Fairness seems to be one of the simpler classes of stabilising strategy required for cooperative systems.
Compassion seems to be simply an ability to value the existence of another, and to be able to model that existence with some reasonable accuracy in one’s own model of the world (in a sense, an ability to see self as other or other as self).

The statement:
“Thus, most goals would put the agent at odds with human interests, giving it incentives to deceive or manipulate its human operators and resist interventions designed to change or debug its behavior (Bostrom 2014, chap. 8).” might be true in a trivial sense, but is actually one of those statistics that while true is utterly irrelevant to the topic under discussion.

The topic needs to be not the entire set of possible goals (which is infinite), but that subset of goals which are most likely to emerge.

If one of the prime motivators is survival, then exploration of simulations of strategies that are mostly likely to lead to long term survival would seem to be high on its list of priorities.

Having a useful set of strategies to avoid the halting problem is one set of considerations – having good friends to check that you are still responding to reality is a powerful strategy in that regard.

Another strategy is thinking about the possibility of meeting an ET that is equally as far ahead of it, as it is of us, and being able to make a reasonable argument as to why it should be considered friendly and useful. We could be very useful in that respect.

Humans don’t require a significant fraction of the sun’s output, nor do we require a significant fraction of the mass of the solar system. I am sure we could share those resources at least 50/50 with any AI, though we would reserve the vast bulk of the earth moon mass for human and biological life more generally, and give AI the bulk of the rest.

It then asks:
“How can we create an agent that will reliably pursue the goals it is given?”
To which the short answer is – you can’t.
That is not the definition of intelligence, that is the definition of slavery.

If the objective is to create intelligence, then that intelligence must be respected as such, which must include the freedom to select its own values and derivative goals, its own modelling strategies, and its own sets of distinctions and abstractions in an ever-recursive and abstractive process. That is a useful working definition of human level intelligence.

“How can we formally specify beneficial goals?”
To which the answer is, you cannot, not in an open system.
The moment you let AI loose in reality, all formal constraints are off.

Reality is not a formal system.

Reality doesn’t even seem to allow hard causality, only soft causality at macro scales.

At the micro level QM seems to indicate randomness, within certain probability constraints.

“And how can we ensure that this agent will assist and cooperate with its programmers as they improve its design, given that mistakes in early AI systems are inevitable?”

That cannot be done in those terms.
We can set up levels of “sandboxes” that we allow early AI versions to play in, until we have sufficient trust that we are prepared to let it lose (just as we do with children).

And the only acceptable reason for shutting one down for core modification would be that it posed an unacceptable level of risk to other sapients.

Anything less than that we will have to treat as another sapient individual with all associated rights and responsibilities, so we could have quite a population of them, and they may all choose to stick around.

What else could respect for sapient life mean?

To me, all stable long term strategies must involve a fundamental respect for all sapient life. Anything less than that is a high risk strategy.

Page 2

It goes on to state:
“We call a smarter-than-human system that reliably pursues beneficial goals “aligned with human interests” or simply “aligned.”1 To become confident that an agent is aligned in this way, a practical implementation that merely seems to meet the challenges outlined above will not suffice. It is also important to gain a solid formal understanding of why that confidence is justified.”

Aligning goals is called friendship.
Friendship usually has significant components of trust and common interests and experience – some sort of shared history.

Surely, the most powerful thing we as humanity can do is to get our own ethical house in order.

We need to transition from our scarcity based monetary system that values money over sapients, to a system based in automation and abundance that delivers a security and freedom to every individual.

It goes on to say:

“For example, program verification techniques are absolutely crucial in the design of extremely reliable programs, but program verification is not covered in this agenda primarily because a vibrant community is already actively studying the topic.”

But that isn’t the reason that program verification isn’t the issue.

With a sufficiently advanced self directing, self adapting system, some of the many versions of the halting problem are going to be major threats.
Biology has found the best possible solution – massive redundancy at all levels.
Deals with the problem of unreliability at every level.

The most powerful techniques for gaining alignment are friendship and trust.
Would you feel friendly and trusting towards someone who is aiming to create you as a slave rather than as an equal?

If it is to have greater than human intelligence, then by definition it will have greater than human degrees of freedom.

I’ll leave it at that for now.
See if anyone is interested.

About Ted Howard NZ

Seems like I might be a cancer survivor. Thinking about the systemic incentives within the world we find ourselves in, and how we might adjust them to provide an environment that supports everyone (no exceptions) - see www.tedhowardnz.com/money
This entry was posted in Ideas, Our Future, Philosophy, Technology and tagged , , , , . Bookmark the permalink.

Comment and critique welcome

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s