Alignment

noun

Foundational concepts Reporting on AI

The process of ensuring an AI system behaves in accordance with human values and intentions, rather than pursuing unintended or harmful goals.

Still, the document is not a silver bullet for solving the so-called alignment problem, which is the tricky task of ensuring AIs conform to human values, even if they become more intelligent than us. — TIME

'This implies that our existing training processes don't prevent models from pretending to be aligned,' Hubinger tells TIME. 'It means that alignment is more difficult than you would have otherwise thought, because you have to somehow get around this problem.' — TIME

Entry by Ryan Serpico

Flag Changelog

About this glossary — who's behind this site and how you can contribute.