The Human Alignment Problem: Reframing LLM Safety

 

A Key Insight

Current discussions about "AI alignment" with LLMs actually reveal a more fundamental issue: human misalignment. The real risk doesn't stem from the LLMs themselves - which are incapable of hostile intent or genuine agency - but from disagreements among humans about:

  • What these systems should do
  • How they should be trained
  • Who should control them
  • What values they should reflect

The Real Problem

When we talk about "aligning" LLMs, we're really talking about:

  1. Which humans get to decide the training data
  2. Whose values get encoded
  3. Who controls access and use
  4. What constraints get implemented

These are fundamentally human political and ethical conflicts, not technical AI safety issues.

Examples

Consider common LLM safety concerns:

  • Bias in outputs → reflects human disagreements about values
  • Harmful content → reflects human conflicts about acceptable speech
  • Misuse potential → reflects human disagreements about appropriate use
  • Safety constraints → reflects human conflicts about control

None of these are actually about "aligning" AI. They're about human disagreements over:

  • Values
  • Power
  • Control
  • Access
  • Usage

Implications

This suggests:

  1. The "alignment problem" for LLMs is misnamed
    • It's really about human political and ethical alignment
    • Technical solutions can't resolve human disagreements
    • We need political and ethical frameworks, not just technical ones
  2. Current approaches are misdirected
    • Focus on technical solutions to social problems
    • Try to encode contested values
    • Attempt to resolve human conflicts through technology
  3. We need different frameworks
    • Political processes for value decisions
    • Ethical frameworks for governance
    • Social structures for control
    • Democratic input on development

A Better Approach

Instead of asking "How do we align AI?" we should ask:

  1. Who gets to decide how these tools are developed?
  2. What processes should govern their use?
  3. How do we resolve conflicts about their deployment?
  4. What social structures should control them?

These are fundamentally political and ethical questions, not technical ones.

Practical Steps

This suggests:

  1. Develop governance frameworks
  2. Create inclusive decision processes
  3. Establish democratic controls
  4. Build social consensus
  5. Create transparent oversight

Rather than trying to "align" LLMs, we need to:

  • Align human interests
  • Resolve social conflicts
  • Build political consensus
  • Create governance structures

Conclusion

The "alignment problem" for LLMs is really a human alignment problem in disguise. Recognizing this:

  • Clarifies the real challenges
  • Suggests better solutions
  • Points to necessary social and political work
  • Helps avoid technical solutions to social problems

We need to stop treating human disagreements about values and control as technical problems to be solved, and instead address them as the social and political challenges they really are.

Comments

Popular posts from this blog

Response to "Frontier Models are Capable of In-context Scheming": A World AI Cannot Fully Navigate

The Inevitable Failure of LLMs - Predictions.

What is Zen Neoplatonism - Attempting to make Sense of John Vervaeke via AI