Coconote
AI notes
AI voice & video notes
Export note
Try for free
Public Lecture: AI Safety, Watermarking, and Neurocryptography by Scott Aronson
Jul 3, 2024
Public Lecture: AI Safety, Watermarking, and Neurocryptography by Scott Aronson
Introduction
Speaker
: Scott Aronson
Host
: Andre Asashi, Chair of the Department of Mathematics
Event
: First public lecture with an invited speaker since the COVID-19 pandemic
Topic
: Neurocryptography, interface between AI and cryptography
About Scott Aronson
Theoretical computer scientist specializing in Quantum Computing
PhD from Berkeley (2004)
Former Assistant/Associate Professor at MIT EECS
Currently a Professor at UT Austin
Working on theoretical foundations of AI safety at OpenAI
Awards: NSF Waterman Award, ACM Prize in Computing, Simons Investigator
Overview of Talk
Shifted from Quantum Computing to AI safety
Collaboration with OpenAI on AI safety
Unexpected rapid advancement in AI capabilities (e.g., ChatGPT)
Rising interest in AI alignment and safety
Five future scenarios for AI's impact: AI Fizzle, Futurama, AI Dystopia, Singularity, AI Apocalypse
Near-term AI safety focuses on practical problems like watermarking and cryptographic backdoors
AI Safety and Alignment
AI alignment: Ensuring AI actions align with human values and interests
AI ethics vs AI alignment: Different focal points but ultimately related
Reform AI alignment: Addressing practical, near-term issues with AI safety
Neurocryptography
Definition
: Integrating cryptographic functionalities in/with neural networks
Applications
:
Watermarking: Recognizing AI-generated content
Cryptographic Backdoors: Secret commands for controlling AI
Preserving privacy and protecting copyright
AI's capability to break CAPTCHAs
Watermarking in AI
Importance: Recognize AI-generated text to prevent misuse (e.g., academic cheating, misinformation)
Techniques:
Not detectable by ordinary users but statistically significant for those who know what to look for
Can be inserted by modifying the probabilities used in generating text
Effective without degrading text quality
Problems: Potential attacks like text translation, inserting dummy words
Future approaches: Watermarking at the semantic level, similar to tree-ring watermarking in images
Cryptographic Backdoors
Concept: Insert secret commands for triggering specific behaviors (e.g., off-switch)
Challenges: Creating unremovable backdoors; ensuring safety if AI modifies itself
Key Points and Discussions
AI's Capability Growth
: Rapid improvements leading to ethical and safety challenges
Reliability of Watermarking
: Effective with a large number of tokens; robustness against simple attacks
Regulation and Coordination
: Necessary involvement and alignment among AI companies
Various AI Misuse Scenarios
: From academic cheating to generating tailored phishing attacks
Potential Solutions
: Combining watermarking with discriminator models, cryptographic databases
Conclusion
Neurocryptography's potential impact on AI safety
Combining modern cryptographic techniques with AI to solve emerging problems
Open questions and challenges with AI-generated content monitoring and control
Q&A Highlights
False Positives with Watermarking
: Needs many tokens to significantly lower false positives
Verification and Regulation
: Government regulation essential; how to implement it is still debated
Future AI Capabilities
: Unpredictability and potential existential risks
Role of Discriminator Models
: Combined with watermarking for flexibility and accuracy
Impacts on Education
: AI's role in applications like college essays; ethical trade-offs
Continuing Development
: The necessity for empirical studies and testing to address evolving challenges
📄
Full transcript