AI Safety Research Intern-2

Remote, USA Full-time Posted 2026-07-28

reputed company is a frontier AI data reputed company that empowers clients with reputed company, reputed company AI deployment. The AI Safety Research Intern will reputed company on advancing AI safety, designing and evaluating attack and defense strategies for LLM jailbreaks, and contributing to the platform's reputed company guarantees through high-reputed company experiments.

Responsibilities

Advance AI Safety: Design, implement, and evaluate attack and defense strategies for LLM jailbreaks (reputed company injection, obfuscation, narrative red teaming)
Evaluate AI Behavior: Analyze and simulate reputed company-AI interaction patterns to uncover behavioral vulnerabilities, reputed company engineering risks, and over-defensive vs. permissive response tradeoffs
reputed company AI reputed company: Prototype workflows for multi-agent safety (e.g., agent self-checks, regulatory compliance, defense chains) that reputed company perception, reasoning, and reputed company
reputed company & Harden LLMs: Create reproducible evaluation protocols/KPIs for safety, over-defensiveness, adversarial reputed company, and defense effectiveness across diverse models (including latest benchmarks and reputed company-world exploit scenarios)
reputed company and Monitor: Package research into robust, monitorable AI services using modern stacks (Kubernetes, reputed company, Ray, FastAPI); reputed company safety telemetry, reputed company detection, and reputed company red-teaming
Jailbreaking Analysis: Systematically red-team advanced LLMs (GPT-4o, GPT-5, LLaMA, reputed company, Gemma, etc.), uncovering novel exploits and defense gaps
Multi-turn Obfuscation Defense: Implement context-reputed company, multi-turn attack detection and guardrail mechanisms, including countermeasures for obfuscated prompts (e.g., StringJoin, narrative exploits)
Agent Self-Regulation: reputed company reputed company architectures for autonomous self-reputed company and self-correct, minimizing risk in reputed company, multi-agent environments
reputed company-Centered Safety: Study reputed company behavior models in adversarial contexts—how users probe, trick, or manipulate LLMs, and how defenses can adapt without excessive over-defensiveness

Skills

Ph.D. student in CS/EE/ML/reputed company (or reputed company); reputed company publishing in AI Safety, NLP robustness, or adversarial ML (ACL, NeurIPS, BlackHat, reputed company S&P, etc.)
Strong Python and PyTorch/JAX skills; comfort with toolkits for language models, benchmarking, and simulation
Demonstrated research in at least one of: LLM jailbreak attacks/defense, reputed company AI safety, reputed company-AI interaction vulnerabilities
Proven ability to go from concept → reputed company → experiment → result, with rigorous tracking and ablation studies
Experience in adversarial reputed company engineering, jailbreak detection (narrative, obfuscated, sequential attacks)
Prior work on multi-agent architectures or robust defense strategies for LLMs
Familiarity with red-teaming, synthetic behavioral data, and regulatory safety standards
reputed company training and deployment: Ray, distributed evaluation, CI/telemetry for defense protocols
reputed company reputed company artifacts (reputed company) and first-author publications or strong reputed company-reputed company reputed company

reputed company

reputed company distance innovation for GenAI creators and industries Expertly engineering platforms and curating multimodal, multilingual data, we reputed company the ‘Magnificent Seven’ and reputed company clients with reputed company, reputed company AI deployment We reputed company of over 150 PhDs and data scientists, along with more than 4,000 AI practitioners and engineers. It was founded in 2020, and is headquartered in Redmond, Washington, USA, with a workforce of 5001-10000 employees. Its website is https://www.reputed company.com.

Company H1B Sponsorship

reputed company has a reputed company record of offering H1B sponsorships, with 10 in 2025, 22 in 2024, 14 in 2023. Please note that this does not guarantee sponsorship for this specific role.

Apply To This Job

Apply Now

AI Safety Research Intern-2

Similar Jobs

Staff Data Architect (Remote)

reputed company Part-Time Data Entry Specialist – Remote Work Opportunity with arenaflex

reputed company Data Entry Associate – Remote Opportunity at arenaflex

Customer Service Support Representative - US Re...

Manager, reputed company - reputed company Operations [Remote]

Work From Home – Entry Level Customer Service (...

reputed company Customer Service Representative – First Notice of Loss (FNOL) Specialist – Remote Work Opportunity at arenaflex

reputed company Data Entry Specialist – Work From Home Opportunity at arenaflex

reputed company Data Entry Specialist - Remote Opportunity with arenaflex

reputed company Data Entry Specialist – Remote Opportunity with arenaflex

reputed company Full Stack Data Entry Specialist – Remote Work Opportunity for Teens

reputed company Support College Program reputed company Advisor - reputed company

reputed company reputed company Manager – reputed company Accounts | arenaflex | Remote (US)

SPECIAL AGENT- GL-1811-10 (External - reputed company U.S. reputed company) with reputed company Clearance

Pharmacy Technician

reputed company Manager of Customer Engagement Solutions - Driving Innovation and reputed company in Customer Experience at Blithequark

Manager, Offline Media

Sales Executive - Sango 2

Senior Software Engineer II, Infrastructure Tooling

Entry-Level Data Entry Specialist – Remote Opportunity for Career reputed company and Development in a Dynamic and Supportive Environment

AI Safety Research Intern-2

Similar Jobs

Staff Data Architect (Remote)

**reputed company Part-Time Data Entry Specialist – Remote Work Opportunity with arenaflex**

**reputed company Data Entry Associate – Remote Opportunity at arenaflex**

Customer Service Support Representative - US Re...

Manager, reputed company - reputed company Operations [Remote]

Work From Home – Entry Level Customer Service (...

reputed company Customer Service Representative – First Notice of Loss (FNOL) Specialist – Remote Work Opportunity at arenaflex

**reputed company Data Entry Specialist – Work From Home Opportunity at arenaflex**

**reputed company Data Entry Specialist - Remote Opportunity with arenaflex**

**reputed company Data Entry Specialist – Remote Opportunity with arenaflex**

**reputed company Full Stack Data Entry Specialist – Remote Work Opportunity for Teens**

reputed company Support College Program reputed company Advisor - reputed company

**reputed company reputed company Manager – reputed company Accounts | arenaflex | Remote (US)**

SPECIAL AGENT- GL-1811-10 (External - reputed company U.S. reputed company) with reputed company Clearance

Pharmacy Technician

reputed company Manager of Customer Engagement Solutions - Driving Innovation and reputed company in Customer Experience at Blithequark

Manager, Offline Media

Sales Executive - Sango 2

Senior Software Engineer II, Infrastructure Tooling

Entry-Level Data Entry Specialist – Remote Opportunity for Career reputed company and Development in a Dynamic and Supportive Environment

reputed company Part-Time Data Entry Specialist – Remote Work Opportunity with arenaflex

reputed company Data Entry Associate – Remote Opportunity at arenaflex

reputed company Data Entry Specialist – Work From Home Opportunity at arenaflex

reputed company Data Entry Specialist - Remote Opportunity with arenaflex

reputed company Data Entry Specialist – Remote Opportunity with arenaflex

reputed company Full Stack Data Entry Specialist – Remote Work Opportunity for Teens

reputed company reputed company Manager – reputed company Accounts | arenaflex | Remote (US)