[Remote] AI Safety Research Intern (PhD)

Remote, USA Full-time Posted 2025-11-24

Note: The job is a remote job and is open to candidates in USA. Centific is focused on advancing AI safety and responsible AI development. As a Ph.D. Research Intern, you will conduct high-impact experiments and contribute to the security guarantees of AI systems through innovative research and practical implementations.

Responsibilities

Advance AI Safety: Design, implement, and evaluate attack and defense strategies for LLM jailbreaks (prompt injection, obfuscation, narrative red teaming)
Evaluate AI Behavior: Analyze and simulate human-AI interaction patterns to uncover behavioral vulnerabilities, social engineering risks, and over-defensive vs. permissive response tradeoffs
Agentic AI Security: Prototype workflows for multi-agent safety (e.g., agent self-checks, regulatory compliance, defense chains) that span perception, reasoning, and action
Benchmark & Harden LLMs: Create reproducible evaluation protocols/KPIs for safety, over-defensiveness, adversarial resilience, and defense effectiveness across diverse models (including latest benchmarks and real-world exploit scenarios)
Deploy and Monitor: Package research into robust, monitorable AI services using modern stacks (Kubernetes, Docker, Ray, FastAPI); integrate safety telemetry, anomaly detection, and continuous red-teaming
Jailbreaking Analysis: Systematically red-team advanced LLMs (GPT-4o, GPT-5, LLaMA, Mistral, Gemma, etc.), uncovering novel exploits and defense gaps
Multi-turn Obfuscation Defense: Implement context-aware, multi-turn attack detection and guardrail mechanisms, including countermeasures for obfuscated prompts (e.g., StringJoin, narrative exploits)
Agent Self-Regulation: Develop agentic architectures for autonomous self-check and self-correct, minimizing risk in complex, multi-agent environments
Human-Centered Safety: Study human behavior models in adversarial contexts—how users probe, trick, or manipulate LLMs, and how defenses can adapt without excessive over-defensiveness

Skills

Ph.D. student in CS/EE/ML/Security (or related); actively publishing in AI Safety, NLP robustness, or adversarial ML (ACL, NeurIPS, BlackHat, IEEE S&P, etc.)
Strong Python and PyTorch/JAX skills; comfort with toolkits for language models, benchmarking, and simulation
Demonstrated research in at least one of: LLM jailbreak attacks/defense, agentic AI safety, human-AI interaction vulnerabilities
Proven ability to go from concept → code → experiment → result, with rigorous tracking and ablation studies
Experience in adversarial prompt engineering, jailbreak detection (narrative, obfuscated, sequential attacks)
Prior work on multi-agent architectures or robust defense strategies for LLMs
Familiarity with red-teaming, synthetic behavioral data, and regulatory safety standards
Scalable training and deployment: Ray, distributed evaluation, CI/telemetry for defense protocols
Public code artifacts (GitHub) and first-author publications or strong open-source impact

Benefits

Comprehensive healthcare, dental, and vision coverage
401k plan
Paid time off (PTO)
And more!

Company Overview

Zero distance innovation for GenAI creators and industries Expertly engineering platforms and curating multimodal, multilingual data, we empower the ‘Magnificent Seven’ and enterprise clients with safe, scalable AI deployment We a team of over 150 PhDs and data scientists, along with more than 4,000 AI practitioners and engineers. It was founded in 2020, and is headquartered in Redmond, Washington, USA, with a workforce of 5001-10000 employees. Its website is https://www.centific.com.

Company H1B Sponsorship

Centific has a track record of offering H1B sponsorships, with 10 in 2025, 22 in 2024, 14 in 2023. Please note that this does not guarantee sponsorship for this specific role.

Apply To This Job

Apply Now

[Remote] AI Safety Research Intern (PhD)

Similar Jobs

Seasonal Sales Associate-282 Southeast Richmond...

Experienced Customer Support Remote Representative – Delivering Magical Experiences to arenaflex Enthusiasts from the Comfort of Your Own Home

Remote Care Manager - RN 3 Locations

Experienced Customer Service Representative – Pet Industry Expert (Remote in Florida)

Intelligence Analyst – RFI Triage (Remote, East...

Business Development Director, Commercial Enter...

Data Entry Remote Jobs-JetBlue Airline At Home ...

[Hiring] Temporary Team Lead @TTEC

Senior Data Scientist - Revenue Intelligence

Delivery Director - US-Based

Experienced Customer Service Representative for Healthcare Claims and Benefits – Remote Opportunity in Nebraska

Resort Sales Associate, Disney Central (Orlando, FL) - Full Time

Scheduling Assistant - Entry Level (Remote)

Associate Attorney - AZ

Mid Market Account Executive | Remote within Las Vegas, NV. OR...

Experienced Virtual Data Entry Clerk – Entry Level No Experience Required for Part-Time Remote Role with Comprehensive Benefits and Growth Opportunities at Blithequark

Want Customer Service Agent - Remote/Hybrid in Cedar Falls, IA

Experienced Financial Analyst – Planning and Analysis for Canada - Driving Merchant Discount Revenue Growth at Blithequark

Temporary Admininstrative Assistant - Front Desk Operations - UT Online High School - (UTEMPS)

[PART_TIME Remote] Part-Time Bookkeeper (Atlanta Preferred)

[Remote] AI Safety Research Intern (PhD)

Similar Jobs

Seasonal Sales Associate-282 Southeast Richmond...

Experienced Customer Support Remote Representative – Delivering Magical Experiences to arenaflex Enthusiasts from the Comfort of Your Own Home

Remote Care Manager - RN 3 Locations

**Experienced Customer Service Representative – Pet Industry Expert (Remote in Florida)**

Intelligence Analyst – RFI Triage (Remote, East...

Business Development Director, Commercial Enter...

Data Entry Remote Jobs-JetBlue Airline At Home ...

[Hiring] Temporary Team Lead @TTEC

Senior Data Scientist - Revenue Intelligence

Delivery Director - US-Based

Experienced Customer Service Representative for Healthcare Claims and Benefits – Remote Opportunity in Nebraska

Resort Sales Associate, Disney Central (Orlando, FL) - Full Time

Scheduling Assistant - Entry Level (Remote)

Associate Attorney - AZ

Mid Market Account Executive | Remote within Las Vegas, NV. OR...

Experienced Virtual Data Entry Clerk – Entry Level No Experience Required for Part-Time Remote Role with Comprehensive Benefits and Growth Opportunities at Blithequark

Want Customer Service Agent - Remote/Hybrid in Cedar Falls, IA

Experienced Financial Analyst – Planning and Analysis for Canada - Driving Merchant Discount Revenue Growth at Blithequark

Temporary Admininstrative Assistant - Front Desk Operations - UT Online High School - (UTEMPS)

[PART_TIME Remote] Part-Time Bookkeeper (Atlanta Preferred)

Experienced Customer Service Representative – Pet Industry Expert (Remote in Florida)