Back to Jobs

[Remote] AI Safety Research Intern (PhD)

Remote, USA Full-time Posted 2025-11-24

Note: The job is a remote job and is open to candidates in USA. Centific is focused on advancing AI safety and responsible AI development. As a Ph.D. Research Intern, you will conduct high-impact experiments and contribute to the security guarantees of AI systems through innovative research and practical implementations.


Responsibilities

  • Advance AI Safety: Design, implement, and evaluate attack and defense strategies for LLM jailbreaks (prompt injection, obfuscation, narrative red teaming)
  • Evaluate AI Behavior: Analyze and simulate human-AI interaction patterns to uncover behavioral vulnerabilities, social engineering risks, and over-defensive vs. permissive response tradeoffs
  • Agentic AI Security: Prototype workflows for multi-agent safety (e.g., agent self-checks, regulatory compliance, defense chains) that span perception, reasoning, and action
  • Benchmark & Harden LLMs: Create reproducible evaluation protocols/KPIs for safety, over-defensiveness, adversarial resilience, and defense effectiveness across diverse models (including latest benchmarks and real-world exploit scenarios)
  • Deploy and Monitor: Package research into robust, monitorable AI services using modern stacks (Kubernetes, Docker, Ray, FastAPI); integrate safety telemetry, anomaly detection, and continuous red-teaming
  • Jailbreaking Analysis: Systematically red-team advanced LLMs (GPT-4o, GPT-5, LLaMA, Mistral, Gemma, etc.), uncovering novel exploits and defense gaps
  • Multi-turn Obfuscation Defense: Implement context-aware, multi-turn attack detection and guardrail mechanisms, including countermeasures for obfuscated prompts (e.g., StringJoin, narrative exploits)
  • Agent Self-Regulation: Develop agentic architectures for autonomous self-check and self-correct, minimizing risk in complex, multi-agent environments
  • Human-Centered Safety: Study human behavior models in adversarial contexts—how users probe, trick, or manipulate LLMs, and how defenses can adapt without excessive over-defensiveness

Skills

  • Ph.D. student in CS/EE/ML/Security (or related); actively publishing in AI Safety, NLP robustness, or adversarial ML (ACL, NeurIPS, BlackHat, IEEE S&P, etc.)
  • Strong Python and PyTorch/JAX skills; comfort with toolkits for language models, benchmarking, and simulation
  • Demonstrated research in at least one of: LLM jailbreak attacks/defense, agentic AI safety, human-AI interaction vulnerabilities
  • Proven ability to go from concept → code → experiment → result, with rigorous tracking and ablation studies
  • Experience in adversarial prompt engineering, jailbreak detection (narrative, obfuscated, sequential attacks)
  • Prior work on multi-agent architectures or robust defense strategies for LLMs
  • Familiarity with red-teaming, synthetic behavioral data, and regulatory safety standards
  • Scalable training and deployment: Ray, distributed evaluation, CI/telemetry for defense protocols
  • Public code artifacts (GitHub) and first-author publications or strong open-source impact

Benefits

  • Comprehensive healthcare, dental, and vision coverage
  • 401k plan
  • Paid time off (PTO)
  • And more!

Company Overview

  • Zero distance innovation for GenAI creators and industries Expertly engineering platforms and curating multimodal, multilingual data, we empower the ‘Magnificent Seven’ and enterprise clients with safe, scalable AI deployment We a team of over 150 PhDs and data scientists, along with more than 4,000 AI practitioners and engineers. It was founded in 2020, and is headquartered in Redmond, Washington, USA, with a workforce of 5001-10000 employees. Its website is https://www.centific.com.

  • Company H1B Sponsorship

  • Centific has a track record of offering H1B sponsorships, with 10 in 2025, 22 in 2024, 14 in 2023. Please note that this does not guarantee sponsorship for this specific role.

  •   Apply To This Job

    Similar Jobs

    Seasonal Sales Associate-282 Southeast Richmond...

    Remote, USA Full-time

    Experienced Customer Support Remote Representative – Delivering Magical Experiences to arenaflex Enthusiasts from the Comfort of Your Own Home

    Remote, USA Full-time

    Remote Care Manager - RN 3 Locations

    Remote, USA Full-time

    **Experienced Customer Service Representative – Pet Industry Expert (Remote in Florida)**

    Remote, USA Full-time

    Intelligence Analyst – RFI Triage (Remote, East...

    Remote, USA Full-time

    Business Development Director, Commercial Enter...

    Remote, USA Full-time

    Data Entry Remote Jobs-JetBlue Airline At Home ...

    Remote, USA Full-time

    [Hiring] Temporary Team Lead @TTEC

    Remote, USA Full-time

    Senior Data Scientist - Revenue Intelligence

    Remote, USA Full-time

    Delivery Director - US-Based

    Remote, USA Full-time

    Experienced Customer Service Representative for Healthcare Claims and Benefits – Remote Opportunity in Nebraska

    Remote, USA Full-time

    Resort Sales Associate, Disney Central (Orlando, FL) - Full Time

    Remote, USA Full-time

    Scheduling Assistant - Entry Level (Remote)

    Remote, USA Full-time

    Associate Attorney - AZ

    Remote, USA Full-time

    Mid Market Account Executive | Remote within Las Vegas, NV. OR...

    Remote, USA Full-time

    Experienced Virtual Data Entry Clerk – Entry Level No Experience Required for Part-Time Remote Role with Comprehensive Benefits and Growth Opportunities at Blithequark

    Remote, USA Full-time

    Want Customer Service Agent - Remote/Hybrid in Cedar Falls, IA

    Remote, USA Full-time

    Experienced Financial Analyst – Planning and Analysis for Canada - Driving Merchant Discount Revenue Growth at Blithequark

    Remote, USA Full-time

    Temporary Admininstrative Assistant - Front Desk Operations - UT Online High School - (UTEMPS)

    Remote, USA Full-time

    [PART_TIME Remote] Part-Time Bookkeeper (Atlanta Preferred)

    Remote, USA Full-time