Introduction Our client is seeking an AI architect who designs and deploys production-grade agentic systems and expert chatbots that automate conversations and solve complex business problems at enterprise scale!
Architect sophisticated AI solutions using Amazon Bedrock AgentCore, LangChain, LangGraph, and PyTorch on containerized Kubernetes infrastructure with advanced MLOps practices!
The ideal candidate is an industry-leading AI expert with proven track record delivering large-scale enterprise AI solutions who excels at orchestrating multi-step reasoning and agentic workflows, brings deep expertise in model training/deployment with PyTorch/TensorFlow, demonstrates mastery of containerization, networking, and performance engineering for ML workloads, establishes robust MLOps/GenAI Ops practices including CI/CD and observability, and provides technical leadership while mentoring teams on AI architecture and operational excellence!
✅ Expert AI/ML engineering with agentic systems and Amazon Bedrock
✅ Hybrid and remote working flexibility with 1960 flexible annual hours
✅ AI architecture leadership role with MLOps and enterprise chatbots
POSITION: Contract: 01 July 2026 – 31 December 2028
EXPERIENCE: 8+ Years related experience
COMMENCEMENT: 01 July 2026
LOCATION: Hybrid: Midrand/Menlyn/Rosslyn/Home Office rotation
TEAM: Expert AI Products / Expert Chatbots
Qualifications / Experience
Appropriate academic qualification such as Computer Science, Engineering or Statistics Demonstrated track record delivering large-scale AI solutions for enterprise customers, including end-to-end ownership of architecture, operations, and stakeholder engagement
Level Criteria
Communication: Negotiate (discussions and compromise. Issues are short-term operational, medium-term tactical or limited strategic nature) Delivery: Outcomes are complex and require the integration of several nuanced systems and processes in order to deliver on the required standards. Employee may be expected to provide guidance to lower level employees Knowledge: Industry Leader - represents the best practice leader for a product or multiple products in a region or country and will typically represent the product or service in professional boards Problem Solving: Create a product or system based on international best practice or guidelines from other companies that have implemented the same or similar solutions. The problems are complex and may require complex solutions in single systems Supervision: Can solve escalated tasks that require a deep understanding of the product or service for which there are no senior level authority to defer the task to. Can lead team leaders who each have their own team members
Duties & Responsibilities Role Requirements
Define and build agentic system architectures that leverage Amazon Bedrock AgentCore and agent frameworks to enable multi-step reasoning and automated workflows Lead technical strategy for model selection, fine-tuning, and inference, advising on cost vs. performance trade-offs Design and implement containerized deployment standards using Docker and Kubernetes to ensure consistent, scalable, and fault-tolerant ML operations Architect secure, low-latency networking for model-to-service and service-to-service communication across private and public networks Perform systems-level performance engineering: select appropriate compute accelerators, run load and stress tests, and conduct capacity planning for production readiness Establish and operate MLOps and GenAI Ops practices, including CI/CD pipelines, model versioning, and deployment automation Implement observability, logging, monitoring, and incident response for production AI systems to ensure operational excellence Own end-to-end system design for AI workloads: data pipelines, model training, inference, orchestration, and lifecycle management Integrate foundation models into enterprise RAG and tool-use pipelines, enabling complex, real-world use cases Provide technical leadership and mentorship to engineers and stakeholders on architecture, best practices, and operational standards
NB:
South African citizens/residents preferred. Valid work permit holders will be considered. By applying, you consent to be added to the database and to receive updates until you unsubscribe. If you do not receive a response within 2 weeks, please consider your application unsuccessful.
#isanqa #DataScientist #Expert #AgenticAI #AmazonBedrock #LangChain #LangGraph #MLOps #PyTorch #TensorFlow #RAG #Kubernetes #GenAI #ITHub #NowHiring #fuelledbypassionintegrityexcellence
Desired Experience & Qualification Essential Skills Requirements
Proven experience designing and building agentic system architectures using Amazon Bedrock AgentCore and agent frameworks (e.g., LangChain, LangGraph, Strands Agents) Strong expertise in orchestrating multi-step reasoning, tool invocation, state management, and workflow automation for AI agents Deep hands-on knowledge of training and deploying models with PyTorch and TensorFlow Experience defining model strategy, including architecture selection, fine-tuning approaches, inference patterns, and cost/performance trade-offs Containerization and orchestration skills: Docker and Kubernetes for scalable, fault-tolerant ML/GenAI deployments Solid understanding of networking for ML workloads, including VPC design, ingress/egress, private and internet-facing communication patterns, and low-latency design Systems-level performance engineering: selecting CPU/GPU/accelerator hardware, plus experience with load testing, stress testing, and capacity planning for ML systems MLOps and GenAI Ops experience: CI/CD for models, model versioning, observability, logging, monitoring, and incident response practices Strong software engineering skills in Python and familiarity with building robust, production-ready APIs and back-end services Experience integrating foundation models with Retrieval-Augmented Generation (RAG) pipelines, tool use, and agentic workflows for enterprise use cases
Advantageous Skills Requirements
Prior experience working with Amazon Bedrock and other cloud-managed foundation model services Familiarity with LangChain extensions, LangGraph, Strands Agents or similar orchestration toolkits at scale Experience with infrastructure-as-code tools (e.g., Terraform, Terragrunt) for reproducible cloud infrastructure Knowledge of serverless components (Lambda, Step Functions, EventBridge) for orchestration and event-driven workflows Background in secure cloud architectures, IAM best practices, and security hardening for AI platforms Experience with data engineering and building reliable ETL/data pipelines for model training and feature stores Familiarity with observability stacks (Prometheus, Grafana, CloudWatch) and distributed tracing for ML services Experience optimizing inference costs through batching, quantization, and model distillation techniques Prior work with enterprise customers in regulated industries (e.g., automotive, pharma, finance) and understanding of compliance considerations Knowledge of hybrid and multi-cloud deployment patterns for AI workloads
Interested? Feel free to also view our other opportunities > https://www.careers-page.com/isanqa
iSanqa Roles notifications are available on Telegram\uD83D\uDE42 - Please join the group of your choice.
iSanqa IT Roles
iSanqa SAP Roles
iSanqa Pharmaceutical/Medical Roles
iSanqa Finance Roles
iSanqa Manufacturing/Engineering Roles
iSanqa Supply Chain/Procurement Roles
iSanqa Sales/Marketing Roles
As well as \uD83D\uDC47
Whatsapp Channel Link: https://whatsapp.com/channel/0029VbCaAEBKrWQyhHxpiV1Y
Update your preference below:
Yes - I want to receive communication on new opportunities via mail, call or text
NO - I want to remain on their database but NOT receive emails or calls or texts
REMOVE me completely from the database.
Sourced from external listing
iSanqa
Sourced from PNet