بازه حقوق 80 میلیون تا 120 میلیون
About You
- The ideal candidate has evolved from senior software development into NLP/ML, bringing the architectural thinking and code craftsmanship needed to tackle complex memory persistence, multi-tier retrieval, and real-time personalization challenges that separate production systems from proof-of-concepts.
- You champion frequent deployments. You believe in shipping code regularly and improving through real-world feedback rather than seeking perfection.
- You solve problems creatively. You tackle challenges with a can-do attitude, recognizing when to build new features versus strengthening foundations.
Key Responsibilities
- Design and implement the core memory subsystem architecture, including storage tiers, retrieval pipelines, and privacy-aware filtering mechanisms
- Build evaluation frameworks to measure memory retrieval quality, relevance scoring, and augmentation effectiveness with clear metrics and benchmarks
- Implement sophisticated memory extraction and classification systems that identify, rank, and surface relevant user context from chat histories
- Research and adapt cutting-edge memory architectures from academic papers and open-source implementations (e.g., MemGPT, Reflexion, cognitive architectures)
- Architect the memory update pipeline that learns from user interactions, including relevance feedback, correction mechanisms, and adaptive privacy classification
- Design hybrid search strategies that combine semantic embeddings with structured metadata, graph relationships, and temporal patterns for optimal memory retrieval
- Collaborate with PM to define technical constraints and feasibility for memory system features
- Takes ownership of production reliability for AI features—comfortable being on-call for prompt failures, extraction issues, or integration breakdowns
- Understanding of AI system failure modes and fallback strategies—knows when to gracefully degrade vs. retry vs. escalate to human review
Requirements (Must-haves)
- 5+ years total experience, with at least 3 years in software engineering and 2 years building ML/NLP systems
- Hands-on experience building LLM-powered systems with a focus on context management, memory persistence, or personalization features
- Production experience with vector databases (Pinecone, Weaviate, Chroma) and hybrid search systems combining semantic and keyword search
- Proven experience shipping LLM-powered features to production, with an understanding of prompt engineering, context window management, and API reliability patterns
- Hands-on experience with LLM orchestration frameworks (LangChain, LangGraph, Mem0, MemGPT, Letta) for building complex AI workflows
- Demonstrated ability to implement hybrid search systems combining dense vectors, sparse indices, and metadata filtering
- Strong debugging skills for RAG pipelines, prompt reliability, extraction accuracy, and function calling/MCP integrations
- Experience with agentic architectures, including tool calling, MCP (Model Context Protocol), and function orchestration for database access and external integrations
- Experience with privacy-preserving ML techniques or building systems with strict data isolation requirements
- Expert Python programming, plus working knowledge of JavaScript/TypeScript for API integration and JSON schema design for LLM tool definitions
Nice-to-Have Skills
- You’re familiar with underlying technologies like transformer networks, attention mechanisms, and how they contribute to models’ abilities to generate coherent responses, function calls, and other language tasks
- Experience with graph databases or knowledge graphs for representing user memory relationships
- Familiarity with testing LLM applications—prompt regression tests, evaluation datasets, and A/B testing for AI features
- Experience designing LLM-friendly APIs with structured outputs, streaming responses, and token usage optimization
- Experience with experiment tracking and model versioning for memory/retrieval systems
- Background in recommendation systems or personalization engines
- Master’s or PhD degree in physics, biology, CS, Electrical Engineering, etc., can be helpful but not required
Tech Stack & Tools
Core
- Python – Primary language for ML/AI development
- FastAPI – For building high-performance async APIs
- Git/GitHub – Version control and collaboration workflow
- GitHub Actions – Familiarity with CI/CD automation pipelines
- JSON – Standard for structured data exchange and API contracts
Databases
- Vector Databases – Pinecone, Weaviate, Chroma, Qdrant
- PostgreSQL – Primary relational database for structured data
- MongoDB – Document store for flexible, user-defined memory structures
AI/ML Frameworks
- LLM Orchestration – LangChain, LangGraph, LlamaIndex, Haystack
- Model Provider APIs – OpenAI, Anthropic, Cohere, and others
Nice to Have
- Elasticsearch – Full-text search and log-based memory indexing
- Docker – For local development and containerized deployment