About the Role
We are looking for a motivated AI Engineer who is passionate about large language models (LLMs), machine learning, and applied AI systems. This position is focused on building real-world AI systems, not just experimenting with models.
You will work on projects involving open-source LLMs, retrieval-augmented generation (RAG) pipelines, vector databases, and AI-powered document processing systems. The goal is to build scalable AI workflows that solve practical problems such as knowledge retrieval, document analysis, and AI-assisted automation.
This role is ideal for engineers who want hands-on experience deploying AI systems used in production-like environments.
Responsibilities
AI & Machine Learning Development
- Build and experiment with open-source LLM and SLM pipelines.
- Design and implement Retrieval Augmented Generation (RAG) systems.
- Develop AI pipelines capable of processing documents, PDFs, and structured data.
- Work with embedding models and vector search systems.
- Implement prompt engineering, model evaluation, and response optimization.
- Assist with fine-tuning or adapting open-source models when necessary.
Data Processing & AI Pipelines
- Build ingestion pipelines for PDFs, documents, and datasets.
- Implement document chunking, embedding generation, and indexing strategies.
- Work with vector databases to support semantic search and retrieval.
- Optimize pipelines for latency, scalability, and cost efficiency.
Research & Experimentation
- Evaluate different open-source models and architectures.
- Compare embedding models and retrieval methods.
- Test improvements in RAG performance and hallucination reduction.
- Explore emerging techniques in Vision Language Models (VLMs).
Collaboration
- Work with engineers to integrate AI components into applications.
- Document experiments and technical findings.
- Participate in weekly discussions on AI architecture decisions and improvements.
Required Skills & Knowledge
Core AI Knowledge
- Strong understanding of:
- Machine Learning
- Neural Networks
- Deep Learning fundamentals
- Familiarity with:
- Large Language Models (LLMs)
- Small Language Models (SLMs)
Programming
- Strong proficiency in Python
- Experience with AI/ML frameworks such as:
- PyTorch
- TensorFlow
- Hugging Face Transformers
AI Application Frameworks
Experience with:
- LangChain or similar orchestration frameworks
- Prompt engineering and AI workflow building
Vector Databases
Conceptual and practical understanding of any vector databases such as:
- Pinecone
- ChromaDB
- Milvus
- Qdrant
- FAISS
- Weaviate
Understanding of:
- embeddings
- similarity search
- indexing strategies
- metadata filtering
RAG Systems
Ability to design or understand:
- Retrieval pipelines
- Document chunking strategies
- Embedding pipelines
- Hybrid search
- Context window optimization
- RAG evaluation methods
Data Processing
Experience with:
- PDF extraction
- Document parsing pipelines
- Data preprocessing
Bonus Knowledge
- Vision Language Models (VLMs)
- Multimodal AI systems
- Distributed AI inference
- GPU inference optimization
Preferred Project Experience
Candidates should have completed at least 1–3 hands-on AI projects, such as:
Example Project 1 – RAG Knowledge Assistant
- Built a chatbot that answers questions from internal documentation.
- Implemented document ingestion, chunking, embedding generation, and vector search.
- Used LangChain + vector database + open-source LLM.
Example Project 2 – Document AI System
- Created a system that extracts structured information from PDFs.
- Built pipelines for PDF parsing → embeddings → AI summarization.
Example Project 3 – AI Research Experiment
- Compared multiple embedding models and evaluated search accuracy.
- Benchmarked RAG response quality and hallucination rates.
Example Project 4 – LLM Application
- Built a real-world tool using open-source models (e.g., summarizer, Q&A system, coding assistant).
Real-World Experience
Candidates may also have experience such as:
- Contributing to open-source AI projects
- Participating in AI hackathons
- Research experience in machine learning or NLP
- Building production-style AI APIs
- Deploying models using Docker or cloud platforms
- Working with LLM inference servers (vLLM, TGI, Ollama, etc.)
Tools & Technologies (Exposure Preferred)
AI / ML
- PyTorch
- Hugging Face
- Transformers
- Sentence Transformers
LLM Tools
- LangChain
- LlamaIndex
- Open-source LLMs (LLaMA, Mistral, etc.)
Vector Databases
- Pinecone
- ChromaDB
- Milvus
- Qdrant
- FAISS
- Weaviate
Data Tools
Deployment
- Docker
- REST APIs / FastAPI
- Basic cloud exposure (AWS/GCP/Azure)
What You Will Gain
- Hands-on experience building real AI systems
- Exposure to modern LLM architecture and AI infrastructure
- Experience with RAG pipelines and vector databases
- Mentorship from engineers working in applied AI
- Opportunity to contribute to real production-style AI tools
Candidate Profile
We are looking for someone who:
- Is curious and loves experimenting with AI systems
- Enjoys solving practical engineering problems
- Can quickly learn new frameworks and models
- Is comfortable reading research papers and technical documentation
- Has strong problem-solving and debugging skills
Application Requirements
Please include:
- Resume
- GitHub profile (required)
- Links to AI/ML projects
- Brief description of a RAG or LLM system you have built
Pay: $60,000.00 - $80,000.00 per year
Work Location: In person