Acoustic Modeling Engineer

Voci Technologies - Pittsburgh, PA

Full-timeEstimated: $72,000 - $100,000 a year
Company Description
Voci Technologies enables enterprises to extract actionable intelligence from their voice data.
Voci’s hardware-accelerated speech recognition engine runs orders of magnitude faster and
with greater accuracy than alternatives, enabling both real-time and batch transcription of 100%
of companies’ voice data. Voci works with partners and clients to deliver domain- and
data-specific speech intelligence solutions that meet their business requirements for customer
experience, call center operations, compliance, surveys, and transcription applications such as
voicemail. Voci is a privately held company with office headquarters in Pittsburgh, PA located
just minutes away from Carnegie Mellon University in the Strip District. Voci offers exceptional
individuals a dynamic, high energy environment to work with world experts on cutting edge
technology designed to improve customer experience.

Position Summary
As part of Voci’s speech engineering team, you will work with talented peers to
develop novel algorithms and modeling techniques to advance the state of the art in
acoustic model for Voci’s highly accurate hardware-accelerated speech recognition
products. You will also be responsible for training, optimizing, and customizing
acoustic models to specific customer requirements, including supervised and semisupervised
acoustic model adaptation. As a senior modeler you will understand the
entire training chain and be able to develop new languages. You will support our
development engineers in implementing highly scalable speech recognition software.

Primary Responsibilities

Work with data team and other modeling team members to train, test, optimize, and deliver production grade acoustic models.
Retrain and test models with the inclusion of increasing amounts of training data.
Conduct optimization experiments under supervised and semi-supervised conditions, including customization to customer environments.
Assist in migrating our model suite across major version upgrades.
Improve our training and testing environment to enable increased capacity.
Remain current in speech recognition tools and techniques.
Education & Experience

Masters or PhD in Electrical Engineering, Computer Science or Language Technologies.
3+ years in Speech R&D in an academic or commercial setting.
Excellent programming skills with C/C++ and Python on Linux platforms.
Solid NVIDIA CUDA programming skills are beneficial.
Expertise in neural net (DNN / LSTM / CNN / TDNN) is essential.
Expertise in WFSTs, lattice processing, Viterbi decoding is beneficial.
Deep familiarity with one or more of Sphinx / HTK / Kaldi / EESEN / Julius / Tensorflow / Keras / Pytorch is a plus.
Required Behaviors

Ability to formulate and carry out research and development plans.
Ability to function in small cross-disciplinary teams.
Ability to create deliverables (models) according to schedule.
Adaptable, able to manage multiple processes simultaneously.
Strong work ethic.
Required Skills

Excellent written and verbal communication, and interpersonal skills.
Highly organized with excellent time and project management skills.
Knowledge of Microsoft Office, Google Apps is a nice to have.
Company Offerings

Benefits (medical and dental).
Paid vacation along with 10 annual company paid holidays.
Modern, attractive office space in the heart of Pittsburgh’s high tech community.
Office snacks, drinks and pretty good coffee.
Company gatherings and outings.
Dynamic small company environment with the excitement of a later stage CMU start-up.
Additional Information

All information will be kept confidential according to Equal Employment Opportunity guidelines