Research Intern - Program Synthesis

Microsoft - Redmond, WA (30+ days ago)4.2

Empower every person and organization on the planet to achieve more. That’s what inspires us, drives our work and pushes us to challenge the status quo every day. At Microsoft, we also work to empower our employees, so they can achieve more. We believe we should each find meaning in our work and we ensure employees have the freedom and the reach to help make a difference in the world.

Microsoft Research provides a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers. Our researchers and engineers pursue innovation in a range of scientific and technical disciplines, to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment.

Program synthesis is a new frontier in AI where-in the computer programs itself-a synthesizer interprets user intent and produces a script or a program for the task. The user intent may be given as input-output examples, constraints on the program, or other high-level specifications. This is significant because 99% people who own computers do not know programming. Even for programmers, this can provide a 10-100x productivity increase for many task domains, and especially in the areas of data wrangling/preparation and code refactoring.

The PROSE SDK is a powerful and flexible framework that uses state-of-the-art program synthesis and machine learning techniques to automate repetitive tasks in a variety of domains. The strength and breadth of the framework is evidenced by the range of Microsoft products that use it, including Excel, Powershell, SQL Server Management Studio and Azure Machine Learning Workbench.

Our experience in using program synthesis in practice has helped us identify several unique and interesting research challenges. How does one test and debug programs when a synthesizer is involved? What user experience and programming interfaces are appropriate when the primary mode of interaction is through input-output examples? How can we adapt the generated programs each user's programming style and coding convention? How can we adapt and improve ML techniques to a setting where logical constraints are present? These challenges require both novel research ideas and practical-minded engineering to address satisfactorily.

Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world’s best researchers, interns learn, collaborate, and network for life. Interns not only advance their own careers, but they also contribute to exciting research and development strides. During the 12-week to 24-week internship, students are paired with mentors and expected to collaborate with other interns and researchers, present findings, and contribute to the vibrant life of the community. Research internships are available in all areas of research, and are offered year-round, though they typically begin in the summer.

Our team is looking for research interns to work on several projects related to the above challenges and several other aspects of the PROSE ecosystem. Below are some specific research directions we are exploring:

a) Task-guided string clustering. In the setting of data preparation and data wrangling, strings are clustered with the aim of processing the clusters separately for standardization or analysis. Can we leverage knowledge about this further processing steps to produce better clusters than traditional methods?

b) User experiences around program synthesis. The nature of program synthesis is that it expects some level of user involvement and expertise: either in specifying behavior, verifying that the synthesized program meets their needs, or providing a means through which the user can provide additional specifications. How can we surface program synthesis through novel user interfaces to help users effectively complete tasks?

c) Generate human-readable and modifiable code. Many current program synthesis approaches are black boxes: they assume that the user is only interested in completing the task, without needing to know the internals of the synthesized programs. However, in domains such as data science, users of the system demand not only want to verify the generated code, they also want to be able to incorporate or extend this code into their programs. How can we synthesize programs that are also human-readable?

Required Qualifications

Must be currently enrolled in a PhD program in Computer Science or a STEM-related field.

Preferred Qualifications

Specialization in or at the intersection of Programming Languages, Formal Methods, Machine Learning, or Human-Computer Interaction is preferred. However, good candidates from other disciplines will also be considered.

Ability to collaborate effectively with other researchers and product development teams.

Experience and/or interest in developing software and a desire to see their research ideas implemented in practice.

Prior publications from top-tier conferences in Computer Science or a related field.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.