Jobs at Q Systems

View all jobs

Site Reliability Engineer (NY)

New York, NY
The candidate will be part of the Global Fixed Income Quantitative Development team, which is responsible for the development of Real-time Pricing and Risk system and trading tools. An SRE is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. Our SRE is responsible for the availability and reliability of our businesses most critical Trading applications and services, and ensures they meet the requirements of our business users. We look for engineers who are motivated to collaborate with our business to build and run sustainable production systems, which can evolve and adapt to changes in our fast-paced, global business environment.

Duties and Responsibilities:
  • Build best practices to improve reliability, reduce latency and improve the software development process to minimize risk around critical systems, either with current or future architecture
  • Balance feature development velocity and reliability with well-defined SLOs
  • Run the Production environment by monitoring availability and taking a holistic view of system health
  • Drive incident management process and support an constructive post-mortems culture
  • Partner with development teams to improve services via rigorous testing and release procedures
  • Participate in system design consulting, platform management, and capacity planning
  • Create sustainable systems and services through automation and uplifts
Background / Experience:
  • Technical education should include an Undergraduate or Advanced Science, Technology, Engineering or Mathematics (STEM) Degree.
  • Strong background in computer science fundamentals, data structures, algorithms, distributed systems
Required Skills:
  • Programming experience in scripting (e.g. Python, Perl etc) and/or compiled languages (C++ etc)
  • Comfortable with a range of current software development tools and practices (testing, source control, build systems, etc)
  • Experience in Agile and Site Reliability Engineering concepts
  • Commitment to excellence and a strong attention to detail
  • Highly motivated with a desire to work in a hands-on collaborative environment
  • A passion for learning, adapting to changing requirements and technology and inventing new approaches to hard problems
  • Understanding of the operational, maintenance and support aspects of business critical systems
Desirable Skills
  • Experience with real-time architecture and analytic engines like Apache Spark and with distributed computing
  • Experience with distributed systems design, maintenance, and troubleshooting
  • Hands-on experience with debugging and optimizing code, as well as automation
  • Experience with large-scale batch processing systems, efficient scheduling and dependency management
  • Experience with owning and leading projects
Powered by