I am a driven self-starter who can multi-task while maintaining focus in a fast-paced, changing environment. I have 20+ years of experience in pure software engineering, infrastructure and operations, DevOps / SRE / Agile, data engineering, and analytics.
Principal Consultant
Founded Vivanti USA, LLC along with Chairman Tony Nicol and National Managing Partner Mike Walker. Handled a variety of tasks across the enterprise, including business development, marketing, interviewing & hiring, internal IT / infrastructure, partner relationship-building, customer relationships, and delivery.
Worked on several data-focused modeling contracts, API integration and "glue-code" projects, and provided end-to-end support from first consultation call, through requirements gathering, solutioning, and ultimately execution and status reporting.
Director of Research & Development
Responsible for a team of a dozen senior and mid-level consultants, both local and remote (Canada, CA, UT, EU), focused on building next-generation cloud enablement technology. Helped with content marketing efforts (blogging, whitepapers, public speaking, conference speaking, video content, etc.) to evangelize our efforts in the public open source and enterprise spheres.
Senior Cloud Engineer
Senior member of a highly-flexible consulting team focused on enabling enterprises to build dedicated cloud offerings, using BOSH and Cloud Foundry.
Lead several dedicated consulting teams at various clients. Responsible for technical direction, client ops training, system architecture design, and some implementation efforts. Interfaced with all levels of client hierarchy, from upper management to independent contributors.
Responsible for several internal mentoring and training efforts. Wrote and conducted the Go programming segment of on-boarding training curriculum. Mentored several junior consultants on everything from hard technical skills like programming, systems administration, and networking, to soft skills like communication, task prioritization, and problem resolution.
Projects
Wrote SHIELD, a backup solution for BOSH deployments, complete with a CLI for automation and a Web UI for operator use. Allowed several clients to effortlessly perform backups of Cloud Foundry deployments.
Wrote Spruce, a YAML multi-tool that features several operators to enable de-duplication and dynamic generation of parts of the final YAML structure. Spruce was a major factor in several initiatives.
Wrote Genesis, a tool that facilitates a BOSH deployment paradigm based on localization of general manifests to more specific environments, allowing re-use of common structure (jobs, releases, properties, etc.) across multiple installations.
Wrote Jumpbox, a utility for outfitting vanilla Ubuntu VMs with all the software and configuration necessary to run BOSH deployments.
Wrote Safe, an alternative CLI for the Hashicorp's Vault secure storage solution, with an emphasis on making life easier for operators by providing higher-level functionality (move, rename, full-tree listing, etc.).
Principal Systems Engineer
Senior member of a Technical Operations team tasked with developing high-quality application and service health check logic, metric data collection / aggregation systems, and data visualization and analysis tools.
Operationally responsible for systems monitoring over 4,200 hosts, 110,000 service checks, and curating ~1.2 million performance data metrics, across 9 data centers.
Received 3 promotions and several merit bonuses, both during normal review cycles and as recognition for excellence in execution on key initiatives.
Projects
Led a four-month effort to build a new monitoring system using Open Source components (Project Hammer Throw). Allowed Synacor to discontinue use of commercially-licensed Groundwork Enterprise Monitoring System, saving approximately $160,000/year. Migrated to new system in less than four weeks, with no data loss and no downtime.
Wrote IRIS, a custom Icinga Event Broker module, written in C, enabling the solution to scale to 2x the throughput of Groundwork. Released as Open Source in 2014.
Wrote NLMA, the NLMA Local Monitoring Agent, for scheduling and executing local service check plugins, and feeding the data back to the core monitoring system. Released as Open Source in 2014.
Wrote NLMA::Plugin framework for quickly writing new Nagios Check Plugins quickly, correctly and with less invested effort. Released as Open Source in 2014.
Wrote image analysis software to generate graph images under both systems and find anomalies indicative of data loss or corruption.
Patched Icinga to improve throughput (#8140, #8141) and stability (#8139). Changes were contributed back to the community and accepted for inclusion in 1.7.13.
Wrote Synformer, a modern web application that coalesces alerts and graphs from all data centers into a cohesive user interface. Synformer facilitates interaction with monitoring, allowing users to acknowledge problems, schedule downtime, clear alerts, search configuration metadata and review alert history. It includes DashCode, a language for writing custom graph and alert dashboards.
Wrote MAD, a custom rules engine that analyzes alerts in the Synformer database, detects patterns of causality and suppresses symptom alerts in the dashboard display.
Wrote ProcLog, a log processing framework used to stream real-time log data from nginx, Apache, Varnish, Jetty and syslog, aggregating request volume, load time, error rate and other web metrics into the monitoring system.
Wrote monitoring check plugins for gathering metrics from Java/JMX installations, Riak, MongoDB, MySQL, Cassandra, Hadoop, PostgreSQL, Apache, Varnish, nginx, Jetty, OpenLDAP, BIND, Postfix, and many other platforms. Consulted during troubleshooting / diagnostics / RCA due to familiarity with these systems.
Methodology
All software development work done within Git revision control, under collaborative code review and some peer programming. Strong foundation of test coverage (90% target across all codebases) and test-driven development practices.
Operational work, including check configuration, package deployments and configuration file management was carried out exclusively through Puppet, enabling fast stand-up of new and replacement monitoring servers.
Systems Administrator
Hired on as a Helpdesk Operator in 2006, but quickly demonstrated value and competence in systems design, implementation and administration.
Wrote Ticket Center, a trouble ticket, service request and change communication system to enable the 30+ person IT and Telecom departments to self-manage workload and prioritize requests.
Leveraged CFEngine configuration management system to enable two system administrators to manage 100+ Linux servers, and still have time to devote to other projects as required.
Implemented Nagios monitoring system for 350+ servers, switches, routers and firewalls, to improve operational visibility.
Championed virtualization of infrastructure with Xen, iSCSI and debootstrap VM deployment automation. Pioneered use of blade server technology to increase server rack density and maximize data center floor space usage.
Replaced aging HP-UX infrastructure by writing a custom file transmissions job framework (in Bash) and migrating 300+ nightly and weekly jobs. Implemented secure, managed sftp bastion hosts and CIFS gateways for interacting with Windows sources / destinations.
Designed and built a custom VoIP PBX system using Asterisk for application functionality (Voicemail, Dial-by-name directory, etc.) and OpenSIPs for routing and registration. Served 250+ internal customers across 5 locations. Wrote custom software for provisioning Grandstream GXP phones via tFTP.
Contractor
Brought in after the departure of senior engineer to finish PHP projects and revamp server configuration / web site deployment. Ported legacy code to newer versions of PHP and implemented version control.
Senior Software Engineer
Created the Exponent CMS. Responsible for all technical decisions related to features and functionality of CMS software. Accompanied sales team as a technical resource to discuss potential new business with clients. Helped manage production web, email and DNS servers running Apache, qmail and BIND.
https://github.com/jhunt/netip.cc and https://netip.cc
A small DNS nameserver that echoes back the address you give it as an A
record. I used this all the time for name-based virtual host routing in
HTTP applications. For example, the name api.10.16.0.17.netip.cc will
always resolve to the A record 10.16.0.17.
This is a service I run free-of-charge to the wider community, on my own personal infrastructure.
https://github.com/jhunt/shout
A smart notification server, written in Common Lisp (primarily Steel Bank Common Lisp / SBCL), implementing edge-triggered notification messages. SHOUT! attempts to solve a personal grip I have with most notification schemes whereby recipients are constantly inundated with "it failed!" messages, but rarely get notification when the problem gets resolved.
SHOUT! is especially useful in pipelines and automated workflows, including automated deployment pipelines and software release automation.
https://github.com/jhunt/k8s-boshrelease
A distribution of a working Kubernets cluster, to be deployed atop the BOSH cloud orchestration tool. BOSH spins VMs, and this BOSH release puts Kubernetes on those VMs, in a wide variety of topologies: all-in-one lab machines, multi-host homogenous clusters, production-grade scaled-out clusters, etc.
https://github.com/jhunt/clockwork
System Configuration Management system written in C, with NaCL crypto support and ZeroMQ transport layer, designed with security and speed as primary objectives. Enables administrators to configure multiple hosts through policy definitions, and then enforces those policies on client systems.
A lightweight testing framework aimed at C programmers practicing TDD. Designed to be expressive, through judicious use of preprocessor macros. Works with the popular prove testing tool from Perl.
https://github.com/jhunt/icinga-iris
Submitted patches against Icinga 1.7.x (a Nagios fork) for improving performance and increasing reliability and throughput of monitoring systems.
IRIS is a custom Event Broker module that increases monitoring capacity, tripling the throughput of an individual host from (unpatched) ~20,000 results per min to ~60,000.
https://github.com/jhunt/verse
A flexible and powerful static site generator. Powers http://jameshunt.us