Architecture - Open source software engineering developer community and events

Test all the (Network) Things by Dan McCormick

Nov 152013

Test all the (Network) Things

(Contributor article “Test all the (Network) Things by Dan McCormick” SVP of Technology at Shutterstock. Originally appeared on )

Our engineering team supports many different sites, including the , the , the , Bigstock, Offset, and Skillfeed

All these sites rely on a core set of REST services for functionality like authentication, payment, and search. Since these core services are so critical, we need to know if they’re functioning properly at all times, and get alerted if they aren’t. There are plenty of solutions for server-level monitoring, but we couldn’t find a good, simple solution for service or API monitoring. So we built one. It’s called , fornetwork testing framework, and it’s part of a large collection of .

Slide & Bio…

November 15, 2013
Articles, Shutterstock
API, REST

Scaling Deployment at Etsy by Daniel Schauenberg

Oct 152013

In this talk, “Scaling Deployment,” from Etsy talks on the development and deployment infrastructure that they utilize at Etsy. This talk was recorded at the Continuous Delivery NYC meetup at Etsy Labs. At Etsy they have over 100 engineers deploying more than 60 times a day. This culture of continuously deploying small change sets enables them to build and release robust features all while serving over a billion page views per month. In order to make sure they can keep up this pace, they have development and deployment infrastructure in place that makes it comfortable and simple to make changes. So simple that as an engineer at Etsy you deploy the site on your first day – even if you’re a dog.

But how is it possible to deploy so frequently among so many engineers and yet maintain a stable system? To answer this Dave gives a high-level overview of the basic application structure to introduce the specific architecture at Etsy and how their development environment is set up. For development, every engineer gets their own VM with the full application stack configured. This makes it easy to get started and puts every developer in the same, familiar setup. This is a crucial part in removing confusion and ambiguity about how to work on and deploy changes.

For the actual deployment Etsy uses a one-button deploy system – Deployinator - which they developed and open sourced. This system is integrated tightly with their company-wide IRC server and its set of tools that they’ve built to foster confidence, fast feedback and easy communication and collaboration between engineers. A detailed overview will be given on how the system works, how they use it and what problems they had to solve while making sure everyone can deploy as easily and fast as possible. One of the pillars of keeping it fast and scalable was also implementing Atomic Deploys for their web stack. This talk will go into details about what challenges they faced and how Etsy made it work with minimal interruption to the developer workflow.

Slides & Bio…

October 15, 2013
Etsy, Video
Etsy, scaling, Scaling Deployment

Etsy – A Deep Dive into Monitoring with Skyline by Abe Stanway – Transcript

Oct 042013

(Original post with video of talk here)

Abe Stanway: Okay. Hi, I’m Abe. I’m a data engineer at Etsy and today we’re going to talk about Skyline, of which I was the primary author. And so we’re going to talk about how we monitor, why we decided to build this, and how it advances the art of monitoring. So let’s start.

So Etsy is the world’s handmade vintage marketplace. We are based right here in Dumbo, so this wasn’t too much of a pain to get up here. We have a large stack. We’ve got a lot of stuff going on. Specifically, or actually not specifically at all, these are just some of our numbers of some of the servers that we’re dealing with – 41 shards, MySQL, 24 API servers, 72 web servers, 42 Gearman boxes, a 150 node Hadoop cluster, 15 memcached boxes, and around 60 search machines, and a lot more than that. Probably on a scale of a hundred to two hundred, for sure other various services come with a lot of things that we have.

And that’s not to mention the app itself, which is running on top of all these machines, and all the services that are actually running on these machines. In addition to that, we practice something called continuous deployment, which is kind of the new hotness we’ve developed with Devoxx, right. It’s kind of always deploying every single day, so we deploy around thirty to sixty times a day, every day, and we make this really really easy to do for all our engineers.

October 4, 2013
Etsy, Transcript
anomaly detection, architecture, data engineering, Etsy, open source, Skyline

Rent The Runway – Building Web Services with DropWizard by Camille Fournier

Sep 062013

In this talk, from Rent The Runway, explains why they chose Dropwizard to build their SOA, and the advantages it gives for “operation-driven webservices”. This talk was recorded at the NYC Tech Talks group at Meetup HQ.

Camille will also explain how Rent The Runway ended up with Dropwizard after trying other alternatives (Play, Glassfish, Spring), what advantages it gives out of the box for operational transparency (metrics, graphite), ease of use and will give some demos around building simple software services in DW. She will also discuss why they wanted to build operations-driven webservices in the first place.

Slides and Bio…

Etsy – A Deep Dive into Monitoring with Skyline

Aug 152013

Data Engineer at Etsy, , talks about Skyline, a real-time anomaly detection tool. The talk was recorded at eBay NYC. Abe goes over Skyline’s architecture and design, taking a deep dive into the architecture and design of Skyline.

Skyline is a real-time anomaly detection system, built to enable passive monitoring of hundreds of thousands of metrics, without the need to configure a model/thresholds for each one, as you might do with Nagios. It is designed to be used wherever there are a large quantity of high-resolution time-series which need constant monitoring. Once a time metrics stream is set up (from StatsD or Graphite or other source), additional metrics are automatically added to Skyline for analysis. Skyline’s easily extendible algorithms allow you to define what each metrics baseline should be, thereby also defining anomalous behavior. After Skyline detects an anomalous metric, it surfaces the entire time-series to the web app, where the anomaly can be viewed and acted upon.

github.com/etsy/skyline

Get updates of upcoming tech talks and presentations

If you’d like to be notified when we post new tech talks, developer presentations and opensource updates, you can subscribe to our newsletter, or .

Want to hear from more top engineers?

Our weekly email contains the best software development content and interviews with top CTOs. Enter your email address now to stay in the loop.

August 15, 2013
Etsy, Video
anomaly detection, data engineering, Etsy, open source, Skyline

Bitly – Realtime Distributed Message Processing at Scale with NSQ

Jun 202013

This talk is by , Lead Engineer at Bitly, and , a Software Engineer at Bitly, recorded at the eBay NYC offices.

Matt and Jehiah will be talking about NSQ, their open sourced project that solves an issue of realtime distributed message processing, designed to operate at bitly’s scale, handling billions of messages per day. It promotes distributed and decentralized topologies without single points of failure, enabling fault tolerance and high availability coupled with a reliable message delivery guarantee.

Bio…

Our NYC Data Engineering events are live!

May 102013

Because of NYC startups’ interest in big data technologies we’ve recently launched a brand new data engineering meetup. The meetup is for engineers only and features New York’s top startups presenting their learnings on building real-world data processing architectures.

We are having a talk by on their recently open sourced data processing technology next Wednesday, come join us!

May 15th: Realtime Distributed Message Processing at Scale with NSQ
(Matt Reiferson from Bit.ly speaking)
http://www.meetup.com/NYC-Data-Engineering/events/113291272/

May 10, 2013
Articles

Dynamic Scaling at Pinterest

Mar 222013

Organized by , this is a talk from their recent Dynamic Scaling meetup at Yahoo! URL’s cafe. In this talk, , Head of Tech Ops at Pinterest, covers how they Dynamically Scale at Pinterest.

Podcast: Play in new window | Download

March 22, 2013
Audio, Pinterest, Slides
aren sandersen, Dynamic scaling, pinterest, scaling

Select Gig: Senior Platform Engineer

Mar 222013

PlaceIQ is a rapidly growing, venture funded “Big Data” business with a tremendous opportunity to become the market leader in the exploding location intelligence marketplace. Starting with an extensive library of unstructured/unrelated geospatial, temporal and social data, PlaceIQ transforms information into time and location context. Revolutionary techniques in data mining and machine learning at scale are being developed on a daily basis to uncover the reality of the world like no other company can.

March 22, 2013
PlaceIQ
big data, classification algorithms, complex data visualizations, front end, geospatial clustering, regression models

Redis at Pinterest

Mar 112013

Today we’ve got the recording of the first geek talk from the recent SF Redis meetup. , Head of Tech Operations at Pinterest, talks to us about their usage of the advanced key-value store in his talk titled ‘Redis at Pinterest’:

Podcast: Play in new window | Download

Older Entries

Test all the (Network) Things by Dan McCormick

Test all the (Network) Things

Scaling Deployment at Etsy by Daniel Schauenberg

Etsy – A Deep Dive into Monitoring with Skyline by Abe Stanway – Transcript

Rent The Runway – Building Web Services with DropWizard by Camille Fournier

Etsy – A Deep Dive into Monitoring with Skyline

Bitly – Realtime Distributed Message Processing at Scale with NSQ

Our NYC Data Engineering events are live!

Dynamic Scaling at Pinterest

Select Gig: Senior Platform Engineer

Redis at Pinterest

DATA ENGINEERING NEWSLETTER

Upcoming NYC Tech talks

Workflow Engines for Hadoop

Upcoming SF Tech talks

Categories

Archives