Netflix
In this talk Manager of Data Platform Architecture Jeff Magnusson from Netflix discusses Lipstick, a tool that visualizes and monitors the progress and performance of Apache Pig scripts. This talk was recorded at Big Data Guru meetup at Samsung R&D. Comments are available here.
While Pig provides a great level of abstraction between MapReduce and dataflow logic, once scripts reach a sufficient level of complexity, it becomes very difficult to understand how data is being transformed and manipulated across MapReduce jobs. The recently open sourced Lipstick solves this problem. Jeff emphasizes the architecture, implementation, and future of Lipstick, as well as various use cases around using Lipstick at Netflix (e.g. examples of using Lipstick to improve speed of development and efficiency of new and existing scripts).
Controlled Experimentation (or A/B testing) has evolved into a powerful tool for driving product strategy and innovation. The dramatic growth in online and mobile content, media, and commerce has enabled companies to make principled data-driven decisions. Large numbers of experiments are typically run to validate hypotheses, study causation, and optimize user experience, engagement, and monetization.
The concept of controlled experimentation is simple – randomly divide the user population into two groups called the Control (A) and Treatment (B). The experiment involves concurrently exposing the Control group users to one experience (typically, the existing experience) and the Treatment group users to another (the new experience). A set of performance metrics are computed for both groups and statistical tests are run to determine if the change in metrics (if any) for the Treatment group compared to the Control group is purely due to chance or not. Typical use cases of A/B testing include, testing a modified web user interface, evaluating a new call to action for mobile app downloads, and examining the effects of a new personalization algorithm.
In this talk, the second of 5 recorded from The Hive’s Big Data Think Tank Meetup at Microsoft, discusses controlled experimentation as they’ve seen it at Netflix.
Podcast: Play in new window | Download
Continue reading »
We were at Box.net this week for the SV Cloud Computing group to hear an awesome talk from Netflix engineer on the massive improvements Netflix has made to their API infrastructure over the past year or so. The Netflix API currently serves up to 1.6 billion requests/day (!) and supports more than 800 different connected devices. Their API architecture is designed for resiliency, and you’ll hear Ben talk about exactly how they achieve that – plus, some other very interesting info on how they’ve recently re-architected their API for easy of maintenance and maximum performance.
Audio
Podcast: Play in new window | Download