This talk is by Adam Ilardi, a data scientist at eBay, and was recorded at the NY-Scala meetup at eBay NYC. Adam talks about eBay’s transition from Pig and raw Cascading to Scalding and explains other ways they use Scala.

 

 

Slides and Bio…

 

This talk is by Mike Nolet, CTO & Co-Founder at AppNexus, recorded at  Aerospike.

Mike discusses the AppNexus infrastructure and their custom built continuous deployment system.

Bio…

 

This talk is by Steve Souders, Veteran Software Engineer at Google and is part of the Airbnb tech talks.

Steve will cover a variety of tools, both old and new, to measure, analyze and fix the performance of popular websites.

 

Bio…

 

This talk is by Mikael Ohlson, Software Engineer at Spotify and was recorded at the Mobile Development meetup at Spotify in New York earlier this month. Mikael will cover how they build dynamic user interfaces on Android.

Bio…

 

mathbabe.org logo

In this talk, we’ll see how recommendation systems are created from data. What’s the algorithm? What’s the evaluation method? What’s the optimization procedure? When does it converge? We’ll talk about parallelizing in order to scale up to “big data” size via the MapReduce framework. Finally, we’ll think about priors and how they are overloaded. Content from this talk draws from chapters in Doing Data Science contributed by David Crawshaw and Matt Gattis.


Bio, etc…

 

Because of NYC startups’ interest in big data technologies we’ve recently launched a brand new data engineering meetup. The meetup is for engineers only and features New York’s top startups presenting their learnings on building real-world data processing architectures.

We are having a talk by bit.ly on their recently open sourced data processing technology next Wednesday, come join us!

May 15th: Realtime Distributed Message Processing at Scale with NSQ
(Matt Reiferson from Bit.ly speaking)
http://www.meetup.com/NYC-Data-Engineering/events/113291272/

 

tutorials

QCon New York is back!

http://qconnewyork.com/

QCon is a practitioner-driven conference designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams. Some of the 15 tracks @QCon New York include: Polyglot Architectures, Applied Data Science, Java, Lean Startup Applied, Continuous Delivery, the JavaScript Ecosystem, Mobile Dev and more.

Pete Soderling, g33ktalk founder, will also be there, giving out hacks on how to get better at hiring as a software engineer, discover practical ways to build a unique engineering brand that will put your team at the top of the pack and develop better techniques for screening & interviewing engineers more quickly and effectively.

Don’t miss out!

Use promo code “g33ktalk″ to save $100!

http://qconnewyork.com/?utm_source=infoq&utm_medium=newsletter&utm_campaign=
g33ktalk

 

h2o big data algorithms

Today we will hear from SriSatish Ambati, founder of 0xdata (pronounced ‘hexadata’), which makes H2O, an opensource prediction and math engine for big data. He will be giving a talk on scaling GLM, Random Forest, and other popular big data algorithms such as the AllState Kaggle dataset.

Continue reading »

 

Continue reading »

 

(Original post with video of talk here)

Ben Engber, CEO and founder of Thumbtack Technology, will discuss how to perform tuned benchmarking across a number of NoSQL solutions. He describes a NoSQL Database Comparison across Couchbase, Aerospike, MongoDB, Cassandra, HBase and others in a way that does not artificially distort the data in favor of a particular database or storage paradigm. This includes hardware and software configurations, as well as ways of measuring to ensure repeatable results.

Ben: Hi, My name’s Ben Engber. I’m the founder of a company called Thumbtack Technology. We are a consulting company, with one of our primary practice areas being doing NoSQL development and advising clients on NoSQL. And, the background of this talk is, you know, one of the things that comes up really often when we talk to clients, one of the first things they ask us is, ‘What NoSQL database should we use?’ And then, you know, the followup is, ‘Well, we need to learn a little bit about your business, so let’s do some discovery’. It’s the correct answer, but it often doesn’t go over that well. So, what we wanted to do, is we wanted to have sort of at least a basic baseline which would introduce them to the main concepts to give them right off the bat, and then sort of introduce a deeper discussion based on that.

So, about six months ago, we started researching within our company to do some NoSQL evaluations, and research on the subject. And, this presentation is sort of presents a way that we can compare across these products. So, in some ways, what I’m going to do is come in and argue with everything that Will just said about why you can’t build an abstraction layer.

Continue reading »

Proudly hosted by WPEngine