This talk is by Adam Ilardi, a data scientist at eBay, and was recorded at the NY-Scala meetup at eBay NYC. Adam talks about eBay’s transition from Pig and raw Cascading to Scalding and explains other ways they use Scala.
This talk is by Mikael Ohlson, Software Engineer at Spotify and was recorded at the Mobile Development meetup at Spotify in New York earlier this month. Mikael will cover how they build dynamic user interfaces on Android.
In this talk, we’ll see how recommendation systems are created from data. What’s the algorithm? What’s the evaluation method? What’s the optimization procedure? When does it converge? We’ll talk about parallelizing in order to scale up to “big data” size via the MapReduce framework. Finally, we’ll think about priors and how they are overloaded. Content from this talk draws from chapters in Doing Data Science contributed by David Crawshaw and Matt Gattis.
Because of NYC startups’ interest in big data technologies we’ve recently launched a brand new data engineering meetup. The meetup is for engineers only and features New York’s top startups presenting their learnings on building real-world data processing architectures.
We are having a talk by bit.ly on their recently open sourced data processing technology next Wednesday, come join us!
May 15th: Realtime Distributed Message Processing at Scale with NSQ
(Matt Reiferson from Bit.ly speaking)
QCon New York is back!
Pete Soderling, g33ktalk founder, will also be there, giving out hacks on how to get better at hiring as a software engineer, discover practical ways to build a unique engineering brand that will put your team at the top of the pack and develop better techniques for screening & interviewing engineers more quickly and effectively.
Don’t miss out!
Use promo code “g33ktalk″ to save $100!
(Original post with video of talk here)
Ben Engber, CEO and founder of Thumbtack Technology, will discuss how to perform tuned benchmarking across a number of NoSQL solutions. He describes a NoSQL Database Comparison across Couchbase, Aerospike, MongoDB, Cassandra, HBase and others in a way that does not artificially distort the data in favor of a particular database or storage paradigm. This includes hardware and software configurations, as well as ways of measuring to ensure repeatable results.
Ben: Hi, My name’s Ben Engber. I’m the founder of a company called Thumbtack Technology. We are a consulting company, with one of our primary practice areas being doing NoSQL development and advising clients on NoSQL. And, the background of this talk is, you know, one of the things that comes up really often when we talk to clients, one of the first things they ask us is, ‘What NoSQL database should we use?’ And then, you know, the followup is, ‘Well, we need to learn a little bit about your business, so let’s do some discovery’. It’s the correct answer, but it often doesn’t go over that well. So, what we wanted to do, is we wanted to have sort of at least a basic baseline which would introduce them to the main concepts to give them right off the bat, and then sort of introduce a deeper discussion based on that.
So, about six months ago, we started researching within our company to do some NoSQL evaluations, and research on the subject. And, this presentation is sort of presents a way that we can compare across these products. So, in some ways, what I’m going to do is come in and argue with everything that Will just said about why you can’t build an abstraction layer.