OkCupid has a long history of building things in-house. They have always been a deeply technical company that doesn’t settle for what’s already available and isn’t afraid to build something new. In fact, OkCupid doesn’t use Apache or Nginx — they built their own web server. So it isn’t surprising that when it came time to reconsider how they manage logs, they considered both existing solutions and the possibility of building something in-house. In the end, they found that Scalyr both met their log management needs for today and has the potential to grow as a solution for them to solve a number of other challenges they face. I recently had a chance to catch up with Alex Dumitriu, CIO at OkCupid, about their early experiences with Scalyr.
Since its launch in 2004, OkCupid has become one of the world’s fastest-growing dating sites. Their effective matching algorithm is responsible for more than 40,000 first dates every day. OkCupid leads important conversations around identity, sex, and relationships on its popular blog, and is the first dating site to offer expanded gender and orientation options.
As a leading online dating site, OkCupid sees enormous use all day, every day. The site is accessed both via browsers and via mobile apps, and making sure their customers have a quality experience during every interaction is a top priority. OkCupid’s existing log management solutions did not allow easy access to their web server logs or applications logs. The existing log management solutions required engineers to search through logs and analyze data with tools like grep and uniq, which was slow and cumbersome.
Awesome Early Results
OkCupid chose Scalyr after evaluating both an ELK-based in-house solution and other log management products. After a deep evaluation, they chose Scalyr for its speed, ease of use, and potential to help phase out a number of other home-grown systems in the future.
When OkCupid works with vendors, they chose carefully. They like to move fast and prefer to work with companies that understand that and also move quickly. They were very happy with the access they had to deeply technical resources at Scalyr and felt all of their questions and concerns were addressed very quickly.
Within a month of purchasing, OkCupid had successfully deployed the Scalyr agent to their production servers, set up parsing of the logs, and were using Scalyr to track down issues in minutes that would previously have taken hours of developer and operations time. The Operations team at OkCupid was very happy with the support they received while getting Scalyr rolled out to the company. In particular, the fact that Scalyr created custom parsers for all of their log formats helped save them time and made getting things up and running a breeze.
Alex told us that they already see “hours saved daily by some of their most expensive employees.”
Those engineers love the speed and responsiveness that Scalyr provides. With other solutions, you need to carefully think about queries before running them. If you get something wrong, you are punished by having to wait minutes before getting a chance to try again. Using Scalyr, OkCupid loves the ability to dig into their log data in a very ad-hoc way and “poke around in the data to tease out interesting patterns.” (Aside: OkCupid has a history of teasing out patterns in data.)
That speed doesn’t just save people time. In the event of fraudulent (or malevolent) web traffic, Scalyr makes it easy for OkCupid to quickly search for unusual events across all of their servers. For example, if someone has a list of stolen usernames and passwords and they are walking through that list trying the credentials on OkCupid, that pattern of behavior can be easily spotted using Scalyr.
OkCupid has only recently adopted Scalyr for log management, and they have some cool ideas for how they can use Scalyr to improve their ability to see what’s going on across their vast collection of servers. They are already very happy with how quickly engineers can get the information they need from log data, and we are looking forward to catching up with them again in six months to talk about some of the other data they are planning to move to Scalyr.