Transform Your Data On The Fly!

Introducing Scalyr PowerQueries

Today we are excited to announce PowerQueries, a new set of data operations within Scalyr that give users the ability to transform and manipulate data. Query languages have been around for some time but they’re often slow, hard to learn and therefore hard to use. Our PowerQueries give you the same capabilities of similar query languages to compute and analyze your data but with the performance you’ve come to expect from Scalyr.

Most day-to-day observability problems aren’t that hard. Thousands of engineers rely on our existing full-text and facet-based search every day. But there are times when you run into those bears of problems. It’s at these times when you need something fast and powerful.

PowerQueries help you perform multi-conditional searches or create a data pipeline in support of specific reports. Use cases generally follow a similar pattern: take some data, group it, compute some statistics for each group, and then show only the groups that match some criteria. For instance, DevOps teams often only consider a specific route to be problematic if it generates more than N errors per minute. Another example could be observing whether a single IP address issues more than N login requests in some time period. You get the idea. And the power.

Our PowerQueries are designed with a simple syntax so they are easy to learn and easy to use, and PowerQueries integrate with and augment our split-second search capabilities. In fact, you can seamlessly pivot from our existing facet-based search into a power query. This is incredibly useful when you don’t know what fields you are looking for, often the case with unstructured log data.

Here’s a basic query that illustrates our simple yet powerful query language:

$logfile= '/var/log/nginx/access.log'
| group requests = count(), errors = count(status >= 500) by uriPath
| let error_percent = errors * 100 / requests
| sort -error_percent
| filter requests >= 500 && error_percent > 1
| limit 5
| columns uriPath, errors, error_percent, requests

This query pulls data from the Nginx access log to produce the following table, which shows the top five URLs with the highest traffic (requests) and highest error rates (5xx), sorted in descending order by error rate (please excuse the formatting of the table!).


PowerQueries results table

As you can see, the query syntax is simple and easy to use.

If there is one thing that we’ve learned over the years, it’s that when tools are fast, simple and effective, engineers tend to use them. We’ve designed PowerQueries to provide useful capabilities with the performance you’ve come to expect of our product.

PowerQueries are available in beta, please let us know if you’re interested in trying them out!

Live, log and prosper!