The big data revolution changed everything about how data is collected, stored, and analyzed.
While big data is now commonly used to drive decision making in almost every arena, several industries have lagged behind, including the electric and energy sectors.
The team of data scientists and software engineers at PingThings is working to fill this industry gap.
They have built a platform that ingests, stores, analyzes, and learns from time series sensor data at scale.
We talked with the CEO, Sean Murphy, to learn about the problems they’re solving and the impact of their work in various industries.
The conversation below has been edited for length and content.
What is the main challenge that your team is solving?
It’s actually an interesting confluence of challenges. If you look at the big data space, most of the technologies that exist were built by the big tech titans, like Google and Facebook.
Data is their lifeblood; it’s one of their competitive advantages. Handling data at scale is just who they are. As a result, big data software has been created that solves problems these companies face.
If you look at time series data from the perspective of Facebook or Google, it’s a timestamp and a Tweet or a timestamp and a Facebook post. However, for the scientist or engineer, time series has a very different connotation; it’s about some sensor regularly measuring some underlying physics-based process.
Often, the more important the process or asset being monitored, the faster you take measurements – a thousand times a second or faster. Since Google and Facebook don’t typically deal with problems requiring such high sample rates, the problem has remained unaddressed. How do I store, compress, and analyze large numbers of time series data streams arriving at a thousand (or million!) samples per second?
It seemed like a worthwhile challenge to tackle.
What kind of industries does this technology impact? Who are your main users?
Our technology impacts numerous traditional industries, such as the electric or energy sectors.
We started by focusing on the energy space, particularly on distribution and transmission utilities because they have underused high frequency sensors already deployed on the grid. Also, we had help from the Department of Energy and the Advanced Research Projects Agency for Energy (ARPA-E).
The current set of grid sensors, called synchrophasors, takes measurements 30, 60, or 120 times per second. The newest sensors sample at a hundred thousand times per second. When you have a transmission system that covers a large region, such as the entire state of Texas or New York, you have hundreds and even thousands of sensors generating petabytes of time series data.
As much of our electric infrastructure is well past its expected lifetime, it is more susceptible to wild fires, extreme weather, cyber-attacks, and any other unexpected insults or perturbations. The more at-risk the patient, the more closely he or she should be monitored and the grid needs to be monitored continuously in a high frequency fashion. This results in an enormous amount of time series data.
Our PredictiveGrid Platform is designed to get insights from that sensor data.
Can you describe the PredictiveGrid Platform?
We offer the PredictiveGrid platform as a service running in most major cloud providers and can deploy on premise as well.
There’s an ingestion layer to handle whatever data comes our way. Then there’s our time series database, purpose-built for large volumes of and high frequency sensor data. We actually invented a new data structure to handle the real-world problems that arise when you work with time series data such as out of order insertions and data cleaning. Then we have a custom analytics layer that’s essentially a big data framework for time series data processing. There’s also a rich API layer upon which we and our customers build applications, analytics, and use cases to extract value from the data.
When you compare our platform with the best time series platforms coming out of Silicon Valley, ours is faster by at least an order of magnitude, can handle petabyte-scale data sets, and was built specifically to enable analytics.
With that foundation we’re set up to serve not just in energy, but in oil and gas, finance, medicine, healthcare, telecommunications, transportation, and manufacturing.
There’s a lot of other verticals that we could enter since we have the ability to build custom applications for each industry that can run on top of our platform.
Why have these markets been underserved in the past? Are there other companies providing a similar product?
We’re solving the issues of industries that have been neglected during the big data revolution.
The sales cycle is very long for a lot of these industries, like oil, gas, energy, and electric utilities.
There’s a saying in the energy space that it’s a race to be second. No one wants to be first; everyone wants to follow. Until recently, the mentality in this space was “if it ain’t broke don’t fix it.” Companies would simply rely on old systems.
But now we’re getting to the point where society’s expectations have changed. Consumers have Apple Watches that literally monitor every move they make, their EKG, and the oxygenation of their blood. If you can do this for a few hundred dollars, why aren’t we monitoring critical infrastructure that is the foundation of our country with such fidelity and care?
That pressure is now driving not only the adoption of new sensors and new sensing technology but also the expectation that we do more than just collect and archive the data but that we actually put it to good use.
Data at rest tends to stay at rest. Data historians make your data history. It’s time to innovate and actually evolve and extract as much value as possible from industrial data.