How to Build the World’s Most Productive Data Lake and Data Analytics Platform with Scala

Silectis’ founder and CEO, Demetrios Kotsikopoulos, shares how he’s using Scala to build a team, culture, and technology that finds valuable insights in data.

The conversation below has been edited for length and content.

Silectis aims to be the world’s most productive platform for data analytics. If you manage data pipelines, data security, data compliance or have anything to do with data, they want to make you more productive. Silectis is built by people who have done your job in the past, and they know how to make data effective.

“The way you do that is by creating individual wins” says Demetrios. “A lot of the features we’ve built into the platform are things that automate routine tasks that everyone does.”

One of those features is data profiling. When you have a new data source, Silectis will preprocess it to help illuminate what that data is. At a glance, you’ll know how big your data set is, what range it covers, how it’s distributed, and whether there are any “unexpected gaps.” It’s a task you nearly always have to do when integrating a new data source or understanding data you already have. By automating this routine task, Silectis saves analysts hours of time.

At the same time, Silectis aims to make the technology more approachable. Silectis’ users have varying technical capabilities. Some are savvy analysts or experts in data science and machine learning. Others might just know how to make SQL queries. Silectis aims to make all of those people more effective.

Why build a data lake and analytics platform with Scala?

From a core application development standpoint, Silectis has largely built its platform in Scala. Demetrios chose to build with Scala because it was a programming language he already knew, and it was a way to build concise, succinct code that allows their team to efficiently accomplish what they need to do. To Demetrios, Scala was a “no-brainer” for the core of the platform.

One of Scala’s many strengths as a programming language is that it allows developers to specify a grammar and gives them a built-in way to parse that grammar. Demetrios’ team was able to build their own scripting language to manipulate the platform.

Another advantage of Scala was its compatibility with Apache Spark. Apache Spark is written in Scala and Silectis’ platform builds on top of this foundation. The team identified Spark as a basis for the platform because it gave them the scalability and performance they needed to manage large data sets. This allowed the team to focus on building productivity and management tools rather than just focusing on system performance.

Aside from Scala and Spark, Silectis uses a number of other programming languages as needed. They use JavaScript on the front end. Parts of the platform are written in both Python and R to allow users to execute both on top of the platform. They use PostgreSQL as their metadata repository. They also have infrastructure across multiple clouds: Amazon Web Services, Google Cloud Platform, and Microsoft Azure.

Silectis’ engineering team isn’t large (yet), but they’re already thinking about scalability. They know they’ll be growing this year. They’re also considering building richer UI components in the future. While they may stay inside of Scala and use a framework like Play for backend web services, they might also look at other options, such as Python. They are still considering front-end frameworks. While some of their tools currently use AngularJS, React is also being investigated.

Stay Up-to-Date on DC Tech Trends

How do you build a Scala engineering team that can scale with the product?

Demetrios is open to engineers without a Scala background joining his team.

“You can convert people.” He explained. A skilled Java, Python, or C# programmer is capable of picking up a new language, especially if they have a good foundation in software engineering.

Silectis team members who haven’t worked with Scala in the past have been able to pick it up even with little or no previous exposure to functional programming. Demetrios knew that they had a sound foundation that would allow them to learn Scala and good functional programming practices

At Silectis, engineers are hired for their intellectual curiosity, critical thinking skills, dedication, and thoughtfulness. Those characteristics can balance out a lack of skills in specific areas. Hiring this way helps balance the team, and Demetrios is happy to be a mentor for new engineers.

That’s not to say they don’t consider academic backgrounds. Demetrios describes academics as a “convenient filter” that allows them to evaluate candidates more effectively. Rigorous coursework in math, engineering, or computer science shows that engineers have the right intellectual mapping to handle the work.

How do values drive your team culture and protect your core platform?

Recently, the Silectis team conducted an exercise where they wrote down the mission, vision, and core values of the company. The exercise sharpened their understanding of the principles driving their work.

“We didn’t invent them,” Demetrios says about the team’s principles, “but we documented what we are already doing and created a picture so now when we bring people into the team we can be more intentional and explicit about our values.”

One of the principles that came out of that meeting was a commitment to great software engineering. The team is focused on building a robust process and committed to automated test coverage. They have a very high test-coverage and believe in automated testing as central to being able to do productive software development.

In addition to automated unit testing, they have a number of ways of measuring code quality. They maintain consistent naming and formatting conventions.

The stability and scalability of the code allows the team to hire engineers that they’ll have to train.

“Appreciating those things is going to be a measure of fit for us. It’s easier to learn a new language than it is to change someone’s entire mindset around the way they approach engineering.” explains Demetrios.

“If you don’t believe writing tests is useful, you’re going to hate automated tests [but] that’s what allows us to make substantial changes quickly without breaking anything. When we think about team scalability, having a process, having a lot of automation, that’s what allows us to add people without them having to learn every detail of the platform – it won’t shatter if somebody tries to make a change.” As long as the entire team believes in it, there’s a stable foundation for everyone.

author Interviewer: Chris Mills