What is Data Engineering?

May 11, 2021

When you look at a dashboard or report in Power BI, you are seeing the tip of the iceberg; the data you see has likely taken many steps from its source on its way to Power BI.

Data engineering is the work that brings data from one or more sources and shapes it, validates it, cleans it, correlates it, and (often) stores it, and it’s a lot of what we do at Imaginet.

Power BI has a way of doing much of this kind of work, called Power Query. This is a very powerful tool in Power BI Desktop to help power users build “data mashups” and clean and transform data and do amazing things like scraping data off web pages or scanning a SharePoint document library and bringing in all matching files into a single table. It’s a great tool for “quick-n-dirty” Power BI data models and prototyping.

Often, we have clients coming to us with complex Power BI models where they have done remarkable things with their source data, but they have difficulty using them because they’re slow and difficult to change, they break easily (“who changed that Excel file!?”) and take a long time to refresh, or the model source must be changed every month and re-published manually.

Other clients have tried to do the complex things in Power Query but run into a roadblock or just don’t have the time to put into it.

This is where the Data Engineering team at Imaginet shines.

Our goal is to build a Power BI data model with the right tables, right names, right columns, and right measures (calculations) that can be published once and refreshed regularly without any manual intervention. With this kind of a Power BI model, our clients can focus on visualizing the data to create reports and dashboards for others to see or exploring the data to see what stories it can tell, and what insights it can reveal.

We often use tools and services from Microsoft Azure, such as Azure Data Factory and Azure SQL Database, to build a robust process that extracts data from your sources as raw materials and stores the end product into Azure SQL. The cost of these services is very low – most of our clients pay well under $1 a day for using Azure Data Factory, and a small Azure SQL Database is around $15 per month.

Just like a real factory that starts with raw materials and has an assembly line or manufacturing process that creates new widgets and stores them in a warehouse, Azure Data Factory consumes raw data and stores its finished products in a data warehouse, an Azure SQL Database.

We have expertise in extracting data from lots of different kinds of sources – not only databases like Microsoft SQL Server, Oracle, Sybase, PostgreSQL, and MySQL, but also data in text files or Excel documents, cloud services via REST APIs, SharePoint lists, and libraries, and files stored on your own computers or in the cloud (not just the Azure cloud, but also AWS and Google).

It can be daunting to try to get data from a cloud-based application on your own, but it’s what we do regularly. It’s not always easy, but we have always figured out how.

We’ve also seen lots of tangled messes and dirty laundry. Sometimes it’s like a dumpster fire or tire fire that we find, but we can help sort out the mess and get it in order. Every client that we’ve worked with has data quality issues and we help identify them, mitigate them, and work around them.

So, if you can relate to any of these issues, Imaginet can help you. We typically start with a short engagement to do a proof of concept so we get familiar with your data (and identify whatever messes you have), It’s a bit like hiring a personal organizer – we’ll come in and do the heavy lifting, and you can focus on the stuff that matters – your data.

Thank you for reading this post! If you enjoyed it, I encourage you to check out some of our other content on this blog. We have a range of articles on various topics that I think you’ll find interesting. Don’t forget to subscribe to our newsletter to stay updated with all of the latest information on Imaginet’s recent successful projects

Mike Diehl

Mike Diehl is the practice lead for Databases and Business Intelligence at Imaginet Resources, a Microsoft partner with offices in Dallas, Calgary, and Winnipeg. Mike has over 20 years of experience using Microsoft database technologies and is an expert in Agile Analytics, Scrum, Kanban, and Microsoft Azure DevOps.

discover more

Let’s build something amazing together

From concept to handoff, we’d love to learn more about what you are working on.
Send us a message below or call us at 1-800-989-6022.

What is Data Engineering?

The Imaginet Difference: Boutique In Size with Big Results

QR Code Phishing Attacks: Are You Protected?

Virtual Workspaces Are Here! How Will You Use Them?

Let’s build something amazing together