2020

What is Delta Lake in Databricks?

If you’re not familiar with Delta Lake in Databricks, I’ll cover what you need to know here. Delta Lake is a technology that was developed by the same developers as Apache Spark. It’s designed to bring reliability to your data lakes and provided ACID transactions, scalable metadata handling and unifies streaming and batch data processing.

Let’s begin with some of the challenges of data lakes:

  • Data lakes are notoriously messy as everything gets dumped there. Sometimes, we may not have a rhyme or reason for dumping data there; we may be thinking we’ll need it at some later date.
  • Much of this mess is because your data lake will have a lot of small files and different data types. Because there are many small files that are not compacted, trying to read them in any shape or form is difficult, if not impossible.
  • Data lakes often contain bad data or corrupted data files so you can’t analyze them unless you go back and pretty much start over again.

This is where Delta Lake comes to the rescue! It delivers an open-source storage layer that brings ACID transactions to Apache Spark big data workloads. So, instead of the mess I described above, you have an over layer of your data lake from Delta Lake. Delta Lake provides ACID transactions through a log that is associated with each Delta table created in your data lake. This log records the history of everything that was ever done to that data table or data set, therefore you gain high levels of reliability and stability to your data lake.

Key Features of Delta Lake are:

  • ACID Transactions (Atomicity, Consistency, Isolation, Durability) – With Delta you don’t need to write any code – it’s automatic that transactions are written to the log. This transaction log is the key, and it represents a single source of truth.
  • Scalable Metadata Handling – Handles terabytes or even petabytes of data with ease. Metadata is stored just like data and you can display it using a feature of the syntax called Describe Detail which will describe the detail of all the metadata that is associated with the table. Puts the full force of Spark against your metadata.
  • Unified Batch & Streaming – No longer a need to have separate architectures for reading a stream of data versus a batch of data, so it overcomes limitations of streaming and batch systems. Delta Lake Table is a batch and streaming source and sink. You can do concurrent streaming or batch writes to your table and it all gets logged, so it’s safe and sound in your Delta table.
  • Schema Enforcement – this is what makes Delta strong in this space as it enforces your schemas. If you put a schema on a Delta table and you try to write data to that table that is not conformant with the schema, it will give you an error and not allow you to write that, preventing you from bad writes. The enforcement methodology reads the schema as part of the metadata; it looks at every column, data type, etc. and ensures what you’re writing to the Delta table is the same as what the schema represents of your Delta table – no need to worry about writing bad data to your table.
  • Time Travel (Data Versioning) – you can query an older snapshot of your data, provide data versioning, and roll back or audit data.
  • Upserts and Deletes – these operations are typically hard to do without something like Delta. Delta allows you to do upserts or merges very easily. Merges are like SQL merges into your Delta table and you can merge data from another data frame into your table and do updates, inserts, and deletes. You can also do a regular update or delete of data with a predicate on a table – something that was almost unheard of before Delta.
  • 100% Compatible with Apache Spark

Delta Lake is really a game changer and I hope you educate yourself more and start using it in your organization. You’ll find a great training resource from the Databricks community at: https://academy.databricks.com/category/self-paced

Or reach out to us at 3Cloud. Our expert team and solution offerings can help your business with any Azure product or service, including Managed Services offerings. Contact us at 888-8AZURE or  [email protected].

 

Brian CusterWhat is Delta Lake in Databricks?
Read More

Azure Trends: Looking Ahead to 2021

2020 has been many things. We’ve seen massive change and resiliency in our interactions with clients and each other. Working remotely for most organizations has changed the way we think about running our business and what we need to do for the future.
In a recent webinar, our delivery leaders discussed the lessons learned from 2020 and looked forward to 2021 to share what they’re looking for in the new year. From DevOps to modernizing applications to data, all the key areas of technology are discussed. Check out this webinar for an engaging, round robin-style conversation on the technical challenges and opportunities you and your team can expect as we say farewell to 2020.
You can watch the complete webinar below.


3Cloud is committed to bringing you the most up-to-date topics covering all areas of Azure, data, and the cloud in our free weekly webinars. We are filling our calendar with hot topics delivered by industry experts in 2021. Follow us on social media to stay up to date and keep checking our event calendar on our website.

Our free webinars happen every Tuesday at 11 a.m. ET. We look forward to you joining us each week in 2021!

3CloudAzure Trends: Looking Ahead to 2021
Read More

Azure Synapse Analytics Now in GA and the Public Preview of Azure Purview

I’m here with some exiting news from Microsoft! Last week at a digital conference, Satya Nadella announced the general availability of Azure Synapse Analytics and the preview of Azure Purview, a unified data governance service. Azure Synapse Analytics has been gaining traction while in preview and adding Azure Purview gives businesses the ability to get the most of out their data and analytics.


Let’s talk about Azure Purview. This is a comprehensive data governance service that helps organizations discover all data across the organization. Demos at the digital conference showcased different ways you can use Purview for governance. Some key things are the ability to go multi-cloud, not only in Azure, but others as well. You can also connect with your on-prem environment and your Azure data assets.

For quite some time, those of us in the data disciplines have worked to inventory all the different aspects of data, like column, database and table names, etc., and put all those pieces into a common repository, often referred to as a data dictionary. Microsoft has been working for years to create a product that would be comprehensive enough to help most people with their governance and compliance needs. We’ve now got this with Azure Purview.

Some key highlights pointed out are:

  • A business glossary – no need to manually build a data dictionary.
  • Automated data classification – allows you to know things like data type (Social Security number for instance). You also have custom options and can schedule future scanning and classification on a routine basis. This way you’re getting continual updates, as opposed to a data dictionary where you get snapshot in time unless you manually update.
  • Cloud-based search facility – gives you the ability to find things quickly and easily across a broad series of data assets.
  • Data lineage and reporting – supports the end to end data lifecycle.
  • Power BI facilities

I feel Azure Purview is a very strong offering. Without it I would have either create my own versions of these pieces or using something like Embarcadero, which I used years ago. Another thing to note is that the experience is very similar to the canvas workspace experience in Azure Synapse Analytics, so if you’ve been working with that, it will feel very familiar.

The next part of Microsoft’s announcement is that Azure Synapse Analytics is now generally available. Azure Synapse Analytics is a limitless analytics service which brings together traditional data warehouse and big data analytics in one offering. It brings these together for a unified experience to ingest, prepare, manage, and serve data for immediate machine learning and BI applications. I, and many of our customers, have been using this great product a lot, so this going GA is surely exciting news.

Some noteworthy things with Azure Synapse Analytics are:

  • A new native cloud distributed SQL engine
  • Deep integration with Spark
  • Flexible query options such as serverless and dedicated
  • Integration with Power BI and machine learning
  • TPC-H benchmark at petabyte scale
  • Native Row Level Security (this is not possible with Amazon Redshift or Google BigQuery)
  • Native ML integration for the citizen data scientist
  • Code management – by that their talking about Azure DevOps as another piece that plays well with it.
  • Power BI integration to Teams which I found to be kind of cool

Again, great announcements with both the general availability of Azure Synapse Analytics and the public preview of Azure Purview. These two products combined empowers teams to remove data silos and leverage all data for analytics and data governance.

Need further help with these or any Azure product or service? Our expert team and solution offerings can help your business with any Azure product or service, including Managed Services offerings. Contact us at 888-8AZURE or  [email protected].

Rowland GoslingAzure Synapse Analytics Now in GA and the Public Preview of Azure Purview
Read More

Getting Started with Custom Power BI Report Themes

Do you want to learn how to create visual consistency across your reports and eliminate the monotonous task of setting common visual properties? Power BI report themes are more than just a color palette for your reports. With report themes, you can apply design changes to your entire report. All the visuals in your report will use the colors and formatting of the theme you’ve selected as a default. You can make design changes like using corporate colors, changing icon sets or applying new default visual formatting.

In a recent webinar, 3Cloud Senior Consultant, Jeremy Black, takes you on a deep dive into Power BI report themes. Jeremy covers the two types of report themes: built-in and custom. With custom themes you can customize via Power BI Desktop and/or by manually manipulating the JSON theme file used by Power BI Desktop. He’ll also cover the benefits of creating custom report themes, as well as their limitations.

Lastly, the presentation will delve into the Power BI report theme building blocks including data and structural colors, text, and visual styles. Most of this webinar is spent on demos of examples of how to begin customizing your report themes.

You can watch the complete webinar below.


So, if you’re looking to create visual consistency in your reports without spending tons of time setting common visual properties for your reports, this webinar is for you. Join us for our free weekly webinars covering a variety of Azure topics every Tuesday at 11 a.m. ET. Check out our events calendar for upcoming topics.

Need further help? Our expert team and solution offerings can help your business with any Azure product or service, including Managed Services offerings. Contact us at 888-8AZURE or  [email protected].

3CloudGetting Started with Custom Power BI Report Themes
Read More

Setting Variables Using DAX in Power BI

Welcome to another Azure Every Day focused on one of my favorite topics, using DAX in Power BI. In a previous post, I covered a couple of my favorite DAX formulas, CALCULATE and FILTER. Here, I’ll discuss variables and how to work those into CALCULATE/FILTER to expand on that and make it even more powerful. You may want to check out my previous post here, as I’ll be continuing using that demo.

  • I started by building in a CALCULATE/FILTER function in a table to calculate the beginning balance for 2017 for all my assets.
  • My code (see my video demo for code detail) tells it to calculate the sum of the beginning balance and I filtered the table where fiscal year equals 2017, and finance type equals assets.
  • Now, let’s say we want to know the assets for every year, not just 2017. To do this, we need to set the year into a variable and then it will calculate that asset for each individual year.
  • You use the VAR function to set the variable in the code and you’ll need to give it a name. In my case I’ll use YR for year and I’ll have that EQUAL to the fiscal year. It’s important to note that anytime you set a variable, you must hit return at the end.
  • Next, I’m going to update my FILTER. My code is calculating the sum of the beginning balance and I’m filtering the table where fiscal year equals 2017, but now I want to take that out and change it to fiscal year equals year.
  • How this works is when this goes through each row it will calculate for each year by using the variable instead of hard coding the year into there.
  • So previously we only had one outcome for 2017, now when we submit this, we’ll see four outcomes as we have four years’ worth of data, so we’re getting a calculation for our beginning balance for each year.
  • We can even step this up a bit if we wanted not only the beginning balance for each year, but also for each finance type. Maybe we don’t just want the assets but other values like equity, expense or liability.
  • All we need to do is set another variable that I’ll call FT for finance type. And instead of doing this for where finance type equals asset, we’ll say where finance type equals our variable.
  • Now we’ll have the calculation for every year for every individual finance type.


 

I hope this quick example helps you to start using these DAX formulas in your reports. The CALCULATE/FILTER functions and using variables are something I use all the time in Power BI.

Need further help? Our expert team and solution offerings can help your business with any Azure product or service, including Managed Services offerings. Contact us at 888-8AZURE or  [email protected].

Alex BeechSetting Variables Using DAX in Power BI
Read More

Microsoft Announces the General Availability of Azure Synapse Analytics and the Preview of Azure Purview

There was some exciting news from Microsoft today! At a digital conference, Satya Nadella, CEO of Microsoft, announced the general availability of Azure Synapse Analytics and the preview of Azure Purview, Microsoft’s new unified data governance service.

Why are these announcements so important? Because to thrive, businesses need to harness the power of their data.

One of the best ways to harness the power of your data is to remove data silos. While not a new concept, achieving this has been a constant challenge in the history of data and analytics, as many ecosystems continue to be complex and heterogeneous. Organizations must break down all silos to the get the most out  of data and analytics, in a consolidated, secure, and compliant manner. Azure Synapse Analytics and Azure Purview remove silos by creating a unified data analytics and governance service.

Now that we’ve given you an overview of the announcements, let’s break down the details of each Azure service.

What is Azure Synapse Analytics? Azure Synapse Analytics is a limitless analytics service that brings together traditional data warehousing and big data analytics – into one offering! Azure Synapse brings these two worlds together with a centralized experience to ingest, prepare, manage, and serve data for immediate BI and machine learning applications.

Azure Synapse Analytics Highlights

  • New cloud native distributed SQL engine
  • Deep integration with Spark
  • Flexible service query options – Serverless + Dedicated
  • Power BI + ML integration
  • Azure Synapse Link – Enables real-time data analytics with link to your operational database
  • TPC-H benchmark at PB scale
  • Native Row Level Security – This is not possible with Amazon Redshift, Google BigQuery
  • Citizen Data Scientist capabilities with direct ML integration
  • Code Management – Automation, code sync to dev/master branch, and end to end deployment lifecycle
  • Power BI integration from inside the Synapse service
  • Ability to add Power BI reports into Teams for end users

What is Azure Purview? It is critical to enable these capabilities through a comprehensive data governance solution. An organization that does not know where its data is, does not know what its future will be.  Azure Purview is a unified data governance service that helps organizations achieve a complete understanding of their data. Azure Purview empowers users to discover all data across the business, track lineage of data and create a business glossary wherever it is stored: on-premise, across clouds, in SaaS applications, or in Power BI.

Azure Purview Highlights

  • Key important and rich features: Compliance, Catalog and Data Map
  • Sourcing from on-prem, SaaS, and multi-cloud services
  • Data Map: search/browse data by tech/non-tech domain
  • Business glossary
  • No manual building of data dictionary
  • Data scanning by clicking on a button
  • Automated data classification with custom options
  • Schedule for future scanning and classification
  • Easy cloud based data search
  • Data lineage and reporting
  • Free scanning: On-premise SQL Servers, Power BI Service, data sensitivity labeling for O365 E5 customers

The combination of Azure Synapse Analytics and Azure Purview enables organizations to develop the capabilities needed to empower their teams to leverage all data for analytics and data governance, silo-free.

3CloudMicrosoft Announces the General Availability of Azure Synapse Analytics and the Preview of Azure Purview
Read More

Introduction to Azure Synapse Analytics

Do you want to learn how to bring together enterprise data warehousing and big data analytics? In a recent webinar, 3Cloud consultant Mike Donnelly, explores the components included with an Azure Synapse Workspace. This presentation goes through all the various pieces of Synapse, what it is and what it can do, as well as why people will be using this more and more as people build modern data warehouse.
When you search Azure Synapse Analytics, you’ll see there is Azure Synapse Analytics (formerly known as Azure SQL DW) which has been around for a bit as a data warehouse solution. But you will also find Azure Synapse Analytics Workspaces Preview. This preview resource is the focus today and many of the pieces it includes will be covered (and demoed), including:
• SQL Pools (formerly known as SQL Data Warehouse)
• Built-in SQL Pool (aka SQL On-Demand/Serverless)
• Azure Data Lake Storage (Gen2)
• Spark Compute Pools (not Azure Databricks)
• Integrate (Azure Data Factory)
If you’re interested in learning more about Azure Synapse Analytics and why it will become the tool of choice for building modern data warehouse, this webinar is for you. Watch the complete webinar below.

Be sure to watch for our free weekly webinars that happen every Tuesday at 11 AM EST. Check out our events calendar on our website https://www.3cloudsolutions.com/resources/events-calendar/

Need further help? Our expert team and solution offerings can help your business with any Azure product or service, including Managed Services offerings. Contact us at 888-8AZURE or  [email protected].

3CloudIntroduction to Azure Synapse Analytics
Read More

DAX CALCULATE and FILTER Functions in Power BI

We all know Power BI is a powerful analytics tool and it’s easy to create a new Power BI Desktop file, import some data into it and create reports that show valuable insight. Adding Data Analysis Expression, better known as DAX, enables you to get the most out of your data.
If you’re new to Power BI or DAX, DAX is a collection of formulas, or expressions, that are used to calculate and return one or more values. DAX helps you to use data already in your model and create new information from it that can solve real business problems.
When I first started using DAX functions, it brought my Power BI skills to the next level. I was able to tackle some analytical needs that I had struggled with in the past. I’m here to share a couple favorite formulas that I use all the time called the CALCULATE function and the FILTER function. Please be sure to watch my video included in this post as I walk through using this DAX formula.
• In my demo, I’m working with a data set to find the beginning balance for 2017 for our assets.
• To do that I need to sum a column in my table called beginning balance when fiscal year equals 2017 and when financial type equals asset.
• I’ll do this by using a combination of the CALCULATE function and the FILTER function. The CALCULATE function allows you to calculate a function on the entire table.
• In my code I’m going to CALCULATE the sum on our beginning balance. This would calculate the sum for the entire table.
• But we only want to calculate the sum for 2017 for just the assets and financial type. For this, once we have calculated the table, we need filter that table. Think of this FILTER function as making a digital table in the background.
• We need to FILTER it where fiscal year equals 2017 and where finance type equals asset. In my code, I’ll add FILTER for the function, and we need to tell it what table we are going to be filtering, in my case it’s the balance table. Then add where fiscal year equals 2017 and where finance type equals asset.
• Using these DAX functions, our result will show the beginning balance for our assets for 2017.
• My video shows you exactly how to write the code I used here, so be sure to check it out.

As you can see, this is super simple, and this formula allows you flexibility in how you write it. You can FILTER tables in many ways and use different functions within CALULATE. I hope you enjoyed this simple use case of these powerful DAX functions in Power BI.

Need further help? Our expert team and solution offerings can help your business with any Azure product or service, including Managed Services offerings. Contact us at 888-8AZURE or [email protected]

Alex BeechDAX CALCULATE and FILTER Functions in Power BI
Read More

3Cloud Welcomes Mark Nelson as New VP of Managed Services

We’re excited to announce Mark Nelson as our VP of Managed Services at 3Cloud! In this role, Mark will lead our Managed Services practice and drive the growth of this key division of 3Cloud’s business.

“We’re thrilled to welcome Mark as VP of Managed Services. He has stepped into a pivotal role and we’re excited to see him bring his knowledge and experience to deliver excellent experiences for our clients, and to drive the growth of our Managed Services business,” said Matt Morse, COO of 3Cloud.

Mark comes to us with a deep background in consulting and managed services. He led the creation and growth of West Monroe Partners’ award-winning Managed Services practice—which ultimately grew to more than 125 people—and shepherded that team’s spinoff into Ascend Technologies. Prior to West Monroe Partners, Mark held technology leadership roles at Arthur Andersen and BearingPoint. He spent nine months serving as interim CIO for a client in the futures trading industry, and that experience was an eye-opener to the opportunity to deliver managed IT services that mid-market firms really need. He is a lifelong resident of Illinois and attended Indiana University.

“I was impressed with the culture and strategy of 3Cloud’s leadership and the capabilities of the 3Cloud team,” Mark states. “3Cloud has a strong Managed Services foundation to support our clients as they evolve on their journey with Azure.  I look forward to building on our current Managed Services offering and continuing to expand our capabilities to meet the changing needs of our clients and to be the premier Managed Services Provider for Microsoft Azure customers.”

As the cloud is forever changing, with new updates emerging daily, companies can quickly fall behind. 3Cloud’s Managed Services practice has the tools, processes, and IP to optimize our clients’ spend and performance and ensure they stay on the leading edge. Our Managed Services team monitors and manages client’s environment to proactively mitigate problems, contain costs and get the maximum performance and availability of their systems. 3Cloud is one of only 55 Azure Expert MSPs in the world.

3Cloud3Cloud Welcomes Mark Nelson as New VP of Managed Services
Read More

Create a Custom Visual in Power BI with React

Welcome to another edition of Azure Every Day! I’m an App Development Consultant at 3Cloud and in this post I’ll show you how to create a custom visual in Power BI with React.
When creating a custom visual in Power BI, you use the TypeScript language. You’ll be fine with TypeScript if you are familiar with JavaScript. Once you’ve created a custom visual from scratch, you can test and debug your visuals online using live reports, as well as package and deploy your visual to a production report. You can check out the Microsoft documentation about custom visuals at https://powerbi.microsoft.com/en-us/developers/custom-visualization/ to learn more.
I’ll walk you through the process here, but also be sure to watch my video included at the end of this post.
• To get started, be sure you have node installed on your machine.
• Navigate to your project directory run the npm install command for Power BI visual tools. Once Power BI visual tools are installed, run pbiviz, new and then your project name.
• Next, run pbiviz start which will start your custom visualization in the local host.
• To debug your visualization, go to app.powerbi.com and create a new report (or use an existing one) and add the developer visual to the report.
• In order to add React elements to your custom visual, you’ll need to install React and ReactDOM. Import React and ReactDOM into the visual.ts file or whatever file you will be using to render the HTML elements to the DOM.
• Create a React element within the visual.ts. Also create the DOM element and add an element to the DOM that is referenced by this element.
• Next, create the React component using React.createElement of the component name that you’ve imported, then pass any props you would like in the second parameter of the React.createElement function.
• Finally, you can add the React element to the DOM by using ReactDOM.render and reference the React component that you built and the HTML element on the page that you would like to add the React element to.
• When you’re finished building and debugging your project, you can set the project and author details in the pbiviz.json file.
• Run the pbiviz package command to generate a pbiviz file to import into your Power BI report.
• Now you can go into your report and import a visual from file. Select the visual and add the data you would like to add to your visual and configure the settings you had previously.
• Please check out my video for code detail on all the above steps.

 

This post walked you through the steps to create a custom visual in Power BI using React. I hope you’ll give it a try.

Need further help? Our expert team and solution offerings can help your business with any Azure product or service, including Managed Services offerings. Contact us at 888-8AZURE or [email protected]

Tom WardCreate a Custom Visual in Power BI with React
Read More