Do you have issues with data storage space? In today’s post I’d like to share a scenario that may be familiar to you and how Azure Data Lake Storage can help.
Azure Data Lake
Using Databricks in Azure Data Factory for ETL
A common discussion we’ve had lately is about using Azure Databricks within Azure Data Factory for ETL.
Transitioning from Traditional to Azure Data Architectures
Summary of the Matter: If you only read one thing, please read this: transitioning to Azure is absolutely “doable”, but do not let anyone sell you “lift and shift”. Azure data architecture is a new way of thinking. Decide to think differently.
First Determine Added Value: Below are snippets from a slide deck I shared during Pragmatic Work’s 2018 Azure Data Week. (You can still sign up for the minimal cost of $29 and watch all 40 recorded sessions, just click here.) However, before we begin, let’s have a little chat. Why in the world would anyone take on an Azure migration if their on-prem SQL database(s) and SSIS packages are humming along with optimum efficiency? The first five reasons given below are my personal favorites.
- Cost (scale up, scale down)
- Event Based File Ingestion
- File based history (SCD2 equivalent but in your Azure Data Lake)
- Support for Near Real Time Requirements
- Support for Unstructured Data
- Large Data Volumes
- Offset Limited Local IT Resources
- Data Science Capabilities
- Development Time to Production
- Support for large audiences
- Mobile
- Collaboration
Each of the reasons given above are a minimum one hour working session on their own, but I’m sharing my thoughts in brief in an effort to help you to get started compiling our own list. Please also look at the following diagram (Figure 1) and note two things: a.) the coinciding “traditional” components and b.) the value add boxed in red.
Using ORC, Parquet and Avro Files in Azure Data Lake
In today’s post I’d like to review some information about using ORC, Parquet and Avro files in Azure Data Lake, in particular when we’re extracting data with Azure Data Factory and loading it to files in Data Lake.
Azure Data Week – Modern Data Warehouse Design Patterns
In his Azure Data Week session, Modern Data Warehouse Design Patterns, Bob Rubocki gave an overview of modern cloud-based data warehousing and data flow patterns based on Azure technologies including Azure Data Factory, Azure Logic Apps, Azure Data Lake Store, and Azure SQL DB.
There were many questions he was unable to answer during his session and we’re happy to share them with you now. If you missed Bob’s session or the entire week, you can still purchase access to the recordings by visiting azuredataweek.com.
Migrating from Oracle to Azure Data Warehouse
In this post, I’d like to tell you a story about a customer who chose to migrate from Oracle to Azure Data Warehouse and tell you their reasons for doing so, as well as the benefits they’re seeing after making the move.
Can Azure Data Factory Read Excel Files from Data Lake?
Today’s post is in response to a question I was recently asked. It’s about using Azure Data Lake Store with Azure Data Factory, in particular about the Copy Activity within Data Factory to read data from Azure Data Lake.
Azure Data Lake vs Azure Blob Storage in Data Warehousing
In today’s post I’ll look at some considerations for choosing to use Azure Blob Storage or Azure Data Lake Store when processing data to be loaded into a data warehouse. My basis here is a reference architecture that Microsoft published, see diagram below.
Hybrid Cloud Strategies and Management
Are you running a hybrid environment between on-premises and Azure? Do you want to be? In a recent webinar, Sr. Principal Architect, Chris Seferlis, answered the question: How can my organization begin using hybrid cloud today? In this webinar, he defines the four key pillars of true hybrid development, identity, security, data platform and development, and shows actionable resources to help get you started down the hybrid road.
Introduction to Azure Data Lake
We talk to a lot of customers about their data strategies, specifically their data cloud strategies. One great tool we have is Azure Data Lake. I’d like to introduce that tool and tell you about some benefits you will gain.