Have you ever noticed that when you ask about data strategy, you sometimes receive a response about digitalization strategy instead? This can be confusing and frustrating, as it may not address your original question. Misunderstandings can easily happen when discussing complex concepts, even among professionals in the same field. Communication is challenging, and it's crucial to establish a shared language to ensure clarity. As data professionals, we often need to clarify terminology with business partners, even...
Let's start right from the beginning with a disclaimer: Artificial intelligence still doesn't know how to figure out data quality deficiencies. It can't even search for them very well on its own. You need to show it some love for it to manage. But you can easily get quite many benefits for the task at hand from machine learning as it is today. Data profiling is one of my all-time favorite data development tools. A few years ago, I got to know the Pandas Profiling Python library, which does so much of the work...
I don't recall such hype from a single technology during my career that Open AI’s Chat GPT has made. As a regular person, I am just as into that hype as the next person. But for it to revolutionize the data industry. Well, for that we might still have a bit way to go. On that path, however, the Azure Open AI is a hefty step in the right direction. Here is a story of my first impressions trying to utilize the new service offering. I wanted to do a text analysis that would, instead of just picking up words, use...
In Power BI, like in quite a few other analytics tools as well, there have always been challenges when moving to really large amounts of data. Like now, for example, analyzing a table with a billion rows. This kind of problem has usually been attempted to be answered by storing part of the data in the memory of the analytics tool, which is usually limited, and then directing detailed queries to a database. A few years back, while testing how Databricks would perform against Power BI's direct queries, I was...
Voland Partners will make its first investment in Cloud1, a company that produces solutions for Microsoft Azure’s cloud services, and thus it will become a minority shareholder in the company. Cloud1 has an excellent reputation amongst both its employees and customers. The megatrend of cloud services and data utilisation, combined with a good working practice, has propelled the company to an average annual growth of about 40 percent.
There are many different services on Azure that are used for IIoT solutions. Playfully I call this versatile stack as a jungle. And why not. We can see few different layers there from usage and also from architectural perspective. But before we go into details let’s take a step back to see the forest from the trees so to speak. Download our blueprint about IIoT.
I 💙 pandas, I really do. And I also like to do data profiling. So I can’t possibly go wrong with panda profiling, right? Well, jokes aside, in my opinion data profiling should, at least to some extent, be a standard practice in all data processing work. For me and for many of my colleagues our history with data development goes far further than to the time before we had all these nice tools we have now, though. The process of checking data issues has previously been mainly manual. Some of us who are lazy enough...
Microsoft Power Platform is a cloud-based set of applications that allow companies to automate processes and build custom applications, visualize data and distribute reports. It brings low-code/no-code development and business intelligence tools to business users.
A week after I started to write the first blog post covering Great Expectations framework, I am back at it again. I managed to first create a custom expectation (i.e., a custom data validation rule) and after which I investigated the more formal way of using the framework. Here’s how it went and what I learned.