data architectures – Devstyler.io https://devstyler.io News for developers from tech to lifestyle Wed, 17 Nov 2021 11:35:12 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.5 Snowflake started to Support Python – The Fastest Growing Programming Language https://devstyler.io/blog/2021/11/17/snowflake-started-to-support-python-the-fastest-growing-programming-language/ Wed, 17 Nov 2021 11:33:50 +0000 https://devstyler.io/?p=74876 ...]]> When Snowflake Inc launched nine years ago, its founders built a data solution to harness the power of the cloud. They created the Data Cloud — a global network where organisations could mobilise their data on a virtually unlimited scale. This meant organisations no longer needed a number of data silos scattered throughout their offices and subsidiaries – they could use Snowflake instead because of its sheer scale.

Today, Snowflake has just announced it started to support Python within the company as part of Snowpark, Snowflake’s developer framework. From now on developers can collaborate on data in their preferred language. At the same time, they can leverage Snowflake’s platform to build scalable, optimised pipelines, applications, and machine learning workflows.

Snowpark will still support Java and Scala, allowing users to have different languages and different users options. They will all work together against the same data with one processing engine, without needing to copy or move the data. This method helps developers because it gives them flexibility and a simpler environment that requires less administrative work and maintenance. SVP of Product at Snowflake Christian Kleinerman commented:

“Snowflake has long provided the building blocks for pipeline development and machine learning workflows, and the introduction of Snowpark has dramatically expanded the scope of what’s possible in the Data Cloud. As with Snowpark for Java and Scala, Snowpark for Python is natively integrated into Snowflake’s engine so users can enjoy the same security, governance and manageability benefits they’ve come to expect when working with Snowflake. As we continue to focus on mobilising the world’s data, Python broadens even further the choices for programming data in Snowflake, while streamlining data architectures.”

Canva’s Head of Platforms says it’s easier to grow with Snowflake

At a media launch for Snowpark for Python this week, Canva’s Head of Data Platforms Greg Roodt said that Canva is taking advantage of the technologies Snowflake provides. He added that Canva had previously been with a platform with “fixed costs” and this became difficult as Canva grew in size so quickly.

 Novartis works with Snowflake

Loic Giraud, the Global Head of Digital Platform & Product Delivery at Novartis, said they’re using Snowflake because:

“… the flexibility and scale of Snowflake’s Data Cloud allow us to accelerate our pace of knowledge through data interpretation and insight generation, bringing more focus and speed to our business. Bringing together all available data ultimately unlocks more value for our employees, patients, and health care providers, and data science innovations help us realise this goal.”

With Snowpark for Python, data teams can:

  • Accelerate their pace of innovation using Python’s familiar syntax and ecosystem of open-source libraries.
  • Optimise development time by removing time spent dealing with broken Python environments with an integrated Python package dependency manager.
  • Operate with improved security by eliminating ungoverned copies of data with all code running in a secure sandbox inside Snowflake.

New developments within Snowflake include:

  • Cross-cloud account replication;
  • Improved replication performance;
  • Expanded governance capabilities and integration;
  • Snowpark: Stored Procedures to define, execute, and schedule complex application code entirely within Snowflake with no separate client to manage.
  • Snowpark: Unstructured File Processing using Java functions directly within Snowflake;
  • Snowpark: Logging Framework to help improve development productivity.
]]>
Rethinking Data Architectures For A Cloud World https://devstyler.io/blog/2021/07/02/rethinking-data-architectures-for-a-cloud-world/ Fri, 02 Jul 2021 10:14:45 +0000 https://devstyler.io/?p=57283 ...]]> Data analytics solutions are continuing to emerge at a fast and furious rate. Data teams are at the center of the storm because they have to balance all the demands for access, data integrity, security, and proper governance, which entails compliance with policies and regulations. The businesses they serve need information as quickly as possible and have little patience for that precarious balancing act. The data teams have to move fast and smart.

They also have to be fortune tellers because they need to build not just the systems for today, but also the platforms for tomorrow. The first key question the data team must consider is: open or closed data architectures.

Open vs. closed data architecture

By definition, those databases are what we would call “closed data architectures.” That’s not a value statement; it’s a descriptive one. It means that the data itself is closed off from other applications and must be accessed through the database engine. This is true even for moving data around with ETL jobs because at some point, to do the export or the import, you need to go through the database, whether that’s the optimal way to achieve what you want to do or not. The data is “closed” off from the rest of the architecture in this important sense.

In contrast, an “open data architecture” is one that stores the data in its own independent tier within the architecture, which allows different best-of-breed engines to be used for an organization’s variety of analytic needs. That’s important because there’s never been a silver bullet when it comes to analytic processing needs, and there likely never will be. An open architecture puts you in an ideal position to be able to use whatever best-of-breed services exist today or in the future.

Open, services-oriented data architecture

When applications moved from client-server to web, the fundamental architecture changed. We went from monolithic applications that ran in one process, to services-oriented applications that were broken into smaller, more specialized software services. Eventually, these became known as “microservices” and they remain the dominant design for web and mobile applications. The microservices approach held many advantages that were realized due to the nature of cloud infrastructure. In a scale-out system with on-demand resource models and numerous teams working on pieces of functionality, the “application” became nothing more than a facade for dozens or hundreds of microservices.

Everyone agrees that this approach has many advantages for building modular and scalable applications. For some reason, we’re expected to believe that this paradigm isn’t nearly as effective for data. At Dremio, we believe that’s inaccurate. We believe the logic of looking at our data in the same open, services-oriented manner as our applications is intuitively obvious and desirable.

]]>