Blog posts tagged
"spark"

Rob Gibbon
28 May 2026

Migrating from Apache Spark 3 to Spark 4

Data Platform Ubuntu tech blog

The purpose of this guide is to highlight the key differences between Apache Spark 3 and Spark 4, and provide advice on how to plan a migration. Let’s get started. The biggest changes Let’s talk about the biggest changes between Apache Spark 3.x and Spark 4. Scala 2.12 no more First up, there’s no support ...

Giulia Lanzafame
4 September 2025

Implement an enterprise-ready data lakehouse architecture with Spark and Kyuubi

Data Platform Ubuntu tech blog

Here at Canonical we are excited to announce that we have shipped the first release of our solution for enterprise-ready data lakehouses, built on the combination of Apache Spark and Apache Kyuubi. Using our Charmed Apache Kyuubi in integration with Spark, you can deliver a robust, production-level, and open source data lakehouse . Our Ap ...

Giulia Lanzafame
26 June 2025

Accelerating data science with Apache Spark and GPUs

Data Platform Ubuntu tech blog

Apache Spark has always been very well known for distributing computation among multiple nodes using the assistance of partitions, and CPU cores have always performed processing within a single partition. What’s less widely known is that it is possible to accelerate Spark with GPUs. Harnessing this power in the right situation brings imm ...

Giulia Lanzafame
10 June 2025

Apache Spark security: start with a solid foundation

Data Platform Ubuntu tech blog

Everyone agrees security matters – yet when it comes to big data analytics with Apache Spark, it’s not just another checkbox. Spark’s open source Java architecture introduces special security concerns that, if neglected, can quietly reveal sensitive information and interrupt vital functions. Unlike standard software, Spark design permits ...

Giulia Lanzafame
10 December 2024

Spark or Hadoop: the best choice for big data teams?

Data Platform Ubuntu tech blog

I always find the Olympics to be an unusual experience. I’m hardly an athletics fanatic, yet I can’t help but get swept up in the spirit of the competition. When the Olympics took place in Paris last summer, I suddenly began rooting for my country in sports I barely knew existed. I would spend random ...

Rob Gibbon
15 October 2024

Apache Spark 4.0 beta release – try it now

Data Platform Ubuntu tech blog

Apache Spark is a popular framework for developing distributed, parallel data processing applications. Our solution for Apache Spark on Kubernetes has made significant progress in the past year since we launched, adding support for Apache Iceberg, a new GPU accelerated image using the NVIDIA Spark-RAPIDS plugin, and support for the Volcan ...

Rob Gibbon
15 July 2024

Deploying and scaling Apache Spark on Amazon AWS EKS

Data Platform Ubuntu tech blog

Move over Hadoop, it’s time for Spark on Kubernetes Apache Spark, a framework for parallel distributed data processing, has become a popular choice for building streaming applications, data lake houses and big data extract-transform-load data processing (ETL). It is horizontally scalable, fault-tolerant, and performs well at high scale. H ...

Rob Gibbon
23 May 2024

Can it play Doom? Running an AI LAN party on a Spark cluster with ViZDoom

AI Ubuntu tech blog

It’s all about AI these days, so I decided to try and answer the important question: can you make a Spark cluster run AI agents that play a game of Doom, in a multiplayer LAN party? Although I’m no data scientist, I was able to get this to work and I’ll show you how so ...

Rob Gibbon
17 October 2023

Why we built a Spark solution for Kubernetes

Data Platform Ubuntu tech blog

We’re super excited to announce that we have shipped the first release of our solution for big data – Charmed Spark. Charmed Spark packages a supported distribution of Apache Spark and optimises it for deployment to Kubernetes, which is where most of the industry is moving these days. Reimagining how to work with big data ...

Hasmik Zmoyan
21 September 2023

Open source tooling at GITEX Global

AI Ubuntu tech blog

Innovate at speed with AI. Stay secure and compliant with Ubuntu Pro Date: 16-20 October 2023 Location: Dubai, UAE Booth: Booth B31, Hall 26, DevSlam Canonical is excited to attend GITEX Global 2023, the largest event in the Middle East. Generative AI, predictive analytics and multi-cloud environments are at the heart of a technological r ...

Rob Gibbon
10 August 2023

Write a Spark big data job with ChatGPT

AI Ubuntu tech blog

I’ve read and watched more than a few articles about ChatGPT in the last couple of months. It seems the large language model AI hype machine just can’t stop. As somebody with a passion for music production, some of the more interesting things I’ve seen included a guy using ChatGPT to build a virtual effect ...

Quick links

Quick links

Quick links

Quick links

Quick links

Quick links

Quick links

Quick links

Quick links

Categories

Industries

Partner programs

Quick links

Roles by department

Working here

Explore Canonical

Latest updates

Company highlights ›

Blog posts tagged
"spark"

Migrating from Apache Spark 3 to Spark 4

Implement an enterprise-ready data lakehouse architecture with Spark and Kyuubi

Accelerating data science with Apache Spark and GPUs

Apache Spark security: start with a solid foundation

Spark or Hadoop: the best choice for big data teams?

Apache Spark 4.0 beta release – try it now

Deploying and scaling Apache Spark on Amazon AWS EKS

Can it play Doom? Running an AI LAN party on a Spark cluster with ViZDoom

Why we built a Spark solution for Kubernetes

Open source tooling at GITEX Global

Write a Spark big data job with ChatGPT

Blog posts tagged "spark"

Blog posts tagged
"spark"