RAG and the challenge of raw to embedded dataset size.With the increase of RAG enterprise deployments it is always interesting to see how things work under the covers. With that in mind in this…Jul 22Jul 22
SQL query S3 objects with Apache drillIn this article I will cover how to setup and configure apache drill to run SQL queries against parquet files stored in an S3 bucket.May 10May 10
Lakefs.io on kubernetes with PureStorage FlashBladeAs per their website LakeFS brings software engineering best practices and applies them to data engineering. Let’s go over how to get up…Mar 1Mar 1
Clickhouse and Flashblade S3In this quick how to I will cover the steps to leverage Pure Storage Flashblade S3 storage with a Clickhouse installation.Aug 22, 2023Aug 22, 2023
Trino S3 via hive-metastore integrationIn this blog I will go over how to use S3 storage on a Pure Storage Flashblade with Trino the fast distributed SQL query engine for big…Aug 10, 2023Aug 10, 2023
Hive-metastore on K8S with S3 external tableIn this blog I will cover how to setup Hive metastore on K8S and then leverage external S3 datasets.Aug 10, 2023Aug 10, 2023
Dremio S3 and NFS integrationIn this blog I will go over how you can use fast NFS and S3 from Pure Storage to power your Dremio K8S deployments.Aug 10, 2023Aug 10, 2023
Airbyte S3 connector on k8sIn this blog I will show a simple implementation of Airbyte on kubernetes with S3 integration on PureStorage Flashblade.Aug 8, 2023Aug 8, 2023
Golang prometheus exporter for timestamped metricsPosting this quick code snippet for those who may be looking to export metrics with a timestamp for scraping by prometheus.Feb 22, 2023Feb 22, 2023