jboothomasRAG and the challenge of raw to embedded dataset size.With the increase of RAG enterprise deployments it is always interesting to see how things work under the covers. With that in mind in this…Jul 22Jul 22
jboothomasSQL query S3 objects with Apache drillIn this article I will cover how to setup and configure apache drill to run SQL queries against parquet files stored in an S3 bucket.May 10May 10
jboothomasLakefs.io on kubernetes with PureStorage FlashBladeAs per their website LakeFS brings software engineering best practices and applies them to data engineering. Let’s go over how to get up…Mar 1Mar 1
jboothomasClickhouse and Flashblade S3In this quick how to I will cover the steps to leverage Pure Storage Flashblade S3 storage with a Clickhouse installation.Aug 22, 2023Aug 22, 2023
jboothomasTrino S3 via hive-metastore integrationIn this blog I will go over how to use S3 storage on a Pure Storage Flashblade with Trino the fast distributed SQL query engine for big…Aug 10, 2023Aug 10, 2023
jboothomasHive-metastore on K8S with S3 external tableIn this blog I will cover how to setup Hive metastore on K8S and then leverage external S3 datasets.Aug 10, 2023Aug 10, 2023
jboothomasDremio S3 and NFS integrationIn this blog I will go over how you can use fast NFS and S3 from Pure Storage to power your Dremio K8S deployments.Aug 10, 2023Aug 10, 2023
jboothomasAirbyte S3 connector on k8sIn this blog I will show a simple implementation of Airbyte on kubernetes with S3 integration on PureStorage Flashblade.Aug 8, 2023Aug 8, 2023
jboothomasGolang prometheus exporter for timestamped metricsPosting this quick code snippet for those who may be looking to export metrics with a timestamp for scraping by prometheus.Feb 22, 2023Feb 22, 2023