<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[group by 1]]></title><description><![CDATA[Thoughts on data systems]]></description><link>https://groupby1.mattarderne.com</link><image><url>https://substackcdn.com/image/fetch/$s_!F2FM!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53b442d1-8699-4eda-921e-af91f12ebab5_750x750.png</url><title>group by 1</title><link>https://groupby1.mattarderne.com</link></image><generator>Substack</generator><lastBuildDate>Mon, 18 May 2026 04:43:09 GMT</lastBuildDate><atom:link href="https://groupby1.mattarderne.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Matt Arderne]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[groupby1@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[groupby1@substack.com]]></itunes:email><itunes:name><![CDATA[Matt Arderne]]></itunes:name></itunes:owner><itunes:author><![CDATA[Matt Arderne]]></itunes:author><googleplay:owner><![CDATA[groupby1@substack.com]]></googleplay:owner><googleplay:email><![CDATA[groupby1@substack.com]]></googleplay:email><googleplay:author><![CDATA[Matt Arderne]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[A Modern Data Benchmark]]></title><description><![CDATA[A comparison of dbt (SQL) and Drizzle (TS) as an infra choice for Data Analysis]]></description><link>https://groupby1.mattarderne.com/p/a-modern-data-benchmark</link><guid isPermaLink="false">https://groupby1.mattarderne.com/p/a-modern-data-benchmark</guid><dc:creator><![CDATA[Matt Arderne]]></dc:creator><pubDate>Mon, 09 Feb 2026 18:56:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!If1s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde2203a-945a-451d-bb30-27d280f83f78_6720x4476.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is a comparison of dbt (SQL) and Drizzle (TS) as an infra choice for Data Analysis. The findings seem to confirm my inkling that dbt might be more human coded than Coding Agent coded... I&#8217;m interested in hearing thoughts, as this is the first poke at this idea.<br><br>Way back I wrote a </em><a href="https://groupby1.mattarderne.com/">few blogs</a><em> about the Modern Data Stack, this is the first look back into the space (and it was a brief look) since I stopped that and started a startup.</em></p><p><em>If you have any ideas for improving this investigation, I&#8217;m all ears! </em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://groupby1.mattarderne.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading group by 1! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!If1s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde2203a-945a-451d-bb30-27d280f83f78_6720x4476.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!If1s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde2203a-945a-451d-bb30-27d280f83f78_6720x4476.jpeg 424w, https://substackcdn.com/image/fetch/$s_!If1s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde2203a-945a-451d-bb30-27d280f83f78_6720x4476.jpeg 848w, https://substackcdn.com/image/fetch/$s_!If1s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde2203a-945a-451d-bb30-27d280f83f78_6720x4476.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!If1s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde2203a-945a-451d-bb30-27d280f83f78_6720x4476.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!If1s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde2203a-945a-451d-bb30-27d280f83f78_6720x4476.jpeg" width="1456" height="970" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cde2203a-945a-451d-bb30-27d280f83f78_6720x4476.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:970,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5238060,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://groupby1.mattarderne.com/i/187428984?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde2203a-945a-451d-bb30-27d280f83f78_6720x4476.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!If1s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde2203a-945a-451d-bb30-27d280f83f78_6720x4476.jpeg 424w, https://substackcdn.com/image/fetch/$s_!If1s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde2203a-945a-451d-bb30-27d280f83f78_6720x4476.jpeg 848w, https://substackcdn.com/image/fetch/$s_!If1s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde2203a-945a-451d-bb30-27d280f83f78_6720x4476.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!If1s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcde2203a-945a-451d-bb30-27d280f83f78_6720x4476.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>----------------------------</em></p><p>Since reading the <a href="https://openai.com/index/inside-our-in-house-data-agent/">OpenAI data stack post</a>, I&#8217;ve <a href="https://x.com/mattarderne/status/2017517042484568538">suspected</a> that dbt/SQL might get in the way of LLMs when looking at the data stack more holistically. </p><p>By data stack, I&#8217;m talking Modern Data Stack: ETL core app data into Snowflake/BigQuery, load other API data like Stripe in as well, do SQL joins to get answers (if unfamiliar then this post <em>might </em>not be that clear).</p><p>All the SQL + metadata might just be more human useful than LLM useful. </p><p>Or the system might have been better designed for when we didn&#8217;t have Coding Agents. </p><p>I&#8217;ve wondered about this a bit:</p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/mattarderne/status/1897279053784383925&quot;,&quot;full_text&quot;:&quot;in 2025, what option does an LLM have but to do the data modelling in dbt?\n\nsomewhat serious question&quot;,&quot;username&quot;:&quot;mattarderne&quot;,&quot;name&quot;:&quot;Matt Arderne &#127754;&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/1784251602750103552/U1WgaMkf_normal.jpg&quot;,&quot;date&quot;:&quot;2025-03-05T13:32:30.000Z&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{&quot;full_text&quot;:&quot;in essence, what other option do startups have but to rely on data modelling done in dbt?&quot;,&quot;username&quot;:&quot;mattarderne&quot;,&quot;name&quot;:&quot;Matt Arderne &#127754;&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/1784251602750103552/U1WgaMkf_normal.jpg&quot;},&quot;reply_count&quot;:2,&quot;retweet_count&quot;:0,&quot;like_count&quot;:8,&quot;impression_count&quot;:1097,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p>On a related note to dbt, a strongly coupled feeling, I&#8217;ve <a href="https://x.com/mattarderne/status/1891898809132650654">felt</a> that your main app&#8217;s code is underutilized as a source of meaning and structure:</p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/mattarderne/status/1569674410675535874&quot;,&quot;full_text&quot;:&quot;What % of the data stack need would be entirely negated if data modelling was better applied at the application layer?\n\n*exclude multi-source centralising \n\n<a class=\&quot;tweet-url\&quot; href=\&quot;https://blog.codecentric.de/en/2017/07/agile-database-design-using-anchor-modeling/\&quot;>blog.codecentric.de/en/2017/07/agi&#8230;</a>&quot;,&quot;username&quot;:&quot;mattarderne&quot;,&quot;name&quot;:&quot;Matt Arderne &#127754;&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/1784251602750103552/U1WgaMkf_normal.jpg&quot;,&quot;date&quot;:&quot;2022-09-13T13:08:37.000Z&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:1,&quot;retweet_count&quot;:0,&quot;like_count&quot;:7,&quot;impression_count&quot;:0,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p>Then I see this point from Open AI, and I&#8217;m like, YES!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gTFx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e4561bc-7658-4979-bf46-232fe5fe5399_856x974.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gTFx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e4561bc-7658-4979-bf46-232fe5fe5399_856x974.png 424w, https://substackcdn.com/image/fetch/$s_!gTFx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e4561bc-7658-4979-bf46-232fe5fe5399_856x974.png 848w, https://substackcdn.com/image/fetch/$s_!gTFx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e4561bc-7658-4979-bf46-232fe5fe5399_856x974.png 1272w, https://substackcdn.com/image/fetch/$s_!gTFx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e4561bc-7658-4979-bf46-232fe5fe5399_856x974.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gTFx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e4561bc-7658-4979-bf46-232fe5fe5399_856x974.png" width="856" height="974" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1e4561bc-7658-4979-bf46-232fe5fe5399_856x974.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:974,&quot;width&quot;:856,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:469899,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!gTFx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e4561bc-7658-4979-bf46-232fe5fe5399_856x974.png 424w, https://substackcdn.com/image/fetch/$s_!gTFx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e4561bc-7658-4979-bf46-232fe5fe5399_856x974.png 848w, https://substackcdn.com/image/fetch/$s_!gTFx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e4561bc-7658-4979-bf46-232fe5fe5399_856x974.png 1272w, https://substackcdn.com/image/fetch/$s_!gTFx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e4561bc-7658-4979-bf46-232fe5fe5399_856x974.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://openai.com/index/inside-our-in-house-data-agent/#:~:text=Lesson%20%233%3A%20Meaning%20Lives%20in%20Code">https://openai.com/index/inside-our-in-house-data-agent/#:~:text=Lesson%20%233%3A%20Meaning%20Lives%20in%20Code</a></figcaption></figure></div><p>I&#8217;ve felt this acutely. Maintaining SQL files in dbt has just <strong>so much surface area, and so little logic!</strong></p><p>But also reading the OpenAI post, I see they are <a href="https://openai.com/index/inside-our-in-house-data-agent/#:~:text=Even%20with%20the,the%20right%20columns.">still running most of their analytics logic in</a> SQL!?<br><br>I really struggled to believe that the OpenAI data, at the fastest-growing, most well-funded supercompany in recent memory, <strong>is doing exactly what I would do.</strong></p><p>They are running the Covid era MDS at the core of all the other stuff. </p><p>Same way, same tools, driven by the same FinOps need. </p><p>I don&#8217;t expect SQL to go anywhere, that is not what I&#8217;m getting at. I also don&#8217;t think dbt should go anywhere necessarily. <a href="https://x.com/mattarderne/status/1717204732882698378">Standards</a> are set in times of disruption, and if dbt is the analytics standard then so be it. </p><p>But I wanted to scratch the itch. So I built a benchmark.</p><p>I present <em>The Modern Data Benchmark (or MDS Gym?). </em></p><p>A small experiment comparing how LLM agents perform across different data architectures, given the same data and the same questions.</p><p><strong>The pro-forma result: </strong></p><p>Across 7 LLM models, warehouse+dbt had a 5% pass rate. App-unified architectures had 38-48%. </p><p>*the numbers in this post are directionally correct, like the MDS.</p><h2><strong>A quick history of how we got here</strong></h2><p>Before we dive into it, some context. </p><p>The Modern Data Stack came out of a specific organizational need. </p><ol><li><p>Marketing needs to track ad spend. </p></li><li><p>Finance needs to reconcile Stripe revenue. </p></li></ol><p>These are important but non-core functions, so they get staffed by analyst-operator types who can usually write SQL but not Python. Add Redshift/Snowflake and all roads lead to you SQL.</p><p>dbt emerged to give those SQL queries just enough software engineering discipline, version control, modularity, templating, without forcing anyone to leave SQL. It was a rational solution to a real constraint: <strong>these teams could not write very good code, and didn&#8217;t have a place to write it.</strong></p><p>The gravity of that constraint pulled everything towards raw SQL against a data warehouse. No types, no abstractions, nothing but SQL. dbt came along to solve the obvious shortcomings (shitshow) of lots of SQL. Version Control+Jinja &#129327; and it was, I was working in enterprise data before, dbt was wild!</p><p><strong>OpenAI&#8217;s data warehouse confirms the pattern</strong> </p><p>What surprised me was looking at OpenAI&#8217;s data warehouse. On the surface it looked sophisticated: context layers, embeddings, all the works. But at the core, two things stood out.</p><p><strong>First, it was driven by FinOps. </strong>The example given was revenue reconciliation from Stripe. Even at frontier AI companies, FinOps drives the data warehouse.</p><p><strong>Second, it uses SQL (and I guess dbt). </strong>The team likely had used dbt before, so they reached for it again. This isn&#8217;t a criticism, it&#8217;s how standards form. Not by systematic evaluation, but by repetition.</p><h2><strong>The question worth asking</strong></h2><p>dbt&#8217;s value was making SQL manageable for human analysts. But at scale there are now tens of thousands of lines of SQL that no human is ever going to read. The SQL is increasingly being consumed by agents.</p><p>If the consumer is an agent, &#8220;easiest for humans to read&#8221; stops being that important. </p><p>The relevant question becomes:</p><p>For an agent answering business questions, which representation of the system is the most legible, robust, and correct?</p><p>It all felt rather benchmarkable.</p><p><strong>The split-brain problem </strong></p><p>The modern data stack creates a structural split. Your app has your core business logic: users, statuses, transactions. Separately, you have a data warehouse that holds a lagging copy of that data, <em>plus</em> third-party data like Stripe that only exists in the warehouse.</p><p>To answer anything useful, you need to join app data onto Stripe data inside the warehouse, using SQL, with constrained logic. The &#8220;single source of truth&#8221; in the warehouse is never truly trustworthy. Your actual source of truth is the production database and someone else&#8217;s API, and the warehouse is always behind.</p><p>The crux: what if you brought the data closer to home? Shift left? Strongly typed? Asked Codex 5.3 to spar with Opus 4.6 on turbo mode? What would they do? </p><p>I guess they&#8217;d load Stripe data into a structure your app understands, with proper types and constraints, and they&#8217;d run analytics against the unified codebase..?</p><p><strong>If revenue is a key business capability, and we&#8217;re no longer as code-constrained as we were, why not model it as a first-class concept?</strong></p><p>(this has a million small holes, but stay with me)</p><h2><strong>The benchmark</strong></h2><p>Three sandbox environments, same data, same three analytical tasks: <strong>ARPU</strong>, <strong>churn rate</strong>, and <strong>LTV</strong>. </p><p>Small data, known correct answers. Size isn&#8217;t the test. We are looking at architecture.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8cXt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F316d4130-98ee-408b-ae03-1e922a7c34fb_836x186.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8cXt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F316d4130-98ee-408b-ae03-1e922a7c34fb_836x186.png 424w, https://substackcdn.com/image/fetch/$s_!8cXt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F316d4130-98ee-408b-ae03-1e922a7c34fb_836x186.png 848w, https://substackcdn.com/image/fetch/$s_!8cXt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F316d4130-98ee-408b-ae03-1e922a7c34fb_836x186.png 1272w, https://substackcdn.com/image/fetch/$s_!8cXt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F316d4130-98ee-408b-ae03-1e922a7c34fb_836x186.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8cXt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F316d4130-98ee-408b-ae03-1e922a7c34fb_836x186.png" width="836" height="186" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/316d4130-98ee-408b-ae03-1e922a7c34fb_836x186.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:186,&quot;width&quot;:836,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!8cXt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F316d4130-98ee-408b-ae03-1e922a7c34fb_836x186.png 424w, https://substackcdn.com/image/fetch/$s_!8cXt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F316d4130-98ee-408b-ae03-1e922a7c34fb_836x186.png 848w, https://substackcdn.com/image/fetch/$s_!8cXt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F316d4130-98ee-408b-ae03-1e922a7c34fb_836x186.png 1272w, https://substackcdn.com/image/fetch/$s_!8cXt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F316d4130-98ee-408b-ae03-1e922a7c34fb_836x186.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>In the app sandboxes, Stripe data is represented as internal data with types. </p><blockquote><p><em>App Tables + Stripe Tables &#8594; Single typed context &#8594; Code &#8594; Metric</em></p></blockquote><p>In the dbt sandbox, it follows the traditional pattern: third-party data loaded and joined via SQL. </p><blockquote><p><em>App DB + Stripe &#8594; Replication &#8594; DuckDB raw tables &#8594; Staging SQL &#8594; Marts SQL &#8594; Metric</em></p></blockquote><p>The model must discover the schema and produce executable code that returns the correct number. No hints, no hand-holding.</p><p>The key difference: in app architectures, the model has one typed context. In the warehouse, it must navigate staging models, column naming conventions, and SQL casting to arrive at the same answer.</p><p><strong>How the evaluation works</strong> </p><p>Each run works like this:</p><ol><li><p><strong>Fresh sandbox.</strong> A clean copy of the sandbox template is created with the synthetic data loaded. No prior work carries over between tasks.</p></li><li><p><strong>Agent loop.</strong> The model gets a system prompt describing the architecture and four tools: read_file, write_file, list_files, and done. It has up to 10 turns (API round-trips) to explore the codebase, discover the schema, write its solution, and signal completion. Temperature is set to 0.</p></li><li><p><strong>No hints.</strong> The model is told <em>what</em> to compute (e.g., &#8220;ARPU for active users&#8221;) and given the function signature, but not <em>how</em>. It must figure out join keys (users.stripe_customer_id &#8594; invoices.customer_id), column names, and time anchoring on its own by reading files.</p></li><li><p><em><strong>Execution</strong> <strong>app-typed</strong>: the TypeScript function is imported and called with the data arrays. </em></p></li><li><p><em><strong>Execution app-drizzle</strong>: the async function runs against a pre-loaded SQLite database via Drizzle ORM. </em></p></li><li><p><em><strong>Execution warehouse-dbt</strong>: the SQL is executed in DuckDB with raw tables created from JSON (staging/mart views are built from any SQL files the model wrote)</em></p></li><li><p><strong>Scoring.</strong> Pass/fail is purely numeric: does the output match the expected value within tolerance? (&#177;1 for integers like ARPU/LTV, &#177;0.001 for rates like churn). No partial credit, no style points. If the code crashes, it&#8217;s a fail. If it returns the wrong number, it&#8217;s a fail.</p></li><li><p><strong>Flexible matching.</strong> The validator accepts naming variations (calculateARPU, computeArpu, getArpu, etc.) and searches multiple directories for SQL files, so models aren&#8217;t penalized for reasonable naming choices.</p></li></ol><p>Expected values are computed from the same data by a reference implementation in the benchmark harness itself, not hand-coded, so they&#8217;re guaranteed consistent.</p><h2><strong>THE RESULTS</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!X_8h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62348d41-3465-4217-9374-f327fa74d2ea_900x623.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!X_8h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62348d41-3465-4217-9374-f327fa74d2ea_900x623.jpeg 424w, https://substackcdn.com/image/fetch/$s_!X_8h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62348d41-3465-4217-9374-f327fa74d2ea_900x623.jpeg 848w, https://substackcdn.com/image/fetch/$s_!X_8h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62348d41-3465-4217-9374-f327fa74d2ea_900x623.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!X_8h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62348d41-3465-4217-9374-f327fa74d2ea_900x623.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!X_8h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62348d41-3465-4217-9374-f327fa74d2ea_900x623.jpeg" width="900" height="623" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/62348d41-3465-4217-9374-f327fa74d2ea_900x623.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:623,&quot;width&quot;:900,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!X_8h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62348d41-3465-4217-9374-f327fa74d2ea_900x623.jpeg 424w, https://substackcdn.com/image/fetch/$s_!X_8h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62348d41-3465-4217-9374-f327fa74d2ea_900x623.jpeg 848w, https://substackcdn.com/image/fetch/$s_!X_8h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62348d41-3465-4217-9374-f327fa74d2ea_900x623.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!X_8h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62348d41-3465-4217-9374-f327fa74d2ea_900x623.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Z2Sg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec9b434b-be6c-4824-892d-3424074b4615_610x317.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Z2Sg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec9b434b-be6c-4824-892d-3424074b4615_610x317.png 424w, https://substackcdn.com/image/fetch/$s_!Z2Sg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec9b434b-be6c-4824-892d-3424074b4615_610x317.png 848w, https://substackcdn.com/image/fetch/$s_!Z2Sg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec9b434b-be6c-4824-892d-3424074b4615_610x317.png 1272w, https://substackcdn.com/image/fetch/$s_!Z2Sg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec9b434b-be6c-4824-892d-3424074b4615_610x317.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Z2Sg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec9b434b-be6c-4824-892d-3424074b4615_610x317.png" width="610" height="317" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec9b434b-be6c-4824-892d-3424074b4615_610x317.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:317,&quot;width&quot;:610,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!Z2Sg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec9b434b-be6c-4824-892d-3424074b4615_610x317.png 424w, https://substackcdn.com/image/fetch/$s_!Z2Sg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec9b434b-be6c-4824-892d-3424074b4615_610x317.png 848w, https://substackcdn.com/image/fetch/$s_!Z2Sg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec9b434b-be6c-4824-892d-3424074b4615_610x317.png 1272w, https://substackcdn.com/image/fetch/$s_!Z2Sg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec9b434b-be6c-4824-892d-3424074b4615_610x317.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The ORM sandbox was the clear winner: Opus got a perfect 3/3 every single run, and cheaper models like Kimi and Grok matched it.</p><p>The warehouse-dbt column is almost entirely zeros. Only Opus managed a single pass, and even that was inconsistent across runs.</p><p>Looking closer at variance on multi-run stability (n=5 per Anthropic model)</p><ul><li><p><strong>Opus</strong>: Zero variance.</p></li><li><p><strong>Sonnet</strong>: High variance (~0.9 std).</p></li><li><p><strong>Haiku</strong>: Fluctuates between 0-1 passes.</p></li></ul><p>Mid-tier models are at the edge, and the architecture is what pushes them over or pulls them back.</p><p><strong>Where dbt struggles</strong></p><p>The failure modes:</p><ul><li><p><strong>Schema mismatch</strong>: wrong column names (created_at vs usage_created_at, org_id vs organization_id). Staging conventions that the model has to guess.</p></li><li><p><strong>Type mismatch</strong>: interval math on VARCHAR timestamps without casting.</p></li><li><p><strong>File naming</strong>: incorrect output filenames or failure to write the metric model.</p></li></ul><p><strong>Even thorough models struggle with dbt</strong></p><p>Adding in a measure of unique files read per task in warehouse-dbt:</p><ul><li><p><strong>Haiku</strong>: 1.1 files (barely looks at the schema)</p></li><li><p><strong>Sonnet</strong>: 3.7 files (reads staging files but still gets column names wrong)</p></li><li><p><strong>Opus</strong>: 4.4 files (reads everything, still only 1/3 pass rate)</p></li></ul><p>Even when the model does its homework, the warehouse architecture introduces enough indirection to trip it up. It&#8217;s not a laziness problem, the representation has too many seams.</p><p>In the <strong>app sandboxes</strong>, failures were simpler (wrong join key, missing function) and more recoverable in typed code.</p><h2><strong>What I take from this</strong></h2><p>This benchmark tests a narrow but important thing: can an agent read a schema, write executable logic, and return the correct metric? It doesn&#8217;t test full dbt workflows with Jinja, ref/source, or materializations. It tests the core analytical task that everything else is built to support.</p><p>A few observations (not conclusions, this is early and the sample is small):</p><ol><li><p><strong>The architecture matters more than the model.</strong> The same model that fails at dbt can succeed in a typed environment. </p></li><li><p><strong>ORMs are surprisingly agent-friendly.</strong> Drizzle over SQLite was the strongest sandbox, even mid-tier models could navigate it. Typed schema + query builder + unified context seems to hit a sweet spot.</p></li><li><p><strong>Indirection has a cost that compounds.</strong> Each layer of staging, naming convention, and type casting is a place where an agent can silently go wrong. Types and co-location seem to reduce that surface area.</p></li></ol><h2><strong>What&#8217;s next</strong></h2><p>The current tasks (ARPU, churn, LTV) are intentionally simple, canonical SaaS metrics on synthetic data. The architecture signal is clear, but the questions need to get harder to be convincing. </p><p>A few directions:</p><ol><li><p><strong>What is a &#8220;fair&#8221; dbt project?</strong> After the initial results I started adding hints to the dbt sandbox to get some passing runs, things like cast annotations in staging models so the model doesn&#8217;t trip on DuckDB timestamp arithmetic. I added <code>CAST(created_at AS TIMESTAMP) AS usage_created_at</code> in the staging layer as it kept tripping up on that. It sort of helped: adding a single cast hint let Sonnet pass org_churn_rate where it previously crashed on a runtime error (</p><p><a href="https://github.com/mattarderne/modern-data-benchmarks/blob/main/architecture-compare/artifacts/reports/warehouse-dbt-documented-experiment-2026-02-09.md">details</a></p><p>), but it wasn&#8217;t that consistently helpful. It also felt like a slippery slope. How much documentation and scaffolding do you add before the dbt sandbox stops being representative of what an AI agent would setup. Remember this was all setup by Codex 5.2, I didn&#8217;t touch a thing! A real dbt project lives somewhere on this spectrum, and where exactly is an open question. But maybe that&#8217;s the point. You can keep adding scaffolding to SQL, cast hints, schema docs, Jinja templating, ref() pointers, semantic YAML, and each one closes a small piece of the gap. At some point you have to ask: is the SQL architecture, with all the scaffolding you need to make it work for agents, converging on the thing you&#8217;d build if you just started with types and a unified codebase?</p></li><li><p><strong>Realistic drift.</strong> The current benchmark is static, the data is clean and complete. Real analytics is messier: late-arriving Stripe invoices, missing stripe_customer_id mappings, schema changes mid-pipeline. Adding sync delay scenarios would test whether the split-brain problem is as bad in practice as it is in theory.</p></li><li><p><strong>Linting as agent feedback.</strong> Early experiments with TypeScript typecheck + SQLFluff showed that giving agents lint feedback and extra fix attempts improved ORM more than dbt scores, but the improvement might just be from extra turns, not the lint signal. SQLFluff style rules seem to add noise that distracts smaller models. A schema-only mode that only surfaces missing tables/columns could be a cleaner signal. I tried some things there, but nothing clear.</p></li><li><p><strong>Information parity.</strong> A typed codebase inherently carries more structural information (types, constraints, relationships) than raw SQL with YAML docs. You could argue that&#8217;s confounding. I guess so? But that&#8217;s also the point: the architecture <em>is</em> the information density. Still, enriching the dbt sandbox with comprehensive YAML schema docs would test how much of the gap is &#8220;types help&#8221; vs &#8220;unified context helps.&#8221;</p></li><li><p><strong>Turn-level analysis.</strong> Currently only tracking file-read counts. Understanding the step-by-step reasoning, where models go wrong, when they recover, would give sharper insight into why architecture matters.</p></li></ol><p>Other interesting things:</p><ul><li><p><strong>Semantic layer sandbox (</strong><a href="https://github.com/cliftonc/drizzle-cube">Drizzle-Cube</a><strong>).</strong> The baseline benchmark included Drizzle-Cube but the architecture benchmark didn&#8217;t. Adding it would test whether pre-defined measures help or constrain agents. I&#8217;m hopeful this pushes Drizzle well beyond comparison!</p></li><li><p><strong>Does the &#8220;context layer&#8221; exist?</strong> There&#8217;s a lot of hand-waving right now about &#8220;context layers&#8221; and &#8220;context graphs.&#8221; When you boil these down, they often look like a data warehouse or semantic layer in new language. My position: <strong>if you can&#8217;t demonstrate a simple instance of a complex idea, it doesn&#8217;t meaningfully exist.</strong> Next step is to build sandboxes for the best-case context graph blogs and run them through the same benchmark.</p></li><li><p><strong>Harder queries.</strong></p><p><a href="https://github.com/matsonj/bird-bench">@matsonj&#8217;s platinum set</a></p><p> from the BIRD text-to-SQL benchmark covering complex joins, CTEs, NULL handling, and conditional aggregation. Also <a href="https://github.com/mitdbg/Kramabench">KramaBench</a> which tests full data pipelines, not just single queries, and from the benchmarks referenced by <a href="https://www.sphinx.ai/blog/sphinx-1-0-re-inventing-ai-for-data-science/">Sphinx</a> including DABStep (real Adyen payments data). If agents struggle with simple ARPU, what happens with real analytical complexity?</p></li><li><p><strong>Costs. </strong>I tracked the costs, the results were somewhat interesting, it seemed like Opus was actually often cheaper as it took fewer laps to get the answer. I need to benchmark this more carefully.</p><p><a href="https://github.com/mattarderne/modern-data-benchmarks/blob/main/architecture-compare/artifacts/benchmark_cost_curve.png">Pareto performance cost curve</a></p><p>. </p></li></ul><h2><strong>A request</strong></h2><p>I&#8217;m no longer that close to dbt. Things may have moved on. If you&#8217;re actively working in dbt and you <a href="https://github.com/mattarderne/modern-data-benchmarks/tree/main/architecture-compare/sandboxes/warehouse-dbt">look at the warehouse sandbox</a> and think &#8220;that&#8217;s not how we&#8217;d set it up,&#8221; I genuinely want to hear that. Is this a realistic task? Is this a fair test? The benchmark is <a href="https://github.com/mattarderne/modern-data-benchmarks/tree/main/architecture-compare">open source</a>, you can set up the dbt sandbox the way you think it should be, and run the same evaluation. If a well-configured dbt project closes the gap, that&#8217;s a finding worth publishing too.</p><p><strong>Caveats</strong>: </p><ol><li><p>Small n, synthetic data, single-pass runs for some models, no full dbt compilation. This is directional, not definitive. Run it yourself, add harder tasks, prove me wrong.</p></li><li><p>Claude wrote this out from a voice-note I recorded. </p></li></ol><p><em><a href="https://github.com/mattarderne/modern-data-benchmarks/tree/main/architecture-compare">Link to benchmark repo</a> &#183; <a href="https://openai.com/index/inside-our-in-house-data-agent/">Link to OpenAI data stack post</a></em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://groupby1.mattarderne.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading group by 1! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[The Way of Ways]]></title><description><![CDATA[The tale of MDS - the Modern Data Stack]]></description><link>https://groupby1.mattarderne.com/p/the-way-of-ways</link><guid isPermaLink="false">https://groupby1.mattarderne.com/p/the-way-of-ways</guid><dc:creator><![CDATA[Matt Arderne]]></dc:creator><pubDate>Fri, 18 Aug 2023 13:41:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbd3edbd-f1f2-4776-ab37-09b6c5c1b52b_704x400.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>The beauty of ideas is that they cannot die. That said, many consider the Modern Data Stack <a href="https://twitter.com/matsonj/status/1691245983911567360">to have developed a bit of a rot</a><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> I thought to pull out the eulogy I've had in the back of my mind for a while. It is a Tour de Links that follows the journey of <strong>the Modern Data Stack</strong>.</em> </p><p><em>The analysis tracks three frameworks for mapping the cyclical nature of cultural phenomenon (<a href="https://astralcodexten.substack.com/p/a-cyclic-theory-of-subcultures">one</a> <a href="https://twitter.com/johncutlefish/status/1616539104669470720">two</a> <a href="https://meaningness.com/geeks-mops-sociopaths">three</a>). These don&#8217;t map perfectly, and number three doesn&#8217;t fit the enterprise context. The idea: Something new catches on, it grows big, it loses its way, there is a fight / collapse / phase-change and then it stabilises. The subheadings are from <a href="https://twitter.com/johncutlefish/status/1616539104669470720">John Cutler</a>:</em></p><blockquote><p>Anything helpful will eventually become commodified, industrialized, and watered down to the point of being unrecognizable. It happened with Agile, and is happening with strands of product management, DevOps, design, etc.</p></blockquote><h1>Phase 1: Precycle</h1><div class="pullquote"><p>People start a movement around a weird thing, with no hope of payoff, <br><strong><a href="https://astralcodexten.substack.com/p/a-cyclic-theory-of-subcultures">for sheer love of the thing.</a></strong><a href="https://astralcodexten.substack.com/p/a-cyclic-theory-of-subcultures"> </a></p></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1W5v!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbd3edbd-f1f2-4776-ab37-09b6c5c1b52b_704x400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1W5v!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbd3edbd-f1f2-4776-ab37-09b6c5c1b52b_704x400.png 424w, https://substackcdn.com/image/fetch/$s_!1W5v!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbd3edbd-f1f2-4776-ab37-09b6c5c1b52b_704x400.png 848w, https://substackcdn.com/image/fetch/$s_!1W5v!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbd3edbd-f1f2-4776-ab37-09b6c5c1b52b_704x400.png 1272w, https://substackcdn.com/image/fetch/$s_!1W5v!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbd3edbd-f1f2-4776-ab37-09b6c5c1b52b_704x400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1W5v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbd3edbd-f1f2-4776-ab37-09b6c5c1b52b_704x400.png" width="704" height="400" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fbd3edbd-f1f2-4776-ab37-09b6c5c1b52b_704x400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:400,&quot;width&quot;:704,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:491375,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1W5v!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbd3edbd-f1f2-4776-ab37-09b6c5c1b52b_704x400.png 424w, https://substackcdn.com/image/fetch/$s_!1W5v!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbd3edbd-f1f2-4776-ab37-09b6c5c1b52b_704x400.png 848w, https://substackcdn.com/image/fetch/$s_!1W5v!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbd3edbd-f1f2-4776-ab37-09b6c5c1b52b_704x400.png 1272w, https://substackcdn.com/image/fetch/$s_!1W5v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbd3edbd-f1f2-4776-ab37-09b6c5c1b52b_704x400.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">the early days camping out in the woods comparing SQL style guides</figcaption></figure></div><h4>MDS EMERGES FROM PRACTICE</h4><p>People like solving problems that resonate with other people, open source vibes, this was the early MDS scene. </p><p>Before MDS, there were enterprise data vendors and mostly they weren&#8217;t very useful for fast moving companies, in the sense that you could only afford them if you budgeted in the $xx millions and you planned in half decades. Redshift precipitated a change and Snowflake accelerated it. Pay-per-query.</p><p>Redshift and Snowflake laid the soil that led to the Modern Data Stack. They created the easiest to buy large scale data storage that anyone had ever seen. The problem now became keeping track of the transformations necessary to deal with the <strong><a href="https://twitter.com/mattarderne/status/1684462316278910976">vast data recycling centres</a></strong> that were suddenly viable to be created by smaller and smaller teams.</p><p><a href="https://github.com/dbt-labs/dbt-core/tree/549282110f393a22c6331ba828a4895bdee9c26e">dbt</a>, a relatively generic idea (<a href="https://news.ycombinator.com/item?id=12862474">SQL templating</a> in python) became the foundation for a new movement to deal with the issue, <a href="https://medium.com/fishtown-analytics/the-missing-layers-of-the-analytics-stack-af420e6214bd">right place, people, audience, right time</a>. </p><p>I was instantly hooked. dbt took the lead among <a href="https://medium.com/@jthandy/the-modern-data-platform-is-too-big-to-fit-on-one-slide-377b9d28d01e">an array of relatively fragmented solutions</a> all jockeying around the task at hand.</p><h4>MDS NAME COINED / FIRST BLOGS</h4><p>A brief google and it seems like dbt first used the term "Modern Data Stack&#8221; on <a href="https://www.getdbt.com/blog/how-do-you-decide-what-to-model-in-dbt-vs-lookml/#:~:text=The%20modern%20data%20stack%20is%20modular">January 29, 2018</a>, (I couldn&#8217;t be bothered to dig any deeper, history in the form of a sea-shanty with a broken banjo).</p><p>The concept caught on. I was working as a Data Engineer doing Redshift transformations, dbt achieved this in a far better way. Before that, doing this stuff was super expensive, testing and version control was tricky, everything was slow.</p><p>Blogs came out describing the way. Here I generously <a href="https://groupby1.substack.com/p/dataform-and-dbt#:~:text=They%20generously%20shared,startup%20data%20analytics">quote myself</a> describing the series of blog posts that introduced the new way</p><blockquote><p>We, the desperate, listened closely. The message I heard: <em>bring the best of software development to startup data analytics</em></p></blockquote><p>I was likely referring to this <a href="https://medium.com/fishtown-analytics/the-missing-layers-of-the-analytics-stack-af420e6214bd">blog post</a>. It all sounds rather exciting in retrospect, and it was!</p><h4>MDS MINDSET / MANIFESTO / PRINCIPLES / EVANGELISTS</h4><p>The whole thing coalesced (yup) around the <strong>Community</strong>. dbt slack and Locally Optimistic slack were the <a href="https://perell.com/fellowship/conjuring-scenius/#:~:text=When%20those%20inside%20the%20cutting%2Dedge%20scenes%20band%20together%20to%20support%2C%20teach%2C%20and%20create%20with%20each%20other%2C%20their%20niche%20and%20experimental%20projects%20can%20become%20the%20new%20normal%20on%20top%20of%20which%20the%20next%20generation%20builds">absolute centre</a>(s) of the data world for a fair amount of time.</p><blockquote><p>When those inside the cutting-edge scenes band together to support, teach, and create with each other, their niche and experimental projects can become the new normal on top of which the next generation builds.</p></blockquote><p>The <a href="https://docs.getdbt.com/community/resources/community-rules-of-the-road">vendor guidelines</a> kept things civil in the dbt chat, and Locally Optimistic was generally smaller and less frantic. Dbt was often the target of staffing firms who would point their junior devs at the dbt slack and say <em>&#8220;here is your technical support&#8221;</em> and they would paste 200 line error logs into the main chat and say <strong>&#8220;what do??&#8221; </strong>and then disappear. </p><p>I think a real kicker was the <strong>Open Source</strong> hook. Everyone wants to work with open source software. Now <a href="https://medium.com/fishtown-analytics/its-time-for-open-source-analytics-194902ae5c5">data people could do that</a>!  </p><p>There were reams of <strong>Guides</strong>, <a href="https://groupby1.substack.com/p/data-as-a-utility-tool">here is mine</a>. (filler content for <a href="https://dataform.co/blog/data-tools">Dataform</a>, now acquired by Google). The writing wasn&#8217;t very good in hindsight, but I stand by my suggestions, <em>move fast with</em> <em>simple tools</em>. Here is another <a href="https://locallyoptimistic.com/post/one-size-fits-none/">good guide</a>. These ideas were evaluated on some or other believability and the good ideas were amplified.</p><p>Alongside this, <strong>Validation</strong> that the <a href="https://www.getdbt.com/blog/analytics-is-a-trade/">data trade was noble</a> fanned the flames.</p><p>dbt ran their <a href="https://www.getdbt.com/blog/coalesce-2020/">first conference in 2020</a>. <a href="https://www.getdbt.com/coalesce-2021/keynote-how-big-is-this-wave/">2021 they ran their second</a>, this time it got big.</p><h1>Phase 2: Growth</h1><div class="pullquote"><p>Because it&#8217;s so new, there is a vast frontier, waiting to be explored. Anyone willing to work hard can go to some virgin tract of ideaspace and start mining it for status. <a href="https://astralcodexten.substack.com/p/a-cyclic-theory-of-subcultures">The returns on talent are high.</a></p></div><h4>MDS BIG WINS</h4><p>Snowflake IPO <a href="https://edition.cnn.com/2020/09/16/investing/snowflake-ipo/index.html">late 2020</a> set things in motion, the biggest software IPO ever was just the beginning of the end for MDS. dbt raised a <a href="https://www.crunchbase.com/organization/dbt-labs/company_financials">ton of money</a>, so did Fivetran and most of the others.</p><h4>EXPLOSION OF MDS PATTERNS / BEST PRACTICES</h4><p><a href="https://locallyoptimistic.com/post/category/tools/">Locally Optimistic</a> was always and remains an ardent supporter of delivering relatively hype-free and insightful best practices. Their community remains well moderated and insightful, with very friendly and well moderated non-vendor participants. Remains a top place to visit. </p><p><a href="https://clrcrl.com/2021/03/03/how-to-build-a-community-why">Community best practices</a> and advice about <a href="https://clrcrl.com/2022/05/06/mds-company-slack">starting a company community slack</a> started to pop up. If you were VC backed, you need 1000+ people in your slack otherwise how would you get any product feedback. This worked well in my opinion, despite the snark. If I am legitimately interested in your product I&#8217;d love to speak to the engineer building it. Companies started adding the slack activity into their go-to-market strategy, ELT&#8217;ing my activity into their CDP for better PLG or whatever. </p><h4>SMALL MDS CONSULTANCIES</h4><p>An acquaintance (or quite a few actually) went full throttle on consulting and quickly employed 100s of people and made a ton of revenue with massive companies.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a></p><p>dbt has a huge <a href="https://partners.getdbt.com/english/directory/search?f0=Partner+Type&amp;f0v0=Consulting+Partner&amp;f1=Partner+Tier&amp;f1v0=Premier+Partner">list of vendors</a>. This was the most reliable way to make good money in the Modern Data Stack ecosystem, but the allure of building products pulled lots of people into building data tools instead.</p><h4>YOU ARE NOT DOING MDS / CERTIFICATIONS TITLES</h4><p>The need to teach analysts how to be <em><strong>Analytics Engineers </strong></em>was pretty clear, they needed git, python, jinja, Data models, star schemas, denormalization, Kimball.</p><p>A few courses came out doing this stuff, first was the <a href="https://analyticsengineers.club/course-overview/">Analytics Engineers Club</a>. I think the <strong>data modelling skills </strong>described are super useful, and generally fell into something that Data Engineers didn&#8217;t do and Analysts didn&#8217;t do, and so they were just ignored. </p><p>This wasn&#8217;t explicitly gatekeeping, more just demand being met by an increase in supply. I was firmly of the opinion that hiring ops people was beneficial in this space as they could easily grasp the tech and generally had a better sense for the problems that were worth addressing. </p><h4>MDS VENDORS FUND CONFERENCES</h4><p>There were a few <a href="https://www.moderndatastack.xyz/summit">smaller conferences</a> other than Coalesce (the conference), and Snowflake summits, <a href="https://www.datacouncil.ai/">Data Council</a> seemed to be the most consistently worthwhile.</p><p>In 2021 the dbt <a href="https://benn.substack.com/p/delirium#:~:text=This%20cultural%20energy%20reaches%20its%20crescendo%20during%20Coalesce%2C%20dbt%20Labs%E2%80%99%20annual%20conference">Coalesce conference</a> peaked:</p><blockquote><p>This cultural energy reaches its crescendo during <a href="https://coalesce.getdbt.com/">Coalesce</a>, dbt Labs&#8217; annual conference. The conference ostensibly takes place over a number of live-streamed talks, but its beating heart is on Slack. Every talk inspires a tidal wave of excitement, encouragement, and general good cheer. Every channel is the parents&#8217; section at track meet: Ready to erupt when their kid crosses the finish line, and equally ready to hop the fence and pick them up if they fall.&nbsp;</p></blockquote><h4>MDS ECOSYSTEM INFOGRAPHICS</h4><p>The vast ecosystem now needed a map, <a href="https://www.moderndatastack.xyz/community">ModernDataStack.XYZ</a> maps out the components, and I think does an adequate job of categorising them.</p><p>Here is <a href="https://www.indicative.com/resource/modern-data-infrastructure/">another map</a>, with their criteria giving you an idea of the qualifying criteria. <a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a></p><h4>MDS VENDOR COMPETITION HEATS UP!</h4><p>Once the ball started rolling, there was a lot of money being poured into the space. New BI tools, new ELT, data quality, data reliability, observability, metrics, semantics, all started being picked over.  </p><p>I wrote a blog comparing dbt and Dataform. Dataform was acquired as mentioned, and dbt took centre stage, centre diagram. </p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;1742052e-9a70-40e9-b8ed-49cd9c32e113&quot;,&quot;caption&quot;:&quot;Welcome to my third post, one I have wanted to write from the beginning. Getting these posts done isn&#8217;t easy, and the time between publishing is a commitment that I undertook rather lightly. Like most good ideas, this one is late, irrelevant, and likely only to be marginally useful. That said, here is a quick rundown on two of the &#8220;indicative-of-the-fut&#8230;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Dataform and dbt&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:10635483,&quot;name&quot;:&quot;Matt Arderne&quot;,&quot;bio&quot;:&quot;I write about data systems that improve business productivity.\n\nBuilding something new with friends\n\nTweet at https://twitter.com/mattarderne&quot;,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ff2ad675-4846-49d6-a397-599b7dd13538_1291x1104.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2021-06-21T14:10:04.539Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d7da52-a627-40e5-87af-f383a32ae202_4445x6667.jpeg&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://groupby1.substack.com/p/dataform-and-dbt&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:37844886,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:3,&quot;comment_count&quot;:4,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;group by 1&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7fbee6-d181-479e-bd71-c4704b2b4c80_1216x1216.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>Two big themes were <a href="https://mattturck.com/mad2023-part-iii/#:~:text=trend%20is%20noteworthy.-,Reverse%20ETL%20vs%20CDP,-Another%20somewhat%2Din">Reverse ETL and CDP</a>:</p><blockquote><p>Another somewhat-in-the-weeds, but fun to watch part of the landscape has been the tension between Reverse ETL (again, the process of taking data out of the warehouse and putting it back into SaaS and other applications) and Customer Data Platforms (products that aggregate customer data from multiple sources, run analytics on them like segmentation, and enable actions like marketing campaigns).&nbsp;</p></blockquote><p><strong>Reverse ETL</strong> was always a  tricky thing. I like the idea of reverse ETL, but ultimately it would often build on very weak foundations:<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a> </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MjCn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f2b8e2b-38ec-4088-bd50-a5eb3679f7ee_979x798.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MjCn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f2b8e2b-38ec-4088-bd50-a5eb3679f7ee_979x798.webp 424w, https://substackcdn.com/image/fetch/$s_!MjCn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f2b8e2b-38ec-4088-bd50-a5eb3679f7ee_979x798.webp 848w, https://substackcdn.com/image/fetch/$s_!MjCn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f2b8e2b-38ec-4088-bd50-a5eb3679f7ee_979x798.webp 1272w, https://substackcdn.com/image/fetch/$s_!MjCn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f2b8e2b-38ec-4088-bd50-a5eb3679f7ee_979x798.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MjCn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f2b8e2b-38ec-4088-bd50-a5eb3679f7ee_979x798.webp" width="528" height="430.3820224719101" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7f2b8e2b-38ec-4088-bd50-a5eb3679f7ee_979x798.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:798,&quot;width&quot;:979,&quot;resizeWidth&quot;:528,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!MjCn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f2b8e2b-38ec-4088-bd50-a5eb3679f7ee_979x798.webp 424w, https://substackcdn.com/image/fetch/$s_!MjCn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f2b8e2b-38ec-4088-bd50-a5eb3679f7ee_979x798.webp 848w, https://substackcdn.com/image/fetch/$s_!MjCn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f2b8e2b-38ec-4088-bd50-a5eb3679f7ee_979x798.webp 1272w, https://substackcdn.com/image/fetch/$s_!MjCn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7f2b8e2b-38ec-4088-bd50-a5eb3679f7ee_979x798.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The data analytics marathon. <a href="https://twitter.com/mattarderne/status/1604528546784870402">My tweet on why Reverse ETL never took off</a> - Reverse ETL is the 5% at the end. That tweet did take off relatively speaking. <a href="https://www.forbes.com/sites/brentdykes/2022/01/12/data-analytics-marathon-why-your-organization-must-focus-on-the-finish/"> Source here</a></figcaption></figure></div><p><strong>The CDP space</strong> also sort of heated up, and had significant overlap with MDS. The MDS gave you the tools to construct your own solution, and design the specifics to suit your needs, whereas the CDP kinda gave you much deeper capabilities, but less control (for the customer analytics niche). </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GSC-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9acc96d-0060-402b-83dd-7a8816029ced_670x478.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GSC-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9acc96d-0060-402b-83dd-7a8816029ced_670x478.png 424w, https://substackcdn.com/image/fetch/$s_!GSC-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9acc96d-0060-402b-83dd-7a8816029ced_670x478.png 848w, https://substackcdn.com/image/fetch/$s_!GSC-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9acc96d-0060-402b-83dd-7a8816029ced_670x478.png 1272w, https://substackcdn.com/image/fetch/$s_!GSC-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9acc96d-0060-402b-83dd-7a8816029ced_670x478.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GSC-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9acc96d-0060-402b-83dd-7a8816029ced_670x478.png" width="670" height="478" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b9acc96d-0060-402b-83dd-7a8816029ced_670x478.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:478,&quot;width&quot;:670,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:54958,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GSC-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9acc96d-0060-402b-83dd-7a8816029ced_670x478.png 424w, https://substackcdn.com/image/fetch/$s_!GSC-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9acc96d-0060-402b-83dd-7a8816029ced_670x478.png 848w, https://substackcdn.com/image/fetch/$s_!GSC-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9acc96d-0060-402b-83dd-7a8816029ced_670x478.png 1272w, https://substackcdn.com/image/fetch/$s_!GSC-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9acc96d-0060-402b-83dd-7a8816029ced_670x478.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">I don&#8217;t think either really get that high on the hierarchy triangle that often in practice. </figcaption></figure></div><h4>FIRST ENTERPRISE VENDOR MDS OFFERINGS</h4><p>Enterprise oriented MDS style tools and even direct clones started cropping up. The awkwardly named <a href="https://coalesce.io/">Coalesce</a> was the first early dbt alternative since Dataform (Coalesce is also the name of dbt&#8217;s conference). I don&#8217;t know much about it.</p><h4>MDS GARTNER MAGIC QUADRANT</h4><p>There isn&#8217;t a fully fledged Gartner Quadrant. The whole point of MDS was all the unbundled solutions. There was a <a href="https://www.gartner.com/peer-community/poll/modern-data-stack-component-looking">post on their forum</a>. I predict we will see a &#8220;new wave&#8221; once all the tools are more consolidated and Gartner figures out how to position all of this. </p><p>I guess there are likely enterprise equivalently branded MDS style solutions that have happened. The only funny one I could find within 5 seconds of Googling was <a href="https://www.ibm.com/products/z-and-cloud-modernization-stack">IBM Gen z/X</a>.   </p><h4>OFFERINGS TO SOLVE MDS PROBLEMS</h4><p>I <strong>almost</strong> wrote a blog decrying the state of things when I was spammed by an ex digitisation evangelist crypto expert, now selling <em><strong>data-something-dot-AI</strong></em>. Selling magic and then supplying junior analysts with git and SQL just stank of an enterprise sales cycle. This was indicative of the beginning of the end.</p><h4>DON'T DO MDS / ENTERPRISE MDS</h4><p><strong>Data mesh</strong> was the funniest part of the entire data hype cycle. Effectively a fully fledged framework to decentralise data ownership, it read like a<strong> Business School Blockchain Certification</strong> (apparently, I didn&#8217;t read it). The response from the MDS pure snark. <a href="https://databased.pedramnavid.com/p/the-last-thing-ill-ever-say-about">Pedram had the last word here</a>. I guess maybe you had to be there.</p><p>There was heated posturing on Linkedin about the failure of MDS to properly address the <strong>problem of data modelling</strong>. Data Vault, Anchor modelling. How about a full circle all the way back to <a href="https://twitter.com/EcZachly/status/1683583788998328320">one big table?</a></p><p>This was also an Enterprise / Startup culture clash, but probably mostly a mutual lack of context and incompatible cadence. A 100 person SF startup growing beyond terminal velocity vs a declining mid-tier bank in the Bavarian hinterland aren&#8217;t going to meet each other with much common language. I wrote something about that here:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;5d00e010-beac-4afd-8dd5-bf0275d3889f&quot;,&quot;caption&quot;:&quot;Enterprise data systems can often be quite distinct from their smaller startup cousins. This post takes a look at how our conversations around data systems and techniques need to be more sensitive to context, specifically when conversationalists have varying backgrounds.&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Context, and the Lack Thereof&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:10635483,&quot;name&quot;:&quot;Matt Arderne&quot;,&quot;bio&quot;:&quot;I write about data systems that improve business productivity.\n\nBuilding something new with friends\n\nTweet at https://twitter.com/mattarderne&quot;,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ff2ad675-4846-49d6-a397-599b7dd13538_1291x1104.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2023-01-18T11:57:42.399Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F255bcb25-cf47-45d7-bed6-6e604e99de9a_1989x970.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://groupby1.substack.com/p/context&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:97285650,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:5,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;group by 1&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7fbee6-d181-479e-bd71-c4704b2b4c80_1216x1216.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h1>Phase 3: Involution / Stagflation</h1><div class="pullquote"><p>The movement has picked the low-hanging fruit of their object-level goals. Artistic movements have created enough works that it&#8217;s hard not to seem derivative. Intellectual movements have explored most of the implications of their ideas. Political movements have absorbed their natural base and are facing organized opposition. It&#8217;s still possible to do object-level work, but <strong><a href="https://astralcodexten.substack.com/p/a-cyclic-theory-of-subcultures">unless you&#8217;re a hard-working genius, someone will have beaten you to most good ideas</a></strong></p></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oZN6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663642d0-f76f-4983-a053-b93aee17609f_347x191.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oZN6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663642d0-f76f-4983-a053-b93aee17609f_347x191.png 424w, https://substackcdn.com/image/fetch/$s_!oZN6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663642d0-f76f-4983-a053-b93aee17609f_347x191.png 848w, https://substackcdn.com/image/fetch/$s_!oZN6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663642d0-f76f-4983-a053-b93aee17609f_347x191.png 1272w, https://substackcdn.com/image/fetch/$s_!oZN6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663642d0-f76f-4983-a053-b93aee17609f_347x191.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oZN6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663642d0-f76f-4983-a053-b93aee17609f_347x191.png" width="347" height="191" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/663642d0-f76f-4983-a053-b93aee17609f_347x191.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:191,&quot;width&quot;:347,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:47643,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!oZN6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663642d0-f76f-4983-a053-b93aee17609f_347x191.png 424w, https://substackcdn.com/image/fetch/$s_!oZN6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663642d0-f76f-4983-a053-b93aee17609f_347x191.png 848w, https://substackcdn.com/image/fetch/$s_!oZN6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663642d0-f76f-4983-a053-b93aee17609f_347x191.png 1272w, https://substackcdn.com/image/fetch/$s_!oZN6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663642d0-f76f-4983-a053-b93aee17609f_347x191.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>The cracks started to show in a few ways: cost, complexity, BI tools, sprawl. </p><p><strong>Cost:</strong> </p><p>Snowflake et-al being super expensive was the first ominous crack. When interrogated, massive cloud bills often weren&#8217;t attributable to any value. It was presumed when these at kick off that they would be <strong>both expensive and lead to value.</strong> The expensive part was duly stomached but then the value part was late to the party.</p><p>ELT, synonymous with MDS, came under cost pressure. People realised that the work was pretty predictable and so cheaper, better, faster options became abundant. </p><p>And then AWS, Salesforce and Snowflake all began to murmur about &#8216;<strong>zero-ETL</strong>&#8217; which basically meant they will co-access or whatever the data that is in Salesforce from Snowflake. Basically ruining the <a href="https://benn.substack.com/p/how-fivetran-fails">Fivetran business model</a>. </p><blockquote><p>&#8220;<em>What if we could eliminate ETL entirely? That would be a world we would all love. This is our vision, what we&#8217;re calling a zero ETL future. </em></p></blockquote><p><strong>dbt: </strong></p><p>Dbt was really just a victim of success, when you push into the unknown you will find out, and dbt found out. Before dbt, teams kept their reliance on a data warehouse simple and un-collaborative. One engineer would generally be responsible, they would have de facto veto on data modelling changes. dbt <a href="https://roundup.getdbt.com/p/complexity-the-new-analytics-frontier">democratised that</a>:</p><blockquote><p>Did we achieve more collaboration on an analytics code base? &#9989;</p><p>Did we achieve more leverage through reusable and modular code? &#9989;</p><p>Did we also buy more complexity, resulting in longer maintenance and debugging cycles? Unfortunately, also &#9989; &#129299;</p><p>Turns out the price of enabling people to build a more complex code base is&#8230; <strong>a more complex codebase</strong>, and everything that comes with that.<strong> </strong></p></blockquote><p>Solving one problem and in doing so creating another is the essence of progress. I give dbt a pass here. Nonetheless, things got very complex, and not just limited to dbt.</p><p>There were numerous <a href="https://mattpalmer.io/posts/hot-takes/">hot takes</a> pointing to the shortcomings of dbt, and a slew of <a href="https://news.ycombinator.com/item?id=37121543">dbt alternatives popped up</a>, all &#8220;faster horses&#8221; in my mind. We need cars.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a> </p><p>dbt having raised a ton of money, had to do <a href="https://www.getdbt.com/blog/dbt-labs-update-a-message-from-ceo-tristan-handy/">layoffs</a> and quickly <a href="https://www.getdbt.com/blog/consumption-based-pricing-and-the-future-of-dbt-cloud/">figure out a business model</a>.</p><p><strong>Complexity: </strong></p><p>You could be lead to believe that you need one from each of the following &#8220;MDS categories&#8221;. However <a href="https://www.moderndatastack.xyz/stacks/pitch">most teams</a> generally limited themselves to a BI tool, a database, an ETL tool and dbt. </p><p>The issue was the abundance of overlapping options.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BLFd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff760e11c-1a3b-459d-92e7-01bdf130e66a_1070x1295.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BLFd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff760e11c-1a3b-459d-92e7-01bdf130e66a_1070x1295.png 424w, https://substackcdn.com/image/fetch/$s_!BLFd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff760e11c-1a3b-459d-92e7-01bdf130e66a_1070x1295.png 848w, https://substackcdn.com/image/fetch/$s_!BLFd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff760e11c-1a3b-459d-92e7-01bdf130e66a_1070x1295.png 1272w, https://substackcdn.com/image/fetch/$s_!BLFd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff760e11c-1a3b-459d-92e7-01bdf130e66a_1070x1295.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BLFd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff760e11c-1a3b-459d-92e7-01bdf130e66a_1070x1295.png" width="1070" height="1295" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f760e11c-1a3b-459d-92e7-01bdf130e66a_1070x1295.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1295,&quot;width&quot;:1070,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:345045,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BLFd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff760e11c-1a3b-459d-92e7-01bdf130e66a_1070x1295.png 424w, https://substackcdn.com/image/fetch/$s_!BLFd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff760e11c-1a3b-459d-92e7-01bdf130e66a_1070x1295.png 848w, https://substackcdn.com/image/fetch/$s_!BLFd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff760e11c-1a3b-459d-92e7-01bdf130e66a_1070x1295.png 1272w, https://substackcdn.com/image/fetch/$s_!BLFd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff760e11c-1a3b-459d-92e7-01bdf130e66a_1070x1295.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">https://www.moderndatastack.xyz/categories</figcaption></figure></div><p><strong>Team size:</strong></p><p>The other awkward thing to address was the growth in MDS team size. I worked with a team that wanted to double their data team from 20 to 50. MDS became known to run on <a href="https://medium.com/@laurengreerbalik/the-modern-data-stack-through-the-gervais-principle-bfd4b4e33ac7">human middleware</a>:</p><blockquote><p>Cash is injected. This means more employees of all sorts are hired. More employees increases the demand for more reports and analytics. More demand means more human middleware is created in the Clueless layer when you adopt the Modern Data Stack paradigm of throwing everything into your cloud data warehouse of choice.</p><p>More Clueless human middleware creates more tables, tables, tables to many more reports and KPIs and metrics. They have to buy new products and hire new people to manage the complexity. </p></blockquote><p>This criticism was semi-reasonable<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>, but honestly I think this is often just the cycle of technology. Something comes along, presents an opportunity to differentiate, it works for some, it doesn&#8217;t work for others. People need to be involved. How many people? Probably a few. Oops too many. OK less people.  </p><p>A more primary issue with <strong>buying anything</strong> <strong>is knowing</strong> <strong>if you are ready for it</strong>. Does this company need a better data capability? Falling behind is a <a href="https://erikbern.com/2020/12/16/giving-more-tools-to-software-engineers-the-reorganization-of-the-factory.html#:~:text=Lack%20of%20adoption%20of%20new%20tools%20means%20falling%20behind%20the%20companies%20leveraging%20those%20tools.">real risk, that compounds</a>!</p><p>The Data Stack was sold broadly, for some it worked, quite often it didn&#8217;t. </p><p><strong>BI tools:</strong></p><p>Business Intelligence tools continued to underwhelm, primarily because of the split between traditional reporting and exploratory analytics. </p><p>The paradigm of &#8220;reporting&#8221; is in my mind a dead end. <strong>Like delivering a menu and then never taking an order, BI tools were informative instead of interactive. </strong></p><p>Traditional BI just don&#8217;t move the needle in the same way that newer tools like Hex do. (Hex described this paradigm in <a href="https://web.archive.org/web/20211113014204/https://hex.tech/blog/bi-tools-hex/">deleted article</a>, they now position themselves as a data tool that does <a href="https://hex.tech/blog/data-driven-decisions-with-kpi-dashboards/#:~:text=They%20can%20be%20simple%20or%20complex%20depending%20on%20need%2C%20and%20as%20beautiful%20or%20sparse%20as%20you%20can%20make%20it.%20But%20the%20data%20always%20takes%20center%20stage.%20Your%20focus%20when%20building%20one%20should%20always%20be%20%E2%80%9Cdoes%20this%20help%20drive%20the%20organization%20forward%3F%E2%80%9D">reporting too</a>). I use Hex daily. It is relatively cheap, it works very well and has sufficient depth to replace a <a href="https://twitter.com/mattarderne/status/1656361016983265280">fair chunk of MDS and technology infrastructure</a> too. </p><p>To be fair to BI tools, they were the last mile delivery problem built on a relative house of cards, so were pretty much destined to be <a href="https://twitter.com/mattarderne/status/1679109539789000710">the pain cafe</a>. </p><p><strong>Sprawl</strong></p><p>There was Vocal Criticism from product analytics people about the sprawling nature of MDS. </p><p>Product analytics is a mature side-car to the MDS, and the tooling built by Posthog et al is generally end-to-end integrated. From <a href="https://substack.timodechau.com/p/why-product-analytics-is-completely#:~:text=If%20you%20already,for%20this%20job.">their perspective</a>, the MDS approach led to poor outcomes:<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cNtm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2951bfd7-0934-4002-926f-df7e84c6d303_1502x1300.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cNtm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2951bfd7-0934-4002-926f-df7e84c6d303_1502x1300.png 424w, https://substackcdn.com/image/fetch/$s_!cNtm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2951bfd7-0934-4002-926f-df7e84c6d303_1502x1300.png 848w, https://substackcdn.com/image/fetch/$s_!cNtm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2951bfd7-0934-4002-926f-df7e84c6d303_1502x1300.png 1272w, https://substackcdn.com/image/fetch/$s_!cNtm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2951bfd7-0934-4002-926f-df7e84c6d303_1502x1300.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cNtm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2951bfd7-0934-4002-926f-df7e84c6d303_1502x1300.png" width="327" height="282.9807692307692" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2951bfd7-0934-4002-926f-df7e84c6d303_1502x1300.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1260,&quot;width&quot;:1456,&quot;resizeWidth&quot;:327,&quot;bytes&quot;:2049001,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!cNtm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2951bfd7-0934-4002-926f-df7e84c6d303_1502x1300.png 424w, https://substackcdn.com/image/fetch/$s_!cNtm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2951bfd7-0934-4002-926f-df7e84c6d303_1502x1300.png 848w, https://substackcdn.com/image/fetch/$s_!cNtm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2951bfd7-0934-4002-926f-df7e84c6d303_1502x1300.png 1272w, https://substackcdn.com/image/fetch/$s_!cNtm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2951bfd7-0934-4002-926f-df7e84c6d303_1502x1300.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A meme from <a href="https://posthog.com/blog/modern-data-stack-sucks">posthog</a>.</figcaption></figure></div><p>Those are a few of the criticisms that illustrate the point. Essentially the <a href="https://en.wikipedia.org/wiki/Bellwether#:~:text=bellewether%2C%20which%20referred%20to%20the%20practice%20of%20placing%20a%20bell%20around%20the%20neck%20of%20a%20castrated%20ram%20(a%20wether)%20leading%20a%20flock%20of%20sheep.%20A%20shepherd%20could%20then%20note%20the%20movements%20of%20the%20animals%20by%20hearing%20the%20bell%2C%20even%20when%20the%20flock%20was%20not%20in%20sight.%5B3%5D">belleweather</a> for the change in direction of the flock. </p><h4>MDS IS DEAD POSTS</h4><p>There were <a href="https://duckduckgo.com/?q=%22modern+data+stack%22+%22dead%22&amp;va=a&amp;t=hp&amp;ia=web">a few</a>. This became a bit of a trope. Meta analysis of the trope is far more palatable. Hope you agree. </p><h4>BACK TO BASICS MOVEMENT</h4><p>There was always talk of a <strong>bundling</strong>. In essence the MDS was unbundling, and at some stage the tide would turn.</p><p>This was actually <a href="https://towardsdatascience.com/the-great-data-debate-unbundling-or-bundling-7d7721ee8514#:~:text=What%20actually%20happened%3F">discussed at length</a> in early 2022, where most everyone agreed that the unbundled approach was great for many things (experimentation, investing, entertainment, curiosity), but it <strong>wasn&#8217;t very productive.</strong> </p><h4>MDS LATE ADOPTERS</h4><p>There is still the <a href="https://www.mdsfest.com/">MDSFest</a>:</p><blockquote><p>&#8220;A community-led celebration of ideas and perspectives on the modern data stack&#8221;. </p></blockquote><p>Sounds great but I literally just came across it researching for this blog so I don&#8217;t actually know much about it</p><p>I don&#8217;t know where this video comes from but I think it demonstrates the idea that <strong>one person can pull together an entirely viable, semi-scalable data platform </strong>from the best of the open-source stuff, as was always intended. Silver lining. </p><div id="youtube2-WlpnVvPpS8U" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;WlpnVvPpS8U&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/WlpnVvPpS8U?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h4>EVANGELISTS MOURN STATE OF WAY</h4><p>Some of the early evangelists <a href="https://twitter.com/sethrosen/status/1508425872268746755">mourned their loss</a>. It did literally feel magic. You could achieve so much with so little. This was a reality.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bp--!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c0adfd3-9f1a-4e5b-bc8f-aedc970d5c3e_1312x856.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bp--!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c0adfd3-9f1a-4e5b-bc8f-aedc970d5c3e_1312x856.png 424w, https://substackcdn.com/image/fetch/$s_!bp--!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c0adfd3-9f1a-4e5b-bc8f-aedc970d5c3e_1312x856.png 848w, https://substackcdn.com/image/fetch/$s_!bp--!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c0adfd3-9f1a-4e5b-bc8f-aedc970d5c3e_1312x856.png 1272w, https://substackcdn.com/image/fetch/$s_!bp--!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c0adfd3-9f1a-4e5b-bc8f-aedc970d5c3e_1312x856.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bp--!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c0adfd3-9f1a-4e5b-bc8f-aedc970d5c3e_1312x856.png" width="615" height="401.25" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8c0adfd3-9f1a-4e5b-bc8f-aedc970d5c3e_1312x856.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:856,&quot;width&quot;:1312,&quot;resizeWidth&quot;:615,&quot;bytes&quot;:226769,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!bp--!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c0adfd3-9f1a-4e5b-bc8f-aedc970d5c3e_1312x856.png 424w, https://substackcdn.com/image/fetch/$s_!bp--!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c0adfd3-9f1a-4e5b-bc8f-aedc970d5c3e_1312x856.png 848w, https://substackcdn.com/image/fetch/$s_!bp--!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c0adfd3-9f1a-4e5b-bc8f-aedc970d5c3e_1312x856.png 1272w, https://substackcdn.com/image/fetch/$s_!bp--!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8c0adfd3-9f1a-4e5b-bc8f-aedc970d5c3e_1312x856.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A shrinking pie makes zero sum games more likely too. A shrinking pie means you have to work pretty hard just to <a href="https://databased.pedramnavid.com/p/what-the-hell-is-going-on-with-data">stem the flow</a>:</p><blockquote><p><strong>Everyone Wants a Piece of the Pie, Nobody Wants to Bake<br></strong>&#8230;<br>if you don&#8217;t build things to solve a pain you&#8217;ve had in the hopes that it&#8217;ll solve someone else&#8217;s, if you don&#8217;t give away your hard work for free, then kindly, please, shut the fuck up.</p></blockquote><p>Matt Turk knows all <a href="https://mattturck.com/mad2023-part-iii/#:~:text=of%20Fiddler.-,The%20Modern%20Data%20Stack%20under%20pressure,-A%20hallmark%20of">about all of this</a>, having documented the data space for ages: </p><blockquote><p>The MDS is now under pressure. In a world of tight budgets and rationalization, it is almost too obvious a target. It&#8217;s <strong>complex</strong> (as customers need to stitch everything together and deal with multiple vendors). It&#8217;s <strong>expensive</strong> (lots of copying and moving data; every vendor in the chain wants their revenue and margin; customers often need an in-house team of data engineers to make it all work, etc). And it is, <strong>arguably, elitist</strong> (as those are the most bleeding-edge, best-in-breed tools, serving the needs of the more sophisticated users with the more advanced use cases).</p></blockquote><p>Overfunded startups, overcrowded teams, too many compute credits, too much badly structured SQL and lots of criticism. </p><p>The substance boiled down: </p><p><strong>MDS was a series of relatively experimental tools strung together to demonstrate varyingly good levels of Product Market Fit, but not quite demonstrating an ideal operating model . </strong></p><p><strong>The result is a great target for more refined, more niche, bundled equivalents. </strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SZCe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a1c705-cf73-4e24-a26d-11b3f16182f7_735x484.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SZCe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a1c705-cf73-4e24-a26d-11b3f16182f7_735x484.png 424w, https://substackcdn.com/image/fetch/$s_!SZCe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a1c705-cf73-4e24-a26d-11b3f16182f7_735x484.png 848w, https://substackcdn.com/image/fetch/$s_!SZCe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a1c705-cf73-4e24-a26d-11b3f16182f7_735x484.png 1272w, https://substackcdn.com/image/fetch/$s_!SZCe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a1c705-cf73-4e24-a26d-11b3f16182f7_735x484.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SZCe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a1c705-cf73-4e24-a26d-11b3f16182f7_735x484.png" width="727.9948120117188" height="479.3870598825468" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a5a1c705-cf73-4e24-a26d-11b3f16182f7_735x484.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:484,&quot;width&quot;:735,&quot;resizeWidth&quot;:727.9948120117188,&quot;bytes&quot;:452539,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!SZCe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a1c705-cf73-4e24-a26d-11b3f16182f7_735x484.png 424w, https://substackcdn.com/image/fetch/$s_!SZCe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a1c705-cf73-4e24-a26d-11b3f16182f7_735x484.png 848w, https://substackcdn.com/image/fetch/$s_!SZCe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a1c705-cf73-4e24-a26d-11b3f16182f7_735x484.png 1272w, https://substackcdn.com/image/fetch/$s_!SZCe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa5a1c705-cf73-4e24-a26d-11b3f16182f7_735x484.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">&#8220;Mrs Armitage full-send on her MDS&#8221;  - The analogy here is that the bike demonstrates the makings of a motorcar. <a href="https://www.quentinblake.com/gallery/mrs-armitage-under-full-sail">I&#8217;d love a copy of this poster. - rights reserved Quentin Blake </a></figcaption></figure></div><h1>Phase 4: Postcycle</h1><div class="pullquote"><p>At some point, everyone realizes you can&#8217;t get easy status from the subculture anymore. <br><strong><a href="https://astralcodexten.substack.com/p/a-cyclic-theory-of-subcultures">The people who want easy status stop joining</a></strong><a href="https://astralcodexten.substack.com/p/a-cyclic-theory-of-subcultures">, </a><br>and the movement stabilizes in a low-growth state.</p></div><h4>END OF MDS STATUS GAMES</h4><p>There is less attention and so less opportunity for status games. Most people have some <a href="https://benn.substack.com/">trusted source of information</a> and they rely on it, less interested in the minutia and more worried with whatever their pressing problems are. </p><p>A large chunk of data practitioners are now<strong> product focussed, in role, in company, in orientation</strong>. This is sensible as product is probably one of the best maturation directions, as it was a primary consumer of data outputs. ex-Data know how far to trust the data systems and what lies they tell. The same applies to operations. </p><h4>MDS CONSOLIDATES WITH FOCUS</h4><p>The <strong>data systems are largely </strong><em><strong>good enough</strong></em> and so the bottleneck becomes what to do with the data. Data engineering <em><strong>was</strong></em> the bottleneck. In January 2022 I gave the opinion that Data Engineering was <a href="https://groupby1.substack.com/p/data-engineering">no longer the primary bottleneck</a> to delivering insights/value/whatever.</p><p>Now in August 2023, I will now say that <strong>the Data Function is no longer the primary bottleneck</strong> in delivering insights/value<strong>. </strong>A good data team can ingest, model, analyse and distribute insights pretty easily. The challenges are now more subtle:<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a> </p><p>Track 1: &#8220;<a href="https://benn.substack.com/p/all-i-want-is-to-know-whats-different">All I want is to know what's different</a>&#8221;</p><p>Track 2: &#8220;<a href="https://benn.substack.com/p/the-emotionally-informed-company">The emotionally informed company</a>&#8221;</p><p>Track 3: &#8220;<a href="https://benn.substack.com/p/the-truth-is-out-there">The truth is out there The only thing stopping us from finding it is us</a>&#8221;</p><p>Track 4: &#8220;<a href="https://benn.substack.com/p/will-we-ever-have-clean-data">Will we ever have clean data? Probably not, but maybe we can work with messy data</a>&#8221;</p><p>Matt Turk describes where things <a href="https://mattturck.com/mad2023-part-iii/#:~:text=Throughout%20this%20section,analytical%20(OLAP)%20workloads">go from here</a>:</p><blockquote><p>The convergence of streaming and batch processing is an evergreen, and important theme. So is the convergence of transactional (OLTP) and analytical (OLAP) workloads</p></blockquote><p>This Analytics vs Operational thing is critical. All I will add is that <strong>all data problems stem from the fact that the blue and a pink blob are handled by different teams. </strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aomn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4a88021-fbf1-4e40-b47b-d4c9cf2dcddc_772x321.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aomn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4a88021-fbf1-4e40-b47b-d4c9cf2dcddc_772x321.png 424w, https://substackcdn.com/image/fetch/$s_!aomn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4a88021-fbf1-4e40-b47b-d4c9cf2dcddc_772x321.png 848w, https://substackcdn.com/image/fetch/$s_!aomn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4a88021-fbf1-4e40-b47b-d4c9cf2dcddc_772x321.png 1272w, https://substackcdn.com/image/fetch/$s_!aomn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4a88021-fbf1-4e40-b47b-d4c9cf2dcddc_772x321.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aomn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4a88021-fbf1-4e40-b47b-d4c9cf2dcddc_772x321.png" width="772" height="321" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4a88021-fbf1-4e40-b47b-d4c9cf2dcddc_772x321.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:321,&quot;width&quot;:772,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:29734,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aomn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4a88021-fbf1-4e40-b47b-d4c9cf2dcddc_772x321.png 424w, https://substackcdn.com/image/fetch/$s_!aomn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4a88021-fbf1-4e40-b47b-d4c9cf2dcddc_772x321.png 848w, https://substackcdn.com/image/fetch/$s_!aomn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4a88021-fbf1-4e40-b47b-d4c9cf2dcddc_772x321.png 1272w, https://substackcdn.com/image/fetch/$s_!aomn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4a88021-fbf1-4e40-b47b-d4c9cf2dcddc_772x321.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">https://materialize.com/blog/warehouse-abuse</figcaption></figure></div><h4>MDS LEFT US WITH MORE / PRACTITIONERS STILL CARE</h4><p>The whole argument around MDS now mostly dusty, it can simply be described as a good idea that explored all the avenues and turned over all the stones. </p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PRAp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc089dd66-1859-4995-9150-e0d7a73aef7b_806x231.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PRAp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc089dd66-1859-4995-9150-e0d7a73aef7b_806x231.png 424w, https://substackcdn.com/image/fetch/$s_!PRAp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc089dd66-1859-4995-9150-e0d7a73aef7b_806x231.png 848w, https://substackcdn.com/image/fetch/$s_!PRAp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc089dd66-1859-4995-9150-e0d7a73aef7b_806x231.png 1272w, https://substackcdn.com/image/fetch/$s_!PRAp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc089dd66-1859-4995-9150-e0d7a73aef7b_806x231.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PRAp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc089dd66-1859-4995-9150-e0d7a73aef7b_806x231.png" width="435" height="124.6712158808933" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c089dd66-1859-4995-9150-e0d7a73aef7b_806x231.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:231,&quot;width&quot;:806,&quot;resizeWidth&quot;:435,&quot;bytes&quot;:102321,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PRAp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc089dd66-1859-4995-9150-e0d7a73aef7b_806x231.png 424w, https://substackcdn.com/image/fetch/$s_!PRAp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc089dd66-1859-4995-9150-e0d7a73aef7b_806x231.png 848w, https://substackcdn.com/image/fetch/$s_!PRAp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc089dd66-1859-4995-9150-e0d7a73aef7b_806x231.png 1272w, https://substackcdn.com/image/fetch/$s_!PRAp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc089dd66-1859-4995-9150-e0d7a73aef7b_806x231.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Data technology was <a href="https://davidsj.substack.com/p/the-modern-data-stack-is-dead-long">just a nightmare before</a>, and a better solution is always necessary. The unbundling of the big systems into discrete elements was effective, and allowed people to experiment, learn, share and iterate on that cycle towards something that was very effective. <a href="https://substack.timodechau.com/p/after-the-modern-data-stack-welcome">Timo Dechau</a> describes this:</p><blockquote><p>The funny thing about evolution and potentially the one often missed out. Evolution is never linear. It branches out, explores, and creates massive amounts of variants. That is the beauty of it.</p><p>But it is also why there is never &#8220;the&#8221; next. But hundreds of next. And out of them, at some point, we will see a step changing the ways in such a good way that we could declare it as a new paradigm.</p><p>We are not there yet. But we can already see the branches, which is exciting.</p></blockquote><p>He has another good line on of thinking that maybe <a href="https://substack.timodechau.com/p/leaving-product-analytics#:~:text=Product%20analytics%20on%20top%20of%20your%20events%20in%20your%20data%20warehouse.%20No%20more%20weird%20data%20loadings%20and%20enrichment%20(where%20most%20of%20them%20never%20worked).%20And%20mostly%2C%20no%20two%20setups%20for%20classic%20BI%20and%20product%20analytics%20use%20cases.%20We%20spent%20some%20more%20time%20with%20this%20approach%20later.">Product and MDS will converge</a>. </p><blockquote><p>Product analytics on top of your events in your data warehouse. No more weird data loadings and enrichment (where most of them never worked). And mostly, no two setups for classic BI and product analytics use cases. We spent some more time with this approach later.</p></blockquote><h1>IN CLOSING</h1><h4>STILL REAL PROBLEMS TO BE SOLVED </h4><p>The opportunities are there. My core issue with MDS was <strong>that data modelling remained a complete nightmare</strong>. As the SaaS systems that run a business got more complex, the effort to consolidate went up and the accuracy of the consolidation went down. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ys4b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45f72c34-94b5-4be0-82ae-656cf535ae1c_2088x1242.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ys4b!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45f72c34-94b5-4be0-82ae-656cf535ae1c_2088x1242.png 424w, https://substackcdn.com/image/fetch/$s_!Ys4b!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45f72c34-94b5-4be0-82ae-656cf535ae1c_2088x1242.png 848w, https://substackcdn.com/image/fetch/$s_!Ys4b!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45f72c34-94b5-4be0-82ae-656cf535ae1c_2088x1242.png 1272w, https://substackcdn.com/image/fetch/$s_!Ys4b!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45f72c34-94b5-4be0-82ae-656cf535ae1c_2088x1242.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ys4b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45f72c34-94b5-4be0-82ae-656cf535ae1c_2088x1242.png" width="1456" height="866" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/45f72c34-94b5-4be0-82ae-656cf535ae1c_2088x1242.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:866,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:726311,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ys4b!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45f72c34-94b5-4be0-82ae-656cf535ae1c_2088x1242.png 424w, https://substackcdn.com/image/fetch/$s_!Ys4b!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45f72c34-94b5-4be0-82ae-656cf535ae1c_2088x1242.png 848w, https://substackcdn.com/image/fetch/$s_!Ys4b!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45f72c34-94b5-4be0-82ae-656cf535ae1c_2088x1242.png 1272w, https://substackcdn.com/image/fetch/$s_!Ys4b!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45f72c34-94b5-4be0-82ae-656cf535ae1c_2088x1242.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://twitter.com/mattarderne/status/1599818945401262080/photo/1">more in the thread</a></figcaption></figure></div><p>My take on data modelling:</p><ol><li><p>There is an optimum level of data modelling done by software developers building apps and not just <a href="https://news.ycombinator.com/item?id=37010349">leaving it to the data team</a> (Analytics vs Operational) </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Z4y6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff15790a0-2dfb-4a78-b6ba-f4011c9ef068_1160x306.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Z4y6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff15790a0-2dfb-4a78-b6ba-f4011c9ef068_1160x306.png 424w, https://substackcdn.com/image/fetch/$s_!Z4y6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff15790a0-2dfb-4a78-b6ba-f4011c9ef068_1160x306.png 848w, https://substackcdn.com/image/fetch/$s_!Z4y6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff15790a0-2dfb-4a78-b6ba-f4011c9ef068_1160x306.png 1272w, https://substackcdn.com/image/fetch/$s_!Z4y6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff15790a0-2dfb-4a78-b6ba-f4011c9ef068_1160x306.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Z4y6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff15790a0-2dfb-4a78-b6ba-f4011c9ef068_1160x306.png" width="1160" height="306" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f15790a0-2dfb-4a78-b6ba-f4011c9ef068_1160x306.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:306,&quot;width&quot;:1160,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:62186,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Z4y6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff15790a0-2dfb-4a78-b6ba-f4011c9ef068_1160x306.png 424w, https://substackcdn.com/image/fetch/$s_!Z4y6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff15790a0-2dfb-4a78-b6ba-f4011c9ef068_1160x306.png 848w, https://substackcdn.com/image/fetch/$s_!Z4y6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff15790a0-2dfb-4a78-b6ba-f4011c9ef068_1160x306.png 1272w, https://substackcdn.com/image/fetch/$s_!Z4y6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff15790a0-2dfb-4a78-b6ba-f4011c9ef068_1160x306.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></li><li><p>There is more work to be done exposing better data models via API from SaaS companies, and <a href="https://twitter.com/mattarderne/status/1593251129147920386">especially internally</a>.</p></li><li><p>There absolutely has to be <strong>better (</strong>easier, pre-populated, more automated, less fragile, less complicated) data modelling techniques. </p></li></ol><p><strong>The immediate future for data modelling involves an important role in the next <a href="https://twitter.com/mattarderne/status/1679107249233436675">big thing</a></strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!X5tH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6529334e-144b-41f3-92f6-4052ed46d661_533x722.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!X5tH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6529334e-144b-41f3-92f6-4052ed46d661_533x722.png 424w, https://substackcdn.com/image/fetch/$s_!X5tH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6529334e-144b-41f3-92f6-4052ed46d661_533x722.png 848w, https://substackcdn.com/image/fetch/$s_!X5tH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6529334e-144b-41f3-92f6-4052ed46d661_533x722.png 1272w, https://substackcdn.com/image/fetch/$s_!X5tH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6529334e-144b-41f3-92f6-4052ed46d661_533x722.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!X5tH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6529334e-144b-41f3-92f6-4052ed46d661_533x722.png" width="533" height="722" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6529334e-144b-41f3-92f6-4052ed46d661_533x722.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:722,&quot;width&quot;:533,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:225722,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!X5tH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6529334e-144b-41f3-92f6-4052ed46d661_533x722.png 424w, https://substackcdn.com/image/fetch/$s_!X5tH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6529334e-144b-41f3-92f6-4052ed46d661_533x722.png 848w, https://substackcdn.com/image/fetch/$s_!X5tH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6529334e-144b-41f3-92f6-4052ed46d661_533x722.png 1272w, https://substackcdn.com/image/fetch/$s_!X5tH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6529334e-144b-41f3-92f6-4052ed46d661_533x722.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://groupby1.mattarderne.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">If you&#8217;d like to hear from me again, drop the email in the box. No spam, just fresh insights cooked up in a flurry of procrastination. </p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Jacob epitomised the true spirit of MDS - he built a <a href="https://www.dataduel.co/modern-data-stack-in-a-box-with-duckdb/">data analytics &#8220;in a box&#8221;</a> using entirely open source tools. It just seems like the absolute best way to demonstrate the value of the idea:<br><em>TLDR: A fast, free, and open-source Modern Data Stack (MDS) can now be fully deployed on your laptop or to a single machine using the combination of&nbsp;<a href="https://duckdb.org/">DuckDB</a>,&nbsp;<a href="https://meltano.com/">Meltano</a>,&nbsp;<a href="https://www.getdbt.com/">dbt</a>, and&nbsp;<a href="https://superset.apache.org/">Apache Superset</a>.</em></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>I started a small consultancy, of which we never had more than 4 or 5 people operating at any one time, but we worked with many great companies and some exceptional ones. I never pushed that hard on this as the learning balance quickly falls in the favour of the client (initially you learn, then once you&#8217;ve stopped learning then they start to benefit, and you just hopefully get paid enough).</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>Based on those criteria, I will riff a bit and say that in 2023 and onward, the following are clear <strong>requirements for buying data tools</strong></p><ol><li><p>The product must be aware of the data warehouse, the CRM and the related tools. Complementary technology is essential. </p></li><li><p>The <strong>intro-demo-trial-buy</strong> process must be accessible without hand holding, screening calls, hidden pricing and other crap. (once in growth phase out of beta etc)</p></li><li><p>Some form of value must be obvious and demonstrated within 3 hours of the trial. </p></li></ol></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>Niche verticalized versions of the Reverse ETL concept do very well, as do niche CDPs.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>My two favourite cars - <a href="https://github.com/malloydata/malloy">Malloy</a> and <a href="https://relational.ai/blog/losing-the-middle-tier">Relational.ai</a> are both sensible concept cars that take a novel approach to the heart of the problem - data modelling is a nightmare!</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>What started out as reasoned rapidly became directed <a href="https://twitter.com/oldjacket/status/1686210267539988482">at individuals</a> that kinda gave you that awkward vibe that is difficult to engage with but to me indicated the <a href="https://twitter.com/mattarderne/status/1689631249286242304">signs of the end</a>. </p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>There is some irony that they then propose Posthog as a Data warehouse <em>and</em> CDP, but I&#8217;ve used Posthog and it is a good product analytics tool. </p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p><em>These read as if you asked chatGPT to write a song list for an angsty B2B data team leader&#8217;s third album</em></p></div></div>]]></content:encoded></item><item><title><![CDATA[Context, and the Lack Thereof]]></title><description><![CDATA[Where we come from, we&#8217;d call you crazy]]></description><link>https://groupby1.mattarderne.com/p/context</link><guid isPermaLink="false">https://groupby1.mattarderne.com/p/context</guid><dc:creator><![CDATA[Matt Arderne]]></dc:creator><pubDate>Wed, 18 Jan 2023 11:57:42 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!tUK0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F255bcb25-cf47-45d7-bed6-6e604e99de9a_1989x970.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Enterprise data systems can often be quite distinct from their smaller startup cousins. This post takes a look at how our conversations around data systems and techniques need to be more sensitive to context, specifically when conversationalists have varying backgrounds.</em></p><p><em>A puddle-deep dive into the world of enterprises and how they compare to startups.</em></p><h1>Does it generalise?</h1><p>A key thing I come back to with working it data, writing about it and especially reading about it, is the following: does this generalise? Does this apply to a general context?</p><p>We all have a specific set of experiences and sufficient time to learn from others. This gives each of us the perspective that forms our opinions on the world.</p><p>So occasionally, we will say something like<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>:</p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://twitter.com/rdrn_/status/1593251129147920386&quot;,&quot;full_text&quot;:&quot;I think tech/SW teams should (be required) to build APIs of core business data concepts (customers, orders, products) way earlier\n\nThe anti-pattern of this being defined by a new (data) team in a new tech paradigm (dbt) and then exposed in a new tool (BI) causes much pain&quot;,&quot;username&quot;:&quot;mattarderne&quot;,&quot;name&quot;:&quot;Matt Arderne&quot;,&quot;profile_image_url&quot;:&quot;&quot;,&quot;date&quot;:&quot;Thu Nov 17 14:34:05 +0000 2022&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:0,&quot;retweet_count&quot;:2,&quot;like_count&quot;:39,&quot;impression_count&quot;:0,&quot;expanded_url&quot;:{},&quot;video_url&quot;:null,&quot;belowTheFold&quot;:false}" data-component-name="Twitter2ToDOM"></div><p>This tweet led to fascinating conversations with many people, experts and non-experts alike.</p><p>What is even more fascinating, is that the context that caused this statement to resonate so strongly<strong> is entirely presumed</strong>, or even unknown.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> The context, the specific experiences that led to the specific insight, is unknown.</p><p>This I would argue is a key issue with a few of the rifts in data discussions. <a href="https://twitter.com/petehanssens/status/1426517732023963649">Data mesh</a>, modern data stack, bundling, data modelling, dbt, human middleware, contracts, <a href="https://benn.substack.com/p/day-of-reckoning#:~:text=But%20what%20if,that%20owns%20it%3F">purpose</a>, <a href="https://stkbailey.substack.com/p/what-exactly-isnt-dbt#:~:text=To%20execute%20%E2%80%9Cdbt,a%20new%20age.">gods</a>, etcetera.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a></p><p>How many of these rifts are due to a lack of generalised context?</p><p>Let&#8217;s have a look.</p><h1>Who are we?</h1><p>Data people do come from a pretty consistent set of <a href="https://stkbailey.substack.com/p/perennial-truth-architectures">contexts</a>:<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a></p><blockquote><p>The monolithic startup driven by a charismatic CEO. The meshy enterprise and its tangled mess of systems. The methodical R&amp;D group with its tinkering innovators. The mutinous midsize company and its C-Suite fights over whose department initiatives are&nbsp;<em>really</em>&nbsp;driving growth.</p></blockquote><p>Across all of these complexity busting teams, possibly the clearest dimension that I've noticed is the different, almost opposed operating models between the scrappy <strong>startup/scaleup</strong> mode and the sprawling <strong>enterprise</strong> mode when it comes to implementing data solutions and systems.</p><p>Here are a few key ways of describing characteristics that really distinguish:</p><ul><li><p>CTO vs CIO<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a> responsible for data</p></li><li><p>Data needs to be useful vs data needs to be accurate</p></li><li><p>Growing vs stable (<a href="https://twitter.com/hkarthik/status/1581339228515930112">dying</a>)</p></li><li><p>Emergent vs traditional</p></li><li><p>Product market fit vs risk averse</p></li></ul><p>This is worth discussing because we have long since reached the point where these conversations feel quite like <em>worlds colliding</em> </p><p>(the enterprise data-mesh car crash through the fence of startup analytics world), </p><p>and this was like two previously un-contacted tribes with each their own gods and deities coming together and not loving the look of each other. People just screaming right past each other. Here is me screaming:</p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://twitter.com/rdrn_/status/1580469472783056900&quot;,&quot;full_text&quot;:&quot;<span class=\&quot;tweet-fake-link\&quot;>@mullinsms</span> <span class=\&quot;tweet-fake-link\&quot;>@DSJayatillake</span> <span class=\&quot;tweet-fake-link\&quot;>@DomenicRavita</span> <span class=\&quot;tweet-fake-link\&quot;>@getdbt</span> <span class=\&quot;tweet-fake-link\&quot;>@fivetran</span> This circular conversation seems most common when the enterprise context clashes with rapid growth context. \n\nDecay vs Acceleration orientation largely not reconcilable, which IMO explain these&quot;,&quot;username&quot;:&quot;mattarderne&quot;,&quot;name&quot;:&quot;Matt Arderne&quot;,&quot;profile_image_url&quot;:&quot;&quot;,&quot;date&quot;:&quot;Thu Oct 13 08:04:20 +0000 2022&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:0,&quot;retweet_count&quot;:0,&quot;like_count&quot;:1,&quot;impression_count&quot;:0,&quot;expanded_url&quot;:{},&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><h2>My Background</h2><p>I have always worked in and with smaller businesses, the median people number being ~150. I briefly worked as a solutions engineer for a company that nearly exclusively sold products to big enterprise companies, multinationals, banks.</p><p>This experience was <em>not agreeable</em> and shocked me in a few ways. Primarily, how incredibly complicated the enterprise customer operations can be, and how dysfunctionally the enterprise customers ran their technology projects.</p><p>Complicated is fine, <a href="https://www.mindtools.com/pages/article/cynefin-framework.htm">complicated is not inherently complex</a>, and just requires careful consideration, rather than rocket science, to figure out what to do. But in these older businesses, what to do is staggeringly complicated.</p><p>And so if you have worked in a startup, you may experience enterprise as static, as if nothing happens.</p><p>Well, things happen, just nothing changes,</p><p>At least not satisfyingly so, and certainly not quick enough for any feedback loops to develop, and absolutely not long enough for anyone to be around for long enough to be held accountable for their actions.</p><p>What ends up happening is pretty organic, often incredibly misguided, and very likely even counterproductive without realising <a href="https://en.wikipedia.org/wiki/Systemantics#cite_note-%5BPink2011%5D-4">it</a>:</p><blockquote><p>Not only do systems expand well beyond their original goals, but as they evolve they tend to oppose even their own original goals. &#8230; For example, incentive reward systems set up in business can have the effect of institutionalizing mediocrity. This leads to the following principle. <em><strong>Systems tend to oppose their own proper function.</strong></em></p></blockquote><p>Working in a company selling software was great, but selling to such established mega-corps was not my style of toasted cheese sandwich. </p><p>This blog has simmered as a comparison between Startup-land and "your typical lower tier enterprise". <strong>The result is a massive generalisation</strong>, highlighting the best of startups and comparing them to the worst of enterprise. I have a minor sliver of context, and I'm adding it to the mix. I'm not an enterprise/startup expert, but I do have experience, which means I have some context to share.</p><p>Many people in this space go their whole lives not looking under the hood of an enterprise, or startup, and this is for them.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tUK0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F255bcb25-cf47-45d7-bed6-6e604e99de9a_1989x970.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tUK0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F255bcb25-cf47-45d7-bed6-6e604e99de9a_1989x970.png 424w, https://substackcdn.com/image/fetch/$s_!tUK0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F255bcb25-cf47-45d7-bed6-6e604e99de9a_1989x970.png 848w, https://substackcdn.com/image/fetch/$s_!tUK0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F255bcb25-cf47-45d7-bed6-6e604e99de9a_1989x970.png 1272w, https://substackcdn.com/image/fetch/$s_!tUK0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F255bcb25-cf47-45d7-bed6-6e604e99de9a_1989x970.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tUK0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F255bcb25-cf47-45d7-bed6-6e604e99de9a_1989x970.png" width="1456" height="710" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/255bcb25-cf47-45d7-bed6-6e604e99de9a_1989x970.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:710,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5283102,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tUK0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F255bcb25-cf47-45d7-bed6-6e604e99de9a_1989x970.png 424w, https://substackcdn.com/image/fetch/$s_!tUK0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F255bcb25-cf47-45d7-bed6-6e604e99de9a_1989x970.png 848w, https://substackcdn.com/image/fetch/$s_!tUK0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F255bcb25-cf47-45d7-bed6-6e604e99de9a_1989x970.png 1272w, https://substackcdn.com/image/fetch/$s_!tUK0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F255bcb25-cf47-45d7-bed6-6e604e99de9a_1989x970.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">LinkedIn or Twitter - <a href="https://www.artmajeur.com/en/magazine/29-pop-culture/octavio-ocampo-s-metamorphic-paintings/331291">source</a></figcaption></figure></div><h1>So what about startups</h1><p>Anyone reading this probably has an <a href="https://en.wikipedia.org/wiki/Silicon_Valley_(TV_series)">idea</a> of startup land.</p><p>Startups have the benefit of a mostly clean slate, minimal baggage and an attitude of <strong>must take the time to get the tech systems right.</strong> Data is very often part of the strategic differentiation (guided or <a href="https://benn.substack.com/p/day-of-reckoning">misguided</a> as it may be).</p><p>Single data system, single tech, migrate quickly, do it now, cutting edge, <em>sorry cancel the contract we turned that off</em>.</p><p>Often wrong but always correcting, both the data stack and the company. Startups pivot on a cent(dime/penny), making extremely confident moves, with everyone in the team ready to have their role/responsibility severely disrupted, and be thrilled about it (unless you get fired).</p><p>Some great points I seem to have copied from <a href="https://locallyoptimistic.slack.com/archives/CHF1E9NUS/p1641404224097000">a slack</a> about what to expect at a startup if you&#8217;ve only worked in an enterprise:</p><blockquote><p>You will ALWAYS be building the plane as you fly it. Take time to step back and really think about your foundation and the scale it needs to operate at.<br><br>My advice would be to try to be as &#8216;T&#8217; shaped as possible. Don&#8217;t worry about being amazing at lots of things - just be <em>good enough</em> at lots of things and <em>best-in-team</em> at one thing in particular.</p><p>Just do things, permission is unaffordable overhead.</p><p>Prioritize aggressively because if you do the wrong things the company dies.</p></blockquote><p>Hot on the trail of customer needs and wants, iterating through the early stages of getting product market fit, and then once the <em>scaleup stage</em> is reached, all hands to the optimisation engine.</p><p>Startups are relatively simple - everyone is aligned. If not then it is probably just a small enterprise.</p><p><strong>Summary</strong></p><ul><li><p>Uncompromised, as everything matters, and the business will probably fail</p></li><li><p>Any single failure could be fatal, but only can be minimally mitigated</p></li><li><p>Aligned incentives</p></li><li><p>Fragile, like a newly lit fire with massive potential &#129512;</p></li></ul><h3>Scale-ups?</h3><p>Scale-ups are what happens when a startup does exceedingly well (product market fit or excessive funding) and transitions from chaos into chaos on a mad hiring spree. The metamorphosis is described <a href="https://compilerqueen.substack.com/p/when-growth_stage-pupa">here</a>:</p><blockquote><p>The growth of a company from startup to enterprise has a lot in common with the growth of a caterpillar into a butterfly. And those parallels are especially meaningful when you&#8217;re in the goopy, amorphous pupa phase.</p><p>It can be disorienting to go from a stage where the emphasis is on quick execution for immediate results to a stage where ideas take longer to incubate and impact is measured in years, not days.</p></blockquote><p>Often enough a chaotic nightmare where no one knows what truly matters. The inherent value of this stage is being tested by the current lack of growth VC funding. </p><p>Pouring fuel on the fire now that you won&#8217;t smother it &#128293;</p><h1>So then what about enterprise</h1><p>Enterprise is entirely different. With endless silos, centralising and decentralising efforts, complicated by mergers, locations, borders, timezones and subdivisions. The focus of data is on basic coherence, control, and standardisation.</p><p>I'm not talking about Apple here. <a href="twitter.com/GergelyOrosz/status/1574729116850561024">FAANG</a> doesn't have this issue, they just build solutions. CV buffering for engineers, they have other issues. They have issues relating to hiring people they don't need to prevent other FAANGs from having access to them, and other rumours about <a href="https://twitter.com/Austen/status/1412322657907793925">deliberately depressing profits</a>.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a></p><p>I am talking about your mid-tier banks, in forgotten regions. Companies that need technology to survive, but have a complicated relationship with it. Think COBOL stories. Think of General Electric post-merger with one of the 100's of companies in one of the very many sectors. Declining profits, fired CEOs, disillusioned staff. Failed <a href="https://www2.deloitte.com/content/dam/Deloitte/mx/Documents/human-capital/01_ERP_Top10_Challenges.pdf">ERP</a> implementations. The long tail of companies that just exist on momentum and middle management careering.</p><p>At a true enterprise, one can expect many legacy tech stacks, each with a plan and migration timeline. Keep it reliable, supported, staffed and contracted, ideally a single vendor.</p><p>What this leads to is <a href="https://earthly.dev/blog/bullshit-software-projects/#:~:text=the%20Zombie%20Projects.-,Zombie%20Projects,was%20the%20case%20at%20the%20networking%20start%2Dup%20he%20worked%20at%3A,-We%20had%20no">Zombie Projects</a>, with incredibly disillusioned teams running them:</p><blockquote><p>when a project has failed or has ballooned in size to the extent that it will never be completed, and everyone knows it. When that happens, and no one wants to face the facts, and so the project continues to move forward, then it becomes a BS project.</p></blockquote><p>Individuals start to realise that the project has died but will remain funded, and the system supports itself. They realise that despite whether they work or not, nothing happens and nothing matters, other than the charade itself. A reminder that this really <a href="https://earthly.dev/blog/bullshit-software-projects/#:~:text=Marcelle%2C%20another%20developer%2C%20also%20found%20doomed%20projects%20hard%20to%20handle%3A">drains motivation</a>.</p><p><em>That intro is probably a bit harsh, What does the enterprise do well? <br>Running incredibly complex, massively scaled businesses, often <a href="https://twitter.com/rkoutnik/status/1501346600202825731">successfully</a>, credit where it is due.</em></p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://twitter.com/rkoutnik/status/1501346600202825731&quot;,&quot;full_text&quot;:&quot;Working at an early startup is like driving a go-kart.  Feels fast only because it's small, you're not actually going quickly.\n\nWorking at a mature company is like riding a plane.  Doesn't feel like you're moving at all until you look out the window and have gone 100s of miles&quot;,&quot;username&quot;:&quot;rkoutnik&quot;,&quot;name&quot;:&quot;Randall Koutnik&quot;,&quot;profile_image_url&quot;:&quot;&quot;,&quot;date&quot;:&quot;Tue Mar 08 23:58:17 +0000 2022&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:0,&quot;retweet_count&quot;:2,&quot;like_count&quot;:21,&quot;impression_count&quot;:0,&quot;expanded_url&quot;:{},&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><h3>Hero Warship</h3><p>Another recognisable enterprise pattern is hiring <a href="https://twitter.com/sergey_brine/status/1594726277554094082">saviours</a>. </p><p>Person X, accolade Y, education Z (<a href="https://benn.substack.com/p/the-emperor-and-his-clothes#:~:text=None%20of%20this,some%20harebrained%20catastrophe">MBA</a>) is heralded as the big new hire with the big new plan, assigned a big new budget to (finally) achieve the big broad goal.</p><p>Hero comes off the back of recent success in the exact same situation at a competitor. They start with an investigation into the &#8220;business needs&#8221; through many committees, leading to a feature hit-list-box-ticking exercise. Eventually, only massive companies like Oracle or IBM are in the running.</p><p>Cynically, this project is entirely for the purpose of leveraging it into a new role at a new bank (<a href="https://twitter.com/shreyas/status/1339997380335128576">Failing Up</a>), years before the project has any chance of fruition.</p><p>Another version of this game: In walk the management consultants. Strategise the flavour of the month (Modern Data Mesh), sell the dream, plan the diagram:</p><p><em>optimise cut reduce expand single source modern truth AI prediction profit, drops mic</em></p><p>The <a href="https://doomedprojects.com/post/it-would-be-career-limiting">contract staffing</a> company arrives, waterfall chart in hand and by the time the paint has dried most of the original stakeholders have jumped ship as <s>digitisation</s> <s>data</s> <s>AI</s> <s>blockchain</s> cost management experts, and are <a href="https://twitter.com/Stonks_dot_com/status/1581395492977618944">spamming</a> the success of the project before it has even begun!   </p><h3>What are you sinking about</h3><p>I think of enterprise as a system that has built so many incredibly reliable systems to prevent the business from failing that they become rigid. </p><p>Incredibly strong. Incredibly inflexible. Entire functional areas begin to decay, while the <a href="https://experimentalhistory.substack.com/p/bureaucratic-psychosis">rigidity remains</a>.</p><blockquote><p>Put people in charge of rules, meetings, and forms, and their first inkling will be &#8220;there should be more rules, meetings, and forms&#8221;.</p></blockquote><p>The rigidity and strength prevent flexibility, and so change becomes increasingly unlikely. This eventually leads to a state of dysfunction. The structure remains, but the function gradually dissolves. Idiosyncrasies like change review boards outright inhibiting changes.</p><p>Broken systems, operating in an organic state of <a href="https://how.complexsystems.fail/#5">failure</a>:</p><blockquote><p>The system continues to function because it contains so many redundancies and because people can make it function, despite the presence of many flaws. After accident reviews nearly always note that the system has a history of prior &#8216;proto-accidents&#8217; that nearly generated catastrophe. </p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://images.unsplash.com/photo-1596248723887-3b002a4c1c90?ixlib=rb-4.0.3&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://images.unsplash.com/photo-1596248723887-3b002a4c1c90?ixlib=rb-4.0.3&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb 424w, https://images.unsplash.com/photo-1596248723887-3b002a4c1c90?ixlib=rb-4.0.3&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb 848w, https://images.unsplash.com/photo-1596248723887-3b002a4c1c90?ixlib=rb-4.0.3&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb 1272w, https://images.unsplash.com/photo-1596248723887-3b002a4c1c90?ixlib=rb-4.0.3&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb 1456w" sizes="100vw"><img src="https://images.unsplash.com/photo-1596248723887-3b002a4c1c90?ixlib=rb-4.0.3&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb" width="4096" height="3072" data-attrs="{&quot;src&quot;:&quot;https://images.unsplash.com/photo-1596248723887-3b002a4c1c90?ixlib=rb-4.0.3&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:3072,&quot;width&quot;:4096,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;https://images.unsplash.com/photo-1596248723887-3b002a4c1c90?ixlib=rb-4.0.3&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="https://images.unsplash.com/photo-1596248723887-3b002a4c1c90?ixlib=rb-4.0.3&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb" title="https://images.unsplash.com/photo-1596248723887-3b002a4c1c90?ixlib=rb-4.0.3&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb" srcset="https://images.unsplash.com/photo-1596248723887-3b002a4c1c90?ixlib=rb-4.0.3&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb 424w, https://images.unsplash.com/photo-1596248723887-3b002a4c1c90?ixlib=rb-4.0.3&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb 848w, https://images.unsplash.com/photo-1596248723887-3b002a4c1c90?ixlib=rb-4.0.3&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb 1272w, https://images.unsplash.com/photo-1596248723887-3b002a4c1c90?ixlib=rb-4.0.3&amp;q=80&amp;fm=jpg&amp;crop=entropy&amp;cs=tinysrgb 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Summary</strong></p><ul><li><p>Compromised, primarily due to risk aversion</p></li><li><p>Severe misalignment</p></li><li><p>Hot embers, of a big fire &#128219;</p></li></ul><h1>Looking Ahead</h1><p>As we have seen, the enterprise has largely taken note of the successes (and failings) of the organic brew of recent data tech.</p><p>My guess is that over the next few years, we will see this accelerate (they don&#8217;t read the salty tweets), but it will largely squish the pulp out of it in the outsourced and cross-matrix-managed divisional regional organisational chaos, and after decades of implementation, will abandon it to the pile of supported and business-critical but largely ignored technology.</p><p>Alongside we&#8217;ll see ever more of the offering of cargo-cult as a service from the consultancies forcing their way into the fray, with AI names and outsourced quasi-bundled solutions. </p><p>Often these kinds of trends are misguided, but some are indeed necessary. Digitisation? Yes if you must, no one wants to call for a delivery update. Cloud? Probably.</p><p>So Modern Data Stack then? Well, it comes with a way of working, an ethos around software best practices, some thoughts around who is responsible for what, and how to engage as an impactful product minded team.</p><p>Therein lies hope. Where Hadoop was a big data solution, Modern analytics is possibly a <a href="https://twitter.com/bennstancil/status/1605969780636205057">process</a> improvement. Doing things a bit more coherently. A slightly easier way to do the same old.</p><p>But I worry that it is probably not enough, or not even the case.</p><p>The ease with which enterprise data modelling experts have co-opted the data modelling conversation on LinkedIn tells me that the more likely post-fact analysis is going to lead to something along the lines of:</p><div class="preformatted-block" data-component-name="PreformattedTextBlockToDOM"><label class="hide-text" contenteditable="false">Text within this block will maintain its original spacing when published</label><pre class="text">MDS brought Enterprise capabilities to Startups, <strong>and not the other way around.</strong></pre></div><p>That is a shuddering thought if there ever was one?</p><p>As mentioned, and regardless, Modern Data Stack has started simmering in the Enterprise, and is now well on its way. A new song with the same dance.</p><h1>So what?</h1><p>Back to the opening point on context. </p><p>Take a moment to understand the context behind opinions, product features, books, blogs, tweets and toots. </p><p>When consuming in this space it is critical to ensure the relevance of what you are reading, <strong>to you.</strong></p><p>Is that (this) advice or content applicable to you?</p><p>More importantly, the person writing that (this) blog. What is their background? What are they saying. What have they not said?</p><div><hr></div><p><em>Thanks to <a href="https://twitter.com/sspaeti/">Simon Sp&#228;ti</a> for insightful feedback on an early draft. Simon published a very insightful analysis on how enterprise data systems <a href="https://airbyte.com/blog/modern-data-stack-struggle-of-enterprise-adoption">actually work</a>. Also thanks to my team of <a href="https://en.wikipedia.org/wiki/Extended_family">editors</a> for the proofreading.</em></p><p><em><strong>20 Feb 2023 edit</strong> - Google <a href="https://medium.com/@pravse/the-maze-is-in-the-mouse-980c57cfd61a">is indeed an enterprise</a> by the definition broadly described in this post</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://groupby1.mattarderne.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Sign up below for my nearly annual insight.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p><a href="https://twitter.com/mattarderne/status/1593251129147920386">The tweet</a> was inspired by a conversation with a company, who have tasked their newest team member with building a Data Warehouse so that the Engineering team could keep shipping.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>The reason slack/twitter/Linkedin works at all is that we broadly presume everyone is of the following: data folks, largely all working in tech, scaling and not so much, generally all building web-apps, B2B and B2C, VC funded, ~Head of Data consulting, engineers too, building a modern Data Analytics practice, hate Data Mesh, jaded Jaffles, ~SQL, mostly ready to move into Product.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>I could try find a link for each of those but as a reader I find the pressure to discover ALL of the context a bit <a href="https://benn.substack.com/p/the-modern-data-experience#:~:text=To%20analytics%20engineers,would%20care%20about.">overwhelming</a>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p><em>Data people do actually come from a pretty consistent set of <a href="https://stkbailey.substack.com/p/perennial-truth-architectures">contexts</a>. </em>Why is this? My guess, data people are the glue that is applied to deal with complexity. Filling the <a href="https://youtu.be/wB0ulHmvU7E?t=1287">gaps in the matrix</a>. People, varyingly hired to fix the complexity problem.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>CIO? <a href="https://www.joincolossus.com/episodes/75184353/slootman-narrow-the-focus-increase-the-quality?tab=transcript">Frank Slootman</a> on the enterprise CIO:</p><blockquote><p>It used to be IT was king of the hill, it still is in some place. But now business is just as technical as IT. So their roles are shifting and you get a much more balanced environment between what the business makes decisions on and what IT is really in charge of, because IT doesn't really know how to apply technology to the business, but the business does. We see that balance changing.</p><p>And I have that conversation often, by the way, with CIOs who are your typical infrastructure guys, they manage for cost and risk and these kinds of things. But they're infrastructure people, they really are enablers, but they don't really know how technology impacts the business. The business does.</p></blockquote></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>This was written before they started firing people.</p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[Data Navigators]]></title><description><![CDATA[Dealing with uncertainty is what differentiates the navigator from the auditor]]></description><link>https://groupby1.mattarderne.com/p/data-navigators</link><guid isPermaLink="false">https://groupby1.mattarderne.com/p/data-navigators</guid><dc:creator><![CDATA[Matt Arderne]]></dc:creator><pubDate>Fri, 19 Aug 2022 11:12:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/h_600,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc22fff2e-1ab7-4f48-844e-f100a3849e34_800x506.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Rich with historical trivia, linking to great practical insights, and sublime section headings. A must skim for any slow Friday - </em>Data Workers Daily </p><h1><strong>Where am I?</strong></h1><p>Throughout history, this question has usually been resolved somewhat approximately, but always with a strong desire for better accuracy.</p><p>From the perspective of explorers deep in the fog of discovering new worlds, each incremental step in <strong>increasingly</strong> <strong>accurate methods of navigation was a new opportunity</strong>, much like an entrepreneur sees the incremental tech capabilities as a wonder of possibilities.</p><p>Let us pick up the thread of increasing accuracy at the very interesting inflection point of <a href="https://www.celestialnavigation.info/what-is-celestial-navigation/">celestial navigation</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0u9G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe91da8e-94d5-48c8-8ebb-b818921db2c9_564x508.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0u9G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe91da8e-94d5-48c8-8ebb-b818921db2c9_564x508.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0u9G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe91da8e-94d5-48c8-8ebb-b818921db2c9_564x508.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0u9G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe91da8e-94d5-48c8-8ebb-b818921db2c9_564x508.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0u9G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe91da8e-94d5-48c8-8ebb-b818921db2c9_564x508.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0u9G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe91da8e-94d5-48c8-8ebb-b818921db2c9_564x508.jpeg" width="344" height="309.84397163120565" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/be91da8e-94d5-48c8-8ebb-b818921db2c9_564x508.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:508,&quot;width&quot;:564,&quot;resizeWidth&quot;:344,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Sextant_(PSF).png (2320&#215;2090) | Tattoo | Pinterest&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Sextant_(PSF).png (2320&#215;2090) | Tattoo | Pinterest" title="Sextant_(PSF).png (2320&#215;2090) | Tattoo | Pinterest" srcset="https://substackcdn.com/image/fetch/$s_!0u9G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe91da8e-94d5-48c8-8ebb-b818921db2c9_564x508.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0u9G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe91da8e-94d5-48c8-8ebb-b818921db2c9_564x508.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0u9G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe91da8e-94d5-48c8-8ebb-b818921db2c9_564x508.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0u9G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe91da8e-94d5-48c8-8ebb-b818921db2c9_564x508.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Pinterest <a href="https://www.pinterest.co.uk/pin/461337555552779242/">captures the gist of how it works</a></figcaption></figure></div><p>Using the angle of stars to the horizon, coupled with some maths and reference charts, one can approximate latitude, with fair accuracy.</p><p>Latitude (remember, steps on a La(t)dder) are the ones that go across the map. How far North or South.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-hwm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06185b53-46cf-4f83-bb2f-7308d5a4aa8e_1599x933.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-hwm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06185b53-46cf-4f83-bb2f-7308d5a4aa8e_1599x933.png 424w, https://substackcdn.com/image/fetch/$s_!-hwm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06185b53-46cf-4f83-bb2f-7308d5a4aa8e_1599x933.png 848w, https://substackcdn.com/image/fetch/$s_!-hwm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06185b53-46cf-4f83-bb2f-7308d5a4aa8e_1599x933.png 1272w, https://substackcdn.com/image/fetch/$s_!-hwm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06185b53-46cf-4f83-bb2f-7308d5a4aa8e_1599x933.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-hwm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06185b53-46cf-4f83-bb2f-7308d5a4aa8e_1599x933.png" width="504" height="294.2307692307692" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/06185b53-46cf-4f83-bb2f-7308d5a4aa8e_1599x933.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:850,&quot;width&quot;:1456,&quot;resizeWidth&quot;:504,&quot;bytes&quot;:843648,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-hwm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06185b53-46cf-4f83-bb2f-7308d5a4aa8e_1599x933.png 424w, https://substackcdn.com/image/fetch/$s_!-hwm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06185b53-46cf-4f83-bb2f-7308d5a4aa8e_1599x933.png 848w, https://substackcdn.com/image/fetch/$s_!-hwm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06185b53-46cf-4f83-bb2f-7308d5a4aa8e_1599x933.png 1272w, https://substackcdn.com/image/fetch/$s_!-hwm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06185b53-46cf-4f83-bb2f-7308d5a4aa8e_1599x933.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Figure 1 - Latitude and Atlantic Ocean &#8212; Encyclopedia Britannica</figcaption></figure></div><p>Longitude (the down ones), however could only <a href="https://en.wikipedia.org/wiki/Lunar_distance_(navigation)">be very roughly estimated</a> due to the lack of effective seagoing clocks and the process being more complicated.</p><p>For most of history, Marine Navigators could at best get to the latitude they knew their destination was on, and sail along along that latitude until they bumped into their destination (Land Ho!), or whatever was between them and the destination.</p><p>Often this was something hard and sharp.</p><p>Crossing the Atlantic Ocean between Europe and the Caribbean (see figure 1) was pretty attainable, with a great heuristic: </p><p><strong>Sail South until the butter melts, and then follow the Sun</strong>. </p><p>The butter melting is the approximate Latitude that you are aiming for in the Caribbean. Following the Sun is heading West. When you spot land, you have arrived!</p><p>Beyond that, things get far trickier.</p><p>Rounding Cape Horn or Cape of Storms are both formidable challenges, even today. Without accurate East/West determination, you were left with far fewer heuristics and some mediocre workarounds:</p><blockquote><p>As the Crow Flies &#8211; When lost or unsure of their position in coastal waters, ships would release a caged crow. The crow would fly straight towards the nearest land thus giving the vessel some sort of a navigational fix. <a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p></blockquote><p>The race to map the world meant that enhancing location accuracy was as valuable then as it is now (maps, the new oil).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RgZl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc22fff2e-1ab7-4f48-844e-f100a3849e34_800x506.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RgZl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc22fff2e-1ab7-4f48-844e-f100a3849e34_800x506.png 424w, https://substackcdn.com/image/fetch/$s_!RgZl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc22fff2e-1ab7-4f48-844e-f100a3849e34_800x506.png 848w, https://substackcdn.com/image/fetch/$s_!RgZl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc22fff2e-1ab7-4f48-844e-f100a3849e34_800x506.png 1272w, https://substackcdn.com/image/fetch/$s_!RgZl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc22fff2e-1ab7-4f48-844e-f100a3849e34_800x506.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RgZl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc22fff2e-1ab7-4f48-844e-f100a3849e34_800x506.png" width="800" height="506" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/c22fff2e-1ab7-4f48-844e-f100a3849e34_800x506.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:506,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:520530,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RgZl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc22fff2e-1ab7-4f48-844e-f100a3849e34_800x506.png 424w, https://substackcdn.com/image/fetch/$s_!RgZl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc22fff2e-1ab7-4f48-844e-f100a3849e34_800x506.png 848w, https://substackcdn.com/image/fetch/$s_!RgZl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc22fff2e-1ab7-4f48-844e-f100a3849e34_800x506.png 1272w, https://substackcdn.com/image/fetch/$s_!RgZl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc22fff2e-1ab7-4f48-844e-f100a3849e34_800x506.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://en.wikipedia.org/wiki/James_Cook#First_voyage_(1768%E2%80%931771)">James Cook</a>, without much sensitivity for how it would be interpreted after the fact, hammered around the earth, charting the world with only a rough idea of his latitude. Here he is running aground on the Great Barrier Reef, presumably due to an intern&#8217;s PR.</figcaption></figure></div><p>And so, a 1700s <a href="https://en.wikipedia.org/wiki/Longitude_rewards#Establishing_the_rewards">British government-sponsored</a> technical arms race was established, matched only by the Data Stack sparring of 2021 in terms of strong narratives. The desired outcome - a better clock.</p><p>The marine chronometer (a much better clock) was conceived of and improved to such a degree that it enabled navigators to <a href="https://en.wikipedia.org/wiki/James_Cook#Second_voyage_(1772%E2%80%931775)">know where they were</a>.&nbsp;</p><p>The chronometer enabled better longitude determination. With an extreme emphasis on better (just a tiny snippet of the history of navigation, which may or may not be fascinating, I won&#8217;t impose).</p><p>The point, and why I find this a relevant thing to idle upon, is because this approximate location resolution process still exists. When you learn to navigate a modern sailboat, you are instructed to determine your location by taking 3 compass readings, resulting in a triangle, resolving to a fair chance that you are <strong>or at least were </strong>in that triangle.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JGaR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cee5b7-07f7-4672-a557-5c2cab02fc75_620x412.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JGaR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cee5b7-07f7-4672-a557-5c2cab02fc75_620x412.jpeg 424w, https://substackcdn.com/image/fetch/$s_!JGaR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cee5b7-07f7-4672-a557-5c2cab02fc75_620x412.jpeg 848w, https://substackcdn.com/image/fetch/$s_!JGaR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cee5b7-07f7-4672-a557-5c2cab02fc75_620x412.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!JGaR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cee5b7-07f7-4672-a557-5c2cab02fc75_620x412.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JGaR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cee5b7-07f7-4672-a557-5c2cab02fc75_620x412.jpeg" width="620" height="412" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/c6cee5b7-07f7-4672-a557-5c2cab02fc75_620x412.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:412,&quot;width&quot;:620,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;https://i1.wp.com/www.paddlinglight.com/pl/wp-content/uploads/2011/02/fix-example.jpg?fit=620%2C412&amp;ssl=1&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="https://i1.wp.com/www.paddlinglight.com/pl/wp-content/uploads/2011/02/fix-example.jpg?fit=620%2C412&amp;ssl=1" title="https://i1.wp.com/www.paddlinglight.com/pl/wp-content/uploads/2011/02/fix-example.jpg?fit=620%2C412&amp;ssl=1" srcset="https://substackcdn.com/image/fetch/$s_!JGaR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cee5b7-07f7-4672-a557-5c2cab02fc75_620x412.jpeg 424w, https://substackcdn.com/image/fetch/$s_!JGaR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cee5b7-07f7-4672-a557-5c2cab02fc75_620x412.jpeg 848w, https://substackcdn.com/image/fetch/$s_!JGaR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cee5b7-07f7-4672-a557-5c2cab02fc75_620x412.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!JGaR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fc6cee5b7-07f7-4672-a557-5c2cab02fc75_620x412.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">3 lines are better than 2. You are possibly within that <a href="https://www.paddlinglight.com/articles/navigation-fixes-and-triangulation/">red triangle</a>. </figcaption></figure></div><p>One does this despite having a GPS, for obvious reasons to anyone who has ever relied on any highly available service - backup and validate.</p><p>And no, the numbers don&#8217;t match. Find peace therein.</p><p>The point is that within the context of &#8220;where am I?&#8221; there is always a certain uncertainty around how well all the data feeds are working. Quite literally in the rules of of the ocean, you need to continuously consult &#8220;all available means&#8221; to assure yourself that you know where you are. Granted <a href="https://ecolregs.com/index.php?option=com_k2&amp;view=item&amp;id=281:using-all-available-means-to-determine-if-risk-of-collision-exists&amp;lang=en">the rule is for avoiding a collision</a>, the point is that you may not presume to trust a single data point from a single system.</p><h2><strong>The link to the data</strong></h2><p>&#8230;is still coming, first another bit of history.</p><p><a href="https://en.wikipedia.org/wiki/Francis_Chichester#Aviator">Sir Francis Chichester</a> was another straightforward British navigator type with an entrepreneurial streak, having gone to New Zealand to set up a forestry startup with NZ Combinator. He later picked up an interest in flying, and went to the UK to buy a plane and fly it BACK to New Zealand over a few months -  a new Tesla the modern equivalent I presume.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!swAK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F94e15e2a-e505-4767-9b68-6273c9517b00_253x183.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!swAK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F94e15e2a-e505-4767-9b68-6273c9517b00_253x183.png 424w, https://substackcdn.com/image/fetch/$s_!swAK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F94e15e2a-e505-4767-9b68-6273c9517b00_253x183.png 848w, https://substackcdn.com/image/fetch/$s_!swAK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F94e15e2a-e505-4767-9b68-6273c9517b00_253x183.png 1272w, https://substackcdn.com/image/fetch/$s_!swAK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F94e15e2a-e505-4767-9b68-6273c9517b00_253x183.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!swAK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F94e15e2a-e505-4767-9b68-6273c9517b00_253x183.png" width="397" height="287.15810276679844" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/94e15e2a-e505-4767-9b68-6273c9517b00_253x183.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:183,&quot;width&quot;:253,&quot;resizeWidth&quot;:397,&quot;bytes&quot;:59288,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!swAK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F94e15e2a-e505-4767-9b68-6273c9517b00_253x183.png 424w, https://substackcdn.com/image/fetch/$s_!swAK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F94e15e2a-e505-4767-9b68-6273c9517b00_253x183.png 848w, https://substackcdn.com/image/fetch/$s_!swAK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F94e15e2a-e505-4767-9b68-6273c9517b00_253x183.png 1272w, https://substackcdn.com/image/fetch/$s_!swAK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F94e15e2a-e505-4767-9b68-6273c9517b00_253x183.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption">The plane spent a fair bit of its time upside down, and this was not the most spectacular nor dangerous of the <a href="https://www.a-e-g.org.uk/sir-francis-chichester.html">calamities</a> he suffered in his plane.</figcaption></figure></div><p>Through this long and dangerous trip (England to New Zealand in 1929 in the above single-seater float plane), he encountered a few setbacks in terms of knowing where he was.</p><p>Once in NZ, he decided to tackle crossing the Tasman sea, thought to be a bad idea at the time because navigation would be an issue. He needed to stop halfway across to refuel, and the tolerance for missing the refuelling stop at Lord Howe Island was near zero. If he wasn&#8217;t able to land there, <strong>he&#8217;d certainly be beyond rescue or recovery.</strong></p><p>He was suddenly outrunning the metrics layer of a previous generation (neat), so he needed to <a href="https://www.a-e-g.org.uk/sir-francis-chichester.html">develop his own</a>:</p><blockquote><p>The challenge of the Tasman remained and Chichester realised that he could reach Australia if he fitted [the plane] with floats to alight on the sea and refuel at Norfolk Island and Lord Howe Island.</p><p>The real problem now was to find these tiny spots in the sea. The only method of position-fixing available was to take sunshots with a sextant - not easy when you&#8217;re alone in the cramped vibrating cockpit - then laboriously work out the fix with pencil and paper.</p><p>He decided to use the principle of &#8216;off-course navigation&#8217; i.e. deliberately aiming for a point to one side of the island. When this point was reached there would be no doubt which way to make a 90&#186; turn for the final leg to the island. To reduce potential errors in calculation, Chichester worked out a series of examples based on his estimate of the sun&#8217;s position at the time of his expected arrival at critical points on his course.</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!c_iX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F0558efd2-2ff4-40a6-96b5-f90adb84168b_700x367.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!c_iX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F0558efd2-2ff4-40a6-96b5-f90adb84168b_700x367.png 424w, https://substackcdn.com/image/fetch/$s_!c_iX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F0558efd2-2ff4-40a6-96b5-f90adb84168b_700x367.png 848w, https://substackcdn.com/image/fetch/$s_!c_iX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F0558efd2-2ff4-40a6-96b5-f90adb84168b_700x367.png 1272w, https://substackcdn.com/image/fetch/$s_!c_iX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F0558efd2-2ff4-40a6-96b5-f90adb84168b_700x367.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!c_iX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F0558efd2-2ff4-40a6-96b5-f90adb84168b_700x367.png" width="700" height="367" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/0558efd2-2ff4-40a6-96b5-f90adb84168b_700x367.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:367,&quot;width&quot;:700,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:298199,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!c_iX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F0558efd2-2ff4-40a6-96b5-f90adb84168b_700x367.png 424w, https://substackcdn.com/image/fetch/$s_!c_iX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F0558efd2-2ff4-40a6-96b5-f90adb84168b_700x367.png 848w, https://substackcdn.com/image/fetch/$s_!c_iX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F0558efd2-2ff4-40a6-96b5-f90adb84168b_700x367.png 1272w, https://substackcdn.com/image/fetch/$s_!c_iX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F0558efd2-2ff4-40a6-96b5-f90adb84168b_700x367.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Notice the dog-leg about 1 3rd of the way across (East to West) &#8212; <a href="https://www.a-e-g.org.uk/sir-francis-chichester.html">More pics</a></figcaption></figure></div><p>The primary issue was around the speed at which he moved, but through a clever application of logic and tolerance for approximations, he managed (quite literally barely based on the reading of his book) to make it across.</p><p>He then went on to become the Chief Data Officer for the Royal Airforce flying school (training WW2 pilots), a prolific seller of maps, as well as a solo sailor (which is where he became famous).</p><p>The point here is that he was moving too fast for existing methods. The dynamic was changing too quickly for existing technologies. SQL Server SSIS was no longer sufficient. It may have been sufficient if you were doing milk runs in a steamship from Bristol to Londonderry (or a mid-tier bank with a CIO intent on minimal risk), but certainly not if you were trying to set a navigation record across the Tasman sea.</p><h1><strong>The Data Navigator</strong></h1><p>This is the link: Data + Navigation</p><p>Today, the problem of longitude has been solved to centimetre accuracy through satellite navigation<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>. However, the reason for the strong nautical-themed trivia, is that they demonstrate an excellent ability to manage and thrive with uncertainty.</p><p>Both Chichester and Cook knew the limitations of the BI tools of their time. They supplemented these with experience and intelligence to surmount what were pretty tough odds, to get to wherever they felt needed arriving.</p><p>For clarity, most navigators are indeed supporting roles to the Captain. The Navigator would use their combined technical skills, intuition and experience (collectively, wisdom) to give the Captain guidance on where and when. The Captain would then integrate all insights into their resource planning (a startup on the sea).</p><p>What is clear is that the Navigator didn&#8217;t rely on navigation aids as a crutch, but rather as means to better outcomes through a clear understanding of the weaknesses and an attempt at improving them.</p><p>So what is all of this leading to?</p><p><strong>First,</strong> the Data Navigator is a useful analogy for startups and their data abuse. </p><p>When exploring, there is a benefit to be had from pushing into the unknown, and new methods are often required.</p><p><strong>Second,</strong> bad data causes problems. Often it&#8217;s not the data, but rather the navigator.</p><p><strong>Third,</strong> the navigator provides context which can be used as part of a higher purpose: alignment, motivation and maybe even consensus.</p><p>Three thoughts, three sections. Back to the sea, from whence we came.</p><h1><strong>1 - Exploration as a startup analogy</strong></h1><p>Navigation is well applied to the data team when the team is supporting the exploration of new worlds.</p><p>What the navigator does is supplement and confirm a mental model that someone has built about how the world works.</p><h3><strong>The map is not the territory</strong></h3><p>The idea maze maps nicely from historical exploration to startups - barely anything is known and there is a commercial framework based on discovery leading to reward. A startup is entirely lost, by definition, in the <a href="https://spark-public.s3.amazonaws.com/startup/lecture_slides/lecture5-market-wireframing-design.pdf">idea maze</a>:</p><blockquote><p>A good founder is capable of anticipating which turns lead to treasure and which lead to certain death. A bad founder is just running to the entrance of (say) the &#8220;movies/music/filesharing/P2P&#8221; maze &#8230;&nbsp; without any sense for the history of the industry, the players in the maze, the casualties of the past, and the technologies that are likely to move walls and change assumptions.</p></blockquote><p>The navigator has a theory about where they are and are going. This information complements the plan.</p><p>The startup has a plan of action and looks for confirmation or invalidation.</p><p>A navigator needs to understand their science in addition to the history and futures of their space. This becomes acutely more important when there is some form of race, as was typically the case with all historic navigation.</p><h3><strong>Innovation mixed in with luck and intuition</strong></h3><p>A great supplement to this is that the navigator doesn&#8217;t rely entirely on data. They use it to complement their intuition.</p><p>In a rapidly developing environment, the intuition delivered by a navigator gives an indication of which inputs can be trusted over others, what to prioritise and how to simplify the decision space.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a></p><p>I think the Data Navigator concept echoes the purpose of data within a risky environment, where there is upside to getting <em>it</em> right.</p><h1><strong>2 - When it goes wrong</strong></h1><p>There are many documented situations where among other failings, huge ships just run aground.</p><blockquote><p>He left her on autopilot, but strong currents overnight pushed the ship to the north and east and the chief officer altered her course towards the north. When Captain Rugiati awoke he saw that the Scilly Isles were unexpectedly off his port, not starboard bow<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a></p></blockquote><p>The issue arises when the Captain receives information from the navigator that undermines the model. That number doesn&#8217;t look right. That island doesn&#8217;t look right.</p><p>Twitter bots, Substack daily views<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a>. These issues are easily overlooked until they blow up.</p><blockquote><p>It ain&#8217;t what you don&#8217;t know that gets you into trouble. It&#8217;s what you know for sure that just ain&#8217;t so. </p><p>&#8212; <a href="https://quoteinvestigator.com/2018/11/18/know-trouble/">?</a></p></blockquote><p>Dealing with these is the actual skill of the navigator. Being sure of what they know, and balancing this with what they don&#8217;t.</p><p>Navigators provide the best supplement to decision making, trading off speed against accuracy, and relying on intuition, and overcoming ingrained preconceptions and bias.</p><h3><strong>Speed not haste</strong></h3><p>Accuracy trades off against speed almost directly.</p><p>Crucially, this needs to be seen in the appropriate context. Some data teams function to ensure accuracy. In that context, the Navigator mindset is likely misapplied, and it&#8217;s probably better to rely on an auditor/historian mindset<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-6" href="#footnote-6" target="_self">6</a>.</p><p>Startups require speed, due to competition. Accounting requires accuracy, due to compliance.</p><p>Speed and hurry should not be seen as excuses for bad data. As with navigation, so with data, there are pretty firm fundamentals that are easy to achieve, can be effectively relied upon and supplement measurement and improvement of the basics.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-7" href="#footnote-7" target="_self">7</a></p><blockquote><p>"To err is human, but to persist [in error] is diabolical." </p><p>&#8212; <a href="https://en.wiktionary.org/wiki/errare_humanum_esthttps://en.wiktionary.org/wiki/errare_humanum_est">Latinum Sayinum</a></p></blockquote><p>As a navigation route starts to mature and grow in traffic, then the balance favours accuracy, to drive efficiency. Similar dynamic when a startup becomes an established scale-up.</p><h3><strong>Hierarchy</strong></h3><p>It should be noted that the Nautical Navigator operates in a strictly hierarchical environment. There are no committees at sea, the Captain has the final word, typically for good reason, though the shipping disaster earlier indicates how this becomes an issue, as described in <a href="https://www.samuelthomasdavies.com/book-summaries/business/black-box-thinking/">Black Box Thinking</a>:</p><blockquote><p>When we are confronted with evidence that challenges our deeply held beliefs we are more likely to reframe the evidence than we are to alter our beliefs. We simply invent new reasons, new justifications, new explanations. Sometimes we ignore the evidence altogether.</p></blockquote><p>Many maritime disasters, when deconstructed, relate to power dynamic issues, indicating mental inflexibility around what was true. Running your ship into a reef or ignoring a warning about a storm are great examples of ignoring the data.</p><p>This is because new insights are disruptive and hard to integrate. People are happier with consensus than they are with uncertainty, even when the uncertainty might save them.</p><blockquote><p>Organisations think they want more insights and innovation.</p><p>They are deluding themselves.</p><p>Organisations suppress insights for reasons that are locked inside their corporate DNA: <strong>organisations prize predictability and they recoil from errors</strong>. </p><p>&#8212; <a href="https://www.wired.co.uk/article/gary-klein">Gary Klein</a></p></blockquote><p>Granted that the above line was in the context of innovation, not avoiding disaster, but the same &#8220;don&#8217;t rock the boat&#8221; mentality underlies both aversions to innovation and failing to avert course.</p><p>My anecdote - If you are on a UK train and someone is in your seat, and has the same seat booked, <strong>then</strong> <strong>at least</strong> <strong>one of you is on the wrong train</strong> - an uncomfortable insight.</p><p>The hardest thing for the navigator to overcome is an unwanted insight.</p><h1><strong>3 - Purpose</strong></h1><p>To bring things closer to the point then. A data team ultimately brings a shared context to an organisation. <strong>We are here, and aiming there.</strong></p><p>As we know, the shared context is abstract, flexible, and transient. All the data does is add a foundation to the abstraction of reality we use to support and enhance our shared context. The data tends to be the one part that is consistent, reliable and infallible.</p><p>This has an underlying seemingly heretical belief: <strong>a single source of truth is an abstraction</strong>.</p><p>Single source of truth is a fairytale, data teams help reconcile this untruth.</p><p>What it looks like is shared context, with firmly drawn lines, in pen, on paper, that indicate that the boundary of the territory is exactly HERE.</p><h3>Trails </h3><p>And so what the data team does is simplify, create some abstractions, and identify options, and trails:</p><blockquote><p>Complete freedom is not what a trail offers. Quite the opposite; a trail is a tactful reduction of options.</p><p>&#8213; Robert Moor, <a href="https://www.goodreads.com/work/quotes/47328278">On Trails: An Exploration</a></p></blockquote><p>This is to simplify and create shorthands, narratives, and reasonable decisions.</p><p>However, the Data Navigator needs to keep this simplification in mind, always apply their experience and skill to avoid disaster, and ensure the drift from reality to map isn&#8217;t too severe:</p><p>The map maker, the surveyor, the compass reader, the ships crew, and everyone that had some part in the determination of the final pen to paper knows that there are many sources of uncertainty, each possibly extending the truth in their own limited capacity that may indeed lead to the pen on paper determination being slightly but disastrously off.</p><blockquote><p>We had 424,000 daily active users yesterday.&#8221; The pessimetricist thinks &#8212; hopefully, he does not say this &#8212; &#8220;Actually, you had in excess of 424,000 HTTP requests from devices associated at least temporarily with unique user accounts registered in your internal systems over a 24-hour time period that survived a number of arbitrary assumptions in your data processing systems that passed muster six months ago but which haven&#8217;t been re-evaluated meaningfully since. </p><p>&#8212; <a href="https://stkbailey.substack.com/p/beyond-one-and-zerohttps://stkbailey.substack.com/p/beyond-one-and-zero">Stephen Bailey</a></p></blockquote><p>But pen to paper it is, and from that moment onwards, and until a better map is created, that is the single source of truth, the true story.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-8" href="#footnote-8" target="_self">8</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iAjd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2018d8ac-c395-405a-ab5d-30268500b8a5_516x271.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iAjd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2018d8ac-c395-405a-ab5d-30268500b8a5_516x271.png 424w, https://substackcdn.com/image/fetch/$s_!iAjd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2018d8ac-c395-405a-ab5d-30268500b8a5_516x271.png 848w, https://substackcdn.com/image/fetch/$s_!iAjd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2018d8ac-c395-405a-ab5d-30268500b8a5_516x271.png 1272w, https://substackcdn.com/image/fetch/$s_!iAjd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2018d8ac-c395-405a-ab5d-30268500b8a5_516x271.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iAjd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2018d8ac-c395-405a-ab5d-30268500b8a5_516x271.png" width="516" height="271" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/2018d8ac-c395-405a-ab5d-30268500b8a5_516x271.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:271,&quot;width&quot;:516,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:46106,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iAjd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2018d8ac-c395-405a-ab5d-30268500b8a5_516x271.png 424w, https://substackcdn.com/image/fetch/$s_!iAjd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2018d8ac-c395-405a-ab5d-30268500b8a5_516x271.png 848w, https://substackcdn.com/image/fetch/$s_!iAjd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2018d8ac-c395-405a-ab5d-30268500b8a5_516x271.png 1272w, https://substackcdn.com/image/fetch/$s_!iAjd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2018d8ac-c395-405a-ab5d-30268500b8a5_516x271.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Substack doesn&#8217;t embed replies, <a href="https://twitter.com/imightbemary/status/1501059963036209152">so I took a screenshot</a></figcaption></figure></div><h3><strong>Certainty Matters</strong></h3><p>If the way we are collecting data, storing data, transforming data, distributing data, consuming data and then sharing data each have between 0.1% and 1% chance of error, then that error gets accumulated, and in some cases amplified, <strong>apply a narrative and suddenly the truth could be anything.</strong></p><p>Is the data correct? What is correct? What is?</p><p>This doesn&#8217;t land well with people on the receiving end of data systems and old maps. Accountants doing financial reconciliation on data warehouses, and someone who has smashed their fibreglass boat into an<a href="https://www.notion.so/blog-Data-Navigators-Accurate-data-ce52ac5c4201401a91112b35f6eda130"> uncharted granite rock</a><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-9" href="#footnote-9" target="_self">9</a>.</p><p>When a data team gives an indication of uncertainty - this causes thoughts along the lines of YOU CANNOT BE SAYING WHAT YOU ARE SAYING.</p><p>The map, still, is not the territory. It is part of building consensus and alignment.</p><h3><strong>In the same boat</strong></h3><p>Context isn&#8217;t enough, one needs consensus:</p><blockquote><p>Data professionals can <strong>build consensus</strong> as the company becomes more diverse. Data systems can <strong>establish methods</strong> for understanding the world even as it becomes more complex. [My emphasis]</p><p>&#8212; Stephen Bailey again! <a href="https://stkbailey.substack.com/p/perennial-truth-architectures?utm_campaign=myspace">Perennial Truth Architectures</a></p></blockquote><p>What that means, is that using context (meaning), the navigator and the team need to build from that shared context towards consensus (opinion or position reached by a group as a whole).</p><p>Without consensus, we get stuck in the wrong quadrant of a 2x2 matrix, where autonomy and indirection lead to the night watch sailing the boat in one direction and then the day watch on handover reversing course and backtracking. Consensus is king.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-10" href="#footnote-10" target="_self">10</a></p><h1><strong>Last Words</strong></h1><p>To wrap things up, the story here is as follows:</p><ol><li><p>Help your team/company by navigating. Reconciling uncertainty is your special responsibility.</p></li><li><p>Hierarchy and inertia are cultural issues that flummox good intentions and amplify bad data. A purely technical orientation will only get the message so far, a voice and an opinion are necessary to succeed. </p></li><li><p>Build a shared context, and use it to reach consensus.</p></li></ol><blockquote><p>Navigation <strong>[Data]</strong> is easy. </p><p>If it wasn't, they wouldn't be able to teach it to Sailors <strong>[Business People]</strong>. </p><p>&#8212; <strong>James Lawrence</strong></p></blockquote><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://groupby1.mattarderne.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe to receive new posts</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qb1R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb796b558-2b1f-48fd-a099-7d383e7fea9b_976x549.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qb1R!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb796b558-2b1f-48fd-a099-7d383e7fea9b_976x549.png 424w, https://substackcdn.com/image/fetch/$s_!Qb1R!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb796b558-2b1f-48fd-a099-7d383e7fea9b_976x549.png 848w, https://substackcdn.com/image/fetch/$s_!Qb1R!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb796b558-2b1f-48fd-a099-7d383e7fea9b_976x549.png 1272w, https://substackcdn.com/image/fetch/$s_!Qb1R!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb796b558-2b1f-48fd-a099-7d383e7fea9b_976x549.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qb1R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb796b558-2b1f-48fd-a099-7d383e7fea9b_976x549.png" width="976" height="549" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/b796b558-2b1f-48fd-a099-7d383e7fea9b_976x549.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:549,&quot;width&quot;:976,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:985101,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Qb1R!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb796b558-2b1f-48fd-a099-7d383e7fea9b_976x549.png 424w, https://substackcdn.com/image/fetch/$s_!Qb1R!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb796b558-2b1f-48fd-a099-7d383e7fea9b_976x549.png 848w, https://substackcdn.com/image/fetch/$s_!Qb1R!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb796b558-2b1f-48fd-a099-7d383e7fea9b_976x549.png 1272w, https://substackcdn.com/image/fetch/$s_!Qb1R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb796b558-2b1f-48fd-a099-7d383e7fea9b_976x549.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Cook Inc team building off-site &#8212; Australia </figcaption></figure></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>The best list of <a href="https://spiritofbuffalo.com/nautical-resources/nautical-phrases-and-terms/">nautical phrases</a>, some need a fact-check (Windfall)</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Accuracy is still not &#8220;solved&#8221; if you need sub-centimetre precision. Precision is probably another post or line of thinking</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>Excellent podcast on why product teams need to take more risk, and a great perspective on data vs intuition <a href="https://www.thetwentyminutevc.com/grant-lafontaine/">20VC: Startups Fail Because They Do Not Take Enough Risk, Why A/B Testing is Inefficient and Slows You Down</a></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>So much detail around SS Torrey Canyon running aground: <a href="https://professionalmariner.com/torrey-canyon-alerted-the-world-to-the-dangers-that-lay-ahead/">summary details</a>, <a href="https://www.fedcourt.gov.au/digital-law-library/judges-speeches/justice-rares/rares-j-20171005">legal proceedings</a>, <a href="https://timharford.com/2019/02/lessons-from-the-wreck-of-the-torrey-canyon/">excellent podcast</a></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>Post-posting Edit: <br>Benn <a href="https://benn.substack.com/p/when-we-get-it-wrong">released a blog not 6 hours</a> after I posted this (a reasonable time to be inspired and put penn to paper if you ask me), with a breakdown of <strong>&#8220;What do we do when we get it wrong?&#8221;</strong></p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-6" href="#footnote-anchor-6" class="footnote-number" contenteditable="false" target="_self">6</a><div class="footnote-content"><p>Once we go beyond generalists and into specialists, we start to see the need for all kinds of data archetypes. A few that I&#8217;ve plucked from the mind space, in order of likely usefulness from startup to enterprise</p><ul><li><p>Data Navigators - less worried about the truth, more concerned about the objective</p></li><li><p>Data Plumbers - quality, speed, reliability</p></li><li><p>Data Journalists - truth-seeking, relentless, individual</p></li><li><p>Data Librarians - availability, discoverability, comprehensive</p></li><li><p>Citizen navigators - business users who can navigate data without engineering skills</p></li></ul><p>I&#8217;d like to think further on how to matrix these against the notion of <a href="https://twitter.com/swardley/status/1509478040174305282">Pioneer - Settle - Plan concept</a>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-7" href="#footnote-anchor-7" class="footnote-number" contenteditable="false" target="_self">7</a><div class="footnote-content"><p>I deliberately didn&#8217;t include considerations around mistakes - which while they should be expected, <a href="https://seattledataguy.substack.com/p/data-horror-stories-what-could-possibly">are their own category</a>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-8" href="#footnote-anchor-8" class="footnote-number" contenteditable="false" target="_self">8</a><div class="footnote-content"><p>When asked if I would task the embedding of &#8220;data-stack analytics&#8221; into a web app, my first question is <em>how important accuracy?</em> This is probably too inflammatory, but ultimately comes to an important point - it is much easier to constrain the possibilities of embedding analytics if the system is one coherent stack, same DB, same framework, same developer.</p><p>Introduce an entirely different stack, with different latencies, different processes, a DIFFERENT TEAM, well then the chance for amplifying error increases, just like if you subcontract the printing of your maps to the lowest bidder and they distort the scaling inadvertently to get it to fit into their printer. </p><p>Substack had this issue, where they were double counting the readers of posts, presumably root cause is related.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-9" href="#footnote-anchor-9" class="footnote-number" contenteditable="false" target="_self">9</a><div class="footnote-content"><p>Crazy <a href="https://www.youtube.com/watch?v=lmw7_DzM2JI">replay</a> of this boat running aground, to be fair to that rock, it&#8217;s actually an island!</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-10" href="#footnote-anchor-10" class="footnote-number" contenteditable="false" target="_self">10</a><div class="footnote-content"><p><a href="https://roundup.getdbt.com/p/two-types-of-power">dbt captures an important subtlety</a> on the road to consensus - <strong>power.</strong> As we&#8217;ve seen (in boats above, in varying political systems and in companies we&#8217;ve worked for), how we reach consensus can be reached in varying ways.</p><blockquote><p>[There are] two paths to growth in an organization that represent two different approaches to truth: the path of power, in which the word of God CEO comes down and slowly diverges via apostles organizational hierarchy; and the path of consensus, in which multiple humans converge on a truth based on shared principles</p></blockquote><div class="poll-embed" data-attrs="{&quot;id&quot;:7447}" data-component-name="PollToDOM"></div><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[The future history of Data Engineering]]></title><description><![CDATA[On Data Engineers and their place in a Data SaaS world]]></description><link>https://groupby1.mattarderne.com/p/data-engineering</link><guid isPermaLink="false">https://groupby1.mattarderne.com/p/data-engineering</guid><dc:creator><![CDATA[Matt Arderne]]></dc:creator><pubDate>Thu, 06 Jan 2022 07:33:48 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!yLBh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2cac65f-aaeb-442f-a945-3790c3343ec0_1100x743.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Trigger warning: this may trip your thought leadership nerve, but I&#8217;m mostly riffing, and I think it is additive. This post was easy to start, a pain to finish and lots of fun in the middle.</em></p><p><em>This is a narrative for the near future of Data Engineering in startups, and I think makes some interesting points. I do think the post has avenues for expansion, especially counter-arguments in the context of enterprise tech.&nbsp;</em></p><p><em>Hello to new subscribers, and a shoutout to anyone using the RSS feed.</em></p><p><em>Thanks to the <a href="https://locallyoptimistic.com/community/">LocallyOptimistic.com</a> community for some lively discussions on an early draft of this post.</em></p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yLBh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2cac65f-aaeb-442f-a945-3790c3343ec0_1100x743.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yLBh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2cac65f-aaeb-442f-a945-3790c3343ec0_1100x743.jpeg 424w, https://substackcdn.com/image/fetch/$s_!yLBh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2cac65f-aaeb-442f-a945-3790c3343ec0_1100x743.jpeg 848w, https://substackcdn.com/image/fetch/$s_!yLBh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2cac65f-aaeb-442f-a945-3790c3343ec0_1100x743.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!yLBh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2cac65f-aaeb-442f-a945-3790c3343ec0_1100x743.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yLBh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2cac65f-aaeb-442f-a945-3790c3343ec0_1100x743.jpeg" width="1100" height="743" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/b2cac65f-aaeb-442f-a945-3790c3343ec0_1100x743.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:743,&quot;width&quot;:1100,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yLBh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2cac65f-aaeb-442f-a945-3790c3343ec0_1100x743.jpeg 424w, https://substackcdn.com/image/fetch/$s_!yLBh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2cac65f-aaeb-442f-a945-3790c3343ec0_1100x743.jpeg 848w, https://substackcdn.com/image/fetch/$s_!yLBh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2cac65f-aaeb-442f-a945-3790c3343ec0_1100x743.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!yLBh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2cac65f-aaeb-442f-a945-3790c3343ec0_1100x743.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">credit: me</figcaption></figure></div><h2><strong>Intro</strong></h2><p>The core premise of this post is:</p><p>Most businesses' <strong>data engineering</strong> needs have been solved or will shortly be solved by managed services that 10 years ago would require endless and extensive self-built ETL pipelines, databases and tools.</p><p>For the exceeding majority of businesses, this means they can and should focus on building capacity for business logic, analysis and predictions instead of data engineering. </p><p>The minority of businesses that need streaming services / low latency batch data, will further push the boundary, using specialist Data Engineers.</p><p>The implications are that while Data Engineering is growing rapidly, so too are the forces that will undermine the need for Data Engineers, and the current under-supply of competent engineers will lead to an over-supply of junior engineers (this should ring a bell to the Web-dev then Full-stack then Data Science boot camps).</p><p>Let us break down the premise further, as that is a massive generalisation, and to the reader of this niche corner of opinion, may seem counterintuitive, inflammatory, and frankly, stupid.&nbsp;</p><p>6 points, expanded into 6 sections. Let's go.</p><h3><strong>1. Majority</strong></h3><p>Keep in mind the context for the majority of businesses - technology is often an expensive misdirect when implemented badly. This is exceedingly true when the technology is not directly aligned with their competitive advantage. Majority here means all businesses investing in technology, not just the typecast "blitz scale tech business". All businesses should take advantage of data tooling, or they will in effect be flying blind relative to their peers (or get an advantage over their peers if they get it right).</p><h3><strong>2. Data engineering</strong></h3><p>Plumbing of the data - ETL, data warehouse, streaming, batch, orchestration, infra etc. Niche skills that are hard to hire for.&nbsp;</p><p>Distinct from Software Engineering. Mostly not backend developers, more commonly generalists, occasionally highly skilled specialists.&nbsp;</p><p>Data Engineering is quite contentiously named as is made clear later.</p><h3><strong>3. Business logic</strong></h3><p>This is covered elsewhere - but in the most abstract terms - businesses should hope that their engineers&#8217; primary focus is on improving the ability to represent the businesses various states, and enhance the ability to interact and modify these states, and even predict outcomes to modifications.</p><p>Businesses should strive not to have people worrying about managing infrastructure, plumbing, ops etc over and above what is strictly necessary. Playing on the margin of this point is what the CTO does.&nbsp;</p><h3><strong>4. Managed services</strong></h3><p>Think about Sysadmins of the mid-2000s, arcane knowledge that is now redundant in almost every business, due to AWS, then Heroku, now Vercel, Supabase etc flying up the stack. (Or hadoop specialists. Big Data DBA anyone?).</p><p>Same with Data Engineering. <strong>Tech abstraction as a service.</strong> Managed Services are arriving fast with the likes of Snowflake, Fivetran and the commodifying follow-ons. They&#8217;re aggressively <a href="https://benn.substack.com/p/data-and-the-almighty-dollar">chasing down the almighty dollar,</a> undercutting margins, offering better cost structures, as well as a flurry of bundling mergers and consolidation.</p><h3><strong>5. The minority</strong></h3><p>Many businesses will still have an exceedingly strong need to increase their advantage through data engineering. Take High-Frequency Trading as an example. These businesses will progress the field, and the best data specialists will be needed in those spaces.</p><h3><strong>6. Implications</strong></h3><p>This one is clear, don&#8217;t get caught on the wrong side of any sea change. I would (do) argue that the ETL engineer skill-set is mostly going to be marginalised until :skull:.</p><p>In much the same way that the market demand for boot-camp Data Scientists is low, due in part to oversupply, better tooling and additionally a reorientation around the expectations of a Data Scientist, so too do I propose that Data Engineering demand dynamics will change.&nbsp;</p><p>I'd like to hope that the rate of ETL code being written is in decline because most can rely on managed services or open-source ELT extractors.&nbsp;</p><p>This point gets some pushback, discussed further in part 6.</p><p>But more generally, here is something about changes in the tide:</p><p>When the tide turns, there is a definite moment when the tide has indeed turned, but that change in direction becomes apparent to different boats at different times. This depends on context, location, keel depth and distance from both the equator and the moon (not to mention the sun). The gravitational pull has changed, but the water doesn&#8217;t start moving everywhere at the same time.&nbsp;</p><p>This blog also, as it became clear through writing it, and quoting sources, agrees with a certain viewpoint on specialisation, the link shall possibly become obvious.</p><p>So, that is the intro, I&#8217;m going to stick with those 6 sections so that coherence abounds, and explain / expand the points in finer detail.&nbsp;</p><h1><strong>1. Majority&nbsp;</strong></h1><p><strong>Nearly every company needs a data person</strong>. Any company that has ambitions to beat market returns on their investor&#8217;s capital and doesn't have someone in a broadly data-dedicated data role will certainly struggle to compete.</p><p>10-15 years ago an easy indication that a company wasn't keeping up was having no IT person, the modern equivalent is having no Data Person.&nbsp;</p><p><strong>But the premise is that that person no longer needs to be a Data Engineer.</strong></p><p>Reference the sysadmin/dba type roles, which for 99.99% of businesses does not exist, because cloud providers hire those people and abstract their role into a service.</p><p>The thrust is that Data Engineering could go the way of the dba. Niche, specialised.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4qXy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb927377-6fba-430b-a338-2797b8342190_1100x578.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4qXy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb927377-6fba-430b-a338-2797b8342190_1100x578.png 424w, https://substackcdn.com/image/fetch/$s_!4qXy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb927377-6fba-430b-a338-2797b8342190_1100x578.png 848w, https://substackcdn.com/image/fetch/$s_!4qXy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb927377-6fba-430b-a338-2797b8342190_1100x578.png 1272w, https://substackcdn.com/image/fetch/$s_!4qXy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb927377-6fba-430b-a338-2797b8342190_1100x578.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4qXy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb927377-6fba-430b-a338-2797b8342190_1100x578.png" width="1100" height="578" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/fb927377-6fba-430b-a338-2797b8342190_1100x578.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:578,&quot;width&quot;:1100,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4qXy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb927377-6fba-430b-a338-2797b8342190_1100x578.png 424w, https://substackcdn.com/image/fetch/$s_!4qXy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb927377-6fba-430b-a338-2797b8342190_1100x578.png 848w, https://substackcdn.com/image/fetch/$s_!4qXy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb927377-6fba-430b-a338-2797b8342190_1100x578.png 1272w, https://substackcdn.com/image/fetch/$s_!4qXy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb927377-6fba-430b-a338-2797b8342190_1100x578.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Hard work to find a combination of roles, location and time frame where applicable roles are on the same scale AND cross over.</figcaption></figure></div><h3><strong>Who/What is the data person then?</strong></h3><p>Data engineering in the ETL/ELT sense has historically been complex, difficult, emergent, at times chaotic, and required niche software engineering skills.</p><p>Now, Extract and Load for most businesses using generic SaaS tools, is solved. Using the standard set of CRM, HR, Finance, and Ops tools, 80% of your ELT work is done for you at a standard, predictable price.&nbsp;</p><p>Commoditised EL SaaS is ubiquitous, with the <em>second wave</em> (of EL providers) offering better services at more favourable terms than Fivetran, with <em>multiple variants and mutations.</em></p><p><strong>T for transform, with general best practices courtesy of dbt, is where the bulk of the&nbsp; analytics work lies. Critically this is where the data person should start.&nbsp;</strong></p><p>They should be, to some degree, what is known as an Analytics Engineer, but possibly more usefully, <strong>not a specialist Data Engineer </strong>(nor a Data Scientist for that matter, but that bridge has been crossed). </p><p><strong>They should be a<a href="https://blog.getdbt.com/we-the-purple-people/"> purple data generalist</a>:</strong></p><blockquote><p>The data world needs more purple people &#8212; generalists who can navigate both the business context and the modern data stack. Let's put aside skillset dichotomies, and learn to feel comfortable in the space between.</p></blockquote><p>If you do need a Data Engineer, probably for some or other niche API that isn&#8217;t supported by your EL tool, then this is great work to outsource! (My day-to-day work: Supporting companies on their fringe data engineering needs when their internal team wants some extra capacity or capability).</p><p><strong>But the first full time data hire needs to be obsessed with business impact.</strong></p><h1><strong>2. Data Engineering</strong></h1><p><strong>A tale of two types of data Engineers: </strong>Again with the generalisations! In my view and from what I've seen in the job market, there are two types of data engineers at the moment:</p><h3><strong>(1) Data Engineers: Software engineers, Data</strong></h3><p><strong>Described as</strong>: Software engineering specialists, with data as the core specialisation, who can focus on the niche areas of data engineering and can work with complex real-time data systems.</p><p><strong>Needed When</strong>: Only required in tech businesses, and only when software engineers cannot assist. This is not needed for 99% of businesses and these candidates know what they want to work on and have the agency to decide.</p><p><strong>Characteristics:</strong></p><ul><li><p>Tools-oriented</p></li><li><p>Computer scientists / Very good software engineers</p></li><li><p>Driven by curiosity</p></li><li><p>Driven towards perfection of the craft</p></li><li><p>Want the solutions to be elegant, optimised</p></li><li><p>A specialised role for a specialised business problem</p></li></ul><p><strong>Currently and into the future</strong> hired to do the following (quote from a slack group from someone who may or may not want the shoutout):</p><blockquote><p>When building out some data-focused applications, like, say, a streaming data enrichment layer that serves up some curated data real-time to other micro-services, we need software engineers and data engineers. Occasionally you&#8217;ll find unicorns who can do it all (we have a few of them), but the vast majority of software engineers aren&#8217;t experienced enough with data to also be able to solve complex, big-data, non-SQL problems as well as someone more specialised could.</p></blockquote><h3><strong>(2) Data Engineers: Solutions oriented engineers, Data</strong></h3><p><strong>Described as</strong>: Business optimisers. Data engineers that engineer data because it is the biggest blocker in the optimisation of a bigger picture issue, namely <strong>analytics</strong> as it relates to business improvement efforts. I love this post from <a href="https://erikbern.com/2021/07/07/the-data-team-a-short-story.html">erikbern.com</a>:</p><blockquote><p>You work with the recruiting team to define a profile for a generalist data role, that emphasizes core software skills, but with a generalist attitude and a deep empathy for business needs. For now, you remove all the mentions of artificial intelligence and machine learning from the job posting.</p></blockquote><p><strong>Needed when:</strong> Data engineering data extraction and centralisation is identified as the key issue in a long line of issues. The primary bottleneck in the optimisation process.</p><p><strong>Characteristics:</strong></p><ul><li><p>Goals oriented</p></li><li><p>Background in an adjacent engineering field</p></li><li><p>Driven by optimisation, the ultimate goal</p></li><li><p>Utilitarian problem solvers, relied upon to get the job done</p></li><li><p>Functionally broader skill set, maybe even new to the domain, and not (yet) experts in technology</p></li></ul><p><strong>Historically</strong> hired to <a href="https://www.getdbt.com/what-is-analytics-engineering/">do the following:</a></p><blockquote><p>If you were on a &#8220;traditional data team&#8221; pre 2012, your first data hire was probably a data engineer. You needed this person to build your infrastructure: extract data from the Postgres database and SaaS tools that ran your business, transform that data, and then load it into your data warehouse.</p></blockquote><p><strong>Currently</strong> hired to:</p><p>Build data warehouse, pipelines, dimensional modelling, deploy analytics tools, string it together, but critically, to drive change in a business.</p><p>In short - Type 2 wants the solution to be cheaper, easier, faster, best fit, 80/20, is less intrinsically interested in the how and more interested in the impact on outcomes.&nbsp;</p><p>In my opinion, this is basically now Analytics Engineers, and if you disagree with my take on this concept, speak to an Analytics Engineer who had the title Data Engineer, and ask them if they can relate. Similar experience to those <a href="https://jasnonaz.medium.com/data-scientist-or-analytics-engineer-how-i-made-the-decision-that-defined-my-career-1646d4296467">Data Scientists who preferred Analytics Engineering</a>.</p><p><strong>Another way to think about these distinctions is (</strong><a href="https://erikbern.com/2021/07/23/what-is-the-right-level-of-specialization.html">erikbern.com</a><strong> </strong>again<strong>):</strong></p><blockquote><p>I often think of people as (and this is an unfair crude generalization etc) roughly on a spectrum between tools-oriented and goal-oriented.</p></blockquote><p><strong>Memed as:</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CnkK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd25e30e5-68c4-440c-ae38-b120344a57df_525x499.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CnkK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd25e30e5-68c4-440c-ae38-b120344a57df_525x499.png 424w, https://substackcdn.com/image/fetch/$s_!CnkK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd25e30e5-68c4-440c-ae38-b120344a57df_525x499.png 848w, https://substackcdn.com/image/fetch/$s_!CnkK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd25e30e5-68c4-440c-ae38-b120344a57df_525x499.png 1272w, https://substackcdn.com/image/fetch/$s_!CnkK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd25e30e5-68c4-440c-ae38-b120344a57df_525x499.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CnkK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd25e30e5-68c4-440c-ae38-b120344a57df_525x499.png" width="525" height="499" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/d25e30e5-68c4-440c-ae38-b120344a57df_525x499.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:499,&quot;width&quot;:525,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:718461,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CnkK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd25e30e5-68c4-440c-ae38-b120344a57df_525x499.png 424w, https://substackcdn.com/image/fetch/$s_!CnkK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd25e30e5-68c4-440c-ae38-b120344a57df_525x499.png 848w, https://substackcdn.com/image/fetch/$s_!CnkK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd25e30e5-68c4-440c-ae38-b120344a57df_525x499.png 1272w, https://substackcdn.com/image/fetch/$s_!CnkK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fd25e30e5-68c4-440c-ae38-b120344a57df_525x499.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Based on those holding the blades, this is a Type 2 DE</figcaption></figure></div><h1><strong>3. Business Logic</strong></h1><h3><strong>Engineers as optimisation specialists</strong></h3><p>My background is in industrial engineering, which is broadly a stats'y engineering field incubated in the optimisation of systems (typically factories).</p><p>Layouts, flows, bottlenecks, JIT, supply chain etc. Mostly a solved field in many regards (shout out to the bullwhip effect &#128667; &#128666; &#128667;).</p><p>The broad optimisation process for <strong>most of those businesses</strong> looks something like</p><ol><li><p>Collection of SaaS and ERP-like systems to track and account for things</p></li><li><p>Data engineering to extract the various states</p></li><li><p>Analytics on the states</p></li><li><p>Decisions to change the states</p></li><li><p>Track decisions in ERP (i.e. repeat)</p></li></ol><p>When I left engineering school, ERP implementation was where the demand was, and large chunks of engineers ended up implementing/consulting/suggesting various guises of a(n) ERP / CRM / database / app / spreadsheet / chalkboard.</p><p>However, it became clear to me (with hindsight) that this was quickly becoming a commodity technology and skillset <strong>(i.e. outsource to contractors)</strong>, and that Data Engineering was the real skill bottleneck.</p><p>Businesses were amassing large data sets but struggling to access them, let alone analyse them, and so having lucked into a DE role, I made this transition.</p><h3><strong>Optimising down the optimisation list</strong></h3><p>Data Engineering is no longer the bottleneck! This is a huge relief, because Data Engineering is not optimising, rather just a necessary lift and shift. It is purely an operational burden brought about by decisions made with siloed data as the tradeoff.&nbsp;</p><p>Now surely <strong>3. Analytics on the states </strong>is the biggest hurdle and opportunity.&nbsp;</p><p>Analytics is currently a headache, which requires significant investment, and where I suggest the investment is made. There is much more value in time spent on the building of &#8220;Business Logic&#8221;. In this case, Analytics.&nbsp;</p><p>Analytics extends far beyond data modelling and analysis, encompassing business processes, people processes, management and communication.&nbsp;</p><p>Analytics also pushes back into software engineering, system designing and overall value chain analysis.&nbsp;</p><h1><strong>4. Managed services</strong></h1><h3><strong>The Data engineering "type (2)" makeover</strong></h3><p>My day to day is where I form this opinion. I have done much less data engineering as it relates to Data Warehouse fine tuning, and ELT troubleshooting since the tools became so much easier. I do a lot more analytics and a lot more modelling. The problem has moved, onwards, up-system. The old bottleneck has largely been removed and solved. Optimised.</p><p>To quantify this stance, consider why&nbsp; there is a literal tsunami of new spins on data products: metric-stores, reverse ETL, metadata, discovery, quality, etc etc. <a href="https://benn.substack.com/p/the-data-os">Great data from Benn Stancil:</a></p><blockquote><p>In 2017, Y Combinator&#8212;an incubator of both startups and the Silicon Valley zeitgeist&#8212;funded 15 analytics, data engineering, and AI and ML companies. <strong>In 2021, they funded</strong> <strong>100</strong> (my emphasis)</p></blockquote><p>These are viable partly because the EL bottleneck was eased, the storage got cheaper and <a href="https://www.getdbt.com/">dbt</a> made the whole thing more manageable.</p><p>Suddenly the problem wasn't getting the data, <strong>it was using the data.</strong></p><p>Typically the domain of the elite. The reason Airbnb, Linkedin etc have needed a data catalog for near decades is because they had the engineering clout to make it necessary.</p><p>The sudden simplification of this process has meant that the next, hitherto unknown bottleneck gets suddenly bashed into, and there is<a href="https://duckduckgo.com/?t=ffab&amp;q=snowflake+ipo&amp;ia=web"> immense</a> value to be gained by unlocking it.</p><p><strong>Build it, will they come?</strong></p><p>If offered, many businesses will jump at a SaaS subscription, rather than spending that money on hiring/expanding an engineering team.</p><blockquote><p>The term engineering is derived from the Latin ingenium, meaning "cleverness" and ingeniare, meaning "to contrive, devise" <a href="https://en.wikipedia.org/wiki/Engineering">wiki</a></p></blockquote><p>When the data is easy to centralise, combine and analyse, engineers won't be needed to <strong>devise and contrive</strong> data combining solutions.</p><p>They can go and contrive and devise something else, that is complex, and that gives the company a competitive advantage.</p><p>Eventually, analytics engineering could face the same turn of the tide. When the tooling gets so good that the <strong>team is composed entirely of analysts and product people, and no contriving engineers.</strong></p><p>In the same way that structural engineers are only required when building on quicksand, data engineers are only required when building upon a dataswamp. As the tooling gets better, so do the foundations stabilise.</p><h3><strong>On the margin</strong></h3><p>The companies I advise and work with often have much less need for Data Engineering at the outset.</p><p>However to clarify one point - when they do need Data Engineering, it is a requirement for specialisation. There is indeed more Data Engineering to be done, but this is increasingly specialised (this is a semi-deliberate contradiction to this entire post that I am OK with).</p><p>The companies need help with the edge-case, marginally viable solution, where something emerges, crucial to them, that falls through the cracks of the 80/20 SaaS solutions. The point is that these needs come later down the line. Not at the outset of a data project, but later, once the bulk of the crucial, impactful elements are working and generalist data practitioners have exhausted their options.</p><p><strong>A caveat: the assemblage of the appropriate tools in the appropriate order to match business needs and maturity is a tricky problem indeed. Probably something that would benefit from the skills of a Data Engineer. More on that in section 6.</strong></p><h1><strong>5. Minority</strong></h1><h3><strong>Data ENGINEERING isn't going anywhere.</strong></h3><p>I recently discussed this with someone from a quant hedge fund, and while they had a computer science background, they were "data" + "engineering" to a profound degree. They needed a real-time (real time real-time) data feed from all of the brokers, with extensive transformation across all of them. Multiple decision systems integrated with predictive models, and then reliably send orders back into that system, in near real-time.</p><p>This system literally was the business. Complex, differentiating. Building this was one of maybe two things that the company needed to execute to beat the competition. </p><p><strong>Data engineering in certain contexts is necessary, but likely to be a specialisation increasingly of interest to the minority.</strong></p><p>The above point alone isn't that contentious. </p><p>What is contentious is the WHEN. </p><p>Has the tied turned, is it still rising. Who is seeing the signs and who is missing them. Who is seeing evidence where there is none. </p><h1><strong>6. Implications and Evidence from the field</strong></h1><p>Hello?</p><p>2 more minutes, less hand waving I promise.</p><h3><strong>Implications for Engineers</strong></h3><p>This entire post makes the same point as this specialisation bombshell:</p><p><a href="https://erikbern.com/2021/07/23/what-is-the-right-level-of-specialization.html">What is the right level of specialization? For data teams and anyone else.</a></p><blockquote><p>It seems fair that, if tools didn't require so much knowledge to use (I'm looking at you, Kubernetes), then on the margin, the need for specialisation would be less.</p></blockquote><p>The extension of this point is that because the data engineering toolset got so much better, the specialisation required is now less. Snowflake and BigQuery users agree.&nbsp;</p><p>The implication for engineers whose work is now easier is the following:</p><p><strong>Either you move in the direction of the new business problem.</strong></p><p><strong>Or you move to a new business that still has the old problem.&nbsp;</strong></p><p><strong>Or you specialise further until you find another domain to play in, and wait for the tide to turn again.</strong></p><p>Erik's blog above makes another point, which made me realise this is a mostly &#8220;deeply inspired&#8221; notion, so much so that I've used the <strong>tools/goals oriented</strong> concept in an earlier section.</p><blockquote><p>I often think of people as (and this is an unfair crude generalisation etc) roughly on a spectrum between tools-oriented and goal-oriented. Some people have their favourite tools, and that's what they like to use. They make their whole career about honing a craft with those skills. Other people are more entrepreneurial, and don't care about what tools they use: they care about the ultimate goal.</p></blockquote><p>This topic was quite contentious on Twitter. People made some very stern remarks about specialisation when Erik posted it initially, and I guess I'm not surprised. People are very likely going to fight against any concept that undermines their career domain.</p><p>However, this contentiousness further highlights the opportunity:</p><p><strong>Contrarian ideas, when right, are "the valuable thing" from the Taleb and Zero to One books:</strong></p><blockquote><p>&#8220;What important truth do very few people agree with you on?&#8221;</p></blockquote><p>Should this point be right, it will be proven right by (another) tool that reduces the need for specialisation and sells for ${LOTS} because it enables achieving Goal X (data-driven-whatnot) without hiring a team of 100 ludicrously demanding human specialists with endless needs.</p><p>Arguments, of which there are a few, against this, include that the <strong>startup ELT paradigm</strong> is a minority and that data engineering work is firmly entrenched in the structures of larger businesses, especially enterprises. The refinement I think is worth making, is that while this may be true, the hope is that it will become less so. Like the shift to the cloud, I would hope that what we describe as ELT now leads to us finding a better way of doing things, whatever it may end up being, that is as transformational for Data Teams as cloud computing was for Software Teams. (Noteworthy that &#8220;hybrid-cloud&#8221; has proven so popular with enterprise)</p><p>And a pushback to this: enterprises aren&#8217;t most businesses. Most businesses don&#8217;t have a large tech team, most businesses didn&#8217;t exist a decade ago. However most Data Engineers are not employed by most businesses, hence this disconnect. <strong>Most Data Engineers would disagree with this premise, but the point is that most businesses won&#8217;t need a Data Engineer</strong>.&nbsp;</p><p>Looking at history, this happened before, take a look at<a href="https://www.youtube.com/watch?v=gmFhOJhJ_aI"> Data Science as a field</a>, maybe due for a renaissance in the guise of ML. &#8220;Data Science&#8221; was a crutch for companies not knowing what to do with their data.&nbsp;</p><h3><strong>Implications for Businesses</strong></h3><p>The message from the communities and my experience is clear - Data Engineering as it once was is generally less of a challenge - but building a coherent &#8220;data platform&#8221; remains a chore.&nbsp;</p><p>What is possibly the most complex part of &#8220;Data&#8221;, and what they really need help with is, what I suppose quite fairly is called<strong> Data Platform Engineering:</strong></p><ul><li><p>EL tool can start costing inordinate amounts relative to the value gained.&nbsp;</p></li><li><p>X tool sunsetting Y feature&nbsp;</p></li><li><p>Adding a new business tool with an unsupported API that needs a singer tap built. This work typically is open-sourced, so eventually, there will be fewer needing singer taps (pray)</p></li><li><p>Airflow proving to be a headache. According to Slack, 90% of airflow users are using managed services, so less specialisation in airflow will be needed (pray pray)</p></li></ul><p>As an example of this, I&#8217;ve recently consulted on the best way to ELT some data from a few API sources unsupported by Fivetran, as well as Stitch/Airbyte. The decision complexity is quite high:</p><ol><li><p>Is an orchestration tool such as Airflow/Prefect needed yet, and if so, which one?</p><ul><li><p>If Airflow, then the AWS instance, the Astronomer version, or self-host?</p></li><li><p>Do we try at the outset to use Kubernetes? Is Airflow stable yet? It still feels overcomplicated.</p></li><li><p>If Prefect, will they as a new entrant be more reliable or still have teething issues?</p></li><li><p>What level of CI/CD for the tools?&nbsp;</p></li><li><p>Would they benefit from Terraform?</p></li></ul></li><li><p>Meltano, Airbyte or Singer extractor/tap spec?</p><ul><li><p>Meltano [1] seems to be making excellent progress, but requires some minor hosting effort, and also requires an orchestrator.&nbsp;</p></li><li><p>Airbyte seems to (seems to) be making more of a commitment to quality.</p></li><li><p>Both are wrangling with the ways of incentivising community maintainers.</p></li></ul></li></ol><p>This is just one &#8220;component&#8221; of the team&#8217;s ELT, not even the full picture of their Data Platform, and it is a subtly complex and consuming decision for those familiar.&nbsp;</p><p>A great way to frame this is quasi-architectural DataOps flavoured generalist Data Guru role of the <strong>Data Platform Engineer (DPE):</strong></p><blockquote><p>DPE are thinking about what data exists, who should have which access, how to make it available for usage by people and tools, how to make it redundant (disaster recovery), how to enable discovery (catalog), etc&nbsp;&nbsp;&nbsp;</p></blockquote><p>Or another spin:</p><blockquote><p>DPE just means that you are the Tech Lead of the Analytics Engineering.</p></blockquote><p>While I don&#8217;t necessarily care for the DPE term over DE, I do think DPE aptly captures the key work that many Data Engineers now do, combining and ensuring cooperation between competing tools to build a coherent consumable data platform.&nbsp;</p><p>More than anything, the developer experience for most of the necessary Data Platforms tools is just garbage. Airflow is a nightmare, GCP really a frightening pain, and AWS is just so much worse. The correct abstractions over all of this is a huge opportunity and the thing that the DPE needs to keep an eye on.&nbsp;&nbsp;</p><p><em>[1] Worth your time to have a look at the <a href="http://sdk.meltano.com/">Meltano SDK</a> if you need to build an API extractor. Great team, developer experience and ambition. If you are a Data Engineer (either type), these open source projects are possibly the best intersection of your skills, interests and market demand. I set up the most lightweight way to run a Meltano <a href="https://github.com/mattarderne/meltano-batch">ELT on AWS, using Terraform</a>, and could use a review!</em></p><h1><strong>Closing</strong></h1><p>In closing, I broadly see the below chart as usefully inflammatory and marginally useful.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dmgz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06deac28-57d0-4b4d-82e3-e86938286e84_1100x727.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dmgz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06deac28-57d0-4b4d-82e3-e86938286e84_1100x727.png 424w, https://substackcdn.com/image/fetch/$s_!dmgz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06deac28-57d0-4b4d-82e3-e86938286e84_1100x727.png 848w, https://substackcdn.com/image/fetch/$s_!dmgz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06deac28-57d0-4b4d-82e3-e86938286e84_1100x727.png 1272w, https://substackcdn.com/image/fetch/$s_!dmgz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06deac28-57d0-4b4d-82e3-e86938286e84_1100x727.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dmgz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06deac28-57d0-4b4d-82e3-e86938286e84_1100x727.png" width="1100" height="727" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/06deac28-57d0-4b4d-82e3-e86938286e84_1100x727.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:727,&quot;width&quot;:1100,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dmgz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06deac28-57d0-4b4d-82e3-e86938286e84_1100x727.png 424w, https://substackcdn.com/image/fetch/$s_!dmgz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06deac28-57d0-4b4d-82e3-e86938286e84_1100x727.png 848w, https://substackcdn.com/image/fetch/$s_!dmgz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06deac28-57d0-4b4d-82e3-e86938286e84_1100x727.png 1272w, https://substackcdn.com/image/fetch/$s_!dmgz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F06deac28-57d0-4b4d-82e3-e86938286e84_1100x727.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">I think this might actually be incorrect. The DE should probably be hitting the Plateau if this post has any merit.</figcaption></figure></div><p>As Data Science gave way to Data Engineering enthusiasm, I'll say that Data Engineering enthusiasm possibly will have to give way to Analytics, currently called Analytics Engineering.</p><p>Following this will be the traditional Data Analyst role, in whatever new guise, which will make some resurgence.</p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://twitter.com/imrobertyi/status/1446144033747832839&quot;,&quot;full_text&quot;:&quot;Analytics is not\n- Writing SQL\n- Writing Python\n- Making dashboards\n- Doing research\n- Making pipelines\n\nAnalytics is driving impact using data.\n\n[caveat: in the corporate world]&quot;,&quot;username&quot;:&quot;imrobertyi&quot;,&quot;name&quot;:&quot;Robert Yi &#128051;&quot;,&quot;profile_image_url&quot;:&quot;&quot;,&quot;date&quot;:&quot;Thu Oct 07 16:03:00 +0000 2021&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:0,&quot;retweet_count&quot;:49,&quot;like_count&quot;:214,&quot;impression_count&quot;:0,&quot;expanded_url&quot;:{},&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p>However, the core Data Engineering skill-set, technological awareness and systems thinking, will remain vitally important, but perhaps not in the historical and existing notion of a Data Engineer.&nbsp;</p><h1><strong>Appendix</strong></h1><p>Questions to ponder, hit the comments if you have some thoughts:</p><ol><li><p>Will data science re-emerge now that the data wrangling tooling is getting so much better? What will this do to the hierarchy of data science? Maybe<a href="https://twitter.com/AmplifyPartners/status/1468327066873565189"> ML Engineer is a better candidate</a>.</p></li><li><p>Where does MLops sit, it largely has felt disconnected from &#8220;Modern Data Stack&#8221;?</p></li><li><p>The enterprise dynamic is entirely different. Enterprise companies will need ETL engineers until the heat death of the sun, &#822;a&#822;n&#822;d&#822; &#822;n&#822;o&#822; &#822;I&#822; &#822;d&#822;o&#822;n&#822;&#8217;&#822;t&#822; &#822;w&#822;a&#822;n&#822;t&#822; &#822;t&#822;o&#822; &#822;h&#822;e&#822;a&#822;r&#822; &#822;a&#822;b&#822;o&#822;u&#822;t&#822; &#822;i&#822;t&#822;.&#822;  and I&#8217;m somewhat curious how that plays out.</p></li><li><p>Is training Data Engineers a lost cause, along with training Data Scientists, Front-end devs?</p></li><li><p>Remember that 92% of startups disappear, but while we are stealing fun from tomorrow, we can satisfy ourselves knowing that someone will get it right, but for someone to be right someone else must be wrong.</p></li></ol><div><hr></div><p><em>Please consider subscribing for more on the subject of data systems thinking</em></p><p><em>What is <a href="https://groupby1.substack.com/about">group by 1</a></em></p><p><em>Who is <a href="https://substack.com/profile/10635483-matt">Matt Arderne</a></em></p>]]></content:encoded></item><item><title><![CDATA[Getting into Data]]></title><description><![CDATA[Rapid fire thoughts on transitioning from a technical role into "data"]]></description><link>https://groupby1.mattarderne.com/p/getting-into-data</link><guid isPermaLink="false">https://groupby1.mattarderne.com/p/getting-into-data</guid><dc:creator><![CDATA[Matt Arderne]]></dc:creator><pubDate>Mon, 09 Aug 2021 16:11:17 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NN8n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F198998f2-5862-4f45-8838-1c50b1128820_5184x3456.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>I&#8217;ve been asked frequently enough about making a transition into the Data Analytics space, aka my day job, that I thought it would be useful to combine my thoughts into a coherent post. This is a quick take on who this industry/job/role suits, what the skills required typically&nbsp;look like, and some background info on the industry as a whole. Note that this is oriented towards the Data Analyst / Analytics Engineer. Also note this post is mostly links.</em></p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NN8n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F198998f2-5862-4f45-8838-1c50b1128820_5184x3456.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NN8n!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F198998f2-5862-4f45-8838-1c50b1128820_5184x3456.jpeg 424w, https://substackcdn.com/image/fetch/$s_!NN8n!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F198998f2-5862-4f45-8838-1c50b1128820_5184x3456.jpeg 848w, https://substackcdn.com/image/fetch/$s_!NN8n!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F198998f2-5862-4f45-8838-1c50b1128820_5184x3456.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!NN8n!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F198998f2-5862-4f45-8838-1c50b1128820_5184x3456.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NN8n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F198998f2-5862-4f45-8838-1c50b1128820_5184x3456.jpeg" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/198998f2-5862-4f45-8838-1c50b1128820_5184x3456.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1279409,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NN8n!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F198998f2-5862-4f45-8838-1c50b1128820_5184x3456.jpeg 424w, https://substackcdn.com/image/fetch/$s_!NN8n!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F198998f2-5862-4f45-8838-1c50b1128820_5184x3456.jpeg 848w, https://substackcdn.com/image/fetch/$s_!NN8n!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F198998f2-5862-4f45-8838-1c50b1128820_5184x3456.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!NN8n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F198998f2-5862-4f45-8838-1c50b1128820_5184x3456.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">credit <a href="https://unsplash.com/@emilymorter?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Emily Morter</a></figcaption></figure></div><p>The barrier to entry to working as a data analyst is reducing rapidly. If you are a combination of curious, technically astute, outcomes-oriented, driven, observant, people-oriented and a natural leader then the technology should not be a hindrance, as it is quite literally being made easier every single day.</p><p>Consider this as a primer, and <strong>reach out to me directly</strong> if you think you&#8217;d like to work in this space, especially if you are motivated and have <em>any</em> data experience. </p><h2><strong>Is it a good fit for me/you?</strong></h2><ol><li><p>It is a great fit for people with good soft skills, strong intuition for business operations and a desire to make a difference in how a company operates. Are you interested in finding out the facts about what happened in the past and suggesting changes as to how they should operate in the future? To tell the CEO this? To back the decision and measure the changes? To possibly have been wrong? Good things to like the sound of.</p></li><li><p>There is Very Strong Demand in the startup space for these kinds of roles, so if you can convince the right person that you have what it takes then it is entirely possible to transition into this career, without any formal qualification. Typically good transitions are from engineering and technical operations in fast moving businesses.</p></li><li><p>With that goes the obvious implication that the current high demand may not be sustainable in the long term.&nbsp;</p></li><li><p>The term Data Scientist has largely been merged into the analytics roles, as pure data science is very niche, and often misapplied. Typically companies need more data analytics than they do data science unless their core product is somehow related to data science, and even if they say it is, it often in reality very much is not.</p></li><li><p>I prefer working in or as a provider to startups and scaleups, as this environment is more dynamic. Larger organisations and enterprises have analytics functions, but they often have more functionally or departmentally specialised roles, which means less exposure to the business as a whole. As you become more senior, larger companies will provide a great growth opportunity.&nbsp;</p></li></ol><h2><strong>Data Analyst &amp; Analytics Engineer stuff: </strong></h2><p><em>Necessary tech skills. </em></p><ol><li><p>The Data Analyst / Data engineer concepts have been pushed together into a single role, <a href="https://www.getdbt.com/what-is-analytics-engineering/">Analytics Engineer</a>, which broadly makes sense, and can be considered a role related to the Data Scientist. I explained this to some extent visually in <a href="https://twitter.com/rdrn_/status/1314115799951515649">this tweet</a>. Further, <a href="https://www.holistics.io/blog/what-we-know-and-dont-know-about-analytics-engineering/">this describes</a> in a bit more detail what we do and don&#8217;t know about analytics engineering.</p></li><li><p>These roles are generally distinct from or work alongside Business Analyst and Product Manager roles. </p></li><li><p>If you can do some part-time education, <a href="https://analyticsengineers.club/">this short course</a> from one of the ex-dbt employees looks to be worthwhile considering. It is brand new, but they intend to teach the exact thing I usually look to hire for.</p></li><li><p>So, onto <em>The Analytics Engineer skills/tools to know</em></p><p><em><strong>&nbsp;I&#8217;d give 60% focus on SQL, with maybe 10% each for Git, CLI, Python, BI tools.</strong></em></p><ol><li><p>SQL</p><ol><li><p>Analytical queries <a href="https://mode.com/sql-tutorial/">like this tutorial</a>, especially the &#8220;advanced&#8221; section, also this <a href="https://popsql.com/sql-templates">tutorial</a></p></li><li><p><a href="https://mode.com/blog/use-common-table-expressions-to-keep-your-sql-clean/">CTEs</a></p></li><li><p>Window functions</p></li><li><p>Focus on consistency, neatness, style</p></li></ol></li><li><p>Git basics</p><ol><li><p>Pull, branch, commit, merge</p></li><li><p>Most people don&#8217;t need to know much more</p></li></ol></li><li><p>Command-line basics</p><ol><li><p>Navigation, creating, deleting, moving</p></li></ol></li><li><p>Python</p><ol><li><p>Python is a huge domain, super useful but harder to get to grips with, and also less useful unless you are in a specific niche that calls for it</p></li><li><p>Jupyter notebooks</p></li><li><p>Basic pandas for data manipulation</p></li><li><p><a href="https://realpython.com/defining-your-own-python-function/#the-importance-of-python-functions">Functions</a></p></li><li><p>Virtual environments basics</p></li></ol></li><li><p>BI Tools</p><ol><li><p>BI tools are often prohibitively expensive, so the experience is often limited to one, and often the wrong one or an old one.</p></li><li><p><a href="https://www.metabase.com/">Metabase</a> is the defacto open source quick and simple option, it functions in a reasonably useful way. It is worth downloading and testing out.&nbsp;</p></li><li><p>Another shoutout to <a href="https://www.lightdash.com/">lightdash</a>, a very simple BI tool that runs with dbt, and is familiar to users of Looker.</p></li></ol></li><li><p><strong>Amendment: *Excel*</strong></p><ol><li><p>This post strongly assumes Excel expertise, but some readers pointed out that this would be worth stating.  </p></li><li><p>Pivots, xlookup etc</p></li><li><p>Highly recommend reading the post <a href="https://counting.substack.com/p/doing-better-with-excel">Doing Better With Excel</a> </p></li></ol></li></ol></li></ol><h2><strong>Analytics industry context:</strong> </h2><p><em>Articles/books etc that I think worth having a look at to add some colour to the above.</em></p><ol><li><p><a href="https://groupby1.substack.com/">I wrote some articles</a> about what I think about technology, the <a href="https://groupby1.substack.com/p/data-as-a-utility-tool">data-as-a-utility-tool</a> one is probably the only one worth reading.</p></li><li><p><a href="https://erikbern.com/2021/07/07/the-data-team-a-short-story.html">Building a data team at a mid-stage startup: a short story</a></p><ol><li><p>This very neatly describes my jobs and career so far&nbsp;</p></li><li><p>A good representation of what the &#8220;data&#8221; industry looks like</p></li></ol></li><li><p><a href="https://blog.getdbt.com/future-of-the-modern-data-stack/">The Modern Data Stack: Past, Present, and Future</a><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p><ol><li><p>dbt is the tool that I use for SQL transformations. It is probably the single most useful thing to learn in addition to the SQL, git, command line, Python list</p></li><li><p>They are doing lots of thought leadership in the analytics space, with very good blog posts</p></li></ol></li><li><p><a href="https://technically.dev/posts/what-your-data-team-is-using">Technology in Data Analytics</a></p><ol><li><p>Simplest, shortest overview of the most important tech tools, anything not on this list is possibly out of date or redundant, or used in a different context</p></li></ol></li><li><p><a href="https://www.holistics.io/books/setup-analytics/start-here-introduction/">The Analytics Setup Guidebook</a></p><ol><li><p>This is a deep dive into modern analytics, opinions are now pretty generic, but it is a <strong>BOOK</strong></p></li><li><p>Some interesting meta-analysis takes on the jobs in the industry <a href="https://www.holistics.io/books/setup-analytics/data-servicing-a-tale-of-three-jobs/">here</a></p></li><li><p>Possibly the thing that is&nbsp; done most badly in the analytics space is data modelling, <a href="https://www.holistics.io/books/setup-analytics/data-modeling-layer-and-concepts/">with the background here</a>. This is probably what will be the hardest to attain, and near impossible to hire for, but something that is mostly trial and error anyway!</p></li></ol></li></ol><h2>OK I learned / am learning those skills above</h2><p><em>How do I get into analytics?</em></p><blockquote><p>General advice being - go and start doing analytics in your current position, utilizing whichever data skills/tools/etc you have access to. Guaranteed there are data pain points you can solve (which requires you to have some business acumen and soft skills to identify &amp; work on!). This will get you real-life experience which you can leverage into a full-time position within a couple years. </p></blockquote><p>Thanks Nate Sooter, copied verbatim from your feedback on this post! Totally agree with this advice.</p><div><hr></div><p>That is all I&#8217;ve got. Topical and relevant as of publishing. I&#8217;ll continue to add and amend, but for now, if you know of someone interested in Data, send them this. There is a world of nuance not covered here, the most interesting themes all covered in due course in a post here. Or not.&nbsp;</p><p>As may be obvious from the long list of articles, there are lots of thoughts and opinions in this space. I have personally experienced the tension and growth of data analytics alongside data science and data engineering, (and also a frustrated relationship with software engineering). In very many contexts the data science/engineering/analytics terms are used overly interchangeably, and often end up meaning the exact same thing, but often not meaning anything similar at all, just to make it confusing, especially <a href="https://news.ycombinator.com/item?id=27779264">data engineering</a>. See an upcoming post on <strong>Data Engineering: Backend Developer, or Data Analyst.</strong><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a></p><p>And as I said, I&#8217;d love to hear from you if you are making this transition, or feel like understanding it in more detail. The above is just a primer! Get in touch to set up time directly, quickest via the dreaded: <a href="https://www.linkedin.com/in/m-ard/">https://www.linkedin.com/in/m-ard/</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bdpJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce95f1e-c7e4-480f-abf9-58bdb98920c8_1920x3072.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bdpJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce95f1e-c7e4-480f-abf9-58bdb98920c8_1920x3072.jpeg 424w, https://substackcdn.com/image/fetch/$s_!bdpJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce95f1e-c7e4-480f-abf9-58bdb98920c8_1920x3072.jpeg 848w, https://substackcdn.com/image/fetch/$s_!bdpJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce95f1e-c7e4-480f-abf9-58bdb98920c8_1920x3072.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!bdpJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce95f1e-c7e4-480f-abf9-58bdb98920c8_1920x3072.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bdpJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce95f1e-c7e4-480f-abf9-58bdb98920c8_1920x3072.jpeg" width="1456" height="2330" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/5ce95f1e-c7e4-480f-abf9-58bdb98920c8_1920x3072.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2330,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:456121,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bdpJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce95f1e-c7e4-480f-abf9-58bdb98920c8_1920x3072.jpeg 424w, https://substackcdn.com/image/fetch/$s_!bdpJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce95f1e-c7e4-480f-abf9-58bdb98920c8_1920x3072.jpeg 848w, https://substackcdn.com/image/fetch/$s_!bdpJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce95f1e-c7e4-480f-abf9-58bdb98920c8_1920x3072.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!bdpJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5ce95f1e-c7e4-480f-abf9-58bdb98920c8_1920x3072.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Credit <a href="https://unsplash.com/@dayee?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">&#22823;&#29239; &#24744;</a> </figcaption></figure></div><div><hr></div><p><em>Please consider subscribing for more on the subject of data systems thinking</em></p><p><em>What is <a href="https://groupby1.substack.com/about">group by 1</a></em></p><p><em>Who is <a href="https://rdrn.dev/?utm_source=groupby1.substack.com">Matt Arderne</a></em></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>See my post on Modern Data Stack [end of]:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;bb0832b2-c25a-4bf6-8c6c-7c8ca13ae66f&quot;,&quot;caption&quot;:&quot;The beauty of ideas is that they cannot die. That said, many consider the Modern Data Stack to have developed a bit of a rot I thought to pull out the eulogy I've had in the back of my mind for a while. It is a Tour de Links that follows the journey of&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The Way of Ways&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:41772521,&quot;name&quot;:&quot;Matt Arderne&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1074168c-41cd-4592-af47-8cf5f26262f2_1818x1216.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2023-08-18T13:41:08.673Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbd3edbd-f1f2-4776-ab37-09b6c5c1b52b_704x400.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://groupby1.mattarderne.com/p/the-way-of-ways&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:136115515,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:9,&quot;comment_count&quot;:5,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;group by 1&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7fbee6-d181-479e-bd71-c4704b2b4c80_1216x1216.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>See my post on Data Engineering:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;fbd686eb-bd3e-4da5-94f3-e6063c6ccfbb&quot;,&quot;caption&quot;:&quot;Trigger warning: this may trip your thought leadership nerve, but I&#8217;m mostly riffing, and I think it is additive. This post was easy to start, a pain to finish and lots of fun in the middle. This is a narrative for the near future of Data Engineering in startups, and I think makes some interesting points. I do think the post has avenues for expansion, es&#8230;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The future history of Data Engineering&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:41772521,&quot;name&quot;:&quot;Matt Arderne&quot;,&quot;bio&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1074168c-41cd-4592-af47-8cf5f26262f2_1818x1216.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2022-01-06T07:33:48.641Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2cac65f-aaeb-442f-a945-3790c3343ec0_1100x743.jpeg&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://groupby1.mattarderne.com/p/data-engineering&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:46624409,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:14,&quot;comment_count&quot;:9,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;group by 1&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1d7fbee6-d181-479e-bd71-c4704b2b4c80_1216x1216.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div></div></div>]]></content:encoded></item><item><title><![CDATA[Dataform and dbt]]></title><description><![CDATA[Some commentary on transformation tooling]]></description><link>https://groupby1.mattarderne.com/p/dataform-and-dbt</link><guid isPermaLink="false">https://groupby1.mattarderne.com/p/dataform-and-dbt</guid><dc:creator><![CDATA[Matt Arderne]]></dc:creator><pubDate>Mon, 21 Jun 2021 14:10:04 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!iDjA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d7da52-a627-40e5-87af-f383a32ae202_4445x6667.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><p><em>Welcome to my third post, one I have wanted to write from the beginning. Getting these posts done isn&#8217;t easy, and the time between publishing is a commitment that I undertook rather lightly. Like most good ideas, this one is late, irrelevant, and likely only to be marginally useful. That said, here is a quick rundown on two of the &#8220;indicative-of-the-future&#8221; SQL tools in data analytics at the moment.</em></p><div><hr></div><h1>Dataform and dbt, Dbt and dataform</h1><p>If neither dbt nor Dataform is familiar to you, stop and read my <a href="https://groupby1.substack.com/p/data-as-a-utility-tool">primer post</a> on the modern data analytics stack. This post will still likely be a bit meta unless you are familiar with at least one of these tools. The below two-liner explanation from <a href="http://tamaszilagyi.com/blog/2019/2019-03-05-dbt/">No frills data warehousing with dbt</a> might be sufficient to get you through.</p><blockquote><p>To use [dbt/Dataform], you only need to be familiar with SQL. The package relies on templating using [jinja/javascript] to enable nifty features like dependency graphs, macros or schema tests. Upon compilation, everything is translated into pure SQL and run on the database&#8217;s execution engine. It is quite fascinating how much you can do with such a minimalist tool.</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iDjA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d7da52-a627-40e5-87af-f383a32ae202_4445x6667.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iDjA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d7da52-a627-40e5-87af-f383a32ae202_4445x6667.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iDjA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d7da52-a627-40e5-87af-f383a32ae202_4445x6667.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iDjA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d7da52-a627-40e5-87af-f383a32ae202_4445x6667.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iDjA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d7da52-a627-40e5-87af-f383a32ae202_4445x6667.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iDjA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d7da52-a627-40e5-87af-f383a32ae202_4445x6667.jpeg" width="1456" height="2184" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/e3d7da52-a627-40e5-87af-f383a32ae202_4445x6667.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2184,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2869044,&quot;alt&quot;:&quot;https://unsplash.com/photos/UMncYEfO9-U&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="https://unsplash.com/photos/UMncYEfO9-U" title="https://unsplash.com/photos/UMncYEfO9-U" srcset="https://substackcdn.com/image/fetch/$s_!iDjA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d7da52-a627-40e5-87af-f383a32ae202_4445x6667.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iDjA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d7da52-a627-40e5-87af-f383a32ae202_4445x6667.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iDjA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d7da52-a627-40e5-87af-f383a32ae202_4445x6667.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iDjA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3d7da52-a627-40e5-87af-f383a32ae202_4445x6667.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://unsplash.com/photos/UMncYEfO9-U">unsplash</a></figcaption></figure></div><h1>Context</h1><p>Because I find it interesting, here is my understanding of the history of these tools. Cmd+f to THE TLDR if the <em>what why how and hearsay</em> of technology isn&#8217;t your thing.</p><p>dbt was born from a common need. Fishtown Analytics, an analytics consulting team, solved their need for a <em><strong>better way to transform data inside a data warehouse</strong></em>. They built a way to manage SQL transformations, as an open-source tool <a href="https://github.com/fishtown-analytics/dbt/tree/549282110f393a22c6331ba828a4895bdee9c26e">way back in 2016</a> (epic to git time-travel).</p><p>They generously shared their experience through a transformative series of blog posts from their CEO <a href="https://blog.getdbt.com/author/tristan/">Tristan</a> <a href="https://medium.com/@jthandy">Handy</a>. He captured many struggling engineer/analysts attention with his &#8220;new way&#8221; of doing analytics in startups. We, the desperate, listened closely. The message I heard: <em>bring the best of software development to startup data analytics</em></p><p>This was great, but what set this apart amongst the ever-growing set of open-source developer tools was, in my opinion, the &#8220;hype house&#8221; that was the early dbt slack community.</p><blockquote><p>The Hype House is part of a millennia-old tradition of collaboration among those at the avant-garde of new forms of media, technology, and thought. Outsiders like me have always dismissed the novel as silly, faddish, or worse. When those inside the cutting-edge scenes band together to support, teach, and create with each other, their niche and experimental projects can become the new normal on top of which the next generation builds.</p><p><a href="https://perell.com/fellowship/conjuring-scenius/">Source</a> (I&#8217;ve waited a long time to paste this snippet. Scenius is quite a word, but the article is epic)</p></blockquote><p>The building of this dbt tool was &#8220;in the open&#8221;, and a community was incubated alongside it. An enthusiastic community. An <em>investible</em> community. The new paradigm was incubated in the dbt slack channel, with contribution, opinion and criticism all weighed and measured in the passionate and growing community.</p><p>dbt was first a CLI tool. As is the trend with open-source, evolution is dynamic and open. Dataform was born when a front-end was created by a team of engineers who spotted the opportunity to bring dbt to even less technical analysts through a GUI. Later on, Dataform decided to migrate to a newly built dbt replacement backend for their frontend (also open-source). To match this development, or they had planned to anyway, dbt launched their own GUI SaaS tool (called Sinter, now dbt Cloud for the historians).</p><p>Personally, dbt led me to Dataform, and as a consulting data-engineer turned head-of-data, I was drawn by Dataform&#8217;s relative ease of getting started. I wanted a tool that I could stick in the hands of a Looker developer and have them hitting the same notes as I was hitting with dbt, with less friction, less cognitive burden, and less fiddle.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_mtI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5cad9813-4ab5-48e7-996a-e873427a9049_6000x3375.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_mtI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5cad9813-4ab5-48e7-996a-e873427a9049_6000x3375.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_mtI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5cad9813-4ab5-48e7-996a-e873427a9049_6000x3375.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_mtI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5cad9813-4ab5-48e7-996a-e873427a9049_6000x3375.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_mtI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5cad9813-4ab5-48e7-996a-e873427a9049_6000x3375.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_mtI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5cad9813-4ab5-48e7-996a-e873427a9049_6000x3375.jpeg" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/5cad9813-4ab5-48e7-996a-e873427a9049_6000x3375.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:719180,&quot;alt&quot;:&quot;https://unsplash.com/photos/OxHPDs4WV8Y&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="https://unsplash.com/photos/OxHPDs4WV8Y" title="https://unsplash.com/photos/OxHPDs4WV8Y" srcset="https://substackcdn.com/image/fetch/$s_!_mtI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5cad9813-4ab5-48e7-996a-e873427a9049_6000x3375.jpeg 424w, https://substackcdn.com/image/fetch/$s_!_mtI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5cad9813-4ab5-48e7-996a-e873427a9049_6000x3375.jpeg 848w, https://substackcdn.com/image/fetch/$s_!_mtI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5cad9813-4ab5-48e7-996a-e873427a9049_6000x3375.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!_mtI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5cad9813-4ab5-48e7-996a-e873427a9049_6000x3375.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>So, this post. We&#8217;ll briefly run through the most notable differences between dbt and Dataform, how/why to choose, and end with some thoughts on evolution.</p><p>Why me? I&#8217;ve been working with these tools for a while. I deployed dbt at a company and then migrated to Dataform when Dataform launched. I&#8217;ve deployed Dataform with another client and recently worked on a dbt project that used some of the dbt plugins. I now consider both on their relative merits when I make the assessments below.</p><p>If you&#8217;ve got this far but can go no further, these tools function similarly, with the TLDR worth a look.</p><p>The bulk of the value for me has been in having these tools as &#8220;ready to use&#8221; for data analysts who are strong with LookML, SQL and have exposure to data modelling concepts, but are less familiar with Airflow, CLI tools, Python environments, and git workflows (<a href="https://ohshitgit.com/">git undo everything</a>). The singular terror that it is to do anything Python-related on Windows (as a Mac user) has been reason enough to default to avoid having to set someone up with the CLI, enabling git, managing Python dependencies, C++ redistributables, :shock:. Both tools allow developers to move in that direction if they choose, but offer easier onboarding via the SaaS version.</p><p>Thus, dbt/Dataform <strong>cloud</strong> is my primary point of consideration, because that has been my primary interest. Both are extensively used as CLI tools, which for many is the go-to. My focus has been in enabling teams that initially do not have the time/resources to upskill/maintain/support the tech, and so my considerations are within the context of the cloud offering. This may be contentious, but on average the analysts who come from Bizops or something of that nature seem to make for more rounded data analysts, and enabling them is key.</p><p>*complete aside, Denodo, a previous employer and data integration technology company, was born out of the need of their consulting team working with large multinationals. They too built a successful enterprise Data Virtualisation tool, which in a way achieves a similar outcome to dbt + Snowflake. In essence a powerful enterprise database multi-plug adaptor, and a SQL DAG builder.</p><h1>dbt</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Qf2U!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F75469982-6458-4dd6-a2e8-460a96a89615_1600x462.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Qf2U!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F75469982-6458-4dd6-a2e8-460a96a89615_1600x462.png 424w, https://substackcdn.com/image/fetch/$s_!Qf2U!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F75469982-6458-4dd6-a2e8-460a96a89615_1600x462.png 848w, https://substackcdn.com/image/fetch/$s_!Qf2U!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F75469982-6458-4dd6-a2e8-460a96a89615_1600x462.png 1272w, https://substackcdn.com/image/fetch/$s_!Qf2U!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F75469982-6458-4dd6-a2e8-460a96a89615_1600x462.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Qf2U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F75469982-6458-4dd6-a2e8-460a96a89615_1600x462.png" width="1456" height="420" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/75469982-6458-4dd6-a2e8-460a96a89615_1600x462.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:420,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Qf2U!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F75469982-6458-4dd6-a2e8-460a96a89615_1600x462.png 424w, https://substackcdn.com/image/fetch/$s_!Qf2U!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F75469982-6458-4dd6-a2e8-460a96a89615_1600x462.png 848w, https://substackcdn.com/image/fetch/$s_!Qf2U!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F75469982-6458-4dd6-a2e8-460a96a89615_1600x462.png 1272w, https://substackcdn.com/image/fetch/$s_!Qf2U!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F75469982-6458-4dd6-a2e8-460a96a89615_1600x462.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://getdbt.com/">dbt</a></figcaption></figure></div><p>dbt is an open-source tech success story. A dedicated team created the technology they wished they could pay for, built in the open, engaging the community and incubating a &#8220;new way&#8221; that enthusiasts began to feel very enthusiastic about.</p><p>dbt&#8217;s strengths lie in the powerful ecosystem they have developed. Illustrative of this are the integrations with <a href="https://www.getdbt.com/ecosystem/">tools</a> such as Fivetran, Census, Databricks, that make raw dbt projects pretty mobile and feel like the beginnings of a standard for SQL transformation DAGs.</p><p>This ecosystem enables developers to build powerful <a href="https://hub.getdbt.com/">open-source</a> plugins, available to use and improve. These include useful macros, prebuilt transformations for specific sources, infrastructure management and even some interesting machine learning plugins. The community has built the tools they need on a common standard.</p><p>Because dbt is platform-agnostic it feels like an ecosystem rather than a tool alone. Any modern data warehouse is a viable engine to run it, and any modern data tool is likely considering how they can incorporate elements of it to take advantage of the momentum.</p><p>When considering dbt cloud, the online version it feels like an online IDE rather than a standalone tool, and according to the founder as of a 1 year ago they&#8217;re</p><blockquote><p>in the very early days in improving the developer experience of writing dbt code. The dbt Cloud IDE is still in its infancy, and is only one of the many ways in which we ultimately believe that users will write dbt code that we want to facilitate. <a href="https://blog.getdbt.com/four-years-in-from-misfits-to-mainstream/">source</a></p></blockquote><p>My take is that this rings true. The usability features are not quite there, a simple example is no autocomplete, which readily exists as a VSCode plugin. [TODO: Validate this is still true (please comment below)]</p><h1>Dataform</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PpUW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a1566f-69d0-459a-bde0-8ea85477ed8d_1600x488.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PpUW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a1566f-69d0-459a-bde0-8ea85477ed8d_1600x488.png 424w, https://substackcdn.com/image/fetch/$s_!PpUW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a1566f-69d0-459a-bde0-8ea85477ed8d_1600x488.png 848w, https://substackcdn.com/image/fetch/$s_!PpUW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a1566f-69d0-459a-bde0-8ea85477ed8d_1600x488.png 1272w, https://substackcdn.com/image/fetch/$s_!PpUW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a1566f-69d0-459a-bde0-8ea85477ed8d_1600x488.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PpUW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a1566f-69d0-459a-bde0-8ea85477ed8d_1600x488.png" width="1456" height="444" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/a2a1566f-69d0-459a-bde0-8ea85477ed8d_1600x488.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:444,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PpUW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a1566f-69d0-459a-bde0-8ea85477ed8d_1600x488.png 424w, https://substackcdn.com/image/fetch/$s_!PpUW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a1566f-69d0-459a-bde0-8ea85477ed8d_1600x488.png 848w, https://substackcdn.com/image/fetch/$s_!PpUW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a1566f-69d0-459a-bde0-8ea85477ed8d_1600x488.png 1272w, https://substackcdn.com/image/fetch/$s_!PpUW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2a1566f-69d0-459a-bde0-8ea85477ed8d_1600x488.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://dataform.co/">Dataform</a></figcaption></figure></div><p>Dataform built a great user interface and my initial impression was that it felt like an easy transition for someone from Looker to orientate and become productive quickly. The user interface has continuously been improved and the backlog of feature requests was quickly moved through, as the team was reactive to suggestions and rapidly added features that provided better prompts, insights and contextual information. Things like prompting about an unbuilt view being referenced, an effective autocomplete, and continuously compiling the SQL to raise errors like typos, glitches, invalid elements introduced prior to execution. These all added to the great user experience of developing.</p><p>My least favourite feature is that any templating is done in Javascript (instead of jinja in dbt). Ideally, I&#8217;d prefer something closer to Python than either, but jinja feels easier and more intuitive to a non <code>.js</code> type.&nbsp;</p><p>[Ammendment] A <a href="https://lightdash.com/">friend&#8217;s</a> opinion, below, which runs counter to mine on the jinja/js thing, and I agree with him, especially if you have a team who know Javascript! My counterpoint is that Looker analysts often find jinja easier to get started. Jinja does get very complicated if you try and do things like cohort funnel analysis. </p><blockquote><p>Your main disadvantage of dataform I actually found to be a great advantage: templating is just javascript. Sometimes jinja starts feeling like a programming language (macros in macros etc.). But that makes a horrible developer experience. In dataform, you can just declare models in a regular old .js file, meaning you&#8217;re completely free to build whatever you want. Plus you have all the power and tooling of a mature programming language. </p></blockquote><p>[End Ammendment]</p><p>Like dbt, Dataform has benefitted from community-maintained packages.</p><p>The most interesting thing about this story is that Dataform was acquired by Google. It seems likely that Dataform will be incorporated into Google&#8217;s Big Query data warehouse, I suppose as a &#8220;transformation&#8221; feature, or perhaps still standalone? This has interesting implications but it does mean that there will be some time getting Dataform integrated. It also means that as a tool it may have less of their historic great responsiveness to user suggestions and demands as they run through the enterprise compliance backlog. This does mean that it will be an easy first choice for Big Query customers to test out this paradigm of data transformation. Sadly, it also means that support for other data warehouses has been sunset.</p><h1>Implications</h1><p>The implications of the Google acquisition of Dataform are interesting to me, maybe given my unique experience with both tools, in that it pretty firmly plants dbt as the SQL data transformation standard. This has likely been the case regardless, based purely on momentum and head start that dbt has maintained.&nbsp;</p><p>However, had Google maintained Dataform as an independent entity rather than incorporating it into Big Query as an exclusive feature or tool, it may have given Dataform the staging area and resources to properly pose a serious threat as an alternative. This would have been a preferable outcome for me personally. Dataform going exclusively Big Query suggests that the play is likely a more direct threat towards Snowflake&#8217;s dominance in the data warehouse space, as the feature race heats up.</p><p>From a more practical perspective, the TLDR comes into play. When heading along this road, the first decision you&#8217;re likely to make is&nbsp;Big Query or Snowflake (or even Redshift if you have lots of AWS credits&#8230; (those are ominous dot dots)) decision, and then if Big Query, deciding between dbt and Dataform. My preference is due to my familiarity with Snowflake, and as a practitioner and consultant, my bet is generally with the coverage and standard-setting dbt.&nbsp;</p><h1>TLDR</h1><p><code>THE TLDR/</code></p><ul><li><p>Dataform was acquired by Google and is now exclusively GCP, and my guess is that it will be integrating directly into the Big Query platform. If you&#8217;re a startup in the GCP ecosystem, then that is compelling. If you like or are already using Snowflake, then Dataform is no longer an option.</p></li><li><p>dbt has a huge ecosystem built around it, with real momentum benefitting from plugins, integrations and general compatibility across all the analytics tools.</p></li><li><p>Both tools have a SaaS offering, effectively a cloud IDE + CICD. Great for getting connected and developing quickly. Both happily run a hybrid of cloud IDE and open-source core.</p></li><li><p>The dbt community has led to many ecosystem &#8220;gap fillers&#8221; such as <a href="https://spectacles.co/">spectacles</a>, <a href="https://lightdash.com">lightdash</a>, and technically even Dataform itself, all taking the dbt base and extending it or recombining it in different ways.</p></li></ul><p>The modern ELT data stack that no one is going to second guess has Snowflake as the data warehouse, Fivetran doing the ExtractLoad and dbt doing the T for transforms. Note that Snowflake and Fivetran are doing the heavy-lifting, but don&#8217;t provide the magic. They are doing the quiet and dependable, probably get taken for granted (other than the price tag), but developing is where insight is captured, and developing is done with dbt.</p><p><code>/ THE TLDR</code></p><h1>Evolution</h1><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bFUO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3b0f7805-55be-40f2-9c0d-12ddb7953491_1920x1920.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bFUO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3b0f7805-55be-40f2-9c0d-12ddb7953491_1920x1920.jpeg 424w, https://substackcdn.com/image/fetch/$s_!bFUO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3b0f7805-55be-40f2-9c0d-12ddb7953491_1920x1920.jpeg 848w, https://substackcdn.com/image/fetch/$s_!bFUO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3b0f7805-55be-40f2-9c0d-12ddb7953491_1920x1920.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!bFUO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3b0f7805-55be-40f2-9c0d-12ddb7953491_1920x1920.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bFUO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3b0f7805-55be-40f2-9c0d-12ddb7953491_1920x1920.jpeg" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/3b0f7805-55be-40f2-9c0d-12ddb7953491_1920x1920.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1252542,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bFUO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3b0f7805-55be-40f2-9c0d-12ddb7953491_1920x1920.jpeg 424w, https://substackcdn.com/image/fetch/$s_!bFUO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3b0f7805-55be-40f2-9c0d-12ddb7953491_1920x1920.jpeg 848w, https://substackcdn.com/image/fetch/$s_!bFUO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3b0f7805-55be-40f2-9c0d-12ddb7953491_1920x1920.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!bFUO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F3b0f7805-55be-40f2-9c0d-12ddb7953491_1920x1920.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a><figcaption class="image-caption"><a href="https://unsplash.com/photos/O68cqTi5k-I">unsplash</a></figcaption></figure></div><p>If you jumped to the TLDR and then went further, this is likely where you&#8217;d want to stop. The rest is a quick self-indulgent attempt at hypotho-philosophising the evolutionary aspects of software building. Beware!</p><p>Watching the evolution of dbt and then Dataform quite closely and with entrenched interest, I have found the dynamics at play interesting, engaging, and commentable. What was also interesting is the entrenched alliances that quickly developed along with people&#8217;s preferences or opportunities. Sentiment generally went something along the lines of <em>Dataform ripped the idea off dbt</em>, which I feel is mostly wrong, as the idea <a href="https://hn.algolia.com/?q=sql+template">isn&#8217;t novel</a>, just very well executed and well-timed (IMO, comment below)<em>.</em></p><p>I made a meta-take comment on the sentiment in another <a href="https://locallyoptimistic.slack.com">wonderful slack channel</a>, the comment has now been repurposed for this section, as rambly and raw as it is, I thought captured some of the point:</p><blockquote><p>The tribalism vibe on this was always a weird one for me. us/them always felt counter-productive and tribal [1], but then perhaps that was inevitable. Early dbt felt very much like the beginnings of a tribe: lots of in-jokes, data-eology, new/old. Exciting!&nbsp;</p><p>Dataform beginning public life as a dbt front-end ~early 2018 [2] felt pretty confirming.</p><p>It did two things for (us) users:&nbsp;</p><p>1. affirmed the dbt value prop [3]&nbsp;</p><p>2. made using the "new-paradigm" much more accessible to non-techy-data-devs [4]&nbsp;</p><p>For reasons, Dataform built their own backend dbt replacement (Google software engineers, I don't know, but they probably do). This change positioned them at odds with one another, but besides that point, the 2 different tools enabled more users to adopt the new paradigm[5].</p><p>For me, as a user of dbt and then early dataform, best enabling this new paradigm was a combination of the two, and it was/is great! Collectively this is all a quantum leap for SQL analysts. I'm beyond thrilled that I don't have to use the tools I had to use pre-dbt.&nbsp;</p><p>My post-ramble point: I think evolutionary pressure on technology is bloody brilliant for users. Competition is great, iteration should be encouraged. I'm looking forward to someone building an iteration on Looker to this same end.</p><p>The evolutionary pressure did two interesting things:</p><p>1. It affirmed the need for a SaaS tool for smaller and larger teams. dbt may never have needed to create a user interface, which would have frustrated my team building analytics efforts in smaller tech communities.</p><p>2. It created _an alternative_, which is the most important way to add creativity to the space. In real-time users expressed their preferences for different ways of achieving the same goal, in one great experiment.</p><p>Interesting parallels to this comparison can loosely be considered in other data tools such as Fivetran &amp; Stitch (closed vs open-source), Snowflake vs Redshift (developer-friendly vs cheaper on paper) and even Tableau vs Looker (old way vs new way).</p><p>Both dbt and Dataform now have relatively independent niches, having benefited from the existence of the other. I'm personally glad for having met and got to know the Dataform team and enjoyed being tangentially involved in their growth. The product is great and has been a pleasure to use, and they are nice people who have put a lot of thought into solving my team&#8217;s problems.</p><p>I think dbt's success in raising lots of VC money is great, they have created many hours of additional output for each hour of development, and more broadly for an enhancement on the thinking in this space, also great people to deal with!</p><p>---comment footnotes---</p><p>[1] - Tribal ie The Robbers Cave Study</p><blockquote><p>The Robbers Cave experiment studied how hostilities quickly developed between two groups of boys at a summer camp. The researchers were later able to reduce the tensions between the two groups by having them work towards shared goals. The Robbers Cave study helps to illustrate several key ideas in psychology, including realistic conflict theory, social identity theory, and the contact hypothesis. <a href="https://www.thoughtco.com/robbers-cave-experiment-4774987">Source</a> (disclaimer - I read about robbers cave this week) </p></blockquote><p>[2] - <a href="https://getdbt.slack.com/archives/C0VLZM3U2/p1527001713000902">dataform announcement date ~ May 2018</a></p><p>[3] - <a href="https://getdbt.slack.com/archives/C0VLNUUTZ/p1562825154386300">dbt on dataform</a></p><blockquote><p>Hah! This is one of the implications of open-source! ... In my 3-year post, I explicitly called out this likelihood and embraced it</p></blockquote><p>[4] - <a href="https://app.slack.com/client/T0VLPD22H/CTMTMFNH5/thread/C0VLNUUTZ-1562825154.386300">early dbt-cloud called Sinter, early 2017</a></p><p>[5] See superordinate goals conveniently linked in [1]</p></blockquote><h1>Closing</h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TDbP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F427e7f80-c8d1-4413-9fc5-c896e170a6f3_1920x1924.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TDbP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F427e7f80-c8d1-4413-9fc5-c896e170a6f3_1920x1924.jpeg 424w, https://substackcdn.com/image/fetch/$s_!TDbP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F427e7f80-c8d1-4413-9fc5-c896e170a6f3_1920x1924.jpeg 848w, https://substackcdn.com/image/fetch/$s_!TDbP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F427e7f80-c8d1-4413-9fc5-c896e170a6f3_1920x1924.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!TDbP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F427e7f80-c8d1-4413-9fc5-c896e170a6f3_1920x1924.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TDbP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F427e7f80-c8d1-4413-9fc5-c896e170a6f3_1920x1924.jpeg" width="1456" height="1459" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/427e7f80-c8d1-4413-9fc5-c896e170a6f3_1920x1924.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1459,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:314014,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TDbP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F427e7f80-c8d1-4413-9fc5-c896e170a6f3_1920x1924.jpeg 424w, https://substackcdn.com/image/fetch/$s_!TDbP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F427e7f80-c8d1-4413-9fc5-c896e170a6f3_1920x1924.jpeg 848w, https://substackcdn.com/image/fetch/$s_!TDbP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F427e7f80-c8d1-4413-9fc5-c896e170a6f3_1920x1924.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!TDbP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F427e7f80-c8d1-4413-9fc5-c896e170a6f3_1920x1924.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://unsplash.com/photos/hoS3dzgpHzw">unsplash</a></figcaption></figure></div><p>I enjoyed witnessing first-hand the evolution of "a new technology". But I think it probably fairer to call it a recombination, or a reorg. Bottoms up top-down shuffle.</p><p>The reshuffle rather than blinding innovation here is made clear by the fact that I could port the code I wrote in 2016 in Denodo&#8217;s VQL query language, designed in 2008, which was a large and involved SQL DAG, into something that dbt and Snowflake would happily use. Entirely independently from dbt, and I&#8217;d bet mutually unknown, a company created from a users perspective in essence the same way of working. I&#8217;m sure from an evolutionary perspective this divergent and now convergent evolution is an interesting quirk.</p><p>In that sense, nothing here is &#8220;new&#8221; enough to interest a computer scientist. The computer scientists I know have an allergic reaction to most of the analytics tooling. Messy and inefficient new ways of doing things, which then become the standards and that look like inflexion points after the fact, and I think for an analyst this change is in progress. Going back to the hype house concept:</p><blockquote><p>Outsiders &#8230; always dismissed the novel as silly, faddish, or worse. When those inside the cutting-edge scenes band together to support, teach, and create with each other, their niche and experimental projects can become the new normal</p></blockquote><p>The user is one of the many pressures that drive the evolutionary development of these tools but is best served by choice. Consolidation of technology into profit mode leaves the user&#8217;s needs as a low priority, whereas competitive pressure and alternatives put the users at the front. Current trends are about enabling a data analyst turned developer to <em>create as much insight with as little engineering as possible,</em> making it a great time to be an analyst. The users ultimately benefit when the technology scene is in inclusive, collaborative and open mode. Ultimately we are probably due for a period of consolidation mode, but perhaps more on that later.</p><p><strong>Please comment if you have any feedback on any of this, I aim to improve with your help.</strong></p><p><em>edit: Somehow the version on my <a href="https://github.com/mattarderne/rdrn.dev/issues/8#issuecomment-1311423227">personal backup blog</a> got all the comments</em></p><div><hr></div><p>* again interesting aside: Denodo was part of a different paradigm, with virtualisation being the word. Instead of ETL&#8217;ing your data into a data warehouse, you&#8217;d rather leave it where it lives and query it directly. Whether in a data warehouse, API, transaction database, or many others! A dream for a data architect faced with decades of legacy tech and a desperate need to unify access for analytics and as a data bus. Dremio is now continuing this trend, focussing more on the data lake concept.</p><div><hr></div><p><em>Please consider subscribing for more on the subject of data systems thinking</em></p><p><em>What is <a href="https://groupby1.substack.com/about">group by 1</a></em></p><p><em>Who is <a href="https://rdrn.dev/?utm_source=groupby1.substack.com">Matt Arderne</a></em></p>]]></content:encoded></item><item><title><![CDATA[Snowflake Field Notes]]></title><description><![CDATA[This is a technical intro to deploying Snowflake data warehouse, as a follow on from my previous post. This may be useful if you have decided to implement Snowflake.]]></description><link>https://groupby1.mattarderne.com/p/snowflake-field-notes</link><guid isPermaLink="false">https://groupby1.mattarderne.com/p/snowflake-field-notes</guid><dc:creator><![CDATA[Matt Arderne]]></dc:creator><pubDate>Mon, 13 Jul 2020 07:55:21 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!UwwM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F11363802-d543-41b8-b457-8a993b9abb62_1600x1066.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UwwM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F11363802-d543-41b8-b457-8a993b9abb62_1600x1066.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UwwM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F11363802-d543-41b8-b457-8a993b9abb62_1600x1066.jpeg 424w, https://substackcdn.com/image/fetch/$s_!UwwM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F11363802-d543-41b8-b457-8a993b9abb62_1600x1066.jpeg 848w, https://substackcdn.com/image/fetch/$s_!UwwM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F11363802-d543-41b8-b457-8a993b9abb62_1600x1066.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!UwwM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F11363802-d543-41b8-b457-8a993b9abb62_1600x1066.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UwwM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F11363802-d543-41b8-b457-8a993b9abb62_1600x1066.jpeg" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/11363802-d543-41b8-b457-8a993b9abb62_1600x1066.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UwwM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F11363802-d543-41b8-b457-8a993b9abb62_1600x1066.jpeg 424w, https://substackcdn.com/image/fetch/$s_!UwwM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F11363802-d543-41b8-b457-8a993b9abb62_1600x1066.jpeg 848w, https://substackcdn.com/image/fetch/$s_!UwwM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F11363802-d543-41b8-b457-8a993b9abb62_1600x1066.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!UwwM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F11363802-d543-41b8-b457-8a993b9abb62_1600x1066.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p><em>In <a href="https://groupby1.substack.com/p/data-as-a-utility-tool">my first post,</a> I justified an approach to achieve a scalable system for <strong>loading, storing, transforming and distributing</strong> data within an analytics context.&nbsp;</em></p><p><em>In this post, we&#8217;ll be taking a look into my notebook on <strong>storing</strong>. Specifically, the things I&#8217;ve noted as useful when implementing Snowflake. These few notes, scripts and points of reference should save you some time and get you out onto the water sooner.&nbsp;</em></p><p><em>Welcome to my pocket notebook, heading <strong>Snowflake Important Things - Jan 2020.</strong></em></p><div><hr></div><h2>Why Snowflake</h2><p>This isn&#8217;t paid content. Though with the flowery praise to come it should be (see contacts below). However, this post doesn't get too far into <em>why</em> Snowflake. Rather it explores <em>how</em> Snowflake. Nonetheless, we may need some justification.</p><p>Snowflake&#8217;s real value is the <strong>reduction of non-value-adding complexity for the user</strong>. Putting useful things in your path and keeping anything and everything operationally complex out of your way. Simple as that. If you&#8217;ve used PostgreSQL then this shouldn&#8217;t feel too foreign, minus index maintenance, table locks, performance issues and upgrades. Pretty standard SQL otherwise, and a few new concepts.</p><p>And you only pay for the capacity and performance you use.</p><p>That&#8217;s it really.&nbsp;</p><p>It is <em>just</em> a SQL database. A very fast one, that handles loads of data, and has lots of usability features. It stores data in a columnar way (rather than rows), which means it is very fast. But you&#8217;ll still be writing SQL queries, in a mostly familiar pleasant SQL syntax.&nbsp;</p><p>The primary alternative to Snowflake in this context is Google Bigquery. I&#8217;m no expert, but you&#8217;d struggle to go wrong with either. Snowflake offers a choice of AWS, Azure or GCP for your horsepower, so that might be reason enough for you to choose Snowflake. At some point, it should start to become clear that Snowflake is just a clever interface for storage and computation built on commodity cloud infrastructure. Very clever. S3 buckets + EC2 for anyone feeling like they&#8217;d rather DIY this part, or build a competitor.&nbsp;</p><p>Last part of the intro fanfare: Snowflake is a Data Platform. This is made clear in their recent manoeuvring into the crowded, polluted sea of Data Marketplaces, and a peek into the BI world, with their very simple new Dashboards tool. However, the most platformy move here is a direct integration with Salesforce. More on this in the closing.</p><h2>Context&nbsp;</h2><p>This post doesn&#8217;t get <em>too far</em> into the details of the doing, but rather points out things that are somewhat peculiar or unique to Snowflake. Things to be kept in mind when doing the initial deployment.&nbsp;</p><p>The context also caters entirely towards doing your transforming tasks in a SQL transformation tool like <a href="https://dataform.co/?utm_source=groupby1.substack.com">Dataform</a> or <a href="https://getdbt.com/?utm_source=groupby1.substack.com">dbt</a>.&nbsp;</p><p>The structure of this post will loosely follow the order in which you&#8217;ll encounter and want to consider various new concepts and features as you implement Snowflake.&nbsp;</p><p>We will start with an intro to a Snowflake deployment. We&#8217;ll then apply some structure to loading, after getting the security and costs watertight we will finally set sail with some interesting new features and capabilities.&nbsp;</p><h1>1. Deployment&nbsp;</h1><p>As of publishing this, you can sign up and get started with a free (no credit card), month-long trial, which gets you floating. </p><p>Once you&#8217;ve signed up, you&#8217;ll need a few things in place as part of the deployment. These include roles, users, databases and warehouses.&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Tzs8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6935cff-ff66-4f5c-ae6a-8cee5110c5d9_1600x1067.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Tzs8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6935cff-ff66-4f5c-ae6a-8cee5110c5d9_1600x1067.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Tzs8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6935cff-ff66-4f5c-ae6a-8cee5110c5d9_1600x1067.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Tzs8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6935cff-ff66-4f5c-ae6a-8cee5110c5d9_1600x1067.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Tzs8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6935cff-ff66-4f5c-ae6a-8cee5110c5d9_1600x1067.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Tzs8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6935cff-ff66-4f5c-ae6a-8cee5110c5d9_1600x1067.jpeg" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/e6935cff-ff66-4f5c-ae6a-8cee5110c5d9_1600x1067.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Tzs8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6935cff-ff66-4f5c-ae6a-8cee5110c5d9_1600x1067.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Tzs8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6935cff-ff66-4f5c-ae6a-8cee5110c5d9_1600x1067.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Tzs8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6935cff-ff66-4f5c-ae6a-8cee5110c5d9_1600x1067.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Tzs8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6935cff-ff66-4f5c-ae6a-8cee5110c5d9_1600x1067.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>New Concepts</h2><p>The new concepts introduced here are warehouses and credits.&nbsp;</p><p><strong>/Warehouses</strong></p><p>Essentially a warehouse is how you specify the <strong>power of compute</strong> that you use to run queries. This is interesting because you can assign a warehouse to a role. <code>TRANSFORM</code> roles can use a different warehouse to <code>REPORT</code> roles. This allows you to fine-tune your compute power and response time for various scenarios. Predictable power for <code>TRANSFORM</code>, snappy and responsive for <code>REPORT</code> to keep the end-users happy!&nbsp;</p><p>Warehouses are NOT where you keep your data. Think of a warehouse like a sail that you hoist when the cold query winds blow from the East (or when the warm Summer trade-winds blow from the East depending on your preference).</p><p>Practically, a role is granted privileges to use a warehouse in much the same way a role is granted privileges to access a database. A warehouse also needs to be specified whenever a connection is made to Snowflake.&nbsp;</p><pre><code><code>grant all privileges on warehouse WAREHOUSE_REPORT 
to role ROLE_REPORT;</code></code></pre><p><strong>/Credits</strong></p><p>You get billed based on your usage of credits.</p><p>Credits are consumed by storage and warehouses.</p><p>Every time* you start a warehouse, you pay per second in credits, and so credits are effectively your unit of currency.&nbsp;</p><p>At the time of writing a credit is <a href="https://www.snowflake.com/pricing/?utm_source=groupby1.substack.com">$2-$3</a>, and negotiating that down when your annual contract value reaches ~$10k is the typical script.&nbsp;</p><p>The outcome of this warehouse/credit scenario is you have <a href="https://www.snowflake.com/blog/understanding-snowflake-utilization-warehouse-profiling/?utm_source=groupby1.substack.com">a very granular cost breakdown</a> of your query costs.</p><p><em>*Not every query starts a warehouse - see cached data section below.</em></p><p><strong>Additional Notes:</strong></p><ul><li><p>See a walkthrough of cost calculations, product tiers and implications <a href="https://www.tropos.io/blog/how-to-calculate-your-snowflake-monthly-cost/">here</a>.</p></li></ul><h2>Permissions</h2><p>This is the <code>grant &lt;PERMISSION&gt; to &lt;ROLE&gt;</code> part of the database deployment process.&nbsp;</p><p>I like to follow either one of the following two deployment patterns:&nbsp;</p><ol><li><p>The<strong> Proof Of Concept</strong> (POC) keeps things as simple as possible, while still being stable and scalable.&nbsp;</p></li><li><p>The<strong> Production</strong> option adds some additional structure on top of the POC.&nbsp;</p></li></ol><h3>1. Proof of Concept&nbsp;</h3><p>This setup doesn't distinguish between <code>PROD</code> and <code>DEV</code>, and rather relies on branching features later on in the transformation, which is perfectly fine.</p><p>At the core are the 3 roles, with each only having the permissions necessary to function, without the ability to interfere with the other roles&#8217; domains.&nbsp;</p><ul><li><p><strong>INGEST</strong></p><ul><li><p>Loads data</p></li><li><p>Can create schemas in <code>RAW</code> database</p></li></ul></li><li><p><strong>TRANSFORM</strong></p><ul><li><p>Creates transformation scripts</p></li><li><p>Can read data in <code>RAW</code></p></li><li><p>Can create schemas in <code>ANALYTICS</code></p></li></ul></li><li><p><strong>REPORT</strong></p><ul><li><p>Read-only access to <code>ANALYTICS</code></p></li></ul></li></ul><p>This is shown in the relationship diagram below, where connections indicate permissions assigned.&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ptyk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F69161c61-2125-4f33-b112-4517401729ed_1600x675.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ptyk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F69161c61-2125-4f33-b112-4517401729ed_1600x675.png 424w, https://substackcdn.com/image/fetch/$s_!Ptyk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F69161c61-2125-4f33-b112-4517401729ed_1600x675.png 848w, https://substackcdn.com/image/fetch/$s_!Ptyk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F69161c61-2125-4f33-b112-4517401729ed_1600x675.png 1272w, https://substackcdn.com/image/fetch/$s_!Ptyk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F69161c61-2125-4f33-b112-4517401729ed_1600x675.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ptyk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F69161c61-2125-4f33-b112-4517401729ed_1600x675.png" width="1456" height="614" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/69161c61-2125-4f33-b112-4517401729ed_1600x675.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:614,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ptyk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F69161c61-2125-4f33-b112-4517401729ed_1600x675.png 424w, https://substackcdn.com/image/fetch/$s_!Ptyk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F69161c61-2125-4f33-b112-4517401729ed_1600x675.png 848w, https://substackcdn.com/image/fetch/$s_!Ptyk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F69161c61-2125-4f33-b112-4517401729ed_1600x675.png 1272w, https://substackcdn.com/image/fetch/$s_!Ptyk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F69161c61-2125-4f33-b112-4517401729ed_1600x675.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You&#8217;ll notice in the diagram that the <code>USER_REPORT</code> cannot access the <code>RAW</code> data, this is an entirely deliberate move towards ensuring that downstream tools cannot build a dependency on <code>RAW</code> data.</p><p>For further clarification on how all this works, I&#8217;ve created a starter kit for Snowflake, which creates the above diagram exactly, ready for a POC. If you&#8217;re considering a Snowflake implementation, it is well worth an hour to take a look. Pull requests welcome!</p><ul><li><p><a href="https://github.com/mattarderne/snowflake-starter">https://github.com/mattarderne/snowflake-starter</a></p></li></ul><h3>2. Production</h3><p>The following configuration takes the basics from the <strong>Proof Of Concept</strong> and enhances them to include a more robust separation between <code>PROD</code> and <code>DEV</code>. There is a duplication of all entities with <code>_PROD</code> with a <code>_DEV</code> version (<code>_DEV</code> not shown in this diagram for simplicity) and distinct role breakdown for accessing Databases.&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2GFL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F71aa7043-46d8-4938-bf05-fa57c2b65b99_1652x1033.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2GFL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F71aa7043-46d8-4938-bf05-fa57c2b65b99_1652x1033.png 424w, https://substackcdn.com/image/fetch/$s_!2GFL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F71aa7043-46d8-4938-bf05-fa57c2b65b99_1652x1033.png 848w, https://substackcdn.com/image/fetch/$s_!2GFL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F71aa7043-46d8-4938-bf05-fa57c2b65b99_1652x1033.png 1272w, https://substackcdn.com/image/fetch/$s_!2GFL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F71aa7043-46d8-4938-bf05-fa57c2b65b99_1652x1033.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2GFL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F71aa7043-46d8-4938-bf05-fa57c2b65b99_1652x1033.png" width="1456" height="910" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/71aa7043-46d8-4938-bf05-fa57c2b65b99_1652x1033.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:910,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:207960,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2GFL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F71aa7043-46d8-4938-bf05-fa57c2b65b99_1652x1033.png 424w, https://substackcdn.com/image/fetch/$s_!2GFL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F71aa7043-46d8-4938-bf05-fa57c2b65b99_1652x1033.png 848w, https://substackcdn.com/image/fetch/$s_!2GFL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F71aa7043-46d8-4938-bf05-fa57c2b65b99_1652x1033.png 1272w, https://substackcdn.com/image/fetch/$s_!2GFL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F71aa7043-46d8-4938-bf05-fa57c2b65b99_1652x1033.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Additional Notes:&nbsp;</strong></p><ul><li><p>Snowflake case sensitivity is subtly <a href="https://github.com/mattarderne/snowflake-starter/blob/master/utils/case_sensitivity.sql">different to PostgreSQL</a>.&nbsp;</p><ul><li><p>Unquoted object identifiers are case-insensitive</p></li><li><p><code>&#8220;ANALYTICS&#8221; = ANALYTICS = analytics</code></p></li></ul></li><li><p>Create a user for every connecting system, and a user for every developer. This will enable you to <strong>track the source and cost of all queries</strong>.&nbsp;</p></li><li><p>If you already have a Snowflake database, you can visually analyse your setup with the <a href="http://snowflakeinspector.hashmapinc.com/?utm_source=groupby1.substack.com">snowflakeinspector.com</a>, great for tracking poorly configured snowflake permissions that you may inherit.</p></li><li><p>A very useful bit of code is the <strong><code>grant on future</code></strong> snippet, which allows you to grant all future tables in a schema with a certain permission.&nbsp;</p></li></ul><pre><code><code>g</code>rant usage on future SCHEMAS in database RAW to role TRANSFORM

grant select on future TABLES in database RAW to role TRANSFORM</code></pre><h1>2. Extract and Load Nuance</h1><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2CsO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb12d25d5-57a8-45a1-9742-801d66c84d1e_1600x1157.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2CsO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb12d25d5-57a8-45a1-9742-801d66c84d1e_1600x1157.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2CsO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb12d25d5-57a8-45a1-9742-801d66c84d1e_1600x1157.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2CsO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb12d25d5-57a8-45a1-9742-801d66c84d1e_1600x1157.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2CsO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb12d25d5-57a8-45a1-9742-801d66c84d1e_1600x1157.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2CsO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb12d25d5-57a8-45a1-9742-801d66c84d1e_1600x1157.jpeg" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/b12d25d5-57a8-45a1-9742-801d66c84d1e_1600x1157.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2CsO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb12d25d5-57a8-45a1-9742-801d66c84d1e_1600x1157.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2CsO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb12d25d5-57a8-45a1-9742-801d66c84d1e_1600x1157.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2CsO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb12d25d5-57a8-45a1-9742-801d66c84d1e_1600x1157.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2CsO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fb12d25d5-57a8-45a1-9742-801d66c84d1e_1600x1157.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>/Loaders</strong></p><p>If you are using <a href="https://stitchdata.com/?utm_source=groupby1.substack.com">Stitch</a>, <a href="https://fivetran.com/?utm_source=groupby1.substack.com">Fivetran</a> or similar, you can target your data warehouse at this point. Assign the tool the appropriate role, warehouse, database and schema as specified in the deployment script (<code>ROLE_INGEST, WAREHOUSE_INGEST, RAW</code>).&nbsp;</p><p>Stitch will create a schema based on the name you give to the job, so stick with something scalable. I like <code>&lt;loader&gt;_&lt;source&gt;</code> format, so you&#8217;ll start with something like <code>STITCH_HUBSPOT</code>. It&#8217;s key to note that this means you can later pop out the stitch part for a <code>FIVETRAN_HUBSPOT</code> or an <code>ETL_HUBSPOT</code>.&nbsp;</p><p><strong>/JSON</strong></p><p>Managed ELT tools will load data as best as they can, typically as rows and columns, but often will insert your data as raw JSON into a single column. This is a good thing. It allows you to become familiar with the incredibly useful Snowflake JSON SQL syntax.&nbsp;</p><p>If you write any custom ELT scripts, ensure when loading data to load all data as JSON variant type. This is the crux of ELT. Schemaless loading means your data lands without any notion of a schema, and so you can define the schema later on in one go in the transformation step. This can be seen as a big step, but it helps to be able to define ALL transformations in the transformation stage, and not have to go back to your Python scripts to add new fields.</p><p><strong>Additional Notes:</strong></p><ul><li><p>Start with a tutorial for <a href="https://calogica.com/sql/2018/12/17/parsing-nested-json-snowflake.html/?utm_source=groupby1.substack.com">handling JSON in Snowflake</a>, just to get the <a href="https://interworks.com/blog/hcalder/2018/06/19/the-ease-of-working-with-json-in-snowflake/?utm_source=groupby1.substack.com">basics</a>.&nbsp;</p></li></ul><h1>3. Secure the perimeter&nbsp;</h1><p>At this stage there is a risk of moving too fast, and that awkward speed wobble is avoided by taking stock and balancing the books. </p><p>The pre-retrospective things to attend to are Costs and Sensitive Data.</p><h2>Costs</h2><p>Snowflake is a powerful tool, and with the largest warehouse running into the thousands of dollars <em>per hour, </em>you want to do two things:</p><p><strong>/Set a budget and limit</strong></p><p>Determining what you are willing to spend in a month is a good start, and setting a policy to alert you at various increments of that amount will avoid a broadside attack from Finance. Setting the policy to disable future queries across specific warehouses or all of them is a good trip switch to ensure that you aren&#8217;t caught at sea.</p><p><strong>/Get alerted&nbsp;</strong></p><p>Worse than running up a large bill (depending on who you ask) would be for your credit limit policy to come into play the moment you click run when demo&#8217;ing your fancy analytics to a client or stakeholder.&nbsp;</p><p>For this reason, keeping close tabs on spikes in credit usage and becoming familiar with how and where your credits are going is very high on your new agenda. Remember this is SaaS, i.e. <em>Operational Expense</em>. <strong>All the costs lay ahead of you on this one.</strong></p><p><a href="https://github.com/snowflakedb/SnowAlert">SnowAlert</a> is a tool that Snowflake maintains. I&#8217;ve adopted some of the queries as part of my suggested monitoring in the <a href="https://github.com/mattarderne/snowflake-starter/#snowalert">Snowflake-Starter</a> repo. The queries look for spending spikes across the infrastructure and will return results only if they detect a spike.&nbsp;</p><p>Last thing on cost management and this is more of an opinion.</p><p>Historically, database resources are specified against a budget for their max expected load. This left lots of performance headroom for the median query. One could view Snowflake costs with some equivalency to this performance headroom, in that a Snowflake query could run faster if you assign it a larger warehouse at increased cost.&nbsp;</p><p>However<strong> there is a premium being paid for the flexibility</strong>, and so it benefits you to manage your fleet of warehouses carefully, lest they turn on you. Snowflake is an operational expense. This is a subtle shift. The crux is that every credit spent should &#8220;deliver value&#8221; in a somewhat meaningful way.&nbsp;</p><p><strong>Additional Notes:</strong></p><ul><li><p>Snowflake caches results of queries, meaning that you won&#8217;t get charged for queries that hit the cache. This requires some nuance when modelling credit intensive processes like incremental updates. See this <a href="https://medium.com/hashmapinc/30-second-snowflake-cloud-data-warehouse-cheat-sheet-e72c42b863a4">blog</a> for a run-through.</p></li></ul><ul><li><p>Snowflake charges lightly for access to metadata queries, this is because each time your transform tool runs, it queries the schema definition <em><strong>heavily</strong></em>. This was free, it now isn&#8217;t. The cost is negligible but it is worth noting what is going on.&nbsp;</p></li></ul><h2>Sensitive Data</h2><p><strong>/Masking</strong></p><p>Snowflake&#8217;s <strong>&#8220;Dynamic Data Masking&#8221; </strong>feature isn&#8217;t quite as dynamic as it sounds but is a welcome addition. You&#8217;ll <strong><code>create or replace masking policy EMAIL_MASK</code></strong> and attach that to a role. See this <a href="https://www.youtube.com/watch?v=ByyfTAj97xY">video</a> for an explanation. This is a helpful addition to be able to define masks at an object level. This is a new (enterprise only) feature and works in conjunction or in addition to the <a href="https://community.snowflake.com/s/article/Methods-for-Securing-PII-Data-in-Snowflake/?utm_source=groupby1.substack.com">standard masking features</a>.</p><p><strong>/Access Control</strong></p><p>Enable a <a href="https://docs.snowflake.com/en/user-guide/network-policies.html">network policy</a> that whitelists the IPs of Stitch, your BI tool, VPN etc.</p><p>Enable <a href="https://docs.snowflake.com/en/user-guide/ui-preferences.html#enrolling-in-mfa-multi-factor-authentication">multi-factor authentication</a> (MFA) with the <a href="https://duo.com/product/multi-factor-authentication-mfa/duo-mobile-app">Duo app</a>. Duo is GREAT. It prompts for a password protected authorisation on your phone&#8217;s home screen. No excuses. All users assigned the <code>ACCOUNTADMIN</code> role should also be required to use MFA.</p><h1>4. Setting Sail</h1><p>Snowflake at this point, like setting sail, depends on where you want to go. In my <a href="https://groupby1.substack.com/p/data-as-a-utility-tool">previous post</a>, I outlined what I&#8217;d do next, and it looks something like setting up a few data loading tools, writing transforms in <a href="https://dataform.co/?utm_source=groupby1.substack.com">Dataform</a> and then distributing the results in an analytics tool. If you haven&#8217;t, <a href="https://groupby1.substack.com/p/data-as-a-utility-tool">please check it out</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l0bS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fee90038c-bd14-472b-87f7-301f36998802_1600x1066.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l0bS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fee90038c-bd14-472b-87f7-301f36998802_1600x1066.jpeg 424w, https://substackcdn.com/image/fetch/$s_!l0bS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fee90038c-bd14-472b-87f7-301f36998802_1600x1066.jpeg 848w, https://substackcdn.com/image/fetch/$s_!l0bS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fee90038c-bd14-472b-87f7-301f36998802_1600x1066.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!l0bS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fee90038c-bd14-472b-87f7-301f36998802_1600x1066.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l0bS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fee90038c-bd14-472b-87f7-301f36998802_1600x1066.jpeg" width="1456" height="970" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ee90038c-bd14-472b-87f7-301f36998802_1600x1066.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:970,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l0bS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fee90038c-bd14-472b-87f7-301f36998802_1600x1066.jpeg 424w, https://substackcdn.com/image/fetch/$s_!l0bS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fee90038c-bd14-472b-87f7-301f36998802_1600x1066.jpeg 848w, https://substackcdn.com/image/fetch/$s_!l0bS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fee90038c-bd14-472b-87f7-301f36998802_1600x1066.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!l0bS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fee90038c-bd14-472b-87f7-301f36998802_1600x1066.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I will not be overemphasising this section, but rather point out a few of the most interesting features that fall under <strong>analysing data</strong>. You could at this point treat Snowflake like you would a very tiny <code>t2.tiny</code> PostgreSQL instance, forget about it (other than the $) and continue.&nbsp;</p><p>New features in themselves are not always so interesting, but what is interesting is what they enable when combined with existing features. As in technology, so in databases.&nbsp;</p><p><strong>/Swap With</strong></p><pre><code><code>alter database PROD swap with STAGE</code></code></pre><blockquote><p>Swaps all content and metadata  between two specified tables, including any integrity constraints  defined for the tables. Also swap all access control privilege grants.  <strong>The two tables are essentially renamed in a single transaction</strong>.&nbsp;</p></blockquote><p>It also enables a Blue/Green deployment, which in simple terms means: Create a new database with your changes (<code>STAGE</code>), run tests on that, if they pass, swap it with <code>PROD</code>. If an hour later you realise you&#8217;ve deployed something terrible, swap it back.&nbsp;</p><p><strong>/Zero copy clone</strong></p><pre><code><code>create or replace table USERS_V2 clone USERS</code></code></pre><p>Create an instant clone of Tables, Schemas, and Databases with zero cost (until you change the data). Great for testing, development and deployment.</p><p><strong>/Time Travel</strong></p><p>Combining the clone function, one can <a href="https://docs.snowflake.com/en/user-guide/data-time-travel.html">time travel to a table</a> as it existed at a specified time (1 day back on the standard plan, 90 days on enterprise). The command below will recover the schema at the timestamp (wayward <code>DROP</code> perchance).</p><pre><code><code>create schema TEST_RESTORE clone TEST at (timestamp=&gt; to_timestampe(40*365*86400));</code></code></pre><p><strong>/External functions</strong></p><p>Run a call to a <a href="https://docs.snowflake.com/en/sql-reference/external-functions-introduction.html">REST API</a> in your SQL. Great for those pesky ML functions.&nbsp;</p><pre><code><code>select zipcode_to_city_external_function(ZIPCODE)
from ADDRESS;</code></code></pre><h1>Closing Meta Industry Thoughts</h1><p>Snowflake is building a platform, meaning they are building the one-stop-shop for your data needs. The notion of Data Loading is likely going to become more fringe. Snowflake has already moved in this direction with <a href="https://blocksandfiles.com/2020/06/04/snowflake-salesforce-integration-tools/?utm_source=groupby1.substack.com">Salesforce</a>.</p><blockquote><p>Einstein Analytics Output Connector for Snowflake lets customers move their Salesforce data into the Snowflake data warehouse alongside data from other sources. Joint customers can consolidate all their Salesforce data in Snowflake. Automated data import keeps the Snowflake copy up to date.</p></blockquote><p>This off-the-shelf analytics is a reasonable next step, perhaps in this case due to investment by Salesforce into Snowflake, but that aside, the data space is finding where lie its <em><strong>layers of abstraction</strong></em>, and this is shown in these industry moves.&nbsp;&nbsp;</p><p>Snowflake is building a platform, doing it well, and charging you for it. Engineering time remains expensive, and so outsourcing this to Snowflake&#8217;s managed platform will be a welcome relief. However there are no free lunches, and Snowflake is building something bigger than a data warehouse. What this means is that if you take too much, you&#8217;ll be stuck with too much.&nbsp;</p><p>Echoing <a href="https://www.dremio.com/getting-locked-in-and-locked-out-with-snowflake/?utm_source=groupby1.substack.com">Dremio</a>, there is always a thought towards a modular data architecture <em><strong>&#8220;that&#8217;s built around an open cloud data lake* (e.g S3) instead of a proprietary data warehouse&#8221;</strong>. </em>I generally agree with this premise. Snowflake is built on top of AWS or Azure or GCP, and so is (was) a thin layer on top of raw storage and compute.&nbsp;</p><p><em>* More on <a href="https://fivetran.com/blog/when-to-adopt-a-data-lake//?utm_source=groupby1.substack.com">data lakes here</a></em></p><p>Snowflake is marching towards the abstractions seen in Software Engineering, where every job is a feature for them to build. Snowflake has built Data Warehouse Engineer, it is building ETL Engineer <em>and will likely build Data Engineer in some version soon</em>.&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iyT8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf206714-d7ad-4e9f-a72e-41d12b408620_1600x1068.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iyT8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf206714-d7ad-4e9f-a72e-41d12b408620_1600x1068.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iyT8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf206714-d7ad-4e9f-a72e-41d12b408620_1600x1068.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iyT8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf206714-d7ad-4e9f-a72e-41d12b408620_1600x1068.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iyT8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf206714-d7ad-4e9f-a72e-41d12b408620_1600x1068.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iyT8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf206714-d7ad-4e9f-a72e-41d12b408620_1600x1068.jpeg" width="1456" height="972" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/cf206714-d7ad-4e9f-a72e-41d12b408620_1600x1068.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:972,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iyT8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf206714-d7ad-4e9f-a72e-41d12b408620_1600x1068.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iyT8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf206714-d7ad-4e9f-a72e-41d12b408620_1600x1068.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iyT8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf206714-d7ad-4e9f-a72e-41d12b408620_1600x1068.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iyT8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf206714-d7ad-4e9f-a72e-41d12b408620_1600x1068.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p>&#8220;It is not the ship so much as the skilful sailing that assures the prosperous voyage.&#8221; - George William Curtis</p></blockquote><div><hr></div><p><strong>Please comment if you have any feedback on any of this, I aim to improve with your help.</strong></p><p>Thanks to Dan Lee for reviewing and contributing to this post. </p><div><hr></div><p><em>Please consider subscribing for more on the subject of data systems thinking</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://groupby1.mattarderne.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://groupby1.mattarderne.com/subscribe?"><span>Subscribe now</span></a></p><p><em>What is <a href="https://groupby1.substack.com/about">group by 1</a></em></p><p><em>Who is <a href="https://rdrn.dev/?utm_source=groupby1.substack.com">Matt Arderne</a></em></p>]]></content:encoded></item><item><title><![CDATA[Data as a Utility Tool]]></title><description><![CDATA[Reasoning through a modern data analytics architecture]]></description><link>https://groupby1.mattarderne.com/p/data-as-a-utility-tool</link><guid isPermaLink="false">https://groupby1.mattarderne.com/p/data-as-a-utility-tool</guid><dc:creator><![CDATA[Matt Arderne]]></dc:creator><pubDate>Sun, 07 Jun 2020 15:04:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AAoX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc281e7-c1f3-4813-b538-aacd2cda2764_2914x1943.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Welcome to <strong>group by 1</strong>. In this first post, I&#8217;ve started broad with my opinion on a few of the typical compromises made when implementing a modern data warehouse solution. Modern meaning cloud, data warehouse meaning the back-end for an analytics tool. This post is a primer for my future content.</em></p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AAoX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc281e7-c1f3-4813-b538-aacd2cda2764_2914x1943.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AAoX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc281e7-c1f3-4813-b538-aacd2cda2764_2914x1943.jpeg 424w, https://substackcdn.com/image/fetch/$s_!AAoX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc281e7-c1f3-4813-b538-aacd2cda2764_2914x1943.jpeg 848w, https://substackcdn.com/image/fetch/$s_!AAoX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc281e7-c1f3-4813-b538-aacd2cda2764_2914x1943.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!AAoX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc281e7-c1f3-4813-b538-aacd2cda2764_2914x1943.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AAoX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc281e7-c1f3-4813-b538-aacd2cda2764_2914x1943.jpeg" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/5fc281e7-c1f3-4813-b538-aacd2cda2764_2914x1943.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1350592,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AAoX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc281e7-c1f3-4813-b538-aacd2cda2764_2914x1943.jpeg 424w, https://substackcdn.com/image/fetch/$s_!AAoX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc281e7-c1f3-4813-b538-aacd2cda2764_2914x1943.jpeg 848w, https://substackcdn.com/image/fetch/$s_!AAoX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc281e7-c1f3-4813-b538-aacd2cda2764_2914x1943.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!AAoX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5fc281e7-c1f3-4813-b538-aacd2cda2764_2914x1943.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Within the companies I have worked for and plan on working for, uncertainty is a common thread. Sales <em>may</em> continue to accelerate, funding <em>should</em> land next quarter, we <em>hope</em> to keep in touch. The uncertainty may be more concrete. We <em>should</em> change to a new CRM. We <em>probably </em>need to stop reporting in Excel.&nbsp;</p><p>I&#8217;ve put together some opinions on what has worked for me in managing uncertainty when architecting data systems that need to cater for many parallel futures.</p><h3>Two Buckets</h3><p>Designing solutions for analytics systems can stylistically or abstractly be described as a problem of two buckets. Bucket one is full of the typical problems a business might have. The business usually then approaches &#8220;the data team&#8221; with problems such as <strong>help us define a metric / store the data / visualise the KPI / distribute the report</strong>.</p><p>In this simplistic utopia, bucket two, the Solutions Bucket, is typically filled with lots of products and opinions, like <strong>Snowflake / Big Query / my last company used Tableau / group by 1</strong> etc.</p><pre><code> 
select * from problems_bucket
&nbsp;  inner join (select * from solutions_bucket) </code></pre><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mnUg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F465b1ebb-a438-4537-b2e9-3aedf405bcf6_410x259.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mnUg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F465b1ebb-a438-4537-b2e9-3aedf405bcf6_410x259.png 424w, https://substackcdn.com/image/fetch/$s_!mnUg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F465b1ebb-a438-4537-b2e9-3aedf405bcf6_410x259.png 848w, https://substackcdn.com/image/fetch/$s_!mnUg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F465b1ebb-a438-4537-b2e9-3aedf405bcf6_410x259.png 1272w, https://substackcdn.com/image/fetch/$s_!mnUg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F465b1ebb-a438-4537-b2e9-3aedf405bcf6_410x259.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mnUg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F465b1ebb-a438-4537-b2e9-3aedf405bcf6_410x259.png" width="526" height="332.2780487804878" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/465b1ebb-a438-4537-b2e9-3aedf405bcf6_410x259.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:259,&quot;width&quot;:410,&quot;resizeWidth&quot;:526,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mnUg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F465b1ebb-a438-4537-b2e9-3aedf405bcf6_410x259.png 424w, https://substackcdn.com/image/fetch/$s_!mnUg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F465b1ebb-a438-4537-b2e9-3aedf405bcf6_410x259.png 848w, https://substackcdn.com/image/fetch/$s_!mnUg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F465b1ebb-a438-4537-b2e9-3aedf405bcf6_410x259.png 1272w, https://substackcdn.com/image/fetch/$s_!mnUg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F465b1ebb-a438-4537-b2e9-3aedf405bcf6_410x259.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>The catch when architecting a solution is that you&#8217;re <em><strong>given only one scoop</strong></em> from the solution bucket, with the hope that it covers as many of the items in the problem bucket as possible. The solution bucket is usually very resource-intensive, expensive, time-consuming and gathers momentum once that scoop is moving. </p><p>The decision to re-scoop is not going to be taken lightly. The first scoop usually needs to be made under significant uncertainty and pressure. This is a time to be making bets that will serve you in many of your uncertain futures.</p><h3>Travel Light</h3><p>For this reason, I invoke the spirit of a prepper, where travelling light is as essential as being prepared. Enter the Swiss army knife.</p><p>My ideal scoop of the solution bucket, like a good utility tool, has a nice healthy mix of scalability, ease of use, and utilitarian functionality. Bare metal that stands the test of time and rests easily on the hip, ready for action!</p><p>More concretely, a lightweight data architecture describes modularity, where each component plays a specified part in the greater whole, without restricting the system. This enables upgrading, downgrading and replacing as necessary.&nbsp;</p><p>With that in mind, I&#8217;ll be describing <em>my opinion / experience / preference</em> for a utilitarian data architecture.</p><h3>Context</h3><p>The context of this article skews heavily to the typical <strong>first-hired-one-person-data-team</strong> scenario and is generally applicable if that person is within a small business, a startup or a small team within a larger organisation. It can be extended to a data team within a larger organisation that is rethinking their architecture.&nbsp;</p><p>New paradigms start from the ground up, and so it can safely be assumed that this paradigm will be what banks implement in 50 years while the rest of us use quantum computing to think the data into order.</p><p>The driver for this workflow arises from the need to centralise data across multiple systems, typically at the point where there are 3+ key business apps or systems.&nbsp;</p><p>If you're the one called in to take over from the last guy who burnt the ETL (extract-transform-load) candle on both ends and now has a 1000 yard stare, then this might hit a nerve.</p><p><strong>Ingesting, Storing, Transforming, Distributing</strong>. Four verbs for four (4) sections that describe what will be covered, and the order.</p><h3>1. Ingesting</h3><p>I generally subscribe to the opinion that engineers should avoid writing custom ETL code whenever practically possible, and rather use a managed SaaS ETL tool. This resembles the corkscrew of our Swiss army knife. Powerful and simple.</p><p>Managed ETL tools allow you to connect to your supported sources, point those at your data warehouse and have data flowing in a matter of minutes. You are paying for specialisation here. Post-implementation, the ETL specialist at the end of an intercom is worth their (initially) nominal fee.</p><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CZ4K!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F70020299-3b46-402c-b41e-4de0e3867437_500x333.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CZ4K!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F70020299-3b46-402c-b41e-4de0e3867437_500x333.png 424w, https://substackcdn.com/image/fetch/$s_!CZ4K!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F70020299-3b46-402c-b41e-4de0e3867437_500x333.png 848w, https://substackcdn.com/image/fetch/$s_!CZ4K!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F70020299-3b46-402c-b41e-4de0e3867437_500x333.png 1272w, https://substackcdn.com/image/fetch/$s_!CZ4K!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F70020299-3b46-402c-b41e-4de0e3867437_500x333.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CZ4K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F70020299-3b46-402c-b41e-4de0e3867437_500x333.png" width="500" height="333" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/70020299-3b46-402c-b41e-4de0e3867437_500x333.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:333,&quot;width&quot;:500,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:288088,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CZ4K!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F70020299-3b46-402c-b41e-4de0e3867437_500x333.png 424w, https://substackcdn.com/image/fetch/$s_!CZ4K!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F70020299-3b46-402c-b41e-4de0e3867437_500x333.png 848w, https://substackcdn.com/image/fetch/$s_!CZ4K!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F70020299-3b46-402c-b41e-4de0e3867437_500x333.png 1272w, https://substackcdn.com/image/fetch/$s_!CZ4K!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F70020299-3b46-402c-b41e-4de0e3867437_500x333.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>If you cannot get your data into your data warehouse with a managed ETL, or you cannot strike the cost/benefit balance, then you&#8217;ll have to start building. This is a great time to think about the possibility of contracting the work to a specialist. They&#8217;ll bring the expertise, and over time you can consider internalising that skill as you see fit. Because the work is narrowly described and easily measured, this is a great piece of work for outsourcing. Budget for a maintenance contract, and keep an eye on those Managed ETL services as a replacement option over time.</p><p>Some additional thoughts:</p><ul><li><p>Ingesting raw data (JSON or tables) into your data warehouse is key. Don&#8217;t spend time at this point doing transforms in python, there isn&#8217;t time. ETL has been surpassed by ELT (extract-load-transform). This new paradigm is now established.</p></li><li><p>A Google sheet is a data source. Time is of the essence and done is better than perfect. Data validation, spreadsheet protection and read-only permissions <em>do a database maketh</em>. Use this one sparingly, as word may get out.</p></li></ul><h3>2. Storing</h3><p>Balancing the trifecta of scalability, cost and performance is key when picking the backbone of your system. Your data may start small, or large, or small with a risk of growing large. Stopping to change a tire in bear country is never a good look, and neither is a data warehouse migration.</p><p>Managed data warehouses balance the trifecta, with scalability from a team of 1 to 100(n), megabytes to terabytes+, cost starting near zero, and performance flexibility to suit your budget and need.</p><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!c8SE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F05e61e78-d226-44c8-a58b-de561533e6c2_5472x3648.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!c8SE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F05e61e78-d226-44c8-a58b-de561533e6c2_5472x3648.jpeg 424w, https://substackcdn.com/image/fetch/$s_!c8SE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F05e61e78-d226-44c8-a58b-de561533e6c2_5472x3648.jpeg 848w, https://substackcdn.com/image/fetch/$s_!c8SE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F05e61e78-d226-44c8-a58b-de561533e6c2_5472x3648.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!c8SE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F05e61e78-d226-44c8-a58b-de561533e6c2_5472x3648.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!c8SE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F05e61e78-d226-44c8-a58b-de561533e6c2_5472x3648.jpeg" width="504" height="336.11538461538464" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/05e61e78-d226-44c8-a58b-de561533e6c2_5472x3648.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:504,&quot;bytes&quot;:2753931,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!c8SE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F05e61e78-d226-44c8-a58b-de561533e6c2_5472x3648.jpeg 424w, https://substackcdn.com/image/fetch/$s_!c8SE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F05e61e78-d226-44c8-a58b-de561533e6c2_5472x3648.jpeg 848w, https://substackcdn.com/image/fetch/$s_!c8SE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F05e61e78-d226-44c8-a58b-de561533e6c2_5472x3648.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!c8SE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F05e61e78-d226-44c8-a58b-de561533e6c2_5472x3648.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>Your contract for a data warehouse should begin near $0 and go up from there. Start negotiating your unit costs down once the value of the contract becomes significant, or sooner. The point is that you can get started, prove value, and iron out the details down the line.</p><p>Snowflake is a good start, Big Query does wonders. Microsoft is probably up to something with Azure. Redshift is squarely in the <strong><code>migrated_from</code></strong> category. All can scale beyond your VC backer's wildest dreams.</p><p>This is the knife of your Swiss army knife, simply put, a knife needs to be sharp, a data warehouse needs to be powerful. The main attraction.</p><h3>3. Transforming</h3><p>Pliers apply leverage. A Swiss army knife doesn&#8217;t have pliers, which is why no one owns one, preferring a utility-tool. Loosely applying the same logic, the Transformation Layer has long been the missing link in the analytics stack, with various frustrating attempts at enabling elegant management of transformations.&nbsp;</p><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HDV1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F02752614-89fb-4bc7-9720-ce2329e640b6_6240x4160.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HDV1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F02752614-89fb-4bc7-9720-ce2329e640b6_6240x4160.jpeg 424w, https://substackcdn.com/image/fetch/$s_!HDV1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F02752614-89fb-4bc7-9720-ce2329e640b6_6240x4160.jpeg 848w, https://substackcdn.com/image/fetch/$s_!HDV1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F02752614-89fb-4bc7-9720-ce2329e640b6_6240x4160.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!HDV1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F02752614-89fb-4bc7-9720-ce2329e640b6_6240x4160.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HDV1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F02752614-89fb-4bc7-9720-ce2329e640b6_6240x4160.jpeg" width="520" height="346.7857142857143" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/02752614-89fb-4bc7-9720-ce2329e640b6_6240x4160.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:520,&quot;bytes&quot;:5529323,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HDV1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F02752614-89fb-4bc7-9720-ce2329e640b6_6240x4160.jpeg 424w, https://substackcdn.com/image/fetch/$s_!HDV1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F02752614-89fb-4bc7-9720-ce2329e640b6_6240x4160.jpeg 848w, https://substackcdn.com/image/fetch/$s_!HDV1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F02752614-89fb-4bc7-9720-ce2329e640b6_6240x4160.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!HDV1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F02752614-89fb-4bc7-9720-ce2329e640b6_6240x4160.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p>The broad goal here is to enable access to your data for your business users while abstracting away as much of the source system complexity as possible. The outcome is clean, documented, coherent, reliable, logical, self-explanatory and performant data that can be relied upon by the <strong>Distributing</strong> tools. This is the highest leverage point in your pipeline. Leverage that magnifies both gains and mistakes.</p><p>SQL is the language of analysis, and a collection of SQL scripts best describes the transformation of data landed <strong><code>RAW</code></strong> in your data warehouse to ultimately transformed and ready for <strong><code>ANALYTICS</code></strong>. The Analytics mentioned here is the schema that you expose to your <strong>Distributing</strong> tools. </p><p><a href="https://dataform.co/?utm_source=groupby1.substack.com">Dataform</a> is a tool that takes that simple concept and runs with it, making writing sophisticated transformations a delight for analysts. Simply explained, Dataform is a SQL editor that enables analysts to build complex transformations in a way that is maintainable and interpretable. Dataform is differentiated by three concepts from software engineering that are put in the hands of the analyst:&nbsp;</p><p><strong>1/ Continuous Deployment</strong></p><p>A deployment of new code or changes to your transforms should be a thing that happens continuously, and without fear. This is achieved through automated schema tests, continuously deploying code, and data validity and quality tests. This is achieved through the <strong><code>assertions</code></strong> in Dataform, among other useful features.</p><p><strong>2/ Version Control</strong></p><p>If your job involves writing SQL code, and doesn't involve version control, then perhaps more than anything else, this article was written for you.</p><p><strong>3/ Modularity</strong></p><p>If your SQL queries typically run into the 100's or 1000's of lines, with sub-queries galore, then breaking that into individual reusable modular components will feel like our man on a rock below. Extend this with JavaScript and suddenly you will be able to <em>truly</em> <em>express yourself.</em></p><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z3bL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8effc-17e2-494a-9500-e300f9df28fe_6354x4236.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z3bL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8effc-17e2-494a-9500-e300f9df28fe_6354x4236.jpeg 424w, https://substackcdn.com/image/fetch/$s_!z3bL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8effc-17e2-494a-9500-e300f9df28fe_6354x4236.jpeg 848w, https://substackcdn.com/image/fetch/$s_!z3bL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8effc-17e2-494a-9500-e300f9df28fe_6354x4236.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!z3bL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8effc-17e2-494a-9500-e300f9df28fe_6354x4236.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z3bL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8effc-17e2-494a-9500-e300f9df28fe_6354x4236.jpeg" width="498" height="332.114010989011" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/2dc8effc-17e2-494a-9500-e300f9df28fe_6354x4236.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:498,&quot;bytes&quot;:1664038,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!z3bL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8effc-17e2-494a-9500-e300f9df28fe_6354x4236.jpeg 424w, https://substackcdn.com/image/fetch/$s_!z3bL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8effc-17e2-494a-9500-e300f9df28fe_6354x4236.jpeg 848w, https://substackcdn.com/image/fetch/$s_!z3bL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8effc-17e2-494a-9500-e300f9df28fe_6354x4236.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!z3bL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F2dc8effc-17e2-494a-9500-e300f9df28fe_6354x4236.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><h3>4. Distributing</h3><p><code>### TODO - setup a BI tool</code></p><p>Distribution of data. Commonly described as an Analytics Tool or BI tool aka<em> The Last Mile delivery problem.</em></p><p>In a physical product, and an analytics project, the <strong>last mile of delivery</strong> is often both the most expensive and time-consuming part of the delivery mechanism. This is the point where the surface area expands massively, and the usage pattern permutations explode. Bluntly; the neatly organised cookie-cutter data pipeline gets punched in the face by the needs of the user.</p><p>The utility-tool analogy falls apart somewhat at this point, as arguably the pliers should be used here. Just like <a href="https://www.leatherman.com/tread-425.html">this</a> utility-tool, it can get a bit confusing.</p><p>Broadly speaking the distribution problem gets broken into two categories. BI tools and Analytics Tools. The distinction is murky, like your requirements. Generally speaking, these tools are either:</p><p><strong>1/</strong> Good at solving the operational reporting problems of business: Metrics, KPIs, lots of users, lots of operational complexity (tools like Looker, <a href="https://metabase.com/?utm_source=groupby1.substack.com">Metabase</a>).</p><p><strong>2/</strong> Good at solving the analysts&#8217; problems: complicated questions, nuanced analysis, vague outcomes, forecasts, predictions (tools like Mode, Periscope, Jupyter Notebooks).</p><p>A rule of thumb is that you need a good few <em>business users</em> who are comfortable writing complicated SQL or Python before Option 2 will be feasible. This decision is largely based on the operational complexity and technical fluency of the stakeholders in this grand adventure, and generally Option 1 is more broadly applicable.</p><p>If you&#8217;ve done good work in your <strong>Transforming</strong> layer, then you can get away with a compromise here, and use a cheaper tool as a stop-gap, or use an array of tools, or allow the team to choose whatever suits them. Ultimately, you want to trend towards a single source of truth for KPI / Metric type numbers, and aim to automate their delivery.</p><h3>My experience</h3><p>I've honed in on my preferred data stack, described below. This stack is likely a feasible option for your goals if they are related to aligning your business on key metrics. Especially so if you have multiple SaaS or custom software systems floating around that drive these metrics. What you&#8217;ll end up with is something like the following diagram.</p><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lkyQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0996748-9c20-4b96-a212-264c33e4ca9e_1600x1004.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lkyQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0996748-9c20-4b96-a212-264c33e4ca9e_1600x1004.png 424w, https://substackcdn.com/image/fetch/$s_!lkyQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0996748-9c20-4b96-a212-264c33e4ca9e_1600x1004.png 848w, https://substackcdn.com/image/fetch/$s_!lkyQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0996748-9c20-4b96-a212-264c33e4ca9e_1600x1004.png 1272w, https://substackcdn.com/image/fetch/$s_!lkyQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0996748-9c20-4b96-a212-264c33e4ca9e_1600x1004.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lkyQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0996748-9c20-4b96-a212-264c33e4ca9e_1600x1004.png" width="556" height="349.0274725274725" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/e0996748-9c20-4b96-a212-264c33e4ca9e_1600x1004.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:914,&quot;width&quot;:1456,&quot;resizeWidth&quot;:556,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lkyQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0996748-9c20-4b96-a212-264c33e4ca9e_1600x1004.png 424w, https://substackcdn.com/image/fetch/$s_!lkyQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0996748-9c20-4b96-a212-264c33e4ca9e_1600x1004.png 848w, https://substackcdn.com/image/fetch/$s_!lkyQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0996748-9c20-4b96-a212-264c33e4ca9e_1600x1004.png 1272w, https://substackcdn.com/image/fetch/$s_!lkyQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0996748-9c20-4b96-a212-264c33e4ca9e_1600x1004.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><p><strong>Ingesting/ </strong>As mentioned, I prefer to use a SaaS ELT tool like <a href="https://www.stitchdata.com/?utm_source=groupby1.substack.com">Stitch</a> or <a href="https://www.fivetran.com/?utm_source=groupby1.substack.com">Fivetran</a>, as they reduce the need for ongoing maintenance where possible. Stitch is the cheaper option, and a great low-cost starting point, with the following useful additions:</p><ul><li><p>It has a great <a href="https://www.stitchdata.com/integrations/import-api/?utm_source=groupby1.substack.com">Import API</a> that allows some simplification of ELT scripts if you do need to write them.</p></li><li><p>It has a useful <a href="https://www.stitchdata.com/integrations/google-sheets/?utm_source=groupby1.substack.com">Google Sheets</a> Integration, as well as the usual Postgres, Hubspot, Salesforce, Google/Facebook ads etc. </p></li></ul><p><strong>Storing/</strong> The stack described orients towards <a href="https://cloud.google.com/bigquery/?utm_source=groupby1.substack.com">BigQuery</a> or <a href="https://snowflake.com/?utm_source=groupby1.substack.com">Snowflake</a>, with PostgreSQL also a feasible option. I prefer the scale / cost model of Snowflake.</p><ul><li><p>Snowflake scales up to enterprise but starts from $2/credit, so can be a very cost-effective bet with typical small loads running around 2-5 credits per day. This can get very expensive if you don&#8217;t manage it carefully with limits.</p></li><li><p>PostgreSQL will require a migration in the future, so unless you are very cost sensitive, the cost / benefit generally leans in favour of Snowflake.</p></li><li><p>I have a <a href="https://github.com/mattarderne/snowflake_init/blob/master/first_run.sql">simple SQL script</a> used to setup Snowflake ready to use for a POC, and I like to use <a href="https://github.com/snowflakedb/SnowAlert/blob/master/packs/snowflake_query_pack.sql">these</a> <a href="https://github.com/snowflakedb/SnowAlert/blob/master/packs/snowflake_cost_management.sql">scripts</a> to track Snowflake credit usage in combination with Dataform assertions.</p></li></ul><p><strong>Distributing/ </strong>This is where business users will interact with and judge the success of your system, so to spend your budget on the rest of the components but cut corners on the distribution tool is a bad idea. That said, BI tools can have expensive annual contracts.</p><ul><li><p><a href="https://metabase.com/?utm_source=groupby1.substack.com">Metabase</a> is a great open-source BI tool and should give you a good place to start. The cost jump is quite severe up to <a href="https://looker.com/?utm_source=groupby1.substack.com">Looker</a> / <a href="https://chartio.com/?utm_source=groupby1.substack.com">ChartIO</a>, but so is the feature set.</p></li><li><p>These tools are trickier to migrate from, and so it is reasonable to expect to be locked-in for the mid-term.</p></li></ul><p><strong>Transforming/ </strong>This may be premature depending on the level of sophistication of logical transformations required to answer your questions, but at some stage it will make sense to move your transforms to the data warehouse from the BI tool.&nbsp;</p><ul><li><p>The best of breed at this stage is <a href="https://dataform.co/?utm_source=groupby1.substack.com">Dataform</a> or<a href="https://www.getdbt.com/?utm_source=groupby1.substack.com"> dbt.</a> These tools enable software development best practices (git, testing, documentation). </p></li><li><p>There is relatively little involved in adding this from the start, and significant gains to be had if used to build a logical data model from the start. </p></li><li><p>I have deployed Metabase successfully with https and nice scalability using <a href="https://github.com/mattarderne/metabase">these Docker scripts</a>. </p></li></ul><p>In future editions I&#8217;ll be diving into the above specifics, stay tuned.</p><h3>Conclusion</h3><p>Taking the time to properly implement a reasoned and scalable analytics infrastructure is an axe sharpening exercise with benefits that may compound massively over time. Second-order benefits to aim for include increasing the data proficiency of your team, enabling evidence-based decision making and most importantly, increasing alignment.</p><p>Most businesses follow similar patterns, and in survival as in business, preparation is key.</p><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!G9O1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fccaa6be0-683a-4961-abe6-abec07d6018b_1600x1068.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!G9O1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fccaa6be0-683a-4961-abe6-abec07d6018b_1600x1068.jpeg 424w, https://substackcdn.com/image/fetch/$s_!G9O1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fccaa6be0-683a-4961-abe6-abec07d6018b_1600x1068.jpeg 848w, https://substackcdn.com/image/fetch/$s_!G9O1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fccaa6be0-683a-4961-abe6-abec07d6018b_1600x1068.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!G9O1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fccaa6be0-683a-4961-abe6-abec07d6018b_1600x1068.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!G9O1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fccaa6be0-683a-4961-abe6-abec07d6018b_1600x1068.jpeg" width="1456" height="972" data-attrs="{&quot;src&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/ccaa6be0-683a-4961-abe6-abec07d6018b_1600x1068.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:972,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;https://images.unsplash.com/photo-1545476745-9211a9e7cca8?ixlib=rb-1.2.1&amp;q=85&amp;fm=jpg&amp;crop=entropy&amp;cs=srgb&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="https://images.unsplash.com/photo-1545476745-9211a9e7cca8?ixlib=rb-1.2.1&amp;q=85&amp;fm=jpg&amp;crop=entropy&amp;cs=srgb" title="https://images.unsplash.com/photo-1545476745-9211a9e7cca8?ixlib=rb-1.2.1&amp;q=85&amp;fm=jpg&amp;crop=entropy&amp;cs=srgb" srcset="https://substackcdn.com/image/fetch/$s_!G9O1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fccaa6be0-683a-4961-abe6-abec07d6018b_1600x1068.jpeg 424w, https://substackcdn.com/image/fetch/$s_!G9O1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fccaa6be0-683a-4961-abe6-abec07d6018b_1600x1068.jpeg 848w, https://substackcdn.com/image/fetch/$s_!G9O1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fccaa6be0-683a-4961-abe6-abec07d6018b_1600x1068.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!G9O1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fccaa6be0-683a-4961-abe6-abec07d6018b_1600x1068.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><blockquote><p>&#8220;Give me six hours to chop down a tree and I will spend the first four sharpening the axe.&#8221; - Abe</p></blockquote><div><hr></div><p><em><a href="https://dataform.co/?utm_source=groupby1.substack.com">This was a guest post on the Dataform.co blog</a></em></p><p><em>Please consider subscribing for more on the subject of data systems thinking</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://groupby1.mattarderne.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://groupby1.mattarderne.com/subscribe?"><span>Subscribe now</span></a></p><p><em>What is <a href="https://groupby1.substack.com/about">group by 1</a></em></p><p><em>Who is <a href="https://rdrn.dev/?utm_source=groupby1.substack.com">Matt Arderne</a></em></p>]]></content:encoded></item></channel></rss>