<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" > <channel> <title>data mesh Archives - CDInsights</title> <atom:link href="https://www.clouddatainsights.com/tag/data-mesh/feed/" rel="self" type="application/rss+xml" /> <link>https://www.clouddatainsights.com/tag/data-mesh/</link> <description>Trsanform Your Business in a Cloud Data World</description> <lastBuildDate>Tue, 10 Oct 2023 19:08:14 +0000</lastBuildDate> <language>en-US</language> <sy:updatePeriod> hourly </sy:updatePeriod> <sy:updateFrequency> 1 </sy:updateFrequency> <generator>https://wordpress.org/?v=6.6.1</generator> <image> <url>https://www.clouddatainsights.com/wp-content/uploads/2022/05/CDI-Favicon-2-45x45.jpg</url> <title>data mesh Archives - CDInsights</title> <link>https://www.clouddatainsights.com/tag/data-mesh/</link> <width>32</width> <height>32</height> </image> <site xmlns="com-wordpress:feed-additions:1">207802051</site> <item> <title>Data Mesh Implementation: Your Blueprint for a Successful Launch</title> <link>https://www.clouddatainsights.com/data-mesh-implementation-your-blueprint-for-a-successful-launch/</link> <comments>https://www.clouddatainsights.com/data-mesh-implementation-your-blueprint-for-a-successful-launch/#respond</comments> <dc:creator><![CDATA[Jon Osborn]]></dc:creator> <pubDate>Mon, 21 Aug 2023 23:42:17 +0000</pubDate> <category><![CDATA[Cloud Data Platforms]]></category> <category><![CDATA[data mesh]]></category> <guid isPermaLink="false">https://www.clouddatainsights.com/?p=4197</guid> <description><![CDATA[Simply understanding the principles of data mesh isn’t enough to spark transformative change. Real progress comes when we move beyond comprehension to the realm of application.]]></description> <content:encoded><![CDATA[<div class="wp-block-image"> <figure class="aligncenter size-full"><img fetchpriority="high" decoding="async" width="1000" height="599" src="https://www.clouddatainsights.com/wp-content/uploads/2023/08/data-mesh-Depositphotos_78917668_S.jpg" alt="" class="wp-image-4205" srcset="https://www.clouddatainsights.com/wp-content/uploads/2023/08/data-mesh-Depositphotos_78917668_S.jpg 1000w, https://www.clouddatainsights.com/wp-content/uploads/2023/08/data-mesh-Depositphotos_78917668_S-300x180.jpg 300w, https://www.clouddatainsights.com/wp-content/uploads/2023/08/data-mesh-Depositphotos_78917668_S-768x460.jpg 768w" sizes="(max-width: 1000px) 100vw, 1000px" /><figcaption class="wp-element-caption">Simply understanding the principles of data mesh isn’t enough to spark transformative change. Real progress comes when we move beyond comprehension to the realm of application.</figcaption></figure></div> <p><em>This article is sponsored and originally appeared on <a href="https://www.ascend.io/blog/data-mesh-implementation-your-blueprint-for-a-successful-launch/?utm_campaign=rtinsights&utm_source=rtinsights" target="_blank" rel="noreferrer noopener">Ascend.io</a>.</em></p> <div class="hs-cta-embed hs-cta-simple-placeholder hs-cta-embed-139427169862" style="max-width:100%; max-height:100%; width:210px;height:42.399993896484375px" data-hubspot-wrapper-cta-id="139427169862"> <a href="https://cta-service-cms2.hubspot.com/web-interactives/public/v1/track/redirect?encryptedPayload=AVxigLJ7%2BSTt%2BNElHyY%2F2gLV%2BnBZ1IAloSqwUa5VPdhLQnSVf%2FbWKbjxCXvn8ByQeIDSzEfF5WRrfYnh%2FIs2bmqp4MbhddDBMIRrYM3Pd346F18VeaiiyIwo0lsO7ZtOR3cgybM1lVxxf35j3ARCMJU49u9XcxsEyJTsXgMYHtxUwAfb%2Fqu9o8nbstWWJNjHdo%2BZK5tRDaNcTZWP3zQ3HI6pHfD79qWhLQPcQmkb7VCIbhId8erkIiPCM8aursfO8IO8ZphV70GIWnmziq0rUu5RWIgj&webInteractiveContentId=139427169862&portalId=8019034" target="_blank" rel="noopener" crossorigin="anonymous"> <img decoding="async" alt="Visit Now" loading="lazy" src="https://no-cache.hubspot.com/cta/default/8019034/interactive-139427169862.png" style="height: 100%; width: 100%; object-fit: fill" onerror="this.style.display='none'" /> </a> </div> <p>Ready or not, data mesh is fast becoming an indispensable part of the data landscape. As data leaders, the question isn’t if you’ll cross paths with this emerging architectural pattern. The question is when. </p> <p>A shift this monumental can seem daunting, often leading to analysis paralysis, overthinking, or other implementation delays. This is where we want to step in. While the journey will differ from company to company — because of their unique business and data needs — there are fundamental principles that provide a blueprint for action. </p> <p>In this article, <strong>we provide practical, actionable steps to kick-start your data mesh implementation using the well-established approach of managing People, Processes, and Technology.</strong> This isn’t another theoretical deep dive. Consider this your primer to stop overthinking, start acting, and truly harness the power of data mesh.</p> <h3 class="wp-block-heading">Establishing the Baseline for Data Mesh Implementation</h3> <p>Data teams everywhere are leaning in, eager to figure out how data mesh can help. Sure, we’ve all seen trends come and go, causing a bit of chaos as we shift strategies and adjust our tech stacks. But something about data mesh feels different, doesn’t it?</p> <p>For one, data mesh tackles the real headaches caused by an overburdened data lake and the annoying game of tag that’s too often played between the people who make data, the ones who use it, and everyone else caught in the middle. It feels like data mesh showed up just in time, as we’re all hustling to refine our data platform strategies to build better datasets, dashboards, analytical apps, algorithms, or, in short, <a href="https://www.ascend.io/blog/introducing-data-products/" target="_blank" rel="noreferrer noopener">data products</a>.</p> <figure class="wp-block-image size-full"><img decoding="async" width="936" height="652" src="https://www.clouddatainsights.com/wp-content/uploads/2023/08/Ascend2-1.jpg" alt="" class="wp-image-4198" srcset="https://www.clouddatainsights.com/wp-content/uploads/2023/08/Ascend2-1.jpg 936w, https://www.clouddatainsights.com/wp-content/uploads/2023/08/Ascend2-1-300x209.jpg 300w, https://www.clouddatainsights.com/wp-content/uploads/2023/08/Ascend2-1-768x535.jpg 768w" sizes="(max-width: 936px) 100vw, 936px" /></figure> <p>In this article, we won’t be diving into the deep end of what data mesh is. We’ve got plenty of resources for that — check out the <a href="https://datameshlearning.com/library/" target="_blank" rel="noreferrer noopener">Data Mesh Learning community</a> or our previous articles:</p> <ol class="wp-block-list" type="1" start="1"> <li><a href="https://www.ascend.io/blog/what-is-a-data-mesh/" target="_blank" rel="noreferrer noopener">What is a Data Mesh? — And Why You Might Consider Building One</a></li> <li><a href="https://www.ascend.io/blog/benefits-of-data-mesh-and-top-examples-to-unlock-success/" target="_blank" rel="noreferrer noopener">Benefits of Data Mesh and Top Examples to Unlock Success</a></li> <li><a href="https://www.ascend.io/blog/data-mesh-vs-data-fabric/" target="_blank" rel="noreferrer noopener">Data Mesh vs. Data Fabric: Which One Is Right for You?</a></li> </ol> <p>The tricky part isn’t understanding what data mesh is — it’s figuring out how to put it to work for your organization. So, the million-dollar question is, where do we begin this journey?</p> <p>While the data mesh concept is relatively new, we can leverage the established framework of managing People, Processes, and Technology to successfully guide the implementation. This framework has long been a trusted approach to complex transformations. Each of its three pillars offers a specific focus area that is essential for the successful implementation of any large-scale change, including a data mesh.</p> <figure class="wp-block-image size-full"><img decoding="async" width="970" height="664" src="https://www.clouddatainsights.com/wp-content/uploads/2023/08/ascend2-2.jpg" alt="" class="wp-image-4199" srcset="https://www.clouddatainsights.com/wp-content/uploads/2023/08/ascend2-2.jpg 970w, https://www.clouddatainsights.com/wp-content/uploads/2023/08/ascend2-2-300x205.jpg 300w, https://www.clouddatainsights.com/wp-content/uploads/2023/08/ascend2-2-768x526.jpg 768w" sizes="(max-width: 970px) 100vw, 970px" /></figure> <h3 class="wp-block-heading">People: Defining your Domains, Assessing Domain Maturity, and Selecting Your First Partner(s)</h3> <p>The data mesh model inherently pushes for domain-oriented decentralization, which implies a significant focus on the people within those domains. Establishing ownership and understanding the varying technical maturity levels across these domains is the initial, crucial step.</p> <p><strong>Identifying Domains</strong></p> <p>The shift towards a data mesh implies treating each domain as its own miniature data ecosystem. Defining this organizational architecture should be first.</p> <p>Here are a few of the most common issues with data meshes: </p> <ul class="wp-block-list"> <li><strong>Identify data domains within your organization.</strong> These could range from functional groups like sales, marketing, and finance, to broader LOB units which would likely be further broken down into sub-domains. Your first pass may not be perfect but will be the start of the organizational architecture. Understand the role each domain plays in creating, maintaining, and consuming data products.</li> <li><strong>Find a partner within each domain</strong> – someone influential who can champion the cause of the data mesh within their domain, making the transition smoother.</li> </ul> <p><strong>Assessing Domain Maturity</strong></p> <p>One major overlooked thread when it comes to implementing a mesh is to have a well-defined maturity model. By using such a model you can create repeatable patterns and process to support and move Domains up the maturity scale. At the end of the day, the best data mesh implementation pulls the entire organization in to participate, not just those that are have the dollars and skills to do so.</p> <ul class="nv-cv-m wp-block-list"> <li>Use the same people, process, and technology to create a maturity model. A good starting point looks at the following dimensions: <ul class="wp-block-list"> <li>People: What level of technical and curation skill does the domain have?</li> <li>Process: Are there current processes in place that already support the move to Data Product creation and curation</li> <li>Technology: What level of platforms and tools exist? </li> </ul> </li> <li>Rank and categorize domains that you may already know based on the maturity model.</li> </ul> <p><strong>Select Your First Partners</strong></p> <p>This is the key to moving quickly to show the value in continued investment in a mesh. Using your domain assessment combined with the relationships you already have, choose two partners that are well up the maturity scale to begin this journey with you. As part of this, it’s key to create a Steering Committee. </p> <h3 class="wp-block-heading">Process: Building Robust Systems from Data Governance to Product Publishing</h3> <p>Just as a well-oiled machine needs precise processes, a successful data mesh needs to establish and adhere to processes regarding data governance, data curation, and data product publishing.</p> <p><strong>Implementing Data Governance</strong></p> <p>Effective data governance is the backbone of any data management strategy, and even more so for a decentralized approach like data mesh. A robust data governance framework is crucial to ensure data quality, manage data access, and maintain compliance with regulatory requirements.</p> <p><strong>Establish clear data governance policies.</strong></p> <p>The policies should outline rules and standards for data. These should be explicit and prescriptive, addressing the 5 aspects below:</p> <ol class="nv-cv-m wp-block-list" type="1" start="1"> <li><strong>Domain and business key definitions:</strong> Clearly define your business keys and the domains they belong to. It ensures everyone in the organization speaks the same data language.</li> <li><strong>Security access and roles:</strong> Define who has access to what data and the extent of their permissions. It is essential for maintaining the integrity and confidentiality of your data.</li> <li><strong>Data quality expectations and minimum acceptable criteria:</strong> Set standards for what constitutes acceptable data quality. This helps maintain the reliability of data products across domains.</li> <li><strong>Contractual data obligations and acceptable use standards:</strong> This includes rules for sharing, aggregating, or other usage of data to avoid legal complications.</li> <li><strong>Data product sharing and version management:</strong> Establish a policy for managing the lifecycle of data products.</li> </ol> <p><strong>Promote cross-domain collaboration.</strong></p> <p>Encourage different domains to collaborate on establishing and enforcing data governance policies. This collaboration ensures consistency and compliance across the data mesh implementation.</p> <p>Dealing with regulatory compliance can be challenging for individual domains. This is why core IT often still handles a significant part of data management. Therefore, consider providing specific guidance or resources to help domains understand and navigate compliance implications.</p> <p>Don’t forget to publish business key definitions for relating domain data and maintain a simple-to-use catalog of domain data. These practices can support data users across domains in efficiently locating and utilizing the data they need.</p> <p><strong>Streamlining Data Curation</strong></p> <p>Data curation is the process of organizing and defining data in a way that maintains its usefulness over time. It’s crucial to have a consistent approach to data curation across all domains.</p> <p><strong>Establish standardized processes and guidelines for curating data products.</strong></p> <p>To ensure consistency in the data product definitions across domains, these guidelines should at least cover:</p> <ul class="nv-cv-d nv-cv-m wp-block-list"> <li><strong>Metadata standards:</strong> Define a standard set of metadata to accompany every data product. This might include information about the data source, the type of data, the date of creation, and any relevant context or description.</li> <li><strong>Data validation checks:</strong> Outline a systematic approach to ensuring data accuracy, completeness, and consistency. This might involve data checks at different stages of the data lifecycle.</li> <li><strong>Data lineage documentation:</strong> Establish a clear process for tracking the journey of data from its origin to its current state. This is vital for understanding the history and reliability of your data products.</li> <li><strong>Data documentation best practices:</strong> Define a standard for documenting data products. This should include how to describe data products, how to document any transformations or modifications made to the data, and any other relevant notes or comments.</li> </ul> <p><strong>Train domain teams in these processes.</strong></p> <p>Providing detailed training ensures the guidelines and processes are effectively followed. Make sure that training materials are accessible and easy to understand and consider offering regular refresher courses to account for any updates or changes to the processes.</p> <p><strong>Navigating Data Product Publishing</strong></p> <p>Having a clear process for data product publishing is vital to avoid any hiccups in the mesh due to changing data product definitions.</p> <p><strong>Develop a data product lifecycle framework.</strong></p> <p>This framework should guide the entire lifecycle of data products, from creation to retirement. Key elements to consider when working on the data mesh implementation include:</p> <ul class="nv-cv-d nv-cv-m wp-block-list"> <li><strong>Creation:</strong> Define the processes involved in the creation of new data products, such as data collection, validation, and initial documentation.</li> <li><strong>Versioning: </strong>Establish a versioning system to track different iterations of a data product, enabling users to understand the evolution of the product and helping prevent confusion or misuse of outdated versions.</li> <li><strong>Updating:</strong> Set out clear procedures for making updates to existing data products. This should include steps for documenting changes, notifying users, and ensuring the updated product meets established quality standards.</li> <li><strong>Retirement:</strong> Outline the steps to decommission a data product, which could involve archiving the data, notifying users, and updating any related documentation or metadata.</li> </ul> <p><strong>Implement a publishing approval process.</strong></p> <p>Create a process to ensure new data products or updates meet quality standards before being published. Key elements include:</p> <ul class="nv-cv-d nv-cv-m wp-block-list"> <li><strong>Quality checks:</strong> Set up systematic checks to validate the quality of the data product based on your established criteria.</li> <li><strong>Approval workflow:</strong> Define who is responsible for reviewing and approving new data products or updates. This might involve multiple levels of review depending on the complexity and impact of the data product.</li> <li><strong>Notification system:</strong> Establish a system to notify relevant stakeholders once a data product is approved and published. This could be through email notifications, updates on your data catalog, or automated alerts through your data platform.</li> </ul> <p>If you need further guidance on treating data as a product in the context of data mesh, we recommend reading our article <a href="https://www.ascend.io/blog/essential-capabilities-to-treat-data-as-a-product/" target="_blank" rel="noreferrer noopener">Essential Capabilities to Treat Data as a Product</a>.</p> <h3 class="wp-block-heading">Technology: Laying the Foundation with Storage, Build Plane, and Sharing Layer</h3> <p>The technological infrastructure forms the backbone of a data mesh. It is crucial to have a reliable storage and compute layer, a unified build plane, and a real, user-friendly sharing layer.</p> <p><strong>Building a Resilient Storage and Compute Layer</strong></p> <p>The storage and compute layer is where data is housed and processed, forming the base of your data mesh.</p> <ul class="nv-cv-d nv-cv-m wp-block-list"> <li><strong>Decide on a centralized or decentralized approach.</strong> Consider whether a centralized storage and compute layer would work best for your organization, or whether allowing each domain to use their existing infrastructure would be more practical.</li> <li><strong>Evaluate your infrastructure.</strong> Understand your organization’s technical capabilities and existing infrastructure when deciding on the storage and compute layer.</li> </ul> <p><strong>Standardizing the Build Plane</strong></p> <p>The build plane is where the data products are built, refined, and maintained. It’s important to have a consistent and user-friendly build plane for a successful data mesh implementation.</p> <ol class="nv-cv-d nv-cv-m wp-block-list" type="1" start="1"> <li>Aim to consolidate your build plane onto a single platform. A unified platform brings the benefit of a shared, familiar environment for data product development across different domains.</li> <li>Having a single platform is particularly advantageous as it provides a consistent user experience, helping users from various domains navigate and use the tools efficiently and effectively.</li> </ol> <p>The choice of platform should accommodate users of varying technical capabilities. Ensuring that non-technical users can operate effectively within the platform broadens the range of users who can contribute to and leverage <a href="https://www.ascend.io/blog/benefits-of-data-mesh-and-top-examples-to-unlock-success/" target="_blank" rel="noreferrer noopener">the benefits of the data mesh</a>.</p> <p><strong>Streamlining the Sharing Layer</strong></p> <p>The sharing layer is the bridge that connects users to data products in the data mesh. It not only facilitates the discovery, connection, and use of data products but also ensures that the diligent work put into creating data products reaches its intended audience. As such, it’s one of the most challenging aspects of the data mesh, primarily due to the need for dynamic data sharing and change management, particularly as data products evolve.</p> <p><strong>Dynamic data sharing:</strong></p> <p>One of the greatest challenges is establishing a mechanism for dynamic data sharing. This process involves creating a system that automatically updates data for everyone who has subscribed whenever there are changes in the data products. This requires a tight integration between the sharing layer and the data product versioning system to ensure subscribers always have access to the most updated data.</p> <p><strong>Accessible and user-friendly design:</strong></p> <p>It is essential to ensure the sharing layer is intuitive and easy to navigate. Users should be able to find and connect to data products with ease, making the discovery and connection process as frictionless as possible.</p> <p><strong>Prioritizing discoverability:</strong></p> <p>The effectiveness of a data mesh greatly relies on the ease with which users can find and utilize existing data products. Therefore, investing in features that enhance discoverability, such as advanced search functions or a well-structured data catalog, is critical.</p> <p><strong>Integration with data governance:</strong></p> <p>Lastly, the technical sharing capability should be closely aligned with data governance processes. This helps simplify change management when data products are updated, ensuring that changes are reflected promptly and accurately across all instances where the data product is used. Such integration also ensures that any changes comply with established data governance policies, maintaining the quality and reliability of your data products.</p> <h3 class="wp-block-heading">Taking the First Steps Towards Data Mesh Implementation</h3> <p>With countless resources at our disposal, the theory surrounding data mesh is well-established and readily available. However, simply understanding the principles of data mesh isn’t enough to spark transformative change. Real progress comes when we move beyond comprehension to the realm of application.</p> <p>This article has laid out the foundational principles integral to embarking on a data mesh journey. By treating it as a comprehensive project plan and systematically addressing the aspects of people, processes, and technology, the task becomes much more manageable. </p> <p>While the data mesh is a relatively new concept, the framework we’ve discussed is a classic approach to getting things done and can be effectively applied to implement a data mesh. As you navigate this journey, remember that the ultimate goal of a data mesh is to unlock the true potential of your data, enabling rapid insights and more effective decision-making across your organization.</p> <p></p> <div class="saboxplugin-wrap" itemtype="http://schema.org/Person" itemscope itemprop="author"><div class="saboxplugin-tab"><div class="saboxplugin-gravatar"><img alt='Jon Osborn' src='https://secure.gravatar.com/avatar/e0399f0ed9eb58554728eb4d9fe99693?s=100&d=mm&r=g' srcset='https://secure.gravatar.com/avatar/e0399f0ed9eb58554728eb4d9fe99693?s=200&d=mm&r=g 2x' class='avatar avatar-100 photo' height='100' width='100' itemprop="image"/></div><div class="saboxplugin-authorname"><a href="https://www.clouddatainsights.com/author/jon-osborn/" class="vcard author" rel="author"><span class="fn">Jon Osborn</span></a></div><div class="saboxplugin-desc"><div itemprop="description"><p>Jon Osborn is Field CTO at Ascend.io.</p> </div></div><div class="clearfix"></div></div></div>]]></content:encoded> <wfw:commentRss>https://www.clouddatainsights.com/data-mesh-implementation-your-blueprint-for-a-successful-launch/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <post-id xmlns="com-wordpress:feed-additions:1">4197</post-id> </item> <item> <title>Data Mesh: From Concept to Reality with BairesDev, Monte Carlo, and Databricks</title> <link>https://www.clouddatainsights.com/data-mesh-from-concept-to-reality-with-bairesdev-monte-carlo-and-databricks/</link> <comments>https://www.clouddatainsights.com/data-mesh-from-concept-to-reality-with-bairesdev-monte-carlo-and-databricks/#respond</comments> <dc:creator><![CDATA[Elizabeth Wallace]]></dc:creator> <pubDate>Fri, 25 Nov 2022 13:01:40 +0000</pubDate> <category><![CDATA[Data Architecture]]></category> <category><![CDATA[Webinar]]></category> <category><![CDATA[data architecture]]></category> <category><![CDATA[data mesh]]></category> <category><![CDATA[databricks]]></category> <category><![CDATA[monte carlo]]></category> <category><![CDATA[Practitioner]]></category> <guid isPermaLink="false">https://www.clouddatainsights.com/?p=2032</guid> <description><![CDATA[Find out how BairesDev implemented a data mesh using tools from Databricks and Monte Carlo.]]></description> <content:encoded><![CDATA[ <div class="wp-block-uagb-image uagb-block-f85ad338 wp-block-uagb-image--layout-default wp-block-uagb-image--effect-static wp-block-uagb-image--align-none"><figure class="wp-block-uagb-image__figure"><img decoding="async" srcset="https://www.clouddatainsights.com/wp-content/uploads/2022/11/Depositphotos_166481500_S.jpg " src="https://www.clouddatainsights.com/wp-content/uploads/2022/11/Depositphotos_166481500_S.jpg" alt="" class="uag-image-2033" width="" height="" title="" loading="lazy"/></figure></div> <p>Data mesh is once again on everyone’s mind thanks to press from industry analysts like Gartner and McKinsey. It promises to help companies finally become data-oriented if they can only figure out how to implement and execute it within their own data structures. In a fascinating webinar presented by Data Science Salon, “Data Mesh: From Concept to Reality,” speakers Matheus Espanhol of BairesDev, Jason Pohl of Databricks, and Jon So of Monte Carlo demonstrate just how possible it is to leverage this decentralized data architecture approach using Databricks and Monte Carlo tools. </p> <h3 class="wp-block-heading">The four principles of a data mesh</h3> <p>In order to make the most of this concept, you must understand the four principles of a data mesh.</p> <ul class="nv-cv-d nv-cv-m wp-block-list"> <li>Data domain ownership: Companies must host and serve data in an easily consumable way</li> <li>Data as a product: Application of product-making to data, i.e., easily discoverable and read, as well as versioning and security policies.</li> <li>Self-serve data: Tools and user-friendly interfaces.</li> <li>Federated governance: An overarching set of policies to govern operations</li> </ul> <p>These are the foundation of a well-executed data mesh. Companies must have this foundation in place before they can build a functioning data mesh.</p> <h3 class="wp-block-heading">Technical challenges of managing data </h3> <p>The companies participating in this webinar understood the challenges of becoming data-driven. They experienced challenges in scale, as well as the limits of their existing infrastructures. In addition, a lack of trust and quality prevented real data-driven decision-making.</p> <p>For Bairesdev, executing a data mesh required planning and restructuring their existing technology. And it wasn’t easy. The company includes over 5,000 engineers across 36 countries and delivers its services to a host of brands around the globe. Their solution needed to cause as little disruption as possible while improving the insights given to them by big data so they could help their customers in turn.</p> <h3 class="wp-block-heading">The team evaluated solutions to build a custom data mesh</h3> <p>BairesDev looked at some of its most perplexing challenges and noticed an overlap with the four foundational requirements of a data mesh. This helped make decisions a little easier because the team knew and understood what they were working towards.</p> <ul class="nv-cv-d nv-cv-m wp-block-list"> <li>The company had strong autonomy and domain ownership already in place. It was able to find and define data owners.</li> <li>However, consumers didn’t trust the data. A lack of observability and no data product role kept performance and availability to a minimum.</li> <li>The company had good practices in privacy but centralized metadata management and global policies.</li> </ul> <h3 class="wp-block-heading">Implementation presented its own challenges, but planning and creativity helped</h3> <p>The goal was to reduce complexity to begin the journey towards a data mesh. The company purposefully chose managed options to implement automation. This would help reduce the time to market for data products. Tools such as Fivetran, Monte Carlo, and Databricks provided these capabilities.</p> <p>The company also needed to reduce the complexity and scope. Kafka Connector Manager and Databricks CD provided automated integration tools and supported the creation of new architecture without building from scratch.</p> <ul class="nv-cv-d nv-cv-m wp-block-list"> <li><strong>Databricks:</strong> The lakehouse construction simplified the architecture and helped cover distinct domain needs. </li> <li><strong>Monte Carlo:</strong> Incident IQ helps with root cause analysis and encourages data discoverability. Users were able to maintain high-quality data products, including shared options.</li> </ul> <h3 class="wp-block-heading">The two keys of success: Data lakehouses and data observability</h3> <p>The lakehouse is simple, multi-cloud, and open. The lakehouse is a complementary, not competing, technology. In addition, the Databricks Unity Catalog allows the administration to manage and authenticate users from a central location.</p> <p>Another tool for executing a data mesh is Delta Sharing, the first open protocol for data sharing. Users can share data within their existing data lake with partners, suppliers, or even customers outside the identity provider. It allows users to scale their data mesh and integrate with other users and tools.</p> <p>As for data observability, Monte Carlo integrates with the Databricks Lakehouse. It automatically notifies domain or data team owners of anomalies and nudges teams to resolve the incident. Monte Carlo tools also help them understand how changes downstream or in the schema will affect the overall system.</p> <p>It can automate observability markers and facilitate the self-serve portion of a data mesh. These are preprogrammed to check for common issues and work out of the box. These are customizable through the platform and ensures that even a decentralized architecture offers a cohesive governance strategy.</p> <h3 class="wp-block-heading">Two common ways to organize a data mesh using Databricks</h3> <p>Companies must decide to balance autonomy with complexity. </p> <h3 class="wp-block-heading">Harmonized</h3> <p>This is the truest form of a data mesh. It requires each domain to have the skills to manage the end-to-end data lifecycle but can create inefficiencies if there is a high level of data reuse.</p> <h4 class="wp-block-heading">Hub-and-Spoke</h4> <p>This option offers a hybrid data mesh with some centralization. If there are a large number of domains, it can reduce data sharing and management overheads. However, it blurs the boundaries around a truly decentralized system. </p> <h3 class="wp-block-heading">Data Mesh is attainable and actionable</h3> <p>The webinar clarifies how companies can implement new concepts, such as the data mesh, to transform how they handle data. It isn’t just a conceptual architecture but one that companies can achieve with planning and the right tools. </p> <p>To view the entire webinar on demand and see more details about how the pieces fit together, visit the <a href="https://info.datascience.salon/data-mesh-from-concept-to-reality?utm_campaign=DSS%20Webinars&utm_medium=email&_hsmi=233858190&_hsenc=p2ANqtz-9KjPywLDr8nWRCqqNjz6xfnM8W1emRRO5TlwgjMpygSOBiHb5kVHXetiorwhrnQemRCC9h8JXJVh7Ui3USroIh8Pt7GA&utm_content=231026650&utm_source=hs_email">Data Science Salon</a>.</p> <div class="saboxplugin-wrap" itemtype="http://schema.org/Person" itemscope itemprop="author"><div class="saboxplugin-tab"><div class="saboxplugin-gravatar"><img loading="lazy" decoding="async" src="https://www.clouddatainsights.com/wp-content/uploads/2022/05/Elizabeth-Wallace-RTInsights-141x150-1.jpg" width="100" height="100" alt="" itemprop="image"></div><div class="saboxplugin-authorname"><a href="https://www.clouddatainsights.com/author/elizabeth-wallace/" class="vcard author" rel="author"><span class="fn">Elizabeth Wallace</span></a></div><div class="saboxplugin-desc"><div itemprop="description"><p>Elizabeth Wallace is a Nashville-based freelance writer with a soft spot for data science and AI and a background in linguistics. She spent 13 years teaching language in higher ed and now helps startups and other organizations explain – clearly – what it is they do.</p> </div></div><div class="clearfix"></div></div></div>]]></content:encoded> <wfw:commentRss>https://www.clouddatainsights.com/data-mesh-from-concept-to-reality-with-bairesdev-monte-carlo-and-databricks/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <post-id xmlns="com-wordpress:feed-additions:1">2032</post-id> </item> <item> <title>Enabling Innovation with the Right Cloud Data Architecture </title> <link>https://www.clouddatainsights.com/enabling-innovation-with-the-right-cloud-data-architecture/</link> <comments>https://www.clouddatainsights.com/enabling-innovation-with-the-right-cloud-data-architecture/#respond</comments> <dc:creator><![CDATA[David Curry]]></dc:creator> <pubDate>Fri, 10 Jun 2022 22:19:31 +0000</pubDate> <category><![CDATA[Data Architecture]]></category> <category><![CDATA[data architecture]]></category> <category><![CDATA[data fabric]]></category> <category><![CDATA[data mesh]]></category> <guid isPermaLink="false">https://www.clouddatainsights.com/?p=1339</guid> <description><![CDATA[With the wrong data architecture, businesses often end up collecting a huge amount of unstructured data but fail to achieve the business goals set out by the company.]]></description> <content:encoded><![CDATA[ <p>Businesses are collecting more data than ever before, with IDC and Seagate <a href="https://www.seagate.com/files/www-content/our-story/rethink-data/files/Rethink_Data_Report_2020.pdf" target="_blank" rel="noreferrer noopener">projecting a 42.2% annual growth rate</a> in enterprise data collection over the next four years. However, without a proper framework in place, a lot of this data ends up going unused or fails to reach the right destination. At a time when organizations are moving data to the cloud and experimenting with artificial intelligence and machine learning, the need for component data architecture processes is more critical than ever before.</p> <h3 class="wp-block-heading"><a></a>What is data architecture?</h3> <p>Data architecture is the framework organizations use to document all data assets and govern how these assets are stored, arranged, and integrated into other data systems. The implementation of standards, rules, and policies on data assets that flow into an organization can provide a clearer view of what is happening inside the business, enabling faster analysis and visualization of data, and permit all teams to make data-driven decisions without the need for a complete understanding of the data process.</p> <p>Modern data architecture practices are built for applications that require real-time processing and visualization by integrating various data pipelines into a unified platform that can be accessed by all those in the company. The use of container orchestration tools automates much of the operational effort to run workloads and services, and data fabric or mesh technologies are the newest data-as-a-platform layer that fills the gaps in unifying an organization’s data flows.</p> <h3 class="wp-block-heading"><a></a>Why is data architecture important?</h3> <p>Without data architecture, businesses often end up collecting a huge amount of unstructured data, which fails to achieve the goals set out by the company. This happens all too often, as organizations regularly take a ‘technology-first approach’ to data collection, according to McKinsey, and rush to implement new digital technologies without going through the necessary processes of designing data architecture and implementing infrastructure to support data and analytics at scale.</p> <p>This can lead to all sorts of issues further down the pipeline, including redundant and inconsistent data storage, in which an organization lacks a single source of truth and has similar datasets stored in numerous data silos. This lack of focus can also lead to development sprawl, where a single function requires multiple technologies to support it. As a business gets larger, these issues will be magnified and can lead to lower productivity, as more technologies inevitably leads to more potential blockages in the data pipelines, and new employees need to be taught how to use more technologies.</p> <h3 class="wp-block-heading"><a></a>What are the benefits of using modern data architecture?</h3> <p>Modern data architecture has many benefits to data leaders looking to implement the new wave of collection, analytics, and visualization technologies into their businesses.</p> <ul class="wp-block-list"><li>Automation – The amount of data that digital organizations can collect today is far more than any manual process can handle, and many automation tools have been built with modern data architecture practices in mind to handle the workload.</li><li>Unified dashboard – All stakeholders should be able to access data regardless of the platform or system, improving collaboration and real-time access.</li><li>Performance and productivity – Modern data architecture enables the handling of data-intensive workloads, such as AI, ML, and data analytics platforms.</li><li>Scalability – Modern data architecture allows businesses to scale to meet current demands, while the infrastructure does not need to be tied to specific platforms or environments as it was in the previous generation of data collection.</li></ul> <h3 class="wp-block-heading"><a></a>A look to the future</h3> <p>One of the key reasons business leaders bring up for failing to reach their businesses analytics goals is lack of data architecture. This lack of focus on the data strategy from the start is likely going to cost businesses even more than it did a decade ago because of the cost of storing more data than ever before and running it through analytics platforms. The creation of modern data architecture practices, such as data fabrics and data meshes, has provided smart organizations with the power to collect data from disparate sources and process and display it in one unified platform.</p> <div class="saboxplugin-wrap" itemtype="http://schema.org/Person" itemscope itemprop="author"><div class="saboxplugin-tab"><div class="saboxplugin-gravatar"><img loading="lazy" decoding="async" src="https://www.clouddatainsights.com/wp-content/uploads/2022/05/curry-150x150-1.webp" width="100" height="100" alt="" itemprop="image"></div><div class="saboxplugin-authorname"><a href="https://www.clouddatainsights.com/author/david-curry/" class="vcard author" rel="author"><span class="fn">David Curry</span></a></div><div class="saboxplugin-desc"><div itemprop="description"><div class="author-info"> <div class="author-description"> <p>David is a technology writer with several years experience covering all aspects of IoT, from technology to networks to security.</p> </div> </div> <div class="clear"> <article id="post-47305" class="entry-grid first-grid post-47305 post type-post status-publish format-standard has-post-thumbnail hentry category-aiops tag-aiops tag-observability"> <div class="post-thumb"></div> </article> </div> </div></div><div class="clearfix"></div></div></div>]]></content:encoded> <wfw:commentRss>https://www.clouddatainsights.com/enabling-innovation-with-the-right-cloud-data-architecture/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <post-id xmlns="com-wordpress:feed-additions:1">1339</post-id> </item> </channel> </rss>