<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" > <channel> <title>Yash Mehta, Author at CDInsights</title> <atom:link href="https://www.clouddatainsights.com/author/yash-mehta/feed/" rel="self" type="application/rss+xml" /> <link>https://www.clouddatainsights.com/author/yash-mehta/</link> <description>Trsanform Your Business in a Cloud Data World</description> <lastBuildDate>Tue, 24 Jan 2023 22:11:23 +0000</lastBuildDate> <language>en-US</language> <sy:updatePeriod> hourly </sy:updatePeriod> <sy:updateFrequency> 1 </sy:updateFrequency> <generator>https://wordpress.org/?v=6.6.1</generator> <image> <url>https://www.clouddatainsights.com/wp-content/uploads/2022/05/CDI-Favicon-2-45x45.jpg</url> <title>Yash Mehta, Author at CDInsights</title> <link>https://www.clouddatainsights.com/author/yash-mehta/</link> <width>32</width> <height>32</height> </image> <site xmlns="com-wordpress:feed-additions:1">207802051</site> <item> <title>Data Masking Methods for Data-Centric Security</title> <link>https://www.clouddatainsights.com/data-masking-methods-for-data-centric-security/</link> <comments>https://www.clouddatainsights.com/data-masking-methods-for-data-centric-security/#respond</comments> <dc:creator><![CDATA[Yash Mehta]]></dc:creator> <pubDate>Tue, 24 Jan 2023 22:11:20 +0000</pubDate> <category><![CDATA[Cloud Data Platforms]]></category> <category><![CDATA[Governance]]></category> <category><![CDATA[Security]]></category> <category><![CDATA[cloud governance]]></category> <guid isPermaLink="false">https://www.clouddatainsights.com/?p=2263</guid> <description><![CDATA[Data masking of cloud data helps businesses meet data privacy regulations and protect operations, customers, users, and sensitive information.]]></description> <content:encoded><![CDATA[<div class="wp-block-image"> <figure class="aligncenter size-full is-resized"><img fetchpriority="high" decoding="async" src="https://www.clouddatainsights.com/wp-content/uploads/2023/01/privacy-Depositphotos_27088923_S.jpg" alt="" class="wp-image-2265" width="750" height="563" srcset="https://www.clouddatainsights.com/wp-content/uploads/2023/01/privacy-Depositphotos_27088923_S.jpg 1000w, https://www.clouddatainsights.com/wp-content/uploads/2023/01/privacy-Depositphotos_27088923_S-300x225.jpg 300w, https://www.clouddatainsights.com/wp-content/uploads/2023/01/privacy-Depositphotos_27088923_S-768x576.jpg 768w" sizes="(max-width: 750px) 100vw, 750px" /><figcaption class="wp-element-caption"><em>Data masking of cloud data helps businesses meet data privacy regulations and protect operations, customers, users, and sensitive information.</em></figcaption></figure></div> <p>Enterprises have data as the most valuable asset and act as an input for business analytics. As enterprises learned about the potential of data, they focused more on collecting large volumes of data and further transitioning it to cloud storage, etc. This transition and collection of large volumes of sensitive data created large swaths of security vulnerabilities – with average data breaches costing companies <a href="https://www.ibm.com/security/data-breach" target="_blank" rel="noreferrer noopener">$4.24 million by 2021</a>. This hefty price brings a strong push in businesses to adopt data security solutions such as Data Masking that protects data from any external or unauthorized intrusion.</p> <p>With the data-driven style of business, the chances of data leaks also increase, further making implementing data security protocols/methods becoming a priority for businesses. The adoption of assured security <a href="https://www.k2view.com/what-is-data-masking" target="_blank" rel="noreferrer noopener">methods like data masking</a> brings confidence and increases reliance on the company. Data masking methods secure sensitive data by creating a dummy substitution of this data for database teams without compromising security.</p> <p><strong>See also: </strong><a href="https://www.clouddatainsights.com/big-three-launch-sovereign-cloud-efforts/">Big Three Launch Sovereign Cloud Efforts</a></p> <h3 class="wp-block-heading">Types of data masking</h3> <p>With the Data Masking method, businesses can mask sensitive data in many ways. Depending on business requirements, one can select the type of data masking. Below are the various types of Data Masking:</p> <h4 class="wp-block-heading">Static data masking (SDM)</h4> <p>This type of data masking helps create a sanitized version of production data (<strong>fully or partially masked data set), later utilized or sent in </strong>different environments, such as testing, development, or training. With the SDM, within an organization, sensitive data can be passed to downstream teams or even third parties, where there is a risk of any actual data leakage. Thus, the SDM type provides the final output as an altered or masked version of sensitive data that can be forwarded to the intended environment.</p> <h4 class="wp-block-heading">Dynamic data masking (DDM)</h4> <p>Dynamic data masking (DDM) type is more commonly used to conceal or mask real-time data โ data sets within business processes are altered depending upon the required access or authentication required for particular processes. Unlike SDM, in dynamic masking, no physical changes are made to the original production data/database, and data is masked and copied to the different environments on demand, thus creating a data transfer limitation in concealing the data sets as they are requested or accessed. With DDM, businesses can implement role-based (object-level) authentication to databases or systems.</p> <h4 class="wp-block-heading">On-the-fly data masking (OFDM)</h4> <p>The type is typically used when business processes require continuous movement of data that needs to be masked; for instance, businesses perform software testing extensively. This type functions best to provide a development or testing environment with masked data as soon as it is produced, thus, not requiring any specific staging environment to prepare the masked data for transfer. The process includes masking subsets or pieces of data, as required.</p> <h4 class="wp-block-heading">Deterministic data masking</h4> <p>This type of data masking is used for databases with mapping data sets that have similar types of data. In such a database, using deterministic data masking always substitutes one value with another in mapping data sets. For instance, a database with multiple tables containing personal or sensitive information of a customer, like the first name, can thus be replaced with a fixed substitution name. If ABC is the first name present in multiple tables, the ABC is masked with XYZ at every instance in the database.</p> <h4 class="wp-block-heading">Unstructured data masking</h4> <p>As the name suggests, this type of data masking is very useful for unstructured data (qualitative and not often being able to categorize as sensitive data by various data tools). Such data includes Unstructured scanned images, such as insurance claims, bank checks, and medical records. This data is shared and accessed by many people in different formats within businesses exposing sensitive information to be at risk.</p> <p><strong>See also: </strong><a href="https://www.rtinsights.com/automating-data-governance-leverage-ai-as-your-digital-doorman/">Automating Data Governance: Leverage AI as Your Digital Doorman</a></p> <h3 class="wp-block-heading">Data masking with various platforms โ enterprise data masking tools</h3> <p>Secured and sanitized sensitive information enables businesses to maximize the potential of big data. Enterprise data masking tools provide an end-to-end platform with a wide range of features for integrating raw, scattered, structured/unstructured data from various sources.</p> <p>For example, Informatica makes a robust and versatile data masking platform capable of solving difficult data use cases. <a href="https://www.informatica.com/products/data-security/data-masking.html" target="_blank" rel="noreferrer noopener">Informatica offers resources</a>, called Cloud Data Masking to help safeguard data privacy during sensitive scenarios. The data masking resource helps the data in providing a complete, cloud-native data governance (compliance) and privacy solution. Thus, allowing masked data based on user, roles, and locations. Another enterprise data masking tool is K2view โ which has been top-scored in the <a href="https://www.gartner.com/reviews/market/data-masking" target="_blank" rel="noreferrer noopener">Gartner Data Masking Report 2022</a> and offers data through its data product platform. The data product platform streamlines the process of masking all the data pertaining to particular business entities, including clients, orders, credit card details, etc., and controls the integration and transmission of the encrypted Micro-Data of each business entity. For operational services like customer data management (Customer 360) or Test data management, etc., it uses dynamic data masking techniques to modify, disguise, or deny access to sensitive data based on user responsibilities and rights.</p> <p>The graphical data transformation and orchestration tool uses its in-flight data masking tool to avoid having to fully mask huge data and instead integrates and masks data when a quick transition is necessary from any source systems (production) into any target application. It also uses a combination of data masking types to protect unstructured data.</p> <h3 class="wp-block-heading"><a></a>Conclusion</h3> <p>As businesses incline more to cloud software or applications, it is necessary to enhance the level of security and privacy assurance. Data masking methods comply with numerous data protection requirements, including CCPA, HIPAA, and PCI DSS. Security systems like data masking protect business operations, customers, users, and sensitive information. Depending on the business requirement, various types of data masking are suitable for all businesses dealing with sensitive data.</p> <div class="saboxplugin-wrap" itemtype="http://schema.org/Person" itemscope itemprop="author"><div class="saboxplugin-tab"><div class="saboxplugin-gravatar"><img decoding="async" src="https://www.clouddatainsights.com/wp-content/uploads/2022/05/Yash-Mehta-150x150-1.jpg" width="100" height="100" alt="" itemprop="image"></div><div class="saboxplugin-authorname"><a href="https://www.clouddatainsights.com/author/yash-mehta/" class="vcard author" rel="author"><span class="fn">Yash Mehta</span></a></div><div class="saboxplugin-desc"><div itemprop="description"><div class="author-info"> <div class="author-description"> <p>Yash Mehta is an internationally recognized IoT, M2M and Big Data technology expert. He has written a number of widely acknowledged articles on Data Science, IoT, Business Innovation, Cognitive intelligence. His articles have been featured in the most authoritative publications and awarded as one of the most innovative and influential works in the connected technology industry by IBM and Cisco IoT department. He heads Intellectus (thought-leadership platform for experts) and a Board member in various tech startups.</p> </div> </div> </div></div><div class="clearfix"></div></div></div>]]></content:encoded> <wfw:commentRss>https://www.clouddatainsights.com/data-masking-methods-for-data-centric-security/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <post-id xmlns="com-wordpress:feed-additions:1">2263</post-id> </item> <item> <title>From Big Data to Bigger Data: Redoing Data Preparation</title> <link>https://www.clouddatainsights.com/from-big-data-to-bigger-data-redoing-data-preparation/</link> <comments>https://www.clouddatainsights.com/from-big-data-to-bigger-data-redoing-data-preparation/#respond</comments> <dc:creator><![CDATA[Yash Mehta]]></dc:creator> <pubDate>Mon, 04 Apr 2022 18:52:26 +0000</pubDate> <category><![CDATA[Integration]]></category> <guid isPermaLink="false">https://clouddatainsights.com/from-big-data-to-bigger-data-redoing-data-preparation/</guid> <description><![CDATA[Data is growing, and so is the time spent on data preparation to retrieve, process, and manage it. In the pursuit of real-time business intelligence, the enterprise’s reaction to big… <a href="https://www.clouddatainsights.com/from-big-data-to-bigger-data-redoing-data-preparation/" class="" rel="bookmark">Read More ยป<span class="screen-reader-text">From Big Data to Bigger Data: Redoing Data Preparation</span></a>]]></description> <content:encoded><![CDATA[ <p>Data is growing, and so is the time spent on data preparation to retrieve, process, and manage it. In the pursuit of real-time business intelligence, the enterprise’s reaction to big data is ambitious yet insufficient and inconsistent. The volume of data is outpacing their readiness to handle it.</p> <p>Such inefficiencies are a concern largely when modules like preparation account for 44% of a professional’s time. In a<a href="https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says/"> survey</a> of professionals, data scientists end up consuming 40% of their working hours in <a rel="noreferrer noopener" href="https://www.rtinsights.com/tag/etl/" target="_blank">manual data preparation</a> while only 11% is spared for their core tasks. Now imagine how much you are losing if the same preparation could be automated. Consequently, sluggish and error-prone processes are doing more damage than helping the data science ecosystem.</p> <p>This increase in data traffic and the attached challenges has compelled organizations to think beyond and embrace contemporary data preparation solutions. As per<a href="https://www.marketsandmarkets.com/PressReleases/data-prep.asp" target="_blank" rel="noreferrer noopener"> Markets And Markets</a>, we are in the middle of a data preparation market growth rate (CAGR) of 25.2%. If that is true, organizations should wake up to self-servicing data preparation methodologies.</p> <p><em>Self-servicing tools </em><a href="https://www.k2view.com/data-preparation/" target="_blank" rel="noreferrer noopener"><em>automate the data preparation</em></a><em> methods, thereby enabling the data scientists and the business users to execute the life cycle with ease.</em></p> <p>That is, the process starting from exploration, accessibility, profiling to cleansing and transformation occurs in a predefined yet interactive pattern. Since the self-servicing tools hosted in cloud-native platforms automate the life cycle, the users (and other professionals) get to focus on core analytics. To put it simply, it empowers non-tech professionals such as business users to execute the preparation life cycle without the skills of coding or knowledge of the underlying technology layers.</p> <p>How does it work? After collection and reconciliation, the self-servicing tool scans the data files through a workflow that is designed to perform all the steps iteratively. By the end of the workflow, the datasets populate into a final file that is further loaded into a data store or a warehouse for business analytics.</p> <p>While <a href="https://www.cio.com/article/3235394/how-to-select-the-best-self-service-bi-tool-for-your-business.html" target="_blank" rel="noreferrer noopener">selecting a self-servicing tool</a>, check for the following attributes to ensure optimal value:</p> <ul class="wp-block-list"><li>Compliance with all data sets: Exploring and accessing should support all sources such as Excel, CSV, etc., to data lakes, warehouses, and SaaS platforms.</li><li> ML engineered cleansing, profiling, and enrichment functions.</li><li>By default, support for self-triggered discovery, profiling, standardization, suggestions, and visualization.</li><li>Seamless function export to different file types such as Excel, SaaS native formats, analytics dashboard like Tableau, etc.</li><li>Support for features like automated versioning, advanced designing for a variety of ETL processes.</li></ul> <h3 class="wp-block-heading"><strong>Data preparation expertise a must</strong></h3> <p>Despite the availability of self-servicing tools, enterprises struggle with their data preparation expertise and do not really avail the benefits for real-time as well offline applications.</p> <p>This happens due to complex UI and the inability to populate qualitative data persistently. At the core, it is the conventional preparation methods such as database-by-database, row-by-row, and table-by-table. Not to miss, the complex joins to other tables through scripts and indexes. Here, the data mapping and validation logic is complicated and requires assuring referential integrity for every request.</p> <p>To address this, micro-databases could be used to store and populate data for every business entity. Subsequently, it performs end-to-end data preparation (discover, collect, cleanse, transform and mask) for a specific business entity as and when required. Each of these business entities would store a single customer’s master data.</p> <p>Among many attempts over the years, K2View’s Data Fabric is the most successful case study in using micro databases for automating data preparation. It captures the data from multiple source systems and stores them as a standalone digital entity in an exclusive micro DB. This micro DB is readily available for consumer apps.</p> <p>The solution achieves <a rel="noreferrer noopener" href="https://www.k2view.com/products/data-preparation-hub/" target="_blank">end-to-end data preparation</a> at the business entity level. Unlike conventional approaches, this data preparation hub defines a digital entity schema including all attributes for the specific business entity regardless of their source systems. It automatically locates the desired data sets specific to the business user in the system’s landscape and creates a connection to all those sources. The system performs automated synchronizing of datasets with the sources on a predefined schedule. Not to miss, it automatically implements filters, enrichments, and masking.</p> <p>Besides optimal utilization of resources, such an approach leads to complete, correct, and qualitative data preparation. </p> <h3 class="wp-block-heading">Remember the end goal</h3> <p>Personalized, faster, and profitable consumer experience should be the ultimate goal of all business processes, including data analytics & preparation. Unless you strengthen your foundation, the impact at the front-end is bound to suffer and affect the utility of your products and services. That being said, there’s only one principle to master data issues: act upon them in advance.</p> <div class="saboxplugin-wrap" itemtype="http://schema.org/Person" itemscope itemprop="author"><div class="saboxplugin-tab"><div class="saboxplugin-gravatar"><img decoding="async" src="https://www.clouddatainsights.com/wp-content/uploads/2022/05/Yash-Mehta-150x150-1.jpg" width="100" height="100" alt="" itemprop="image"></div><div class="saboxplugin-authorname"><a href="https://www.clouddatainsights.com/author/yash-mehta/" class="vcard author" rel="author"><span class="fn">Yash Mehta</span></a></div><div class="saboxplugin-desc"><div itemprop="description"><div class="author-info"> <div class="author-description"> <p>Yash Mehta is an internationally recognized IoT, M2M and Big Data technology expert. He has written a number of widely acknowledged articles on Data Science, IoT, Business Innovation, Cognitive intelligence. His articles have been featured in the most authoritative publications and awarded as one of the most innovative and influential works in the connected technology industry by IBM and Cisco IoT department. He heads Intellectus (thought-leadership platform for experts) and a Board member in various tech startups.</p> </div> </div> </div></div><div class="clearfix"></div></div></div>]]></content:encoded> <wfw:commentRss>https://www.clouddatainsights.com/from-big-data-to-bigger-data-redoing-data-preparation/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <post-id xmlns="com-wordpress:feed-additions:1">481</post-id> </item> </channel> </rss>