Master data management also known as MDM is a process of creating and managing all critical data to one file as a single master copy i.e. master data. In a larger organisation there are many different departments. In each departments there are many number of software systems and each system having large amount of data to share or to use. Overall a huge amount of data are flowing here and there in the whole organisation. All these data need to connect in one file, called a master file that would provide a common point of reference. So, we can say that “Master data is basically a shared master copy of data from different departments such as product, suppliers, employee and customer used by several applications within an organisation”.
According to Stamford, Conn.-based IT research firm Gartner Inc. “The growth of “big data,” as well as social media and cloud computing, must be factored into master data management strategies”.
Master data and big data is highly connected to each other. In the previous article we already explained the relation between master data and big data. Applying Master data concept always gives great benefits in leveraging big data. So big data tools are also have their own importance in Master data management. In this article we will co-relate one of the big data tool ‘Hadoop’ with master data management. Its use and application on master data management.
Hadoop is an open source platform used for big data processing. It breaks large amount of data in small “chunks” and store these “chunks” on different Hadoop clusters. MapReduce function runs query over all chunks stored in different clusters. It helps organisation to find the answer of many questions that they are expecting from big data solutions. In other words we may say that Hadoop is a relatively new open source platform for big data designed to allow users to handle increasing volumes of data quickly and efficiently and any piece of data within a data-set stored in Hadoop cluster can be used in a query.
The Information Difference, an analyst firm conducted a survey on big data and MDM. In this survey 209 companies shared their views on the subject. The respondents were from North America and European companies, plus a few (11% Asian respondents.
77% of companies claimed big data was important to them and they are leveraging big data. Of those with active projects, 80% were using Hadoop, the clearly preferred technology to tackle such data. On the question of master data management they having live MDM implementations in 56% of cases, with a further 14% coming up cases. On the question on big data and MDM: are they linked? 59% of respondent were strongly agreed on it and 67% of survey respondents saw MDM driving big data.
Doing master data management on big data platform, Hadoop has a lot of potential especially when we leverage big data for single version of customer and product. If you have data about customers with hundreds of attributes, Hadoop allows you to explore all those attributes and values. Just pick Hadoop, to get solution for the problem that related to a single version of customer. Hadoop systems are highly scalable and relatively cost efficient for large enterprises projects.