Talk:Data profiling

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
WikiProject Statistics (Rated Start-class, Low-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

Start-Class article Start  This article has been rated as Start-Class on the quality scale.
 Low  This article has been rated as Low-importance on the importance scale.

Six Sigma

The introduction states that one of the purposes of data profiling is to be able to apply six sigma methodologies to enterprise data. Six Sigma is a commercial product. This is innapropriate in a definition of the term. Countersubject 11:02, 5 September 2006 (UTC)

References to Cite

Here are some good references to pull from that should help legitimize and clean up the content here.

  • Data Quality and Data Profiling by David Loshin, 2008 - [1]
  • The Practioner's Guide to Data Profiling by David Loshin, SAS/Dataflux - [2]
  • Three-Dimensional Data Analysis by Ed Lindsey, 2008 - [3]
  • Data Analysis with Open Source Tools - [4] — Preceding unsigned comment added by Paulboal (talkcontribs) 13:33, 14 May 2013 (UTC)

Data Profiling is Not Just for Tabular Data

One or more of the authors implied that Data Profiling is only used on tabular data. However, data profiling also works on graph and document structures. For example JSON and XML data can be profiled.

Not just for Data Warehousing

The references to Kimball are good, but there should be indications that data profiling is for a much wider use than just warehouses. It is equally applicable to OLTP systems, Big Data, and machine learning/predictive analytics that are not warehouse driven (Among others) (talk) 17:13, 2 August 2016 (UTC)

Retrieved from ""
This content was retrieved from Wikipedia :
This page is based on the copyrighted Wikipedia article "Talk:Data profiling"; it is used under the Creative Commons Attribution-ShareAlike 3.0 Unported License (CC-BY-SA). You may redistribute it, verbatim or modified, providing that you comply with the terms of the CC-BY-SA