background preloader

Data Mining Map

http://chem-eng.utoronto.ca/~datamining/dmc/data_mining_map.htm

Big data has big implications for knowledge management A goal of knowledge management over the years has been the ability to integrate information from multiple perspectives to provide the insights required for valid decision-making. Organizations do not make decisions just based on one factor, such as revenue, employee salaries or interest rates for commercial loans. The total picture is what should drive decisions, such as where to invest marketing dollars, how much to invest in R&D or whether to expand into a new geographic market.

Perceptual Edge - Library Contents Books Articles Whitepapers Other Brief Publications Books Information Dashboard Design: Displaying data for at-a-glance monitoring, Second Edition, Stephen Few, $40.00 (U.S.), Analytics Press, 2013 This book alone addresses the visual design of dashboards. Don't be misled by the title. Data mining Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.[1] Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal to extract information (with intelligent methods) from a data set and transform the information into a comprehensible structure for further use.[1][2][3][4] Data mining is the analysis step of the "knowledge discovery in databases" process or KDD.[5] Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.[1] Etymology[edit] In the 1960s, statisticians and economists used terms like data fishing or data dredging to refer to what they considered the bad practice of analyzing data without an a-priori hypothesis.

Tertiary data: Big data's hidden layer Big data isn’t just about multi-terabyte datasets hidden inside eventually-concurrent distributed databases in the cloud, or enterprise-scale data warehousing, or even the emerging market in data. It’s also about the hidden data you carry with you all the time; the slowly growing datasets on your movements, contacts and social interactions. Until recently, most people’s understanding of what can actually be done with the data collected about us by our own cell phones was theoretical. There were few real-world examples. But over the last couple of years, this has changed dramatically.

Visualising Ad Hoc Tweeted Link Communities, via BackType So you’ve tweeted a link as part of your social media/event amplification strategy, and it’s job done, right? Or is there maybe some way you can learn something about who else found that interesting? Notwitshtanding the appearance of yet another patent of the bleedin’ obvious, here’s one way I’ve been experimenting with for tracking informal, ad hoc communities around a link. (In part this harkens back to some of my previous “social life of a URL” doodles such as delicious URL History – Hyperbolic Tree Visualisation, More Hyperbolic Tree Visualisations – delicious URL History: Users by Tag.) In part inspired by a comment by Chris Jobling on one of my flickr Twitter network images, here’s a recipe for identifying a core community that may be interested in a retweeted link:

Protovis Protovis composes custom views of data with simple marks such as bars and dots. Unlike low-level graphics libraries that quickly become tedious for visualization, Protovis defines marks through dynamic properties that encode data, allowing inheritance, scales and layouts to simplify construction. Protovis is free and open-source, provided under the BSD License. Connecting the Dots: Finding Patterns in Large Piles of Numbers - Rebecca J. Rosen - Technology A new program can find and compare relationships in complicated data without having to be asked specific queries Are there subtle patterns lurking in data that can foretell of a coming financial-system crash? What can explain the variations in sports-star salaries? How about the complex relationship between genes and certain diseases? Scientists in various fields have been searching for better ways to analyze large piles of data for such patterns, but the difficulty has always been that they need to know what they're looking for in order to find. A new software program, described in the latest issue of Science, is designed to find the patterns in data that scientists don't know to look for.

Government data UK: what's really been achieved? "I think this project is doomed" - not perhaps what you expect on the front page of a website launched to huge fanfare a year ago today by Sir Tim Berners-Lee, the British inventor of the world wide web. One year after the Labour government launched the data.gov.uk portal, intended to provide a front door to a library of government data that developers in the outside world could use to analyse trends and create commercial services, there is disquiet that the initial enthusiasm has worn off and that civil servants are quietly blocking widespread release of useful information. Peter Austin, a web developer using the site, commented : "I became a member of this community nearly a year ago. I wanted to use my programming skills for the public good …[but] I can only describe it as Yes Minister data. Harmless.

Related: