background preloader

Large Network Dataset Collection

Large Network Dataset Collection
Social networks Networks with ground-truth communities Communication networks Citation networks Collaboration networks Web graphs Product co-purchasing networks Internet peer-to-peer networks Road networks Autonomous systems graphs Signed networks Location-based online social networks Wikipedia networks, articles, and metadata Temporal networks User Actions Memetracker and Twitter Online Communities Online Reviews Face-to-Face Communication Networks Graph classification datasets Network types Directed : directed network Undirected : undirected network Bipartite : bipartite network Multigraph : network has multiple edges between a pair of nodes Temporal : for each node/edge we know the time when it appeared in the network Labeled : network contains labels (weights, attributes) on nodes and/or edges Network statistics Citing SNAP We encourage you to cite our datasets if you have used them in your work. Related:  Big data

Machine Learning Repository Datasets for Data Mining and Data Science See also Data repositories AssetMacro, historical data of Macroeconomic Indicators and Market Data. Awesome Public Datasets on github, curated by caesar0301. AWS (Amazon Web Services) Public Data Sets, provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. BigML big list of public data sources. Related StatLib---Datasets Archive If you have an interesting dataset, or collection of data from a book, please consider submitting the data. To submit a dataset, please see the submissions guidelines, via Some of the entries are shar archives. The datasets archive currently contains: NIST Statistical Reference Datasets (StRD) A pointer to a NIST site that contains reference datasets for the objective evaluation of the computational accuracy of statistical software. agresti Contains data from "An Introduction to Categorical Data Analysis," by Alan Agresti, John Wiley, 1996, plus SAS code for various analyses. Aldrich_Nelson.zip This data is used in the following book: Aldrich, J. and Forrest, N. (1984) "Linear Probability, Logit and Probit Models". alr This file contains data from Applied Linear Regression, 2nd Edition, by Sanford Weisberg, John Wiley, 1985 (sandy@umnstat.stat.umn.edu) (36808 bytes) analcatdata A collection of the data sets used in the book "Analyzing Categorical Data," by Jeffrey S. Andrews Arsenic arsenic.zip

50 Resources for Getting the Most Out of Google Analytics Google Analytics is a very useful free tool for tracking site statistics. For most users, however, it never becomes more than just a pretty interface with interesting graphs. The resources below will help anyone, from the beginner to those who have been using Google Analytics for some time, learn how to get the most out of this great tool. For Beginners The following list of links will help you get started with Google Analytics from setup to understanding what data is being presented by Google Analytics. How to Use Google Analytics for Beginners – Mahalo’s how-to guide for beginners. Tips & Tricks If you’re already fairly familiar with Google Analytics and you’re ready to dig deeper and learn more about how to make use of the information that is available to you with Google Analytics, this list of tips & tricks is for you. Plugins, Hacks & Additions Want to learn how to get even more out of and extend Google Analytics by extending it with third party plugins, additions and hacks?

Public Data Sets on AWS Click here for the detailed list of available data sets. Here are some examples of popular Public Data Sets: NASA NEX: A collection of Earth science data sets maintained by NASA, including climate change projections and satellite images of the Earth's surface Common Crawl Corpus: A corpus of web crawl data composed of over 5 billion web pages 1000 Genomes Project: A detailed map of human genetic variation Google Books Ngrams: A data set containing Google Books n-gram corpuses US Census Data: US demographic data from 1980, 1990, and 2000 US Censuses Freebase Data Dump: A data dump of all the current facts and assertions in the Freebase system, an open database covering millions of topics The data sets are hosted in two possible formats: Amazon Elastic Block Store (Amazon EBS) snapshots and/or Amazon Simple Storage Service (Amazon S3) buckets. If you have any questions or want to participate in our Public Data Sets community, please visit our Public Data Sets forum.

Common Google Universal Analytics Mistakes that kill your Analysis & Conversions I have audited hundreds of web analytics accounts and profiles. And each account/view had at least one or two issues which seriously stood in my way of getting optimum results from my analysis. I have put all of these issues into five broad categories: Directional Issues Data Collection Issues Data Integration issues Data Interpretation Issues Data Reporting Issues These are the most common mistakes that kill your analysis, reporting and conversions. In order to get optimum results from your analysis of Universal Analytics reports you must aim to find and fix as many of these issues as possible. Failing to do so will almost always result in inaccurate analysis, interpretation and reporting. 1. These issues are not associated with Google Universal Analytics or any other analytics software you use but are commonly found in analysts themselves and are reflected in the way they set up Google Analytics account, advanced segment, conversions segments, filters and custom reports. For example: 1. 2.

Gapminder: Unveiling the beauty of statistics for a fact based world view. Using the New Cohort Analysis in Google Analytics The cohort was the basic tactical unit of Roman Legions following the reforms of Gaius Marius in 107 BC. Initially a Roman legion consisted of ten cohorts, each consisting of 480 men. Today we use the term cohort to distinguish between groups of consumers to help us make them spend more money on things they probably don’t need. Progress? I guess I’d rather live in a world where we try and get people to spend more money on shoes, than die violently by taking a spear to my chest while fighting Carthaginians; but it’s close. And now Google Analytics has a fancy new Cohort Analysis Report that lets us analyze the death rates from the Second Punic War… Er… no… it helps us analyze the consumer/shoe thing. Ok, So What are Cohorts? For our purposes – cohorts are a way of grouping together people (or content), usually, based on date, and for our purposes it’s grouping them by their first session on a website. What is Cohort Analysis? The New Cohort Analysis Report Lines and Triangle Charts

Открытые данные в России | Открытые данные — государственные, коммерческие, общественные. Все. Advanced Content Analysis in Google Analytics The author's posts are entirely his or her own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz. We analyze the performance of our content every day. Sometimes it's subconscious, like when we check the number of tweets we get from a new blog post. Other times, we make more conscious efforts, like reviewing performance metrics in Google Analytics. This feedback—both formal and anecdotal—informs what we do next. Paying attention to which of your content efforts are working well is the cornerstone to data-driven marketing. These articles show how taking data-driven approach to producing content can produce great results. I don't know about you, but exponential traffic sounds pretty great to me! But we will never get there without taking a methodical and data-driven approach to our efforts. It's time to take things to the next level! Using Google Analytics Content Groupings and Dimensions to inform our content strategy The definitions work as a waterfall.

Related: