Data Science Dictionary
We created a data science dictionary in 2012, and we are still adding keywords. It is also in our Wiley book (better English, recent update). Here we share with you another similar dictionary, from BigDataProjects.org. Here are the first few enties, from the Techniques sections (there are two sections: techniques and technologies): A/B testing: A technique in which a control group is compared with a variety of test groups in order to determine what treatments (i.e., changes) will improve a given objective variable, e.g., marketing response rate. Big data enables huge numbers of tests to be executed and analyzed, ensuring that groups are of sufficient size to detect meaningful (i.e., statistically significant) differences between the control and treatment groups When more than one variable is simultaneously manipulated in the treatment, the multivariate generalization of this technique, which applies statistical modeling, is often called “A/B/N” testing. Association rule learning:
Big Data Timeline
Interesting interactive timeline featuring a number of "big data" milestones since 1932. There's way too much emphasis on BI, ERP and SAP, but still, it contains lots of interesting history when you filter out these references. Big data, back in 1940 Here are some highlights: July 1997 - The Problem of Big Data. Check the presentation. Other links
ELC 015: Don't Fear The Data
Sharebar Podcast: Play in new window | Download Subscribe in iTunes Collecting, analyzing and using big data has become embedded in how governments, corporations and institutions work. It’s now an expected, though often resented, part of our culture. In this episode, I interview Ellen Wagner Ph.D., who helps us understand big data and how it can be leveraged to improve learning and development as well as higher education. Ellen is Partner and Senior Analyst for Sage Road Solutions. What big data is and isn’tWhy the learning industry should be paying attention to big dataDecisions we can we make with big dataProblems we can preventUsing data to get follow and get ahead of the trendsWhy we push back on measurement and evaluation of our workThe benefits of being (partially) data drivenPatterns that people are finding in learning analyticsHow to cooperate and share de-identified dataDifference between inferential and predictive statisticsExperience API TIME: 30 minutes
Twitter beyond 140: The social network’s new growth plan
In December of 2012, four days before Christmas, Twitter flew Jeff Seibert and Wayne Chang from Boston to the company’s headquarters in downtown San Francisco. A year earlier, the two men had founded a startup called Crashlytics. The startup’s namesake product is a bug-reporting tool, designed to help mobile software developers figure out when and why their applications crash. The men had a hunch they were about to make a lot of money. “We were just saying, if we get an offer, make sure you don’t show any emotion,” Mr. Their hunch proved correct. And yet, something about the deal didn’t make sense. It wasn’t until a couple of weeks ago that Twitter finally made clear the overarching strategy behind the Crashlytics acquisition. “This is about something much broader than Twitter the consumer application,” says Kevin Weil, Twitter’s vice-president of revenue products. Why? It’s no easy feat. Reaching advertisers “We don’t judge this by revenue,” Mr. In many ways, Mr. Luring users But Mr.
Phones and Wearables Will Spur Tenfold Growth in Wireless Data by 2019
Persistent growth in the use of smartphones, plus the adoption of wireless wearable devices, will cause the total amount of global wireless data traffic to rise by 10 times its current levels by 2019, according to a forecast by networking giant Cisco Systems out today. The forecast, which Cisco calls its Visual Networking Index, is based in part on the growth of wireless traffic during 2014, which Cisco says reached 30 exabytes, the equivalent of 30 billion gigabytes. If growth patterns remain consistent, Cisco’s analysts reckon, the wireless portion of traffic crossing the global Internet will reach 292 exabytes by the close of the decade. To put that 292 exabytes in terms you might be able to understand: Imagine every person on Earth taking 23 Instagram selfies a day, every day, for an entire year. It would add up to about 65 trillion pictures. One key driver will be the raw number of mobile users, which should rise to 5.2 billion from 4.3 billion now.
What The Hell is... Big Data
The Hidden Biases in Big Data - Kate Crawford
by Kate Crawford | 2:00 PM April 1, 2013 This looks to be the year that we reach peak big data hype. From wildly popular big data conferences to columns in major newspapers, the business and science worlds are focused on how large datasets can give insight on previously intractable challenges. Sadly, they can’t. For example, consider the Twitter data generated by Hurricane Sandy, more than 20 million tweets between October 27 and November 1. While massive datasets may feel very abstract, they are intricately linked to physical place and human culture. Fortunately Boston’s Office of New Urban Mechanics is aware of this problem, and works with a range of academics to take into account issues of equitable access and digital divides. Big data’s signal problems won’t disappear as the use of smartphones and other digital technologies increases. This points to the next frontier: how to address these weaknesses in big data science.