More R and Python
Nadal or Djokovic? Predicting the winner of the US Open
9 September 2013Last updated at 10:03 ET By Kim Gittleson BBC reporter, Flushing Meadows, New York Who will win this year's US Open tennis championship - Novak Djokovic or Rafael Nadal? Most pundits will have an opinion on who will triumph in this year's US Open men's final - Rafael Nadal or Novak Djokovic - but the best insights into who will be crowned champion will come from the same technology that has helped cities to lower crime rates and plan for extreme weather. Deep in the bowels of Arthur Ashe Stadium in Flushing Meadows, Queens, New York, beats the data heart of the 2013 US Open. In a bland room accessed through an unmarked door, more than 60 laptops are piled high, arranged like a command control centre for a mission to the moon. This room is known as "scoring central", according to US Open officials. It's where data is pushed to scoreboards on Louis Armstrong Court - the second largest US Open tennis court - or to TV screens across the globe. 'Unusual statistics' Djokovic v Nadal
priorityMatrixForBigData
BigDataOpportunityByIndustry
BigDataInvestmentsByIndustry
Google introduces new 'Hummingbird' search algorithm
Don't be fooled into thinking that big data just relies on technology – but it can help | News
Even the oldest market in the world uses big data As big data becomes increasingly popular, it's occasionally worth taking a step back to think about what companies are looking to achieve as well as the necessary elements. Is investment in new technology really necessary, for example, if there is a cheaper – and ultimately simpler – means of collecting the required information? Reza Soudagar, writing for multinational software company SAP, eloquently summarised how cost-effective 'low-tech' methods can be implemented to collect big data, citing the example of Istanbul's Grand Bazaar. While one could easily mistake this as an attempt to keep customers interested for longer, a happy side effect, their reasoning is more sophisticated: it's all about big data. This allows a fairly basic understanding of just how businesses use big data. But sometimes, it's easier to automate – just ask Tesco It's also worth noting the experiences of larger stores. And how about Square?
Tesco petrol stations use face-scan tech to target ads
4 November 2013Last updated at 12:46 ET Face-scan screens at Tesco petrol stations will target ads at drivers Tesco is installing face-scanning technology at its petrol stations to target advertisements to individual customers at the till. The technology, made by Lord Sugar's digital signage company Amscreen, will use a camera to identify a customer's gender and approximate age. It will then show an advertisement tailored to that demographic. Tesco says the screens will be rolled out across all 450 of its forecourts in the UK. "It's like something out of Minority Report," said Amscreen's chief executive Simon Sugar, Lord Sugar's eldest son. "But this could change the face of British retail, and our plans are to expand the screens into as many supermarkets as possible." A Tesco spokeswoman said: "This is not new technology." "No data or images are collected or stored and the system does not use eyeball scanners or facial-recognition technology", she added. 'Ethically deployed'
How big data is changing the cost of insurance
14 November 2013Last updated at 19:26 ET By Kim Gittleson BBC reporter, New York Car insurers want to monitor driver behaviour to help them lower rates - and help improve customer driving Dave Pratt winced when his teenage son bought himself a Jeep, thinking of how high the insurance would be on a young driver with a flashy car. But unlike most parents, Mr Pratt is at the forefront of the insurance industry's efforts to change the way car insurance is priced. So Mr Pratt, who works for insurer Progressive, installed the company's Snapshot device in his son's car. It's what's known in the business as a "telematic" device, which monitors the speed his son drives every second and what time of day he drives. When his son accused him of trying to train him to be a better driver, Mr Pratt agreed that was what he was trying to do. Continue reading the main story “Start Quote The way we've done insurance now compared to what we can do is sloppy - most people are actually overpaying” Sloppy business
CASE STUDIES | 10 Big Data Projects in the Insurance Industry | DATA SCIENCE REPORT - TODAY!
If there were a competition for breathless hype in technology, big data would be the current champion — there’s even a Brooklyn-based band by the name. And though the phrase is ubiquitous in boardrooms and IT departments across categories of companies, the insurance industry is in many ways taking the lead in getting real business value from the volume, velocity, and variety of massive datasets. Why are insurers taking this challenge on at the same time they are grappling with core-systems transformation, evolving customer expectation and regulatory upheaval? Well, says Pawan Divakarla, big data business leader at Progressive Insurance, “Big data actually does work.” And the results can be dramatic. According to recent research from Accenture, a third of insurers now are using data from wearable technologies, such as FitBits and Jawbones, to collect lifestyle data from insureds. Here are the case studies:
6 Tricks I Learned From The OTTO Kaggle Challenge
Here are a few things I learned from the OTTO Group Kaggle competition. I had the chance to team up with great Kaggle Master Xavier Conort, and the french community as a whole has been very active. 1 — Stacking, blending and averaging Teaming with Xavier has been the opportunity to practice some ensembling technics. We heavily used stacking. We added to an initial set of 93 features, new features being the predictions of N different classifiers (Random Forest, GBM, Neural Networks, …). We tested two tricks : for average, use the harmonic mean (instead of the geometric mean) : it improved a bit our scorewhen adding the N features, add the logit of the prediction, instead of the prediction itself (it didn’t improve things in our case) 2 — Calibration This is one of the great functionalities of the last scikit-learn version (0.16). Here is a mini notebook explaining how to use calibration, and demonstrating how well it worked on the OTTO challenge data. 3 — GridSearchCV and RandomizedSearchCV