background preloader

Where can I find large datasets open to the public?

Related:  Big Data / Analytics

Publicly Available Big Data Sets :: Hadoop Illuminated Public Data sets on Amazon AWS Amazon provides following data sets : ENSEMBL Annotated Gnome data, US Census data, UniGene, Freebase dump Data transfer is 'free' within Amazon eco system (within the same zone) AWS data sets InfoChimps InfoChimps has data marketplace with a wide variety of data sets. InfoChimps market place Comprehensive Knowledge Archive Network open source data portal platform data sets available on datahub.io from ckan.org Stanford network data collection Open Flights Crowd sourced flight data Flight arrival data

Dis, papa, c’est quoi l’open data ? Nombreux sont ceux qui estiment que le mouvement "open data" aura, à l'instar de l’apparition de l’alphabet, de l'internet ou encore de l'explosion des réseaux sociaux, des répercussions majeures dans nos sociétés. Connu pour ses logiciels non libres, Microsoft a eu la très bonne idée de demander à Regards sur le numérique (RSLN, animé par Spintank), son “laboratoire d’idées, de réflexions et d’expérimentations en ligne“, de se pencher sur la notion d’open data, et donc le partage de données publiques dans des formats ouverts, afin de libérer les données récoltées, ou produites, par les autorités publiques, et de les rendre, si possible gratuitement, à la société, ses citoyens, associations, entreprises privées et administrations publiques. Au menu, très complet, digeste et instructif : une enquête et une trentaine d’articles, que l’on retrouve sur son site ainsi que dans le n° spécial de leur magazine, suivi d’une conférence, intitulée L’Open data, et nous, et nous, et nous ?

Email any web page to any one / EmailTheWeb.com Easy Java Simulations Wiki | Main / Home Page About Easy Java/Javascript Simulations Easy Java/Javascript Simulations, also known as EjsS (and, formerly, EJS or Ejs), is a free authoring tool written in Java that helps non-programmers create interactive simulations in Java or Javascript, mainly for teaching or learning purposes. EjsS has been created by Francisco Esquembre and is part of the Open Source Physics project. A brief historical and naming remark: Before release 5.0, EjsS could only create Java simulations. In this wiki: Science SPORE PrizeNovember 2011 Password only required for helping with the documentation If you follow a link in this wiki and get a ‘Password required’ message, this means the page you tried to visit does not exist yet. Visitors counter This page has been visited times since October 2008.

Finding Data on the Internet Skip to Content A Community Site for R – Sponsored by Revolution Analytics Home » How to » Finding Data on the Internet Finding Data on the Internet By RevoJoe on October 6, 2011 The following list of data sources has been modified as of 3/18/14. If an (R) appears after source this means that the data are already in R format or there exist R commands for directly importing the data from R. Economics American Economic Ass. Data Science Practice This section contains data sets used in the book "Doing Data Science" by Rachel Schutt and Cathy O'Neil (O'Reilly 2014) Datasets on the book site: Enron Email Dataset: GetGlue (time stamped events: users rating TV shows): Titanic Survival Data Set: Half a million Hubway rides: Finance Government Health Care Gapminder: Machine Learning Networks Science Comments

Solvent Solvent Why do I need screen scrapers? Piggy Bank needs web pages to embed information in a format that it can understand. In short, screen scrapers allow you to turn a regular web page into a regular web page plus semantic data, and thus frees the data from the page/site that contains it. How do I use it? Watch a screencast of Solvent scraping the location of Starbucks coffee shops in Cambridge, MA and then use Piggy Bank to show the scraped data on a map. Also read the Piggy Bank screen scraping howto that uses Solvent to write a screen scraper for Piggy Bank. There is another tutorial about using Solvent to scrape web pages containing data about baseball players. What are the main features of Solvent? Writing screen scrapers can be hard and tedious, that's why you need a tool to help you. Where do I find other scrapers to learn from? See the list of Piggy Bank scrapers available. How can I help/complain/thank? There are several ways you can help: Licensing & Legal Issues Credits

SWF Charts > Buy Free License XML/SWF Charts is free to download and use. The free, unregistered version contains all the features except for: Clicking a chart takes the user to the XML/SWF Charts web site. Developing and maintaining XML/SWF Charts takes a lot of effort. Web site developers may use unregistered copies of XML/SWF Charts in client web sites. Software developers may redistribute unregistered copies of XML/SWF Charts within other software products, with the copyright attached. $29 - Single License The single license is for one domain name, all its sub-domains (www.yourdomain.com, sales.yourdomain.com, www.sales.yourdomain.com, tech.yourdomain.com, etc.), all its ports (yourdomain.com, yourdomain.com:8000, etc.), and for localhost ( License for one domain name, all its sub-domains and ports, and "localhost". Make a payment with PayPal, and get a registration code at the end of the payment process. Credit card transactions are processed immediately. $399 - Bulk License

Wall of Films! | Films For Action Just imagine what could become possible if an entire city had seen just one of the documentaries above. Just imagine what would be possible if everyone in the country was aware of how unhealthy the mainstream media was for our future and started turning to independent sources in droves. Creating a better world really does start with an informed citizenry, and there's lots of subject matter to cover. From all the documentaries above, it's evident that our society needs a new story to belong to. The old story of empire and dominion over the earth has to be looked at in the full light of day - all of our ambient cultural stories and values that we take for granted and which remain invisible must become visible. But most of all, we need to see the promise of the alternatives - we need to be able to imagine new exciting ways that people could live, better than anything that the old paradigm could ever dream of providing. So take this library of films and use it.

IT Operations Analytics In the fields of information technology and systems management, IT Operations Analytics (ITOA) is an approach or method applied to application software designed to retrieve, analyze and report data for IT operations. ITOA has been described as applying big data analytics to large datasets where IT operations can extract unique business insights.[1][2] In its Hype Cycle Report, Gartner rated the business impact of ITOA as being ‘high’, meaning that its use will see businesses enjoy significantly increased revenue or cost saving opportunities.[3] By 2017, Gartner predicts that 15% of enterprises will use IT operations analytics technologies to deliver intelligence for both business execution and IT operations.[2] Definition[edit] History[edit] Due the mainstream embrace of cloud computing and the increasing desire for businesses to adopt more Big Data practices, the ITOA industry has grown significantly since 2010. Applications[edit] Types[edit] Tools and ITOA Platforms[edit] See also[edit]

The Big Clean Sandia's Computational Software Site Data Visualisation: What's the big deal? | Career and Hiring Insights | Aquent The concept of using pictures to understand complex information — especially data — has been around for a very long time, centuries in fact. One of the most cited examples of statistical graphics is Napoleon’s invasion of Russia mapped by Charles Minard. The maps showed the size of the army and the path of Napoleon’s retreat from Moscow. It also included detailed information like temperature and time scales, providing the audience with an in-depth understanding of the event. However, as with most things, it’s technology that has truly allowed data visualisation to take the stage and get noticed. It’s no surprise that with big data there’s potential for BIG opportunity (someone pass me the shot glass), but many corporates are genuinely challenged when it comes to: understanding the data they have finding value in it getting the wider business to buy in and just GET IT!!! So how do you tackle this? How do you get people to comprehend this information quickly? One word — INSIGHT.

DocumentCloud Compressed sensing and single-pixel cameras I’ve had a number of people ask me (especially in light of some recent publicity) exactly what “compressed sensing” means, and how a “single pixel camera” could possibly work (and how it might be advantageous over traditional cameras in certain circumstances). There is a large literature on the subject, but as the field is relatively recent, there does not yet appear to be a good non-technical introduction to the subject. So here’s my stab at the topic, which should hopefully be accessible to a non-mathematical audience. For sake of concreteness I’ll primarily discuss the camera application, although compressed sensing is a more general measurement paradigm which is applicable to other contexts than imaging (e.g. astronomy, MRI, statistical selection, etc.), as I’ll briefly remark upon at the end of this post. The purpose of a camera is, of course, to record images. How can one compress an image? pixels, which are all exactly the same colour – e.g. all white. combinations to consider!)

Related: