background preloader

Welcome to Hive!

Welcome to Hive!

Apache Thrift HBase Big Data 2011 by GigaOM - Infrastructure - Web- Eventbrite Invalid quantity. Please enter a quantity of 1 or more. The quantity you chose exceeds the quantity available. Please enter your name. Please enter an email address. Please enter a valid email address. Please enter your message or comments. Please enter the code as shown on the image. Please select the date you would like to attend. Please enter a valid email address in the To: field. Please enter a subject for your message. Please enter a message. You can only send this invitations to 10 email addresses at a time. $$$$ is not a properly formatted color. Please limit your message to $$$$ characters. $$$$ is not a valid email address. Please enter a promotional code. Sold Out Pending You have exceeded the time limit and your reservation has been released. The purpose of this time limit is to ensure that registration is available to as many people as possible. This option is not available anymore. Please read and accept the waiver. All fields marked with * are required. US Zipcodes need to be 5 digits. Map

Apache ZooKeeper - Home Apache Pig [repost]How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data « New IT Farmer How do you query hundreds of gigabytes of new data each day streaming in from over 600 hyperactive servers? If you think this sounds like the perfect battle ground for a head-to-head skirmish in the great MapReduce Versus Database War, you would be correct. Bill Boebel, CTO of Mailtrust (Rackspace’s mail division), has generously provided a fascinating account of how they evolved their log processing system from an early amoeba’ic text file stored on each machine approach, to a Neandertholic relational database solution that just couldn’t compete, and finally to a Homo sapien’ic Hadoop based solution that works wisely for them and has virtually unlimited scalability potential. Rackspace faced a now familiar problem. Facing exponential growth they spent about 3 months building a new log processing system using Hadoop (an open-source implementation of Google File System and MapReduce), Lucene and Solr.

Apache Kafka Sqoop About | Elastic Web Mining | Bixo Labs Scale Unlimited is based in Nevada City, California and provides consulting and training services for big data analytics, search, and web mining. The company was founded in 2008 by Stefan Groschupf, Chris Wensel, and Ken Krugler, three of the world’s leading experts in scalable, reliable data analytics, workflow design and web mining. All are well-known community members and contributors to key open source projects, including Hadoop, Bixo, Cascading, Solr, Lucene, Katta and Tika. Solutions from Scale Unlimited are built using these and other widely used and well supported open source packages, providing maximum flexibility with no commercial lock-in. Inspiration Scale Unlimited solves three major problems that the founders experienced first-hand at previous startups and consulting projects. First, processing big data requires a workflow system that is efficient, reliable and scalable. With Scale Unlimited, solutions are built using Hadoop and Cascading-based workflows. Team Technical Advisors

The Internet of Things (and the myth of the “Smart” Fridge) Flume

A data warehouse system for Hadoop that offers a SQL-like query language to facilitate easy data summarization, ad-hoc queries, and the analysis of large datasets stored in Hadoop compatible file systems. by sergeykucherov Jul 15

Related: