background preloader

Hortonworks. We Do Hadoop.

Hortonworks. We Do Hadoop.

Products Amazon Web Services offers a broad set of global compute, storage, database, analytics, application, and deployment services that help organizations move faster, lower IT costs, and scale applications. These services are trusted by the largest enterprises and the hottest start-ups to power a wide variety of workloads including: web and mobile applications, data processing and warehousing, storage, archive, and many others. Amazon Web Services provides a variety of cloud-based computing services including a wide selection of compute instances which can scale up and down automatically to meet the needs of your application, a managed load balancing service as well as fully managed desktops in the cloud. Sign up with Amazon Web Services and receive 12 months of access to the AWS Free Usage Tier and enjoy AWS Basic Support features including, 24x7x365 customer service, support forums, and more. Amazon EC2 provides resizable compute capacity in the cloud. Learn more » See pricing details »

Hadoop 2.0.3-alpha Apache Hadoop 2.7.2 is a minor release in the 2.x.y release line, building upon the previous stable release 2.7.1. Here is a short overview of the major features and improvements. Common Authentication improvements when using an HTTP proxy server. This is useful when accessing WebHDFS via a proxy server. A new Hadoop metrics sink that allows writing directly to Graphite. Specification work related to the Hadoop Compatible Filesystem (HCFS) effort. The Hadoop documentation includes the information you need to get started using Hadoop.

Giraph - Welcome To Apache Giraph Sandbox Sandbox is a personal, portable Hadoop environment that comes with a dozen interactive Hadoop tutorials. Sandbox includes many of the most exciting developments from the latest HDP distribution, packaged up in a virtual environment that you can get up and running in 15 minutes! Learn HadoopSandbox comes with a dozen hands-on tutorials that will guide you through the basics of Hadoop; tutorials built on the experience gained from training thousands of people in our Hortonworks University Training classes. Build a Proof of ConceptThe Sandbox includes the Hortonworks Data Platform in an easy to use form. Test New FunctionalityYou can test new functionality with the Sandbox before you put it into production.

InfoSphere Platform – big data, information integration, data warehousing, master data management, lifecycle management & data security Why InfoSphere The InfoSphere Platform provides all the foundational building blocks of trusted information, including data integration, data warehousing, master data management, big data and information governance. The platform provides an enterprise-class foundation for information-intensive projects, providing the performance, scalability, reliability and acceleration needed to simplify difficult challenges and deliver trusted information to your business faster. Core Capability Information Integration and GovernanceBringing together diverse data, managing its quality, maintaining master data, securing and protecting data, managing data across its lifecycle. Additional Capabilities Big DataExtracting insight from an immense volume, variety and velocity of data, in context, beyond what was previously possible. Data WarehousingData warehouse appliances, systems and software optimized for deep and operational analytics View all products dropdown Featured InfoSphere Resources Get the report

Presto | Distributed SQL Query Engine for Big Data 16 Top Big Data Analytics Platforms Teradata delivers unified big data architecture Analytical DBMS: Teradata, Teradata Aster.In-memory DBMS: Although not an in-memory DBMS, Teradata Intelligent Memory monitors queries and automatically moves the most-requested data to the fastest storage tiers available, with options including RAM, flash, SSD, and various speeds of conventional spinning discs.Stream-analysis option: None.Hadoop distribution: Resells and supports the Hortonworks Data Platform. Hardware/software systems: Teradata and Teradata Aster are integrated software/hardware systems. Teradata entered the big-data era boasting the largest roster of petabyte-scale enterprise data warehouse (EDW) customers of any vendor. The Teradata DBMS is at the heart of the UDA, supporting EDWs and marts for production BI and analytical needs. Aster is the UDA data-discovery platform, a small, transient store for day-to-day exploration of structured and multi-structured (clickstream, social, or machine) data.

Infosphere MDM Every day, employees and business leaders make decisions that affect the competitiveness of their enterprises. They need to be confident that the information they act on is a trusted “version of the truth.” And, today they want to extend existing customer views with big data by incorporating information from additional internal and external information sources such as social media. Otherwise, they may miss opportunities to provide outstanding customer service, grow revenues or cut costs. To help you deliver this trusted view, IBM offers InfoSphere Master Data Management, delivering a trusted view of customers, products, accounts, reference data and more. It helps you establish governed, trusted views of your data assets, including big data.

Manhattan, our real-time, multi-tenant distributed database for Twitter scale As Twitter has grown into a global platform for public self-expression and conversation, our storage requirements have grown too. Over the last few years, we found ourselves in need of a storage system that could serve millions of queries per second, with extremely low latency in a real-time environment. Availability and speed of the system became the utmost important factor. Over the years, we have used and made significant contributions to many open source databases. Our holistic view into storage systems at TwitterDifferent databases today have many capabilities, but through our experience we identified a few requirements that would enable us to grow the way we wanted while covering the majority of use cases and addressing our real-world concerns, such as correctness, operability, visibility, performance and customer support. Developers should be able to store whatever they need on a system that just works.Tweet We designed with the following goals in mind:

16 Top Big Data Analytics Platforms Data analysis is a do-or-die requirement for today's businesses. We analyze notable vendor choices, from Hadoop upstarts to traditional database players. 1 of 17 Revolutionary. That pretty much describes the data analysis time in which we live. Apache Hadoop, a nine-year-old open-source data-processing platform first used by Internet giants including Yahoo and Facebook, leads the big-data revolution. In-memory analysis gains steam as Moore's Law brings us faster, more affordable, and more-memory-rich processors. Advances in bandwidth, memory, and processing power also have improved real-time stream-processing and stream-analysis capabilities, but this technology has yet to see broad adoption. Our slideshow includes broad-based data-management vendors -- IBM, Microsoft, Oracle, SAP -- that offer everything from data-integration software and database-management systems (DBMSs) to business intelligence and analytics software, to in-memory, stream-processing, and Hadoop options.

visualizzazione dati

he only vendor which uses 100% open source Apache Hadoop without own (non-open) modifications. Hortonworks is the first vendor to use Apache HCatalog functionality for metadata services. Besides, their Stinger initiative optimizes the Hive project massively. Hortonworks offers a very good, easy-to-use sandbox for getting started. Hortonworks developed and committed enhancements into the core trunk that make Apache Hadoop run natively on the Microsoft Windows platforms including Windows Server and Windows Azure. by sergeykucherov Jul 15

Related: