background preloader

MapR

MapR

The C10K problem [Help save the best Linux news source on the web -- subscribe to Linux Weekly News!] It's time for web servers to handle ten thousand clients simultaneously, don't you think? After all, the web is a big place now. And computers are big, too. You can buy a 1000MHz machine with 2 gigabytes of RAM and an 1000Mbit/sec Ethernet card for $1200 or so. Let's see - at 20000 clients, that's 50KHz, 100Kbytes, and 50Kbits/sec per client. In 1999 one of the busiest ftp sites, cdrom.com, actually handled 10000 clients simultaneously through a Gigabit Ethernet pipe. And the thin client model of computing appears to be coming back in style -- this time with the server out on the Internet, serving thousands of clients. With that in mind, here are a few notes on how to configure operating systems and write code to support thousands of clients. Contents Related Sites See Nick Black's execellent Fast UNIX Servers page for a circa-2009 look at the situation. Book to Read First I/O frameworks I/O Strategies 1. 2.

High-Performance Analytics - L'architecture Big Data de SAS Hadoop, MapReduce, NoSQL, Appliances… tous ces termes techniques fleurissent pour décrire le phénomène Big Data, à l'origine du Big Analytics chez SAS. Si le Peta-octet n'est pas encore l'unité de base des applications décisionnelles, on peut estimer que les données disponibles pour le monde analytique vont augmenter et se diversifier. La capacité à valoriser et utiliser ces informations dans un laps de temps réduit est l'enjeu majeur des trois prochaines années. Le Big Data et l'Analytique : la réponse à des enjeux métiers Nous vous proposons en téléchargement gratuit un livre blanc qui énonce les utilisations et les bénéfices métiers dans différents secteurs d'activité. Ce sont autant d’exemples apportant un éclairage pertinent sur la gestion, le stockage, l'analyse et l'exploitation d'importants volumes de données réalisés avec SAS dans le contexte du Big Data. SAS® High-Performance Analytics Server : une offre dédiée Exploration visuelle des données avec SAS® Visual Analytics

Box plot In descriptive statistics, box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their quartiles. Box plots may also have lines extending vertically from the boxes whiskers indicating variability outside the upper and lower quartiles, hence the terms box-and-whisker plot and box-and-whisker diagram. Outliers may be plotted as individual points. This is also called a "box and whisker plot". Box plots are non-parametric: they display variation in samples of a statistical population without making any assumptions of the underlying probability distribution statistical distribution. Types of boxplots[edit] Figure 2. Figure 3. Box and whisker plots are uniform in their use of the box: the bottom and top of the box are always the first and third quartiles, and the band inside the box is always the second quartile (the median). Some box plots include an additional character to represent the mean of the data.[2] Variations[edit] Figure 4. if and John W.

Uses some different concepts than its competitors, especially support for a native Unix file system instead of HDFS (with non-open-source components) for better performance and ease of use. Native Unix commands can be used instead of Hadoop commands. Besides, MapR differentiates from its competitors with high availability features such as snapshots, mirroring or stateful failover. The company is also spearheading the Apache Drill project, an open-source re-envisioning of Google’s Dremel for SQL-like queries on Hadoop data for offering real time processing. by sergeykucherov Jul 15

Related: