background preloader

NoSQL

NoSQL
"Structured storage" redirects here. For the Microsoft technology also known as structured storage, see COM Structured Storage. A NoSQL (often interpreted as Not Only SQL[1][2]) database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Motivations for this approach include simplicity of design, horizontal scaling and finer control over availability. The data structure (e.g. key-value, graph, or document) differs from the RDBMS, and therefore some operations are faster in NoSQL and some in RDBMS. There are differences though, and the particular suitability of a given NoSQL DB depends on the problem it must solve (e.g. does the solution use graph algorithms?). History[edit] There have been various approaches to classify NoSQL databases, each with different categories and subcategories. A more detailed classification is the following, by Stephen Yen:[9] Performance[edit] Examples[edit] Graph[edit]

Kyoto Cabinet: a straightforward implementation of DBM Copyright (C) 2009-2012 FAL Labs Last Update: Fri, 04 Mar 2011 23:07:26 -0800 Overview Kyoto Cabinet is a library of routines for managing a database. The database is a simple data file containing records, each is a pair of a key and a value. Every key and value is serial bytes with variable length. Kyoto Cabinet runs very fast. Kyoto Cabinet is written in the C++ language, and provided as API of C++, C, Java, Python, Ruby, Perl, and Lua. Documents The following are documents of Kyoto Cabinet. Packages The following are the source packages of Kyoto Cabinet. Source Packages of the core library (C/C++) Binary Packages for Windows (C/C++/Java) Information Kyoto Cabinet was written and is maintained by FAL Labs. The following are sibling projects of Kyoto Cabinet. Remote Service (Kyoto Tycoon)

MongoDB MongoDB (from "humongous") is a cross-platform document-oriented database. Classified as a NoSQL database, MongoDB eschews the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas (MongoDB calls the format BSON), making the integration of data in certain types of applications easier and faster. Released under a combination of the GNU Affero General Public License and the Apache License, MongoDB is free and open-source software. First developed by the software company 10gen (now MongoDB Inc.) in October 2007 as a component of a planned platform as a service product, the company shifted to an open source development model in 2009, with 10gen offering commercial support and other services.[1] Since then, MongoDB has been adopted as backend software by a number of major websites and services, including Brave Collective, Craigslist, eBay, Foursquare, SourceForge, Viacom, and the New York Times, among others. Licensing and support[edit]

Document-oriented database This article is about the software type. For usage/deployment instances, see Full text database. A document-oriented database is a computer program designed for storing, retrieving, and managing document-oriented information, also known as semi-structured data. Document-oriented databases are inherently a subclass of the key-value store, another NoSQL database concept. XML databases are a specific subclass of document-oriented databases. Documents[edit] The central concept of a document-oriented database are the documents, which is used in usual English sense of a group of data that encodes some sort of user-readable information. To understand the difference, consider this text document: Bob Smith 123 Back St. Although it is clear to the reader that this document contains the address for a contact, there is no information within the document that indicates that, nor information on what the individual fields represent. Now consider the same document marked up in pseudo-XML: See also[edit]

Welcome to MariaDB! - MariaDB NoSQL Data Modeling Techniques « Highly Scalable Blog NoSQL databases are often compared by various non-functional criteria, such as scalability, performance, and consistency. This aspect of NoSQL is well-studied both in practice and theory because specific non-functional properties are often the main justification for NoSQL usage and fundamental results on distributed systems like the CAP theorem apply well to NoSQL systems. At the same time, NoSQL data modeling is not so well studied and lacks the systematic theory found in relational databases. In this article I provide a short comparison of NoSQL system families from the data modeling point of view and digest several common modeling techniques. I would like to thank Daniel Kirkdorffer who reviewed the article and cleaned up the grammar. To explore data modeling techniques, we have to start with a more or less systematic view of NoSQL data models that preferably reveals trends and interconnections. Key-Value storage is a very simplistic, but very powerful model. Conceptual Techniques

CouchDB Apache CouchDB, commonly referred to as CouchDB, is an open source database that focuses on ease of use and on being "a database that completely embraces the web".[1] It is a NoSQL database that uses JSON to store data, JavaScript as its query language using MapReduce, and HTTP for an API.[1] One of its distinguishing features is multi-master replication. CouchDB was first released in 2005 and later became an Apache project in 2008. Unlike in a relational database, CouchDB does not store data and relationships in tables. Instead, each database is a collection of independent documents. Each document maintains its own data and self-contained schema. CouchDB implements a form of Multi-Version Concurrency Control (MVCC) in order to avoid the need to lock the database file during writes. Other features include document-level ACID semantics with eventual consistency, (incremental) MapReduce, and (incremental) replication. History[edit] Main features[edit] Document Storage ACID Semantics

XML database An XML database is a data persistence software system that allows data to be stored in XML format. These data can then be queried, exported and serialized into the desired format. XML databases are usually associated with document-oriented databases. Two major classes of XML database exist:[1] XML-enabled: these may either map XML to traditional database structures (such as a relational database[2]), accepting XML as input and rendering XML as output, or more recently support native XML types within the traditional database. Rationale for XML in databases[edit] O'Connell gives one reason for the use of XML in databases: the increasingly common use of XML for data transport, which has meant that "data is extracted from databases and put into XML documents and vice-versa".[3] It may prove more efficient (in terms of conversion costs) and easier to store the data in XML format. XML Enabled databases[edit] RDBMS that support the ISO XML Type are: Example of XML Type Query in IBM DB2 SQL[edit]

Oracle Database 12c Release 1 (12.1) New Features The following sections describe the new business intelligence and data warehousing features for Oracle Database 12c Release 1 (12.1). 1.2.1 Oracle Advanced Analytics The following sections describe new Oracle Advanced Analytics features. 1.2.1.1 Decision Tree Mining Text Data The Decision Tree algorithm now supports nested data and can be used for text mining. Decision Tree is popular due to its transparency and prevalence, therefore, it is important to enable the algorithm to handle unstructured data. 1.2.1.2 Expectation Maximization (EM) Clustering and Density Estimation In Release 11g, Oracle Data Mining offered two clustering algorithms. In bringing analytics to applications, Oracle Data Mining provides different types of clustering capabilities currently being used by multiple applications. 1.2.1.3 Feature Extraction Using Singular Value Decomposition PCA can be viewed as a special scoring method under the SVD algorithm. 1.2.1.5 Native Double in Data Mining Functions ALTER TABLE ...

Visual Guide to NoSQL Systems - Nathan Hurst's Blog There are so many NoSQL systems these days that it's hard to get a quick overview of the major trade-offs involved when evaluating relational and non-relational systems in non-single-server environments. I've developed this visual primer with quite a lot of help (see credits at the end), and it's still a work in progress, so let me know if you see anything misplaced or missing, and I'll fix it. Without further ado, here's what you came here for (and further explanation after the visual). Note: RDBMSs (MySQL, Postgres, etc) are only featured here for comparison purposes. Also, some of these systems can vary their features by configuration (I use the default configuration here, but will try to delve into others later). As you can see, there are three primary concerns you must balance when choosing a data management system: consistency, availability, and partition tolerance. According to the CAP Theorem, you can only pick two. Self promotion and Credits

Related: