NoSQL Un article de Wikipédia, l'encyclopédie libre. En informatique, NoSQL désigne une famille de systèmes de gestion de base de données (SGBD) qui s'écarte du paradigme classique des bases relationnelles. L'explicitation du terme la plus populaire de l'acronyme est Not only SQL (« pas seulement SQL » en anglais) même si cette interprétation peut être discutée[1]. La définition exacte de la famille des SGBD NoSQL reste sujette à débat. Le terme se rattache autant à des caractéristiques techniques qu'à une génération historique de SGBD qui a émergé à la fin des années 2000/début des années 2010[2]. L'architecture machine en clusters induit une structure logicielle distribuée fonctionnant avec des agrégats répartis sur différents serveurs permettant des accès et modifications concurrentes mais imposant également de remettre en cause de nombreux fondements de l'architecture SGBD relationnelle traditionnelle, notamment les propriétés ACID. Éléments historiques[modifier | modifier le code]
PostgreSQL Hardware Performance Tuning POSTGRESQL is an object-relational database developed on the Internet by a group of developers spread across the globe. It is an open-source alternative to commercial databases like Oracle and Informix. POSTGRESQL was originally developed at the University of California at Berkeley. In 1996, a group began development of the database on the Internet. They use email to share ideas and file servers to share code. There are two aspects of database performance tuning. To understand hardware performance issues, it is important to understand what is happening inside the computer. You can see that storage areas increase in size as they get farther from the CPU. Moving information between various storage areas happens automatically. CPU registers and the CPU cache cannot be effectively tuned by the database administrator. You might think this is easy to do, but it is not. POSTGRESQL does not directly change information on disk. The default POSTGRESQL configuration allocates 1000 shared buffers.
Mapeo objeto-relacional El problema[editar] Implementaciones[editar] Los tipos de bases de datos usados mayoritariamente son las bases de datos SQL, cuya aparición precedió al crecimiento de la programación orientada a objetos en los 1990s. Las bases de datos SQL usan una serie de tablas para organizar datos. Una implementación del mapeo relacional de objetos podría necesitar elegir de manera sistemática y predictiva qué tablas usar y generar las sentencias SQL necesarias. Muchos paquetes han sido desarrollados para reducir el tedioso proceso de desarrollo de sistemas de mapeo relacional de objetos proveyendo bibliotecas de clases que son capaces de realizar mapeos automáticamente. Desde el punto de vista de un programador, el sistema debe lucir como un almacén de objetos persistentes. Sin embargo, en la práctica no es tan simple. Un buen número de sistemas de mapeo objeto-relacional se han desarrollado a lo largo de los años, pero su efectividad en el mercado ha sido diversa. Véase también[editar]
Boy or Girl paradox The Boy or Girl paradox surrounds a well-known set of questions in probability theory which are also known as The Two Child Problem,[1] Mr. Smith's Children[2] and the Mrs. Smith Problem. Mr. Gardner initially gave the answers 1/2 and 1/3, respectively; but later acknowledged that the second question was ambiguous.[1] Its answer could be 1/2, depending on how you found out that one child was a boy. The paradox has frequently stimulated a great deal of controversy.[4] Many people argued strongly for both sides with a great deal of confidence, sometimes showing disdain for those who took the opposing view. Common assumptions[edit] Each child is either male or female.Each child has the same chance of being male as of being female.The sex of each child is independent of the sex of the other. First question[edit] Mr. Under the forementioned assumptions, in this problem, a random family is selected. Only two of these possible events meet the criteria specified in the question (e.g., GG, GB).
Why use a database instead of just saving your data to disk? This Q&A is part of a weekly series of posts highlighting common questions encountered by technophiles and answered by users at Stack Exchange, a free, community-powered network of 100+ Q&A sites. Dokkat appears to think that databases are overused. "Instead of a database, I just serialize my data to JSON, saving and loading it to disk when necessary," he writes. "All the data management is made on the program itself, which is faster AND easier than using SQL queries." What is missing here? Why should a developer use a database when saving data to a disk might work just as well? See the original question here. The laundry list Robert Harvey Answers (117 votes): In short, you benefit from a wide range of well-known, proven technologies developed over many years by a wide variety of very smart people. If you're worried that a database is overkill, check out SQLite. Related: "When would someone use MongoDB (or similar) over traditional RDMS?" Complexity deserves thought Yannis Rizos comments:
Tuning Your PostgreSQL Server by Greg Smith, Robert Treat, and Christopher Browne PostgreSQL ships with a basic configuration tuned for wide compatibility rather than performance. Odds are good the default parameters are very undersized for your system. Rather than get dragged into the details of everything you should eventually know (which you can find if you want it at the GUC Three Hour Tour), here we're going to sprint through a simplified view of the basics, with a look at the most common things people new to PostgreSQL aren't aware of. You should click on the name of the parameter in each section to jump to the relevant documentation in the PostgreSQL manual for more details after reading the quick intro here. Background Information on Configuration Settings PostgreSQL settings can be manipulated a number of different ways, but generally you will want them changed in your configuration files, either directly or, starting with PostgreSQL 9.4, through ALTER SYSTEM. The types of settings When they take effect
The ACID Model The ACID model is one of the oldest and most important concepts of database theory. It sets forward four goals that every database management system must strive to achieve: atomicity, consistency, isolation and durability. No database that fails to meet any of these four goals can be considered reliable. Let’s take a moment to examine each one of these characteristics in detail: Atomicity states that database modifications must follow an “all or nothing” rule. FPF Announces New Group to Develop Best Practices for Retail Location Analytics Companies « Future of Privacy The Future of Privacy Forum Announces New Group to Develop Best Practices for Retail Location Analytics Companies First Step for Shaping Privacy Principles for Technologies Aiming to Improve the In-Store Shopping Experience Date: July 16, 2013 WASHINGTON, D.C. – The Future of Privacy Forum (FPF) today announced that it is working with a group of leading technology companies to develop best practices for retail location analytics. FPF’s goal is to make sure these technologies are subject to privacy controls and are used responsibly to improve the consumer shopping experience. “Companies need to ensure they have data protection standards in place to de-identify data, to provide consumers with effective choices to not be tracked and to explain to consumers the purposes for which data is being used,” said Jules Polonetsky, Director of the Future of Privacy Forum. “ShopperTrak is working with FPF because we believe in individuals’ rights to privacy.
About indexes | DataStax Cassandra 1.2 Documentation An index is a data structure that allows for fast, efficient lookup of data matching a given condition. About primary indexes In relational database design, a primary key is the unique key used to identify each row in a table. A primary key index, like any index, speeds up random access to data in the table. In Cassandra, the primary index for a table is the index of its row keys. Rows are assigned to nodes by the cluster-configured partitioner and the keyspace-configured replica placement strategy . With randomly partitioned row keys (the default in Cassandra), row keys are partitioned by their MD5 hash and cannot be scanned in order like traditional b-tree indexes. About secondary indexes Secondary indexes in Cassandra refer to indexes on column values (to distinguish them from the primary row key index for a table). Secondary index names, in this case users_state, must be unique within the keyspace. SELECT * FROM users WHERE state = 'TX' ; Using multiple secondary indexes
ACID: An acronym for atomicity, consistency, isolation, and durability, which are the main requirements for guaranteed transaction processing.
Found in: Hurwitz, J., Nugent, A., Halper, F. & Kaufman, M. (2013) Big Data For Dummies. Hoboken, New Jersey, United States of America: For Dummies. ISBN: 9781118504222. by raviii Dec 31