Nosql Study Cards

Enhance Your Learning with NoSQL Flash Cards for quick learning



NoSQL

A type of database management system that provides a flexible and scalable approach to storing and retrieving data, especially for large-scale applications and distributed environments.

Key Characteristics

NoSQL databases are schema-less, horizontally scalable, and designed for high availability and fault tolerance.

Database Models

NoSQL databases support various data models, including document, key-value, column-family, graph, and object models.

CAP Theorem

The CAP theorem states that it is impossible for a distributed data system to simultaneously provide consistency, availability, and partition tolerance.

ACID vs BASE

ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee reliable processing of database transactions, while BASE (Basically Available, Soft state, Eventually consistent) prioritizes availability and scalability over strict consistency.

Document Databases

Document databases store and retrieve data in the form of semi-structured documents, typically using JSON or XML formats.

Key-Value Databases

Key-value databases store and retrieve data as a collection of key-value pairs, providing fast access to values based on their keys.

Column-Family Databases

Column-family databases store and retrieve data in column families, which are containers for related data columns.

Graph Databases

Graph databases store and retrieve data in the form of nodes, edges, and properties, allowing efficient representation and traversal of complex relationships.

Object Databases

Object databases store and retrieve data in the form of objects, providing support for object-oriented programming concepts and relationships.

Distributed Databases

Distributed databases store and retrieve data across multiple nodes or servers, enabling scalability, fault tolerance, and high availability.

Data Replication

Data replication is the process of creating and maintaining multiple copies of data across different nodes or servers for improved availability and fault tolerance.

Sharding

Sharding is the process of horizontally partitioning data across multiple nodes or servers to improve scalability and performance.

Consistency Models

Consistency models define the level of consistency that a distributed database system guarantees, such as eventual consistency or strong consistency.

Eventual Consistency

Eventual consistency is a consistency model where all updates to a distributed database will eventually propagate and reach a consistent state.

Strong Consistency

Strong consistency is a consistency model where all updates to a distributed database are immediately visible and consistent across all nodes.

Concurrency Control

Concurrency control ensures that multiple concurrent transactions can access and modify data in a consistent and isolated manner.

Indexing

Indexing is the process of creating data structures, such as B-trees or hash tables, to improve the speed and efficiency of data retrieval.

Querying

Querying is the process of retrieving specific data from a database using query languages, such as SQL or NoSQL-specific query languages.

Data Modeling

Data modeling is the process of designing the structure and relationships of data in a database, ensuring efficient storage and retrieval.

Normalization

Normalization is the process of organizing data in a database to eliminate redundancy and improve data integrity and consistency.

Denormalization

Denormalization is the process of intentionally introducing redundancy in a database to improve performance and simplify data retrieval.

ACID Transactions

ACID transactions ensure that database operations are atomic, consistent, isolated, and durable, providing reliability and data integrity.

CAP Theorem Revisited

The CAP theorem revisited acknowledges that in a distributed system, it is possible to achieve only two out of three properties: consistency, availability, and partition tolerance.

Scalability

Scalability is the ability of a system to handle increasing amounts of data, traffic, or workload without sacrificing performance or availability.

Fault Tolerance

Fault tolerance is the ability of a system to continue operating properly in the event of failures or errors, ensuring high availability and reliability.

Data Integrity

Data integrity ensures that data remains accurate, consistent, and reliable throughout its lifecycle, preventing unauthorized modifications or corruption.

Data Security

Data security involves protecting data from unauthorized access, use, disclosure, disruption, modification, or destruction, ensuring confidentiality, integrity, and availability.

Backup and Recovery

Backup and recovery strategies involve creating copies of data and implementing processes to restore data in the event of data loss, corruption, or system failures.

Performance Optimization

Performance optimization techniques aim to improve the speed, efficiency, and responsiveness of a database system, ensuring optimal resource utilization.

Data Partitioning

Data partitioning involves dividing a database into smaller, more manageable parts called partitions or shards, allowing parallel processing and improved performance.

Data Distribution

Data distribution refers to the process of distributing data across multiple nodes or servers in a distributed database system, ensuring load balancing and fault tolerance.

Data Consistency

Data consistency ensures that data remains accurate and valid across different replicas or copies in a distributed database system, preventing conflicts or inconsistencies.

Data Replication Strategies

Data replication strategies determine how data is replicated across different nodes or servers, such as master-slave replication or multi-master replication.

Data Compression

Data compression techniques reduce the size of data to save storage space and improve data transfer efficiency, while maintaining data integrity and accessibility.

Data Encryption

Data encryption involves transforming data into a secure and unreadable format using encryption algorithms, ensuring confidentiality and protection against unauthorized access.

Data Backup Strategies

Data backup strategies involve creating regular backups of data to protect against data loss, corruption, or accidental deletion, ensuring data recovery and business continuity.

Data Recovery Strategies

Data recovery strategies involve restoring data from backups or other sources in the event of data loss, corruption, or system failures, ensuring data integrity and availability.

Data Migration

Data migration is the process of transferring data from one system or storage device to another, ensuring data integrity, compatibility, and minimal downtime.

Data Warehousing

Data warehousing involves collecting, organizing, and analyzing large volumes of data from various sources to support business intelligence and decision-making processes.

Data Lake

A data lake is a centralized repository that stores raw and unprocessed data from various sources, enabling flexible data exploration, analysis, and processing.

Data Governance

Data governance refers to the overall management and control of data assets within an organization, ensuring data quality, compliance, and security.

Data Quality

Data quality refers to the accuracy, completeness, consistency, and reliability of data, ensuring that data meets the requirements and expectations of users and applications.

Data Privacy

Data privacy involves protecting sensitive and personally identifiable information (PII) from unauthorized access, use, or disclosure, ensuring compliance with privacy regulations.

Data Access Control

Data access control involves implementing security measures to control and restrict access to data based on user roles, permissions, and authentication mechanisms.

Data Auditing

Data auditing involves monitoring and recording data access, modifications, and activities to ensure compliance, detect unauthorized actions, and investigate security incidents.

Data Archiving

Data archiving involves moving infrequently accessed or historical data to long-term storage for compliance, regulatory, or historical purposes, freeing up primary storage resources.

Data Integration

Data integration involves combining data from multiple sources or systems into a unified view, enabling data analysis, reporting, and decision-making across the organization.