Fantastic Rdbms Cards - 105+ Study Flash Cards: Study Flash Cards: RDBMS

RDBMS

Stands for Relational Database Management System, a software system used to manage relational databases.

Relational Database

A type of database that organizes data into tables with rows and columns, and establishes relationships between tables.

SQL

Stands for Structured Query Language, a programming language used to manage and manipulate relational databases.

Primary Key

A unique identifier for a record in a table, used to ensure data integrity and enable efficient data retrieval.

Foreign Key

A field in a table that refers to the primary key of another table, establishing a relationship between the two tables.

Normalization

The process of organizing data in a database to eliminate redundancy and improve data integrity.

Index

A data structure that improves the speed of data retrieval operations on a database table.

Transaction

A sequence of database operations that are treated as a single unit, ensuring data consistency and integrity.

ACID

Stands for Atomicity, Consistency, Isolation, and Durability, a set of properties that guarantee reliable processing of database transactions.

Database Security

Measures and techniques used to protect a database from unauthorized access, data breaches, and other security threats.

Backup

The process of creating copies of database files to protect against data loss in case of hardware failure, human error, or other disasters.

Recovery

The process of restoring a database to a consistent state after a failure or data loss.

Data Warehouse

A large, centralized repository of integrated data from various sources, used for reporting, analysis, and decision-making.

Data Mining

The process of discovering patterns, relationships, and insights from large datasets using statistical and machine learning techniques.

Big Data

Refers to extremely large and complex datasets that cannot be easily managed, processed, or analyzed using traditional database systems.

NoSQL

Stands for 'not only SQL', a type of database management system that provides flexible data models and scalability for handling big data.

Entity

A distinct object, concept, or event that is represented in a database and can be uniquely identified.

Attribute

A characteristic or property of an entity, represented as a column in a database table.

Tuple

A single row or record in a database table, containing data values for each attribute.

Query

A request for data or information from a database, typically written in SQL.

Join

A database operation that combines rows from two or more tables based on a related column between them.

View

A virtual table derived from the data in one or more tables, presenting a customized or filtered view of the data.

Schema

A logical structure that defines the organization and layout of a database, including tables, relationships, and constraints.

Data Integrity

The accuracy, consistency, and reliability of data stored in a database, ensured through constraints and validation rules.

Concurrency Control

Techniques and mechanisms used to manage simultaneous access to a database by multiple users or applications, ensuring data consistency.

Deadlock

A situation where two or more transactions are waiting for each other to release resources, resulting in a state of inactivity.

Normalization Forms

A set of rules or guidelines for designing and organizing relational databases to minimize redundancy and improve efficiency.

Data Warehouse Architecture

The structure and components of a data warehouse system, including data sources, ETL processes, and analytical tools.

OLAP

Stands for Online Analytical Processing, a category of software tools used for analyzing multidimensional data from data warehouses.

Data Mart

A subset of a data warehouse that is focused on a specific business function or department, providing tailored data for analysis.

Data Mining Techniques

Methods and algorithms used to extract patterns, trends, and insights from large datasets, including clustering, classification, and regression.

Hadoop

An open-source framework for distributed storage and processing of big data, based on the MapReduce programming model.

MapReduce

A programming model and algorithm for processing large datasets in parallel across a distributed cluster of computers.

CAP Theorem

Stands for Consistency, Availability, and Partition Tolerance, a principle that states it is impossible to achieve all three guarantees in a distributed system.

Key-Value Store

A type of NoSQL database that stores data as a collection of key-value pairs, providing fast and scalable access to data.

Document Store

A type of NoSQL database that stores and retrieves data in the form of documents, typically using JSON or XML formats.

Column Store

A type of NoSQL database that stores and retrieves data in columns rather than rows, enabling efficient data compression and query performance.

Graph Database

A type of NoSQL database that represents data as nodes and edges, allowing for efficient traversal and analysis of complex relationships.

ACID vs BASE

A comparison between traditional ACID properties and the BASE properties (Basically Available, Soft state, Eventually consistent) of NoSQL databases.

Data Replication

The process of creating and maintaining multiple copies of data across different nodes or servers in a distributed database system.

Sharding

A technique used in distributed databases to horizontally partition data across multiple servers or nodes for improved scalability and performance.

Consistency Models

Different levels of data consistency guarantees provided by distributed database systems, such as strong consistency, eventual consistency, and causal consistency.

Data Consistency

The property of a database system that ensures all data in the database is accurate, valid, and up-to-date at all times.

Data Warehouse vs Data Lake

A comparison between traditional data warehouses and data lakes, which store raw and unprocessed data for flexible analysis and exploration.

ETL

Stands for Extract, Transform, Load, a process used to extract data from various sources, transform it into a consistent format, and load it into a data warehouse.

Data Cleansing

The process of identifying and correcting or removing errors, inconsistencies, and inaccuracies in data to improve its quality and reliability.

Data Visualization

The representation of data in visual formats, such as charts, graphs, and maps, to facilitate understanding, analysis, and decision-making.

Data Mining Algorithms

Mathematical models and techniques used to discover patterns, trends, and relationships in large datasets, including decision trees, neural networks, and association rules.

Data Privacy

The protection of sensitive and personal data from unauthorized access, use, or disclosure, ensuring compliance with privacy regulations and laws.

Data Backup Strategies

Different approaches and techniques for creating backups of data, including full backups, incremental backups, and differential backups.

Data Recovery Techniques

Methods and procedures for restoring data from backups in case of data loss, system failure, or disaster.

Data Mining Applications

Real-world examples and use cases of data mining, such as customer segmentation, fraud detection, market basket analysis, and recommendation systems.

Data Warehouse Tools

Software applications and platforms used for designing, building, and managing data warehouses, including ETL tools, OLAP servers, and reporting tools.

Data Lake Architecture

The structure and components of a data lake system, including data ingestion, storage, and processing layers.

Data Lake vs Data Mart

A comparison between data lakes and data marts, which are subsets of data warehouses focused on specific business functions or departments.

Data Mining Challenges

Obstacles and issues faced during the data mining process, such as data quality, scalability, interpretability, and privacy concerns.

Data Governance

The overall management and control of data assets within an organization, including data policies, standards, and data stewardship.

Data Warehouse Modeling

The process of designing and structuring a data warehouse to meet the analytical needs of an organization, including dimensional modeling and star schemas.

Data Mining Tools

Software applications and algorithms used for data mining tasks, such as classification, clustering, regression, and association analysis.

Data Security

Measures and practices used to protect data from unauthorized access, disclosure, alteration, or destruction, ensuring its confidentiality, integrity, and availability.

Data Backup and Recovery Strategies

Different approaches and techniques for creating backups of data and recovering it in case of data loss or system failure.

Data Mining Process

A systematic approach to discovering patterns, trends, and insights from data, including data preparation, model building, evaluation, and deployment.

Data Warehouse Implementation

The process of building and deploying a data warehouse system, including data extraction, transformation, loading, and schema design.

Data Mining Evaluation

The assessment and validation of data mining models and results, measuring their accuracy, precision, recall, and other performance metrics.

Data Integration

The process of combining data from different sources and formats into a unified view, enabling comprehensive analysis and decision-making.

Data Warehouse Performance Tuning

Techniques and optimizations used to improve the speed and efficiency of data retrieval and analysis in a data warehouse system.

Data Mining Algorithms for Classification

Algorithms used to classify data into predefined categories or classes, such as decision trees, naive Bayes, support vector machines, and k-nearest neighbors.

Data Privacy Regulations

Laws and regulations that govern the collection, use, storage, and sharing of personal and sensitive data, such as GDPR and CCPA.

Data Backup and Recovery Solutions

Software and hardware solutions for creating backups of data and recovering it in case of data loss or system failure.

Data Mining Algorithms for Clustering

Algorithms used to group similar data points together based on their characteristics or attributes, such as k-means, hierarchical clustering, and DBSCAN.

Data Quality

The accuracy, completeness, consistency, and reliability of data, ensuring it is fit for its intended purpose and meets the needs of users.

Data Warehouse vs Data Mart vs Data Lake

A comparison between data warehouses, data marts, and data lakes, highlighting their differences in terms of data structure, purpose, and usage.

Data Mining Algorithms for Association Analysis

Algorithms used to discover relationships and associations between items or variables in large datasets, such as Apriori and FP-growth.

Data Governance Framework

A set of policies, processes, and controls for managing and ensuring the quality, availability, and security of data within an organization.

Data Mining Algorithms for Time Series Analysis

Algorithms used to analyze and forecast data points over time, such as ARIMA, exponential smoothing, and recurrent neural networks.

Data Lineage

The complete record of the origin, movement, and transformation of data throughout its lifecycle, ensuring data traceability and accountability.

Data Mining Algorithms for Regression Analysis

Algorithms used to predict and model the relationship between variables, such as linear regression, logistic regression, and decision trees.

Data Masking

A technique used to protect sensitive data by replacing it with fictional or scrambled data, while preserving its format and characteristics.

Data Mining Algorithms for Text Mining

Algorithms used to extract and analyze information from unstructured text data, such as natural language processing, sentiment analysis, and topic modeling.

Data Anonymization

The process of removing or modifying personally identifiable information from data to protect individual privacy and comply with data protection regulations.

Data Mining Algorithms for Anomaly Detection

Algorithms used to identify unusual or abnormal patterns in data, such as clustering, outlier detection, and support vector machines.

Data Archiving

The process of moving data from active storage to long-term storage for historical or compliance purposes, freeing up resources in the primary database.

Data Mining Algorithms for Decision Trees

Algorithms used to create models that represent decisions or decisions trees based on input data, such as ID3, C4.5, and CART.

Data Masking Techniques

Methods and approaches used to anonymize or obfuscate sensitive data, such as tokenization, encryption, and data substitution.

Data Mining Algorithms for Neural Networks

Algorithms inspired by the structure and function of the human brain, used for pattern recognition, classification, and prediction tasks.

Data Masking Best Practices

Guidelines and recommendations for implementing data masking techniques effectively and securely, ensuring data privacy and compliance.

Data Mining Algorithms for Association Rules

Algorithms used to discover interesting relationships or associations between items or variables in large datasets, such as Apriori and FP-growth.

Data Masking Challenges

Obstacles and issues faced during the data masking process, such as preserving data utility, maintaining referential integrity, and ensuring performance.

Data Masking Tools

Software applications and solutions used for implementing data masking techniques, providing features for data discovery, masking, and monitoring.

Data Masking Techniques for Databases

Approaches and methods for implementing data masking in databases, such as dynamic data masking, static data masking, and data scrambling.

Data Masking Techniques for Files

Methods and approaches for implementing data masking in files, such as data encryption, data shuffling, and data substitution.

Rdbms Study Cards