Database ======== While I have extensive experience using various databases in team projects, I feel that my understanding of them lacks a systematic depth. Frequently, I find myself without a clear resolution for certain database-related challenges. - SQL databases: PostgreSQL, SQLite - NoSQL databases: DynamoDB, MongoDB, Redis I intend to conduct a retrospective to consolidate and summarize my experiences with databases. I believe that databases can be categorized into two major parts: source-related (src-related) and deployment-related (deploy-related). - src-related: how to use databases. - deploy-related: how to configure databases and deploy them. --- Foreign key violation (src) ~~~~~~~~~~~~~~~~~~~~~~~~~~~ A primary key generally focuses on the uniqueness of the table. A foreign key is generally used to build a relationship between the two tables. .. code-block:: none -- Enable foreign key constraints PRAGMA foreign_keys = ON; -- Create customers table CREATE TABLE customers ( customer_id INTEGER PRIMARY KEY, customer_name TEXT NOT NULL ); -- Create orders table with a foreign key reference to customers CREATE TABLE orders ( order_id INTEGER PRIMARY KEY, order_number TEXT NOT NULL, customer_id INTEGER, FOREIGN KEY (customer_id) REFERENCES customers(customer_id) ); -- Insert data into customers table INSERT INTO customers (customer_id, customer_name) VALUES (1, 'Customer A'), (2, 'Customer B'); -- Insert data into orders table with a foreign key violation INSERT INTO orders (order_id, order_number, customer_id) VALUES (101, 'Order 101', 3); -- This should now cause a foreign key violation Database migration with Alembic (src) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Key aspects of database migration include: 1) Schema Changes: Database migration involves making changes to the structure of the database, such as adding or removing tables, altering columns, or creating indexes. 2) Data Migration: Data Transfer: Moving existing data to match the new schema, ensuring that no data is lost during the migration. Data Transformation: Modifying data as needed to conform to the new schema or data model. 3) Versioning Thread-safety issues (src) ~~~~~~~~~~~~~~~~~~~~~~~~~~ 1) Debug a multi-thread issue with `threading.Lock` .. code-block:: python import logging import os from urllib.parse import urljoin +from threading import Lock + +LOCK = Lock() import requests from requests.packages.urllib3.util.retry import Retry # pylint: disable=import-error @@ -68,9 +71,11 @@ class HTTPFetcher(Fetcher): remote_path, stream=True, allow_redirects=True, timeout=self.timeout ) response.raise_for_status() - with open(local_path, "wb") as local_file: - for chunk in response.iter_content(chunk_size=1024): - local_file.write(chunk) + with LOCK: + import pdb; pdb.set_trace() + with open(local_path, "wb") as local_file: + for chunk in response.iter_content(chunk_size=1024): + local_file.write(chunk) return local_path def fetch(self, path, download_root): 2) Lazy initialization and thread-safety What's lazy initialization? Lazy initialization is a design pattern where an object or resource is not created until it is actually needed. When lazy initialization is used in a multithreaded environment, potential issues can arise if multiple threads attempt to access the uninitialized resource simultaneously, zum beispiel "race conditions". Use the lightweight sqlite for unit test (src) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Using SQLite for unit testing is a common and effective approach. Set up and tear down the necessary database fixtures for each test. This ensures a consistent starting point for your tests and prevents interference between different test cases. .. code-block:: python @pytest.fixture(autouse=True) def sqlite_in_tests(): """Any 'real' usage of sqlalchemy during this test suite makes use of a fresh sqlite DB for each test run. """ remove_old_db_files() set_env_vars() ... @pytest.fixture(autouse=True) def sqlite_broker_in_tests(sqlite_in_tests): """Decouple the sqlite initialization and usage by broker. """ set_broker() ...