Database¶
While I have extensive experience using various databases in team projects, I feel that my understanding of them lacks a systematic depth. Frequently, I find myself without a clear resolution for certain database-related challenges.
SQL databases: PostgreSQL, SQLite
NoSQL databases: DynamoDB, MongoDB, Redis
I intend to conduct a retrospective to consolidate and summarize my experiences with databases.
I believe that databases can be categorized into two major parts: source-related (src-related) and deployment-related (deploy-related).
src-related: how to use databases.
deploy-related: how to configure databases and deploy them.
—
Foreign key violation (src)¶
A primary key generally focuses on the uniqueness of the table.
A foreign key is generally used to build a relationship between the two tables.
-- Enable foreign key constraints
PRAGMA foreign_keys = ON;
-- Create customers table
CREATE TABLE customers (
customer_id INTEGER PRIMARY KEY,
customer_name TEXT NOT NULL
);
-- Create orders table with a foreign key reference to customers
CREATE TABLE orders (
order_id INTEGER PRIMARY KEY,
order_number TEXT NOT NULL,
customer_id INTEGER,
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);
-- Insert data into customers table
INSERT INTO customers (customer_id, customer_name) VALUES
(1, 'Customer A'),
(2, 'Customer B');
-- Insert data into orders table with a foreign key violation
INSERT INTO orders (order_id, order_number, customer_id) VALUES
(101, 'Order 101', 3); -- This should now cause a foreign key violation
Database migration with Alembic (src)¶
Key aspects of database migration include:
Schema Changes:
Database migration involves making changes to the structure of the database, such as adding or removing tables, altering columns, or creating indexes.
Data Migration:
Data Transfer: Moving existing data to match the new schema, ensuring that no data is lost during the migration.
Data Transformation: Modifying data as needed to conform to the new schema or data model.
Versioning
Thread-safety issues (src)¶
Debug a multi-thread issue with threading.Lock
import logging
import os
from urllib.parse import urljoin
+from threading import Lock
+
+LOCK = Lock()
import requests
from requests.packages.urllib3.util.retry import Retry # pylint: disable=import-error
@@ -68,9 +71,11 @@ class HTTPFetcher(Fetcher):
remote_path, stream=True, allow_redirects=True, timeout=self.timeout
)
response.raise_for_status()
- with open(local_path, "wb") as local_file:
- for chunk in response.iter_content(chunk_size=1024):
- local_file.write(chunk)
+ with LOCK:
+ import pdb; pdb.set_trace()
+ with open(local_path, "wb") as local_file:
+ for chunk in response.iter_content(chunk_size=1024):
+ local_file.write(chunk)
return local_path
def fetch(self, path, download_root):
Lazy initialization and thread-safety
What’s lazy initialization?
Lazy initialization is a design pattern where an object or resource is not created until it is actually needed.
When lazy initialization is used in a multithreaded environment, potential issues can arise if multiple threads attempt to access the uninitialized resource simultaneously, zum beispiel “race conditions”.
Use the lightweight sqlite for unit test (src)¶
Using SQLite for unit testing is a common and effective approach. Set up and tear down the necessary database fixtures for each test. This ensures a consistent starting point for your tests and prevents interference between different test cases.
@pytest.fixture(autouse=True)
def sqlite_in_tests():
"""Any 'real' usage of sqlalchemy during this test suite makes use of
a fresh sqlite DB for each test run.
"""
remove_old_db_files()
set_env_vars()
...
@pytest.fixture(autouse=True)
def sqlite_broker_in_tests(sqlite_in_tests):
"""Decouple the sqlite initialization and usage by broker.
"""
set_broker()
...