top of page
Search

Python and DuckDB

  • Staff
  • Feb 5
  • 2 min read

Updated: Feb 9



dxdy is implemented using Python and DuckDB. In this article, we explain this software architecture choice and describe our experience with this technology stack.


Database App Programming

  • The application uses a single DuckDB database for all data and reports.


  • The SQL database table schemas and views are created using a Python script. This way, the database schema is easily version controlled (using Git).


  • The entire database can be rebuilt from scratch using a testing script. This turns out to be very useful for rapid application prototyping. The SQL database schema can be easily modified during the development cycle, since there is no centralized database server (DuckDB is an embedded database).


Column-Oriented Database

  • DuckDB is an in-memory, columnar database. This makes it very efficient at querying data for analytics. A trade-off is that it is not optimized for write-operations.

  • Using only modest hardware (e.g. a MacBook M1 Air), and efficiently implemented SQL queries, the database is able to process millions of records nearly instantaneously.

  • There are several benefits to effecient run-time performance. The obvious benefit is ability to handle very large datasets. Other benefits include a responsive UI (User Interface) and capabilities to run simulations or backtesting at scale.


Optimizing Inventory and P&L (Profit & Loss) Computations

  • A trading portfolio is an inventory that is built up through a series of transactions (similar to shop inventory).

  • Legacy portfolio software usually computes and stores a daily "snapshot" of the inventory. DxDy is implemented to compute the inventory "on-the-fly" from transactions, using DuckDB's in-memory database technology.


Python and DuckDB

  • The Python programming language has a well-known subsystem known as Pandas (Pandas itself is built on top of NumPy). The central feature of Pandas/NumPy is the DataFrame.

  • The DuckDB Python API is integrated with Pandas. This allows for seamlessly embedding SQL queries in Python. Many "data wrangling" and ETL (Extract, Transform, Load) tasks are more easily solved using SQL queries as opposed to Pandas DataFrame operations (e.g. joining two tables, etc).


Conclusion

  • Python and DuckDB are a highly effective tech stack for rapid application prototyping.


 
 
 

Opmerkingen


bottom of page