Using UUID as a Primary Key


I am developing a personal finance application on mobile and web. Data should be synced two ways between the mobile app and the web application.

The data schema is:

  • Currency table with currency name and conversion rate
  • Account table with name and a foreign key to its currency
  • Transactions table with an amount, withdraw account related, deposit account related, and date.

As for compatibility with server database, I added user id to all tables and a little User table with just our user ID, name and email.

I am using sample data with about 4500 transactions. The total size of the SQLite files is 324K Byte.

I am using the auto-increment primary key that is managed by SQLite. As I wanted to implement a sync feature between mobile apps and web app. I thought changing autoincrement id with UUID is a good idea to be able to sync data in two ways.

I did some research, and I found two options:

  • Implement UUID as clear text. This has the advantage of clearly seeing the ID in the database so debugging would be more comfortable. The disadvantage is having 36 bytes per id.
  • Implement UUID as a BLOB of 16 bytes. Data is stored as binary so that you save 20 bytes per id. You can easily use HEX(id) in the SQL query to view the BLOB UUID as a text.

For the sake of saving space on the mobile device, I decided to use BLOB. But I worried about performance, so I did some research.

I read multiple blogs, articles, and SQLite reference and found that there is no negative performance impact on using UUID either in binary or text. But of course, size will be much bigger especially in case of using the 36 bytes text-based UUID.

Before doing any changes. This is my performance numbers of the main APIs per one call to compute balances that are being displayed in the application dashboard.

  1 ║ Calculations time   │ 634  │ milliseconds  
  2 ║ Asset Accounts      │ 358  │ milliseconds  
  3 ║ Liability Accounts  │ 70   │ milliseconds  
  4 ║ Expense Accounts    │ 68   │ milliseconds  
  5 ║ Income Accounts     │ 67   │ milliseconds  
  6 ║ Other Accounts      │ 69   │ milliseconds  
  7 ║ Total               │ 1266 │ milliseconds        

Rows from 1 to 6 is an SQLite fetch from database and balance calculations. I did timing measurements for display refresh and found it very fast. So the bottleneck is in SQL data access.

As the whole database is 324K, I expect SQLite to load most of it to memory to give the best performance.

Then I spent an extensive effort to change all Primary Keys and Foreign Keys to be BLOB 16 bytes and migrate data.  I notice that:

The database grew to 807 KB.
The performance almost degrades immediately as follows:

  1 ║ Totals Calculations │ 1295 │ milliseconds  
  2 ║ Asset Accounts      │ 670  │ milliseconds  
  3 ║ Liability Accounts  │ 120  │ milliseconds  
  4 ║ Expense Accounts    │ 147  │ milliseconds  
  5 ║ Income Accounts     │ 147  │ milliseconds  
  6 ║ Other Accounts      │ 98   │ milliseconds  
  7 ║ Total               │ 2477 │ milliseconds  

This is almost double the time!

The first thing to think of to improve DB performance is adding indexes. So, I added indexes gradually and measured performance after each index. In the end, I added 4 indexes to fields that are used in the query where part and order by parts. I measured again and found these performance numbers.

  1 ║ Totals Calculations │ 1078 │ milliseconds  
  2 ║ Asset Accounts      │ 639  │ milliseconds  
  3 ║ Liability Accounts  │ 111  │ milliseconds  
  4 ║ Expense Accounts    │ 123  │ milliseconds  
  5 ║ Income Accounts     │ 141  │ milliseconds  
  6 ║ Other Accounts      │ 97   │ milliseconds  
  7 ║ Total               │ 2189 │ milliseconds  

The performance improved just by 188 milliseconds! While DB size jumped up to 1.4 Mega. Remember it was initially about 300KB.

I do expect database size to increase by adding indexes, but how performance is not improving at all!!

Lessons Learned and Ideas:

  • DB size increase decreases performance.
  • Using UUID is slower than using auto increment.
  • Adding indexes has no significant positive impact on performance. This does not make sense at all. It means simply database access is not the real problem.
  • As a hint: To make sure the queries are using indexes, I used the query “EXPLAIN QUERY PLAN” before my SQL query in “DB Browser for SQLite” open source tool. It shows clearly that my indexes are being used.

Conclusions:

  • The size increase is really significant. So, I will go back to use autoincrement ids. I am happily using GIT, so it is so easy to just get back to develop branch.
  • Performance needs analysis with a more appropriate tool, I tried Android Profiler and will continue using it to optimize performance. Measuring time the old way is not working properly.
  • The assumption that I must use UUID in all PK and FK is not true. I will consider other options.

Useful Links:

  • https://tomharrisonjr.com/uuid-or-guid-as-primary-keys-be-careful-7b2aa3dcb439
  • https://medium.com/@JasonWyatt/squeezing-performance-from-sqlite-indexes-indexes-c4e175f3c346
  • https://developer.android.com/studio/profile/cpu-profiler
  • http://plaintexttools.github.io/plain-text-table/

From ahm507.blogspot.com

Leave a Reply

Your email address will not be published. Required fields are marked *