Data Engineer. Tinkerer. Writer.

Exist Integrations Developer Log 2

This is the second development log for my project Exist Integrations.

See previous entries: 1.

In the last log I mentioned motivation was low to work on Exist Integrations. Getting WhatPulse working was exciting but I took too many shortcuts.

After working a few hours over the last couple days, I’ve merged two PRs. These PRs will enable me to finally move on to the next integrations. I have ideas on how to enhance the Trakt, Toggl Track, and YNAB integrations. Plus, I am closer to removing an unneeded server.

PR #7 - UI Clean-up

The new Exist Integrations user interface was a mess. I added UI elements without consideration or consistency. I suspect this was a big catalyst for my lack of motivation.

I went into this pull request with two goals. Set standards and rules for the usage of colors. Help the user with more context and confirm intent with destructive actions. I use a lot of colors with the buttons. A real designer would shudder if they saw it. I don’t care. I like the colors. They’re fun and now they are consistent. I clarified the text around different actions. The user must confirm when doing a destructive action like disconnecting an integration.

PR #8 - WhatPulse Code Clean-up

Next, my focus shifted to the backend.

Data storage is a big challenge. In a project like this, source systems format the data one way. Then I must process it and send it to Exist with its formatting requirements. In the live version I use a table unique to each integration that stores the data in the format of the source. Then I calculate aggregated totals by day in another table. Only then, do I send the data for any day where I detected a change. I would send one API call to Exist for each day.

This poses challenges that I want to address with the new version.

Take the API calls to Exist for example. For a new user, the site pulls 30 days of data from the source. The first time the site sends data to Exist, there would be 30 API calls to Exist. Then, with users that have a lot of data, the logic the site uses to calculate the values by data slows down. This caused the largest issue with the site to date and is one of the reasons I am doing this write. The YNAB integration deals with a lot of data. The integration processes all financial transactions against the relevant categories. My user had a years worth of data. The logic was slow and caused the server to run at 100% CPU for days. I got around this with a band-aid, by deleting data older than 30 days. Limiting the data stored for each user is an edict with the new version because of this.

I removed the dedicated table related to storing the pulses from WhatPulse. I replaced it with a User Data table that contains the records formatted for Exist. It posed some challenges. I had to split each Pulse record to many lines in the table. With the line, I include a flag on the status in Exist and the response. I plan to use this table for a log view to the end user, but I’m not sure I will do that.

After I pull the data from WhatPulse and format it, the processor will batch up to 35 records to send to Exist. This will limit the calls to Exist and will make me a better API consumer. I did struggle with this. I lose the unique identifier back to my User Data table when batching these. This is important when checking if the calls succeed. 34 could succeed and 1 could fail. I am happy with how I tackled this. It may not be perfect, so I am monitoring this with my user account.

I am using the incremental update endpoint. This thing rocks. I no longer have to calculate the data by date. I can send the value and Exist will add the value to the total. I do foresee that it could cause issues with custom attributes. On the Review page for Exist, users can change those values in their UI. I implemented a function that a user can trigger on my site to zero out the values for Exist. Then on the next update, the data will resend and the issues should resolve. It’s not perfect. It is functional though.

With my data edict, I implemented logic to delete anything older than 3 days related to the users. Time will tell if this is enough time to resolve issues with deleted values from the source. For now, this should mean I can continue to run the server with minimal resources.