Commit Graph

279 Commits (75e9b3f8b25aeeafe9682c73862f0349cf9281a4)

Author SHA1 Message Date
Aditya Maru f440133b20 wip 2025-01-10 15:52:55 -05:00
Aayush 8554acbf59
src: prevent path duplication when dockerfile is within context 2025-01-09 10:03:58 -05:00
Aditya Maru 5ac445ae84 src: fix error message 2025-01-08 07:14:25 -05:00
Aayush 0e4788906e
src: bump buildkit startup timeout to 30sec 2025-01-07 21:18:32 -05:00
Aayush Shah d8a061af73
src: update timeout on `setupStickyDisk` (#91) 2025-01-01 15:09:21 -05:00
Aditya Maru 34ea2f79e5 src: change warning to debug 2025-01-01 13:16:46 +04:00
Aayush Shah 4ed3ba5c73
src: ignore unset sentinel value for tailscale token (#89) 2025-01-01 02:05:30 -05:00
Aditya Maru 42b59d67c9 src: bump timeout from 30s to 45s 2025-01-01 09:25:31 +04:00
Aayush Shah c03b613806
use local dockerfile path over git context (#86) 2024-12-31 13:08:49 -05:00
Aditya Maru aa6b213b0b src: join and leave tailnet on start and cleanup of builder 2024-12-31 15:52:49 +04:00
Aditya Maru 9fdeb57c53 src: disable automatic buildkit GC
We have reason to believe that automatic GC is affecting
daemon startup times. In this patch we disable automatic GC
and instead rely on manual pruning of the buildkit cache.
Once the daemon is ready we spawn an async task to run prune
on any objects older than 14 days. We are already manaing the
ceph volume approaching its size limit ourselves in the VM
Agent.

Patch also adds some alerting when inode usage is high on a mountpoint.
2024-12-23 09:15:34 -05:00
Aditya Maru 61713d1849 src: print api url in debug info 2024-12-21 23:42:52 -05:00
Aditya Maru 6fe2467492 src: silence metric warnings for now 2024-12-21 23:12:08 -05:00
Aditya Maru 4759d93c12 src: use the plumbed BLACKSMITH_BACKEND_URL if present 2024-12-21 12:08:11 -05:00
Aditya Maru def1585067 *: report metrics to the VM agent 2024-12-20 17:43:40 -05:00
Aditya Maru 4723a2a346 src: stop spurious warnings on buildkit shutdown 2024-12-19 19:04:07 -05:00
Aditya Maru 1672d6fbad src: fix shutdown retry behavior 2024-12-19 13:04:09 -05:00
Aditya Maru 9302d2aea9 src: stop running process as nohup to avoid missing logs 2024-12-19 12:44:35 -05:00
Aditya Maru ac42783fa9 src: cleanup flakiness in different parts of the action 2024-12-18 09:58:15 -05:00
Aditya Maru 54bc4e0788 src: refactor cleanup logic to expose buildkitd.log
Previosuly, we only killed the buildkitd process and unmounted
if builderInfo was non null. This was wrong cause we could have setup
builkdkitd, but failed after that step. This would then rely on the last
ditch effort by the post action to cleanup. We now change the proc kill
and unmount to happen on any build error.
2024-12-16 19:25:47 -05:00
Aditya Maru d43ee61bb7 *: move to grpc backed communication for the agent 2024-12-16 15:29:30 -05:00
Aditya Maru 53000f0f59 ignore error when nothing is mounted 2024-12-15 17:16:24 -05:00
Aditya Maru 1df1b3c361 src: ignore error when theres nothing mounted 2024-12-13 12:32:05 -05:00
Aditya Maru de0451e517 src: make post unmount even if buildkitd is no longer present
Also increase retries when trying to unmount the buildkit directory.
Retry up to 3 seconds now, previously we were only retrying 3 times
with a 100ms backoff.
2024-12-10 21:26:18 -05:00
Aditya Maru 0f99a0b1c7 src: start sending get request with query params
We are incorrectly using formData in a get request. To move
away from this we send both query params and formData until
the server is fully upgraded. After which we can stop sending
formData.
2024-12-09 13:01:35 -05:00
Aditya Maru 0186286e06 *: use axios-retry instead of handrolled retry methods 2024-12-09 13:01:20 -05:00
Aayush Shah 7b8642822f
src: make `getDockerfilePath` return the full path to the dockerfile (#64)
Previously we were just returning the path to the dir containing the dockerfile
in most cases.
2024-12-09 12:20:46 -05:00
Aditya Maru f06a558c36 src: alert if an exception is thrown on cleanup 2024-12-08 19:21:46 -05:00
Aditya Maru b76cd7bf3b src: fix bug in conditional that zero'd out expose ID 2024-12-08 18:44:36 -05:00
Aayush f9d1e150a9
*: unify error handling and add more unit tests 2024-12-08 16:41:30 -05:00
Aditya Maru c71ad2dbef *: refactor methods to support mocking
Additionally, write some tests to ensure the driver method
`startBlacksmithBuilder` handles all exceptions correctly in
both nofallback=true and nofallback=false configurations.
2024-12-08 14:35:30 -05:00
Aditya Maru 5ab78173d3 backend: default to /dev/vdb when ExposeVolume response is empty 2024-12-06 22:39:20 -05:00
Aditya Maru edc01b36df backend: use device returned in ExposeVolume response 2024-12-06 22:31:43 -05:00
Aditya Maru 72c7e93db9 src: send stickydisk key with commit 2024-12-03 18:31:15 -05:00
Aditya Maru 6d7db93fa2 src: bump socket creation timeout to 5s from 3s 2024-12-03 16:44:56 -05:00
Aditya Maru c308f14958 src: report the correct sticky disk key 2024-12-03 15:34:30 -05:00
Aditya Maru 17d922af1a src: shuttle a expose ID between expose and commit 2024-12-02 17:33:33 -05:00
Aditya Maru a55bae5255 src: prevent top-level ref variable from being shadowed 2024-12-01 18:07:10 -05:00
Aditya Maru 9841eabab1 src: only resolve buildref on success 2024-12-01 12:55:52 -05:00
Aditya Maru 4938a7e10a src: change arch to use BLACKSMITH_ENV 2024-11-30 12:22:23 -05:00
Aditya Maru 9336122050 src: add some idempotent cleanup safeguard in post action 2024-11-30 11:13:02 -05:00
Aditya Maru bdf7f0bb37 src: wrap all steps after blacksmith builder in try catch
This ensures we always run cleanup if any step after creating the Blacksmith
builder errors out.
2024-11-30 09:25:09 -05:00
Aditya Maru 9b63433194 src: make blacksmith builder name unique 2024-11-27 22:41:36 -05:00
Aditya Maru bda6587832 src: change sticky disk key to repo name 2024-11-27 21:13:49 -05:00
Aditya Maru c33190b3c9 src: add local mirror to buildkit toml 2024-11-27 17:24:38 -05:00
Aditya Maru 1dee25cffd src: fix the movement of cleanup to the main step 2024-11-27 15:38:13 -05:00
Aditya Maru f16c36e819 src: add resize2fs call if the block device is formatted 2024-11-27 12:40:10 -05:00
Aditya Maru e1da38ff9d src: add petname and vmID to notification 2024-11-27 10:38:29 -05:00
Aditya Maru ac4af6279b src: move shutdown, cleanup, commit from post to after build 2024-11-26 22:09:05 -05:00
Aditya Maru 138e3a2a14 dist: cat buildkit log file if build fails 2024-11-26 21:07:24 -05:00