Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring deployer exec root-homes command #5279

Merged
merged 14 commits into from
Dec 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 24 additions & 22 deletions deployer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -384,6 +384,30 @@ setup or an outage), or when taking down a hub.
All these commands take a cluster and hub name as parameters, and perform appropriate
authentication before performing their function.

#### `exec hub`

This subcommand gives you an interactive shell on the hub pod itself, so you
can poke around to see what's going on. Particularly useful if you want to peek
at the hub db with the `sqlite` command.

#### `exec homes`

This subcommand gives you a shell with the home directories of all the
users on the given hub in the given cluster mounted under `/home`.
Very helpful when doing (rare) manual operations on user home directories,
such as renames.

When you exit the shell, the temporary pod spun up is removed.

#### `exec root-homes`

Similar to `exec homes` but mounts the _entire_ NFS filesystem to `/root-homes`.
You can optionally mount a secondary NFS share if required, which is useful when migrating data across servers.

#### `exec aws`

This sub-command can exec into a shell with appropriate AWS credentials (including MFA).

#### `exec debug`

This sub-command is useful for debugging.
Expand All @@ -410,28 +434,6 @@ in our 2i2c cluster that can be accessed via this command, and speeds up image b
Once you run this command, run `export DOCKER_HOST=tcp://localhost:23760` in another terminal to use the faster remote
docker daemon.

#### `exec shell`
This exec sub-command can be used to acquire a shell in various places of the infrastructure.

##### `exec shell hub`

This subcommand gives you an interactive shell on the hub pod itself, so you
can poke around to see what's going on. Particularly useful if you want to peek
at the hub db with the `sqlite` command.

##### `exec shell homes`

This subcommand gives you a shell with the home directories of all the
users on the given hub in the given cluster mounted under `/home`.
Very helpful when doing (rare) manual operations on user home directories,
such as renames.

When you exit the shell, the temporary pod spun up is removed.

##### `exec shell aws`

This sub-command can exec into a shall with appropriate AWS credentials (including MFA).

## Running Tests

To execute tests on the `deployer`, you will need to install the development requirements and then invoke `pytest` from the root of the repository.
Expand Down
68 changes: 46 additions & 22 deletions deployer/commands/exec/infra_components.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import json
import os
import subprocess
import tempfile

import typer
from ruamel.yaml import YAML
Expand All @@ -22,13 +23,18 @@
def root_homes(
cluster_name: str = typer.Argument(..., help="Name of cluster to operate on"),
hub_name: str = typer.Argument(..., help="Name of hub to operate on"),
extra_nfs_server: str = typer.Argument(
persist: bool = typer.Option(
False,
"--persist",
help="Do not automatically delete the pod after completing. Useful for long-running processes.",
),
extra_nfs_server: str = typer.Option(
None, help="IP address of an extra NFS server to mount"
),
extra_nfs_base_path: str = typer.Argument(
extra_nfs_base_path: str = typer.Option(
None, help="Path of the extra NFS share to mount"
),
extra_nfs_mount_path: str = typer.Argument(
extra_nfs_mount_path: str = typer.Option(
None, help="Mount point for the extra NFS share"
),
):
Expand All @@ -40,12 +46,11 @@ def root_homes(
with open(config_file_path) as f:
cluster = Cluster(yaml.load(f), config_file_path.parent)

with cluster.auth():
hubs = cluster.hubs
hub = next((hub for hub in hubs if hub.spec["name"] == hub_name), None)
if not hub:
print_colour("Hub does not exist in {cluster_name} cluster}")
return
hubs = cluster.hubs
hub = next((hub for hub in hubs if hub.spec["name"] == hub_name), None)
if not hub:
print_colour("Hub does not exist in {cluster_name} cluster}")
return

server_ip = base_share_name = ""
for values_file in hub.spec["helm_chart_values_files"]:
Expand Down Expand Up @@ -94,6 +99,10 @@ def root_homes(
pod = {
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"name": pod_name,
"namespace": hub_name,
},
"spec": {
"terminationGracePeriodSeconds": 1,
"automountServiceAccountToken": False,
Expand All @@ -112,27 +121,42 @@ def root_homes(
},
}

cmd = [
# Command to exec into pod
exec_cmd = [
"kubectl",
"-n",
hub_name,
"run",
"--rm", # Remove pod when we're done
"-it", # Give us a shell!
"--overrides",
json.dumps(pod),
"--image",
# Use ubuntu image so we get GNU rm and other tools
# Should match what we have in our pod definition
UBUNTU_IMAGE,
"exec",
"-it",
pod_name,
"--",
"/bin/bash",
"-l",
]

with cluster.auth():
subprocess.check_call(cmd)
with tempfile.NamedTemporaryFile(mode="w", suffix=".json") as tmpf:
# Dump the pod spec to a temporary file
json.dump(pod, tmpf)
tmpf.flush()

with cluster.auth():
try:
# We are splitting the previous `kubectl run` command into a
# create and exec couplet, because using run meant that the bash
# process we start would be assigned PID 1. This is a 'special'
# PID and killing it is equivalent to sending a shutdown command
# that will cause the pod to restart, killing any processes
# running in it. By using create then exec instead, we will be
# assigned a PID other than 1 and we can safely exit the pod to
# leave long-running processes if required.
#
# Ask api-server to create a pod
subprocess.check_call(["kubectl", "create", "-f", tmpf.name])
sgibson91 marked this conversation as resolved.
Show resolved Hide resolved
# Exec into pod
subprocess.check_call(exec_cmd)
finally:
if not persist:
delete_pod(pod_name, hub_name)


@exec_app.command()
Expand Down Expand Up @@ -180,7 +204,7 @@ def homes(
"-n",
hub_name,
"run",
"--rm", # Remove pod when we're done
"--rm", # Deletes the pod when the process completes, successfully or otherwise
"-it", # Give us a shell!
"--overrides",
json.dumps(pod),
Expand Down
2 changes: 1 addition & 1 deletion docs/howto/features/storage-quota.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ Here's an example of how to do this:

```bash
# Create a throwaway pod with both the existing home directories and the new NFS server mounted
deployer exec root-homes <cluster_name> <hub_name> --extra_nfs_server=<nfs_service_ip> --extra_nfs_base_path=/ --extra_nfs_mount_path=/new-nfs-volume
deployer exec root-homes <cluster_name> <hub_name> --extra-nfs-server=<nfs_service_ip> --extra-nfs-base-path=/ --extra-nfs-mount-path=/new-nfs-volume

# Copy the existing home directories to the new NFS server while keeping the original permissions
rsync -av --info=progress2 /root-homes/<path-to-the-parent-of-user-home-directories> /new-nfs-volume/
Expand Down
Loading