How to Put a Legacy Application in Git

Even though the risk of encountering an application that is not yet in a version control system decreases over time, the risk does exist. Do you know how to pull it from a server and put it in a Git repository?

4 min read
How to Put a Legacy Application in Git
Photo by Min An on Pexels.

Have you ever encountered an old application that, in 2022, is still not under a version control system (VCS) like Git or Apache Subversion? A system that, in order to modify it, you need to connect to the server where the application is located and directly modify the files with vim? 🤮

At Ville de Sherbrooke, one of our main systems was. Someone could have easily introduced new changes or flaws. The only person who made the change would know why it was made! 😲

It was time to start tracking the changes in that system. 😎

Structuring

Since a legacy system could be very complex, we need to structure ourselves to determine what and how we want to version it. 🤔

Git branches

Our system has two environments: production and development. As they could each have a different version of the application, they will need to have their corresponding branch in Git : main and develop.

While you're thinking about Git, you can probably go to GitLab or GitHub to create a new Git repository. Make sure it's empty because it'll be a lot easier. 😉

Development files

Since the developers made changes directly on the servers, chances are there are files that we don't want to be versioned, such as copies or to-do lists.

So we need to list the files we want with an allowlist (also known as a whitelist).

We can start this list with the files on the production server since there shouldn't be many files we don't want. 🤗

index.html
api.js
search.js
main.js
main.css
README.md
An example of a text file simply named files.txt that list all files that should be versioned for a custom application.
💡
If you know the extensions of files you want, you can easily create that file with a command similar to the one below.

find . -regex ".*\.js" -o -regex ".*\.css" > files.txt

It will list all files that match the regexes. You'll be able to modify this base file to include or exclude more content. 😉

Now that we're structured, let's move on to the scripting phase! 🤗

Scripting

Since we shouldn't add Git to the production server (this could introduce potential vulnerabilities there), we'll download the files we want to our local machine with scp. 😌

💡
For those interested, here's a useful cheat sheet to have on hand if you're new to this command. 😋

Create an archive of our application

# From our local computer, upload the file that contains the list of files we want to version
scp list.txt USERNAME@PRODUCTION_SERVER_NAME:/PATH_OF_OUR_APPLICATION

# Connect to the production server
ssh USERNAME@PRODUCTION_SERVER_NAME

# Go to where the application is
cd /PATH_OF_OUR_APPLICATION

# Create an archive containing the files we want
tar --create --preserve-permissions --file APPLICATION_NAME.tar --files-from files.txt --verbose

# Go back to our local computer
exit
The commands you need to use to create the application archive.

Download the source code to our local computer

# Create a new directory to receive the source code
mkdir -p ~/Downloads/APPLICATION_NAME

# Download our TAR archive to our computer
scp USERNAME@PRODUCTION_SERVER_NAME:/PATH_OF_OUR_APPLICATION/APPLICATION_NAME.tar ~/Downloads

# Extract the TAR archive to our new directory
tar --extract --preserve-permissions --file ~/Downloads/APPLICATION_NAME.tar --directory ~/Downloads/APPLICATION_NAME --verbose
The commands you need to use to download the source code to our local computer.

Upload the source code to a GitLab repository

# Go in the directory that now contains our source code
cd ~/Downloads/APPLICATION_NAME

# Initialize the Git repository in that directory
git init
git remote add origin git@gitlab.com:benjaminrancourt/APPLICATION.git

# Add all files to the Git repository
git add .

# Commit changes
git commit -m "Source code from the production environment"

# Create the main branch
git branch -M main

# Push the main branch to GitLab
git push origin main
The commands to execute to upload our source code to a GitLab repository.

Repeat for the other environment

Remember that our legacy application may have different versions in different environments? If so, you'll probably want to know what changes haven't been made to the production server yet. 😙

So we will have to push those versions to Git so we can compare them easily.🙂

For example, for my development environment, I would:

# Download the source code from the development server (follow the previous procedure)

# Go in the directory that now contains our source code
cd ~/Downloads/APPLICATION_NAME

# Make sure we are on the main branch
git checkout main

# Create a new branch from main
git checkout -b develop

# Extract the archive from the development server 
tar --extract --preserve-permissions --file ~/Downloads/APPLICATION_NAME_DEV.tar --directory ~/Downloads/APPLICATION_NAME --verbose

# Add all changes
git add .

# Commit the changes
git commit -m "Source code from the development environment"

# Push the develop branch to GitLab
git push origin develop
The commands to push source code from the development environment to GitLab.
A merge request from develop to main, where I can easily identify changes that are not yet in production.

Conclusion

By putting your legacy system into a version control system, you greatly reduce the risk of losing your application's source code.

A step forward towards a better system! 😎