The Code Archaeologist

by | Software Development

I fell in love with the idea of archaeology, probably like many other kids my age, when Harrison Ford first donned his famous fedora and bullwhip in the first Indiana Jones movie, “Raiders of the Lost Ark”.

Mild-mannered professor of archaeology at a prestigious university, by day, but a rough and tumble adventurer who seeks out fame and glory by night. Searching for fabulous lost cities and treasures, fighting villains and goons and of course, winning the heart of the beautiful girl at the end.

Of course, archaeology, in reality, is about as far from the romantic and adventurous Indiana Jones movies as you can get.

A lot of real archaeology is dry research, and when it actually is time to get out into the field, it’s not like there are whole lost cities of Atlantis and El Dorado just waiting for the right adventuresome archeologist to just find the right treasure map which will lead them directly to the lost treasure.

I found that out real fast when I finished attending an Intro to Archaeology course for fun, during my freshman year in college.

Indiana Jones may have been able to find the Ark of the Covenant, just waiting for him to pluck out of an underground tomb, but that’s a fantasy of the silver screen.

A lot of real archaeology in the field is dealing with tiny little shards and fragments that have eroded and decayed for hundreds, if not thousands, of years deep inside the earth. Hours of backbreaking, sweaty work outdoors, all for the faint glimmer of a hope that you’ll find even the tiniest bits of nothing.

So there went my hopes of being the next Indiana Jones, world traveling archaeologist.

I ended up in a completely unrelated field, which is the world of software programming.

But has it been so unrelated?

Lately, I’ve been reminiscing about how eerily similar many of my programming tasks are to archaeology.

It all goes back to the concept of legacy technology.

Ask any software developer about whether they’ve ever had to deal with legacy code, and I practically guarantee every single one will say ‘yes’.

Legacy code is software code that has stood the test of time and has managed to stay alive in a production environment over the course of time.

You might actually be the original developer of a particular legacy code project, but it’s more likely that you inherited a legacy project that was developed by a different developer. And likely many years before you even came onboard to your particular development team.

Even worse, that original developer or designer has likely moved to a different division of the company or even to a completely different organization.

The end result is that you have forever lost access to any original expertise and knowledge that went into the original conceptual design and implementation of that codebase.

Why is that such a big deal? After all, code is code, right? A programmer worth their salt should be able to start examining the source code and related documentation and be able to pick it up, right?

While that may be technically correct, the answer is much more nuanced than that.

Code is code, yes, that’s true.

But on the flip side, code isn’t meant for us puny humans. Code is the language of machines.

At the lowest level, computer machine code is simply long strings of 1s and 0s.

It’s been the “native tongue” of a computer since the very first one came off the assembly line, and continues to be the native language of every modern computer and device on the planet to this day.

But what computer code does NOT convey or explain is WHY a computer program was created in the first place.

Let’s step back a little and examine this a little further.

Why Are Software Programs Written?

At least in the business world, computer programs are written to solve a particular problem more quickly and efficiently than the equivalent effort by a human being.

Before a single line of code is written, there’s ideally a lot of initial analysis and project requirements gathering that takes place by the business stakeholders and the developer or development team.

Questions revolving around the high-level business requirements and an understanding of the problem statement.

For instance, perhaps the original project was kicked off in order to help automate and centralize sales order lead information into a centralized digital repository.

Maybe all this sales lead information was tracked in a haphazard and in many cases, manually tracked on paper by the salespeople.

So the Vice President of Sales, in a fit of inspiration, decides he wants to centralize all this valuable sales lead information into a centralized and easy to access digital repository so that it can be easily queried and shared with other divisions of the company.

It would also make it easier to generate valuable reports for upper management.

And of course, a lot of these initial discussions would involve any specific business rules and logic that needs to go into the original application.

This is usually the way most business applications go through their original inception and implementation … lots of analysis and requirements gathering.

At the end of this phase of project development, the original software developers should technically become business domain experts around the project. They should be able to understand, at a business level, the business problem they are trying to solve.

This is an extremely important milestone for the developer to achieve.

Because without truly understanding the business domain and the related problem, there’s literally no chance the developer will be able to solve the problem, no matter how much whiz-bang technology you throw at the problem.

So assuming the original developer or development team successfully implements the project, all is right with the world.

And then time passes.

And more time. Days, weeks, months, and eventually years.

The project is so successful, it continues to gain popularity inside the organization. More and more people and departments come to rely on the original system until it reaches mission-critical status.

That means even a few minutes of downtime or outages could potentially lead to lost revenue for the company.

But sooner or later, the original developer or development team moves on. Either moves to a different team or division or heads to greener pastures outside of the organization.

But even if the original developer or team is no longer there, it doesn’t change the fact that the application continues to live and exist. In fact, the application grows even more important and critical to the organization.

And herein lies the crux of the problem.

It’s very likely that you lose any access to the original developer(s) of the project.

All that business domain knowledge and expertise is no longer around. But you better believe the organization continues to assign newer developers to own and maintain that codebase.

The big challenge for the new developers is attempting to understand all the original effort and thought processes that went into the initial development.

Since the original technical staff are no longer around to ask questions, the new developer or development team has no choice but to dig through the original codebase and make their own educated guesses about the original design decisions and effort that was poured into the original application.

Easier said than done.

The Art of Digital Code Archaeology

What the newer developers are essentially doing is digital code archaeology… digging through the original codebase and sifting through the code to try to decipher what’s going on… much like a real archaeologist in the field.

And keep in mind, there’s no roadmap or guide leading the way.

As more time passes, the more technically challenging it becomes for these digital code archaeologists.


The problem is the nature of technology.

It’s always evolving.

And these days, it’s evolving faster than ever.

What that means is that the hot new programming language that was used in the original implementation of the project has been aging over time.

Let’s take the high-level programming language, COBOL.

Originally designed in 1959, it is still widely used today by many financial institutions and banks.

That’s a testament to its staying power.

But at the same time, it continues to create long term problems of maintenance.

Remember the big Y2K brouhaha around the turn of the millennium?

It was primarily caused by many of these COBOL programs written for banks, which used two digits to represent a year …. ie. 99 to represent the year “1999”.

The problem revolved around what would happen when the calendar switched to the year 2000.

Would all these original COBOL programs, that were designed to implement very important bank transactions, continue to process those transactions correctly? Or would it cause monumental software crashes and unwanted side effects as the year 2000 rolled around?

Could it potentially cause major banks and stock markets around the world to go into unwanted chaos?

This was a real problem that many banks and other institutions poured significant money, time and resources into, in order to ensure this wouldn’t happen.

Why Maintaining Legacy Code Is So Hard

Part of the challenge was even finding anyone who even knew how to write COBOL.

Many of those original COBOL programmers were retired or working for other organizations.

For those old time veterans who were willing to put in the effort to go back into the workforce to fix those potential Y2K problems in the code, they probably made a handsome profit as independent consultants.

But it illustrates the point that the longer a codebase continues to exist and function within an organization, the greater the problem of maintaining that codebase becomes.

The Y2K millennium problem is not just an isolated case.

All code, sooner or later, becomes legacy code, like it or not.

And I foresee a potentially new avenue of software development that will arise out of this … the CODE ARCHAEOLOGIST.

The code archaeologist will know how to go into a legacy codebase and decipher the code, and either continue to make additions and enhancements to that codebase, OR know how to translate that legacy codebase into something more modern.

Easier said, than done, of course.

But for those adventuresome software developers who love these kinds of challenges, I foresee many companies and organizations willing to fork out the dough to entice these code archaeologists to go in and do exactly this.

It actually sounds quite interesting.

All I’m going to need is that fedora and bullwhip …

Ready for Your Next Job?

We can help! Send us your resume today.

Need Talent?

Submit your job order in seconds.

About ProFocus

ProFocus is an IT staffing and consulting company. We strive to connect a select few of the right technology professionals to the right jobs.

We get to know our clients and candidates in detail and only carefully introduce a small number of candidates that fit the role well.