'Refactoring / Decoupling Highly Complex and Confusing Legacy Code
I'm Mid-Level Back-end Developer, just entering on a very big (and important) Enterprise Java Legacy Project (around 13k Files) that started 10 years ago. The code is pretty bad, i'm sure you can find every single anti-pattern in existence inside that labyrinth of source code, and somehow yet also has 0 Tests.
It doesn't even help much going through the effort of making UML Diagrams because a single module is too big for the UML to make any worthwhile clarifications.
On my free time I've been studying some books related to Code Quality like:
- Clean Code & Clean Architecture
- Structure and Interpretation of Computer Programs
- Refactoring: Improving the Design of Existing Code
- Unit Testing: Principles, Practices, and Patterns
- Working Effectively with Legacy Code
- Domain-Driven Design: Tackling Complexity in the Heart of Software
I can see some commonalities between them, and how their different concepts relate to build a better quality code base. But up until now I've only managed to make simpler refactorings.. the overall structure of files and dependencies are still unchangeable without introducing some weird bug, interrupting some unexpected temporal coupling, messing with some 3 classes deep hidden side effect, etc...
I feel like if (hopefully when) I manage to properly refactor this code, my expertise will sky-rocket (to whatever level is reasonable to get above average in "one" project). I'm willing to go to whatever means necessary to learn how to effectively deal with such complexity (as far as is humanly reasonable). I already feel like I improved faster than before with just the simple refactorings as they are more complex than normal in this project.
I guess my real problem started as I moved from variables/names/functions level refactoring and got into class/components/dependencies level refactoring, so.. more of a Software Architecture problem.
If you are curious or need some context, here are some examples of bad code that can be found in there (some are rarer but still considerable):
Names (basically every single thing the Clean Code says to be bad)
- Noise Words (Helper, Info, etc..)
- Unpronounceable Names (c, DtPesqIni, etc..)
- Prefixes & Encodings (strSomething, intSomething, namesList, optionString, etc...)
- Type Redundancies (Abcd abcd, etc..)
- Misinformation (some very misleading names)
- Functions Without Verbs, Classes Names as Verbs
- No Clear Intent, some functions name doesn't describe even a little of what it does
Comments
- Lots & lots of Comments
- Outdated Comments
- 70% are completely Redundant
- 20% are covering for bad names or in the place of extracting functions
- 10% doesn't even make sense, doesn't clarify anything, would need a comment for itself
Functions & Classes
- 400 lines Functions & 3k lines Classes
- Functions that could be dived into 3 or more Classes
- No Function Abstraction Levels considerations
- Extreme examples of how to brake every single SOLID principle
- Demeter is crying in his grave (Demeter's Law broken freely)
- Lots of repetitions without Abstracting into Functions or even Variables (while having terrible names, picture that)
- Classes Feature Envy
- Abstract Classes that depend on Child Classes (wtf?)
- Big Circular Dependencies
- Classes that should be Data Structures
- Functions repeating inside different Classes
- Lots & Lots of Parameters (there is a function that managed to have 9 parameters lol)
- Boolean Parameters, Double Boolean Parameters, even a Triple Boolean one
- Output Parameters
- Unused Parameters
- Some clearly bad error handling techniques (although I didn't manage to study it yet)
And more, believe me...
I've read that Unit Tests are essential for sustainability and scalability, as well as a first step for starting proper Refactoring. But at the same time is harder to Unit Test highly coupled complex components, and after refactoring would have to refactor the Unit Test also.
There is this weird loop where to start complex refactoring you gotta build Unit Tests, but to build quality Unity Tests you gotta refactor the code.
I guess I have 2 questions:
- How does one valuably map the Components Decoupling Process (as it is basically impossible to juggle the related components with bad names just inside one's head, yet, while having to think of how to reformulate specific complex functionalities/logic)
- How/Where does one start to effectively build Unit Tests for huge and complex legacy code from scratch?
Maybe a last one that would already be helpful:
- How to map which value/function should be in which class, considering the context of a highly coupled and complex group of classes (I can't even tell which class should better hold some specific variables/functions, because none have a clear distinct role, everyone does a little bit of the job that another one does very similarly)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
