Understand Source Code — Diving Deep into the Code Base, Locally and in Production | by Shai Almog | Jun, 2022

Why you need to debug when there’s no bug

Say you’ve a brand new code base to check or picked up an open supply mission. You could be a seasoned developer for whom that is one other mission in a packed resume. Alternatively, you could be a junior engineer for whom that is the primary “actual” mission.

It doesn’t matter!

With utterly new supply code repositories, we nonetheless know nothing.

The seasoned senior may need a leg up to find some issues and recognizing patterns. However none of us can learn and actually observe a mission with 1M+ traces of code. We glance over the docs, which in my expertise normally have solely a passing resemblance to the precise code base. We segregate and assume concerning the varied modules, however selecting them up is tough.

We use IDE instruments to seek for connections, however that is actually laborious. It’s like following a yarn thread after a military of cats had its method with it.

As a marketing consultant for over a decade, I picked up new consumer tasks on a weekly foundation. On this submit, I’ll describe the strategy I used to do this and the way I tailored this strategy additional at Lightrun.

The programming language can assist quite a bit. As Java builders, we’re fortunate, code base exploration instruments are remarkably dependable. We will dig into the code and discover utilization. IDEs spotlight unused code and so they’re fairly nice for this. However this has a number of issues:

  • We have to know the place to look and the way deeply
  • Code could be utilized by checks or by APIs that aren’t truly utilized by customers
  • Circulation is tough to know through utilization. Particularly asynchronous stream
  • There’s no context comparable to knowledge to assist clarify the stream of the code

There needs to be a greater method than randomly combing by means of supply recordsdata

Another choice is UML chart technology from supply recordsdata. My private expertise with these instruments hasn’t been nice. They’re supposed to assist with the “massive image” however I typically felt much more confused by these instruments.

They elevate minor implementation particulars and unused code into equal footing in a mind-boggling complicated chart. With a typical code base, we’d like greater than a high-level view. The satan is within the particulars, and our notion ought to be of the particular code base in model management. Not some theoretical mannequin.

Debuggers immediately remedy all these issues. We will immediately confirm assumptions, see “actual world” utilization, and step over a code block to know the stream. We will place a breakpoint to see if we reached a chunk of code. If it’s reached too regularly and we are able to’t work out what’s happening, we are able to make this breakpoint conditional.

I make it a behavior to learn the values of variables within the watch when utilizing a debugger to check the code base.

Does this worth make sense at this time limit?

If it doesn’t, then I’ve one thing to take a look at and work out. With this device, I can rapidly perceive the semantics of the code base. Within the following sections, I’ll cowl methods for code studying each within the debugger and in manufacturing.

Discover I take advantage of Java, however this could work for some other programming language because the ideas are (largely) common.

I feel most builders find out about field watchpoints and simply overlook about them!

“Who modified this worth and why,” might be the commonest query requested by builders. Once we look by means of the code, there could be dozens of code flows that set off a change. However inserting a watchpoint on a discipline will inform you every thing in seconds.

Understanding state mutation and propagation might be an important factor you are able to do when learning a code base.

One of the crucial essential issues to know when stepping by means of strategies is the return worth. Sadly, with debuggers, this info is commonly “misplaced” when coming back from a way and may miss a important a part of the stream.

Fortunately, most IDEs allow us to examine the return worth dynamically and see what the strategy returned from its execution. In JetBrains IDEs, comparable to IntelliJ/IDEA, we are able to allow “Present Methodology Return Worth” as I focus on here.

Why is that this line wanted?

What would occur if it wasn’t there?

That’s a fairly widespread set of questions. With a debugger, we are able to change control flow to leap to a particular line of code or pressure an early return from a way with a particular worth. This can assist us verify conditions with a particular line, e.g., what if this technique was invoked with worth X as a substitute of Y?

Easy. Simply drag the execution again a bit and invoke the strategy once more with a unique worth. That is a lot simpler than studying right into a deep hierarchy.

Object Marking is a kind of unknown debugger capabilities which can be invaluable and remarkably highly effective. It has an enormous function in understanding “what the hell is happening.”

You know the way whenever you debug a worth you write down the pointer to the thing so you possibly can maintain monitor of “what’s happening on this code block?”.

This turns into very laborious to maintain monitor of. So we restrict the interplay to only a few pointers. Object marking lets us skip this by holding a reference to a pointer below a hard and fast title. Even when the thing is out of scope, the reference marker would nonetheless be legitimate.

We will begin monitoring objects to know the stream and see how issues work. For instance, if we’re taking a look at a Person object within the debugger and need to maintain monitor of it, we are able to simply maintain a reference to it. Then use conditional breakpoints with the person object in place to detect the world of the system that accesses the person.

That is additionally remarkably helpful in holding monitor of threads, which can assist in understanding code the place the threading logic is complicated.

A standard scenario I run into is a case the place I see a part within the debugger. However I’ve been in search of a unique occasion of this object. For instance, in case you have an object known as UserMetaData. Does each person object have a corresponding UserMetaData object?

As an answer, we are able to use the reminiscence inspection device and see which objects of the given type are held in the memory!

Seeing precise object occasion values and reviewing them helps put numbers/information behind the objects. It’s a strong device that helps us visualize the info.

Throughout improvement, we regularly simply add logs to see “was this line reached.” Clearly, a breakpoint has benefits, however we don’t all the time need to cease. Stopping may change threading conduct, and it will also be fairly tedious.

However including a log will be worse. Recompile, rerun, and unintentionally commit it to the repository. It’s a problematic device for debugging and for learning code.

Fortunately, we’ve got tracepoints that are successfully logs that allow us print out expressions, and many others.

This works nice for “easy” methods. However there are platforms and settings in our trade which can be remarkably laborious to breed in a debugger. Data about the way in which our code works domestically is one factor. The way in which it really works in manufacturing is one thing utterly completely different.

Manufacturing is the one factor that basically issues and in multi-developer tasks, it’s actually laborious to judge the hole between manufacturing and assumption.

We name this actuality protection. For instance, you may get 80% protection in your checks. But when your protection is low on lessons which can be closely accessed in your supply code repository… the QA could be much less efficient. We will examine the repo again and again. We will use each code evaluation device and setting. However they gained’t present us the 2 issues that basically matter:

Is that this truly utilized in manufacturing?

How is that this utilized in manufacturing?

With out this info, we would waste our time. For instance, when coping with hundreds of thousands of traces within the repo. You don’t need to waste your examine time studying a way that isn’t closely used.

To be able to get an perception into manufacturing and debug that surroundings, we’ll want a developer observability device, like Lightrun. You possibly can install it for free here.

A counter lets us see how regularly a line of code was reached. This is among the most useful instruments at our disposal.

Is that this technique even reached?

Is that this block within the code reached? How typically?

If you wish to perceive the place to focus your energies first, the counter might be the best device at your disposal. You possibly can examine counters here.

We frequently have a look at a press release and make varied assumptions. When the code is new to us, these assumptions could be pivotal to our understanding of the code. A superb instance is one thing like many of the customers who use this characteristic have been with the system for some time and ought to be acquainted with it.

You possibly can check that with conditional statements you can connect to any motion (logs, counters, snapshots, and many others.). Consequently, we are able to use a situation like person.signupDate.getMillis() < ….

You possibly can add this to a counter and actually depend the customers that don’t match your expectations.

I feel it’s apparent how injecting a log in runtime could make an unlimited distinction in understanding our system. However in manufacturing, this comes at a value. I’m learning the system whereas trying on the logs and all my “methodX reached with worth Y” logs add noise to our poor DevOps/SRE groups.

We will’t have that. Finding out one thing ought to be solitary, however by definition, manufacturing is the precise reverse of that.

With piping, we are able to log every thing domestically to the IDE and spare everybody else the noise. For the reason that logic is sandboxed, there shall be no overhead if you happen to log an excessive amount of. So go wild!

One of many monumental challenges in studying is knowing the issues we don’t know “but.” Snapshots assist us get an even bigger image of the code. Snapshots are like inserting any breakpoint and reviewing the values of the variables within the stack to see if we perceive the ideas. They only “don’t break” so that you get all the data you should utilize to check, however the system retains performing as traditional, together with threading conduct.

Once more, utilizing conditional snapshots may be very useful in pinpointing a particular query. For instance, what’s happening when a person with permission X makes use of this technique?

Easy. Place a conditional snapshot for the permission. Then examine the ensuing snapshot to see how variable values are affected alongside the way in which.

Builders typically have a strained relationship with debugging instruments.

On the one hand, they’re typically the device that save our bacon and helps us discover the bug. Then again, they’re the device we have a look at once we understand we had been full morons for the previous few hours.

I hope this information will encourage you to choose up debuggers whenever you don’t have a bug. The insights offered by working a debugger, even whenever you aren’t actively debugging, will be “recreation altering.”

More Posts