Reversing and debugging EVM Smart contracts: The Execution flow if/else/for/functions (Part 5)
In this part, we will talk about the execution flow. How statements like if/for or nested functions are handled by the EVM in assembly?
Let’s find out!
🔴 This is the 5th part of our series about reversing and debugging EVM smart contracts, here you can find previous & next parts:
- ✅ Reversing and debugging EVM Smart contracts: First steps in assembly (part 1️⃣)
- ✅ Reversing and debugging EVM Smart contracts: Deployment of a smart contract (Part 2️⃣)
- ✅ Reversing and debugging EVM Smart contracts: How the storage layout works? (part 3️⃣)
- ✅ Reversing and Debugging EVM Smart contracts: 5 Instructions to end/abort the Execution (part 4️⃣)
- Reversing and debugging EVM Smart contracts: The Execution flow if/else/for/functions (part 5️⃣)
- Reversing and debugging EVM Smart contracts: Full Smart Contract layout (part 6️⃣)
- Reversing and debugging EVM Smart contracts: External Calls and contract deployment (part 7️⃣)
1. IF/ELSE in assembly
This is our 1st example about reversing if/else statement, compile it WITHOUT the optimizer and call the function flow()
with x=true
.
Here is the full disassembly of the function.
062 JUMPDEST |0x01|stack after arguments discarded|
063 DUP1 |0x01|0x01|
064 ISZERO |0x00|0x01|
065 PUSH1 4b |0x4b|0x00|0x01|
067 JUMPI |0x01|
068 PUSH1 04 |0x04|0x01|
070 PUSH1 00 |0x00|0x04|0x01|
072 SSTORE |0x01|
073 POP
074 JUMP
075 JUMPDEST
076 PUSH1 09
078 PUSH1 00
080 SSTORE
081 POP
082 JUMP
When a function is called, it’s arguments are EVERY TIME placed in the stack (we will “prove it later”), so as x=true=1
in the EVM (and therefore false = 0) then the stack contains 1 at Stack(0)
.
At byte 63 and 64 The stack is duplicated, and ISZERO
instruction is called.
The instruction obviously verify if Stack(0) = 0. If yes then 1 is pushed to the stack, otherwise 0 is pushed to the Stack.
As Stack(0) = 1 then 0 is pushed to the stack | 0x00 | 0x01 |
After that 4b is PUSHed to the stack too. And JUMPI is called
| 0x4b | 0x00 | 0x01 |
As Stack(1) = 0, the EVM will NOT jump to 4b.
Therefore, we can easily deduce that if the first argument in the stack is 0 then the EVM will jump to 4b (75 in decimal) and otherwise the EVM continue the execution flow.
Between 68 and 74, we already know, what’s happening: the EVM store 4 in the slot number 0. Same code between 75 and 81: the EVM stores 9 are the slot number 0.
After these 2 “outcomes” In both cases the EVM jumps to 3c, where the execution ends.
In fact every time, there is a JUMPI instruction, there is a IF statement in solidity (or WHILE/FOR).
2. ELSE IF in assembly
What if we use a bit more complex if statement? This time there will be more “else”, but does the assembly code is a lot more complex?
(SPOILER: no)
Compile this code (without optimizer) and solidity 0.8.7 and call flow with any value you want.
As always, let’s disassemble the function :
062 JUMPDEST
063 DUP1
064 PUSH1 01
066 EQ
067 ISZERO
068 PUSH1 4e
070 JUMPI
071 PUSH1 04
073 PUSH1 00
075 SSTORE
076 POP
077 JUMP
078 JUMPDEST
079 DUP1
080 PUSH1 02
082 EQ
083 ISZERO
084 PUSH1 5e
086 JUMPI
087 PUSH1 09
089 PUSH1 00
091 SSTORE
092 POP
093 JUMP
094 JUMPDEST
095 DUP1
096 PUSH1 03
098 EQ
099 ISZERO
100 PUSH1 6e
102 JUMPI
103 PUSH1 0e
105 PUSH1 00
107 SSTORE
108 POP
109 JUMP
110 JUMPDEST
111 DUP1
112 PUSH1 04
114 EQ
115 ISZERO
116 PUSH1 7e
118 JUMPI
119 PUSH1 13
121 PUSH1 00
123 SSTORE
124 POP
125 JUMP
126 JUMPDEST
127 PUSH1 18
129 PUSH1 00
131 SSTORE
132 POP
133 JUMP
This structure looks like similar with something we already seen. Despite being quite long, it’s very simple. This is just a repetition of 2 different blocks :
- The intermediate condition block (in bold)
__It verifies if the value in stack(0) is equal to an “if else” intermediary __statement. If not it JUMP
to the next intermediate condition and to the same.
__If yes, it does NOT JUMP
and execute the SSTORE block (2)
2. and the SSTORE block (in italic) which we already know very well (not in bold)
__By pushing the slot and the value in the stack and calling SSTORE
__Once it’s done, the EVM JUMP to another location and ends the execution.
If all conditions are not met (i is not equal to 1 or 2 or 3 or 4) the the else statement is triggered between 127 and 133, this is the last SSTORE block but without any condition
In fact the whole structure is very similar the function selector at the beginning of the EVM execution to select the code matching the function signature. (we saw it in the episode 1)
To summarize the code, the else if statement can be translated as multiple nested if in solidity, this yield exactly the same as if we used else if.
3. For Loops in assembly
For statement contrary to others programming languages are not widely used in solidity
The principal reason is that functionalities which need for statement often need a lot of gas to execute the function which makes the smart contract unusable. This is about the same case for the while statement.
Here is code we will study, compile it, deploy and call the function flow with x=10
.
The disassembly is a bit harder to study, you’ll need to be more focus this time :)
062 JUMPDEST
063 PUSH1 00
065 JUMPDEST
066 DUP2
067 DUP2
068 LT
069 ISZERO
070 PUSH1 6c
072 JUMPI
Byte 62 is entry point of the function flow()
At byte 62, the value 0xa is in the stack (10 in decimal), this is the value of x. (Don’t forget what i wrote: in functions arguments are always in the stack)
| 0xa0 |
At byte 63, 0 is pushed to the Stack. This is very likely our variable i = 0, (The initialization in the for loop.) | 0x00 | 0xa0 |
At byte 65, there is a JUMPDEST instruction, we will see why later.
Between 66 and 69 the value of x is compared to the value of i, (by using instruction LT which means Less Than)
If it’s less, the EVM jumps to 6c (108 in decimal) at byte 72, If not the EVM continue.
Obviously, 0xa is NOT less than 0x0 so the execution continue at byte 73.
This should be i < x
in the for loop.
073 DUP1
074 PUSH1 00
076 DUP1
077 DUP3
078 DUP3
079 SLOAD
080 PUSH1 57
082 SWAP2
083 SWAP1
084 PUSH1 88
086 JUMP
The purpose of this code is to SLOAD the Slot 0 and to push 57 (87 in decimal)
YES, it’s looks like more complex, be if your turn ON the optimizer, this should simplify to PUSH1 0; SLOAD; PUSH1 88
After that the code JUMP unconditionally to 88 (136 in decimal)
136 JUMPDEST
137 PUSH1 00
139 DUP3
140 NOT
141 DUP3
142 GT
143 ISZERO
144 PUSH1 98
146 JUMPI
147 PUSH1 98
149 PUSH1 b5
151 JUMP
As this article will be long, i won’t explain everything here. All you need to know is that in solidity 0.8.0 and later, the compilator injects code to prevent overflow when we are adding numbers.
For example for uint256 type : 2²⁵⁶- 1 is the largest possible number, if I add 1 to this number, the result will be 0, as 2²⁵⁶ can’t be contained in 256 bit slot.
The purpose of this code is to test if there will be an overflow BEFORE an arithmetical operation.
- if yes, then the code JUMP to B5 and reverts. (You can check disassembly at 181.)
- if not, the execution continues at 98 (152 in dec)
152 JUMPDEST
153 POP
154 ADD
155 SWAP1
156 JUMP
Once overflow verification are done. This code adds the previous result SLOAD at slot 0 with i (the incrementation variable)
After that the EVM jump to 57 (87 in dec), 57 was in the stack pushed at instruction 80. You’ll understand in the next section why 57 was saved.
087 JUMPDEST
088 SWAP1
089 SWAP2
090 SSTORE
091 POP
092 DUP2
093 SWAP1
094 POP
095 PUSH1 65
097 DUP2
098 PUSH1 9d
100 JUMP
This code SSTORE the result of the previous addition in slot 0, and jumps directly to 9d (157 in dec).
157 JUMPDEST
158 PUSH1 00
160 PUSH1 00
162 NOT
163 DUP3
164 EQ
165 ISZERO
166 PUSH1 ae
168 JUMPI
169 PUSH1 ae
171 PUSH1 b5
173 JUMP
This code is EXACTLY the same as between 136 and 151 it verify that the result of the future arithmetical operation is not in an overflow.
If all is OK, it JUMP to ae (174 in dec).
174 JUMPDEST
175 POP
176 PUSH1 01
178 ADD
179 SWAP1
180 JUMP
It adds 1 to the incrementation variable i and JUMP to 65. (101 in dec, which was push at byte 90 just before.)
101 JUMPDEST
102 SWAP2
103 POP
104 POP
105 PUSH1 41
107 JUMP
The purpose of this code is just to “clean” the stack and jumps to 41. (65 in dec)
But do you remember what is 65 ?
This is the beginning of the loop, With these informations it’s possible to resume the execution flow:
- I declare
i = 0
. - I test if
i < x
if yes jump directly to the end (8). - Load the Slot 0 (
value
variable). - verify that when, i’ll add i to Slot to
value
, there won’t be an overflow. If the test fails go to 181 when the function reverts. - add i to
value
and SSTORE to slot 0. - verify that when the EVM will add 1 to i (for incrementation) there won’t be an overflow if test fails go to 181 when the function reverts.
- add 1 to i and return to 2.
- end the execution.
The loop lies between 2 and 8 while i < x (in this example we called flow() with x = 10)
This was longest part of this article, but we’re done with for loops, let’s now talk about functions.
4. function call without arguments
This part is the most important part of this post, it will help us to understand the next post. DO NOT SKIP IT.
What are the functions behavior in assembly ?
He is the code we will analyses :
Compile it WITHOUT the optimizer (but still with solidity version 0.8.7)
And of course, you need to disassemble it :
071 JUMPDEST
072 PUSH1 4d
074 PUSH1 4f
076 JUMP
077 JUMPDEST
078 JUMP
At byte 72 4d are pushed.
At byte 74 4f are pushed.
At byte 76 the EVM jump to Stack(0) which is 4f (79) in our case.
At byte 79, the function code is pretty obvious, it’s the flow2() function.
079 JUMPDEST
080 PUSH1 05
082 PUSH1 00
084 DUP2
085 SWAP1
086 SSTORE
087 POP
088 JUMP
It stores the value 5 in slot 0, nothing more.
After Storing value, the opcode JUMP is executed at byte 88, but where the JUMP goes? What is the value of Stack(0) at this time? This is not obvious.
Did you remembered that 4d was PUSHed at byte 72 ? (just before 4f at byte 74)
The function flow2(defined between 79 and 88, 3 values were added to the stack, by using PUSH PUSH and DUP and 3 were removed by using SSTORE which consumes 2 values and POP)
So at byte 74, before the call and the PUSH 4f the stack is the same as in byte 88,
As a result before the start of flow2(), Stack(0) = 4d. So the jump at 88 jumps to Stack(0) = 4d (=77 in dec)
THIS IS THE CASE FOR ALL FUNCTIONS !!! ALL FUNCTIONS IN SOLIDITY ONCE EXECUTED WILL USE THE STACK AND CLEAN IT AFTER EXECUTION. AS A RESULT THE STACK WILL EXACTLY THE SAME BEFORE AND AFTER EXECUTION !
We can notice that after the end of the function flow2, the EVM JUMP just after the call to flow2() at byte 75 in 77. Why this is the case ?
After the end of the function flow2(), the function flow() continue. This is why 4d was PUSHed: to save the state of function execution.
As flow2() is nested in flow(), after the execution of flow2() the EVM needs to continue the execution flow of flow(). And to do that before calling flow2() the EVM saves in the stack the next instruction (JUMPDEST) after JUMP to resume the execution.
Every time, there is a function called in solidity (or others assembly like x86 or ARM). The byte/address of the current function is saved in the stack to resume the execution once the called functions is done.
If all is OK, let’s complexity a bit the function, what if we put arguments in flow2() function (like an uint) ?
5. Function call with arguments
This is our 2nd example about reversing functions calls, compile it WITHOUT the optimizer.
Let’s disassemble THAT:
087 JUMPDEST
088 DUP1
089 PUSH1 00
091 DUP2
092 SWAP1
093 SSTORE
094 POP
095 POP
096 JUMP
097 JUMPDEST
098 PUSH1 69
100 PUSH1 05
102 PUSH1 57
104 JUMP
105 JUMPDEST
106 JUMP
The function flow() entry point starts at instruction 97, juste after it PUSHes 69, 05 and 57 in the stack.
As you may guess:
69 (105 in dec) saves the byte after the function call
05 is the argument of the function
57 (87 in dec) is the address of the function flow2 which will be called now by using JUMP at byte 104.
Between 87 and 96, this is the function “flow2” which SSTORE the content of the Stack(0), 05 in this case (the argument provided to the function flow2)
After that is jumps
As this is the end of the function
The disassembly is almost exactly the same (apart from function which lays in others area of the code) but the only true difference, is that the argument 5 is pushed to the stack
As with the first code, the function cleans the stack every time
6. Function Call with return value
Now, instead of putting an argument let’s see what will happend if the flow2 function returns a value instead of taking an argument.
SPOILER : The idea is the same and the difference is marginal too.
The full disassembly (entry point of the function flow() is 90) :
090 PUSH1 00
092 PUSH1 61
094 PUSH1 6d
096 JUMP
097 JUMPDEST
098 SWAP1
099 POP
100 DUP1
101 PUSH1 00
103 DUP2
104 SWAP1
105 SSTORE
106 POP
107 POP
108 JUMP
109 JUMPDEST
110 PUSH1 00
112 PUSH1 05
114 SWAP1
115 POP
116 SWAP1
117 JUMP
between 90 and 94, 0x0 0x61 and 0x6d is pushed to the Stack.
The function then JUMP to 6d (109 in dec)
The function flow2() between 109 and 117 pushes 5 to the Stack (all 5 instructions is simplified to PUSH 5, i think the optimizer should be enabled to see it in the code)
At 117 the Stack is (61 and 5)
61, we already know, this byte
But what is 5 ? As you may guess. This is the return value of the function
As you may have excepted, the return value is pushed to the stack too
After the execution of flow2() the flow function still will continue, with the Stack being the same (as said before) but with value 5 !
7. Let’s bring it together
Finally, this is the last examples of the part 5 :
What if we bring the 3 example together, by adding a return value and 2 arguments in the flow2() function ?
Let’s analyse that ! (SPOILER : there isn’t so much difference too)
117 JUMPDEST
118 PUSH1 00
120 PUSH2 0083
123 PUSH1 05
125 PUSH1 07
127 PUSH2 008f
130 JUMP
131 JUMPDEST
132 SWAP1
133 POP
134 DUP1
135 PUSH1 00
137 DUP2
138 SWAP1
139 SSTORE
140 POP
141 POP
142 JUMP
143 JUMPDEST
144 PUSH1 00
146 DUP3
147 SWAP1
148 POP
149 SWAP3
150 SWAP2
151 POP
152 POP
153 JUMP
The entry point of the flow() function is at instruction 118.
83 (131 in dec) is the value
05 and 07 are the arguments
and 8f (143 in dec) is the address of the function
Between 143 and 153, the function flow2() remove y (7), because it don’t need this value and place x (5) in the stack and return it.
The function JUMP to the saved byte (83, 131 in dec), and the execution of the function flow() resumes by storing the returned value 5.
Once it’s done, it jumps to the STOP spot and the execution ends here
There isn’t much to say about this function, this behavior was excepted. The argument, the saved byte and the return value were stored in the stacks and the function had done the job correctly.
So What do you need to remember ?
When you call a function in solidity (in assembly).
- The EVM PUSH all arguments to the stack before the call
- The function is executed
- ALL return values are PUSHED in the stack
8. Conclusion
We’re done for this part 5. This was the hardest part of this series but it was necessary. Now have have a better understanding of the execution flow in assembly in solidity.
At next we will learn the full solidity smart contract layout and it’s different parts.
🔴 This is the 5th part of our series about reversing and debugging EVM smart contracts, here you can find previous & next parts:
- ✅ Reversing and debugging EVM Smart contracts: First steps in assembly (part 1️⃣)
- ✅ Reversing and debugging EVM Smart contracts: Deployment of a smart contract (Part 2️⃣)
- ✅ Reversing and debugging EVM Smart contracts: How the storage layout works? (part 3️⃣)
- ✅ Reversing and Debugging EVM Smart contracts: 5 Instructions to end/abort the Execution (part 4️⃣)
- ✅ Reversing and debugging EVM Smart contracts: The Execution flow if/else/for/functions (part 5️⃣)
- NEXT: Reversing and debugging EVM Smart contracts: Full Smart Contract layout (part 6️⃣)
- Reversing and debugging EVM Smart contracts: External Calls and contract deployment (part 7️⃣)